From shin-5@jp-c.ne.jp Wed Dec 1 02:02:00 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Dec 2004 02:02:06 -0800 (PST) Received: from MIN14 ([221.212.226.3]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iB1A1wn1011500 for ; Wed, 1 Dec 2004 02:01:59 -0800 Date: Wed, 1 Dec 2004 02:01:58 -0800 Message-Id: <200412011001.iB1A1wn1011500@oss.sgi.com> From: "=?iso-2022-jp?B?c2pkazE0QHlhaG9vLmNvLmpw?=" To: "netdev@oss.sgi.com" X-mailer: Super Mailer 9 [en][outlook] Subject: =?iso-2022-jp?B?GyRCISE0KCQkJEckOSQrISkbKEI=?= MIME-Version: 1.0 Content-Type: text/plain; charset="iso-2022-jp" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4927.1200 X-archive-position: 12362 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: alkjd@yahoo.co.jp Precedence: bulk X-list: netdev $B$$$$$M!A:#=5$b:G9b$G$9!*(B http://iidote.info/kinjyo ****$B%a%k%^%,2r=|(B/$BLd$$9g$o$;(B**** $B9-El>J!!HM!!9@(B yotsuba_kouyou@yahoo.co.jp ******************************* From hadi@cyberus.ca Wed Dec 1 05:23:12 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Dec 2004 05:23:19 -0800 (PST) Received: from mx01.cybersurf.com (mx01.cybersurf.com [209.197.145.104]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB1DNB1C021461 for ; Wed, 1 Dec 2004 05:23:12 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx01.cybersurf.com with esmtp (Exim 4.30) id 1CZURT-0003Du-Es for netdev@oss.sgi.com; Wed, 01 Dec 2004 08:22:47 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CZURP-0008Vw-8l; Wed, 01 Dec 2004 08:22:43 -0500 Subject: Re: [E1000-devel] Transmission limit From: jamal Reply-To: hadi@cyberus.ca To: Lennert Buytenhek Cc: Robert Olsson , P@draigBrady.com, mellia@prezzemolo.polito.it, e1000-devel@lists.sourceforge.net, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com In-Reply-To: <20041201001107.GE4203@xi.wantstofly.org> References: <1101467291.24742.70.camel@mellia.lipar.polito.it> <41A73826.3000109@draigBrady.com> <16807.20052.569125.686158@robur.slu.se> <1101484740.24742.213.camel@mellia.lipar.polito.it> <41A76085.7000105@draigBrady.com> <1101499285.1079.45.camel@jzny.localdomain> <16811.8052.678955.795327@robur.slu.se> <1101821501.1043.43.camel@jzny.localdomain> <20041130134600.GA31515@xi.wantstofly.org> <1101824754.1044.126.camel@jzny.localdomain> <20041201001107.GE4203@xi.wantstofly.org> Content-Type: text/plain Organization: jamalopolous Message-Id: <1101902900.1041.9.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 01 Dec 2004 07:08:20 -0500 Content-Transfer-Encoding: 7bit X-archive-position: 12363 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Tue, 2004-11-30 at 19:11, Lennert Buytenhek wrote: > On Tue, Nov 30, 2004 at 09:25:54AM -0500, jamal wrote: > > > > > > Also from what I understand new HW and MSI can help in the case where > > > > > pass objects between CPU. Did I dream or did someone tell me that S2IO > > > > > could have several TX ring that could via MSI be routed to proper cpu? > > > > > > > > I am wondering if the per CPU tx/rx irqs are valuable at all. They sound > > > > like more hell to maintain. > > > > > > On the TX path you'd have qdiscs to deal with as well, no? > > > > I think management of it would be non-trivial in SMP. Youd have to start > > playing stupid loadbalancing tricks which would reduce the value of > > existence of tx irqs to begin with. > > You mean the management of qdiscs would be non-trivial? I mean it is useful in only the most ideal cases and if you want to actually do something useful in most cases with it you will have to muck around. Take the case of forwarding (maybe with a little or almost no localhost generated traffic) - then you end allocating in CPUA, processing and queueing on egress. Tx softirq, which is what stashes the packet on tx DMA eventually, is not guaranteed to run on the same CPU. Now add a little latency between ingress and egress .. The ideal case is where you end up processing to completion from ingress to egress (which is known to happen in Linux when theres no congestion). > Probably the idea of these kinds of tricks is to skip the qdisc step > altogether. > Which is preached by the BSD folks - bogus in my opinion. If you want to do something as bland/boring as that you can probably afford a $500 DLINK router which can do it at wire rate with (with cost you being locked in whatever features they have). cheers, jamal From hadi@cyberus.ca Wed Dec 1 05:40:41 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Dec 2004 05:40:46 -0800 (PST) Received: from mx02.cybersurf.com (mx02.cybersurf.com [209.197.145.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB1Debev022056 for ; Wed, 1 Dec 2004 05:40:41 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx02.cybersurf.com with esmtp (Exim 4.30) id 1CZUiL-0006Ct-BZ for netdev@oss.sgi.com; Wed, 01 Dec 2004 08:40:13 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CZUiH-000245-JX; Wed, 01 Dec 2004 08:40:09 -0500 Subject: Re: [E1000-devel] Transmission limit From: jamal Reply-To: hadi@cyberus.ca To: mellia@prezzemolo.polito.it Cc: Lennert Buytenhek , Harald Welte , P@draigBrady.com, e1000-devel@lists.sourceforge.net, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com In-Reply-To: <1101804146.11111.23.camel@mellia.lipar.polito.it> References: <1101467291.24742.70.camel@mellia.lipar.polito.it> <41A73826.3000109@draigBrady.com> <1101483081.24742.174.camel@mellia.lipar.polito.it> <20041127092503.GA12592@sunbeam.de.gnumonks.org> <1101718412.14930.46.camel@verza.polito.it> <20041129145028.GC18788@xi.wantstofly.org> <1101804146.11111.23.camel@mellia.lipar.polito.it> Content-Type: text/plain Organization: jamalopolous Message-Id: <1101903944.1042.29.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 01 Dec 2004 07:25:44 -0500 Content-Transfer-Encoding: 7bit X-archive-position: 12364 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Tue, 2004-11-30 at 03:42, Marco Mellia wrote: > On Mon, 2004-11-29 at 15:50, Lennert Buytenhek wrote: > > On Mon, Nov 29, 2004 at 09:53:33AM +0100, Marco Mellia wrote: > > > > > Th's our intuition too. > > > Notice that we get the same results with 3com (broadcom based) gigabit > > > cards. > > > We are thinking of sending packet in "bursts" instead of single > > > transfers. The only problem is to let the NIC know that there are more > > > than a packet in a burst... > > > > Jamal implemented exactly this for e1000 already, he might be persuaded > > into posting his patch here. Jamal? :) > > I guess that saying that we are _very_ interested in this might help. > :-) > We can offer as "beta-testers" as well... Sorry missed this (I wasnt CCed so it went to a low priority queue which i read on a best effort basis). Let me clean up the patches a little bit this weekend. The patch is at least 4 months old; latest reincarnation was due to issue1 on my SUCON presentation. Would a patch against latest 2.6.x bitkeeper (whatever it is this weekend) be fine? If you are in a rush and dont mind a little ugliness then i will pass them as is. BTW, Scott posted a interesting patch yesterday, you may wanna give that a shot as well. cheers, jamal From buytenh@wantstofly.org Wed Dec 1 07:24:56 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Dec 2004 07:25:08 -0800 (PST) Received: from xi.wantstofly.org (alephnull.demon.nl [212.238.201.82]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB1FOtOe011798 for ; Wed, 1 Dec 2004 07:24:56 -0800 Received: by xi.wantstofly.org (Postfix, from userid 500) id 09F592B0ED; Wed, 1 Dec 2004 16:24:33 +0100 (MET) Date: Wed, 1 Dec 2004 16:24:33 +0100 From: Lennert Buytenhek To: jamal Cc: Robert Olsson , P@draigBrady.com, mellia@prezzemolo.polito.it, e1000-devel@lists.sourceforge.net, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com Subject: Re: [E1000-devel] Transmission limit Message-ID: <20041201152433.GA12558@xi.wantstofly.org> References: <16807.20052.569125.686158@robur.slu.se> <1101484740.24742.213.camel@mellia.lipar.polito.it> <41A76085.7000105@draigBrady.com> <1101499285.1079.45.camel@jzny.localdomain> <16811.8052.678955.795327@robur.slu.se> <1101821501.1043.43.camel@jzny.localdomain> <20041130134600.GA31515@xi.wantstofly.org> <1101824754.1044.126.camel@jzny.localdomain> <20041201001107.GE4203@xi.wantstofly.org> <1101902900.1041.9.camel@jzny.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1101902900.1041.9.camel@jzny.localdomain> User-Agent: Mutt/1.4.1i X-archive-position: 12365 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: buytenh@wantstofly.org Precedence: bulk X-list: netdev On Wed, Dec 01, 2004 at 07:08:20AM -0500, jamal wrote: [ per-CPU TX/RX rings ] > > You mean the management of qdiscs would be non-trivial? > > I mean it is useful in only the most ideal cases and if you want to > actually do something useful in most cases with it you will have to > muck around. > Take the case of forwarding (maybe with a little or almost no localhost > generated traffic) - then you end allocating in CPUA, processing and > queueing on egress. Tx softirq, which is what stashes the packet on tx > DMA eventually, is not guaranteed to run on the same CPU. Now add a > little latency between ingress and egress .. > The ideal case is where you end up processing to completion from ingress > to egress (which is known to happen in Linux when theres no congestion). We disagreed on this topic at SUCON and I'm afraid we'll be disagreeing on it forever :) IMHO, on 10GbE any kind of qdisc is a waste of cycles. I don't think it's very likely that you'll be using that single 10GbE NIC for forwarding packets, doing that with a PC at this point in the history of PCs is just silly. If you do use it for forwarding, how likely is it that you'll be able to process an incoming burst of packets fast enough to require queueing on the egress interface? You have to be able to send a burst of packets bigger than the NIC's TX FIFO at >10GbE in the first place for queueing to be effective/useful at all. (Leaving the question of whether or not there'll be some room in the TX FIFO at TX time unanswered, what you're doing with per-CPU TX rings is basically just simulating the "N individual NICs each bound to its own CPU" case with a single NIC.) > > Probably the idea of these kinds of tricks is to skip the qdisc step > > altogether. > > Which is preached by the BSD folks - bogus in my opinion. If you want to > do something as bland/boring as that you can probably afford a $500 > DLINK router which can do it at wire rate with (with cost you being > locked in whatever features they have). That's an unfair comparison. Just because I don't need CBQ doesn't mean my $500 DLINK router does everything I'd want it to -- advanced firewalling is one thing that comes to mind. Last time I looked I couldn't load my own kernel modules on my DLINK router either. --L From Robert.Olsson@data.slu.se Wed Dec 1 07:34:53 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Dec 2004 07:34:58 -0800 (PST) Received: from mail1.slu.se (mail1.slu.se [130.238.96.11]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB1FYmVv012294 for ; Wed, 1 Dec 2004 07:34:53 -0800 Received: from robur.slu.se (robur.slu.se [130.238.98.12]) by mail1.slu.se (8.12.10/8.12.10) with ESMTP id iB1FYCKO029728; Wed, 1 Dec 2004 16:34:12 +0100 Received: by robur.slu.se (Postfix, from userid 1000) id 5C1C6EC001; Wed, 1 Dec 2004 16:34:12 +0100 (CET) From: Robert Olsson MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16813.58484.343629.570703@robur.slu.se> Date: Wed, 1 Dec 2004 16:34:12 +0100 To: sfeldma@pobox.com Cc: Lennert Buytenhek , jamal , Robert Olsson , P@draigBrady.com, mellia@prezzemolo.polito.it, e1000-devel@lists.sourceforge.net, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com Subject: Re: [E1000-devel] Transmission limit In-Reply-To: <1101863399.4663.54.camel@sfeldma-mobl.dsl-verizon.net> References: <1101467291.24742.70.camel@mellia.lipar.polito.it> <41A73826.3000109@draigBrady.com> <16807.20052.569125.686158@robur.slu.se> <1101484740.24742.213.camel@mellia.lipar.polito.it> <41A76085.7000105@draigBrady.com> <1101499285.1079.45.camel@jzny.localdomain> <16811.8052.678955.795327@robur.slu.se> <1101821501.1043.43.camel@jzny.localdomain> <20041130134600.GA31515@xi.wantstofly.org> <1101824754.1044.126.camel@jzny.localdomain> <20041201001107.GE4203@xi.wantstofly.org> <1101863399.4663.54.camel@sfeldma-mobl.dsl-verizon.net> X-Mailer: VM 7.18 under Emacs 21.3.1 X-archive-position: 12366 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Robert.Olsson@data.slu.se Precedence: bulk X-list: netdev Scott Feldman writes: > Hey, turns out, I know some e1000 tricks that might help get the kpps > numbers up. > > My problem is I only have a P4 desktop system with a 82544 nic running > at PCI 32/33Mhz, so I can't play with the big boys. But, attached is a > rework of the Tx path to eliminate 1) Tx interrupts, and 2) Tx > descriptor write-backs. For me, I see a nice jump in kpps, but I'd like > others to try with their setups. We should be able to get to wire speed > with 60-byte packets. > > System: Intel 865 (HT 2.6Ghz) > Nic: 82544 PCI 32-bit/33Mhz > Driver: linux-2.6.9 e1000 (5.3.19-k2-NAPI), no Interrupt Delays > 4096 descs > pkt_size = 60: 541618pps 277Mb/sec errors: 914 Hello! Nice but I no improvements w. 82546GB @ 133 MHz on 1.6 GHz Opteron it seems. SMP kernel linux-2.6.9-rc2 Vanilla. 801077pps 410Mb/sec (410151424bps) errors: 95596 Patch TXD=4096 608690pps 311Mb/sec (311649280bps) errors: 0 Patch TXD=2048 624103pps 319Mb/sec (319540736bps) errors: 0 Patch TXD=1024 551289pps 282Mb/sec (282259968bps) errors: 4506 Error count is a bit confusing... --ro From sfeldma@pobox.com Wed Dec 1 08:47:41 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Dec 2004 08:47:45 -0800 (PST) Received: from orb.pobox.com (orb.pobox.com [207.8.226.5]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB1GlafG018005 for ; Wed, 1 Dec 2004 08:47:41 -0800 Received: from orb (localhost [127.0.0.1]) by orb.pobox.com (Postfix) with ESMTP id EAD952F9553; Wed, 1 Dec 2004 11:47:14 -0500 (EST) Received: from [172.20.3.21] (66.239.228.194.ptr.us.xo.net [66.239.228.194]) by orb.sasl.smtp.pobox.com (Postfix) with ESMTP id D00302F84CC; Wed, 1 Dec 2004 11:47:04 -0500 (EST) Subject: Re: [E1000-devel] Transmission limit From: Scott Feldman Reply-To: sfeldma@pobox.com To: Robert Olsson Cc: Lennert Buytenhek , jamal , P@draigBrady.com, mellia@prezzemolo.polito.it, e1000-devel@lists.sourceforge.net, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com In-Reply-To: <16813.58484.343629.570703@robur.slu.se> References: <1101467291.24742.70.camel@mellia.lipar.polito.it> <41A73826.3000109@draigBrady.com> <16807.20052.569125.686158@robur.slu.se> <1101484740.24742.213.camel@mellia.lipar.polito.it> <41A76085.7000105@draigBrady.com> <1101499285.1079.45.camel@jzny.localdomain> <16811.8052.678955.795327@robur.slu.se> <1101821501.1043.43.camel@jzny.localdomain> <20041130134600.GA31515@xi.wantstofly.org> <1101824754.1044.126.camel@jzny.localdomain> <20041201001107.GE4203@xi.wantstofly.org> <1101863399.4663.54.camel@sfeldma-mobl.dsl-verizon.net> <16813.58484.343629.570703@robur.slu.se> Content-Type: text/plain Message-Id: <1101919791.5198.15.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.6 (1.4.6-2) Date: Wed, 01 Dec 2004 08:49:51 -0800 Content-Transfer-Encoding: 7bit X-archive-position: 12367 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: sfeldma@pobox.com Precedence: bulk X-list: netdev On Wed, 2004-12-01 at 07:34, Robert Olsson wrote: > Nice but I no improvements w. 82546GB @ 133 MHz on 1.6 GHz Opteron it seems. > SMP kernel linux-2.6.9-rc2 > > Vanilla. > 801077pps 410Mb/sec (410151424bps) errors: 95596 > > Patch TXD=4096 > 608690pps 311Mb/sec (311649280bps) errors: 0 Thank you Robert for trying it out. Well those results are counter-intuitive! We remove Tx interrupts and Tx descriptor DMA write-backs and get no re-tries, and performance drops? The only bus activities left are the DMA of buffers to device and the register writes to increment tail. I'm stumped. I'll need to get my hands on a faster system. Maybe there is a bus analyzer under the tree. :-) -scott From Robert.Olsson@data.slu.se Wed Dec 1 08:48:14 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Dec 2004 08:48:19 -0800 (PST) Received: from mail1.slu.se (mail1.slu.se [130.238.96.11]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB1Gm7iB018063 for ; Wed, 1 Dec 2004 08:48:14 -0800 Received: from robur.slu.se (robur.slu.se [130.238.98.12]) by mail1.slu.se (8.12.10/8.12.10) with ESMTP id iB1GlOKO024025; Wed, 1 Dec 2004 17:47:24 +0100 Received: by robur.slu.se (Postfix, from userid 1000) id 34CC8EC001; Wed, 1 Dec 2004 17:47:24 +0100 (CET) From: Robert Olsson MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16813.62876.180869.404122@robur.slu.se> Date: Wed, 1 Dec 2004 17:47:24 +0100 To: "David S. Miller" Cc: Robert Olsson , hadi@cyberus.ca, P@draigBrady.com, mellia@prezzemolo.polito.it, e1000-devel@lists.sourceforge.net, jorge.finochietto@polito.it, galante@polito.it, netdev@oss.sgi.com Subject: Re: [E1000-devel] Transmission limit In-Reply-To: <20041129121604.47eb6593.davem@davemloft.net> References: <1101467291.24742.70.camel@mellia.lipar.polito.it> <41A73826.3000109@draigBrady.com> <16807.20052.569125.686158@robur.slu.se> <1101484740.24742.213.camel@mellia.lipar.polito.it> <41A76085.7000105@draigBrady.com> <1101499285.1079.45.camel@jzny.localdomain> <16811.8052.678955.795327@robur.slu.se> <20041129121604.47eb6593.davem@davemloft.net> X-Mailer: VM 7.18 under Emacs 21.3.1 X-archive-position: 12368 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Robert.Olsson@data.slu.se Precedence: bulk X-list: netdev David S. Miller writes: > > Did I dream or did someone tell me that S2IO > > could have several TX ring that could via MSI be routed to proper cpu? > > One of Sun's gigabit chips can do this too, except it isn't > via MSI, the driver has to read the descriptor to figure out > which cpu gets the software interrupt to process the packet. > > SGI had hardware which allowed you to do this kind of stuff too. > > Obviously the MSI version works much better. > > It is important, the cpu selection process. First of all, it must > be calculated such that flows always go through the same cpu. > Otherwise TCP sockets bounce between the cpus for a streaming > transfer. > > And even this doesn't avoid all such problems, TCP LISTEN state > sockets will still thrash between the cpus with such a "pick > a cpu based upon" flow scheme. > > Anyways, just some thoughts. Thanks for the the info. Well we'll be forced to get into those problems when the HW is capable. I'll guess it will be w. the 10 GIGE cards. --ro From wensong@linux-vs.org Wed Dec 1 08:49:54 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Dec 2004 08:50:02 -0800 (PST) Received: from lb1.ctrip.com ([218.244.111.2]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB1Gnqeo018487 for ; Wed, 1 Dec 2004 08:49:53 -0800 Received: from penguin.linux-vs.org ([61.149.145.226]) by lb1.ctrip.com (8.12.10/8.12.10) with ESMTP id iB1GmjMh028406 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Thu, 2 Dec 2004 00:48:49 +0800 Received: from penguin.linux-vs.org (localhost.localdomain [127.0.0.1]) by penguin.linux-vs.org (8.12.8/8.12.8) with ESMTP id iB1GmZ9A001083; Thu, 2 Dec 2004 00:48:36 +0800 Received: from localhost (wensong@localhost) by penguin.linux-vs.org (8.12.8/8.12.8/Submit) with ESMTP id iB1GmRZa001079; Thu, 2 Dec 2004 00:48:31 +0800 X-Authentication-Warning: penguin.linux-vs.org: wensong owned process doing -bs Date: Thu, 2 Dec 2004 00:48:26 +0800 (CST) From: Wensong Zhang To: "David S. Miller" , netdev@oss.sgi.com Subject: [PATCH] [IPVS] add a sysctl variable to expire quiescent template Message-ID: MIME-Version: 1.0 Content-Type: MULTIPART/Mixed; BOUNDARY="-1463811584-572891765-1097000265=:2451" Content-ID: X-archive-position: 12369 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: wensong@linux-vs.org Precedence: bulk X-list: netdev This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. ---1463811584-572891765-1097000265=:2451 Content-Type: TEXT/PLAIN; CHARSET=US-ASCII; format=flowed Content-ID: Hi Dave, Here is the patch from Horms to add a sysctl variable to expire quiescent templat. Please check and apply them to kernel 2.4 and 2.6 respectively. Thanks, Wensong ---1463811584-572891765-1097000265=:2451 Content-Type: TEXT/PLAIN; charset=US-ASCII; name="linux-2.4-ipvs-quiescent_template.patch" Content-Transfer-Encoding: BASE64 Content-ID: Content-Description: Content-Disposition: attachment; filename="linux-2.4-ipvs-quiescent_template.patch" IyBUaGlzIGlzIGEgQml0S2VlcGVyIGdlbmVyYXRlZCBkaWZmIC1OcnUgc3R5 bGUgcGF0Y2guDQojDQojIENoYW5nZVNldA0KIyAgIDIwMDQvMTIvMDIgMDA6 MDI6NDgrMDg6MDAgd2Vuc29uZ0BsaW51eC12cy5vcmcgDQojICAgW0lQVlNd IGFkZCBhIHN5c2N0bCB2YXJpYWJsZSB0byBleHBpcmUgcXVpZXNjZW50IHRl bXBsYXRlDQojICAgDQojICAgVGhlIHBhdGNoIGlzIGZyb20gSG9ybXMgPGhv cm1zQHZlcmdlbmV0Lm5ldD4NCiMgDQojIG5ldC9pcHY0L2lwdnMvaXBfdnNf Y3RsLmMNCiMgICAyMDA0LzEyLzAyIDAwOjAyOjM4KzA4OjAwIHdlbnNvbmdA bGludXgtdnMub3JnICs0IC0wDQojICAgc2V0IHN5c2N0bF9pcF92c19leHBp cmVfcXVpZXNjZW50X3RlbXBsYXRlDQojIA0KIyBuZXQvaXB2NC9pcHZzL2lw X3ZzX2Nvbm4uYw0KIyAgIDIwMDQvMTIvMDIgMDA6MDI6MzcrMDg6MDAgd2Vu c29uZ0BsaW51eC12cy5vcmcgKzMgLTENCiMgICBkb24ndCB1c2UgcXVpZXNj ZW50IHRlbXBsYXRlIGlmIHRoZSBleHBpcmVfcXVpZXNjZW50X3RlbXBsYXRl IGlzIGVuYWJsZWQNCiMgDQojIGluY2x1ZGUvbmV0L2lwX3ZzLmgNCiMgICAy MDA0LzEyLzAyIDAwOjAyOjM3KzA4OjAwIHdlbnNvbmdAbGludXgtdnMub3Jn ICsyIC0wDQojICAgYWRkIHRoZSBzeXNjdGxfaXBfdnNfZXhwaXJlX3F1aWVz Y2VudF90ZW1wbGF0ZQ0KIyANCmRpZmYgLU5ydSBhL2luY2x1ZGUvbmV0L2lw X3ZzLmggYi9pbmNsdWRlL25ldC9pcF92cy5oDQotLS0gYS9pbmNsdWRlL25l dC9pcF92cy5oCTIwMDQtMTItMDIgMDA6MTY6MzYgKzA4OjAwDQorKysgYi9p bmNsdWRlL25ldC9pcF92cy5oCTIwMDQtMTItMDIgMDA6MTY6MzYgKzA4OjAw DQpAQCAtMzE3LDYgKzMxNyw3IEBADQogCU5FVF9JUFY0X1ZTX0VYUElSRV9O T0RFU1RfQ09OTj0yMywNCiAJTkVUX0lQVjRfVlNfU1lOQ19USFJFU0hPTEQ9 MjQsDQogCU5FVF9JUFY0X1ZTX05BVF9JQ01QX1NFTkQ9MjUsDQorCU5FVF9J UFY0X1ZTX0VYUElSRV9RVUlFU0NFTlRfVEVNUExBVEU9MjYsDQogCU5FVF9J UFY0X1ZTX0xBU1QNCiB9Ow0KIA0KQEAgLTcwMCw2ICs3MDEsNyBAQA0KICAq Lw0KIGV4dGVybiBpbnQgc3lzY3RsX2lwX3ZzX2NhY2hlX2J5cGFzczsNCiBl eHRlcm4gaW50IHN5c2N0bF9pcF92c19leHBpcmVfbm9kZXN0X2Nvbm47DQor ZXh0ZXJuIGludCBzeXNjdGxfaXBfdnNfZXhwaXJlX3F1aWVzY2VudF90ZW1w bGF0ZTsNCiBleHRlcm4gaW50IHN5c2N0bF9pcF92c19zeW5jX3RocmVzaG9s ZDsNCiBleHRlcm4gaW50IHN5c2N0bF9pcF92c19uYXRfaWNtcF9zZW5kOw0K IGV4dGVybiBzdHJ1Y3QgaXBfdnNfc3RhdHMgaXBfdnNfc3RhdHM7DQpkaWZm IC1OcnUgYS9uZXQvaXB2NC9pcHZzL2lwX3ZzX2Nvbm4uYyBiL25ldC9pcHY0 L2lwdnMvaXBfdnNfY29ubi5jDQotLS0gYS9uZXQvaXB2NC9pcHZzL2lwX3Zz X2Nvbm4uYwkyMDA0LTEyLTAyIDAwOjE2OjM2ICswODowMA0KKysrIGIvbmV0 L2lwdjQvaXB2cy9pcF92c19jb25uLmMJMjAwNC0xMi0wMiAwMDoxNjozNiAr MDg6MDANCkBAIC0xMTMxLDcgKzExMzEsOSBAQA0KIAkgKiBDaGVja2luZyB0 aGUgZGVzdCBzZXJ2ZXIgc3RhdHVzLg0KIAkgKi8NCiAJaWYgKChkZXN0ID09 IE5VTEwpIHx8DQotCSAgICAhKGRlc3QtPmZsYWdzICYgSVBfVlNfREVTVF9G X0FWQUlMQUJMRSkpIHsNCisJICAgICEoZGVzdC0+ZmxhZ3MgJiBJUF9WU19E RVNUX0ZfQVZBSUxBQkxFKSB8fCANCisJICAgIChzeXNjdGxfaXBfdnNfZXhw aXJlX3F1aWVzY2VudF90ZW1wbGF0ZSAmJiANCisJICAgICAoYXRvbWljX3Jl YWQoJmRlc3QtPndlaWdodCkgPT0gMCkpKSB7DQogCQlJUF9WU19EQkcoOSwg ImNoZWNrX3RlbXBsYXRlOiBkZXN0IG5vdCBhdmFpbGFibGUgZm9yICINCiAJ CQkgICJwcm90b2NvbCAlcyBzOiV1LiV1LiV1LiV1OiVkIHY6JXUuJXUuJXUu JXU6JWQgIg0KIAkJCSAgIi0+IGQ6JXUuJXUuJXUuJXU6JWRcbiIsDQpkaWZm IC1OcnUgYS9uZXQvaXB2NC9pcHZzL2lwX3ZzX2N0bC5jIGIvbmV0L2lwdjQv aXB2cy9pcF92c19jdGwuYw0KLS0tIGEvbmV0L2lwdjQvaXB2cy9pcF92c19j dGwuYwkyMDA0LTEyLTAyIDAwOjE2OjM2ICswODowMA0KKysrIGIvbmV0L2lw djQvaXB2cy9pcF92c19jdGwuYwkyMDA0LTEyLTAyIDAwOjE2OjM2ICswODow MA0KQEAgLTc0LDYgKzc0LDcgQEANCiBzdGF0aWMgaW50IHN5c2N0bF9pcF92 c19hbV9kcm9wcmF0ZSA9IDEwOw0KIGludCBzeXNjdGxfaXBfdnNfY2FjaGVf YnlwYXNzID0gMDsNCiBpbnQgc3lzY3RsX2lwX3ZzX2V4cGlyZV9ub2Rlc3Rf Y29ubiA9IDA7DQoraW50IHN5c2N0bF9pcF92c19leHBpcmVfcXVpZXNjZW50 X3RlbXBsYXRlID0gMDsNCiBpbnQgc3lzY3RsX2lwX3ZzX3N5bmNfdGhyZXNo b2xkID0gMzsNCiBpbnQgc3lzY3RsX2lwX3ZzX25hdF9pY21wX3NlbmQgPSAw Ow0KIA0KQEAgLTE0MzksNiArMTQ0MCw5IEBADQogCSAgJnByb2NfZG9pbnR2 ZWN9LA0KIAkge05FVF9JUFY0X1ZTX05BVF9JQ01QX1NFTkQsICJuYXRfaWNt cF9zZW5kIiwNCiAJICAmc3lzY3RsX2lwX3ZzX25hdF9pY21wX3NlbmQsIHNp emVvZihpbnQpLCAwNjQ0LCBOVUxMLA0KKwkgICZwcm9jX2RvaW50dmVjfSwN CisJIHtORVRfSVBWNF9WU19FWFBJUkVfUVVJRVNDRU5UX1RFTVBMQVRFLCAi ZXhwaXJlX3F1aWVzY2VudF90ZW1wbGF0ZSIsDQorCSAgJnN5c2N0bF9pcF92 c19leHBpcmVfcXVpZXNjZW50X3RlbXBsYXRlLCBzaXplb2YoaW50KSwgMDY0 NCwgTlVMTCwNCiAJICAmcHJvY19kb2ludHZlY30sDQogCSB7MH19LA0KIAl7 e05FVF9JUFY0X1ZTLCAidnMiLCBOVUxMLCAwLCAwNTU1LCBpcHY0X3ZzX3Rh YmxlLnZzX3ZhcnN9LA0K ---1463811584-572891765-1097000265=:2451 Content-Type: TEXT/PLAIN; charset=US-ASCII; name="linux-2.6-ipvs-quiescent_template.patch" Content-Transfer-Encoding: BASE64 Content-ID: Content-Description: Content-Disposition: attachment; filename="linux-2.6-ipvs-quiescent_template.patch" IyBUaGlzIGlzIGEgQml0S2VlcGVyIGdlbmVyYXRlZCBkaWZmIC1OcnUgc3R5 bGUgcGF0Y2guDQojDQojIENoYW5nZVNldA0KIyAgIDIwMDQvMTIvMDIgMDA6 NDI6MTUrMDg6MDAgd2Vuc29uZ0BsaW51eC12cy5vcmcgDQojICAgW0lQVlNd IGFkZCBhIHN5c2N0bCB2YXJpYWJsZSB0byBleHBpcmUgcXVpZXNjZW50IHRl bXBsYXRlDQojICAgDQojICAgVGhlIHBhdGNoIGlzIGZyb20gSG9ybXMgPGhv cm1zQHZlcmdlbmV0Lm5ldD4NCiMgDQojIG5ldC9pcHY0L2lwdnMvaXBfdnNf Y3RsLmMNCiMgICAyMDA0LzEyLzAyIDAwOjQxOjU2KzA4OjAwIHdlbnNvbmdA bGludXgtdnMub3JnICsyMCAtMTENCiMgICBzZXQgdGhlIHN5c2N0bF9pcF92 c19leHBpcmVfcXVpZXNjZW50X3RlbXBsYXRlDQojIA0KIyBuZXQvaXB2NC9p cHZzL2lwX3ZzX2Nvbm4uYw0KIyAgIDIwMDQvMTIvMDIgMDA6NDE6NTYrMDg6 MDAgd2Vuc29uZ0BsaW51eC12cy5vcmcgKzMgLTENCiMgICBkb24ndCB1c2Ug cXVpZXNjZW50IHRlbXBsYXRlIGlmIHRoZSBleHBpcmVfcXVpZXNjZW50X3Rl bXBsYXRlIGlzIGVuYWJsZWQNCiMgDQojIGluY2x1ZGUvbmV0L2lwX3ZzLmgN CiMgICAyMDA0LzEyLzAyIDAwOjQxOjU2KzA4OjAwIHdlbnNvbmdAbGludXgt dnMub3JnICsyIC0wDQojICAgYWRkIHRoZSBzeXNjdGxfaXBfdnNfZXhwaXJl X3F1aWVzY2VudF90ZW1wbGF0ZSBwcm90b3R5cGUNCiMgDQpkaWZmIC1OcnUg YS9pbmNsdWRlL25ldC9pcF92cy5oIGIvaW5jbHVkZS9uZXQvaXBfdnMuaA0K LS0tIGEvaW5jbHVkZS9uZXQvaXBfdnMuaAkyMDA0LTEyLTAyIDAwOjQzOjE0 ICswODowMA0KKysrIGIvaW5jbHVkZS9uZXQvaXBfdnMuaAkyMDA0LTEyLTAy IDAwOjQzOjE0ICswODowMA0KQEAgLTM1OCw2ICszNTgsNyBAQA0KIAlORVRf SVBWNF9WU19FWFBJUkVfTk9ERVNUX0NPTk49MjMsDQogCU5FVF9JUFY0X1ZT X1NZTkNfVEhSRVNIT0xEPTI0LA0KIAlORVRfSVBWNF9WU19OQVRfSUNNUF9T RU5EPTI1LA0KKwlORVRfSVBWNF9WU19FWFBJUkVfUVVJRVNDRU5UX1RFTVBM QVRFPTI2LA0KIAlORVRfSVBWNF9WU19MQVNUDQogfTsNCiANCkBAIC04Nzks NiArODgwLDcgQEANCiAgKi8NCiBleHRlcm4gaW50IHN5c2N0bF9pcF92c19j YWNoZV9ieXBhc3M7DQogZXh0ZXJuIGludCBzeXNjdGxfaXBfdnNfZXhwaXJl X25vZGVzdF9jb25uOw0KK2V4dGVybiBpbnQgc3lzY3RsX2lwX3ZzX2V4cGly ZV9xdWllc2NlbnRfdGVtcGxhdGU7DQogZXh0ZXJuIGludCBzeXNjdGxfaXBf dnNfc3luY190aHJlc2hvbGRbMl07DQogZXh0ZXJuIGludCBzeXNjdGxfaXBf dnNfbmF0X2ljbXBfc2VuZDsNCiBleHRlcm4gc3RydWN0IGlwX3ZzX3N0YXRz IGlwX3ZzX3N0YXRzOw0KZGlmZiAtTnJ1IGEvbmV0L2lwdjQvaXB2cy9pcF92 c19jb25uLmMgYi9uZXQvaXB2NC9pcHZzL2lwX3ZzX2Nvbm4uYw0KLS0tIGEv bmV0L2lwdjQvaXB2cy9pcF92c19jb25uLmMJMjAwNC0xMi0wMiAwMDo0Mzox NCArMDg6MDANCisrKyBiL25ldC9pcHY0L2lwdnMvaXBfdnNfY29ubi5jCTIw MDQtMTItMDIgMDA6NDM6MTQgKzA4OjAwDQpAQCAtNDUzLDcgKzQ1Myw5IEBA DQogCSAqIENoZWNraW5nIHRoZSBkZXN0IHNlcnZlciBzdGF0dXMuDQogCSAq Lw0KIAlpZiAoKGRlc3QgPT0gTlVMTCkgfHwNCi0JICAgICEoZGVzdC0+Zmxh Z3MgJiBJUF9WU19ERVNUX0ZfQVZBSUxBQkxFKSkgew0KKwkgICAgIShkZXN0 LT5mbGFncyAmIElQX1ZTX0RFU1RfRl9BVkFJTEFCTEUpIHx8IA0KKwkgICAg KHN5c2N0bF9pcF92c19leHBpcmVfcXVpZXNjZW50X3RlbXBsYXRlICYmIA0K KwkgICAgIChhdG9taWNfcmVhZCgmZGVzdC0+d2VpZ2h0KSA9PSAwKSkpIHsN CiAJCUlQX1ZTX0RCRyg5LCAiY2hlY2tfdGVtcGxhdGU6IGRlc3Qgbm90IGF2 YWlsYWJsZSBmb3IgIg0KIAkJCSAgInByb3RvY29sICVzIHM6JXUuJXUuJXUu JXU6JWQgdjoldS4ldS4ldS4ldTolZCAiDQogCQkJICAiLT4gZDoldS4ldS4l dS4ldTolZFxuIiwNCmRpZmYgLU5ydSBhL25ldC9pcHY0L2lwdnMvaXBfdnNf Y3RsLmMgYi9uZXQvaXB2NC9pcHZzL2lwX3ZzX2N0bC5jDQotLS0gYS9uZXQv aXB2NC9pcHZzL2lwX3ZzX2N0bC5jCTIwMDQtMTItMDIgMDA6NDM6MTQgKzA4 OjAwDQorKysgYi9uZXQvaXB2NC9pcHZzL2lwX3ZzX2N0bC5jCTIwMDQtMTIt MDIgMDA6NDM6MTQgKzA4OjAwDQpAQCAtNzUsNiArNzUsNyBAQA0KIHN0YXRp YyBpbnQgc3lzY3RsX2lwX3ZzX2FtX2Ryb3ByYXRlID0gMTA7DQogaW50IHN5 c2N0bF9pcF92c19jYWNoZV9ieXBhc3MgPSAwOw0KIGludCBzeXNjdGxfaXBf dnNfZXhwaXJlX25vZGVzdF9jb25uID0gMDsNCitpbnQgc3lzY3RsX2lwX3Zz X2V4cGlyZV9xdWllc2NlbnRfdGVtcGxhdGUgPSAwOw0KIGludCBzeXNjdGxf aXBfdnNfc3luY190aHJlc2hvbGRbMl0gPSB7IDMsIDUwIH07DQogaW50IHN5 c2N0bF9pcF92c19uYXRfaWNtcF9zZW5kID0gMDsNCiANCkBAIC0xNDQ3LDkg KzE0NDgsOSBAQA0KIAl7DQogCQkuY3RsX25hbWUJPSBORVRfSVBWNF9WU19U T19FUywNCiAJCS5wcm9jbmFtZQk9ICJ0aW1lb3V0X2VzdGFibGlzaGVkIiwN Ci0JICAJLmRhdGEJPSAmdnNfdGltZW91dF90YWJsZV9kb3MudGltZW91dFtJ UF9WU19TX0VTVEFCTElTSEVEXSwNCisJCS5kYXRhCT0gJnZzX3RpbWVvdXRf dGFibGVfZG9zLnRpbWVvdXRbSVBfVlNfU19FU1RBQkxJU0hFRF0sDQogCQku bWF4bGVuCQk9IHNpemVvZihpbnQpLA0KLQkJLm1vZGUJCT0gMDY0NCwgDQor CQkubW9kZQkJPSAwNjQ0LA0KIAkJLnByb2NfaGFuZGxlcgk9ICZwcm9jX2Rv aW50dmVjX2ppZmZpZXMsDQogCX0sDQogCXsNCkBAIC0xNDU3LDcgKzE0NTgs NyBAQA0KIAkJLnByb2NuYW1lCT0gInRpbWVvdXRfc3luc2VudCIsDQogCQku ZGF0YQk9ICZ2c190aW1lb3V0X3RhYmxlX2Rvcy50aW1lb3V0W0lQX1ZTX1Nf U1lOX1NFTlRdLA0KIAkJLm1heGxlbgkJPSBzaXplb2YoaW50KSwNCi0JCS5t b2RlCQk9IDA2NDQsIA0KKwkJLm1vZGUJCT0gMDY0NCwNCiAJCS5wcm9jX2hh bmRsZXIJPSAmcHJvY19kb2ludHZlY19qaWZmaWVzLA0KIAl9LA0KIAl7DQpA QCAtMTQ2NSw3ICsxNDY2LDcgQEANCiAJCS5wcm9jbmFtZQk9ICJ0aW1lb3V0 X3N5bnJlY3YiLA0KIAkJLmRhdGEJPSAmdnNfdGltZW91dF90YWJsZV9kb3Mu dGltZW91dFtJUF9WU19TX1NZTl9SRUNWXSwNCiAJCS5tYXhsZW4JCT0gc2l6 ZW9mKGludCksDQotCQkubW9kZQkJPSAwNjQ0LCANCisJCS5tb2RlCQk9IDA2 NDQsDQogCQkucHJvY19oYW5kbGVyCT0gJnByb2NfZG9pbnR2ZWNfamlmZmll cywNCiAJfSwNCiAJew0KQEAgLTE0NzMsNyArMTQ3NCw3IEBADQogCQkucHJv Y25hbWUJPSAidGltZW91dF9maW53YWl0IiwNCiAJCS5kYXRhCT0gJnZzX3Rp bWVvdXRfdGFibGVfZG9zLnRpbWVvdXRbSVBfVlNfU19GSU5fV0FJVF0sDQog CQkubWF4bGVuCQk9IHNpemVvZihpbnQpLA0KLQkJLm1vZGUJCT0gMDY0NCwg DQorCQkubW9kZQkJPSAwNjQ0LA0KIAkJLnByb2NfaGFuZGxlcgk9ICZwcm9j X2RvaW50dmVjX2ppZmZpZXMsDQogCX0sDQogCXsNCkBAIC0xNDg5LDcgKzE0 OTAsNyBAQA0KIAkJLnByb2NuYW1lCT0gInRpbWVvdXRfY2xvc2UiLA0KIAkJ LmRhdGEJPSAmdnNfdGltZW91dF90YWJsZV9kb3MudGltZW91dFtJUF9WU19T X0NMT1NFXSwNCiAJCS5tYXhsZW4JCT0gc2l6ZW9mKGludCksDQotCQkubW9k ZQkJPSAwNjQ0LCANCisJCS5tb2RlCQk9IDA2NDQsDQogCQkucHJvY19oYW5k bGVyCT0gJnByb2NfZG9pbnR2ZWNfamlmZmllcywNCiAJfSwNCiAJew0KQEAg LTE0OTcsNyArMTQ5OCw3IEBADQogCQkucHJvY25hbWUJPSAidGltZW91dF9j bG9zZXdhaXQiLA0KIAkJLmRhdGEJPSAmdnNfdGltZW91dF90YWJsZV9kb3Mu dGltZW91dFtJUF9WU19TX0NMT1NFX1dBSVRdLA0KIAkJLm1heGxlbgkJPSBz aXplb2YoaW50KSwNCi0JCS5tb2RlCQk9IDA2NDQsIA0KKwkJLm1vZGUJCT0g MDY0NCwNCiAJCS5wcm9jX2hhbmRsZXIJPSAmcHJvY19kb2ludHZlY19qaWZm aWVzLA0KIAl9LA0KIAl7DQpAQCAtMTUwNSw3ICsxNTA2LDcgQEANCiAJCS5w cm9jbmFtZQk9ICJ0aW1lb3V0X2xhc3RhY2siLA0KIAkJLmRhdGEJPSAmdnNf dGltZW91dF90YWJsZV9kb3MudGltZW91dFtJUF9WU19TX0xBU1RfQUNLXSwN CiAJCS5tYXhsZW4JCT0gc2l6ZW9mKGludCksDQotCQkubW9kZQkJPSAwNjQ0 LCANCisJCS5tb2RlCQk9IDA2NDQsDQogCQkucHJvY19oYW5kbGVyCT0gJnBy b2NfZG9pbnR2ZWNfamlmZmllcywNCiAJfSwNCiAJew0KQEAgLTE1MTMsNyAr MTUxNCw3IEBADQogCQkucHJvY25hbWUJPSAidGltZW91dF9saXN0ZW4iLA0K IAkJLmRhdGEJPSAmdnNfdGltZW91dF90YWJsZV9kb3MudGltZW91dFtJUF9W U19TX0xJU1RFTl0sDQogCQkubWF4bGVuCQk9IHNpemVvZihpbnQpLA0KLQkJ Lm1vZGUJCT0gMDY0NCwgDQorCQkubW9kZQkJPSAwNjQ0LA0KIAkJLnByb2Nf aGFuZGxlcgk9ICZwcm9jX2RvaW50dmVjX2ppZmZpZXMsDQogCX0sDQogCXsN CkBAIC0xNTIxLDcgKzE1MjIsNyBAQA0KIAkJLnByb2NuYW1lCT0gInRpbWVv dXRfc3luYWNrIiwNCiAJCS5kYXRhCT0gJnZzX3RpbWVvdXRfdGFibGVfZG9z LnRpbWVvdXRbSVBfVlNfU19TWU5BQ0tdLA0KIAkJLm1heGxlbgkJPSBzaXpl b2YoaW50KSwNCi0JCS5tb2RlCQk9IDA2NDQsIA0KKwkJLm1vZGUJCT0gMDY0 NCwNCiAJCS5wcm9jX2hhbmRsZXIJPSAmcHJvY19kb2ludHZlY19qaWZmaWVz LA0KIAl9LA0KIAl7DQpAQCAtMTUyOSw3ICsxNTMwLDcgQEANCiAJCS5wcm9j bmFtZQk9ICJ0aW1lb3V0X3VkcCIsDQogCQkuZGF0YQk9ICZ2c190aW1lb3V0 X3RhYmxlX2Rvcy50aW1lb3V0W0lQX1ZTX1NfVURQXSwNCiAJCS5tYXhsZW4J CT0gc2l6ZW9mKGludCksDQotCQkubW9kZQkJPSAwNjQ0LCANCisJCS5tb2Rl CQk9IDA2NDQsDQogCQkucHJvY19oYW5kbGVyCT0gJnByb2NfZG9pbnR2ZWNf amlmZmllcywNCiAJfSwNCiAJew0KQEAgLTE1NTMsNiArMTU1NCwxNCBAQA0K IAkJLmN0bF9uYW1lCT0gTkVUX0lQVjRfVlNfRVhQSVJFX05PREVTVF9DT05O LA0KIAkJLnByb2NuYW1lCT0gImV4cGlyZV9ub2Rlc3RfY29ubiIsDQogCQku ZGF0YQkJPSAmc3lzY3RsX2lwX3ZzX2V4cGlyZV9ub2Rlc3RfY29ubiwNCisJ CS5tYXhsZW4JCT0gc2l6ZW9mKGludCksDQorCQkubW9kZQkJPSAwNjQ0LA0K KwkJLnByb2NfaGFuZGxlcgk9ICZwcm9jX2RvaW50dmVjLA0KKwl9LA0KKwl7 DQorCQkuY3RsX25hbWUJPSBORVRfSVBWNF9WU19FWFBJUkVfUVVJRVNDRU5U X1RFTVBMQVRFLA0KKwkJLnByb2NuYW1lCT0gImV4cGlyZV9xdWllc2NlbnRf dGVtcGxhdGUiLA0KKwkJLmRhdGEJCT0gJnN5c2N0bF9pcF92c19leHBpcmVf cXVpZXNjZW50X3RlbXBsYXRlLA0KIAkJLm1heGxlbgkJPSBzaXplb2YoaW50 KSwNCiAJCS5tb2RlCQk9IDA2NDQsDQogCQkucHJvY19oYW5kbGVyCT0gJnBy b2NfZG9pbnR2ZWMsDQo= ---1463811584-572891765-1097000265=:2451-- From Robert.Olsson@data.slu.se Wed Dec 1 09:38:06 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Dec 2004 09:38:10 -0800 (PST) Received: from mail1.slu.se (mail1.slu.se [130.238.96.11]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB1Hc487020583 for ; Wed, 1 Dec 2004 09:38:06 -0800 Received: from robur.slu.se (robur.slu.se [130.238.98.12]) by mail1.slu.se (8.12.10/8.12.10) with ESMTP id iB1HbaKO016341; Wed, 1 Dec 2004 18:37:37 +0100 Received: by robur.slu.se (Postfix, from userid 1000) id 0CEF4EC001; Wed, 1 Dec 2004 18:37:37 +0100 (CET) From: Robert Olsson MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16814.353.15446.489489@robur.slu.se> Date: Wed, 1 Dec 2004 18:37:37 +0100 To: sfeldma@pobox.com Cc: Robert Olsson , Lennert Buytenhek , jamal , P@draigBrady.com, mellia@prezzemolo.polito.it, e1000-devel@lists.sourceforge.net, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com Subject: Re: [E1000-devel] Transmission limit In-Reply-To: <1101919791.5198.15.camel@localhost.localdomain> References: <1101467291.24742.70.camel@mellia.lipar.polito.it> <41A73826.3000109@draigBrady.com> <16807.20052.569125.686158@robur.slu.se> <1101484740.24742.213.camel@mellia.lipar.polito.it> <41A76085.7000105@draigBrady.com> <1101499285.1079.45.camel@jzny.localdomain> <16811.8052.678955.795327@robur.slu.se> <1101821501.1043.43.camel@jzny.localdomain> <20041130134600.GA31515@xi.wantstofly.org> <1101824754.1044.126.camel@jzny.localdomain> <20041201001107.GE4203@xi.wantstofly.org> <1101863399.4663.54.camel@sfeldma-mobl.dsl-verizon.net> <16813.58484.343629.570703@robur.slu.se> <1101919791.5198.15.camel@localhost.localdomain> X-Mailer: VM 7.18 under Emacs 21.3.1 X-archive-position: 12370 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Robert.Olsson@data.slu.se Precedence: bulk X-list: netdev Scott Feldman writes: > Thank you Robert for trying it out. > > Well those results are counter-intuitive! We remove Tx interrupts and > Tx descriptor DMA write-backs and get no re-tries, and performance > drops? The only bus activities left are the DMA of buffers to device > and the register writes to increment tail. I'm stumped. I'll need to > get my hands on a faster system. Maybe there is a bus analyzer under > the tree. :-) Huh. I've got a deja-vu feeling. What will happen if we remove almost all events (interrupts) and just have the timer waking up once-in-a-while? --ro From buytenh@wantstofly.org Wed Dec 1 10:30:10 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Dec 2004 10:30:14 -0800 (PST) Received: from xi.wantstofly.org (alephnull.demon.nl [212.238.201.82]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB1IU5tj022456 for ; Wed, 1 Dec 2004 10:30:09 -0800 Received: by xi.wantstofly.org (Postfix, from userid 500) id 80D2B2B0ED; Wed, 1 Dec 2004 19:29:43 +0100 (MET) Date: Wed, 1 Dec 2004 19:29:43 +0100 From: Lennert Buytenhek To: Scott Feldman Cc: jamal , Robert Olsson , P@draigBrady.com, mellia@prezzemolo.polito.it, e1000-devel@lists.sourceforge.net, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com Subject: Re: [E1000-devel] Transmission limit Message-ID: <20041201182943.GA14470@xi.wantstofly.org> References: <16807.20052.569125.686158@robur.slu.se> <1101484740.24742.213.camel@mellia.lipar.polito.it> <41A76085.7000105@draigBrady.com> <1101499285.1079.45.camel@jzny.localdomain> <16811.8052.678955.795327@robur.slu.se> <1101821501.1043.43.camel@jzny.localdomain> <20041130134600.GA31515@xi.wantstofly.org> <1101824754.1044.126.camel@jzny.localdomain> <20041201001107.GE4203@xi.wantstofly.org> <1101863399.4663.54.camel@sfeldma-mobl.dsl-verizon.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1101863399.4663.54.camel@sfeldma-mobl.dsl-verizon.net> User-Agent: Mutt/1.4.1i X-archive-position: 12371 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: buytenh@wantstofly.org Precedence: bulk X-list: netdev On Tue, Nov 30, 2004 at 05:09:59PM -0800, Scott Feldman wrote: > This doubles the kpps numbers for 60-byte packets. I'd like to see what > happens on higher bus bandwidth systems. Anyone? Dual Xeon 2.4GHz, a 82540EM and a 82541GI both on 32/66 on separate PCI buses. BEFORE performance is approx the same for both, ~620kpps. AFTER performance is ~730kpps, also approx the same for both. (Note: only sending with one NIC at a time.) Once or twice it went into a state where it started spitting out these kinds of messages and never recovered: Dec 1 19:13:18 phi kernel: NETDEV WATCHDOG: eth1: transmit timed out [...] Dec 1 19:13:31 phi kernel: NETDEV WATCHDOG: eth1: transmit timed out [...] Dec 1 19:13:43 phi kernel: NETDEV WATCHDOG: eth1: transmit timed out But overall, looks good. Strange thing that Robert's numbers didn't improve. Doing some more measurements right now. --L From buytenh@wantstofly.org Wed Dec 1 11:56:12 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Dec 2004 11:56:17 -0800 (PST) Received: from xi.wantstofly.org (alephnull.demon.nl [212.238.201.82]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB1JuBrN024519 for ; Wed, 1 Dec 2004 11:56:12 -0800 Received: by xi.wantstofly.org (Postfix, from userid 500) id 5BE772B100; Wed, 1 Dec 2004 20:55:50 +0100 (MET) Date: Wed, 1 Dec 2004 20:55:50 +0100 From: Lennert Buytenhek To: Ben Greear Cc: netdev@oss.sgi.com Subject: Re: via-rhine unable to send back-to-back packets? Message-ID: <20041201195550.GC14470@xi.wantstofly.org> References: <20041129222700.GA22918@xi.wantstofly.org> <20041129172540.6b959858.davem@davemloft.net> <20041130064823.GA27872@xi.wantstofly.org> <41ACB0C3.5020408@candelatech.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <41ACB0C3.5020408@candelatech.com> User-Agent: Mutt/1.4.1i X-archive-position: 12372 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: buytenh@wantstofly.org Precedence: bulk X-list: netdev On Tue, Nov 30, 2004 at 09:41:23AM -0800, Ben Greear wrote: > >Yeah, preamble (8 bytes), CRC (4 bytes), inter-packet gap (12 bytes). > > > >Perhaps the via-rhine simply can't send out packets back-to-back and > >needs 14 byte times of inter-packet gap. I couldn't find any stray +2 > >in the driver anywhere but I'm just checking. > > Couldn't you just sniff the resulting traffic to see if it is too big? Sorry, forgot to mention: I did check that, and the data portion of the packet doesn't appear to be too big. --L From buytenh@wantstofly.org Wed Dec 1 12:02:10 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Dec 2004 12:02:15 -0800 (PST) Received: from xi.wantstofly.org (alephnull.demon.nl [212.238.201.82]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB1K2AuT024983 for ; Wed, 1 Dec 2004 12:02:10 -0800 Received: by xi.wantstofly.org (Postfix, from userid 500) id 088982B100; Wed, 1 Dec 2004 21:01:49 +0100 (MET) Date: Wed, 1 Dec 2004 21:01:48 +0100 From: Lennert Buytenhek To: Roger Luethi Cc: "David S. Miller" , netdev@oss.sgi.com Subject: Re: via-rhine unable to send back-to-back packets? Message-ID: <20041201200148.GE14470@xi.wantstofly.org> References: <20041129222700.GA22918@xi.wantstofly.org> <20041129172540.6b959858.davem@davemloft.net> <20041130064823.GA27872@xi.wantstofly.org> <20041130122503.0adac947.davem@davemloft.net> <20041130220644.GC29947@k3.hellgate.ch> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041130220644.GC29947@k3.hellgate.ch> User-Agent: Mutt/1.4.1i X-archive-position: 12373 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: buytenh@wantstofly.org Precedence: bulk X-list: netdev On Tue, Nov 30, 2004 at 11:06:44PM +0100, Roger Luethi wrote: > > > Perhaps the via-rhine simply can't send out packets back-to-back and > > > needs 14 byte times of inter-packet gap. I couldn't find any stray +2 > > > in the driver anywhere but I'm just checking. > > > > Or the via-rhine driver is not programming one of the registers > > proper to get optimal spacing. > > > > As with most Donald Becker drivers, many of the register layouts > > are not documented in the sources so it's not possible to just > > scan the driver looking for potential problems like this. > > For example, maybe the TxConfig register has an "IPG" field but > > we'll never know by reading anything in the driver source. > > Presumably Donald Becker had only access to the publicly available > documentation at the time which is very incomplete and buggy. What > little time my day job leaves for hacking via-rhine is consumed by the > WOL issues that have come up with 2.6.9+, but if you have a specific > question that can be answered by someone who knows the chip but not > necessarily Linux I can try and poke my contacts. "Is the hardware capable of sending back-to-back packets (i.e. with an inter-packet gap of no more than 96 bit times)?" "Can misprogramming the chip lead to the effect that the inter-packet gap is never less than 112 bit times?" Thanks in advance. > Of course, you can always check if VIA's driver has the same issue. If > it doesn't, chances are we can borrow the fix. Hmm, didn't know they had such a driver. Where can I find it? cheers, Lennert From buytenh@wantstofly.org Wed Dec 1 13:36:15 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Dec 2004 13:36:20 -0800 (PST) Received: from xi.wantstofly.org (alephnull.demon.nl [212.238.201.82]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB1LaBxK001792 for ; Wed, 1 Dec 2004 13:36:14 -0800 Received: by xi.wantstofly.org (Postfix, from userid 500) id 549B22B100; Wed, 1 Dec 2004 22:35:50 +0100 (MET) Date: Wed, 1 Dec 2004 22:35:50 +0100 From: Lennert Buytenhek To: Scott Feldman Cc: jamal , Robert Olsson , P@draigBrady.com, mellia@prezzemolo.polito.it, e1000-devel@lists.sourceforge.net, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com Subject: Re: [E1000-devel] Transmission limit Message-ID: <20041201213550.GF14470@xi.wantstofly.org> References: <1101484740.24742.213.camel@mellia.lipar.polito.it> <41A76085.7000105@draigBrady.com> <1101499285.1079.45.camel@jzny.localdomain> <16811.8052.678955.795327@robur.slu.se> <1101821501.1043.43.camel@jzny.localdomain> <20041130134600.GA31515@xi.wantstofly.org> <1101824754.1044.126.camel@jzny.localdomain> <20041201001107.GE4203@xi.wantstofly.org> <1101863399.4663.54.camel@sfeldma-mobl.dsl-verizon.net> <20041201182943.GA14470@xi.wantstofly.org> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="uZ3hkaAS1mZxFaxD" Content-Disposition: inline In-Reply-To: <20041201182943.GA14470@xi.wantstofly.org> User-Agent: Mutt/1.4.1i X-archive-position: 12374 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: buytenh@wantstofly.org Precedence: bulk X-list: netdev --uZ3hkaAS1mZxFaxD Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Wed, Dec 01, 2004 at 07:29:43PM +0100, Lennert Buytenhek wrote: > > This doubles the kpps numbers for 60-byte packets. I'd like to see what > > happens on higher bus bandwidth systems. Anyone? > > Dual Xeon 2.4GHz, a 82540EM and a 82541GI both on 32/66 on separate > PCI buses. > > BEFORE performance is approx the same for both, ~620kpps. > AFTER performance is ~730kpps, also approx the same for both. Pretty graph attached. From ~220B packets or so it does wire speed, but there's still an odd drop in performance around 256B packets (which is also there without your patch.) From 350B packets or so, performance is identical with or without your patch (wire speed.) So. Do you have any other good plans perhaps? :) > Once or twice it went into a state where it started spitting out these > kinds of messages and never recovered: > > Dec 1 19:13:18 phi kernel: NETDEV WATCHDOG: eth1: transmit timed out > [...] > Dec 1 19:13:31 phi kernel: NETDEV WATCHDOG: eth1: transmit timed out > [...] > Dec 1 19:13:43 phi kernel: NETDEV WATCHDOG: eth1: transmit timed out Didn't see this happen anymore. (ifconfig down and then up recovered it both times I saw it happen.) thanks, Lennert --uZ3hkaAS1mZxFaxD Content-Type: image/png Content-Disposition: attachment; filename="feldman.png" Content-Transfer-Encoding: base64 iVBORw0KGgoAAAANSUhEUgAAAoAAAAHPCAIAAABMSfxQAAAACXBIWXMAAAsTAAALEwEAmpwY AAAAB3RJTUUH1AwBFR0dvrAfowAAHrZJREFUeNrt3e1yqjoYBtB4Zt9Euf9ro5fh+UFLYxJC VJCvtWbPHkqVirY+vklIbn3fBwDgs/7zFACAAAYAAdym67rZGwzePPLLxwGAvfn3gfQdu5nj 7ZYj930/3uXl4wDA2SrgZ4OwfuM4y8cjDxn81HEA4OQV8MtBOGaqKAVAAK8lidu8MXm8wbAx lcqzsa17GIDPeL+A/PfJRxl36OY3mG3QrvcBS18APub90Uj/Nnnc67U8X6RNu95U4EydrJN1 pk72A2f6pv+2fWTxLePXbGyXNuAZgFN6qwIu9t0mwRl38U7tmapl4/RtvBcAHMJtjTDbpGy9 VDsPABtaJHFWaYKWgsAn3wfXGIO5q3n3Vn0ke5thMJ/0cL1XeVvmggZoKiTOeqlFY8m01OnX jzM0oA5Of22LAAZgsZz+2HFO4J+nADh6KhTn9ol3Jm/9s7MDTdVts0euFHzxMNVkovv2cEoe an4WyZDYlmejvWyN79LyrObnNXuc4qOqfOspX133vaf4F8DAqSQruBSn3nthZZfkjqE6qV/x Lsm1IfF3Z48z9ROLp1zZaP+okYdfcbr+5Lxa9swe59mXRgUMsKOauFiZxbk4bLzzFp8fJz/a az9rKheT/ZWFavKNyr1afnp7lfz+C7dG+n79PrZhYyd1sAAGrhLJyZ733+VbjrPgz0oO8toB Vyori23j7x9qKUPi7q0J2iAs4CqSq1nyxt7G5Jg9Tn7A+pje8cKb2fnwiwd5LfBWGmOctDC3 nFfyrF5nlJYKGDh5+Ts7GV/cUJxM8Fecj699Ur+p2yS3f6p1erxQ54XhVMV7tSRf+xSH9Qc2 e5zw5Ki0F+rg/bid5rOGmbCAg753nfKN69y17H5nwgIABDDAfp21TNQeKYABQAADB2fxg5Od 6csPsnKD9y9DshgDwOdY/GCl03/hOPlA65fP4uUfbTEGAHad06seRz/uqlwHDDz9jmzxg3CW xQ+eLaNbnvnkRz/1HK60GEP31fXffb4tgIEDs/hBOPLiB1MTZecfLGbPopiy9We+uPHsr0rT r9N3P+TuftJXAAPL1MTB4gcTh9r54gdTz+HsuS/1mBuf/AUzeD9/OwIYWCuSkz0WP1ij4SF8 ZLTaImfR+JhXmkJrhxWwQVjAWix+sEYRWTnsC4sfbHIWW6XvWAergIFzlr8WP6g/nr0tfrDI WUx9WElehfwx1+N5saHgUdW7nwrYYgzA5epyix+wh8TRBA0AGxDAwLVY/AABDMDZrDRPuAAG YJskS5Yo6B5Vsmq9Pfn+F/qh81PID3vWeaEFMMAB0jdfoqCPxDfLE3GNPUud5nWWXhDAyWv/ 5e8c2GH6vtChm18RtN6e4kNd8Prj4oRZliMEYF1TU4kVg2q3sVRvGK+c13Wup/rv2r/l34pg 4KDFccuqvRtW6nkBPX5cyD83XPMiZjNhARwvfTePq5bITObTnpqm+7X1sh6PsHwp1fffAhiA C03gFY/0bl688vuI564PGODAKZWUm+Ejo7GK0Tg1RGtqueXiceJx3estD7wTV6+Ah27gg356 As4at/HGmHlJSuULJBSXUFxpT/uHhvp911h64SgsxhAEMMB6xfo+j7lV4sQ0QQOwmDWS8qyV sQB2MRIAAhgABDAAIIBXpBUaAAEMAAIYABDAACCAT0U3MAACGAAEMAAggNejFRoAAQwAArgq X5bynZtV7tL9WvUZUQQDcIAAXi8Ok5Wchy+TVTAXZ11CAA4QwO1rNLZEZnybZCXnLCZ7rxwA 1w3g9vRNbvmZxuSXi2Ct0ACs7d/nf2Scx3k7cyXXx5tVgr/lNgDwWn4dKYCLzdTJOQw3mG3Q TmLbrwIAKuCmjwxxz+5KP2uRIw+t0AZkATBVB75/tFWuA44f2Th0uZiOyS3j7eFbil0AVMC1 0jZELcn14Iw7fSvhOtxsvEHjvRb9GKEIBmAttzXCbJOytT6ASwADsKvEWaUJWqMxAGwQwOfg gmAABDAACGAAQACvRys0AAIYAAQwACCA16MVGgABDAACGAAQwOvRCg2AAAYAAQwACOD1aIUG QAADgAAGAATwerRCAyCAAUAAAwACeD1aoQEQwAAggAEAAbwerdAACGAAEMBXoggGQAB/Wt9/ exIAEMAAIIAvUwRrhQZAAAOAAAYABPBKtEIDIIABQABfiSIYAAH8aS4IBkAAA4AAvlIRrBUa AAEMAAJYEQwAAhgABLAiGAABDAAIYEUwAAIYABDA+6YIBkAAf5qZKQEQwAAggK9UBGuFBkAA A4AAVgQDgAAGgGMHcNd1szcYvHnkl4/zeYpgAOr+fSB9+77Pt1uO3Pf9eJeXj/N5WqEBWLcC XjYI4ywfjzxkcJZwvVcOgOtWwC1BWLzNmKlnjdKhCDY1BwCrBPBr5XLemDzm8bAxlcotsX36 aAdgK8sOQvr3sQcdJ2JyDnkw14vpnfcBK4IB2EUA53m5XnwqfAFYO2IWKYVXuQ64OJxq9pbx zcZ26UMUu9WnwnBoAApu78TbVEty8cvKbRpL5/q96v3HGwawVmiAk1kkcW5rJNYmZes+A1gG AwjgolWaoHXEAsAGAczjxxETYwEggAFAACuCARDAAIAAVgQDIIBZhAwGQABvUAR7EgAQwIpg AASwIhgAAQwACOCzFcFaoQEQwAAggBXBAAhgAEAAK4IBEMAAgAA+KEUwgADm00zKASCAUQQD IIAVwQAIYBTBAAhgRTAAAhhFMAACWBEMgABGEQyAAFYEAyCAUQQDCGAUwQAI4GtnsCIYQAAD AAJYEQyAAEYGAyCAAUAAowgGQADLYAAEMAAggBXBAAhgimQwgABmgyLYkwAggNkmgxXBAAIY ABDAimAABDAAIIAVwQAIYJYggwEEMBsUwZ4EAAHMNhmsCAYQwADAvgO467r379X9UgT7xQUQ wCumb9/3432HLwcXz+BgNBaAAG7M0WdzerxXMW4bD3jiItgvLsDR/Vs/LfpK1l48St/J4K77 ksQAAvjFmjhvZ66kcktsi3YA1suvYwdwfg5DWM42ViexrQhWBAOogJ8Nj/5wR5bBAJevfPoF S+GNrwOOzyHOzrFdWrHb8BwaEQ2gAp6O2LF/N+70rYTrcLPxBo33ut7HMZcFAwjgasE+u3P2 ZnK3ksEaogGOxVSU6mAABDBvkMEAApgNimBPAoAAZpsMVgQDCGBkMAACGAAEMIpgAAHMCclg AAHMBkWwJwFAALNNBiuCAQQwACCAFcEACGBkMIAARgYDIIBZiAwGEMBsUATLYAABzGYZDIAA ZoMMVgQDCGBkMIAARgYDIICRwQACmHOSwQACmA2KYE8CgABmmwxWBAMIYGQwgABGBgMggFmb DAYQwGxQBHsSAAQw22SwIhhAALMNGQwggNmgCPYkAAhgtslgRTCAAEYGAwhgZDAAAhgZDCCA kcEACGBkMIAARgYDIICRwQACGBkMIIChTgYDCGA2KII9CQACmG0yWBEMIICRwQACGBkMgADm AxkshgEEMBtkcDAuGkAAs0kGa44GEMBsFsMyGGAvAdz9SnbmN1tkDzIYQACHruv6X2NYDjvj 7FxqDzIYQADXIjmEMGbnUnvYFRkMsK8A5iJFsCcB4Cn/ln4j/itPh4L143XYlj/94hncdV+S GDixZdtf/y3+4Mbki7eRwQCsGMA7yACRL4MB1o2YRUrh/z7wcJcde6WwPkQGex4A6m6Lh1mx FzZPzaX2JD9XNu+EOhg49VvcAolzO01iCWAZDHCgxHEZEmvRFg0ggJHBAAKY62WwGAYQwGyQ wcFclQACmE0yWHM0gABmsxiWwQACGBkMIICRwQACGGQwgABGBgMIYJDBAAKYE2SwGAYEMGyQ wcE0HYAAvqbuq+u+unwbGQywkn+XTdwQQv/dx1k7bvff1jTcLIOtYAiogE/9dv/dTyWu9N02 hnUJAwL4EhlcqYPZKoONjgYE8JnFnb5jEieVMUphAAG8fPomLc/j8KsxhiWxUhhgJRcdhBW3 Pyc749zVH7yTUtjILEAAXyKbn+oSHm4vqmUwwFNcB5x66sIk6fvJDPY8AAL45Ok7jsYqpmz8 rXH/2HA9bMTbnlUZDJDTBJ1WtMVUjtulk0yNq+T4lvk4L5bKYM3RgAr4/KVwMXSLyT2mb7FW ZqkMDiG4Qgk4gVvfnyQkuq4LIXz4dOKSN6mAZfDKL/dXHMkAh0scFfC76VsM45DNdsnipbAl HIBDE8BvZMBvxA4bcftzXBYHo7FWjmEt0oAAvm4MT/2fzK6V19DFbZ6NYaUwcDhGQa+bzcV1 D4sZnNTTPJvBQx2sSxhQAfOw3kNlKo/49tL3nQx2rTCgAibkfcB5QRwPnJ66zjhkPcrMlsLB 6GhABXzdJChNp1WsjGd7gqXvsxkcdAkDApg8ZfNwTQK7soEMBs7BRBz7iup6Se339fnfCsOy gJ0mjgp4X+lbTFlzerxTCrtKGNgng7B2ExXV1mbl7zsZrBQGdkgFfJjiOP8fpTAggFm3OI6X WppaD5HZDHahMLAfmqCPlMEh6ipO1iEu3ljD9VQpHFwoDKiAaVFceSmpgJM98RwgGrGLpbBq GBDAzKdvJYPD41pM4w2SKrnYlH3ZJBbDwLY0QR8hKqLllZI0zUO6ePdi+rZkcJ7r54vhMYM1 SgMqYOZL4amW55B1/VbSN7ll/mV9Gq8zlcIhBNUw8ElmwjpnTods+o7ZOrj4ZUt1fqYSWSkM fCxxNEGfsaSLauJkeyp9Q9aI3RLG52ugjieRFsPAqjRBXyKJw2NHcuMlxWO4ximbd0Wf8Hmz lgOwPk3QTDyfUfpWAv7EQ7TUwcCqiaMCplY91/N1HESdbBTr6SPWwWbOAtajD5j5GJ7K4GKP cn1ikNlQ32EMK4WBNSxfAXe/8mp9jT3vGCaJGjbiL2mP56nrmuKLjOOm7LhWTv7ttmg2ZQdw gADuuq7/NYblsDPOzqX2sK3GK5qG7XqH8f67k8UwsPcKOHrD6sfUHL4csnOpPYvUvnH5G39L Kfxy+obq3JlHH1Btyg7gAAG8Z999//07em3YSL5kPoqyK5riDuNk51ROJ9l8jBM3cxawhOUH YY3l6SZXBL3/05PKeIjkr64b/h+/jPdL4jA9YXVeAU/V0+FxscWdF8RxBhufBRexbB/o8gE8 Jt/YaLznOrhYBCdB2xjbcRKPR7jiL2hUCrePf65PUr3/GA6GSQObB/DWb4iLvXFXOomL+y+e u0lBnG+H6cuZQmlyrmQW6/xm+4lqMQxXeX+LKsz3j/bfBx7usmOvPlBYDz3Es53ESdAWE9pg rqfq5uL0W2MZnQyrztN322lATNwBbFkBx0OUx5gcdsapudSebc2Gq+7hF9J3ahGI4rTVyc49 VMYm7gAamQt6gRiuj9KSxC/kcS3h5pZm2km7tBiGM79NmQt6P5KIjVuweSF9Kx3JxXyduiJ5 QybuAFTA28gbqOslsiuaWiK5ONNWnsFJrbx5QTxmsIIYJI4K+BM18QuTe+RTdIWGzuZzG4Zf xdtTixkn7c9Jck8N0fpAufw7qk9BDKiAt6uDG8NbL/IidXMS5FOXM21SEKuG4eKJYznC1evg MYlnm6Dz2E42xPDL6RuyC4srQ6zzzF42pF03DARN0PtMa95P37jhOs/U4jXH+e1X7ULWKA0C mM+F69gxXNzIkzjfSNYwjutjqxqPOZqM2JpalClfwLi4UlO9Sl4whiUxCGC2zOni0K08dKd2 UkniYgYnO5O4TebeWimJjdKCCzII6zAWGc/leqeH35lsvq08wvMLn4oTdU1tJ4dqbM3WPQxX SByDsA5WH9djeGpUF1P5mmxUYjifibp4LdPsNcotGWyUFlyBJugjpW++/dTKEEE/8fMl8mx2 JpN2VXqXn73kSaM0CGD2FcNTg7biUV1TCR1/KYNbSuRiBlfm9Kj0Lie9yI19yWIYBDA7rYnf 6dBNCuL8SyE9JG6xfi1OWD07S9f71bAkhnPQB3zyeJ6qes1z+VoSJ1NdTo2sTsrlJMXjvuQw Nw3IQ8z/9g3rHoYTMAr6il4IXQtILPArWl2v6emdt3ucysDhEkcTtPp4spM42WDB9M1bs4vT gNQuZ7rfkoIYOBYBfN30bUzWZLqPfFi1RuwWYyt0sa4tjttK7hvf8e/uQwzfb2IYDkcTNE0Z bDHjZavhOFmfmlqreIVxyNqlw/0Wnpz9A/hw4hiEBZ9O3+KKxVPfmq2V46Dt7rffj9b3EEL3 +wax6qoSgApYBfzpsri4v1Iiq6GXTfGpybYekjsaq5VP1+WZhA0TRx8wL8qnBHkzQa0t8Wz6 Fld5Sovp+23413VfYxjPXgEFfIAAZrFSeHa4VnH/1MbUrCByOh7PFaoDux7uNiTx7f7zb/0l ngABzOp1cDIXZlIQP3uBU2O41qfxulopHE/3ES+BnCyHHI+aDrf71DSZgADmKilejOT2Ylop nJTFobocclwQTyUxsDaDsNiLeLni+nCtZ3Odet1cuX7JWC1YL3FUwOyrDm5JzcZW7tle5Gt2 J5fnrP5tmh56iLvuS9M0rE0As9MwjldXDNmg63pOP9tGPdWLfNZszheHiJN4CGNN07A2TdCc 0MvBWWz0vsIzll+YlDdNF3uU4Zo0QcNkjtY3Zgdg5xdEndvUyolx03TXfcVFsKZpeJMA5uQZ /FRUv3ll1DliuNI0PV5G7PolEMAwE66VXuRK6BavjKoM4zrZeK60Dh79zqtVuX5JEoMAhqej Og7mpPZtD9fKNF4HTeKx3/cvjB+bpiUxvMBqSFBL5Tx6G2fcPJNkvaaH/++3cbjWsPjSmL4G bYEKGJbM48aZNesTXB86iStXEv8s+fA713ScvuOXgACG1uhtvxhpdgB2Ui5PXXy885CeTOL+ +6d1+nHVB0kMAhgWSOL68K4pLRNZ5zt3vhjUVBLHw7XyJB7oKubi9AHD8jk9tVEP3UN3JxdX gwghdPfbz9fDnB6luabNO40KGNgynuulcyhNmbnPkM4XYkquJC4OnDZ8GgEMrJu+U4k7G8nv rDmxlyQeLl5KLiZ+XPtBEiOAgdWTOL/4eGrPVEEc2hqrt10MKk7iv8bq34uJ/9Z++E3i8Hgh kyTmlPQBw+6CebbqDW3dyTtc1ml6+PTvdvcVwk9vcbwyhN8NVMDAlnlc/G77mhPFYN6kO7n1 QibpiwAGdpjNb645MdWdHKoTai470ebMhUzDKkzjP23RnIgmaDhVEjeu7xRXwGGhxuqvrntn +eTihUx/VzGF8DOCevxu/+3V59Bup1nBfpHlkeGClm18/u77N5M4/dOOxkU/ZHP3JYY5dOJo ggaemOl66gbFqnq2EbuxMk7m6PiplfvvYfj0+M/ryLFoggbp24+V64er6qFWHm6cbIzRnlfA cR7HFXCcwSpj9k8TNLBk6L4c5MUALi5r2LLE4RjGkpjdJo4KGFiskh6jd6qurSR08SLmewhD zuZN0HVj7iqL2a21+oC7xz+wLvt7W2oPsKsMDm0XR71wEfNrhq7ivMNYnzGb+0QF3HVd3/fD /8vuAXYexvWIrTRWJwXxInmcVMCKY7a1Sh9w3Di+Ru4WM1gfMJzMItHb2Hmsz5iXY25HFfCY l14hYHPDCOrZaaX1GfN5ZxuENQa/UhiOrlj7FuflqAfwc0e4R/NiZv3EIlnhu98A1kELfKCi nc3gqUWF81WHp44fSn3G6mN2XQGPHxA2CWPxD1fL4KmNOGLjPuA8g/M7Try9FMZwieFr/e5F Q5HeP9paE3GsN/ZqKtcNwoKzSi4gvt1DewUcB/CzFXDbe52a+IoOMxFHnppL7QGupiV9Q9Ty PA7CKqx4OHHHpzJ4ai5MkcxmFfBBP48AO6x9i569PCnO1Kdq3DfelOSxCnjrChjgNUPK5pNZ vnCoxsksk17h5Agt01BX4jZpr57qqP7AhwP2QAADpClbbKB+dkGIeiR33VcI927oXHscsJ30 XotkAQywZR0clpsguiWDixtJOrZfjlwox++3nyN3XyHc47I4Sf2kSpbHp/GfpwAgr3SnStKH EJ0K17Yf8ZOm/Xe438L9NiwXEW73v3+PPz0ZQVa80BkBDLCiccnCykb85bPpWwzRZGdcicYb U/unbhnH/BjGP//iML7du+4rH8s9NeUI+6cJGrhKYLe0YA8B+dV192wN49u9C2FoOf6pTou1 crG9unLLZJKQh0cz/LDR7T72HMf9x8UM1litAgZYOErjGreyke+vlMj5YXP32088D9ci12vo uiFYx4D86rr+ux+K3ngjTeH7bSiRx/+HJuu/hussfcflj4cfp0oWwAAv+u77cTnhcU9xI693 G5ujZ6P9fgtfXRc3Fc/+S/J1yMhhoyXyh1veb+lMXvdwC/fb8H/ff8eN1WOTdfyxII9kv1EC GGDFzC5GbCVo69H+WuHeWKZP3WDM4DGJhz7i4f+fVujfwB8iOQwXO8W9yI9DuOWxAAZ4JVPj ajjfiL+sl8izpXPlATx15JaDN9bxYxIPGZxk57DzL4mnR3WN6R43U4tkAQywWGA/VbzWo70S 6vXKuyXCnw3msY16/BcH89gv/NORnJXIQxLfwl0v8ueZCxpgG19dN4yvbvxwEA/Jjv+vZPZ4 myFb40iO/QV2uCffKK4r9dpEYGdiLmiAY9fl4Zl+5WIRPE6XnVw0lUfsd9+H0A0bQ4KOeRyV y7c4ksPt55Knv+CJroxKquHxaK59EsAAVw/1qYbuOI+TxurB787bw7cem6l/DhR98zeGTZzZ RBM0AH+G9upiL3IevmmTdUgnDyk2ep8gmDVBA7C8/vuvsfp+S3uRx2I3brKeKpFvyWReUYk8 XkkVrtpqLYAB+DO2WicdyWOr9d9lx4VkvaW172Me38MtLqzHb16zO1kTNABLvAk/DuyayJxC k/XMXXaZx5qgAdiLIR2/+9B/T+fx/faw8+cq5LQlO7l9Prwr/qHH7U4WwACsnsdjldx/92OO Di3SNdG3f3I6beXuiks0CmAASFN5DMgxPvO69yd577VBXrcQ7rfb7T4M5uri4V1TB9xVNgtg AHaRx1HOPgTzVB7/XAd1+ymOJ0rkv0I6bsTeQyQLYAD2XiiHMDVbyO3hquVwL7Rax1/+fr2H AdgCGIAD5PFUd3L8/5DHeZo+hHbisaqeGvC1RjC7DAmA40fAV7k7eTKJ2/I4uWM0fnv4+q08 VgEDcJIqeSoO80bsn+1soq50Jq8w2Yj9fjUsgAG4RDzHjdiTA7BDtRH7bxj2LfwuQiWAAeDd ijmplScr5lsIzywlKYAB4LlauRjMgzcn/RDAAPBcMHehC2/3Af/naQWAzxPAACCAAUAAAwAC GAAEMAAggAFAAAMAAhgABDAAIIABQAADgAAGAAQwAAhgAEAAA4AABgAEMAAIYABgrQDufiU7 85stsgcABHDouq7/NYblsDPOzqX2XFb+EceZOlkn60yd7NUr4GIkhxDG7FxqDwAI4B9DRgIA df/WLnw/7FLF8aXatbysTtaZOlkV8H7TFwCuWwFvlb4iH4DrVsB5+i4+9kp5DcAJ3JYNs6Tp fzx4nppL7QEAAQwANDEVJQAIYAAQwADASv6d4BzaR36d4EwvMk4tPotzv77j2cXncuKXNT7Z s76y74xFPcHJnv4NOTmRd17WMwRwKF0EPK7ccJoMLq4NlZzjOc46P9Ozvr75a3fulzU/2VO+ ssm787n/Wosne4U35EVe1nM2QZ9v5YbKh6yTrVfR8ot74pU5Lr4MyclONvlQde6XtfKXe6aT zRfoe+dl/e9Mf7onfnu6ztVilb/h872+l7oI8FKvbLjYpAXFCuF8L+vir+l/Z3perFR47j/v E7++l32zPusrO9ZDF3lBi6WhN+RZZ+gDNpfINSsn6Xv0kz3xWZ94KOjsyZ71fNd4KV2GBNLX yULTL/CybRv/neNJyT+RnX7lhuusV3Hi1/e1xUtOc7JnfWVbzutML+tF/mD7X2PR//7LejvN hVlJY8jJcug6VxYWz/Ssr+/FLxi9wit7wcu7T/+GvOB1wBZjAIAN6AMGAAEMAAIYABDAACCA AQABDAACGAAQwAAggAEAAQwAAhgABDAAsJ5/U99IlrNoX8lk6miVhTLajzOrZRmK+uIVjQvU vPyYlzoOACcM4CSQ4hUQ67d5Kn6G7deOU//Q8NRtkodRTMfkXi8/5uRnLXjuABzLJ5qgW3Jx kZ/yTu07LqScf+udaIyPWflZKmAAAVwLg67rnk3TjxV2LT/ltUfSXty/8PwAcE3/2uMzqQjj Mq492PJ7FcvBzdU/N1TK5RC1M8dFcOUzjQoYQABPhlAeDy39l7PDnZJu1wOVj/ljnhqnVk9W fcAAAviJEvCpUvJwAfPaQxWfALT7rzGB2mvT+Jb9r9DQBpvvX6Qgrh9kbCt+djBz+zPw5s8C 4JRuU03HeW3Xcv3uVI/mC9cBP5tSLY+5/Rrf4sVIzz4bLR9u9AEDCOBlLFLeLVgjKjcB2KH/ AdjzOoqApIL+AAAAAElFTkSuQmCC --uZ3hkaAS1mZxFaxD-- From buytenh@wantstofly.org Wed Dec 1 13:59:45 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Dec 2004 13:59:50 -0800 (PST) Received: from xi.wantstofly.org (alephnull.demon.nl [212.238.201.82]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB1Lxirr002754 for ; Wed, 1 Dec 2004 13:59:45 -0800 Received: by xi.wantstofly.org (Postfix, from userid 500) id 7BA6F2B101; Wed, 1 Dec 2004 22:59:23 +0100 (MET) Date: Wed, 1 Dec 2004 22:59:23 +0100 From: Lennert Buytenhek To: Robert Olsson Cc: netdev@oss.sgi.com Subject: Re: [PATCH,pktgen] account for preamble and inter-packet gap Message-ID: <20041201215923.GH14470@xi.wantstofly.org> References: <20041128213251.GA9330@xi.wantstofly.org> <16811.19277.594137.939120@robur.slu.se> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <16811.19277.594137.939120@robur.slu.se> User-Agent: Mutt/1.4.1i X-archive-position: 12375 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: buytenh@wantstofly.org Precedence: bulk X-list: netdev On Mon, Nov 29, 2004 at 05:16:13PM +0100, Robert Olsson wrote: > I've heard some boxes that didn't do the ipg correctly. Don't know what NIC > this was. We can only estimate stats at the point where we are delivering > packets. Other devices has "true" L2 stats. Just found out the other day that via-rhine appears to do this. > OK. Adding FCS yes kinda compromise and maybe was wrong in this aspect. Let's remove it then? > The only solution I can think of is of having a selectable option in > the config for output stats. To support different L2 layers and with > warning if we are "predicting" L2 statistics. Feel free to extend your > patch. At that point it might be easier to just do it in userspace, no? --L From sjackman@gmail.com Wed Dec 1 14:03:42 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Dec 2004 14:03:48 -0800 (PST) Received: from mproxy.gmail.com (mproxy.gmail.com [216.239.56.249]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB1M3fSQ003184 for ; Wed, 1 Dec 2004 14:03:42 -0800 Received: by mproxy.gmail.com with SMTP id w41so88283cwb for ; Wed, 01 Dec 2004 14:03:18 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:references; b=SdwSV4xL+mxR4tLQ/vd+ZennOM9ZCci6fHRQLD19YMy5YRmYsoPTXwFxMUPmx/1CHUh6LVIUJa766uCakASbKSy1Mowr7dtF5GwaWDRgn69iGZX/sxowkM1Cwo5+7wKsGrfa0IWh0aVFhGeKsPma6K4e/su00/CSvlw4eqleM5M= Received: by 10.11.116.64 with SMTP id o64mr326004cwc; Wed, 01 Dec 2004 14:03:17 -0800 (PST) Received: by 10.11.99.50 with HTTP; Wed, 1 Dec 2004 14:03:17 -0800 (PST) Message-ID: <7f45d939041201140329d0273f@mail.gmail.com> Date: Wed, 1 Dec 2004 14:03:17 -0800 From: Shaun Jackman Reply-To: Shaun Jackman To: netdev@oss.sgi.com, Andrew Morton Subject: Re: Multicast filtering for tun.c [PATCH] In-Reply-To: <20041127011006.03232bb6.akpm@osdl.org> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit References: <7f45d9390411251715138b35d0@mail.gmail.com> <20041127011006.03232bb6.akpm@osdl.org> X-archive-position: 12376 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: sjackman@gmail.com Precedence: bulk X-list: netdev This patch adds multicast filtering to the TUN network driver, for packets being sent from the network device to the character device. It applies against the 2.6.8.1 kernel tree. Cheers, Shaun On Sat, 27 Nov 2004 01:10:06 -0800, Andrew Morton wrote: > Shaun Jackman wrote: > > > > This patch adds multicast filtering to the TUN network driver, > > You should cc netdev@oss.sgi.com on networking stuff. > > You may not get any feedback on this work. Persist. If you get nowhere, > ping me and I'll help push things along. > > > This is my first attempt at sending a patch using gmail. > > Send the patch to yourself first, check that the result applies OK. 2004-11-25 Shaun Jackman * drivers/net/tun.c: Add multicast filtering for packets travelling from the network device to the character device. * include/linux/if_tun.h (tun_struct): Add interface flags, a hardware device addres, and a multicast filter. diff -ur linux-2.6.8.1.orig/drivers/net/tun.c linux-2.6.8.1/drivers/net/tun.c --- linux-2.6.8.1.orig/drivers/net/tun.c 2004-08-14 03:55:23.000000000 -0700 +++ linux-2.6.8.1/drivers/net/tun.c 2004-11-25 17:00:22.000000000 -0800 @@ -41,6 +41,7 @@ #include #include #include +#include #include #include @@ -104,11 +105,42 @@ return 0; } -static void tun_net_mclist(struct net_device *dev) +/** Add the specified Ethernet address to this multicast filter. */ +static void +add_multi(u32* filter, const u8* addr) { - /* Nothing to do for multicast filters. - * We always accept all frames. */ - return; + int bit_nr = ether_crc(ETH_ALEN, addr) >> 26; + filter[bit_nr >> 5] |= 1 << (bit_nr & 31); +} + +/** Remove the specified Ethernet addres from this multicast filter. */ +static void +del_multi(u32* filter, const u8* addr) +{ + int bit_nr = ether_crc(ETH_ALEN, addr) >> 26; + filter[bit_nr >> 5] &= ~(1 << (bit_nr & 31)); +} + +/** Update the list of multicast groups to which the network device belongs. + * This list is used to filter packets being sent from the character device to + * the network device. */ +static void +tun_net_mclist(struct net_device *dev) +{ + struct tun_struct *tun = netdev_priv(dev); + const struct dev_mc_list *mclist; + int i; + DBG(KERN_DEBUG "%s: tun_net_mclist: mc_count %d\n", + dev->name, dev->mc_count); + memset(tun->chr_filter, 0, sizeof tun->chr_filter); + for (i = 0, mclist = dev->mc_list; i < dev->mc_count && mclist != NULL; + i++, mclist = mclist->next) { + add_multi(tun->net_filter, mclist->dmi_addr); + DBG(KERN_DEBUG "%s: tun_net_mclist: %x:%x:%x:%x:%x:%x\n", + dev->name, + mclist->dmi_addr[0], mclist->dmi_addr[1], mclist->dmi_addr[2], + mclist->dmi_addr[3], mclist->dmi_addr[4], mclist->dmi_addr[5]); + } } static struct net_device_stats *tun_net_stats(struct net_device *dev) @@ -301,6 +333,10 @@ add_wait_queue(&tun->read_wait, &wait); while (len) { + const u8 ones[ ETH_ALEN] = { 0xff, 0xff, 0xff, 0xff, 0xff, 0xff }; + u8 addr[ ETH_ALEN]; + int bit_nr; + current->state = TASK_INTERRUPTIBLE; /* Read frames from the queue */ @@ -320,10 +356,37 @@ } netif_start_queue(tun->dev); - ret = tun_put_user(tun, skb, (struct iovec *) iv, len); - - kfree_skb(skb); - break; + /** Decide whether to accept this packet. This code is designed to + * behave identically to an Ethernet interface. Accept the packet if + * - we are promiscuous. + * - the packet is addressed to us. + * - the packet is broadcast. + * - the packet is multicast and + * - we are multicast promiscous. + * - we belong to the multicast group. + */ + memcpy(addr, skb->data, min(sizeof addr, skb->len)); + bit_nr = ether_crc(sizeof addr, addr) >> 26; + if ((tun->if_flags & IFF_PROMISC) || + memcmp(addr, tun->dev_addr, sizeof addr) == 0 || + memcmp(addr, ones, sizeof addr) == 0 || + (((addr[0] == 1 && addr[1] == 0 && addr[2] == 0x5e) || + (addr[0] == 0x33 && addr[1] == 0x33)) && + ((tun->if_flags & IFF_ALLMULTI) || + (tun->chr_filter[bit_nr >> 5] & (1 << (bit_nr & 31)))))) { + DBG(KERN_DEBUG "%s: tun_chr_readv: accepted: %x:%x:%x:%x:%x:%x\n", + tun->dev->name, addr[0], addr[1], addr[2], + addr[3], addr[4], addr[5]); + ret = tun_put_user(tun, skb, (struct iovec *) iv, len); + kfree_skb(skb); + break; + } else { + DBG(KERN_DEBUG "%s: tun_chr_readv: rejected: %x:%x:%x:%x:%x:%x\n", + tun->dev->name, addr[0], addr[1], addr[2], + addr[3], addr[4], addr[5]); + kfree_skb(skb); + continue; + } } current->state = TASK_RUNNING; @@ -417,6 +480,12 @@ tun = netdev_priv(dev); tun->dev = dev; tun->flags = flags; + /* Be promiscuous by default to maintain previous behaviour. */ + tun->if_flags = IFF_PROMISC; + /* Generate random Ethernet address. */ + *(u16 *)tun->dev_addr = htons(0x00FF); + get_random_bytes(tun->dev_addr + sizeof(u16), 4); + memset(tun->chr_filter, 0, sizeof tun->chr_filter); tun_net_init(dev); @@ -457,13 +526,16 @@ unsigned int cmd, unsigned long arg) { struct tun_struct *tun = file->private_data; + void __user* argp = (void __user*)arg; + struct ifreq ifr; + + if (cmd == TUNSETIFF || _IOC_TYPE(cmd) == 0x89) + if (copy_from_user(&ifr, argp, sizeof ifr)) + return -EFAULT; if (cmd == TUNSETIFF && !tun) { - struct ifreq ifr; int err; - if (copy_from_user(&ifr, (void __user *)arg, sizeof(ifr))) - return -EFAULT; ifr.ifr_name[IFNAMSIZ-1] = '\0'; rtnl_lock(); @@ -473,7 +545,7 @@ if (err) return err; - if (copy_to_user((void __user *)arg, &ifr, sizeof(ifr))) + if (copy_to_user(argp, &ifr, sizeof(ifr))) return -EFAULT; return 0; } @@ -519,6 +591,61 @@ break; #endif + case SIOCGIFFLAGS: + ifr.ifr_flags = tun->if_flags; + if (copy_to_user( argp, &ifr, sizeof ifr)) + return -EFAULT; + return 0; + + case SIOCSIFFLAGS: + /** Set the character device's interface flags. Currently only + * IFF_PROMISC and IFF_ALLMULTI are used. */ + tun->if_flags = ifr.ifr_flags; + DBG(KERN_INFO "%s: interface flags 0x%lx\n", + tun->dev->name, tun->if_flags); + return 0; + + case SIOCGIFHWADDR: + memcpy(ifr.ifr_hwaddr.sa_data, tun->dev_addr, + min(sizeof ifr.ifr_hwaddr.sa_data, sizeof tun->dev_addr)); + if (copy_to_user( argp, &ifr, sizeof ifr)) + return -EFAULT; + return 0; + + case SIOCSIFHWADDR: + /** Set the character device's hardware address. This is used when + * filtering packets being sent from the network device to the character + * device. */ + memcpy(tun->dev_addr, ifr.ifr_hwaddr.sa_data, + min(sizeof ifr.ifr_hwaddr.sa_data, sizeof tun->dev_addr)); + DBG(KERN_DEBUG "%s: set hardware address: %x:%x:%x:%x:%x:%x\n", + tun->dev->name, + tun->dev_addr[0], tun->dev_addr[1], tun->dev_addr[2], + tun->dev_addr[3], tun->dev_addr[4], tun->dev_addr[5]); + return 0; + + case SIOCADDMULTI: + /** Add the specified group to the character device's multicast filter + * list. */ + add_multi(tun->chr_filter, ifr.ifr_hwaddr.sa_data); + DBG(KERN_DEBUG "%s: add multi: %x:%x:%x:%x:%x:%x\n", + tun->dev->name, + (u8)ifr.ifr_hwaddr.sa_data[0], (u8)ifr.ifr_hwaddr.sa_data[1], + (u8)ifr.ifr_hwaddr.sa_data[2], (u8)ifr.ifr_hwaddr.sa_data[3], + (u8)ifr.ifr_hwaddr.sa_data[4], (u8)ifr.ifr_hwaddr.sa_data[5]); + return 0; + + case SIOCDELMULTI: + /** Remove the specified group from the character device's multicast + * filter list. */ + del_multi(tun->chr_filter, ifr.ifr_hwaddr.sa_data); + DBG(KERN_DEBUG "%s: del multi: %x:%x:%x:%x:%x:%x\n", + tun->dev->name, + (u8)ifr.ifr_hwaddr.sa_data[0], (u8)ifr.ifr_hwaddr.sa_data[1], + (u8)ifr.ifr_hwaddr.sa_data[2], (u8)ifr.ifr_hwaddr.sa_data[3], + (u8)ifr.ifr_hwaddr.sa_data[4], (u8)ifr.ifr_hwaddr.sa_data[5]); + return 0; + default: return -EINVAL; }; diff -ur linux-2.6.8.1.orig/include/linux/if_tun.h linux-2.6.8.1/include/linux/if_tun.h --- linux-2.6.8.1.orig/include/linux/if_tun.h 2004-08-14 03:55:09.000000000 -0700 +++ linux-2.6.8.1/include/linux/if_tun.h 2004-11-25 16:47:31.000000000 -0800 @@ -45,6 +45,11 @@ struct fasync_struct *fasync; + unsigned long if_flags; + u8 dev_addr[ETH_ALEN]; + u32 chr_filter[2]; + u32 net_filter[2]; + #ifdef TUN_DEBUG int debug; #endif From jketreno@linux.intel.com Wed Dec 1 17:35:30 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Dec 2004 17:35:45 -0800 (PST) Received: from orsfmr003.jf.intel.com (fmr18.intel.com [134.134.136.17]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB21ZQbP012623 for ; Wed, 1 Dec 2004 17:35:30 -0800 Received: from talaria.jf.intel.com (talaria.jf.intel.com [10.7.209.7]) by orsfmr003.jf.intel.com (8.12.10/8.12.10/d: major-outer.mc,v 1.1 2004/09/17 17:50:56 root Exp $) with ESMTP id iB21Z0xK001499 for ; Thu, 2 Dec 2004 01:35:00 GMT Received: from linux.intel.com (vpnfm001-139-dhcp-client.fm.intel.com [10.19.13.139]) by talaria.jf.intel.com (8.12.9-20030918-01/8.12.9/d: major-inner.mc,v 1.11 2004/07/29 22:51:53 root Exp $) with ESMTP id iB21Q9uc018759 for ; Thu, 2 Dec 2004 01:26:09 GMT Message-ID: <41AE7143.80505@linux.intel.com> Date: Wed, 01 Dec 2004 19:34:59 -0600 From: James Ketrenos User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6b) Gecko/20031205 Thunderbird/0.4 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Netdev Subject: Steps for netdev-2.6 inclusion? Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.31 (www . roaringpenguin . com / mimedefang) X-archive-position: 12377 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jketreno@linux.intel.com Precedence: bulk X-list: netdev Ok, its been a long time coming, but it appears the ipw* wireless drivers are to the point where being more proactive at getting them into the kernel is appropriate (at least based on the frequency of emails I'm getting of 'why isn't this in mainline?') So, what would be the set of steps required to get a version in the queue for inclusion? The drivers we have are the ipw2100 (supporting the Intel PRO/Wireless 2100 Network Connection adapter) and ipw2200 (supporting the Intel PRO/Wireless 2200BG and 2915ABG Network Connection adapters). There is also a shared generic ieee 802.11 stack (ieee80211*.ko) supporting 802.11b, g, and a for BSS and IBSS modes. The ipw2100 recently was stamped 1.0.0, which means we've put it through a validation and stabalization phase. As we want to try and ensure end users don't have to spend all night fighting with their wireless connection just because they've updated the kernel, we've adopted the following version numbering scheme for the ipw* drivers in the form of of x.y.z, where: .z increases from snapshot to snapshot (pushed out as tarballs on SourceForge) .y increases (and sets .z to 0) when a snapshot has gone through a regression validation cycle. .x increases if there are significant functionality changes to the driver. The idea is to then only have x.y.0 (stable/tested) versions go out for wider distribution (kernel inclusion). For those that are curious, the tip of development for the ipw2100, ipw2200, and ieee80211 stack is available at bk://ipw.bkbits.net/ipw-2.6. Given what I described above, would it be most appropriate to create a ipw-2.6-stable bk tree with the parent as netdev-2.6 that we put the stable versions of the ipw* drivers into, and then request that be pulled? The ipw-2.6 tree could then continue to represent the development tip. I've searched around for some BKMs on this, but haven't found a whole lot. The ipw2100 1.0.0 snapshot (and newer versions) can be found at http://ipw2100.sf.net/#downloads. The latest ipw2200 snapshot (0.15) is available at http://ipw2200.sf.net/#downloads. Also, we have a Bugzilla database setup for the above drivers at http://bughost.org for those that are curious about current remaining issues, etc. Thanks, James From horms@koto.vergenet.net Wed Dec 1 20:31:46 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Dec 2004 20:31:54 -0800 (PST) Received: from koto.vergenet.net (koto.vergenet.net [210.128.90.7]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iB24VjpX020259 for ; Wed, 1 Dec 2004 20:31:45 -0800 Received: (qmail 26081 invoked by uid 7100); 2 Dec 2004 04:15:29 -0000 Date: Thu, 2 Dec 2004 13:07:20 +0900 From: Horms To: Wensong Zhang Cc: "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PATCH] [IPVS] add a sysctl variable to expire quiescent template Message-ID: <20041202040717.GF32190@verge.net.au> Mail-Followup-To: Wensong Zhang , "David S. Miller" , netdev@oss.sgi.com References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Cluestick: seven User-Agent: Mutt/1.5.6+20040907i X-archive-position: 12378 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: horms@verge.net.au Precedence: bulk X-list: netdev On Thu, Dec 02, 2004 at 12:48:26AM +0800, Wensong Zhang wrote: > > > Hi Dave, > > Here is the patch from Horms to add a sysctl > variable to expire quiescent templat. Please check and apply them to > kernel 2.4 and 2.6 respectively. > > Thanks, > > Wensong I can do this too, just in case you need it. Signed-off-by: Horms > # This is a BitKeeper generated diff -Nru style patch. > # > # ChangeSet > # 2004/12/02 00:02:48+08:00 wensong@linux-vs.org > # [IPVS] add a sysctl variable to expire quiescent template > # > # The patch is from Horms > # > # net/ipv4/ipvs/ip_vs_ctl.c > # 2004/12/02 00:02:38+08:00 wensong@linux-vs.org +4 -0 > # set sysctl_ip_vs_expire_quiescent_template > # > # net/ipv4/ipvs/ip_vs_conn.c > # 2004/12/02 00:02:37+08:00 wensong@linux-vs.org +3 -1 > # don't use quiescent template if the expire_quiescent_template is enabled > # > # include/net/ip_vs.h > # 2004/12/02 00:02:37+08:00 wensong@linux-vs.org +2 -0 > # add the sysctl_ip_vs_expire_quiescent_template > # > diff -Nru a/include/net/ip_vs.h b/include/net/ip_vs.h > --- a/include/net/ip_vs.h 2004-12-02 00:16:36 +08:00 > +++ b/include/net/ip_vs.h 2004-12-02 00:16:36 +08:00 > @@ -317,6 +317,7 @@ > NET_IPV4_VS_EXPIRE_NODEST_CONN=23, > NET_IPV4_VS_SYNC_THRESHOLD=24, > NET_IPV4_VS_NAT_ICMP_SEND=25, > + NET_IPV4_VS_EXPIRE_QUIESCENT_TEMPLATE=26, > NET_IPV4_VS_LAST > }; > > @@ -700,6 +701,7 @@ > */ > extern int sysctl_ip_vs_cache_bypass; > extern int sysctl_ip_vs_expire_nodest_conn; > +extern int sysctl_ip_vs_expire_quiescent_template; > extern int sysctl_ip_vs_sync_threshold; > extern int sysctl_ip_vs_nat_icmp_send; > extern struct ip_vs_stats ip_vs_stats; > diff -Nru a/net/ipv4/ipvs/ip_vs_conn.c b/net/ipv4/ipvs/ip_vs_conn.c > --- a/net/ipv4/ipvs/ip_vs_conn.c 2004-12-02 00:16:36 +08:00 > +++ b/net/ipv4/ipvs/ip_vs_conn.c 2004-12-02 00:16:36 +08:00 > @@ -1131,7 +1131,9 @@ > * Checking the dest server status. > */ > if ((dest == NULL) || > - !(dest->flags & IP_VS_DEST_F_AVAILABLE)) { > + !(dest->flags & IP_VS_DEST_F_AVAILABLE) || > + (sysctl_ip_vs_expire_quiescent_template && > + (atomic_read(&dest->weight) == 0))) { > IP_VS_DBG(9, "check_template: dest not available for " > "protocol %s s:%u.%u.%u.%u:%d v:%u.%u.%u.%u:%d " > "-> d:%u.%u.%u.%u:%d\n", > diff -Nru a/net/ipv4/ipvs/ip_vs_ctl.c b/net/ipv4/ipvs/ip_vs_ctl.c > --- a/net/ipv4/ipvs/ip_vs_ctl.c 2004-12-02 00:16:36 +08:00 > +++ b/net/ipv4/ipvs/ip_vs_ctl.c 2004-12-02 00:16:36 +08:00 > @@ -74,6 +74,7 @@ > static int sysctl_ip_vs_am_droprate = 10; > int sysctl_ip_vs_cache_bypass = 0; > int sysctl_ip_vs_expire_nodest_conn = 0; > +int sysctl_ip_vs_expire_quiescent_template = 0; > int sysctl_ip_vs_sync_threshold = 3; > int sysctl_ip_vs_nat_icmp_send = 0; > > @@ -1439,6 +1440,9 @@ > &proc_dointvec}, > {NET_IPV4_VS_NAT_ICMP_SEND, "nat_icmp_send", > &sysctl_ip_vs_nat_icmp_send, sizeof(int), 0644, NULL, > + &proc_dointvec}, > + {NET_IPV4_VS_EXPIRE_QUIESCENT_TEMPLATE, "expire_quiescent_template", > + &sysctl_ip_vs_expire_quiescent_template, sizeof(int), 0644, NULL, > &proc_dointvec}, > {0}}, > {{NET_IPV4_VS, "vs", NULL, 0, 0555, ipv4_vs_table.vs_vars}, > # This is a BitKeeper generated diff -Nru style patch. > # > # ChangeSet > # 2004/12/02 00:42:15+08:00 wensong@linux-vs.org > # [IPVS] add a sysctl variable to expire quiescent template > # > # The patch is from Horms > # > # net/ipv4/ipvs/ip_vs_ctl.c > # 2004/12/02 00:41:56+08:00 wensong@linux-vs.org +20 -11 > # set the sysctl_ip_vs_expire_quiescent_template > # > # net/ipv4/ipvs/ip_vs_conn.c > # 2004/12/02 00:41:56+08:00 wensong@linux-vs.org +3 -1 > # don't use quiescent template if the expire_quiescent_template is enabled > # > # include/net/ip_vs.h > # 2004/12/02 00:41:56+08:00 wensong@linux-vs.org +2 -0 > # add the sysctl_ip_vs_expire_quiescent_template prototype > # > diff -Nru a/include/net/ip_vs.h b/include/net/ip_vs.h > --- a/include/net/ip_vs.h 2004-12-02 00:43:14 +08:00 > +++ b/include/net/ip_vs.h 2004-12-02 00:43:14 +08:00 > @@ -358,6 +358,7 @@ > NET_IPV4_VS_EXPIRE_NODEST_CONN=23, > NET_IPV4_VS_SYNC_THRESHOLD=24, > NET_IPV4_VS_NAT_ICMP_SEND=25, > + NET_IPV4_VS_EXPIRE_QUIESCENT_TEMPLATE=26, > NET_IPV4_VS_LAST > }; > > @@ -879,6 +880,7 @@ > */ > extern int sysctl_ip_vs_cache_bypass; > extern int sysctl_ip_vs_expire_nodest_conn; > +extern int sysctl_ip_vs_expire_quiescent_template; > extern int sysctl_ip_vs_sync_threshold[2]; > extern int sysctl_ip_vs_nat_icmp_send; > extern struct ip_vs_stats ip_vs_stats; > diff -Nru a/net/ipv4/ipvs/ip_vs_conn.c b/net/ipv4/ipvs/ip_vs_conn.c > --- a/net/ipv4/ipvs/ip_vs_conn.c 2004-12-02 00:43:14 +08:00 > +++ b/net/ipv4/ipvs/ip_vs_conn.c 2004-12-02 00:43:14 +08:00 > @@ -453,7 +453,9 @@ > * Checking the dest server status. > */ > if ((dest == NULL) || > - !(dest->flags & IP_VS_DEST_F_AVAILABLE)) { > + !(dest->flags & IP_VS_DEST_F_AVAILABLE) || > + (sysctl_ip_vs_expire_quiescent_template && > + (atomic_read(&dest->weight) == 0))) { > IP_VS_DBG(9, "check_template: dest not available for " > "protocol %s s:%u.%u.%u.%u:%d v:%u.%u.%u.%u:%d " > "-> d:%u.%u.%u.%u:%d\n", > diff -Nru a/net/ipv4/ipvs/ip_vs_ctl.c b/net/ipv4/ipvs/ip_vs_ctl.c > --- a/net/ipv4/ipvs/ip_vs_ctl.c 2004-12-02 00:43:14 +08:00 > +++ b/net/ipv4/ipvs/ip_vs_ctl.c 2004-12-02 00:43:14 +08:00 > @@ -75,6 +75,7 @@ > static int sysctl_ip_vs_am_droprate = 10; > int sysctl_ip_vs_cache_bypass = 0; > int sysctl_ip_vs_expire_nodest_conn = 0; > +int sysctl_ip_vs_expire_quiescent_template = 0; > int sysctl_ip_vs_sync_threshold[2] = { 3, 50 }; > int sysctl_ip_vs_nat_icmp_send = 0; > > @@ -1447,9 +1448,9 @@ > { > .ctl_name = NET_IPV4_VS_TO_ES, > .procname = "timeout_established", > - .data = &vs_timeout_table_dos.timeout[IP_VS_S_ESTABLISHED], > + .data = &vs_timeout_table_dos.timeout[IP_VS_S_ESTABLISHED], > .maxlen = sizeof(int), > - .mode = 0644, > + .mode = 0644, > .proc_handler = &proc_dointvec_jiffies, > }, > { > @@ -1457,7 +1458,7 @@ > .procname = "timeout_synsent", > .data = &vs_timeout_table_dos.timeout[IP_VS_S_SYN_SENT], > .maxlen = sizeof(int), > - .mode = 0644, > + .mode = 0644, > .proc_handler = &proc_dointvec_jiffies, > }, > { > @@ -1465,7 +1466,7 @@ > .procname = "timeout_synrecv", > .data = &vs_timeout_table_dos.timeout[IP_VS_S_SYN_RECV], > .maxlen = sizeof(int), > - .mode = 0644, > + .mode = 0644, > .proc_handler = &proc_dointvec_jiffies, > }, > { > @@ -1473,7 +1474,7 @@ > .procname = "timeout_finwait", > .data = &vs_timeout_table_dos.timeout[IP_VS_S_FIN_WAIT], > .maxlen = sizeof(int), > - .mode = 0644, > + .mode = 0644, > .proc_handler = &proc_dointvec_jiffies, > }, > { > @@ -1489,7 +1490,7 @@ > .procname = "timeout_close", > .data = &vs_timeout_table_dos.timeout[IP_VS_S_CLOSE], > .maxlen = sizeof(int), > - .mode = 0644, > + .mode = 0644, > .proc_handler = &proc_dointvec_jiffies, > }, > { > @@ -1497,7 +1498,7 @@ > .procname = "timeout_closewait", > .data = &vs_timeout_table_dos.timeout[IP_VS_S_CLOSE_WAIT], > .maxlen = sizeof(int), > - .mode = 0644, > + .mode = 0644, > .proc_handler = &proc_dointvec_jiffies, > }, > { > @@ -1505,7 +1506,7 @@ > .procname = "timeout_lastack", > .data = &vs_timeout_table_dos.timeout[IP_VS_S_LAST_ACK], > .maxlen = sizeof(int), > - .mode = 0644, > + .mode = 0644, > .proc_handler = &proc_dointvec_jiffies, > }, > { > @@ -1513,7 +1514,7 @@ > .procname = "timeout_listen", > .data = &vs_timeout_table_dos.timeout[IP_VS_S_LISTEN], > .maxlen = sizeof(int), > - .mode = 0644, > + .mode = 0644, > .proc_handler = &proc_dointvec_jiffies, > }, > { > @@ -1521,7 +1522,7 @@ > .procname = "timeout_synack", > .data = &vs_timeout_table_dos.timeout[IP_VS_S_SYNACK], > .maxlen = sizeof(int), > - .mode = 0644, > + .mode = 0644, > .proc_handler = &proc_dointvec_jiffies, > }, > { > @@ -1529,7 +1530,7 @@ > .procname = "timeout_udp", > .data = &vs_timeout_table_dos.timeout[IP_VS_S_UDP], > .maxlen = sizeof(int), > - .mode = 0644, > + .mode = 0644, > .proc_handler = &proc_dointvec_jiffies, > }, > { > @@ -1553,6 +1554,14 @@ > .ctl_name = NET_IPV4_VS_EXPIRE_NODEST_CONN, > .procname = "expire_nodest_conn", > .data = &sysctl_ip_vs_expire_nodest_conn, > + .maxlen = sizeof(int), > + .mode = 0644, > + .proc_handler = &proc_dointvec, > + }, > + { > + .ctl_name = NET_IPV4_VS_EXPIRE_QUIESCENT_TEMPLATE, > + .procname = "expire_quiescent_template", > + .data = &sysctl_ip_vs_expire_quiescent_template, > .maxlen = sizeof(int), > .mode = 0644, > .proc_handler = &proc_dointvec, -- Horms From sfeldma@pobox.com Wed Dec 1 22:11:17 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Dec 2004 22:11:23 -0800 (PST) Received: from orb.pobox.com (orb.pobox.com [207.8.226.5]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB26BGK4022289 for ; Wed, 1 Dec 2004 22:11:17 -0800 Received: from orb (localhost [127.0.0.1]) by orb.pobox.com (Postfix) with ESMTP id 2F5622FA235; Thu, 2 Dec 2004 01:10:55 -0500 (EST) Received: from [192.168.2.72] (adsl-68-127-20-190.dsl.pltn13.pacbell.net [68.127.20.190]) by orb.sasl.smtp.pobox.com (Postfix) with ESMTP id 585752F9F7E; Thu, 2 Dec 2004 01:10:45 -0500 (EST) Subject: Re: [E1000-devel] Transmission limit From: Scott Feldman Reply-To: sfeldma@pobox.com To: Lennert Buytenhek Cc: jamal , Robert Olsson , P@draigBrady.com, mellia@prezzemolo.polito.it, e1000-devel@lists.sourceforge.net, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com In-Reply-To: <20041201213550.GF14470@xi.wantstofly.org> References: <1101484740.24742.213.camel@mellia.lipar.polito.it> <41A76085.7000105@draigBrady.com> <1101499285.1079.45.camel@jzny.localdomain> <16811.8052.678955.795327@robur.slu.se> <1101821501.1043.43.camel@jzny.localdomain> <20041130134600.GA31515@xi.wantstofly.org> <1101824754.1044.126.camel@jzny.localdomain> <20041201001107.GE4203@xi.wantstofly.org> <1101863399.4663.54.camel@sfeldma-mobl.dsl-verizon.net> <20041201182943.GA14470@xi.wantstofly.org> <20041201213550.GF14470@xi.wantstofly.org> Content-Type: text/plain Message-Id: <1101967983.4782.9.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.6 (1.4.6-2) Date: Wed, 01 Dec 2004 22:13:33 -0800 Content-Transfer-Encoding: 7bit X-archive-position: 12379 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: sfeldma@pobox.com Precedence: bulk X-list: netdev On Wed, 2004-12-01 at 13:35, Lennert Buytenhek wrote: > Pretty graph attached. From ~220B packets or so it does wire speed, but > there's still an odd drop in performance around 256B packets (which is > also there without your patch.) From 350B packets or so, performance is > identical with or without your patch (wire speed.) Seems this is helping PCI nics but not PCI-X. I was using PCI 32/33. Can't explain the dip around 256B. > So. Do you have any other good plans perhaps? :) Idea#1 Is the write of TDT causing interference with DMA transactions? In addition to my patch, what happens if you bump the Tx tail every n packets, where n is like 16 or 32 or 64? if((i % 16) == 0) E1000_REG_WRITE(&adapter->hw, TDT, i); This might piss the NETDEV timer off if the send count isn't a multiple of n, so you might want to disable netdev->tx_timeout. Idea#2 The Ultimate: queue up 4096 packets and then write TDT once to send all 4096 in one shot. Well, maybe a few less that 4096 so we don't wrap the ring. How about pkt_size = 4000? Take my patch and change the timer call in e1000_xmit_frame from jiffies + 1 to jiffies + HZ This will schedule the cleanup of the skbs 1 second after the first queue, so we shouldn't be doing any cleanup while the 4000 packets are DMA'ed. Oh, and change the tail write to if((i % 4000) == 0) E1000_REG_WRITE(&adapter->hw, TDT, i); Of course you'll need to close/open the driver after each run. Idea#3 http://www.mail-archive.com/freebsd-net@freebsd.org/msg10826.html Set TXDMAC to 0 in e1000_configure_tx. > > Once or twice it went into a state where it started spitting out these > > kinds of messages and never recovered: > > > > Dec 1 19:13:18 phi kernel: NETDEV WATCHDOG: eth1: transmit timed out > > [...] > > Dec 1 19:13:31 phi kernel: NETDEV WATCHDOG: eth1: transmit timed out > > [...] > > Dec 1 19:13:43 phi kernel: NETDEV WATCHDOG: eth1: transmit timed out > > Didn't see this happen anymore. (ifconfig down and then up recovered it > both times I saw it happen.) Well, it's probably not a HW bug that's causing the reset; it's probably some bug with my patch. -scott From jgarzik@pobox.com Wed Dec 1 22:19:07 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Dec 2004 22:19:11 -0800 (PST) Received: from www.linux.org.uk (IDENT:93@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB26J6O8022740 for ; Wed, 1 Dec 2004 22:19:06 -0800 Received: from rdu74-155-169.nc.rr.com ([24.74.155.169] helo=[10.10.10.88]) by www.linux.org.uk with asmtp (TLSv1:AES256-SHA:256) (Exim 4.33) id 1CZkId-00016k-W9; Thu, 02 Dec 2004 06:18:44 +0000 Message-ID: <41AEB3B8.2000406@pobox.com> Date: Thu, 02 Dec 2004 01:18:32 -0500 From: Jeff Garzik User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3) Gecko/20040922 X-Accept-Language: en-us, en MIME-Version: 1.0 To: James Ketrenos CC: Netdev Subject: Re: Steps for netdev-2.6 inclusion? References: <41AE7143.80505@linux.intel.com> In-Reply-To: <41AE7143.80505@linux.intel.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 12380 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev It's fairly easy, just email me and netdev the patch for inclusion, and it'll get reviewed. Once review issues are addressed, I'll merge it immediately, which causes it to be automatically propagated to Andrew Morton's -mm tree for testing. Once consensus agrees that we can push this + HostAP upstream, that's an easy 10-minute task. One potential showstopper is firmware crapola: I'm concerned about a situation where we have drivers in the kernel, but the firmware must be downloaded from SourceForge or somesuch. IOW, the kernel driver as-is is useless without a differently-licensed firmware. Comments? Jeff From akpm@osdl.org Wed Dec 1 23:34:51 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 01 Dec 2004 23:34:55 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB27YpGI024362 for ; Wed, 1 Dec 2004 23:34:51 -0800 Received: from bix (build.pdx.osdl.net [172.20.1.2]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id iB27YM912409; Wed, 1 Dec 2004 23:34:22 -0800 Date: Wed, 1 Dec 2004 23:34:00 -0800 From: Andrew Morton To: Shaun Jackman Cc: netdev@oss.sgi.com Subject: Re: Multicast filtering for tun.c [PATCH] Message-Id: <20041201233400.45078efe.akpm@osdl.org> In-Reply-To: <7f45d939041201140329d0273f@mail.gmail.com> References: <7f45d9390411251715138b35d0@mail.gmail.com> <20041127011006.03232bb6.akpm@osdl.org> <7f45d939041201140329d0273f@mail.gmail.com> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.10; i386-redhat-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 12381 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: akpm@osdl.org Precedence: bulk X-list: netdev Shaun Jackman wrote: > > This patch adds multicast filtering to the TUN network driver, for > packets being sent from the network device to the character device. It > applies against the 2.6.8.1 kernel tree. > > ... Minor points: > diff -ur linux-2.6.8.1.orig/drivers/net/tun.c linux-2.6.8.1/drivers/net/tun.c > --- linux-2.6.8.1.orig/drivers/net/tun.c 2004-08-14 03:55:23.000000000 -0700 > +++ linux-2.6.8.1/drivers/net/tun.c 2004-11-25 17:00:22.000000000 -0800 > @@ -41,6 +41,7 @@ > #include > #include > #include > +#include You're sure this shouldn't be using crc-ccitt? > +del_multi(u32* filter, const u8* addr) del_multi(u32 *filter, const u8 *addr) would be a more typical layout. > static struct net_device_stats *tun_net_stats(struct net_device *dev) > @@ -301,6 +333,10 @@ > > add_wait_queue(&tun->read_wait, &wait); > while (len) { > + const u8 ones[ ETH_ALEN] = { 0xff, 0xff, 0xff, 0xff, 0xff, 0xff }; The space after the `[' is gratuitous. This array will be allocated onthe stack and populated by hand each time this function is called. It should be made static. > + u8 addr[ ETH_ALEN]; extraneous space. > + memcpy(addr, skb->data, min(sizeof addr, skb->len)); We normally put parentheses around the argument of the sizeof operator, for no particular reason ;) > + if (copy_from_user(&ifr, argp, sizeof ifr)) Ditto > + case SIOCGIFFLAGS: > + ifr.ifr_flags = tun->if_flags; > + if (copy_to_user( argp, &ifr, sizeof ifr)) extraneous space. > + if (copy_to_user( argp, &ifr, sizeof ifr)) ditto > + DBG(KERN_DEBUG "%s: add multi: %x:%x:%x:%x:%x:%x\n", > + tun->dev->name, > + (u8)ifr.ifr_hwaddr.sa_data[0], (u8)ifr.ifr_hwaddr.sa_data[1], > + (u8)ifr.ifr_hwaddr.sa_data[2], (u8)ifr.ifr_hwaddr.sa_data[3], > + (u8)ifr.ifr_hwaddr.sa_data[4], (u8)ifr.ifr_hwaddr.sa_data[5]); Why all the typecasts? > + case SIOCDELMULTI: > + /** Remove the specified group from the character device's multicast > + * filter list. */ > + del_multi(tun->chr_filter, ifr.ifr_hwaddr.sa_data); > + DBG(KERN_DEBUG "%s: del multi: %x:%x:%x:%x:%x:%x\n", > + tun->dev->name, > + (u8)ifr.ifr_hwaddr.sa_data[0], (u8)ifr.ifr_hwaddr.sa_data[1], > + (u8)ifr.ifr_hwaddr.sa_data[2], (u8)ifr.ifr_hwaddr.sa_data[3], > + (u8)ifr.ifr_hwaddr.sa_data[4], (u8)ifr.ifr_hwaddr.sa_data[5]); ditto. From cranium2003@yahoo.com Thu Dec 2 01:44:59 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Dec 2004 01:45:11 -0800 (PST) Received: from web41411.mail.yahoo.com (web41411.mail.yahoo.com [66.218.93.77]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iB29ix1n000503 for ; Thu, 2 Dec 2004 01:44:59 -0800 Received: (qmail 39134 invoked by uid 60001); 2 Dec 2004 09:44:33 -0000 Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; b=k92BT0y43RKYSYKMggOHP5GR4uBRRgWyRcOAX5JmqyP/9qc7ObXyWrE7P9ymSzL+66OC8Ent972edeO7XCBfxbGmHyDKsnRKOhZZP1xOKL0zYIpVe3+rRwO6a4AMd3PpgkzKY/dpAceYFejgq5HEi0RIiLQRnYBcF1ypdL5sjWo= ; Message-ID: <20041202094433.39132.qmail@web41411.mail.yahoo.com> Received: from [202.56.231.117] by web41411.mail.yahoo.com via HTTP; Thu, 02 Dec 2004 01:44:33 PST Date: Thu, 2 Dec 2004 01:44:33 -0800 (PST) From: cranium2003 Subject: Maximum packet size in kernel To: kernerl mail Cc: net dev MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-archive-position: 12382 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: cranium2003@yahoo.com Precedence: bulk X-list: netdev hello, I want to know how much maximum packet size is assigned by alloc_skb in kernel? and is there any extra space remains in packet allocated memory? Does it dependent on CPU cache? regards, cranium. __________________________________ Do you Yahoo!? The all-new My Yahoo! - What will yours do? http://my.yahoo.com From rl@hellgate.ch Thu Dec 2 03:02:30 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Dec 2004 03:02:36 -0800 (PST) Received: from mail2.bluewin.ch (mail2.bluewin.ch [195.186.4.73]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB2B2TXn004351 for ; Thu, 2 Dec 2004 03:02:30 -0800 Received: from k3.hellgate.ch (83.77.143.24) by mail2.bluewin.ch (Bluewin AG 7.0.031.3) id 41888440002B727B; Thu, 2 Dec 2004 11:01:17 +0000 Received: by k3.hellgate.ch (Postfix, from userid 1000) id 8BCF691F5FB; Thu, 2 Dec 2004 12:01:20 +0100 (CET) Date: Thu, 2 Dec 2004 12:01:20 +0100 From: Roger Luethi To: Lennert Buytenhek Cc: "David S. Miller" , netdev@oss.sgi.com Subject: Re: via-rhine unable to send back-to-back packets? Message-ID: <20041202110120.GA16597@k3.hellgate.ch> References: <20041129222700.GA22918@xi.wantstofly.org> <20041129172540.6b959858.davem@davemloft.net> <20041130064823.GA27872@xi.wantstofly.org> <20041130122503.0adac947.davem@davemloft.net> <20041130220644.GC29947@k3.hellgate.ch> <20041201200148.GE14470@xi.wantstofly.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041201200148.GE14470@xi.wantstofly.org> X-Operating-System: Linux 2.6.10-rc2-bk11 on i686 X-GPG-Fingerprint: 92 F4 DC 20 57 46 7B 95 24 4E 9E E7 5A 54 DC 1B X-GPG: 1024/80E744BD wwwkeys.ch.pgp.net User-Agent: Mutt/1.5.6i X-archive-position: 12383 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rl@hellgate.ch Precedence: bulk X-list: netdev [Thanks to A.J. from VNT for the answer which I am relaying below.] On Wed, 01 Dec 2004 21:01:48 +0100, Lennert Buytenhek wrote: > "Is the hardware capable of sending back-to-back packets (i.e. with > an inter-packet gap of no more than 96 bit times)?" => YES. All the VIA Rhine Fast Ethernet controllers follows the IEEE standard (inter-frame gap is 96 bit time) and it is hardware fixed and NOT programmable. > "Can misprogramming the chip lead to the effect that the inter-packet > gap is never less than 112 bit times?" => NO, it is fixed and neither software nor board level change could modify the inter-frame gap of VIA Rhine Fast Ethernet controller. > > Of course, you can always check if VIA's driver has the same issue. If > > it doesn't, chances are we can borrow the fix. > > Hmm, didn't know they had such a driver. Where can I find it? => in http://www.viaarena.com/ , choose "downloads" => "Drivers" => choose target OS (e.g. Fedora Core Linux) => "Ethernet (Networking/LAN)" => choose the type of your Ethernet controller => click to download the combo driver software package. From buytenh@wantstofly.org Thu Dec 2 03:09:59 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Dec 2004 03:10:04 -0800 (PST) Received: from xi.wantstofly.org (alephnull.demon.nl [212.238.201.82]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB2B9wjF004946 for ; Thu, 2 Dec 2004 03:09:59 -0800 Received: by xi.wantstofly.org (Postfix, from userid 500) id 17EA82B0ED; Thu, 2 Dec 2004 12:09:37 +0100 (MET) Date: Thu, 2 Dec 2004 12:09:37 +0100 From: Lennert Buytenhek To: Roger Luethi Cc: "David S. Miller" , netdev@oss.sgi.com Subject: Re: via-rhine unable to send back-to-back packets? Message-ID: <20041202110937.GC24069@xi.wantstofly.org> References: <20041129222700.GA22918@xi.wantstofly.org> <20041129172540.6b959858.davem@davemloft.net> <20041130064823.GA27872@xi.wantstofly.org> <20041130122503.0adac947.davem@davemloft.net> <20041130220644.GC29947@k3.hellgate.ch> <20041201200148.GE14470@xi.wantstofly.org> <20041202110120.GA16597@k3.hellgate.ch> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041202110120.GA16597@k3.hellgate.ch> User-Agent: Mutt/1.4.1i X-archive-position: 12384 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: buytenh@wantstofly.org Precedence: bulk X-list: netdev On Thu, Dec 02, 2004 at 12:01:20PM +0100, Roger Luethi wrote: > On Wed, 01 Dec 2004 21:01:48 +0100, Lennert Buytenhek wrote: > > "Is the hardware capable of sending back-to-back packets (i.e. with > > an inter-packet gap of no more than 96 bit times)?" > > => YES. All the VIA Rhine Fast Ethernet controllers follows the IEEE > standard (inter-frame gap is 96 bit time) and it is hardware fixed and > NOT programmable. As far as I know, the IEEE standard only mandates a certain minimum inter-frame gap. Thanks for the information, I'll have a look at the VIA driver. --L From jgarzik@pobox.com Thu Dec 2 03:25:43 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Dec 2004 03:25:49 -0800 (PST) Received: from www.linux.org.uk (IDENT:93@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB2BPgmF005571 for ; Thu, 2 Dec 2004 03:25:43 -0800 Received: from rdu74-155-169.nc.rr.com ([24.74.155.169] helo=[10.10.10.88]) by www.linux.org.uk with asmtp (TLSv1:AES256-SHA:256) (Exim 4.33) id 1CZp5N-00027Y-1w; Thu, 02 Dec 2004 11:25:21 +0000 Message-ID: <41AEFB95.8000100@pobox.com> Date: Thu, 02 Dec 2004 06:25:09 -0500 From: Jeff Garzik User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3) Gecko/20040922 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Stephen Hemminger CC: "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PATCH] b44: allow ethtool get_settings when down References: <20041129094523.3185c64c@zqx3.pdx.osdl.net> In-Reply-To: <20041129094523.3185c64c@zqx3.pdx.osdl.net> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 12385 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Stephen Hemminger wrote: > The FC and Suse startup scripts use ethtool to check for link present. This has > problems on my laptop with Broadcom because it quieries settings before > bringing link up. The problem is driver returns EAGAIN when queried for > settings but not up. Just go ahead and return values anyway, the supported and link > state values will be correct, speed will end up being 10BaseT/Half which is a > reasonable default. > > Signed-off-by: Stephen Hemminger > > diff -Nru a/drivers/net/b44.c b/drivers/net/b44.c > --- a/drivers/net/b44.c 2004-11-29 09:41:27 -08:00 > +++ b/drivers/net/b44.c 2004-11-29 09:41:27 -08:00 > @@ -1487,8 +1487,6 @@ > { > struct b44 *bp = netdev_priv(dev); > > - if (!(bp->flags & B44_FLAG_INIT_COMPLETE)) > - return -EAGAIN; > cmd->supported = (SUPPORTED_Autoneg); > cmd->supported |= (SUPPORTED_100baseT_Half | > SUPPORTED_100baseT_Full | I'm not so sure about this one... This sounds like working around stupid userland in the kernel? Jeff From mellia@prezzemolo.polito.it Thu Dec 2 05:40:57 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Dec 2004 05:41:04 -0800 (PST) Received: from prezzemolo.polito.it (IDENT:root@prezzemolo.polito.it [130.192.9.131]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB2DeuAB012945 for ; Thu, 2 Dec 2004 05:40:57 -0800 Received: from mellia.lipar.polito.it ([192.168.85.105]) by prezzemolo.polito.it (8.12.10/8.12.10) with ESMTP id iB2DdWdW015837; Thu, 2 Dec 2004 14:39:35 +0100 Subject: Re: [E1000-devel] Transmission limit From: Marco Mellia Reply-To: mellia@prezzemolo.polito.it To: hadi@cyberus.ca Cc: mellia@prezzemolo.polito.it, Lennert Buytenhek , Harald Welte , P@draigBrady.com, e1000-devel@lists.sourceforge.net, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com In-Reply-To: <1101903944.1042.29.camel@jzny.localdomain> References: <1101467291.24742.70.camel@mellia.lipar.polito.it> <41A73826.3000109@draigBrady.com> <1101483081.24742.174.camel@mellia.lipar.polito.it> <20041127092503.GA12592@sunbeam.de.gnumonks.org> <1101718412.14930.46.camel@verza.polito.it> <20041129145028.GC18788@xi.wantstofly.org> <1101804146.11111.23.camel@mellia.lipar.polito.it> <1101903944.1042.29.camel@jzny.localdomain> Content-Type: text/plain Organization: Message-Id: <1101994772.18491.16.camel@mellia.lipar.polito.it> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 (1.2.2-5) Date: 02 Dec 2004 14:39:32 +0100 Content-Transfer-Encoding: 7bit X-TLC-MailScanner-Information: Please contact the ISP for more information X-TLC-MailScanner: Found to be clean X-TLC-MailScanner-SpamCheck: not spam, SpamAssassin (score=-4.785, required 5.5, AWL 0.12, BAYES_00 -4.90) X-archive-position: 12386 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mellia@prezzemolo.polito.it Precedence: bulk X-list: netdev > > > > We are thinking of sending packet in "bursts" instead of single > > > > transfers. The only problem is to let the NIC know that there are more > > > > than a packet in a burst... > > > > > > Jamal implemented exactly this for e1000 already, he might be persuaded > > > into posting his patch here. Jamal? :) > > > > I guess that saying that we are _very_ interested in this might help. > > :-) > > We can offer as "beta-testers" as well... > > Sorry missed this (I wasnt CCed so it went to a low priority queue which > i read on a best effort basis). > Let me clean up the patches a little bit this weekend. The patch is at > least 4 months old; latest reincarnation was due to issue1 on my SUCON > presentation. Would a patch against latest 2.6.x bitkeeper (whatever it > is this weekend) be fine? If you are in a rush and dont mind a little > ugliness then i will pass them as is. > We'll be glad to spend some time trying this out. Please, we are not very confortable with the linux bitkeeper maintenance method. Can we ask you to provide us a patch to a standard kernel/driver (whatever you prefer...)? Also a complete source sub-tree would be ok ;-) > BTW, Scott posted a interesting patch yesterday, you may wanna give that > a shot as well. We're trying that out right now... (which means, that in a couple of days, we'll try it ;-)) Thanks a lot. -- Ciao, /\/\/\rco +-----------------------------------+ | Marco Mellia - Assistant Professor| | Tel: 39-011-2276-608 | | Tel: 39-011-564-4173 | | Cel: 39-340-9674888 | /"\ .. . . . . . . . . . . . . | Politecnico di Torino | \ / . ASCII Ribbon Campaign . | Corso Duca degli Abruzzi 24 | X .- NO HTML/RTF in e-mail . | Torino - 10129 - Italy | / \ .- NO Word docs in e-mail. | http://www1.tlc.polito.it/mellia | .. . . . . . . . . . . . . +-----------------------------------+ The box said "Requires Windows 95 or Better." So I installed Linux. From wensong@linux-vs.org Thu Dec 2 06:53:15 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Dec 2004 06:53:22 -0800 (PST) Received: from lb1.ctrip.com ([218.244.111.2]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB2ErDuh020118 for ; Thu, 2 Dec 2004 06:53:15 -0800 Received: from penguin.linux-vs.org ([61.149.157.27]) by lb1.ctrip.com (8.12.10/8.12.10) with ESMTP id iB2EpwMh015802 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Thu, 2 Dec 2004 22:52:06 +0800 Received: from penguin.linux-vs.org (localhost.localdomain [127.0.0.1]) by penguin.linux-vs.org (8.12.8/8.12.8) with ESMTP id iB2EpiPX001140; Thu, 2 Dec 2004 22:51:44 +0800 Received: from localhost (wensong@localhost) by penguin.linux-vs.org (8.12.8/8.12.8/Submit) with ESMTP id iB2EpZmc001136; Thu, 2 Dec 2004 22:51:40 +0800 X-Authentication-Warning: penguin.linux-vs.org: wensong owned process doing -bs Date: Thu, 2 Dec 2004 22:51:34 +0800 (CST) From: Wensong Zhang To: Horms cc: "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PATCH] [IPVS] add a sysctl variable to expire quiescent template In-Reply-To: <20041202040717.GF32190@verge.net.au> Message-ID: References: <20041202040717.GF32190@verge.net.au> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-archive-position: 12387 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: wensong@linux-vs.org Precedence: bulk X-list: netdev On Thu, 2 Dec 2004, Horms wrote: > On Thu, Dec 02, 2004 at 12:48:26AM +0800, Wensong Zhang wrote: >> >> >> Hi Dave, >> >> Here is the patch from Horms to add a sysctl >> variable to expire quiescent templat. Please check and apply them to >> kernel 2.4 and 2.6 respectively. >> >> Thanks, >> >> Wensong > > I can do this too, just in case you need it. > > Signed-off-by: Horms > Sure. :) I will ask you to make the patches for kernel 2.4 and 2.6 next time, and I just need to verify and test them. :) Cheers, Wensong From mellia@prezzemolo.polito.it Thu Dec 2 09:26:09 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Dec 2004 09:26:16 -0800 (PST) Received: from prezzemolo.polito.it (IDENT:root@prezzemolo.polito.it [130.192.9.131]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB2HQ8oi027479 for ; Thu, 2 Dec 2004 09:26:09 -0800 Received: from mellia.lipar.polito.it ([192.168.85.105]) by prezzemolo.polito.it (8.12.10/8.12.10) with ESMTP id iB2HOYdW032567; Thu, 2 Dec 2004 18:24:49 +0100 Subject: Re: [E1000-devel] Transmission limit From: Marco Mellia Reply-To: mellia@prezzemolo.polito.it To: hadi@cyberus.ca Cc: mellia@prezzemolo.polito.it, P@draigBrady.com, e1000-devel@lists.sourceforge.net, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com In-Reply-To: <1101822364.1044.60.camel@jzny.localdomain> References: <1101467291.24742.70.camel@mellia.lipar.polito.it> <41A73826.3000109@draigBrady.com> <1101483081.24742.174.camel@mellia.lipar.polito.it> <1101498963.1076.39.camel@jzny.localdomain> <1101738118.14930.142.camel@verza.polito.it> <1101822364.1044.60.camel@jzny.localdomain> Content-Type: text/plain Organization: Message-Id: <1102008274.19646.14.camel@mellia.lipar.polito.it> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 (1.2.2-5) Date: 02 Dec 2004 18:24:34 +0100 Content-Transfer-Encoding: 7bit X-TLC-MailScanner-Information: Please contact the ISP for more information X-TLC-MailScanner: Found to be clean X-TLC-MailScanner-SpamCheck: not spam, SpamAssassin (score=-4.788, required 5.5, AWL 0.11, BAYES_00 -4.90) X-archive-position: 12388 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mellia@prezzemolo.polito.it Precedence: bulk X-list: netdev > > In our experiments, we modified the kernel to drop packets just after > > receiving them. skb are just deallocated (using standerd kernel > > routines, i.e., no recycling is used). Logically, that happen when the > > netif_rx() is called. > > > > Now, we have three cases > > 1) just mofify the netif_rx() to drop packets. > > 2) as in one, plus remove the protocol check in the driver > > (i.e., comment the line > > skb->protocol = eth_type_trans(skb, netdev); > > ) to avoid to access the real packet data. > > 3) as in 2, but dealloc is performed at the driver level, instead of > > calling the netif_rx() > > > > In the first case, we can receive about 1.1Mpps (~80% of packets) > > Possible. I was able to receive 900Kpps or so in my experiments with > gact drop which is slightly above this with a 2.4 Ghz machine with IRQ > affinity. I double checked with the people that actually did the job. They indeed tested both cases, i.e., dropping packets either using IRQ (therefore using netif_rx()) or using NAPI (therefore using netif_receive_skb()). In both cases, disabling the eth_type_trans() check, we receive 100% of packets... > > In the third case, we can NOT receive 100% of packets! > > The only difference is that we actually _REMOVED_ a funcion call. This > > reduces the overhead, and the compiler/cpu/whatever can not optimize the > > data path to access to the skb which must be freed. > > It doesnt seem like you were runing NAPI if you depended on calling > netif_rx > In that case, #3 would be freeing in hard IRQ context while #2 is > softIRQ. Again, it was my mistake. Case #3 was performed using the NAPI stack, i.e., freeing up skb instead of calling the netif_receive_skb(). Doing that, we observed a performance drop, that we hint to some caching isses. Indeed, investigating with a Oprofile, in case #3 it registers about twice the number of cache miss than in case #2. Again, we do not have any plain explanation, but our intuition is that adding a function call with pointer as argument might allow the compiler/cpu to prefecth the skb and speed up the memory release... > > Our guess is that by freeing up the skb in the netif_rx() function > > actually allows the compiler/cpu to prefetch the skb itself, and > > therefore keep the pipeline working... > > > > My guess is that if you change compiler, cpu, memory subsystem, you may > > get very counterintuitive results... > > Refer to my comment above. > Repeat tests with NAPI and see if you get same results. We were using NAPI. Sorry for the misunderstanding. Hope this helps. -- Ciao, /\/\/\rco +--+ | Marco Mellia - Assistant Professor| | Tel: 39-011-2276-608 | | Tel: 39-011-564-4173 | | Cel: 39-340-9674888 | /"\ .. . . . . . . . . . . . . | Politecnico di Torino | \ / . ASCII Ribbon Campaign . | Corso Duca degli Abruzzi 24 | X .- NO HTML/RTF in e-mail . | Torino - 10129 - Italy | / \ .- NO Word docs in e-mail. | http://www1.tlc.polito.it/mellia | .. . . . . . . . . . . . . +--+ The box said "Requires Windows 95 or Better." So I installed Linux. From sjackman@gmail.com Thu Dec 2 09:28:59 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Dec 2004 09:29:04 -0800 (PST) Received: from mproxy.gmail.com (mproxy.gmail.com [216.239.56.244]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB2HSxL0027742 for ; Thu, 2 Dec 2004 09:28:59 -0800 Received: by mproxy.gmail.com with SMTP id w41so155372cwb for ; Thu, 02 Dec 2004 09:28:36 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:references; b=tt+V2mOZVtqNGj2lUTzutEa6y3bD2GO0hWS5b6Xur4NSQ2logiGPSVmg9tnuVxTgrY0xztUkGWxBv96sLTt11zWL2ksmpN/pd54vyP9sNEko8JgvzIXwJaePCalzBBPVOwGV8Yhvuj/nO6Li+oeAl6NTpGyqMEDf2axP1uCdbWQ= Received: by 10.11.116.64 with SMTP id o64mr418223cwc; Thu, 02 Dec 2004 09:28:36 -0800 (PST) Received: by 10.11.99.50 with HTTP; Thu, 2 Dec 2004 09:28:36 -0800 (PST) Message-ID: <7f45d9390412020928f298944@mail.gmail.com> Date: Thu, 2 Dec 2004 09:28:36 -0800 From: Shaun Jackman Reply-To: Shaun Jackman To: netdev@oss.sgi.com Subject: Re: Multicast filtering for tun.c [PATCH] In-Reply-To: <20041201233400.45078efe.akpm@osdl.org> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit References: <7f45d9390411251715138b35d0@mail.gmail.com> <20041127011006.03232bb6.akpm@osdl.org> <7f45d939041201140329d0273f@mail.gmail.com> <20041201233400.45078efe.akpm@osdl.org> X-archive-position: 12389 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: sjackman@gmail.com Precedence: bulk X-list: netdev Thanks for your careful review, Andrew. I appreciate your help. > You're sure this shouldn't be using crc-ccitt? I used ether_crc because every driver in drivers/net except ppp_async.c does. Either would work as long as the chosen hash function is used consistently. > del_multi(u32 *filter, const u8 *addr) > > would be a more typical layout. I prefer the asterisk with the rest of the type information, but I know the latter is more typical. Fixed. > > + const u8 ones[ ETH_ALEN] = { 0xff, 0xff, 0xff, 0xff, 0xff, 0xff }; > > The space after the `[' is gratuitous. > > This array will be allocated onthe stack and populated by hand each time > this function is called. It should be made static. > > > + u8 addr[ ETH_ALEN]; > > extraneous space. Fixed. > > + memcpy(addr, skb->data, min(sizeof addr, skb->len)); > > We normally put parentheses around the argument of the sizeof operator, for > no particular reason ;) > > > + if (copy_from_user(&ifr, argp, sizeof ifr)) > > Ditto I prefer not to add the extraneous punctuation. I find it makes it harder to read. > > + case SIOCGIFFLAGS: > > + ifr.ifr_flags = tun->if_flags; > > + if (copy_to_user( argp, &ifr, sizeof ifr)) > > extraneous space. > > > + if (copy_to_user( argp, &ifr, sizeof ifr)) > > ditto Fixed. > > + DBG(KERN_DEBUG "%s: add multi: %x:%x:%x:%x:%x:%x\n", > > + tun->dev->name, > > + (u8)ifr.ifr_hwaddr.sa_data[0], (u8)ifr.ifr_hwaddr.sa_data[1], > > + (u8)ifr.ifr_hwaddr.sa_data[2], (u8)ifr.ifr_hwaddr.sa_data[3], > > + (u8)ifr.ifr_hwaddr.sa_data[4], (u8)ifr.ifr_hwaddr.sa_data[5]); > > Why all the typecasts? [clip] > ditto. sa_data is signed for some reason, which causes the signed byte to be sign extended resulting in the mac address being printed as 0:c:6e:14:ffffffa0:5a, for example. Here's a short diff of the changes to my patch. The full patch follows. 27c27 < +add_multi(u32* filter, const u8* addr) --- > +add_multi(u32 *filter, const u8 *addr) 38c38 < +del_multi(u32* filter, const u8* addr) --- > +del_multi(u32 *filter, const u8 *addr) 71,72c71,72 < + const u8 ones[ ETH_ALEN] = { 0xff, 0xff, 0xff, 0xff, 0xff, 0xff }; < + u8 addr[ ETH_ALEN]; --- > + static const u8 ones[ETH_ALEN] = { 0xff, 0xff, 0xff, 0xff, 0xff, 0xff }; > + u8 addr[ETH_ALEN]; 137c137 < + void __user* argp = (void __user*)arg; --- > + void __user *argp = (void __user*)arg; 168c168 < + if (copy_to_user( argp, &ifr, sizeof ifr)) --- > + if (copy_to_user(argp, &ifr, sizeof ifr)) 183c183 < + if (copy_to_user( argp, &ifr, sizeof ifr)) --- > + if (copy_to_user(argp, &ifr, sizeof ifr)) Cheers, Shaun 2004-11-25 Shaun Jackman * drivers/net/tun.c: Add multicast filtering for packets travelling from the network device to the character device. * include/linux/if_tun.h (tun_struct): Add interface flags, a hardware device addres, and a multicast filter. diff -ur linux-2.6.8.1.orig/drivers/net/tun.c linux-2.6.8.1/drivers/net/tun.c --- linux-2.6.8.1.orig/drivers/net/tun.c 2004-08-14 03:55:23.000000000 -0700 +++ linux-2.6.8.1/drivers/net/tun.c 2004-11-25 17:00:22.000000000 -0800 @@ -41,6 +41,7 @@ #include #include #include +#include #include #include @@ -104,11 +105,42 @@ return 0; } -static void tun_net_mclist(struct net_device *dev) +/** Add the specified Ethernet address to this multicast filter. */ +static void +add_multi(u32 *filter, const u8 *addr) { - /* Nothing to do for multicast filters. - * We always accept all frames. */ - return; + int bit_nr = ether_crc(ETH_ALEN, addr) >> 26; + filter[bit_nr >> 5] |= 1 << (bit_nr & 31); +} + +/** Remove the specified Ethernet addres from this multicast filter. */ +static void +del_multi(u32 *filter, const u8 *addr) +{ + int bit_nr = ether_crc(ETH_ALEN, addr) >> 26; + filter[bit_nr >> 5] &= ~(1 << (bit_nr & 31)); +} + +/** Update the list of multicast groups to which the network device belongs. + * This list is used to filter packets being sent from the character device to + * the network device. */ +static void +tun_net_mclist(struct net_device *dev) +{ + struct tun_struct *tun = netdev_priv(dev); + const struct dev_mc_list *mclist; + int i; + DBG(KERN_DEBUG "%s: tun_net_mclist: mc_count %d\n", + dev->name, dev->mc_count); + memset(tun->chr_filter, 0, sizeof tun->chr_filter); + for (i = 0, mclist = dev->mc_list; i < dev->mc_count && mclist != NULL; + i++, mclist = mclist->next) { + add_multi(tun->net_filter, mclist->dmi_addr); + DBG(KERN_DEBUG "%s: tun_net_mclist: %x:%x:%x:%x:%x:%x\n", + dev->name, + mclist->dmi_addr[0], mclist->dmi_addr[1], mclist->dmi_addr[2], + mclist->dmi_addr[3], mclist->dmi_addr[4], mclist->dmi_addr[5]); + } } static struct net_device_stats *tun_net_stats(struct net_device *dev) @@ -301,6 +333,10 @@ add_wait_queue(&tun->read_wait, &wait); while (len) { + static const u8 ones[ETH_ALEN] = { 0xff, 0xff, 0xff, 0xff, 0xff, 0xff }; + u8 addr[ETH_ALEN]; + int bit_nr; + current->state = TASK_INTERRUPTIBLE; /* Read frames from the queue */ @@ -320,10 +356,37 @@ } netif_start_queue(tun->dev); - ret = tun_put_user(tun, skb, (struct iovec *) iv, len); - - kfree_skb(skb); - break; + /** Decide whether to accept this packet. This code is designed to + * behave identically to an Ethernet interface. Accept the packet if + * - we are promiscuous. + * - the packet is addressed to us. + * - the packet is broadcast. + * - the packet is multicast and + * - we are multicast promiscous. + * - we belong to the multicast group. + */ + memcpy(addr, skb->data, min(sizeof addr, skb->len)); + bit_nr = ether_crc(sizeof addr, addr) >> 26; + if ((tun->if_flags & IFF_PROMISC) || + memcmp(addr, tun->dev_addr, sizeof addr) == 0 || + memcmp(addr, ones, sizeof addr) == 0 || + (((addr[0] == 1 && addr[1] == 0 && addr[2] == 0x5e) || + (addr[0] == 0x33 && addr[1] == 0x33)) && + ((tun->if_flags & IFF_ALLMULTI) || + (tun->chr_filter[bit_nr >> 5] & (1 << (bit_nr & 31)))))) { + DBG(KERN_DEBUG "%s: tun_chr_readv: accepted: %x:%x:%x:%x:%x:%x\n", + tun->dev->name, addr[0], addr[1], addr[2], + addr[3], addr[4], addr[5]); + ret = tun_put_user(tun, skb, (struct iovec *) iv, len); + kfree_skb(skb); + break; + } else { + DBG(KERN_DEBUG "%s: tun_chr_readv: rejected: %x:%x:%x:%x:%x:%x\n", + tun->dev->name, addr[0], addr[1], addr[2], + addr[3], addr[4], addr[5]); + kfree_skb(skb); + continue; + } } current->state = TASK_RUNNING; @@ -417,6 +480,12 @@ tun = netdev_priv(dev); tun->dev = dev; tun->flags = flags; + /* Be promiscuous by default to maintain previous behaviour. */ + tun->if_flags = IFF_PROMISC; + /* Generate random Ethernet address. */ + *(u16 *)tun->dev_addr = htons(0x00FF); + get_random_bytes(tun->dev_addr + sizeof(u16), 4); + memset(tun->chr_filter, 0, sizeof tun->chr_filter); tun_net_init(dev); @@ -457,13 +526,16 @@ unsigned int cmd, unsigned long arg) { struct tun_struct *tun = file->private_data; + void __user *argp = (void __user*)arg; + struct ifreq ifr; + + if (cmd == TUNSETIFF || _IOC_TYPE(cmd) == 0x89) + if (copy_from_user(&ifr, argp, sizeof ifr)) + return -EFAULT; if (cmd == TUNSETIFF && !tun) { - struct ifreq ifr; int err; - if (copy_from_user(&ifr, (void __user *)arg, sizeof(ifr))) - return -EFAULT; ifr.ifr_name[IFNAMSIZ-1] = '\0'; rtnl_lock(); @@ -473,7 +545,7 @@ if (err) return err; - if (copy_to_user((void __user *)arg, &ifr, sizeof(ifr))) + if (copy_to_user(argp, &ifr, sizeof(ifr))) return -EFAULT; return 0; } @@ -519,6 +591,61 @@ break; #endif + case SIOCGIFFLAGS: + ifr.ifr_flags = tun->if_flags; + if (copy_to_user(argp, &ifr, sizeof ifr)) + return -EFAULT; + return 0; + + case SIOCSIFFLAGS: + /** Set the character device's interface flags. Currently only + * IFF_PROMISC and IFF_ALLMULTI are used. */ + tun->if_flags = ifr.ifr_flags; + DBG(KERN_INFO "%s: interface flags 0x%lx\n", + tun->dev->name, tun->if_flags); + return 0; + + case SIOCGIFHWADDR: + memcpy(ifr.ifr_hwaddr.sa_data, tun->dev_addr, + min(sizeof ifr.ifr_hwaddr.sa_data, sizeof tun->dev_addr)); + if (copy_to_user(argp, &ifr, sizeof ifr)) + return -EFAULT; + return 0; + + case SIOCSIFHWADDR: + /** Set the character device's hardware address. This is used when + * filtering packets being sent from the network device to the character + * device. */ + memcpy(tun->dev_addr, ifr.ifr_hwaddr.sa_data, + min(sizeof ifr.ifr_hwaddr.sa_data, sizeof tun->dev_addr)); + DBG(KERN_DEBUG "%s: set hardware address: %x:%x:%x:%x:%x:%x\n", + tun->dev->name, + tun->dev_addr[0], tun->dev_addr[1], tun->dev_addr[2], + tun->dev_addr[3], tun->dev_addr[4], tun->dev_addr[5]); + return 0; + + case SIOCADDMULTI: + /** Add the specified group to the character device's multicast filter + * list. */ + add_multi(tun->chr_filter, ifr.ifr_hwaddr.sa_data); + DBG(KERN_DEBUG "%s: add multi: %x:%x:%x:%x:%x:%x\n", + tun->dev->name, + (u8)ifr.ifr_hwaddr.sa_data[0], (u8)ifr.ifr_hwaddr.sa_data[1], + (u8)ifr.ifr_hwaddr.sa_data[2], (u8)ifr.ifr_hwaddr.sa_data[3], + (u8)ifr.ifr_hwaddr.sa_data[4], (u8)ifr.ifr_hwaddr.sa_data[5]); + return 0; + + case SIOCDELMULTI: + /** Remove the specified group from the character device's multicast + * filter list. */ + del_multi(tun->chr_filter, ifr.ifr_hwaddr.sa_data); + DBG(KERN_DEBUG "%s: del multi: %x:%x:%x:%x:%x:%x\n", + tun->dev->name, + (u8)ifr.ifr_hwaddr.sa_data[0], (u8)ifr.ifr_hwaddr.sa_data[1], + (u8)ifr.ifr_hwaddr.sa_data[2], (u8)ifr.ifr_hwaddr.sa_data[3], + (u8)ifr.ifr_hwaddr.sa_data[4], (u8)ifr.ifr_hwaddr.sa_data[5]); + return 0; + default: return -EINVAL; }; diff -ur linux-2.6.8.1.orig/include/linux/if_tun.h linux-2.6.8.1/include/linux/if_tun.h --- linux-2.6.8.1.orig/include/linux/if_tun.h 2004-08-14 03:55:09.000000000 -0700 +++ linux-2.6.8.1/include/linux/if_tun.h 2004-11-25 16:47:31.000000000 -0800 @@ -45,6 +45,11 @@ struct fasync_struct *fasync; + unsigned long if_flags; + u8 dev_addr[ETH_ALEN]; + u32 chr_filter[2]; + u32 net_filter[2]; + #ifdef TUN_DEBUG int debug; #endif From mellia@prezzemolo.polito.it Thu Dec 2 09:33:29 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Dec 2004 09:33:34 -0800 (PST) Received: from prezzemolo.polito.it (IDENT:root@prezzemolo.polito.it [130.192.9.131]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB2HXSrW028218 for ; Thu, 2 Dec 2004 09:33:28 -0800 Received: from mellia.lipar.polito.it ([192.168.85.105]) by prezzemolo.polito.it (8.12.10/8.12.10) with ESMTP id iB2HVUdW000522; Thu, 2 Dec 2004 18:31:30 +0100 Subject: Re: [E1000-devel] Transmission limit From: Marco Mellia Reply-To: mellia@prezzemolo.polito.it To: sfeldma@pobox.com Cc: birke@serveliper.polito.it, Lennert Buytenhek , jamal , Robert Olsson , P@draigBrady.com, mellia@prezzemolo.polito.it, e1000-devel@lists.sourceforge.net, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com In-Reply-To: <1101863399.4663.54.camel@sfeldma-mobl.dsl-verizon.net> References: <1101467291.24742.70.camel@mellia.lipar.polito.it> <41A73826.3000109@draigBrady.com> <16807.20052.569125.686158@robur.slu.se> <1101484740.24742.213.camel@mellia.lipar.polito.it> <41A76085.7000105@draigBrady.com> <1101499285.1079.45.camel@jzny.localdomain> <16811.8052.678955.795327@robur.slu.se> <1101821501.1043.43.camel@jzny.localdomain> <20041130134600.GA31515@xi.wantstofly.org> <1101824754.1044.126.camel@jzny.localdomain> <20041201001107.GE4203@xi.wantstofly.org> <1101863399.4663.54.camel@sfeldma-mobl.dsl-verizon.net> Content-Type: text/plain Organization: Message-Id: <1102008690.19646.22.camel@mellia.lipar.polito.it> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 (1.2.2-5) Date: 02 Dec 2004 18:31:30 +0100 Content-Transfer-Encoding: 7bit X-TLC-MailScanner-Information: Please contact the ISP for more information X-TLC-MailScanner: Found to be clean X-TLC-MailScanner-SpamCheck: not spam, SpamAssassin (score=-4.788, required 5.5, AWL 0.11, BAYES_00 -4.90) X-archive-position: 12390 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mellia@prezzemolo.polito.it Precedence: bulk X-list: netdev On Wed, 2004-12-01 at 02:09, Scott Feldman wrote: > Hey, turns out, I know some e1000 tricks that might help get the kpps > numbers up. > > My problem is I only have a P4 desktop system with a 82544 nic running > at PCI 32/33Mhz, so I can't play with the big boys. But, attached is a > rework of the Tx path to eliminate 1) Tx interrupts, and 2) Tx > descriptor write-backs. For me, I see a nice jump in kpps, but I'd like > others to try with their setups. We should be able to get to wire speed > with 60-byte packets. > Here are the numbers in our setup: vanilla kernel [2.4.20 + packetgen + driver e1000 5.4.11] 4096 Descr => 356 Mbps (60 bytes long frames) => 941Mbps (1500 bytes lonf frames) 256 Descr => 354 Mbps (60 bytes long frames) => 941Mbps (1500 bytes lonf frames) Patched driver [2.4.20 + packetgen + driver e1000 5.4.11 patched] 4096 Descr => 357 Mbps (60 bytes long frames) => 941Mbps (1500 bytes lonf frames) I guess that was _not_ the bottleneck sigh... at least with a PCI-X bus. Again, latency issue of the DMA transfer from RAM to NIC? -- Ciao, /\/\/\rco +-----------------------------------+ | Marco Mellia - Assistant Professor| | Tel: 39-011-2276-608 | | Tel: 39-011-564-4173 | | Cel: 39-340-9674888 | /"\ .. . . . . . . . . . . . . | Politecnico di Torino | \ / . ASCII Ribbon Campaign . | Corso Duca degli Abruzzi 24 | X .- NO HTML/RTF in e-mail . | Torino - 10129 - Italy | / \ .- NO Word docs in e-mail. | http://www1.tlc.polito.it/mellia | .. . . . . . . . . . . . . +-----------------------------------+ The box said "Requires Windows 95 or Better." So I installed Linux. From shemminger@osdl.org Thu Dec 2 09:52:28 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Dec 2004 09:52:34 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB2HqRXO028837 for ; Thu, 2 Dec 2004 09:52:28 -0800 Received: from dxpl.pdx.osdl.net (dxpl.pdx.osdl.net [172.20.1.103]) by mail.osdl.org (8.11.6/8.11.6) with ESMTP id iB2Hpw908926; Thu, 2 Dec 2004 09:51:58 -0800 Date: Thu, 2 Dec 2004 09:51:58 -0800 From: Stephen Hemminger To: Jeff Garzik Cc: "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PATCH] b44: allow ethtool get_settings when down Message-Id: <20041202095158.0722d176@dxpl.pdx.osdl.net> In-Reply-To: <41AEFB95.8000100@pobox.com> References: <20041129094523.3185c64c@zqx3.pdx.osdl.net> <41AEFB95.8000100@pobox.com> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.10claws (GTK+ 1.2.10; x86_64-suse-linux) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 12391 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev On Thu, 02 Dec 2004 06:25:09 -0500 Jeff Garzik wrote: > Stephen Hemminger wrote: > > The FC and Suse startup scripts use ethtool to check for link present. This has > > problems on my laptop with Broadcom because it quieries settings before > > bringing link up. The problem is driver returns EAGAIN when queried for > > settings but not up. Just go ahead and return values anyway, the supported and link > > state values will be correct, speed will end up being 10BaseT/Half which is a > > reasonable default. > > > > Signed-off-by: Stephen Hemminger > > > > diff -Nru a/drivers/net/b44.c b/drivers/net/b44.c > > --- a/drivers/net/b44.c 2004-11-29 09:41:27 -08:00 > > +++ b/drivers/net/b44.c 2004-11-29 09:41:27 -08:00 > > @@ -1487,8 +1487,6 @@ > > { > > struct b44 *bp = netdev_priv(dev); > > > > - if (!(bp->flags & B44_FLAG_INIT_COMPLETE)) > > - return -EAGAIN; > > cmd->supported = (SUPPORTED_Autoneg); > > cmd->supported |= (SUPPORTED_100baseT_Half | > > SUPPORTED_100baseT_Full | > > I'm not so sure about this one... > > This sounds like working around stupid userland in the kernel? > > Jeff Don't bother with the patch, if I use smart user land code like NetworkManager then there is no problem. Although EAGAIN seems like a poor choice for errno how about ENETDOWN or ENONET From Robert.Olsson@data.slu.se Thu Dec 2 09:55:08 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Dec 2004 09:55:12 -0800 (PST) Received: from mail1.slu.se (mail1.slu.se [130.238.96.11]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB2Ht7ex029235 for ; Thu, 2 Dec 2004 09:55:07 -0800 Received: from robur.slu.se (robur.slu.se [130.238.98.12]) by mail1.slu.se (8.12.10/8.12.10) with ESMTP id iB2HsVKO021575; Thu, 2 Dec 2004 18:54:32 +0100 Received: by robur.slu.se (Postfix, from userid 1000) id 36228EC001; Thu, 2 Dec 2004 18:54:32 +0100 (CET) From: Robert Olsson MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16815.22232.190929.306486@robur.slu.se> Date: Thu, 2 Dec 2004 18:54:32 +0100 To: sfeldma@pobox.com Cc: Robert Olsson , Lennert Buytenhek , jamal , P@draigBrady.com, mellia@prezzemolo.polito.it, e1000-devel@lists.sourceforge.net, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com Subject: Re: [E1000-devel] Transmission limit In-Reply-To: <1101919791.5198.15.camel@localhost.localdomain> References: <1101467291.24742.70.camel@mellia.lipar.polito.it> <41A73826.3000109@draigBrady.com> <16807.20052.569125.686158@robur.slu.se> <1101484740.24742.213.camel@mellia.lipar.polito.it> <41A76085.7000105@draigBrady.com> <1101499285.1079.45.camel@jzny.localdomain> <16811.8052.678955.795327@robur.slu.se> <1101821501.1043.43.camel@jzny.localdomain> <20041130134600.GA31515@xi.wantstofly.org> <1101824754.1044.126.camel@jzny.localdomain> <20041201001107.GE4203@xi.wantstofly.org> <1101863399.4663.54.camel@sfeldma-mobl.dsl-verizon.net> <16813.58484.343629.570703@robur.slu.se> <1101919791.5198.15.camel@localhost.localdomain> X-Mailer: VM 7.18 under Emacs 21.3.1 X-archive-position: 12392 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Robert.Olsson@data.slu.se Precedence: bulk X-list: netdev Scott Feldman writes: > Thank you Robert for trying it out. Scott! I've rerun some of the tests. I've set maxcpus=1 make sure all things happens on one CPU. Some HW as yesterday. I see a now lot variation in the results from your patch. vanilla 804353pps 411Mb/sec (411828736bps) errors: 98877 patch TXD=4096 Sometimes: 882362pps 451Mb/sec (451769344bps) errors: 0 patch TXD=2048 Sometimes: 943007pps 482Mb/sec (482819584bps) errors: 0 But very often runs around 500 kpps with patch. This smells scheduling to me as smaller rings use to mean higher performance but ring need to big enough to hide latencies. See also my next mail... --ro From Robert.Olsson@data.slu.se Thu Dec 2 10:23:55 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Dec 2004 10:24:00 -0800 (PST) Received: from mail1.slu.se (mail1.slu.se [130.238.96.11]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB2INsbO030225 for ; Thu, 2 Dec 2004 10:23:55 -0800 Received: from robur.slu.se (robur.slu.se [130.238.98.12]) by mail1.slu.se (8.12.10/8.12.10) with ESMTP id iB2INOKO003502; Thu, 2 Dec 2004 19:23:24 +0100 Received: by robur.slu.se (Postfix, from userid 1000) id 1F693EC001; Thu, 2 Dec 2004 19:23:24 +0100 (CET) From: Robert Olsson MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16815.23964.93437.411404@robur.slu.se> Date: Thu, 2 Dec 2004 19:23:24 +0100 To: sfeldma@pobox.com Cc: Robert Olsson , Lennert Buytenhek , jamal , P@draigBrady.com, mellia@prezzemolo.polito.it, e1000-devel@lists.sourceforge.net, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com Subject: Re: [E1000-devel] Transmission limit In-Reply-To: <1101919791.5198.15.camel@localhost.localdomain> References: <1101467291.24742.70.camel@mellia.lipar.polito.it> <41A73826.3000109@draigBrady.com> <16807.20052.569125.686158@robur.slu.se> <1101484740.24742.213.camel@mellia.lipar.polito.it> <41A76085.7000105@draigBrady.com> <1101499285.1079.45.camel@jzny.localdomain> <16811.8052.678955.795327@robur.slu.se> <1101821501.1043.43.camel@jzny.localdomain> <20041130134600.GA31515@xi.wantstofly.org> <1101824754.1044.126.camel@jzny.localdomain> <20041201001107.GE4203@xi.wantstofly.org> <1101863399.4663.54.camel@sfeldma-mobl.dsl-verizon.net> <16813.58484.343629.570703@robur.slu.se> <1101919791.5198.15.camel@localhost.localdomain> X-Mailer: VM 7.18 under Emacs 21.3.1 X-archive-position: 12393 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Robert.Olsson@data.slu.se Precedence: bulk X-list: netdev Hello! Below is little patch to clean skb at xmit. It's old jungle trick Jamal and I used w. tulip. Note we can now even decrease the size of TX ring. It can increase TX performance from 800 kpps to 1125128pps 576Mb/sec (576065536bps) errors: 0 1124946pps 575Mb/sec (575972352bps) errors: 0 But suffers from scheduling problems as the previous patch. Often we just get 582108pps 298Mb/sec (298039296bps) errors: 0 When the sender CPU free (it's) skb's. we might get some "TX free affinity" which are unrelated to irq affinity of course not 100% perfect. And some of Scotts may still be used. --- drivers/net/e1000/e1000.h.orig 2004-12-01 13:59:36.000000000 +0100 +++ drivers/net/e1000/e1000.h 2004-12-02 20:11:31.000000000 +0100 @@ -103,7 +103,7 @@ #define E1000_MAX_INTR 10 /* TX/RX descriptor defines */ -#define E1000_DEFAULT_TXD 256 +#define E1000_DEFAULT_TXD 128 #define E1000_MAX_TXD 256 #define E1000_MIN_TXD 80 #define E1000_MAX_82544_TXD 4096 --- drivers/net/e1000/e1000_main.c.orig 2004-12-01 13:59:36.000000000 +0100 +++ drivers/net/e1000/e1000_main.c 2004-12-02 20:37:40.000000000 +0100 @@ -1820,6 +1820,10 @@ return NETDEV_TX_LOCKED; } + + if( adapter->tx_ring.next_to_use - adapter->tx_ring.next_to_clean > 80 ) + e1000_clean_tx_ring(adapter); + /* need: count + 2 desc gap to keep tail from touching * head, otherwise try next time */ if(E1000_DESC_UNUSED(&adapter->tx_ring) < count + 2) { --ro From jason.mcmullan@timesys.com Thu Dec 2 10:29:54 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Dec 2004 10:30:05 -0800 (PST) Received: from exchange.timesys.com (mail.timesys.com [65.117.135.102]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB2ITouC030675 for ; Thu, 2 Dec 2004 10:29:54 -0800 Received: from jmcmullan by owa.timesys.com; 02 Dec 2004 13:29:23 -0500 Subject: [PATCH] MII bus API for PHY devices Rev 2.0 From: Jason McMullan To: Andy Fleming Cc: Benjamin Herrenschmidt , Netdev In-Reply-To: References: <069B6F33-341C-11D9-9652-000393DBC2E8@freescale.com> <9B0D9272-398A-11D9-96F6-000393C30512@freescale.com> <1100820391.25521.14.camel@gaston> <97DA0EF0-3A70-11D9-B023-000393C30512@freescale.com> <1100904184.3856.46.camel@gaston> Content-Type: multipart/mixed; boundary="=-UCEP7LL7eTaCc7pqknBv" Date: Thu, 02 Dec 2004 13:29:22 -0500 Message-Id: <1102012163.6056.39.camel@jmcmullan> Mime-Version: 1.0 X-Mailer: Evolution 2.0.1-1mdk X-archive-position: 12394 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jason.mcmullan@timesys.com Precedence: bulk X-list: netdev --=-UCEP7LL7eTaCc7pqknBv Content-Type: text/plain Content-Transfer-Encoding: 7bit Ok, given previous input, I now release mii-bus 2.0 * CIS8201 interrupt support * The PHY device api is now similar to 'sungem' * Set up your PHY as autoneg or forced speeds. -- Jason McMullan --=-UCEP7LL7eTaCc7pqknBv Content-Disposition: attachment; filename=driver-mii-bus.patch Content-Type: text/x-patch; name=driver-mii-bus.patch; charset=ISO-8859-1 Content-Transfer-Encoding: base64 IyMjIyBBdXRvLWdlbmVyYXRlZCBwYXRjaCAjIyMjDQpTaWduZWQtb2ZmLWJ5OiBKYXNvbiBNY011 bGxhbiA8am1jbXVsbGFuQHRpbWVzeXMuY29tPg0KRGF0ZTogICAgICAgICAgVGh1LCAwMiBEZWMg MjAwNCAxMzoyMDo0OSAtMDUwMA0KRGVzY3JpcHRpb246ICAgTUlJIEJ1cyBpbnRlcmZhY2UNCkRl cGVuZHM6DQoNCiMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMNCg0KSW5kZXggb2YgY2hh bmdlczoNCg0KIGRyaXZlcnMvbmV0L01ha2VmaWxlICAgICAgICAgICAgfCAgICA0IA0KIGxpbnV4 L2RyaXZlcnMvbmV0L21paV9iaXRiYW5nLmMgfCAgMTM0ICsrKysrKysrDQogbGludXgvZHJpdmVy cy9uZXQvbWlpX2JpdGJhbmcuaCB8ICAgNDAgKysNCiBsaW51eC9kcml2ZXJzL25ldC9taWlfYnVz LmMgICAgIHwgIDYzOSArKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrDQog bGludXgvZHJpdmVycy9uZXQvcGh5X2NpY2FkYS5jICB8ICAxNzcgKysrKysrKysrKysNCiBsaW51 eC9kcml2ZXJzL25ldC9waHlfZGF2aWNvbS5jIHwgIDE0MCArKysrKysrKw0KIGxpbnV4L2RyaXZl cnMvbmV0L3BoeV9seHQ5N3guYyAgfCAgMjEwICsrKysrKysrKysrKysNCiBsaW51eC9kcml2ZXJz L25ldC9waHlfbWFydmVsbC5jIHwgIDEyNSArKysrKysrDQogbGludXgvaW5jbHVkZS9saW51eC9t aWlfYnVzLmggICB8ICAxOTEgKysrKysrKysrKysNCiA5IGZpbGVzIGNoYW5nZWQsIDE2NTkgaW5z ZXJ0aW9ucygrKSwgMSBkZWxldGlvbigtKQ0KDQoNCi0tLSBsaW51eC1vcmlnL2RyaXZlcnMvbmV0 L01ha2VmaWxlDQorKysgbGludXgvZHJpdmVycy9uZXQvTWFrZWZpbGUNCkBAIC02Miw3ICs2Miw5 IEBADQogIyBlbmQgbGluayBvcmRlciBzZWN0aW9uDQogIw0KIA0KLW9iai0kKENPTkZJR19NSUkp ICs9IG1paS5vDQorb2JqLSQoQ09ORklHX01JSSkgKz0gbWlpLm8gbWlpX2J1cy5vIG1paV9iaXRi YW5nLm8gXA0KKwkJICAgICBwaHlfZGF2aWNvbS5vIHBoeV9tYXJ2ZWxsLm8gcGh5X2NpY2FkYS5v IFwNCisJCSAgICAgcGh5X2x4dDk3eC5vDQogDQogb2JqLSQoQ09ORklHX1NVTkRBTkNFKSArPSBz dW5kYW5jZS5vDQogb2JqLSQoQ09ORklHX0hBTUFDSEkpICs9IGhhbWFjaGkubw0KLS0tIC9kZXYv bnVsbA0KKysrIGxpbnV4L2RyaXZlcnMvbmV0L21paV9iaXRiYW5nLmMNCkBAIC0wLDAgKzEsMTM0 IEBADQorLyogDQorICogZHJpdmVycy9uZXQvbWlpX2JpdGJhbmcuYw0KKyAqDQorICogQXV0aG9y OiBKYXNvbiBNY011bGxhbg0KKyAqDQorICogQ29weXJpZ2h0IChjKSAyMDA0IFRpbWVzeXMgQ29y cC4NCisgKg0KKyAqIFRoaXMgcHJvZ3JhbSBpcyBmcmVlIHNvZnR3YXJlOyB5b3UgY2FuIHJlZGlz dHJpYnV0ZSAgaXQgYW5kL29yIG1vZGlmeSBpdA0KKyAqIHVuZGVyICB0aGUgdGVybXMgb2YgIHRo ZSBHTlUgR2VuZXJhbCAgUHVibGljIExpY2Vuc2UgYXMgcHVibGlzaGVkIGJ5IHRoZQ0KKyAqIEZy ZWUgU29mdHdhcmUgRm91bmRhdGlvbjsgIGVpdGhlciB2ZXJzaW9uIDIgb2YgdGhlICBMaWNlbnNl LCBvciAoYXQgeW91cg0KKyAqIG9wdGlvbikgYW55IGxhdGVyIHZlcnNpb24uDQorICoNCisgKi8N CisNCisjaW5jbHVkZSA8bGludXgva2VybmVsLmg+DQorI2luY2x1ZGUgPGFzbS9zdHJpbmcuaD4N CisNCisjaW5jbHVkZSAibWlpX2JpdGJhbmcuaCINCisNCisjdW5kZWYgUEhZX1JFQUQNCisjdW5k ZWYgUEhZX1dSSVRFDQorI2RlZmluZSBQSFlfUkVBRAkwDQorI2RlZmluZSBQSFlfV1JJVEUJMQ0K Kw0KKy8qIA0KKyAqIDFzdCBieXRlOiAwMTAxUFBQUCBvbiB3cml0ZXMsIHdoZXJlIFBQUFAgaXMg dGhlIE1TQiBvZiB0aGUgcGh5LWlkDQorICogICAgICAgICAgIDAxMTBQUFBQIG9uIHJlYWRzLCB3 aGVyZSAgUFBQUCBpcyB0aGUgTVNCIG9mIHRoZSBwaHktaWQNCisgKiAybmQgYnl0ZTogUFJSUlJS MTAsIFAgaXMgdGhlIExTQiBvZiB0aGUgcGh5LWlkLCBSIGlzIHRoZSByZWdpc3Rlcg0KKyAqIDNy ZCw0dGggYnl0ZXM6IGNvbnRyb2wgb24gd3JpdGVzLCB2YWx1ZXMgb24gcmVhZHMNCisgKi8NCisN CitzdGF0aWMgaW5saW5lIHZvaWQgbWlpX2JpdGJhbmdfbWFyayh2b2lkICpwcml2KQ0KK3sNCisJ c3RydWN0IG1paV9iaXRiYW5nICppbmZvID0gcHJpdjsNCisJaW50IGk7DQorDQorCS8qIFdyaXRl IHByZWFtYmxlICovDQorCWZvciAoaSA9IDA7IGkgPCAzMjsgaSsrKQ0KKwkJaW5mby0+c2VuZChp bmZvLT5wcml2LCAxKTsNCit9DQorDQorc3RhdGljIGlubGluZSB2b2lkIG1paV9iaXRiYW5nX3Bo eV9pZCh2b2lkICpwcml2LCBpbnQgcGh5X2lkLCBpbnQgcmVnLCBpbnQgaXNfd3JpdGUpDQorew0K KwlzdHJ1Y3QgbWlpX2JpdGJhbmcgKmluZm8gPSBwcml2Ow0KKwlpbnQgaTsNCisNCisJLyogUHJl YW1ibGUgKi8NCisJaW5mby0+c2VuZChpbmZvLT5wcml2LDApOw0KKwlpbmZvLT5zZW5kKGluZm8t PnByaXYsMSk7DQorCWlmIChpc193cml0ZSkgew0KKwkJaW5mby0+c2VuZChpbmZvLT5wcml2LDAp Ow0KKwkJaW5mby0+c2VuZChpbmZvLT5wcml2LDEpOw0KKwl9IGVsc2Ugew0KKwkJaW5mby0+c2Vu ZChpbmZvLT5wcml2LDEpOw0KKwkJaW5mby0+c2VuZChpbmZvLT5wcml2LDApOw0KKwl9DQorDQor CS8qIFdyaXRlIFBIWSBhZGRyICovDQorCWZvciAoaSA9IDA7IGkgPCA1OyBpKyspDQorCQlpbmZv LT5zZW5kKGluZm8tPnByaXYsIChwaHlfaWQgPj4gKDQtaSkpICYgMSk7DQorDQorCS8qIFdyaXRl IHRoZSByZWdpc3RlciAqLw0KKwlmb3IgKGkgPSAwOyBpIDwgNTsgaSsrKQ0KKwkJaW5mby0+c2Vu ZChpbmZvLT5wcml2LCAocmVnID4+ICg0LWkpKSAmIDEpOw0KKw0KKwlpbmZvLT5zZW5kKGluZm8t PnByaXYsMSk7DQorCWluZm8tPnNlbmQoaW5mby0+cHJpdiwwKTsNCit9DQorDQorc3RhdGljIGlu dCBtaWlfYml0YmFuZ19yZWFkKHZvaWQgKnByaXYsIGludCBwaHlfaWQsIGludCByZWcpDQorew0K KwlzdHJ1Y3QgbWlpX2JpdGJhbmcgKmluZm8gPSBwcml2Ow0KKwlpbnQgaTsNCisJaW50IHJldHZh bD0wOw0KKw0KKwltaWlfYml0YmFuZ19tYXJrKHByaXYpOw0KKwltaWlfYml0YmFuZ19waHlfaWQo cHJpdiwgcGh5X2lkLCByZWcsIFBIWV9SRUFEKTsNCisNCisJZm9yIChpID0gMDsgaSA8IDE2OyBp KyspDQorCQlyZXR2YWwgPSAocmV0dmFsIDw8IDEpIHwgKGluZm8tPnJlY3YoaW5mby0+cHJpdikg JiAxKTsNCisNCisJbWlpX2JpdGJhbmdfbWFyayhwcml2KTsNCisNCisJcmV0dXJuIHJldHZhbDsN Cit9DQorDQorc3RhdGljIGludCBtaWlfYml0YmFuZ193cml0ZSh2b2lkICpwcml2LCBpbnQgcGh5 X2lkLCBpbnQgcmVnLCB1aW50MTZfdCB2YWwpDQorew0KKwlzdHJ1Y3QgbWlpX2JpdGJhbmcgKmlu Zm8gPSBwcml2Ow0KKwlpbnQgaTsNCisNCisJbWlpX2JpdGJhbmdfbWFyayhwcml2KTsNCisJbWlp X2JpdGJhbmdfcGh5X2lkKHByaXYsIHBoeV9pZCwgcmVnLCBQSFlfV1JJVEUpOw0KKw0KKwlmb3Ig KGk9MDsgaSA8IDE2OyBpKyspDQorCQlpbmZvLT5zZW5kKGluZm8tPnByaXYsICh2YWwgPj4gKDE1 LWkpKSAmIDEpOw0KKw0KKwltaWlfYml0YmFuZ19tYXJrKHByaXYpOw0KKw0KKwlyZXR1cm4gMDsN Cit9DQorDQorc3RhdGljIHZvaWQgbWlpX2JpdGJhbmdfcmVzZXQodm9pZCAqcHJpdikNCit7DQor CXN0cnVjdCBtaWlfYml0YmFuZyAqaW5mbyA9IHByaXY7DQorDQorCWluZm8tPnJlc2V0KGluZm8t PnByaXYpOw0KK30NCisNCisvKiBDcmVhdGVzIGEgYml0YmFuZyBNSUkgYnVzDQorICogUmV0dXJu cyA8IDAgb24gZXJyb3IsIG90aGVyd2lzZSBhIGJ1cyBJRA0KKyAqLw0KK2ludCBtaWlfYml0YmFu Z19yZWdpc3RlcihzdHJ1Y3QgbWlpX2JpdGJhbmcgKmluZm8pDQorew0KKwltZW1zZXQoJmluZm8t PmJ1cywgMCwgc2l6ZW9mKGluZm8tPmJ1cykpOw0KKw0KKwlpbmZvLT5idXMubmFtZSA9IGluZm8t Pm5hbWU7DQorCWluZm8tPmJ1cy5wcml2ID0gaW5mbzsNCisJaW5mby0+YnVzLnJlYWQgPSBtaWlf Yml0YmFuZ19yZWFkOw0KKwlpbmZvLT5idXMud3JpdGUgPSBtaWlfYml0YmFuZ193cml0ZTsNCisJ aW5mby0+YnVzLnJlc2V0ID0gbWlpX2JpdGJhbmdfcmVzZXQ7DQorDQorCXJldHVybiBtaWlfYnVz X3JlZ2lzdGVyKCZpbmZvLT5idXMpOw0KK30NCisNCisNCisvKiBVbnJlZ2lzdGVycyBhIGJpdGJh bmcgTUlJIGJ1cw0KKyAqLw0KK3ZvaWQgbWlpX2JpdGJhbmdfdW5yZWdpc3RlcihzdHJ1Y3QgbWlp X2JpdGJhbmcgKmluZm8pDQorew0KKwltaWlfYnVzX3VucmVnaXN0ZXIoJmluZm8tPmJ1cyk7DQor fQ0KKw0KKw0KDQotLS0gL2Rldi9udWxsDQorKysgbGludXgvZHJpdmVycy9uZXQvbWlpX2JpdGJh bmcuaA0KQEAgLTAsMCArMSw0MCBAQA0KKy8qIA0KKyAqIGRyaXZlcnMvbmV0L21paV9iaXRiYW5n LmgNCisgKg0KKyAqIEF1dGhvcjogSmFzb24gTWNNdWxsYW4NCisgKg0KKyAqIENvcHlyaWdodCAo YykgMjAwNCBUaW1lc3lzIENvcnAuDQorICoNCisgKiBUaGlzIHByb2dyYW0gaXMgZnJlZSBzb2Z0 d2FyZTsgeW91IGNhbiByZWRpc3RyaWJ1dGUgIGl0IGFuZC9vciBtb2RpZnkgaXQNCisgKiB1bmRl ciAgdGhlIHRlcm1zIG9mICB0aGUgR05VIEdlbmVyYWwgIFB1YmxpYyBMaWNlbnNlIGFzIHB1Ymxp c2hlZCBieSB0aGUNCisgKiBGcmVlIFNvZnR3YXJlIEZvdW5kYXRpb247ICBlaXRoZXIgdmVyc2lv biAyIG9mIHRoZSAgTGljZW5zZSwgb3IgKGF0IHlvdXINCisgKiBvcHRpb24pIGFueSBsYXRlciB2 ZXJzaW9uLg0KKyAqDQorICovDQorI2lmbmRlZiBfTkVUX01JSV9CSVRCQU5HX0gNCisjZGVmaW5l IF9ORVRfTUlJX0JJVEJBTkdfSA0KKw0KKyNpbmNsdWRlIDxsaW51eC9taWlfYnVzLmg+DQorDQor c3RydWN0IG1paV9iaXRiYW5nIHsNCisJY29uc3QgY2hhciAqbmFtZTsJLyogTmFtZSBvZiBkZXZp Y2UgKi8NCisJdm9pZCAqcHJpdjsJCS8qIFByaXZhdGUgZGF0YSAqLw0KKw0KKwl2b2lkICgqc2Vu ZCkodm9pZCAqcHJpdiwgaW50IGJpdCk7CS8qIFNlbmQgb25lIGJpdCAqLw0KKwlpbnQgKCpyZWN2 KSh2b2lkICpwcml2KTsJCS8qIFJlY3Ygb25lIGJpdCAqLw0KKwl2b2lkICgqcmVzZXQpKHZvaWQg KnByaXYpOw0KKw0KKwkvKiBBdXRvLWZpbGxlZC1pbiBpbmZvcm1hdGlvbiAqLw0KKwlzdHJ1Y3Qg bWlpX2J1cyBidXM7DQorfTsNCisNCisvKiBDcmVhdGVzIGEgYml0YmFuZyBNSUkgYnVzDQorICog UmV0dXJucyA8IDAgb24gZXJyb3IsIG90aGVyd2lzZSBhIGJ1cyBJRA0KKyAqLw0KK2V4dGVybiBp bnQgbWlpX2JpdGJhbmdfcmVnaXN0ZXIoc3RydWN0IG1paV9iaXRiYW5nICppbmZvKTsNCisNCisv KiBVbnJlZ2lzdGVycyBhIGJpdGJhbmcgTUlJIGJ1cw0KKyAqLw0KK2V4dGVybiB2b2lkIG1paV9i aXRiYW5nX3VucmVnaXN0ZXIoc3RydWN0IG1paV9iaXRiYW5nICppbmZvKTsNCisNCisjZW5kaWYg LyogX05FVF9NSUlfQklUQkFOR19IICovDQoNCi0tLSAvZGV2L251bGwNCisrKyBsaW51eC9kcml2 ZXJzL25ldC9taWlfYnVzLmMNCkBAIC0wLDAgKzEsNjM5IEBADQorLyogDQorICogZHJpdmVycy9u ZXQvbWlpX2J1cy5jDQorICoNCisgKiBBZGFwZXRlZCBmcm9tIGRyaXZlcnMvbmV0L2dpYW5mYXJf bWlpLmMsIGJ5IEFuZHkgRmxlbWluZw0KKyAqDQorICogQXV0aG9yOiBKYXNvbiBNY011bGxhbiAo amFzb24ubWNtdWxsYW5AdGltZXN5cy5jb20pIHRvIA0KKyAqIAkgICBiZSBhIGdlbmVyaWMgbWlp IGludGVyZmFjZQ0KKyAqDQorICogQ29weXJpZ2h0IChjKSAyMDA0IFRpbWVzeXMgSW5jDQorICoN CisgKiBUaGlzIHByb2dyYW0gaXMgZnJlZSBzb2Z0d2FyZTsgeW91IGNhbiByZWRpc3RyaWJ1dGUg IGl0IGFuZC9vciBtb2RpZnkgaXQNCisgKiB1bmRlciAgdGhlIHRlcm1zIG9mICB0aGUgR05VIEdl bmVyYWwgIFB1YmxpYyBMaWNlbnNlIGFzIHB1Ymxpc2hlZCBieSB0aGUNCisgKiBGcmVlIFNvZnR3 YXJlIEZvdW5kYXRpb247ICBlaXRoZXIgdmVyc2lvbiAyIG9mIHRoZSAgTGljZW5zZSwgb3IgKGF0 IHlvdXINCisgKiBvcHRpb24pIGFueSBsYXRlciB2ZXJzaW9uLg0KKyAqDQorICovDQorI2luY2x1 ZGUgPGxpbnV4L2NvbmZpZy5oPg0KKyNpbmNsdWRlIDxsaW51eC9rZXJuZWwuaD4NCisjaW5jbHVk ZSA8bGludXgvc3RyaW5nLmg+DQorI2luY2x1ZGUgPGxpbnV4L2Vycm5vLmg+DQorI2luY2x1ZGUg PGxpbnV4L3NsYWIuaD4NCisjaW5jbHVkZSA8bGludXgvaW50ZXJydXB0Lmg+DQorI2luY2x1ZGUg PGxpbnV4L2luaXQuaD4NCisjaW5jbHVkZSA8bGludXgvZGVsYXkuaD4NCisjaW5jbHVkZSA8bGlu dXgvbmV0ZGV2aWNlLmg+DQorI2luY2x1ZGUgPGxpbnV4L2V0aGVyZGV2aWNlLmg+DQorI2luY2x1 ZGUgPGxpbnV4L3NrYnVmZi5oPg0KKyNpbmNsdWRlIDxsaW51eC9zcGlubG9jay5oPg0KKyNpbmNs dWRlIDxsaW51eC9tbS5oPg0KKyNpbmNsdWRlIDxsaW51eC9taWlfYnVzLmg+DQorDQorI2luY2x1 ZGUgPGFzbS9pby5oPg0KKyNpbmNsdWRlIDxhc20vaXJxLmg+DQorI2luY2x1ZGUgPGFzbS91YWNj ZXNzLmg+DQorI2luY2x1ZGUgPGxpbnV4L21vZHVsZS5oPg0KKw0KKyN1bmRlZiBERUJVRw0KKw0K K3N0YXRpYyBzdHJ1Y3QgbWlpX2J1cyAqbWlpX2J1c1s4XTsNCisNCisjZGVmaW5lIE1JSV9CVVNf TUFYCShzaXplb2YobWlpX2J1cykvc2l6ZW9mKHN0cnVjdCBtaWlfYnVzICopKQ0KKw0KK3N0YXRp YyBpbmxpbmUgc3RydWN0IHBoeV9pbmZvICptaWlfcGh5X29mKHN0cnVjdCBtaWlfaWZfaW5mbyAq bWlpKQ0KK3sNCisJaWYgKG1paSAhPSBOVUxMKSB7DQorCQlpbnQgYnVzLGlkOw0KKwkJYnVzID0g TUlJX0JVUyhtaWktPnBoeV9pZCk7DQorCQlpZCAgPSBNSUlfSUQobWlpLT5waHlfaWQpOw0KKwkJ cmV0dXJuIG1paV9idXNbYnVzXS0+cGh5W2lkXTsNCisJfQ0KKw0KKwlyZXR1cm4gTlVMTDsNCit9 DQorDQorLyogV3JpdGUgdmFsdWUgdG8gdGhlIFBIWSBmb3IgdGhpcyBkZXZpY2UgdG8gdGhlIHJl Z2lzdGVyIGF0IHJlZ251bSwNCisgKiB3YWl0aW5nIHVudGlsIHRoZSB3cml0ZSBpcyBkb25lIGJl Zm9yZSBpdCByZXR1cm5zLiAgQWxsIFBIWSANCisgKiBjb25maWd1cmF0aW9uIGhhcyB0byBiZSBk b25lIHRocm91Z2ggdGhlIFRTRUMxIE1JSU0gcmVncyAqLw0KK0VYUE9SVF9TWU1CT0wobWlpX2J1 c193cml0ZSk7DQoraW50IG1paV9idXNfd3JpdGUoaW50IGJ1cywgaW50IGlkLCBpbnQgcmVnbnVt LCB1aW50MTZfdCB2YWx1ZSkNCit7DQorCWlmIChtaWlfYnVzW2J1c10gPT0gTlVMTCkNCisJCXJl dHVybiAtRUlOVkFMOw0KKw0KKyNpZmRlZiBERUJVRw0KKwlwcmludGsoS0VSTl9JTkZPICJwaHkl ZC4lZDogV3JpdGUgMHglLjJ4IDwtIDB4JS40eFxuIiwgYnVzLCBpZCwgcmVnbnVtLCB2YWx1ZSk7 DQorI2VuZGlmDQorCW1paV9idXNbYnVzXS0+d3JpdGUobWlpX2J1c1tidXNdLT5wcml2LCBpZCwg cmVnbnVtLCB2YWx1ZSk7DQorCXJldHVybiAwOw0KK30NCisNCisvKiBSZWFkcyBmcm9tIHJlZ2lz dGVyIHJlZ251bSBpbiB0aGUgUEhZIGZvciBkZXZpY2UgZGV2LA0KKyAqIHJldHVybmluZyB0aGUg dmFsdWUuICBDbGVhcnMgbWlpbWNvbSBmaXJzdC4gIEFsbCBQSFkNCisgKiBjb25maWd1cmF0aW9u IGhhcyB0byBiZSBkb25lIHRocm91Z2ggdGhlIFRTRUMxIE1JSU0gcmVncyAqLw0KK0VYUE9SVF9T WU1CT0wobWlpX2J1c19yZWFkKTsNCitpbnQgbWlpX2J1c19yZWFkKGludCBidXMsIGludCBpZCwg aW50IHJlZ251bSkNCit7DQorCWlmIChtaWlfYnVzW2J1c10gPT0gTlVMTCkNCisJCXJldHVybiAt RUlOVkFMOw0KKw0KKyNpZmRlZiBERUJVRw0KKwl7DQorCQlpbnQgcnY7DQorCQlydiA9IG1paV9i dXNbYnVzXS0+cmVhZChtaWlfYnVzW2J1c10tPnByaXYsIGlkLCByZWdudW0pOw0KKwkJcHJpbnRr KEtFUk5fSU5GTyAicGh5JWQuJWQ6IFJlYWQgIDB4JS4yeCAtPiAweCUuNHhcbiIsIGJ1cywgaWQs IHJlZ251bSwgcnYpOw0KKwkJcmV0dXJuIHJ2Ow0KKwl9DQorI2Vsc2UNCisJcmV0dXJuIG1paV9i dXNbYnVzXS0+cmVhZChtaWlfYnVzW2J1c10tPnByaXYsIGlkLCByZWdudW0pOw0KKyNlbmRpZg0K K30NCisNCisvKiBIZWxwZXIgZnVuY3Rpb24gKi8NCitzdGF0aWMgaW50IHBoeV9zZXRfYXV0b25l ZyhzdHJ1Y3QgcGh5X2luZm8gKnBoeSwgdWludDMyX3QgYWR2ZXJ0aXNlKQ0KK3sNCisJaW50IGVy cjsNCisNCisJZXJyID0gcGh5LT5vcHMtPnNldF9hdXRvbmVnKHBoeSwgYWR2ZXJ0aXNlKTsNCisJ aWYgKGVyciA8IDApDQorCQlyZXR1cm4gZXJyOw0KKw0KKwlwaHktPm5lZ290aWF0ZS5hZHZlcnRp c2UgPSBhZHZlcnRpc2U7DQorCXBoeS0+bmVnb3RpYXRlLmF1dG9uZWcgPSBBVVRPTkVHX0VOQUJM RTsNCisJcGh5LT5uZWdvdGlhdGUudGltZW91dCA9IGppZmZpZXMgKyBNSUlfVElNRU9VVDsNCisJ cGh5LT5zdGF0ZS5hdXRvbmVnID0gMTsNCisNCisJcmV0dXJuIDA7DQorfQ0KKw0KKyNkZWZpbmUg QlJJRUZfTUlJX0VSUk9SUw0KK0VYUE9SVF9TWU1CT0wocGh5X2dlbl9wb2xsKTsNCisvKiBXYWl0 IGZvciBhdXRvLW5lZ290aWF0aW9uIHRvIGNvbXBsZXRlICovDQoraW50IHBoeV9nZW5fcG9sbChz dHJ1Y3QgcGh5X2luZm8gKnBoeSkNCit7DQorCXN0cnVjdCBwaHlfc3RhdGUgKnBzdGF0ZTsNCisJ dWludDE2X3QgdmFsOw0KKw0KKwlwc3RhdGUgPSAmcGh5LT5zdGF0ZTsNCisNCisJcGh5X3JlYWQo cGh5LCBNSUlfQk1TUik7CS8qIER1bW15IHJlYWQgKi8NCisJdmFsID0gcGh5X3JlYWQocGh5LCBN SUlfQk1TUik7DQorDQorCS8qIElmIHRoZSBsaW5rIGp1c3QgY2FtZSB1cCwgcmVzdGFydCB0aGUg YXV0by1uZWcgcHJvY2VkdXJlDQorCSAqLw0KKwlpZiAodmFsICYgQk1TUl9MU1RBVFVTKSB7DQor CQlpZiAocHN0YXRlLT5saW5rID09IDAgJiYgDQorCQkgICAgcHN0YXRlLT5hdXRvbmVnID09IDAg JiYgcGh5LT5uZWdvdGlhdGUuYXV0b25lZykgew0KKwkJCXBoeV9zZXRfYXV0b25lZyhwaHksIHBo eS0+bmVnb3RpYXRlLmFkdmVydGlzZSk7DQorCQkJcmV0dXJuIC1FQUdBSU47DQorCQl9DQorCQlw c3RhdGUtPmxpbmsgPSAxOw0KKwl9IGVsc2Ugew0KKwkJcHN0YXRlLT5saW5rID0gMDsNCisJfQ0K Kw0KKwkvKiBPbmx5IGF1dG8tbmVnb3RpYXRlIGlmIHRoZSBsaW5rIGhhcyBqdXN0IGdvbmUgdXAg Ki8NCisJaWYgKHBzdGF0ZS0+bGluayAmJiBwc3RhdGUtPmF1dG9uZWcgJiYgDQorCSAgICAodGlt ZV9hZnRlcihqaWZmaWVzLHBoeS0+bmVnb3RpYXRlLnRpbWVvdXQpIHx8IA0KKwkgICAgICh2YWwg JiBCTVNSX0FORUdDT01QTEVURSkpKSB7DQorI2lmZGVmIEJSSUVGX01JSV9FUlJPUlMNCisJCWlm ICh2YWwgJiBCTVNSX0FORUdDT01QTEVURSkgew0KKwkJCXByaW50ayhLRVJOX0lORk8gIiVzOiBB dXRvLW5lZ290aWF0aW9uIGRvbmVcbiIsDQorCQkJICAgICAgIHBoeS0+bmFtZSk7DQorCQl9IGVs c2Ugew0KKwkJCXByaW50ayhLRVJOX0lORk8NCisJCQkgICAgICAgIiVzOiBBdXRvLW5lZ290aWF0 aW9uIHRpbWVkIG91dFxuIiwNCisJCQkgICAgICAgcGh5LT5uYW1lKTsNCisJCX0NCisjZW5kaWYN CisNCisJCXBzdGF0ZS0+YXV0b25lZyA9IDA7DQorDQorCQlpZiAodmFsICYgQk1TUl9BTkVHQ09N UExFVEUpIHsNCisJCQl2YWwgPSBwaHlfcmVhZChwaHksIE1JSV9MUEEpOw0KKwkJCXZhbCAmPSBw aHlfcmVhZChwaHksIE1JSV9BRFZFUlRJU0UpOw0KKw0KKwkJCS8qIEFjY29yZGluZyB0byBJRUVF IDgwMi4zLCBMUEEgZGVjaXNpb25zDQorCQkJICogbXVzdCBiZSBkb25lIGluIHRoaXMgb3JkZXIN CisJCQkgKi8NCisJCQlpZiAodmFsICYgTFBBXzEwMEZVTEwpIHsNCisJCQkJcHN0YXRlLT5zcGVl ZCA9IFNQRUVEXzEwMDsNCisJCQkJcHN0YXRlLT5kdXBsZXggPSBEVVBMRVhfRlVMTDsNCisJCQl9 IGVsc2UgaWYgKHZhbCAmIExQQV8xMDBIQUxGKSB7DQorCQkJCXBzdGF0ZS0+c3BlZWQgPSBTUEVF RF8xMDA7DQorCQkJCXBzdGF0ZS0+ZHVwbGV4ID0gRFVQTEVYX0hBTEY7DQorCQkJfSBlbHNlIGlm ICh2YWwgJiBMUEFfMTBGVUxMKSB7DQorCQkJCXBzdGF0ZS0+c3BlZWQgPSBTUEVFRF8xMDsNCisJ CQkJcHN0YXRlLT5kdXBsZXggPSBEVVBMRVhfRlVMTDsNCisJCQl9IGVsc2UgaWYgKHZhbCAmIExQ QV8xMEhBTEYpIHsNCisJCQkJcHN0YXRlLT5zcGVlZCA9IFNQRUVEXzEwOw0KKwkJCQlwc3RhdGUt PmR1cGxleCA9IERVUExFWF9IQUxGOw0KKwkJCX0gZWxzZSB7DQorCQkJCXBzdGF0ZS0+c3BlZWQg PSBTUEVFRF8xMDsNCisJCQkJcHN0YXRlLT5kdXBsZXggPSBEVVBMRVhfSEFMRjsNCisJCQl9DQor CQl9DQorCX0NCisNCisJcmV0dXJuIChwc3RhdGUtPmF1dG9uZWcgPyAtRUFHQUlOIDogMCk7DQor fQ0KKw0KK3N0YXRpYyBzdHJ1Y3QgcGh5X29wcyBnZW5fb3BzID0gew0KKwkuc2V0X2F1dG9uZWcg PSBwaHlfZ2VuX3NldF9hdXRvbmVnLA0KKwkucG9sbCA9IHBoeV9nZW5fcG9sbA0KK307DQorDQor c3RhdGljIHN0cnVjdCBwaHlfaW5mbyBwaHlfaW5mb19nZW5lcmljID0gew0KKwkuaWQgPSAweDAs DQorCS5uYW1lID0gIkdlbmVyaWMgUEhZIiwNCisJLnNoaWZ0ID0gMzIsDQorCS5vcHMgPSAmZ2Vu X29wcw0KK307DQorDQorc3RhdGljIExJU1RfSEVBRChwaHlfbGlzdCk7DQorDQorLyogVXNlIHRo ZSBQSFkgSUQgcmVnaXN0ZXJzIHRvIGRldGVybWluZSB3aGF0IHR5cGUgb2YgUEhZIGlzIGF0dGFj aGVkDQorICogdG8gZGV2aWNlIGRldi4gIHJldHVybiBhIHN0cnVjdCBwaHlfaW5mbyBzdHJ1Y3R1 cmUgZGVzY3JpYmluZyB0aGF0IFBIWQ0KKyAqLw0KK3N0cnVjdCBwaHlfaW5mbyAqbWlpX3BoeV9n ZXRfaW5mbyhpbnQgYnVzLCBpbnQgaWQpDQorew0KKwlzdHJ1Y3QgbGlzdF9oZWFkICpscDsNCisJ dWludDE2X3QgcGh5X3JlZzsNCisJdWludDMyX3QgcGh5X2lkOw0KKwlzdHJ1Y3QgcGh5X2luZm8g KmluZm8gPSBOVUxMOw0KKw0KKwlpZiAobWlpX2J1c1tidXNdID09IE5VTEwpDQorCQlyZXR1cm4g TlVMTDsNCisNCisJLyogR3JhYiB0aGUgYml0cyBmcm9tIFBIWUlSMSwgYW5kIHB1dCB0aGVtIGlu IHRoZSB1cHBlciBoYWxmICovDQorCXBoeV9yZWcgPSBtaWlfYnVzX3JlYWQoYnVzLCBpZCwgTUlJ X1BIWVNJRDEpOw0KKwlwaHlfaWQgPSAocGh5X3JlZyAmIDB4ZmZmZikgPDwgMTY7DQorDQorCS8q IEdyYWIgdGhlIGJpdHMgZnJvbSBQSFlJUjIsIGFuZCBwdXQgdGhlbSBpbiB0aGUgbG93ZXIgaGFs ZiAqLw0KKwlwaHlfcmVnID0gbWlpX2J1c19yZWFkKGJ1cywgaWQsIE1JSV9QSFlTSUQyKTsNCisJ cGh5X2lkIHw9IChwaHlfcmVnICYgMHhmZmZmKTsNCisNCisJLyogbG9vcCB0aHJvdWdoIGFsbCB0 aGUga25vd24gUEhZIHR5cGVzLCBhbmQgZmluZCBvbmUgdGhhdA0KKwkgKiBtYXRjaGVzIHRoZSBJ RCB3ZSByZWFkIGZyb20gdGhlIFBIWS4gKi8NCisJbGlzdF9mb3JfZWFjaChscCwgJnBoeV9saXN0 KSB7DQorCQlzdHJ1Y3QgcGh5X2luZm8gKnBoeSA9IGxpc3RfZW50cnkobHAsIHN0cnVjdCBwaHlf aW5mbywgbGlzdCk7DQorCQlpZiAoKHBoeS0+aWQgPj4gcGh5LT5zaGlmdCkgPT0gKHBoeV9pZCA+ PiBwaHktPnNoaWZ0KSkgew0KKwkJCWluZm8gPSBwaHk7DQorCQkJYnJlYWs7DQorCQl9DQorCX0N CisNCisJaWYgKGluZm8gPT0gTlVMTCkgew0KKwkJcHJpbnRrKEtFUk5fV0FSTklORw0KKwkJICAg ICAgICJwaHklZC4lZDogUEhZIGlkIDB4JXggaXMgbm90IHN1cHBvcnRlZCFcbiIsIGJ1cywgaWQs DQorCQkgICAgICAgcGh5X2lkKTsNCisJfSBlbHNlIHsNCisJCXByaW50ayhLRVJOX0lORk8gInBo eSVkLiVkOiBQSFkgaXMgJXMgKCV4KVxuIiwgYnVzLCBpZCwNCisJCSAgICAgICBpbmZvLT5uYW1l LCBwaHlfaWQpOw0KKwl9DQorDQorCXJldHVybiBpbmZvOw0KK30NCisNCitzdGF0aWMgaW50IG1k aW9fcmVhZChzdHJ1Y3QgbmV0X2RldmljZSAqZGV2LCBpbnQgcGh5X2lkLCBpbnQgcmVnKQ0KK3sN CisJcmV0dXJuIG1paV9idXNfcmVhZChNSUlfQlVTKHBoeV9pZCksIE1JSV9JRChwaHlfaWQpLCBy ZWcpOw0KK30NCisNCitzdGF0aWMgdm9pZCBtZGlvX3dyaXRlKHN0cnVjdCBuZXRfZGV2aWNlICpk ZXYsIGludCBwaHlfaWQsIGludCByZWcsIGludCB2YWwpDQorew0KKwltaWlfYnVzX3dyaXRlKE1J SV9CVVMocGh5X2lkKSwgTUlJX0lEKHBoeV9pZCksIHJlZywgdmFsICYgMHhmZmZmKTsNCit9DQor DQorc3RhdGljIGlubGluZSB2b2lkIG1paV9waHlfaXJxX2FjayhzdHJ1Y3QgbWlpX2lmX2luZm8g Km1paSkNCit7DQorCXN0cnVjdCBwaHlfaW5mbyAqcGh5ID0gbWlpX3BoeV9vZihtaWkpOw0KKw0K KwlwaHktPm9wcy0+aW50X2FjayhwaHkpOw0KK30NCisNCitzdGF0aWMgaXJxcmV0dXJuX3QgbWlp X3BoeV9pcnEoaW50IGlycSwgdm9pZCAqZGF0YSwgc3RydWN0IHB0X3JlZ3MgKnJlZ3MpDQorew0K KwlzdHJ1Y3QgbWlpX2lmX2luZm8gKm1paSA9ICh2b2lkICopZGF0YTsNCisJc3RydWN0IHBoeV9p bmZvICpwaHkgPSBtaWlfcGh5X29mKG1paSk7DQorDQorCW1paV9waHlfaXJxX2FjayhtaWkpOw0K Kw0KKwkvKiBTY2hlZHVsZSB0aGUgYm90dG9tIGhhbGYgKi8NCisJc2NoZWR1bGVfd29yaygmcGh5 LT5kZWx0YS50cSk7DQorDQorCXJldHVybiBJUlFfSEFORExFRDsNCit9DQorDQorRVhQT1JUX1NZ TUJPTChtaWlfcGh5X2lycV9lbmFibGUpOw0KK2ludCBtaWlfcGh5X2lycV9lbmFibGUoc3RydWN0 IG1paV9pZl9pbmZvICptaWksIGludCBpcnEsIHZvaWQgKCpmdW5jKSAodm9pZCAqKSwNCisJCSAg ICAgICB2b2lkICpkYXRhKQ0KK3sNCisJc3RydWN0IHBoeV9pbmZvICpwaHkgPSBtaWlfcGh5X29m KG1paSk7DQorCWludCBlcnI7DQorDQorCWlmIChwaHkgPT0gTlVMTCkNCisJCXJldHVybiAtRUlO VkFMOw0KKw0KKwlpZiAocGh5LT5kZWx0YS5kYXRhICE9IE5VTEwpDQorCQlyZXR1cm4gLUVCVVNZ Ow0KKw0KKwlpZiAocGh5LT5vcHMtPmludF9hY2sgPT0gTlVMTCB8fA0KKwkgICAgcGh5LT5vcHMt PmludF9lbmFibGUgPT0gTlVMTCB8fA0KKwkgICAgcGh5LT5vcHMtPmludF9kaXNhYmxlID09IE5V TEwpDQorCQlyZXR1cm4gLUVJTlZBTDsNCisNCisJcGh5LT5kZWx0YS5pcnEgPSBpcnE7DQorCXBo eS0+ZGVsdGEuZnVuYyA9IGZ1bmM7DQorCXBoeS0+ZGVsdGEuZGF0YSA9IGRhdGE7DQorDQorCWVy ciA9IHJlcXVlc3RfaXJxKGlycSwgbWlpX3BoeV9pcnEsIFNBX1NISVJRLCBwaHktPm5hbWUsIG1p aSk7DQorCWlmIChlcnIgPCAwKSB7DQorCQlwaHktPmRlbHRhLmlycSA9IC0xOw0KKwkJcGh5LT5k ZWx0YS5mdW5jID0gTlVMTDsNCisJCXBoeS0+ZGVsdGEuZGF0YSA9IE5VTEw7DQorCQlyZXR1cm4g ZXJyOw0KKwl9DQorDQorCXBoeS0+b3BzLT5pbnRfZW5hYmxlKHBoeSk7DQorCXJldHVybiAwOw0K K30NCisNCitFWFBPUlRfU1lNQk9MKG1paV9waHlfaXJxX2Rpc2FibGUpOw0KK3ZvaWQgbWlpX3Bo eV9pcnFfZGlzYWJsZShzdHJ1Y3QgbWlpX2lmX2luZm8gKm1paSwgdm9pZCAqZGF0YSkNCit7DQor CXN0cnVjdCBwaHlfaW5mbyAqcGh5ID0gbWlpX3BoeV9vZihtaWkpOw0KKw0KKwlpZiAocGh5ID09 IE5VTEwgfHwgcGh5LT5kZWx0YS5kYXRhICE9IGRhdGEpDQorCQlyZXR1cm47DQorDQorCXBoeS0+ b3BzLT5pbnRfZGlzYWJsZShwaHkpOw0KKw0KKwlmcmVlX2lycShwaHktPmRlbHRhLmlycSwgbWlp KTsNCisJcGh5LT5kZWx0YS5pcnEgPSAtMTsNCisJcGh5LT5kZWx0YS5mdW5jID0gTlVMTDsNCisJ cGh5LT5kZWx0YS5kYXRhID0gTlVMTDsNCit9DQorDQorLyogU2NoZWR1bGVkIGJ5IHRoZSB0YXNr IHF1ZXVlICovDQorc3RhdGljIHZvaWQgbWlpX3BoeV9kZWx0YSh2b2lkICpkYXRhKQ0KK3sNCisJ c3RydWN0IG1paV9pZl9pbmZvICptaWkgPSAodm9pZCAqKWRhdGE7DQorCXN0cnVjdCBwaHlfaW5m byAqcGh5ID0gbWlpX3BoeV9vZihtaWkpOw0KKwlzdHJ1Y3QgcGh5X3N0YXRlIG9sZDsNCisNCisJ b2xkPXBoeS0+c3RhdGU7DQorDQorCXBoeS0+b3BzLT5wb2xsKHBoeSk7DQorDQorCWlmIChtZW1j bXAoJm9sZCwmcGh5LT5zdGF0ZSxzaXplb2Yob2xkKSkgIT0gMCAmJiBwaHktPmRlbHRhLmZ1bmMp DQorCQlwaHktPmRlbHRhLmZ1bmMocGh5LT5kZWx0YS5kYXRhKTsNCit9DQorDQorc3RhdGljIHZv aWQgbWlpX3BoeV9wb2xsKHVuc2lnbmVkIGxvbmcgZGF0YSkNCit7DQorCXN0cnVjdCBtaWlfaWZf aW5mbyAqbWlpID0gKHZvaWQgKilkYXRhOw0KKwlzdHJ1Y3QgcGh5X2luZm8gKnBoeSA9IG1paV9w aHlfb2YobWlpKTsNCisNCisJc2NoZWR1bGVfd29yaygmcGh5LT5kZWx0YS50cSk7DQorDQorCW1v ZF90aW1lcigmcGh5LT5kZWx0YS50aW1lciwgamlmZmllcyArIEhaICogcGh5LT5kZWx0YS5tc2Vj cyAvIDEwMDApOw0KK30NCisNCitFWFBPUlRfU1lNQk9MKG1paV9waHlfcG9sbF9lbmFibGUpOw0K K2ludCBtaWlfcGh5X3BvbGxfZW5hYmxlKHN0cnVjdCBtaWlfaWZfaW5mbyAqbWlpLCB1bnNpZ25l ZCBsb25nIG1zZWNzLA0KKwkJCXZvaWQgKCpmdW5jKSAodm9pZCAqKSwgdm9pZCAqZGF0YSkNCit7 DQorCXN0cnVjdCBwaHlfaW5mbyAqcGh5ID0gbWlpX3BoeV9vZihtaWkpOw0KKw0KKwlpZiAocGh5 ID09IE5VTEwpDQorCQlyZXR1cm4gLUVJTlZBTDsNCisNCisJaWYgKEhaICogbXNlY3MgLyAxMDAw IDw9IDAgfHwgZnVuYyA9PSBOVUxMKQ0KKwkJcmV0dXJuIC1FSU5WQUw7DQorDQorCWlmIChwaHkt PmRlbHRhLmRhdGEgIT0gTlVMTCkNCisJCXJldHVybiAtRUlOVkFMOw0KKw0KKwlpbml0X3RpbWVy KCZwaHktPmRlbHRhLnRpbWVyKTsNCisJcGh5LT5kZWx0YS50aW1lci5mdW5jdGlvbiA9IG1paV9w aHlfcG9sbDsNCisJcGh5LT5kZWx0YS50aW1lci5kYXRhID0gKHVuc2lnbmVkIGxvbmcpbWlpOw0K KwlwaHktPmRlbHRhLmRhdGEgPSBkYXRhOw0KKwlwaHktPmRlbHRhLmZ1bmMgPSBmdW5jOw0KKwlw aHktPmRlbHRhLm1zZWNzID0gbXNlY3M7DQorCW1vZF90aW1lcigmcGh5LT5kZWx0YS50aW1lciwg amlmZmllcyArIEhaICogbXNlY3MgLyAxMDAwKTsNCisJc2NoZWR1bGVfd29yaygmcGh5LT5kZWx0 YS50cSk7DQorDQorCXJldHVybiAwOw0KK30NCisNCitFWFBPUlRfU1lNQk9MKG1paV9waHlfcG9s bF9kaXNhYmxlKTsNCit2b2lkIG1paV9waHlfcG9sbF9kaXNhYmxlKHN0cnVjdCBtaWlfaWZfaW5m byAqbWlpLCB2b2lkICpkYXRhKQ0KK3sNCisJc3RydWN0IHBoeV9pbmZvICpwaHkgPSBtaWlfcGh5 X29mKG1paSk7DQorDQorCWlmIChwaHkgPT0gTlVMTCB8fCBwaHktPmRlbHRhLmRhdGEgPT0gTlVM TCkNCisJCXJldHVybjsNCisNCisJZGVsX3RpbWVyX3N5bmMoJnBoeS0+ZGVsdGEudGltZXIpOw0K KwlwaHktPmRlbHRhLmZ1bmMgPSBOVUxMOw0KKwlwaHktPmRlbHRhLmRhdGEgPSBOVUxMOw0KK30N CisNCitFWFBPUlRfU1lNQk9MKHBoeV9nZW5fc2V0X2F1dG9uZWcpOw0KK2ludCBwaHlfZ2VuX3Nl dF9hdXRvbmVnKHN0cnVjdCBwaHlfaW5mbyAqcGh5LCB1aW50MzJfdCBhZHZlcnRpc2UpDQorew0K Kwl1aW50MTZfdCBhZHYsIGN0bDsNCisNCisJYWR2ID0gcGh5X3JlYWQocGh5LCBNSUlfQURWRVJU SVNFKTsNCisJYWR2ICY9IH4oQURWRVJUSVNFX0FMTCB8IEFEVkVSVElTRV8xMDBCQVNFNCk7DQor CWlmIChhZHZlcnRpc2UgJiBBRFZFUlRJU0VEXzEwYmFzZVRfSGFsZikNCisJCWFkdiB8PSBBRFZF UlRJU0VfMTBIQUxGOw0KKwlpZiAoYWR2ZXJ0aXNlICYgQURWRVJUSVNFRF8xMGJhc2VUX0Z1bGwp DQorCQlhZHYgfD0gQURWRVJUSVNFXzEwRlVMTDsNCisJaWYgKGFkdmVydGlzZSAmIEFEVkVSVElT RURfMTAwYmFzZVRfSGFsZikNCisJCWFkdiB8PSBBRFZFUlRJU0VfMTAwSEFMRjsNCisJaWYgKGFk dmVydGlzZSAmIEFEVkVSVElTRURfMTAwYmFzZVRfRnVsbCkNCisJCWFkdiB8PSBBRFZFUlRJU0Vf MTAwRlVMTDsNCisNCisJLyogQ29uZmlndXJlIHNvbWUgYmFzaWMgc3R1ZmYgKi8NCisJcGh5X3dy aXRlKHBoeSwgTUlJX0FEVkVSVElTRSwgYWR2KTsNCisNCisJLyogU3RhcnQgYXV0byBuZWdvdGlh dGlvbiAqLw0KKwljdGwgPSBwaHlfcmVhZChwaHksIE1JSV9CTUNSKTsNCisJY3RsIHw9IChCTUNS X0FORU5BQkxFIHwgQk1DUl9BTlJFU1RBUlQpOw0KKwlwaHlfd3JpdGUocGh5LCBNSUlfQk1DUiwg Y3RsKTsNCisNCisJcmV0dXJuIDA7DQorfQ0KKw0KKw0KK0VYUE9SVF9TWU1CT0wobWlpX3BoeV9h dHRhY2gpOw0KK2ludCBtaWlfcGh5X2F0dGFjaChzdHJ1Y3QgbWlpX2lmX2luZm8gKm1paSwgc3Ry dWN0IG5ldF9kZXZpY2UgKmRldiwgaW50IGJ1cywNCisJCSAgIGludCBpZCkNCit7DQorCXN0cnVj dCBwaHlfaW5mbyAqcGh5LCAqaW5mbzsNCisNCisJaWYgKG1paV9idXNbYnVzXSA9PSBOVUxMKSB7 DQorCQlwcmludGsoS0VSTl9FUlINCisJCSAgICAgICAibWlpX3BoeV9hdHRhY2g6IENhbid0IGF0 dGFjaCAlcywgbm8gTUlJIGJ1cyAlZCBwcmVzZW50XG4iLA0KKwkJICAgICAgIGRldi0+bmFtZSwg YnVzKTsNCisJCXJldHVybiAtRU5PREVWOw0KKwl9DQorDQorCWlmIChtaWlfYnVzW2J1c10tPnBo eVtpZF0gIT0gTlVMTCkgew0KKwkJcHJpbnRrKEtFUk5fRVJSDQorCQkgICAgICAgIm1paV9waHlf YXR0YWNoOiBwaHklZC4lZCBpcyBhbHJlYWR5IGF0dGFjaGVkIHRvICVzXG4iLA0KKwkJICAgICAg IGJ1cywgaWQsIGRldi0+bmFtZSk7DQorCQlyZXR1cm4gLUVCVVNZOw0KKwl9DQorDQorCWluZm8g PSBtaWlfcGh5X2dldF9pbmZvKGJ1cywgaWQpOw0KKwlpZiAoaW5mbyA9PSBOVUxMKQ0KKwkJcmV0 dXJuIC1FTk9ERVY7DQorDQorCXBoeSA9IGttYWxsb2Moc2l6ZW9mKCpwaHkpLCBHRlBfS0VSTkVM KTsNCisJaWYgKHBoeSA9PSBOVUxMKQ0KKwkJcmV0dXJuIC1FTk9NRU07DQorDQorCW1lbWNweShw aHksIGluZm8sIHNpemVvZigqcGh5KSk7DQorDQorCUlOSVRfV09SSygmcGh5LT5kZWx0YS50cSwg bWlpX3BoeV9kZWx0YSwgbWlpKTsNCisJc25wcmludGYoJnBoeS0+bmFtZVswXSxzaXplb2YocGh5 LT5uYW1lKSwicGh5JWQuJWQiLGJ1cyxpZCk7DQorCXBoeS0+cGh5X2lkID0gTUlJX1BIWV9JRChi dXMsIGlkKTsNCisJcGh5LT5kZWx0YS5mdW5jID0gTlVMTDsNCisJcGh5LT5kZWx0YS5kYXRhID0g TlVMTDsNCisJcGh5LT5kZWx0YS5pcnEgPSAtMTsNCisJcGh5LT5zdGF0ZS5saW5rID0gMDsNCisJ cGh5LT5zdGF0ZS5kdXBsZXggPSBEVVBMRVhfSEFMRjsNCisJcGh5LT5zdGF0ZS5zcGVlZCA9IFNQ RUVEXzEwOw0KKwlwaHktPm5lZ290aWF0ZS5hdXRvbmVnID0gMDsNCisJcGh5LT5uZWdvdGlhdGUu YWR2ZXJ0aXNlID0gMDsNCisNCisJbWVtc2V0KG1paSwgMCwgc2l6ZW9mKCptaWkpKTsNCisJbWlp LT5waHlfaWQgPSAoYnVzIDw8IDUpIHwgaWQ7DQorCW1paS0+cGh5X2lkX21hc2sgPSAweGZmOw0K KwltaWktPnJlZ19udW1fbWFzayA9IDB4MWY7DQorCW1paS0+ZGV2ID0gZGV2Ow0KKwltaWktPm1k aW9fcmVhZCA9IG1kaW9fcmVhZDsNCisJbWlpLT5tZGlvX3dyaXRlID0gbWRpb193cml0ZTsNCisN CisJbWlpX2J1c1tidXNdLT5waHlbaWRdID0gcGh5Ow0KKw0KKwlpZiAocGh5LT5vcHMtPmluaXQg IT0gTlVMTCkNCisJCXBoeS0+b3BzLT5pbml0KHBoeSk7DQorCXJldHVybiAwOw0KK30NCisNCitF WFBPUlRfU1lNQk9MKG1paV9waHlfZGV0YWNoKTsNCit2b2lkIG1paV9waHlfZGV0YWNoKHN0cnVj dCBtaWlfaWZfaW5mbyAqbWlpKQ0KK3sNCisJc3RydWN0IHBoeV9pbmZvICpwaHkgPSBtaWlfcGh5 X29mKG1paSk7DQorCXN0cnVjdCBtaWlfYnVzICpwYnVzOw0KKw0KKwlpZiAocGh5ID09IE5VTEwp DQorCQlyZXR1cm47DQorDQorCXBidXMgPSBtaWlfYnVzW01JSV9CVVMocGh5LT5waHlfaWQpXTsN CisNCisJaWYgKHBoeS0+ZGVsdGEuZGF0YSAhPSBOVUxMKSB7DQorCQlpZiAocGh5LT5kZWx0YS5p cnEgPCAwKQ0KKwkJCW1paV9waHlfcG9sbF9kaXNhYmxlKG1paSwgcGh5LT5kZWx0YS5kYXRhKTsN CisJCWVsc2UNCisJCQltaWlfcGh5X2lycV9kaXNhYmxlKG1paSwgcGh5LT5kZWx0YS5kYXRhKTsN CisJfQ0KKw0KKwlwYnVzLT5waHlbTUlJX0lEKHBoeS0+cGh5X2lkKV0gPSBOVUxMOw0KKwlrZnJl ZShwaHkpOw0KK30NCisNCitFWFBPUlRfU1lNQk9MKG1paV9waHlfc3RhdGUpOw0KK2ludCBtaWlf cGh5X3N0YXRlKHN0cnVjdCBtaWlfaWZfaW5mbyAqbWlpLCBzdHJ1Y3QgcGh5X3N0YXRlICpzdGF0 ZSkNCit7DQorCXN0cnVjdCBwaHlfaW5mbyAqcGh5ID0gbWlpX3BoeV9vZihtaWkpOw0KKwlpbnQg ZXJyID0gMDsNCisNCisJaWYgKHBoeSA9PSBOVUxMKQ0KKwkJcmV0dXJuIC1FSU5WQUw7DQorDQor CWlmIChwaHktPmRlbHRhLmZ1bmMgPT0gTlVMTCkNCisJCWVyciA9IHBoeS0+b3BzLT5wb2xsKHBo eSk7DQorDQorCW1lbWNweShzdGF0ZSwgJnBoeS0+c3RhdGUsIHNpemVvZigqc3RhdGUpKTsNCisN CisJcmV0dXJuIGVycjsNCit9DQorDQorRVhQT1JUX1NZTUJPTChtaWlfcGh5X3NldF9hdXRvbmVn KTsNCitpbnQgbWlpX3BoeV9zZXRfYXV0b25lZyhzdHJ1Y3QgbWlpX2lmX2luZm8gKm1paSwgdWlu dDMyX3QgYWR2ZXJ0aXNlKQ0KK3sNCisJc3RydWN0IHBoeV9pbmZvICpwaHkgPSBtaWlfcGh5X29m KG1paSk7DQorDQorCWlmIChwaHkgPT0gTlVMTCB8fCBwaHktPm9wcy0+c2V0X2F1dG9uZWcgPT0g TlVMTCkNCisJCXJldHVybiAtRUlOVkFMOw0KKw0KKwlyZXR1cm4gcGh5X3NldF9hdXRvbmVnKHBo eSwgYWR2ZXJ0aXNlKTsNCit9DQorDQorRVhQT1JUX1NZTUJPTChtaWlfcGh5X3NldF9mb3JjZWQp Ow0KK2ludCBtaWlfcGh5X3NldF9mb3JjZWQoc3RydWN0IG1paV9pZl9pbmZvICptaWksIGludCBz cGVlZCwgaW50IGR1cGxleCkNCit7DQorCXN0cnVjdCBwaHlfaW5mbyAqcGh5ID0gbWlpX3BoeV9v ZihtaWkpOw0KKwlpbnQgZXJyID0gMDsNCisNCisJaWYgKHBoeSA9PSBOVUxMKQ0KKwkJcmV0dXJu IC1FSU5WQUw7DQorDQorCWlmIChwaHktPm9wcy0+c2V0X2ZvcmNlZCkNCisJCWVyciA9IHBoeS0+ b3BzLT5zZXRfZm9yY2VkKHBoeSwgc3BlZWQsIGR1cGxleCk7DQorDQorCWlmIChlcnIgPCAwKQ0K KwkJcmV0dXJuIGVycjsNCisNCisJcGh5LT5uZWdvdGlhdGUuYXV0b25lZyA9IEFVVE9ORUdfRElT QUJMRTsNCisJcGh5LT5zdGF0ZS5zcGVlZCA9IHNwZWVkOw0KKwlwaHktPnN0YXRlLmR1cGxleCA9 IGR1cGxleDsNCisJcGh5LT5zdGF0ZS5hdXRvbmVnID0gMDsNCisNCisJcmV0dXJuIDA7DQorfQ0K Kw0KK3N0YXRpYyBERUNMQVJFX01VVEVYKG1paV9idXNfbG9jayk7DQorDQorRVhQT1JUX1NZTUJP TChtaWlfYnVzX3JlZ2lzdGVyKTsNCitpbnQgbWlpX2J1c19yZWdpc3RlcihzdHJ1Y3QgbWlpX2J1 cyAqYnVzKQ0KK3sNCisJaW50IGJ1c19pZDsNCisNCisJaWYgKGJ1cyA9PSBOVUxMIHx8IGJ1cy0+ bmFtZSA9PSBOVUxMIHx8IGJ1cy0+cmVhZCA9PSBOVUxMIHx8DQorCSAgICBidXMtPndyaXRlID09 IE5VTEwpDQorCQlyZXR1cm4gLUVJTlZBTDsNCisNCisJZG93bigmbWlpX2J1c19sb2NrKTsNCisN CisJZm9yIChidXNfaWQgPSAwOyBidXNfaWQgPCBNSUlfQlVTX01BWDsgYnVzX2lkKyspIHsNCisJ CWlmIChtaWlfYnVzW2J1c19pZF0gPT0gTlVMTCkNCisJCQlicmVhazsNCisJfQ0KKw0KKwlpZiAo YnVzX2lkID49IE1JSV9CVVNfTUFYKSB7DQorCQlidXNfaWQgPSAtRU5PTUVNOw0KKwkJZ290byBl bmQ7DQorCX0NCisNCisJbWlpX2J1c1tidXNfaWRdID0gYnVzOw0KKw0KKwlpZiAoYnVzLT5yZXNl dCkNCisJCWJ1cy0+cmVzZXQoYnVzLT5wcml2KTsNCisNCisJcHJpbnRrKEtFUk5fSU5GTyAiJXM6 IHJlZ2lzdGVyZWQgYXMgUEhZIGJ1cyAlZFxuIiwgYnVzLT5uYW1lLCBidXNfaWQpOw0KKw0KKyAg ICAgIGVuZDoNCisJdXAoJm1paV9idXNfbG9jayk7DQorDQorCXJldHVybiBidXNfaWQ7DQorfQ0K Kw0KK0VYUE9SVF9TWU1CT0wobWlpX2J1c191bnJlZ2lzdGVyKTsNCit2b2lkIG1paV9idXNfdW5y ZWdpc3RlcihzdHJ1Y3QgbWlpX2J1cyAqYnVzKQ0KK3sNCisJaW50IGk7DQorDQorCWRvd24oJm1p aV9idXNfbG9jayk7DQorDQorCWZvciAoaSA9IDA7IGkgPCBNSUlfQlVTX01BWDsgaSsrKSB7DQor CQlpZiAobWlpX2J1c1tpXSA9PSBidXMpIHsNCisJCQltaWlfYnVzW2ldID0gTlVMTDsNCisJCQli cmVhazsNCisJCX0NCisJfQ0KKw0KKwl1cCgmbWlpX2J1c19sb2NrKTsNCit9DQorDQorLyogSW5z ZXJ0IGludG8gJ3BoeV9saXN0JyBzb3J0ZWQgYnkNCisgKiBzaGlmdCAoc21hbGxlc3QgdG8gbGFy Z2VzdCkNCisgKi8NCitFWFBPUlRfU1lNQk9MKHBoeV9yZWdpc3Rlcik7DQoraW50IHBoeV9yZWdp c3RlcihzdHJ1Y3QgcGh5X2luZm8gKmluZm8pDQorew0KKwlzdHJ1Y3QgbGlzdF9oZWFkICpscDsN CisNCisJaWYgKGluZm89PU5VTEwgfHwgaW5mby0+b3BzID09IE5VTEwgfHwgaW5mby0+b3BzLT5w b2xsID09IE5VTEwpDQorCQlyZXR1cm4gLUVJTlZBTDsNCisNCisJbGlzdF9mb3JfZWFjaChscCwg JnBoeV9saXN0KSB7DQorCQlzdHJ1Y3QgcGh5X2luZm8gKnBoeSA9IGxpc3RfZW50cnkobHAsIHN0 cnVjdCBwaHlfaW5mbywgbGlzdCk7DQorCQlpZiAocGh5LT5zaGlmdCA+IGluZm8tPnNoaWZ0KQ0K KwkJCWJyZWFrOw0KKw0KKwkJLyogQ2hlY2sgZm9yIGR1cGxpY2F0ZXMgKi8NCisJCWlmICgocGh5 LT5zaGlmdD09aW5mby0+c2hpZnQpICYmIChpbmZvLT5pZCA9PSBwaHktPmlkKSkNCisJCQlyZXR1 cm4gLUVCVVNZOw0KKwl9DQorDQorCS8qIFRoaXMgZG9lcyB0aGUgJ3JpZ2h0IHRoaW5nJyBldmVu IGlmIGxwID09ICZwaHlfbGlzdA0KKwkgKi8NCisJbGlzdF9hZGRfdGFpbCgmaW5mby0+bGlzdCwg bHApOw0KKw0KKwlyZXR1cm4gMDsNCit9DQorDQorRVhQT1JUX1NZTUJPTChwaHlfdW5yZWdpc3Rl cik7DQordm9pZCBwaHlfdW5yZWdpc3RlcihzdHJ1Y3QgcGh5X2luZm8gKmluZm8pDQorew0KKwls aXN0X2RlbF9pbml0KCZpbmZvLT5saXN0KTsNCit9DQorDQorc3RhdGljIGludCBtaWlfYnVzX2lu aXQodm9pZCkNCit7DQorCXJldHVybiBwaHlfcmVnaXN0ZXIoJnBoeV9pbmZvX2dlbmVyaWMpOw0K K30NCisNCitzdGF0aWMgdm9pZCBtaWlfYnVzX2V4aXQodm9pZCkNCit7DQorCXBoeV91bnJlZ2lz dGVyKCZwaHlfaW5mb19nZW5lcmljKTsNCit9DQorDQorbW9kdWxlX2luaXQobWlpX2J1c19pbml0 KTsNCittb2R1bGVfZXhpdChtaWlfYnVzX2V4aXQpOw0KDQotLS0gL2Rldi9udWxsDQorKysgbGlu dXgvZHJpdmVycy9uZXQvcGh5X2NpY2FkYS5jDQpAQCAtMCwwICsxLDE3NyBAQA0KKy8qIA0KKyAq IGRyaXZlcnMvbmV0L3BoeV9jaWNhZGEuYw0KKyAqDQorICogQXV0aG9yOiBKYXNvbiBNY011bGxh bg0KKyAqDQorICogQ29weXJpZ2h0IChjKSAyMDA0IFRpbWVzeXMgQ29ycC4NCisgKg0KKyAqIFRo aXMgcHJvZ3JhbSBpcyBmcmVlIHNvZnR3YXJlOyB5b3UgY2FuIHJlZGlzdHJpYnV0ZSAgaXQgYW5k L29yIG1vZGlmeSBpdA0KKyAqIHVuZGVyICB0aGUgdGVybXMgb2YgIHRoZSBHTlUgR2VuZXJhbCAg UHVibGljIExpY2Vuc2UgYXMgcHVibGlzaGVkIGJ5IHRoZQ0KKyAqIEZyZWUgU29mdHdhcmUgRm91 bmRhdGlvbjsgIGVpdGhlciB2ZXJzaW9uIDIgb2YgdGhlICBMaWNlbnNlLCBvciAoYXQgeW91cg0K KyAqIG9wdGlvbikgYW55IGxhdGVyIHZlcnNpb24uDQorICoNCisgKi8NCisjaW5jbHVkZSA8bGlu dXgvbW9kdWxlLmg+DQorI2luY2x1ZGUgPGxpbnV4L2luaXQuaD4NCisjaW5jbHVkZSA8bGludXgv bWlpX2J1cy5oPg0KKw0KKy8qIENpY2FkYSBBdXhpbGlhcnkgQ29udHJvbC9TdGF0dXMgUmVnaXN0 ZXIgKi8NCisjZGVmaW5lIE1JSU1fQ0lTODIwMV9BVVhfQ09OU1RBVCAgICAgICAgMHgxYw0KKyNk ZWZpbmUgTUlJTV9DSVM4MjAxX0FVWENPTlNUQVRfSU5JVCAgICAweDAwMDQNCisjZGVmaW5lIE1J SU1fQ0lTODIwMV9BVVhDT05TVEFUX0RVUExFWCAgMHgwMDIwDQorI2RlZmluZSBNSUlNX0NJUzgy MDFfQVVYQ09OU1RBVF9TUEVFRCAgIDB4MDAxOA0KKyNkZWZpbmUgTUlJTV9DSVM4MjAxX0FVWENP TlNUQVRfR0JJVCAgICAweDAwMTANCisjZGVmaW5lIE1JSU1fQ0lTODIwMV9BVVhDT05TVEFUXzEw MCAgICAgMHgwMDA4DQorICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICANCisvKiBDaWNhZGEgRXh0ZW5k ZWQgQ29udHJvbCBSZWdpc3RlciAxICovDQorI2RlZmluZSBNSUlNX0NJUzgyMDFfRVhUX0NPTjEg ICAgICAgICAgIDB4MTcNCisjZGVmaW5lIE1JSU1fQ0lTODIwMV9FWFRDT04xX0lOSVQgICAgICAg MHgwMDAwDQorDQorLyogQ0lTODIwMSAqLw0KKyNkZWZpbmUgTUlJX0NJUzgyMDFfRVBDUiAgICAg ICAgMHgxNw0KKyNkZWZpbmUgRVBDUl9NT0RFX01BU0sgICAgICAgICAgMHgzMDAwDQorI2RlZmlu ZSBFUENSX0dNSUlfTU9ERSAgICAgICAgICAweDAwMDANCisjZGVmaW5lIEVQQ1JfUkdNSUlfTU9E RSAgICAgICAgIDB4MTAwMA0KKyNkZWZpbmUgRVBDUl9UQklfTU9ERSAgICAgICAgICAgMHgyMDAw DQorI2RlZmluZSBFUENSX1JUQklfTU9ERSAgICAgICAgICAweDMwMDANCisNCitzdGF0aWMgaW50 IGNpczgyMDFfaW5pdChzdHJ1Y3QgcGh5X2luZm8gKnBoeSkNCit7DQorCXVpbnQxNl90IGVwY3I7 DQorCWNvbnN0IGNoYXIgKm1vZGUgPSAiVW5rbm93biI7DQorDQorCWVwY3IgPSBwaHlfcmVhZChw aHksIE1JSV9DSVM4MjAxX0VQQ1IpOw0KKw0KKwlzd2l0Y2ggKGVwY3IgJiBFUENSX01PREVfTUFT Sykgew0KKwkJY2FzZSBFUENSX0dNSUlfTU9ERTogbW9kZSA9ICJHTUlJIjsgYnJlYWs7DQorCQlj YXNlIEVQQ1JfUkdNSUlfTU9ERTogbW9kZSA9ICJSR01JSSI7IGJyZWFrOw0KKwkJY2FzZSBFUENS X1RCSV9NT0RFOiBtb2RlID0gIlRCSSI7IGJyZWFrOw0KKwkJY2FzZSBFUENSX1JUQklfTU9ERTog bW9kZSA9ICJSVEJJIjsgYnJlYWs7DQorCX0NCisNCisJcHJpbnRrKEtFUk5fSU5GTyAiJXM6ICVz IG1vZGVcbiIscGh5LT5uYW1lLCBtb2RlKTsNCisNCisJcmV0dXJuIDA7DQorfQ0KKw0KKyNkZWZp bmUgTUlJX0NJUzgyMDFfSU5UUl9DVFJMCTB4MTkNCisjZGVmaW5lIE1JSV9DSVM4MjAxX0lOVFJf U1RBVAkweDFhDQorDQorI2RlZmluZSBNSUlfQ0lTODIwMV9JTlRSX0VOQUJMRQkweDgwMDANCisj ZGVmaW5lIE1JSV9DSVM4MjAxX0lOVFJfU1BFRUQJMHg0MDAwDQorI2RlZmluZSBNSUlfQ0lTODIw MV9JTlRSX0xJTksJMHgyMDAwDQorI2RlZmluZSBNSUlfQ0lTODIwMV9JTlRSX0RVUExFWAkweDEw MDANCisjZGVmaW5lIE1JSV9DSVM4MjAxX0lOVFJfQU5fRVJSCTB4MDgwMA0KKyNkZWZpbmUgTUlJ X0NJUzgyMDFfSU5UUl9BTl9ET04JMHgwNDAwDQorI2RlZmluZSBNSUlfQ0lTODIwMV9JTlRSX0FM TAkweDdjMDANCisNCitzdGF0aWMgaW50IGNpczgyMDFfaW50X2VuYWJsZShzdHJ1Y3QgcGh5X2lu Zm8gKnBoeSkNCit7DQorCXBoeV93cml0ZShwaHksIE1JSV9DSVM4MjAxX0lOVFJfQ1RSTCwgTUlJ X0NJUzgyMDFfSU5UUl9FTkFCTEUgfCBNSUlfQ0lTODIwMV9JTlRSX0FMTCk7DQorDQorCXJldHVy biAwOw0KK30NCisNCitzdGF0aWMgaW50IGNpczgyMDFfaW50X2Rpc2FibGUoc3RydWN0IHBoeV9p bmZvICpwaHkpDQorew0KKwlwaHlfd3JpdGUocGh5LCBNSUlfQ0lTODIwMV9JTlRSX0NUUkwsIDAp Ow0KKw0KKwlyZXR1cm4gMDsNCit9DQorDQorc3RhdGljIGludCBjaXM4MjAxX2ludF9hY2soc3Ry dWN0IHBoeV9pbmZvICpwaHkpDQorew0KKwlwaHlfcmVhZChwaHksIE1JSV9DSVM4MjAxX0lOVFJf U1RBVCk7DQorDQorCXJldHVybiAwOw0KK30NCisNCisjZGVmaW5lCU1JSV9DSVM4MjAxX0FDU1IJ MHgxYw0KKyNkZWZpbmUgIEFDU1JfRU5BQkxFXzEwMDBCQVNFVAkweDAwMDQNCisjZGVmaW5lICBB Q1NSX0RVUExFWF9TVEFUVVMJMHgwMDIwDQorI2RlZmluZSAgQUNTUl9TUEVFRF8xMDAwQkFTRVQJ MHgwMDEwDQorI2RlZmluZSAgQUNTUl9TUEVFRF8xMDBCQVNFVAkweDAwMDgNCisNCitzdGF0aWMg aW50IGNpczgyMDFfcG9sbChzdHJ1Y3QgcGh5X2luZm8gKnBoeSkNCit7DQorCXVpbnQxNl90IGFj c3I7DQorCXN0cnVjdCBwaHlfc3RhdGUgKnBzdGF0ZSA9ICZwaHktPnN0YXRlOw0KKwlpbnQgYXV0 b25lZyA9IHBoeS0+c3RhdGUuYXV0b25lZzsNCisJaW50IGVycjsNCisNCisJZXJyID0gcGh5X2dl bl9wb2xsKHBoeSk7DQorCWlmIChlcnIgPCAwKQ0KKwkJcmV0dXJuIGVycjsNCisNCisJaWYgKHBz dGF0ZS0+bGluayA9PSAwKQ0KKwkJcmV0dXJuIDA7DQorDQorCS8qIFdlIHVzZSB0aGUgb2xkIGNv cHkgb2YgJ3BoeS0+c3RhdGUuYXV0b25lZycNCisJICogYXMgcGh5X2dlbl9wb2xsIHdpbGwgaGF2 ZSBzZXQgaXQgdG8gMA0KKwkgKi8NCisJaWYgKGF1dG9uZWcpIHsNCisJCWFjc3IgPSBwaHlfcmVh ZChwaHksIE1JSV9DSVM4MjAxX0FDU1IpOw0KKw0KKwkJaWYgKGFjc3IgJiBBQ1NSX0RVUExFWF9T VEFUVVMpDQorCQkJcHN0YXRlLT5kdXBsZXggPSBEVVBMRVhfRlVMTDsNCisJCWVsc2UNCisJCQlw c3RhdGUtPmR1cGxleCA9IERVUExFWF9IQUxGOw0KKwkJaWYgKGFjc3IgJiBBQ1NSX1NQRUVEXzEw MDBCQVNFVCkgew0KKwkJCXBzdGF0ZS0+c3BlZWQgPSBTUEVFRF8xMDAwOw0KKwkJfSBlbHNlIGlm IChhY3NyICYgQUNTUl9TUEVFRF8xMDBCQVNFVCkNCisJCQlwc3RhdGUtPnNwZWVkID0gU1BFRURf MTAwOw0KKwkJZWxzZQ0KKwkJCXBzdGF0ZS0+c3BlZWQgPSBTUEVFRF8xMDsNCisJfQ0KKw0KKwkv KiBPbiBub24tYW5lZywgd2UgYXNzdW1lIHdoYXQgd2UgcHV0IGluIEJNQ1IgaXMgdGhlIHNwZWVk LA0KKwkgKiB0aG91Z2ggbWFnaWMtYW5lZyBzaG91bGRuJ3QgcHJldmVudCB0aGlzIGNhc2UgZnJv bSBvY2N1cnJpbmcNCisJICovDQorDQorCXJldHVybiAwOw0KK30NCisNCitzdGF0aWMgaW50IGNp czgyMDFfc2V0X2F1dG9uZWcoc3RydWN0IHBoeV9pbmZvICpwaHksIHVpbnQzMl90IGFkdmVydGlz ZSkNCit7DQorCXVpbnQxNl90IHZhbDsNCisNCisJLyogRG8gdGhlIDEwMDBCVCBzZXR1cCBoZXJl LiAqLw0KKwl2YWwgPSBwaHlfcmVhZChwaHksIE1JSV9DSVM4MjAxX0FDU1IpOw0KKwlpZiAoYWR2 ZXJ0aXNlICYgQURWRVJUSVNFRF8xMDAwYmFzZVRfRnVsbCkNCisJCXBoeV93cml0ZShwaHksIE1J SV9DSVM4MjAxX0FDU1IsIHZhbCB8IEFDU1JfRU5BQkxFXzEwMDBCQVNFVCk7DQorCWVsc2UNCisJ CXBoeV93cml0ZShwaHksIE1JSV9DSVM4MjAxX0FDU1IsIHZhbCAmIH5BQ1NSX0VOQUJMRV8xMDAw QkFTRVQpOw0KKw0KKwlyZXR1cm4gcGh5X2dlbl9zZXRfYXV0b25lZyhwaHksIGFkdmVydGlzZSk7 DQorfQ0KKw0KKw0KK3N0cnVjdCBwaHlfb3BzIHBoeV9vcHNfY2lzODIwMSA9IHsNCisJLmluaXQg CQk9IGNpczgyMDFfaW5pdCwNCisJLnNldF9hdXRvbmVnIAk9IGNpczgyMDFfc2V0X2F1dG9uZWcs DQorCS5wb2xsCQk9IGNpczgyMDFfcG9sbCwNCisJLmludF9lbmFibGUJPSBjaXM4MjAxX2ludF9l bmFibGUsDQorCS5pbnRfZGlzYWJsZQk9IGNpczgyMDFfaW50X2Rpc2FibGUsDQorCS5pbnRfYWNr CT0gY2lzODIwMV9pbnRfYWNrDQorfTsNCisNCisvKiBDaWNhZGEgODIwMSAqLw0KK3N0YXRpYyBz dHJ1Y3QgcGh5X2luZm8gcGh5X2luZm9fY2lzODIwMSA9IHsNCisJLmlkID0gMHgwMDBmYzQ0MCwN CisJLm5hbWUgPSAiQ0lTODIwMSIsDQorCS5zaGlmdCA9IDQsDQorCS5vcHMgPSAmcGh5X29wc19j aXM4MjAxDQorfTsNCisNCitzdGF0aWMgaW50IHBoeV9jaWNhZGFfaW5pdCh2b2lkKQ0KK3sNCisJ cmV0dXJuIHBoeV9yZWdpc3RlcigmcGh5X2luZm9fY2lzODIwMSk7DQorfQ0KKw0KK3N0YXRpYyB2 b2lkIHBoeV9jaWNhZGFfZXhpdCh2b2lkKQ0KK3sNCisJcGh5X3VucmVnaXN0ZXIoJnBoeV9pbmZv X2NpczgyMDEpOw0KK30NCisNCittb2R1bGVfaW5pdChwaHlfY2ljYWRhX2luaXQpOw0KK21vZHVs ZV9leGl0KHBoeV9jaWNhZGFfZXhpdCk7DQoNCi0tLSAvZGV2L251bGwNCisrKyBsaW51eC9kcml2 ZXJzL25ldC9waHlfZGF2aWNvbS5jDQpAQCAtMCwwICsxLDE0MCBAQA0KKy8qIA0KKyAqIGRyaXZl cnMvbmV0L3BoeV9kYXZpY29tLmMNCisgKg0KKyAqIEF1dGhvcjogSmFzb24gTWNNdWxsYW4NCisg Kg0KKyAqIENvcHlyaWdodCAoYykgMjAwNCBUaW1lc3lzIENvcnAuDQorICoNCisgKiBUaGlzIHBy b2dyYW0gaXMgZnJlZSBzb2Z0d2FyZTsgeW91IGNhbiByZWRpc3RyaWJ1dGUgIGl0IGFuZC9vciBt b2RpZnkgaXQNCisgKiB1bmRlciAgdGhlIHRlcm1zIG9mICB0aGUgR05VIEdlbmVyYWwgIFB1Ymxp YyBMaWNlbnNlIGFzIHB1Ymxpc2hlZCBieSB0aGUNCisgKiBGcmVlIFNvZnR3YXJlIEZvdW5kYXRp b247ICBlaXRoZXIgdmVyc2lvbiAyIG9mIHRoZSAgTGljZW5zZSwgb3IgKGF0IHlvdXINCisgKiBv cHRpb24pIGFueSBsYXRlciB2ZXJzaW9uLg0KKyAqDQorICovDQorI2luY2x1ZGUgPGxpbnV4L21v ZHVsZS5oPg0KKyNpbmNsdWRlIDxsaW51eC9pbml0Lmg+DQorI2luY2x1ZGUgPGxpbnV4L2RlbGF5 Lmg+DQorI2luY2x1ZGUgPGxpbnV4L21paV9idXMuaD4NCisNCisvKiBETTkxNjEgQ29udHJvbCBy ZWdpc3RlciB2YWx1ZXMgKi8NCisjZGVmaW5lIE1JSU1fRE05MTYxX0NSX1NUT1AJMHgwNDAwDQor I2RlZmluZSBNSUlNX0RNOTE2MV9DUl9SU1RBTgkweDEyMDANCisNCisjZGVmaW5lIE1JSU1fRE05 MTYxX1NDUgkJMHgxMA0KKyNkZWZpbmUgTUlJTV9ETTkxNjFfU0NSX0lOSVQJMHgwNjEwDQorDQor LyogRE05MTYxIFNwZWNpZmllZCBDb25maWd1cmF0aW9uIGFuZCBTdGF0dXMgUmVnaXN0ZXIgKi8N CisjZGVmaW5lIE1JSU1fRE05MTYxX1NDU1IJMHgxMQ0KKyNkZWZpbmUgTUlJTV9ETTkxNjFfU0NT Ul8xMDBGCTB4ODAwMA0KKyNkZWZpbmUgTUlJTV9ETTkxNjFfU0NTUl8xMDBICTB4NDAwMA0KKyNk ZWZpbmUgTUlJTV9ETTkxNjFfU0NTUl8xMEYJMHgyMDAwDQorI2RlZmluZSBNSUlNX0RNOTE2MV9T Q1NSXzEwSAkweDEwMDANCisNCisvKiBETTkxNjEgSW50ZXJydXB0IFJlZ2lzdGVyICovDQorI2Rl ZmluZSBNSUlNX0RNOTE2MV9JTlRSCTB4MTUNCisjZGVmaW5lIE1JSU1fRE05MTYxX0lOVFJfUEVO RAkJMHg4MDAwDQorI2RlZmluZSBNSUlNX0RNOTE2MV9JTlRSX0RQTFhfTUFTSwkweDA4MDANCisj ZGVmaW5lIE1JSU1fRE05MTYxX0lOVFJfU1BEX01BU0sJMHgwNDAwDQorI2RlZmluZSBNSUlNX0RN OTE2MV9JTlRSX0xJTktfTUFTSwkweDAyMDANCisjZGVmaW5lIE1JSU1fRE05MTYxX0lOVFJfTUFT SwkJMHgwMTAwDQorI2RlZmluZSBNSUlNX0RNOTE2MV9JTlRSX0RQTFhfQ0hBTkdFCTB4MDAxMA0K KyNkZWZpbmUgTUlJTV9ETTkxNjFfSU5UUl9TUERfQ0hBTkdFCTB4MDAwOA0KKyNkZWZpbmUgTUlJ TV9ETTkxNjFfSU5UUl9MSU5LX0NIQU5HRQkweDAwMDQNCisjZGVmaW5lIE1JSU1fRE05MTYxX0lO VFJfSU5JVCAJCTB4MDAwMA0KKyNkZWZpbmUgTUlJTV9ETTkxNjFfSU5UUl9TVE9QCVwNCisoTUlJ TV9ETTkxNjFfSU5UUl9EUExYX01BU0sgfCBNSUlNX0RNOTE2MV9JTlRSX1NQRF9NQVNLIFwNCisg fCBNSUlNX0RNOTE2MV9JTlRSX0xJTktfTUFTSyB8IE1JSU1fRE05MTYxX0lOVFJfTUFTSykNCisN CisvKiBETTkxNjEgMTBCVCBDb25maWd1cmF0aW9uL1N0YXR1cyAqLw0KKyNkZWZpbmUgTUlJTV9E TTkxNjFfMTBCVENTUgkweDEyDQorI2RlZmluZSBNSUlNX0RNOTE2MV8xMEJUQ1NSX0lOSVQJMHg3 ODAwDQorDQorc3RhdGljIGludCBkbTkxNjFfaW5pdChzdHJ1Y3QgcGh5X2luZm8gKnBoeSkNCit7 DQorCW1kZWxheSgyMDAwKTsNCisNCisJcGh5X3dyaXRlKHBoeSwgTUlJX0JNQ1IsIE1JSU1fRE05 MTYxX0NSX1NUT1ApOw0KKwlwaHlfd3JpdGUocGh5LCBNSUlNX0RNOTE2MV9TQ1IsIE1JSU1fRE05 MTYxX1NDUl9JTklUKTsNCisJcGh5X3dyaXRlKHBoeSwgTUlJTV9ETTkxNjFfMTBCVENTUiwgTUlJ TV9ETTkxNjFfMTBCVENTUl9JTklUKTsNCisNCisJcmV0dXJuIDA7DQorfQ0KKw0KK3N0YXRpYyBp bnQgZG05MTYxX2ludF9lbmFibGUoc3RydWN0IHBoeV9pbmZvICpwaHkpDQorew0KKwkvKiBDbGVh ciBhbnkgcGVuZGluZyBpbnRlcnJ1cHRzICovDQorCXBoeV9yZWFkKHBoeSwgTUlJTV9ETTkxNjFf SU5UUik7DQorDQorCXJldHVybiAwOw0KK30NCisNCitzdGF0aWMgaW50IGRtOTE2MV9pbnRfYWNr KHN0cnVjdCBwaHlfaW5mbyAqcGh5KQ0KK3sNCisJLyogQ2xlYXIgYW55IHBlbmRpbmcgaW50ZXJy dXB0cyAqLw0KKwlwaHlfcmVhZChwaHksIE1JSU1fRE05MTYxX0lOVFIpOw0KKw0KKwlyZXR1cm4g MDsNCit9DQorDQorc3RhdGljIGludCBkbTkxNjFfaW50X2Rpc2FibGUoc3RydWN0IHBoeV9pbmZv ICpwaHkpDQorew0KKwkvKiBDbGVhciBhbnkgcGVuZGluZyBpbnRlcnJ1cHRzICovDQorCXBoeV9y ZWFkKHBoeSwgTUlJTV9ETTkxNjFfSU5UUik7DQorDQorCXJldHVybiAwOw0KK30NCisNCitzdGF0 aWMgaW50IGRtOTE2MV9wb2xsKHN0cnVjdCBwaHlfaW5mbyAqcGh5KQ0KK3sNCisJaW50IGF1dG9u ZWcgPSBwaHktPnN0YXRlLmF1dG9uZWc7DQorCWludCBlcnI7DQorCXVpbnQxNl90IHZhbDsNCisN CisJZXJyID0gcGh5X2dlbl9wb2xsKHBoeSk7DQorCWlmIChlcnIgPCAwKQ0KKwkJcmV0dXJuIGVy cjsNCisNCisJaWYgKHBoeS0+c3RhdGUubGluayAmJiBhdXRvbmVnKSB7DQorCQl2YWwgPSBwaHlf cmVhZChwaHksIE1JSU1fRE05MTYxX1NDU1IpOw0KKw0KKwkJaWYgKHZhbCAmIChNSUlNX0RNOTE2 MV9TQ1NSXzEwMEYgfCBNSUlNX0RNOTE2MV9TQ1NSXzEwMEgpKQ0KKwkJCXBoeS0+c3RhdGUuc3Bl ZWQgPSAxMDA7DQorCQllbHNlDQorCQkJcGh5LT5zdGF0ZS5zcGVlZCA9IDEwOw0KKw0KKwkJaWYg KHZhbCAmIChNSUlNX0RNOTE2MV9TQ1NSXzEwMEYgfCBNSUlNX0RNOTE2MV9TQ1NSXzEwRikpDQor CQkJcGh5LT5zdGF0ZS5kdXBsZXggPSAxOw0KKwkJZWxzZQ0KKwkJCXBoeS0+c3RhdGUuZHVwbGV4 ID0gMDsNCisJfQ0KKw0KKwlyZXR1cm4gMDsNCit9DQorDQorc3RhdGljIHN0cnVjdCBwaHlfb3Bz IHBoeV9vcHNfZG05MTYxID0gew0KKwkuaW5pdCA9IGRtOTE2MV9pbml0LA0KKwkucG9sbCA9IGRt OTE2MV9wb2xsLA0KKwkuaW50X2VuYWJsZSA9IGRtOTE2MV9pbnRfZW5hYmxlLA0KKwkuaW50X2Fj ayA9IGRtOTE2MV9pbnRfYWNrLA0KKwkuaW50X2Rpc2FibGUgPSBkbTkxNjFfaW50X2Rpc2FibGUs DQorfTsNCisNCitzdGF0aWMgc3RydWN0IHBoeV9pbmZvIHBoeV9pbmZvX2RtOTE2MSA9IHsNCisJ LmlkID0gMHgwMTgxYjg4MCwNCisJLm5hbWUgPSAiRGF2aWNvbSBETTkxNjFFIiwNCisJLnNoaWZ0 ID0gNCwNCisJLm9wcyA9ICZwaHlfb3BzX2RtOTE2MSwNCit9Ow0KKw0KK3N0YXRpYyBpbnQgcGh5 X2Rhdmljb21faW5pdCh2b2lkKQ0KK3sNCisJcmV0dXJuIHBoeV9yZWdpc3RlcigmcGh5X2luZm9f ZG05MTYxKTsNCit9DQorDQorc3RhdGljIHZvaWQgcGh5X2Rhdmljb21fZXhpdCh2b2lkKQ0KK3sN CisJcGh5X3VucmVnaXN0ZXIoJnBoeV9pbmZvX2RtOTE2MSk7DQorfQ0KKw0KK21vZHVsZV9pbml0 KHBoeV9kYXZpY29tX2luaXQpOw0KK21vZHVsZV9leGl0KHBoeV9kYXZpY29tX2V4aXQpOw0KDQot LS0gL2Rldi9udWxsDQorKysgbGludXgvZHJpdmVycy9uZXQvcGh5X2x4dDk3eC5jDQpAQCAtMCww ICsxLDIxMCBAQA0KKy8qIA0KKyAqIGRyaXZlcnMvbmV0L3BoeV9seHQ5N3guYw0KKyAqDQorICog QXV0aG9yOiBKYXNvbiBNY011bGxhbg0KKyAqDQorICogQ29weXJpZ2h0IChjKSAyMDA0IFRpbWVz eXMgQ29ycC4NCisgKg0KKyAqIFRoaXMgcHJvZ3JhbSBpcyBmcmVlIHNvZnR3YXJlOyB5b3UgY2Fu IHJlZGlzdHJpYnV0ZSAgaXQgYW5kL29yIG1vZGlmeSBpdA0KKyAqIHVuZGVyICB0aGUgdGVybXMg b2YgIHRoZSBHTlUgR2VuZXJhbCAgUHVibGljIExpY2Vuc2UgYXMgcHVibGlzaGVkIGJ5IHRoZQ0K KyAqIEZyZWUgU29mdHdhcmUgRm91bmRhdGlvbjsgIGVpdGhlciB2ZXJzaW9uIDIgb2YgdGhlICBM aWNlbnNlLCBvciAoYXQgeW91cg0KKyAqIG9wdGlvbikgYW55IGxhdGVyIHZlcnNpb24uDQorICoN CisgKi8NCisjaW5jbHVkZSA8bGludXgvbW9kdWxlLmg+DQorI2luY2x1ZGUgPGxpbnV4L2luaXQu aD4NCisjaW5jbHVkZSA8bGludXgvbWlpX2J1cy5oPg0KKw0KKy8qIC0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0g Ki8NCisvKiBUaGUgTGV2ZWwgb25lIExYVDk3MCBpcyB1c2VkIGJ5IG1hbnkgYm9hcmRzCQkJCSAg ICAgKi8NCisNCisjZGVmaW5lIE1JSV9MWFQ5NzBfTUlSUk9SICAgIDE2ICAvKiBNaXJyb3IgcmVn aXN0ZXIgICAgICAgICAgICovDQorI2RlZmluZSBNSUlfTFhUOTcwX0lFUiAgICAgICAxNyAgLyog SW50ZXJydXB0IEVuYWJsZSBSZWdpc3RlciAqLw0KKyNkZWZpbmUgTUlJX0xYVDk3MF9JU1IgICAg ICAgMTggIC8qIEludGVycnVwdCBTdGF0dXMgUmVnaXN0ZXIgKi8NCisjZGVmaW5lIE1JSV9MWFQ5 NzBfQ09ORklHICAgIDE5ICAvKiBDb25maWd1cmF0aW9uIFJlZ2lzdGVyICAgICovDQorI2RlZmlu ZSBNSUlfTFhUOTcwX0NTUiAgICAgICAyMCAgLyogQ2hpcCBTdGF0dXMgUmVnaXN0ZXIgICAgICAq Lw0KKw0KK3N0YXRpYyBpbnQgbHh0OTcwX2ludF9lbmFibGUoc3RydWN0IHBoeV9pbmZvICpwaHkp DQorew0KKwlwaHlfd3JpdGUocGh5LCBNSUlfTFhUOTcwX0lFUiwgMHgwMDAyKTsNCisNCisJcmV0 dXJuIDA7DQorfTsNCisNCitzdGF0aWMgaW50IGx4dDk3MF9pbnRfYWNrKHN0cnVjdCBwaHlfaW5m byAqcGh5KQ0KK3sNCisJcGh5X3JlYWQocGh5LCBNSUlfTFhUOTcwX0lTUik7DQorDQorCXJldHVy biAwOw0KK30NCisNCitzdGF0aWMgaW50IGx4dDk3MF9pbnRfZGlzYWJsZShzdHJ1Y3QgcGh5X2lu Zm8gKnBoeSkNCit7DQorCXBoeV93cml0ZShwaHksIE1JSV9MWFQ5NzBfSUVSLCAweDAwMDApOw0K Kw0KKwlyZXR1cm4gMDsNCit9Ow0KKw0KK3N0YXRpYyBpbnQgbHh0OTcwX3BvbGwoc3RydWN0IHBo eV9pbmZvICpwaHkpDQorew0KKwlpbnQgYXV0b25lZyA9IHBoeS0+c3RhdGUuYXV0b25lZzsNCisJ aW50IGVycjsNCisNCisJZXJyID0gcGh5X2dlbl9wb2xsKHBoeSk7DQorCWlmIChlcnIgPCAwKQ0K KwkJcmV0dXJuIGVycjsNCisNCisJaWYgKHBoeS0+c3RhdGUubGluayAmJiBhdXRvbmVnKSB7DQor CQkvKiBmaW5kIG91dCB0aGUgY3VycmVudCBzdGF0ZSAqLw0KKwkJdWludDE2X3QgdmFsOw0KKw0K KwkJdmFsID0gcGh5X3JlYWQocGh5LCBNSUlfTFhUOTcwX0NTUik7DQorCQlpZiAodmFsICYgMHgx MDAwKQ0KKwkJCXBoeS0+c3RhdGUuZHVwbGV4ID0gMTsNCisJCWVsc2UNCisJCQlwaHktPnN0YXRl LmR1cGxleCA9IDA7DQorDQorCQlpZiAodmFsICYgMHgwODAwKSB7DQorCQkJcGh5LT5zdGF0ZS5z cGVlZCA9IDEwMDsNCisJCX0gZWxzZSB7DQorCQkJcGh5LT5zdGF0ZS5zcGVlZCA9IDEwOw0KKwkJ fQ0KKwl9DQorDQorCXJldHVybiAwOw0KK30NCisNCisNCitzdGF0aWMgc3RydWN0IHBoeV9vcHMg cGh5X29wc19seHQ5NzAgPSB7DQorCS5wb2xsIAkJPSBseHQ5NzBfcG9sbCwNCisJLmludF9lbmFi bGUJPSBseHQ5NzBfaW50X2VuYWJsZSwNCisJLmludF9hY2sJPSBseHQ5NzBfaW50X2FjaywNCisJ LmludF9kaXNhYmxlCT0gbHh0OTcwX2ludF9kaXNhYmxlDQorfTsNCisNCitzdGF0aWMgc3RydWN0 IHBoeV9pbmZvIHBoeV9pbmZvX2x4dDk3MCA9IHsNCisJLmlkID0gMHgwNzgxMDAwMCwNCisJLnNo aWZ0ID0gNCwNCisJLm5hbWUgPSAiTFhUOTcwIiwNCisJLm9wcyA9ICZwaHlfb3BzX2x4dDk3MCwN Cit9Ow0KKw0KKy8qIFNhbWUgYXMgdGhlIExYVDk3MCwgYnV0IGRpZmZlcmVudCBJRA0KKyAqLw0K K3N0YXRpYyBzdHJ1Y3QgcGh5X2luZm8gcGh5X2luZm9fbHh0OTcwYSA9IHsNCisJLmlkID0gMHgw MDgxMDAwMCwNCisJLnNoaWZ0ID0gNCwNCisJLm5hbWUgPSAiTFhUOTcwQSIsDQorCS5vcHMgPSAm cGh5X29wc19seHQ5NzAsDQorfTsNCisNCisvKiByZWdpc3RlciBkZWZpbml0aW9ucyBmb3IgdGhl IDk3MSAqLw0KKw0KKyNkZWZpbmUgTUlJX0xYVDk3MV9QQ1IgICAgICAgMTYgIC8qIFBvcnQgQ29u dHJvbCBSZWdpc3RlciAgICAgKi8NCisjZGVmaW5lIE1JSV9MWFQ5NzFfU1IyICAgICAgIDE3ICAv KiBTdGF0dXMgUmVnaXN0ZXIgMiAgICAgICAgICovDQorI2RlZmluZSBNSUlfTFhUOTcxX0lFUiAg ICAgICAxOCAgLyogSW50ZXJydXB0IEVuYWJsZSBSZWdpc3RlciAqLw0KKyNkZWZpbmUgTUlJX0xY VDk3MV9JU1IgICAgICAgMTkgIC8qIEludGVycnVwdCBTdGF0dXMgUmVnaXN0ZXIgKi8NCisjZGVm aW5lIE1JSV9MWFQ5NzFfTENSICAgICAgIDIwICAvKiBMRUQgQ29udHJvbCBSZWdpc3RlciAgICAg ICovDQorI2RlZmluZSBNSUlfTFhUOTcxX1RDUiAgICAgICAzMCAgLyogVHJhbnNtaXQgQ29udHJv bCBSZWdpc3RlciAqLw0KKw0KK3N0YXRpYyBpbnQgbHh0OTcxX2ludF9lbmFibGUoc3RydWN0IHBo eV9pbmZvICpwaHkpDQorew0KKwlwaHlfd3JpdGUocGh5LCBNSUlfTFhUOTcxX0lFUiwgMHgwMGYy KTsNCisNCisJcmV0dXJuIDA7DQorfQ0KKw0KK3N0YXRpYyBpbnQgbHh0OTcxX3BvbGwoc3RydWN0 IHBoeV9pbmZvICpwaHkpDQorew0KKwlpbnQgYXV0b25lZyA9IHBoeS0+c3RhdGUuYXV0b25lZzsN CisJaW50IGVycjsNCisNCisJZXJyID0gcGh5X2dlbl9wb2xsKHBoeSk7DQorCWlmIChlcnIgPCAw KQ0KKwkJcmV0dXJuIGVycjsNCisNCisJaWYgKHBoeS0+c3RhdGUubGluayAmJiBhdXRvbmVnKSB7 DQorCQkvKiBmaW5kIG91dCB0aGUgY3VycmVudCBzdGF0ZSAqLw0KKwkJdWludDE2X3QgdmFsOw0K Kw0KKwkJdmFsID0gcGh5X3JlYWQocGh5LCBNSUlfTFhUOTcxX1NSMik7DQorDQorCQlpZiAodmFs ICYgMHg0MDAwKSB7DQorCQkJcGh5LT5zdGF0ZS5zcGVlZCA9IDEwMDsNCisJCX0gZWxzZSB7DQor CQkJcGh5LT5zdGF0ZS5zcGVlZCA9IDEwOw0KKwkJfQ0KKw0KKwkJaWYgKHZhbCAmIDB4MDIwMCkg ew0KKwkJCXBoeS0+c3RhdGUuZHVwbGV4ID0gMTsNCisJCX0gZWxzZSB7DQorCQkJcGh5LT5zdGF0 ZS5kdXBsZXggPSAwOw0KKwkJfQ0KKw0KKwkJaWYgKHZhbCAmIDB4MDAwOCkNCisJCQlwcmludGso S0VSTl9ERUJVRyAiJXM6IEZhdWx0IGRldGVjdGVkXG4iLHBoeS0+bmFtZSk7DQorCX0NCisNCisJ cmV0dXJuIDA7DQorfQ0KKw0KK3N0YXRpYyBpbnQgbHh0OTcxX2ludF9hY2soc3RydWN0IHBoeV9p bmZvICpwaHkpDQorew0KKwlwaHlfcmVhZChwaHksIE1JSV9MWFQ5NzFfSVNSKTsNCisNCisJcmV0 dXJuIDA7DQorfQ0KKw0KK3N0YXRpYyBpbnQgbHh0OTcxX2ludF9kaXNhYmxlKHN0cnVjdCBwaHlf aW5mbyAqcGh5KQ0KK3sNCisJcGh5X3dyaXRlKHBoeSwgTUlJX0xYVDk3MV9JRVIsIDB4MDAwMCk7 DQorDQorCXJldHVybiAwOw0KK307DQorDQorc3RhdGljIHN0cnVjdCBwaHlfb3BzIHBoeV9vcHNf bHh0OTcxID0gew0KKwkucG9sbCAJCT0gbHh0OTcxX3BvbGwsDQorCS5pbnRfZW5hYmxlCT0gbHh0 OTcxX2ludF9lbmFibGUsDQorCS5pbnRfYWNrCT0gbHh0OTcxX2ludF9hY2ssDQorCS5pbnRfZGlz YWJsZQk9IGx4dDk3MV9pbnRfZGlzYWJsZSwNCit9Ow0KKw0KK3N0YXRpYyBzdHJ1Y3QgcGh5X2lu Zm8gcGh5X2luZm9fbHh0OTcxID0gew0KKwkuaWQgPSAweDAwMTM3OGUwLA0KKwkuc2hpZnQgPSA0 LA0KKwkubmFtZSA9ICJMWFQ5NzEiLA0KKwkub3BzID0gJnBoeV9vcHNfbHh0OTcxLA0KK307DQor DQorc3RhdGljIGludCBwaHlfbHh0OTd4X2luaXQodm9pZCkNCit7DQorCWludCBlcnI7DQorDQor CWVycj1waHlfcmVnaXN0ZXIoJnBoeV9pbmZvX2x4dDk3MCk7DQorCWlmIChlcnIpDQorCQlyZXR1 cm4gZXJyOw0KKw0KKwllcnI9cGh5X3JlZ2lzdGVyKCZwaHlfaW5mb19seHQ5NzBhKTsNCisJaWYg KGVycikgew0KKwkJcGh5X3VucmVnaXN0ZXIoJnBoeV9pbmZvX2x4dDk3MCk7DQorCQlyZXR1cm4g ZXJyOw0KKwl9DQorDQorCWVycj1waHlfcmVnaXN0ZXIoJnBoeV9pbmZvX2x4dDk3MSk7DQorCWlm IChlcnIpIHsNCisJCXBoeV91bnJlZ2lzdGVyKCZwaHlfaW5mb19seHQ5NzApOw0KKwkJcGh5X3Vu cmVnaXN0ZXIoJnBoeV9pbmZvX2x4dDk3MGEpOw0KKwl9DQorDQorCXJldHVybiBlcnI7DQorfQ0K Kw0KK3N0YXRpYyB2b2lkIHBoeV9seHQ5N3hfZXhpdCh2b2lkKQ0KK3sNCisJcGh5X3VucmVnaXN0 ZXIoJnBoeV9pbmZvX2x4dDk3MSk7DQorCXBoeV91bnJlZ2lzdGVyKCZwaHlfaW5mb19seHQ5NzBh KTsNCisJcGh5X3VucmVnaXN0ZXIoJnBoeV9pbmZvX2x4dDk3MCk7DQorfQ0KKw0KK21vZHVsZV9p bml0KHBoeV9seHQ5N3hfaW5pdCk7DQorbW9kdWxlX2V4aXQocGh5X2x4dDk3eF9leGl0KTsNCg0K LS0tIC9kZXYvbnVsbA0KKysrIGxpbnV4L2RyaXZlcnMvbmV0L3BoeV9tYXJ2ZWxsLmMNCkBAIC0w LDAgKzEsMTI1IEBADQorLyogDQorICogZHJpdmVycy9uZXQvcGh5X21hcnZlbGwuYw0KKyAqDQor ICogQXV0aG9yOiBKYXNvbiBNY011bGxhbg0KKyAqDQorICogQ29weXJpZ2h0IChjKSAyMDA0IFRp bWVzeXMgQ29ycC4NCisgKg0KKyAqIFRoaXMgcHJvZ3JhbSBpcyBmcmVlIHNvZnR3YXJlOyB5b3Ug Y2FuIHJlZGlzdHJpYnV0ZSAgaXQgYW5kL29yIG1vZGlmeSBpdA0KKyAqIHVuZGVyICB0aGUgdGVy bXMgb2YgIHRoZSBHTlUgR2VuZXJhbCAgUHVibGljIExpY2Vuc2UgYXMgcHVibGlzaGVkIGJ5IHRo ZQ0KKyAqIEZyZWUgU29mdHdhcmUgRm91bmRhdGlvbjsgIGVpdGhlciB2ZXJzaW9uIDIgb2YgdGhl ICBMaWNlbnNlLCBvciAoYXQgeW91cg0KKyAqIG9wdGlvbikgYW55IGxhdGVyIHZlcnNpb24uDQor ICoNCisgKi8NCisjaW5jbHVkZSA8bGludXgvbW9kdWxlLmg+DQorI2luY2x1ZGUgPGxpbnV4L2lu aXQuaD4NCisjaW5jbHVkZSA8bGludXgvbWlpX2J1cy5oPg0KKw0KKy8qIDg4RTEwMTEgUEhZIFN0 YXR1cyBSZWdpc3RlciAqLw0KKyNkZWZpbmUgTUlJTV84OEUxMDExX1BIWV9TVEFUVVMgICAgICAg ICAweDExDQorI2RlZmluZSBNSUlNXzg4RTEwMTFfUEhZU1RBVF9TUEVFRCAgICAgIDB4YzAwMA0K KyNkZWZpbmUgTUlJTV84OEUxMDExX1BIWVNUQVRfR0JJVCAgICAgICAweDgwMDANCisjZGVmaW5l IE1JSU1fODhFMTAxMV9QSFlTVEFUXzEwMCAgICAgICAgMHg0MDAwDQorI2RlZmluZSBNSUlNXzg4 RTEwMTFfUEhZU1RBVF9EVVBMRVggICAgIDB4MjAwMA0KKyNkZWZpbmUgTUlJTV84OEUxMDExX1BI WVNUQVRfTElOSwkweDA0MDANCisNCisjZGVmaW5lIE1JSU1fODhFMTAxMV9JRVZFTlQJCTB4MTMN CisjZGVmaW5lIE1JSU1fODhFMTAxMV9JRVZFTlRfQ0xFQVIJMHgwMDAwDQorDQorI2RlZmluZSBN SUlNXzg4RTEwMTFfSU1BU0sJCTB4MTINCisjZGVmaW5lIE1JSU1fODhFMTAxMV9JTUFTS19JTklU CQkweDY0MDANCisjZGVmaW5lIE1JSU1fODhFMTAxMV9JTUFTS19DTEVBUgkweDAwMDANCisNCitz dGF0aWMgaW50IG1hcnZlbGxfaW50X2VuYWJsZShzdHJ1Y3QgcGh5X2luZm8gKnBoeSkNCit7DQor CS8qIENsZWFyIHRoZSBJRVZFTlQgcmVnaXN0ZXIgKi8NCisJcGh5X3JlYWQocGh5LCBNSUlNXzg4 RTEwMTFfSUVWRU5UKTsNCisNCisJLyogU2V0IHVwIHRoZSBtYXNrICovDQorCXBoeV93cml0ZShw aHksIE1JSU1fODhFMTAxMV9JTUFTSywgTUlJTV84OEUxMDExX0lNQVNLX0lOSVQpOw0KKw0KKwly ZXR1cm4gMDsNCit9DQorDQorc3RhdGljIGludCBtYXJ2ZWxsX2ludF9hY2soc3RydWN0IHBoeV9p bmZvICpwaHkpDQorew0KKwkvKiBDbGVhciB0aGUgaW50ZXJydXB0ICovDQorCXBoeV9yZWFkKHBo eSwgTUlJTV84OEUxMDExX0lFVkVOVCk7DQorDQorCXJldHVybiAwOw0KK30NCisNCitzdGF0aWMg aW50IG1hcnZlbGxfaW50X2Rpc2FibGUoc3RydWN0IHBoeV9pbmZvICpwaHkpDQorew0KKwkvKiBD bGVhciB0aGUgaW50ZXJydXB0ICovDQorCXBoeV9yZWFkKHBoeSwgTUlJTV84OEUxMDExX0lFVkVO VCk7DQorCS8qIERpc2FibGUgSW50ZXJydXB0cyAqLw0KKwlwaHlfd3JpdGUocGh5LCBNSUlNXzg4 RTEwMTFfSU1BU0ssIE1JSU1fODhFMTAxMV9JTUFTS19DTEVBUik7DQorDQorCXJldHVybiAwOw0K K30NCisNCitzdGF0aWMgaW50IG1hcnZlbGxfcG9sbChzdHJ1Y3QgcGh5X2luZm8gKnBoeSkNCit7 DQorCWludCBhdXRvbmVnID0gcGh5LT5zdGF0ZS5hdXRvbmVnOw0KKwlpbnQgZXJyOw0KKw0KKwll cnIgPSBwaHlfZ2VuX3BvbGwocGh5KTsNCisJaWYgKGVyciA8IDApDQorCQlyZXR1cm4gZXJyOw0K Kw0KKwlpZiAocGh5LT5zdGF0ZS5saW5rICYmIGF1dG9uZWcpIHsNCisJCXVpbnQxNl90IHZhbDsN CisJCXVuc2lnbmVkIGludCBzcGVlZDsNCisNCisJCXZhbCA9IHBoeV9yZWFkKHBoeSwgTUlJTV84 OEUxMDExX1BIWV9TVEFUVVMpOw0KKw0KKwkJaWYgKHZhbCAmIE1JSU1fODhFMTAxMV9QSFlTVEFU X0RVUExFWCkNCisJCQlwaHktPnN0YXRlLmR1cGxleCA9IDE7DQorCQllbHNlDQorCQkJcGh5LT5z dGF0ZS5kdXBsZXggPSAwOw0KKw0KKwkJc3BlZWQgPSAodmFsICYgTUlJTV84OEUxMDExX1BIWVNU QVRfU1BFRUQpOw0KKw0KKwkJc3dpdGNoIChzcGVlZCkgew0KKwkJCWNhc2UgTUlJTV84OEUxMDEx X1BIWVNUQVRfR0JJVDoNCisJCQkJcGh5LT5zdGF0ZS5zcGVlZCA9IDEwMDA7DQorCQkJCWJyZWFr Ow0KKwkJCWNhc2UgTUlJTV84OEUxMDExX1BIWVNUQVRfMTAwOg0KKwkJCQlwaHktPnN0YXRlLnNw ZWVkID0gMTAwOw0KKwkJCQlicmVhazsNCisJCQlkZWZhdWx0Og0KKwkJCQlwaHktPnN0YXRlLnNw ZWVkID0gMTA7DQorCQkJCWJyZWFrOw0KKwkJfQ0KKwl9DQorDQorCXJldHVybiAwOw0KK30NCisN CitzdGF0aWMgc3RydWN0IHBoeV9vcHMgcGh5X29wc19tYXJ2ZWxsID0gew0KKwkucG9sbAkJPSBt YXJ2ZWxsX3BvbGwsDQorCS5pbnRfZW5hYmxlCT0gbWFydmVsbF9pbnRfZW5hYmxlLA0KKwkuaW50 X2Fjawk9IG1hcnZlbGxfaW50X2FjaywNCisJLmludF9kaXNhYmxlCT0gbWFydmVsbF9pbnRfZGlz YWJsZSwNCit9Ow0KKw0KK3N0YXRpYyBzdHJ1Y3QgcGh5X2luZm8gcGh5X2luZm9fTTg4RTEwMTFT ID0gew0KKwkuaWQgPSAweDAxNDEwYzYwLA0KKwkubmFtZSA9ICJNYXJ2ZWxsIDg4RTEwMTFTIiwN CisJLnNoaWZ0ID0gNCwNCisJLm9wcyA9ICZwaHlfb3BzX21hcnZlbGwsDQorfTsNCisNCitzdGF0 aWMgaW50IHBoeV9tYXJ2ZWxsX2luaXQodm9pZCkNCit7DQorCXJldHVybiBwaHlfcmVnaXN0ZXIo JnBoeV9pbmZvX004OEUxMDExUyk7DQorfQ0KKw0KK3N0YXRpYyB2b2lkIHBoeV9tYXJ2ZWxsX2V4 aXQodm9pZCkNCit7DQorCXBoeV91bnJlZ2lzdGVyKCZwaHlfaW5mb19NODhFMTAxMVMpOw0KK30N CisNCittb2R1bGVfaW5pdChwaHlfbWFydmVsbF9pbml0KTsNCittb2R1bGVfZXhpdChwaHlfbWFy dmVsbF9leGl0KTsNCg0KLS0tIC9kZXYvbnVsbA0KKysrIGxpbnV4L2luY2x1ZGUvbGludXgvbWlp X2J1cy5oDQpAQCAtMCwwICsxLDE5MSBAQA0KKy8qIA0KKyAqIGluY2x1ZGUvbGludXgvbWlpX2J1 cy5oDQorICoNCisgKiBBdXRob3I6IEphc29uIE1jTXVsbGFuDQorICoNCisgKiBDb3B5cmlnaHQg KGMpIDIwMDQgVGltZXN5cyBDb3JwLg0KKyAqDQorICogVGhpcyBwcm9ncmFtIGlzIGZyZWUgc29m dHdhcmU7IHlvdSBjYW4gcmVkaXN0cmlidXRlICBpdCBhbmQvb3IgbW9kaWZ5IGl0DQorICogdW5k ZXIgIHRoZSB0ZXJtcyBvZiAgdGhlIEdOVSBHZW5lcmFsICBQdWJsaWMgTGljZW5zZSBhcyBwdWJs aXNoZWQgYnkgdGhlDQorICogRnJlZSBTb2Z0d2FyZSBGb3VuZGF0aW9uOyAgZWl0aGVyIHZlcnNp b24gMiBvZiB0aGUgIExpY2Vuc2UsIG9yIChhdCB5b3VyDQorICogb3B0aW9uKSBhbnkgbGF0ZXIg dmVyc2lvbi4NCisgKg0KKyAqLw0KKyNpZm5kZWYgX19NSUlfQlVTX0gNCisjZGVmaW5lIF9fTUlJ X0JVU19IDQorDQorI2lmZGVmIF9fS0VSTkVMX18NCisNCisjaW5jbHVkZSA8bGludXgvc2NoZWQu aD4NCisjaW5jbHVkZSA8bGludXgvbGlzdC5oPg0KKyNpbmNsdWRlIDxsaW51eC9taWkuaD4NCisj aW5jbHVkZSA8bGludXgvZXRodG9vbC5oPg0KKw0KKyNkZWZpbmUgTUlJX1RJTUVPVVQgCSgyKkha KQ0KKw0KKyNkZWZpbmUgbWlpbV9lbmQgKC0yKQ0KKyNkZWZpbmUgbWlpbV9yZWFkICgtMSkNCisN CisvKiBNYWNyb3MgZm9yICdwaHlfaWQncyB1c2VkIGVsc2V3aGVyZS4NCisgKiBBIFBIWSBJRCBp cyAzIGJpdHMgb2YgYnVzLCBmb2xsb3dlZCBieSA1IGJpdHMgb2YgaWQNCisgKi8NCisjZGVmaW5l IE1JSV9CVVMocGh5X2lkKQkJKCgocGh5X2lkKSA+PiA1KSAmIDB4NykNCisjZGVmaW5lIE1JSV9J RChwaHlfaWQpCQkoKHBoeV9pZCkgJiAweDFmKQ0KKyNkZWZpbmUgTUlJX1BIWV9JRChidXMsIGlk KQkoKCgoYnVzKSAmIDB4NykgPDwgNSkgfCAoKGlkKSAmIDB4MWYpKQ0KKw0KKy8qDQorICogQ3Vy cmVudCBQSFkgc3RhdGUNCisgKi8NCitzdHJ1Y3QgcGh5X3N0YXRlIHsNCisJdW5zaWduZWQgaW50 IGxpbms6MTsNCisJdW5zaWduZWQgaW50IGR1cGxleDoxOw0KKwl1bnNpZ25lZCBpbnQgYXV0b25l ZzoxOw0KKwl1bnNpZ25lZCBpbnQgbG9vcGJhY2s6MTsNCisJdW5zaWduZWQgaW50IHNwZWVkOjI4 Ow0KK307DQorDQorLyogUEhZIG9wZXJhdGlvbnMgLSBib3Jyb3dlZCBmcm9tIHN1bmdlbV9waHku aA0KKyAqDQorICogd29sX29wdGlvbnM6CVNlZSBXQUtFXyogaW4gaW5jbHVkZS9saW51eC9ldGh0 b29sLmgNCisgKiBhZHZlcnRpc2UgIDogU2VlIEFEVkVSVElTRURfKiBpbiBpbmNsdWRlL2xpbnV4 L2V0aHRvb2wuaA0KKyAqLw0KK3N0cnVjdCBwaHlfaW5mbzsNCisNCitzdHJ1Y3QgcGh5X29wcyB7 DQorCWludAkoKmluaXQpKHN0cnVjdCBwaHlfaW5mbyAqcGh5KTsNCisJaW50CSgqc3VzcGVuZCko c3RydWN0IHBoeV9pbmZvICpwaHksIHVpbnQzMl90IHdvbF9vcHRpb25zKTsNCisJaW50CSgqc2V0 X2F1dG9uZWcpKHN0cnVjdCBwaHlfaW5mbyAqcGh5LCB1aW50MzJfdCBhZHZlcnRpc2UpOw0KKwlp bnQJKCpzZXRfZm9yY2VkKShzdHJ1Y3QgcGh5X2luZm8gKnBoeSwgaW50IHNwZWVkLCBpbnQgZHVw bGV4KTsNCisNCisJLyogUG9sbGluZyAqLw0KKwlpbnQJKCpwb2xsKShzdHJ1Y3QgcGh5X2luZm8g KnBoeSk7DQorDQorCS8qIEludGVycnVwdC1iYXNlZCAqLw0KKwlpbnQJKCppbnRfZW5hYmxlKShz dHJ1Y3QgcGh5X2luZm8gKnBoeSk7DQorCWludAkoKmludF9hY2spKHN0cnVjdCBwaHlfaW5mbyAq cGh5KTsNCisJaW50CSgqaW50X2Rpc2FibGUpKHN0cnVjdCBwaHlfaW5mbyAqcGh5KTsNCit9Ow0K Kw0KKy8qIHN0cnVjdCBwaHlfaW5mbzogYSBzdHJ1Y3R1cmUgd2hpY2ggZGVmaW5lcyBhdHRyaWJ1 dGVzIGZvciBhIFBIWQ0KKyAqDQorICogaWQgd2lsbCBjb250YWluIGEgbnVtYmVyIHdoaWNoIHJl cHJlc2VudHMgdGhlIFBIWS4gIER1cmluZw0KKyAqIHN0YXJ0dXAsIHRoZSBkcml2ZXIgd2lsbCBw b2xsIHRoZSBQSFkgdG8gZmluZCBvdXQgd2hhdCBpdHMNCisgKiBVSUQtLWFzIGRlZmluZWQgYnkg cmVnaXN0ZXJzIDIgYW5kIDMtLWlzLiAgVGhlIDMyLWJpdCByZXN1bHQNCisgKiBnb3R0ZW4gZnJv bSB0aGUgUEhZIHdpbGwgYmUgc2hpZnRlZCByaWdodCBieSAic2hpZnQiIGJpdHMgdG8NCisgKiBk aXNjYXJkIGFueSBiaXRzIHdoaWNoIG1heSBjaGFuZ2UgYmFzZWQgb24gcmV2aXNpb24gbnVtYmVy cw0KKyAqIHVuaW1wb3J0YW50IHRvIGZ1bmN0aW9uYWxpdHkNCisgKg0KKyAqIFRoZSBzdHJ1Y3Qg cGh5X2NtZCBlbnRyaWVzIHJlcHJlc2VudCBwb2ludGVycyB0byBhbiBhcnJheXMgb2YNCisgKiBj b21tYW5kcyB3aGljaCB0ZWxsIHRoZSBkcml2ZXIgd2hhdCB0byBkbyB0byB0aGUgUEhZLg0KKyAq Lw0KK3N0cnVjdCBwaHlfaW5mbyB7DQorCXN0cnVjdCBsaXN0X2hlYWQgbGlzdDsNCisNCisJdWlu dDMyX3QgaWQ7DQorCWNoYXIgbmFtZVszMl07DQorCXVuc2lnbmVkIGludCBzaGlmdDsNCisNCisJ c3RydWN0IHBoeV9vcHMgKm9wczsNCisNCisJLyogUGVyLVBIWSBkcml2ZXIgZGF0YSBnb2VzIGhl cmUgKi8NCisJdm9pZCAqcHJpdjsNCisNCisJLyogWW91ciBwb2xsKCkgcm91dGluZSBzaG91bGQg bW9kaWZ5IHRoaXMuDQorCSAqLw0KKwlzdHJ1Y3QgcGh5X3N0YXRlIHN0YXRlOw0KKw0KKwkvKiBF dmVyeXRoaW5nIGZyb20gaGVyZSBvbiBkb3duIHdpbGwNCisJICogYmUgZmlsbGVkIGluIGR1cmlu ZyByZWdpc3RyYXRpb24NCisJICovDQorCWludCBwaHlfaWQ7DQorDQorCXN0cnVjdCB7DQorCQlp bnQgaXJxOw0KKwkJdW5zaWduZWQgbG9uZyBtc2VjczsNCisJCXZvaWQgKCpmdW5jKSAodm9pZCAq ZGF0YSk7DQorCQl2b2lkICpkYXRhOw0KKwkJc3RydWN0IHdvcmtfc3RydWN0IHRxOw0KKwkJc3Ry dWN0IHRpbWVyX2xpc3QgdGltZXI7DQorCX0gZGVsdGE7DQorDQorCXN0cnVjdCB7DQorCQlpbnQg YXV0b25lZzsJCS8qIDE9YXV0bywgMD1mb3JjZWQgKi8NCisJCXVpbnQzMl90IGFkdmVydGlzZTsJ LyogbWFzayB0byBhbGxvdyAqLw0KKwkJdW5zaWduZWQgbG9uZyB0aW1lb3V0OwkvKiBqaWZmaWUg c3RhbXAgKi8NCisJfSBuZWdvdGlhdGU7DQorDQorfTsNCisNCitzdHJ1Y3QgbWlpX2J1cyB7DQor CWNvbnN0IGNoYXIgKm5hbWU7DQorCXZvaWQgKnByaXY7DQorCWludCAoKnJlYWQpICh2b2lkICpw cml2LCBpbnQgcGh5X2lkLCBpbnQgbG9jYXRpb24pOw0KKwlpbnQgKCp3cml0ZSkgKHZvaWQgKnBy aXYsIGludCBwaHlfaWQsIGludCBsb2NhdGlvbiwgdWludDE2X3QgdmFsKTsNCisJdm9pZCAoKnJl c2V0KSAodm9pZCAqcHJpdik7DQorDQorCS8qIEF1dG8tZmlsbGVkIGluIHZhbHVlcyAqLw0KKwlz dHJ1Y3QgcGh5X2luZm8gKnBoeVszMl07DQorfTsNCisNCisvKiBNSUkgYnVzIHJlZ2lzdHJhdGlv bg0KKyAqLw0KK2V4dGVybiBpbnQgbWlpX2J1c19yZWdpc3RlcihzdHJ1Y3QgbWlpX2J1cyAqYnVz KTsNCitleHRlcm4gdm9pZCBtaWlfYnVzX3VucmVnaXN0ZXIoc3RydWN0IG1paV9idXMgKmJ1cyk7 DQorDQorLyogUmF3IHJlYWQvd3JpdGUgcm91dGluZXMNCisgKiBSZXR1cm5zIGEgMTYtYml0IHJl Z2lzdGVyIHZhbHVlLCBvciA8IDAgZXJyb3IgY29kZQ0KKyAqLw0KK2V4dGVybiBpbnQgbWlpX2J1 c19yZWFkKGludCBidXNfaWQsIGludCBwaHlfaWQsIGludCByZWcpOw0KK2V4dGVybiBpbnQgbWlp X2J1c193cml0ZShpbnQgYnVzX2lkLCBpbnQgcGh5X2lkLCBpbnQgcmVnLCB1aW50MTZfdCB2YWwp Ow0KKw0KKy8qIFJvdXRpbmVzIHVzZWQgYnkgbmV0d29yayBkZXZpY2VzIHRoYXQgdXNlIHRoZSBN SUkgYnVzDQorICovDQorZXh0ZXJuIGludCBtaWlfcGh5X2F0dGFjaChzdHJ1Y3QgbWlpX2lmX2lu Zm8gKm1paSwgc3RydWN0IG5ldF9kZXZpY2UgKmRldiwNCisJCQkgIGludCBwaHlfYnVzLCBpbnQg cGh5X2lkKTsNCitleHRlcm4gdm9pZCBtaWlfcGh5X2RldGFjaChzdHJ1Y3QgbWlpX2lmX2luZm8g Km1paSk7DQorDQorLyogUmVhZCBjdXJyZW50IHBoeSBzdGF0ZQ0KKyAqLw0KK2V4dGVybiBpbnQg bWlpX3BoeV9zdGF0ZShzdHJ1Y3QgbWlpX2lmX2luZm8gKm1paSwgc3RydWN0IHBoeV9zdGF0ZSAq c3RhdGUpOw0KKw0KKy8qIFJlc2V0IE1JSSwgcmVuZWdvdGlhdGUgbGluaw0KKyAqLw0KK2V4dGVy biBpbnQgbWlpX3BoeV9zZXRfYXV0b25lZyhzdHJ1Y3QgbWlpX2lmX2luZm8gKm1paSwgdWludDMy X3QgYWR2ZXJ0aXNlKTsNCitleHRlcm4gaW50IG1paV9waHlfc2V0X2ZvcmNlZChzdHJ1Y3QgbWlp X2lmX2luZm8gKm1paSwgaW50IHNwZWVkLCBpbnQgZHVwbGV4KTsNCitleHRlcm4gaW50IG1paV9w aHlfc3VzcGVuZChzdHJ1Y3QgbWlpX2lmX2luZm8gKm1paSwgdWludDMyX3Qgd29sX29wdGlvbnMp Ow0KKw0KKy8qIFVzZSBhbiBJUlEgdG8gZGV0ZXJtaW5lIHdoZW4gdGhlIFBIWSBjaGFuZ2VzDQor ICovDQorZXh0ZXJuIGludCBtaWlfcGh5X2lycV9lbmFibGUoc3RydWN0IG1paV9pZl9pbmZvICpt aWksIGludCBpcnEsDQorCQkJICAgICAgdm9pZCAoKmZ1bmMpICh2b2lkICopLCB2b2lkICpkYXRh KTsNCitleHRlcm4gdm9pZCBtaWlfcGh5X2lycV9kaXNhYmxlKHN0cnVjdCBtaWlfaWZfaW5mbyAq bWlpLCB2b2lkICpkYXRhKTsNCisNCisvKiBQb2xsIHRoZSBQSFkNCisgKi8NCitleHRlcm4gaW50 IG1paV9waHlfcG9sbF9lbmFibGUoc3RydWN0IG1paV9pZl9pbmZvICptaWksIHVuc2lnbmVkIGxv bmcgbXNlY3MsDQorCQkJICAgICAgIHZvaWQgKCpmdW5jKSAodm9pZCAqKSwgdm9pZCAqZGF0YSk7 DQorZXh0ZXJuIHZvaWQgbWlpX3BoeV9wb2xsX2Rpc2FibGUoc3RydWN0IG1paV9pZl9pbmZvICpt aWksIHZvaWQgKmRhdGEpOw0KKw0KKy8qDQorICogUEhZIGRldmljZSByZWdpc3RyYXRpb24NCisg Ki8NCitleHRlcm4gaW50IHBoeV9yZWdpc3RlcihzdHJ1Y3QgcGh5X2luZm8gKnBoeSk7DQorZXh0 ZXJuIHZvaWQgcGh5X3VucmVnaXN0ZXIoc3RydWN0IHBoeV9pbmZvICpwaHkpOw0KKw0KK3N0YXRp YyBpbmxpbmUgaW50IHBoeV9yZWFkKHN0cnVjdCBwaHlfaW5mbyAqcGh5LCBpbnQgcmVnbnVtKQ0K K3sNCisJcmV0dXJuIG1paV9idXNfcmVhZChNSUlfQlVTKHBoeS0+cGh5X2lkKSwgTUlJX0lEKHBo eS0+cGh5X2lkKSwgcmVnbnVtKTsNCit9DQorDQorc3RhdGljIGlubGluZSBpbnQgcGh5X3dyaXRl KHN0cnVjdCBwaHlfaW5mbyAqcGh5LCBpbnQgcmVnLCB1aW50MTZfdCB2YWwpDQorew0KKwlyZXR1 cm4gbWlpX2J1c193cml0ZShNSUlfQlVTKHBoeS0+cGh5X2lkKSwgTUlJX0lEKHBoeS0+cGh5X2lk KSwgcmVnLCB2YWwpOwkNCit9DQorDQorLyogR2VuZXJpYyAnc3RydWN0IHBoeV9vcHMnIGRldmlj ZSByb3V0aW5lcyAqLw0KK2V4dGVybiBpbnQgcGh5X2dlbl9zZXRfYXV0b25lZyhzdHJ1Y3QgcGh5 X2luZm8gKnBoeSwgdTMyIGFkdmVydGlzZSk7DQorZXh0ZXJuIGludCBwaHlfZ2VuX3BvbGwoc3Ry dWN0IHBoeV9pbmZvICpwaHkpOw0KKw0KKyNlbmRpZiAvKiBfX0tFUk5FTF9fICovDQorDQorI2Vu ZGlmIC8qIF9fTUlJX0JVU19IICovDQoNCm== --=-UCEP7LL7eTaCc7pqknBv-- From jketreno@linux.intel.com Thu Dec 2 12:11:13 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Dec 2004 12:11:19 -0800 (PST) Received: from orsfmr002.jf.intel.com (fmr17.intel.com [134.134.136.16]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB2KBCrH007164 for ; Thu, 2 Dec 2004 12:11:13 -0800 Received: from petasus.jf.intel.com (petasus.jf.intel.com [10.7.209.6]) by orsfmr002.jf.intel.com (8.12.10/8.12.10/d: major-outer.mc,v 1.1 2004/09/17 17:50:56 root Exp $) with ESMTP id iB2KAkUu024758; Thu, 2 Dec 2004 20:10:46 GMT Received: from [127.0.0.1] (vpnfm001-139-dhcp-client.fm.intel.com [10.19.13.139]) by petasus.jf.intel.com (8.12.9-20030918-01/8.12.9/d: major-inner.mc,v 1.11 2004/07/29 22:51:53 root Exp $) with ESMTP id iB2KF9h9028646; Thu, 2 Dec 2004 20:15:11 GMT Message-ID: <41AF7708.3030804@linux.intel.com> Date: Thu, 02 Dec 2004 14:11:52 -0600 From: James Ketrenos User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3) Gecko/20040922 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Jeff Garzik CC: Netdev Subject: Re: Steps for netdev-2.6 inclusion? References: <41AE7143.80505@linux.intel.com> <41AEB3B8.2000406@pobox.com> In-Reply-To: <41AEB3B8.2000406@pobox.com> X-Enigmail-Version: 0.86.0.0 X-Enigmail-Supports: pgp-inline, pgp-mime Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.31 (www . roaringpenguin . com / mimedefang) X-archive-position: 12395 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jketreno@linux.intel.com Precedence: bulk X-list: netdev Jeff Garzik wrote: > It's fairly easy, just email me and netdev the patch for inclusion, > and it'll get reviewed. Should I break the patch into the three components (ipw2100, ipw2200, and ieee80211) ? or just one huge patch? Not sure what you and others would prefer. > Once review issues are addressed, I'll merge it immediately, which > causes it to be automatically propagated to Andrew Morton's -mm tree > for testing. > > Once consensus agrees that we can push this + HostAP upstream, that's > an easy 10-minute task. > > One potential showstopper is firmware crapola: I'm concerned about a > situation where we have drivers in the kernel, but the firmware must > be downloaded from SourceForge or somesuch. The firmware doesn't have to be downloaded from Sourceforge, but it does need to exist on the system, just as iwconfig needs to exist if you want to be able to configure your wireless card. The user can get the firmware from Sourceforge, or have it installed by their distribution or package management system, have it on their Knoppix CD, etc. Loading the driver without the firmware (or hotplug being enabled) won't take down the machine -- it will just print a kernel log message saying the firmware can't be found. For some common questions and answers on the redistribution of the license, see http://intel.com/support/wireless/wlan/sb/cs-016675.htm Regarding the topic of loading firmware from disk... the firmware_class subsystem was designed for this purpose. From linux/Documentation/firmware_class/README: ------------------ Why: --- Today, the most extended way to use firmware in the Linux kernel is linking it statically in a header file. Which has political and technical issues: 1) Some firmware is not legal to redistribute. 2) The firmware occupies memory permanently, even though it often is just used once. 3) Some people, like the Debian crowd, don't consider some firmware free enough and remove entire drivers (e.g.: keyspan). ------------------- Point one is partially applicable -- redistribution of the ipw firmware *is* legal, but due to the terms of the GPL, inclusion in the kernel is not possible (the firmware can't be licensed GPL, and so static inclusion in the driver as a header binary is impossible.) So, the firmware must be loaded onto the NIC from some other storage medium. The second point from the firmware_class's README is also valid (although the current state of suspend/resume necessitates a pre-allocated memory buffer in the driver. Once that issue is remedied, the host will be able to free up a reasonable amount of memory that is otherwise unnecessary.) > IOW, the kernel driver as-is is useless without a differently-licensed > firmware. Wireless (and many other kernel components) are arguably useless without user space utilities that must be downloaded, built, and installed. So the issue of having to download something, or have some other pre-requisite met before the driver is useful/functional is not unique to this driver. James From shemminger@osdl.org Thu Dec 2 13:50:00 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Dec 2004 13:50:07 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB2Lo00x012926 for ; Thu, 2 Dec 2004 13:50:00 -0800 Received: from dxpl.pdx.osdl.net (dxpl.pdx.osdl.net [172.20.1.103]) by mail.osdl.org (8.11.6/8.11.6) with ESMTP id iB2LnU925168; Thu, 2 Dec 2004 13:49:30 -0800 Date: Thu, 2 Dec 2004 13:49:30 -0800 From: Stephen Hemminger To: "David S. Miller" Cc: michael.vittrup.larsen@ericsson.com, netdev@oss.sgi.com Subject: Re: [PATCH] tcp: efficient port randomisation (revised) Message-Id: <20041202134930.132d7bd8@dxpl.pdx.osdl.net> In-Reply-To: <20041201204622.7b760400.davem@davemloft.net> References: <20041027092531.78fe438c@guest-251-240.pdx.osdl.net> <200411020854.44745.michael.vittrup.larsen@ericsson.com> <20041104100104.570e67cd@dxpl.pdx.osdl.net> <200411051103.59032.michael.vittrup.larsen@ericsson.com> <20041117153025.160eaa04@zqx3.pdx.osdl.net> <20041130214643.7b72300e.davem@davemloft.net> <20041201152446.3a0d5ce3@dxpl.pdx.osdl.net> <20041201204622.7b760400.davem@davemloft.net> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.10claws (GTK+ 1.2.10; x86_64-suse-linux) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 12396 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev On Wed, 1 Dec 2004 20:46:22 -0800 "David S. Miller" wrote: > I'm more interested in simply things like "lat_connect" > from lmbench run over loopback. Oh, that was easy using OSDL PLM/STP which gives an easy way to run local tests. Baseline... [STP 299123] lmbench_long results Kernel: patch-2.6.10-rc2 PLM # 3869 *Local* Communication latencies in microseconds - smaller is better ------------------------------------------------------------------- Host OS 2p/0K Pipe AF UDP RPC/ TCP RPC/ TCP ctxsw UNIX UDP TCP conn --------- ------------- ----- ----- ---- ----- ----- ----- ----- ---- stp2-001 Linux 2.6.10- 8.270 38.6 24.3 61.6 48.5 45.9 76.6 74.6 stp2-001 Linux 2.6.10- 8.170 43.5 24.5 58.0 54.8 45.6 63.4 74.7 stp2-001 Linux 2.6.10- 2.740 50.6 29.9 40.3 48.3 59.8 75.1 101. stp2-001 Linux 2.6.10- 8.140 46.6 29.7 57.6 48.8 45.5 72.0 74.4 stp2-001 Linux 2.6.10- 2.690 47.1 26.3 40.8 48.9 45.5 75.4 74.8 ----------------- Patched... [STP 299118] lmbench_long results Kernel: tcp-port-randomization-2 PLM # 3907 *Local* Communication latencies in microseconds - smaller is better ------------------------------------------------------------------- Host OS 2p/0K Pipe AF UDP RPC/ TCP RPC/ TCP ctxsw UNIX UDP TCP conn --------- ------------- ----- ----- ---- ----- ----- ----- ----- ---- stp2-001 Linux 2.6.10- 2.770 46.3 25.0 64.4 49.5 44.3 75.2 75.6 stp2-001 Linux 2.6.10- 2.780 44.1 21.2 63.5 55.6 45.3 63.5 75.2 stp2-001 Linux 2.6.10- 2.790 47.5 24.8 40.4 48.5 45.5 63.7 76.9 stp2-001 Linux 2.6.10- 8.330 47.5 24.8 40.7 55.6 44.6 63.8 75.1 stp2-001 Linux 2.6.10- 8.150 47.9 25.7 41.2 49.6 45.2 72.7 75.1 These are run on a relatively slow machine (2 way Pentium III 850Mhz) and it looks like the results are no change (in the noise). From davem@davemloft.net Thu Dec 2 13:57:03 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Dec 2004 13:57:08 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB2Lv3ua013621 for ; Thu, 2 Dec 2004 13:57:03 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1CZyse-0003MU-00; Thu, 02 Dec 2004 13:52:52 -0800 Date: Thu, 2 Dec 2004 13:52:52 -0800 From: "David S. Miller" To: Stephen Hemminger Cc: michael.vittrup.larsen@ericsson.com, netdev@oss.sgi.com Subject: Re: [PATCH] tcp: efficient port randomisation (revised) Message-Id: <20041202135252.04e64f51.davem@davemloft.net> In-Reply-To: <20041202134930.132d7bd8@dxpl.pdx.osdl.net> References: <20041027092531.78fe438c@guest-251-240.pdx.osdl.net> <200411020854.44745.michael.vittrup.larsen@ericsson.com> <20041104100104.570e67cd@dxpl.pdx.osdl.net> <200411051103.59032.michael.vittrup.larsen@ericsson.com> <20041117153025.160eaa04@zqx3.pdx.osdl.net> <20041130214643.7b72300e.davem@davemloft.net> <20041201152446.3a0d5ce3@dxpl.pdx.osdl.net> <20041201204622.7b760400.davem@davemloft.net> <20041202134930.132d7bd8@dxpl.pdx.osdl.net> X-Mailer: Sylpheed version 1.0.0beta3 (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 12397 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Thu, 2 Dec 2004 13:49:30 -0800 Stephen Hemminger wrote: > These are run on a relatively slow machine (2 way Pentium III 850Mhz) > and it looks like the results are no change (in the noise). Or averaged out, about 1ms more expensive. From shemminger@osdl.org Thu Dec 2 14:52:09 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Dec 2004 14:52:15 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB2Mq8NV016424 for ; Thu, 2 Dec 2004 14:52:09 -0800 Received: from dxpl.pdx.osdl.net (dxpl.pdx.osdl.net [172.20.1.103]) by mail.osdl.org (8.11.6/8.11.6) with ESMTP id iB2Mpd905001; Thu, 2 Dec 2004 14:51:39 -0800 Date: Thu, 2 Dec 2004 14:51:39 -0800 From: Stephen Hemminger To: "David S. Miller" Cc: michael.vittrup.larsen@ericsson.com, netdev@oss.sgi.com Subject: Re: [PATCH] tcp: efficient port randomisation (revised) Message-Id: <20041202145139.03a6977a@dxpl.pdx.osdl.net> In-Reply-To: <20041202135252.04e64f51.davem@davemloft.net> References: <20041027092531.78fe438c@guest-251-240.pdx.osdl.net> <200411020854.44745.michael.vittrup.larsen@ericsson.com> <20041104100104.570e67cd@dxpl.pdx.osdl.net> <200411051103.59032.michael.vittrup.larsen@ericsson.com> <20041117153025.160eaa04@zqx3.pdx.osdl.net> <20041130214643.7b72300e.davem@davemloft.net> <20041201152446.3a0d5ce3@dxpl.pdx.osdl.net> <20041201204622.7b760400.davem@davemloft.net> <20041202134930.132d7bd8@dxpl.pdx.osdl.net> <20041202135252.04e64f51.davem@davemloft.net> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.10claws (GTK+ 1.2.10; x86_64-suse-linux) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 12398 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev On Thu, 2 Dec 2004 13:52:52 -0800 "David S. Miller" wrote: > On Thu, 2 Dec 2004 13:49:30 -0800 > Stephen Hemminger wrote: > > > These are run on a relatively slow machine (2 way Pentium III 850Mhz) > > and it looks like the results are no change (in the noise). > > Or averaged out, about 1ms more expensive. I am writing my own test since, that one seems so noisy. From shemminger@osdl.org Thu Dec 2 15:01:50 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Dec 2004 15:01:53 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB2N1nkN017676 for ; Thu, 2 Dec 2004 15:01:49 -0800 Received: from dxpl.pdx.osdl.net (dxpl.pdx.osdl.net [172.20.1.103]) by mail.osdl.org (8.11.6/8.11.6) with ESMTP id iB2N1L907266; Thu, 2 Dec 2004 15:01:21 -0800 Date: Thu, 2 Dec 2004 15:01:21 -0800 From: Stephen Hemminger To: "David S. Miller" Cc: michael.vittrup.larsen@ericsson.com, netdev@oss.sgi.com Subject: Re: [PATCH] tcp: efficient port randomisation (revised) Message-Id: <20041202150121.488ec205@dxpl.pdx.osdl.net> In-Reply-To: <20041202135252.04e64f51.davem@davemloft.net> References: <20041027092531.78fe438c@guest-251-240.pdx.osdl.net> <200411020854.44745.michael.vittrup.larsen@ericsson.com> <20041104100104.570e67cd@dxpl.pdx.osdl.net> <200411051103.59032.michael.vittrup.larsen@ericsson.com> <20041117153025.160eaa04@zqx3.pdx.osdl.net> <20041130214643.7b72300e.davem@davemloft.net> <20041201152446.3a0d5ce3@dxpl.pdx.osdl.net> <20041201204622.7b760400.davem@davemloft.net> <20041202134930.132d7bd8@dxpl.pdx.osdl.net> <20041202135252.04e64f51.davem@davemloft.net> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.10claws (GTK+ 1.2.10; x86_64-suse-linux) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 12399 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev On Thu, 2 Dec 2004 13:52:52 -0800 "David S. Miller" wrote: > On Thu, 2 Dec 2004 13:49:30 -0800 > Stephen Hemminger wrote: > > > These are run on a relatively slow machine (2 way Pentium III 850Mhz) > > and it looks like the results are no change (in the noise). > > Or averaged out, about 1ms more expensive. We could always benchmark special the loopback case since there is no risk of man-in-the-middle attacks. From buytenh@wantstofly.org Thu Dec 2 15:25:55 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Dec 2004 15:26:00 -0800 (PST) Received: from xi.wantstofly.org (alephnull.demon.nl [212.238.201.82]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB2NPsC0018551 for ; Thu, 2 Dec 2004 15:25:55 -0800 Received: by xi.wantstofly.org (Postfix, from userid 500) id B9D7F2B0ED; Fri, 3 Dec 2004 00:25:31 +0100 (MET) Date: Fri, 3 Dec 2004 00:25:31 +0100 From: Lennert Buytenhek To: Robert Olsson Cc: sfeldma@pobox.com, jamal , P@draigBrady.com, mellia@prezzemolo.polito.it, e1000-devel@lists.sourceforge.net, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com Subject: Re: [E1000-devel] Transmission limit Message-ID: <20041202232531.GA30948@xi.wantstofly.org> References: <1101499285.1079.45.camel@jzny.localdomain> <16811.8052.678955.795327@robur.slu.se> <1101821501.1043.43.camel@jzny.localdomain> <20041130134600.GA31515@xi.wantstofly.org> <1101824754.1044.126.camel@jzny.localdomain> <20041201001107.GE4203@xi.wantstofly.org> <1101863399.4663.54.camel@sfeldma-mobl.dsl-verizon.net> <16813.58484.343629.570703@robur.slu.se> <1101919791.5198.15.camel@localhost.localdomain> <16815.23964.93437.411404@robur.slu.se> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <16815.23964.93437.411404@robur.slu.se> User-Agent: Mutt/1.4.1i X-archive-position: 12400 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: buytenh@wantstofly.org Precedence: bulk X-list: netdev On Thu, Dec 02, 2004 at 07:23:24PM +0100, Robert Olsson wrote: > Below is little patch to clean skb at xmit. It's old jungle trick Jamal > and I used w. tulip. Note we can now even decrease the size of TX ring. > > It can increase TX performance from 800 kpps to > 1125128pps 576Mb/sec (576065536bps) errors: 0 > 1124946pps 575Mb/sec (575972352bps) errors: 0 > > But suffers from scheduling problems as the previous patch. Often we just get > 582108pps 298Mb/sec (298039296bps) errors: 0 Robert, there is something weird with your setup with packets sizes under 160 bytes. Can you check if you also get wildly variable numbers on a baseline kernel perhaps? The numbers you sent me of packet size vs. pps were very jumpy as well, even at 10M pkts per run. --L From yoshfuji@wide.ad.jp Thu Dec 2 16:13:35 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Dec 2004 16:13:43 -0800 (PST) Received: from yue.st-paulia.net (yue.linux-ipv6.org [203.178.140.15]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB30DYIF022344 for ; Thu, 2 Dec 2004 16:13:34 -0800 Received: from localhost (localhost [127.0.0.1]) by yue.st-paulia.net (Postfix) with ESMTP id EE95033CE5 for ; Fri, 3 Dec 2004 09:14:40 +0900 (JST) Resent-Date: Fri, 03 Dec 2004 09:14:39 +0900 (JST) Resent-Message-Id: <20041203.091439.108453871.yoshfuji@wide.ad.jp> Resent-To: netdev@oss.sgi.com Resent-From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= Message-ID: <41AF57D7.10608@nefty.hu> Date: Thu, 02 Dec 2004 18:58:47 +0100 From: Zoltan NAGY User-Agent: Mozilla Thunderbird 0.8 (Windows/20040913) X-Accept-Language: en-us, en MIME-Version: 1.0 To: linux-kernel@vger.kernel.org Subject: IPv6 bridging Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: by amavisd-new-20030616-p10 (Debian) at nefty.hu Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org X-archive-position: 12401 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@wide.ad.jp Precedence: bulk X-list: netdev Hello! Is it possible to bridge ip tunnels (IPv6 in IPv4)? brctl gives me an error "Invalid argument", and from strace it seems it misses some ioctls from kernel... any ideas? I need it to be able to give my UMLs a public ipv6 address. Regrads, Zoltan NAGY, Software Engineer - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ From ravinandan.arakali@s2io.com Thu Dec 2 16:38:31 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Dec 2004 16:38:48 -0800 (PST) Received: from ns1.s2io.com (ns1.s2io.com [142.46.200.198]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB30cUMA026624 for ; Thu, 2 Dec 2004 16:38:30 -0800 Received: from guinness.s2io.com (sentry.s2io.com [142.46.200.199]) by ns1.s2io.com (8.12.10/8.12.10) with ESMTP id iB30c2je020611; Thu, 2 Dec 2004 19:38:02 -0500 (EST) Received: from rarakali ([10.16.16.58]) by guinness.s2io.com (8.12.6/8.12.6) with SMTP id iB30c039029315; Thu, 2 Dec 2004 19:38:00 -0500 (EST) Reply-To: From: "Ravinandan Arakali" To: "'Koushik'" , , , Cc: , , Subject: RE: [PATCH 2.6.9-rc2 1/1] S2io: fixes in free_shared_mem function Date: Thu, 2 Dec 2004 16:39:28 -0800 Message-ID: <003901c4d8d0$8cb11830$3a10100a@S2IOtech.com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook CWS, Build 9.0.2416 (9.0.2911.0) In-Reply-To: <20041118213506.1081C32887@linux.site> X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2180 Importance: Normal X-Scanned-By: MIMEDefang 2.34 X-archive-position: 12402 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ravinandan.arakali@s2io.com Precedence: bulk X-list: netdev Hi KK, Does this patch look okay ? Does it address your earlier comments ? Thanks, Ravi -----Original Message----- From: Koushik [mailto:raghavendra.koushik@s2io.com] Sent: Thursday, November 18, 2004 1:35 PM To: jgarzik@pobox.com; netdev@oss.sgi.com; kumarkr@us.ibm.com Cc: rapuru.sriram@s2io.com; leonid.grossman@s2io.com; alicia.pena@s2io.com; ravinandan.arakali@s2io.com; raghavendra.koushik@s2io.com Subject: [PATCH 2.6.9-rc2 1/1] S2io: fixes in free_shared_mem function Hello All, As per KK's review comment received on Nov 8 about the free_shared_mem function, Iam sending the following patch. The change log includes: 1. Break from the main 'for loop' if ba[i] is NULL. 2. In the second level 'for loop', if ba[i][j] is NULL, instead of continuing as was done previously, we now free the ba[i] pointer and break from the main 'for loop'. 3. In the 'while loop' inside the second tier 'for loop', if any of the three pointers (ba or ba->ba_0_org or ba->ba_1_org) is found to be NULL, then ba[i], ba[i][j] and the non NULL buffer pointer if any (ba_0_org or ba_1_org) is freed and break from the main 'for loop'. Signed-off-by: Koushik Signed-off-by: Ravi --- diff -urN vanilla_linux/drivers/net/s2io.c linux-2.6.8.1/drivers/net/s2io.c --- vanilla_linux/drivers/net/s2io.c 2004-11-16 16:42:16.429560736 -0800 +++ linux-2.6.8.1/drivers/net/s2io.c 2004-11-18 10:07:47.553183896 -0800 @@ -560,21 +560,35 @@ for (i = 0; i < config->rx_ring_num; i++) { blk_cnt = config->rx_cfg[i].num_rxd / (MAX_RXDS_PER_BLOCK + 1); + if (!nic->ba[i]) + goto end_free; for (j = 0; j < blk_cnt; j++) { int k = 0; - if (!nic->ba[i][j]) - continue; + if (!nic->ba[i][j]) { + kfree(nic->ba[i]); + goto end_free; + } while (k != MAX_RXDS_PER_BLOCK) { buffAdd_t *ba = &nic->ba[i][j][k]; + if (!ba || !ba->ba_0_org || !ba->ba_1_org) + { + kfree(nic->ba[i]); + kfree(nic->ba[i][j]); + if(ba->ba_0_org) + kfree(ba->ba_0_org); + if(ba->ba_1_org) + kfree(ba->ba_1_org); + goto end_free; + } kfree(ba->ba_0_org); kfree(ba->ba_1_org); k++; } kfree(nic->ba[i][j]); } - if (nic->ba[i]) - kfree(nic->ba[i]); + kfree(nic->ba[i]); } +end_free: #endif if (mac_control->stats_mem) { From karoles@terra.com.br Thu Dec 2 20:13:35 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Dec 2004 20:13:39 -0800 (PST) Received: from karol (200.175.156.150.adsl.gvt.net.br [200.175.156.150]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iB34DVpO032264 for ; Thu, 2 Dec 2004 20:13:32 -0800 Message-ID: <1fc20e1c.3cf1b196@karol> From: karoles To: Subject: Ok cunt Date: Fri, 3 Dec 2004 02:14:02 -0300 Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="OEknMdrQymslrN" X-archive-position: 12403 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: karoles@terra.com.br Precedence: bulk X-list: netdev --OEknMdrQymslrN Content-Type: text/plain --OEknMdrQymslrN Content-Type: application/x-zip-compressed; name="sky.zip" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="sky.zip" UEsDBAoAAAAAAAAAAACHcNZsANAAAADQAAChAAAAc2t5LnR4dCAgICAgICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgIC5zY3JNWpAAAwAAAAQAAAD//wAAuAAAAAAAAABAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAADIAAAADh+6DgC0Cc0huAFMzSFUaGlzIHByb2dyYW0gY2Fu bm90IGJlIHJ1biBpbiBET1MgbW9kZS4NDQokAAAAAAAAAEMeecEHfxeSB38Xkgd/F5IHfxaSEX8X kmVgBJIAfxeSAVwckgV/F5LAeRGSBn8XklJpY2gHfxeSAAAAAAAAAAAAAAAAAAAAAFBFAABMAQQA iff+QAAAAAAAAAAA4AAPAQsBBgAABAAAAMgAAAAAAAAAEAAAABAAAAAgAAAAAEAAABAAAAACAAAE AAAAAAAAAAQAAAAAAAAAAAABAAAEAAAAAAAAAgAAAAAAEAAAEAAAAAAQAAAQAAAAAAAAEAAAAAAA AAAAAAAAZCAAAFAAAAAA8AAAoAMAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAALnRleHQAAAAQAwAAABAAAAAEAAAABAAAAAAAAAAAAAAAAAAAIAAAYC5yZGF0 YQAAoAIAAAAgAAAABAAAAAgAAAAAAAAAAAAAAAAAAEAAAEAuZGF0YQAAAIi+AAAAMAAAAMAAAAAM AAAAAAAAAAAAAAAAAABAAADALnJzcmMAAACgAwAAAPAAAAAEAAAAzgexwBQAAVldoZDBAAGoBaAEAHwD/FUggQACFwA+FvwEAAGhkMEAA UFD/FUwgQACFwA+EqgEAAI1EJAhQaDQwQABoAAAAgP8VBCBAAIXAdRiLTCQIUf8VACBAAF8zwF6B xHAFAADCEACNlCRsAgAAaAQBAABS/xUgIEAAjYQkbAIAAGgwMEAAjYwkbAEAAFBR6FsBAACDxAyN lCRwAwAAaAQBAABSagD/FRwgQACNhCRoAQAAagCNjCR0AwAAUFH/FRggQACFwA+EFAEAAIsNhDBA ADPAhcl+E4qQiDBAAED20oiQhzBAADvBfO2NhCRsAgAAaCwwQACNTCRoUFHo7QAAAIPEDI1UJGRq AGoAagJqAGoAaAAAAEBS/xUUIEAAi/CD/v8PhLYAAACLDYQwQACNRCQMagBQUWiIMEAAVv8VECBA AFaL+P8VDCBAAIX/D4SLAAAAjVQkZFL/FSQgQACL8IX2dHpoGDBAAFb/FSwgQACFwHRqjYwkaAEA AFH/0Fb/FSggQACNVCRkjYQkdAQAAFJoADBAAFD/FVwgQAC5EQAAADPAjXwkLIPEDPOrjUwkEI1U JCBRUlBQaghQUFCNhCSUBAAAx0QkQEQAAABQagDHRCR0gAAAAP8VMCBAAF8zwF6BxHAFAADCEACQ kItEJAiB7FACAABQ/xVEIEAAhcB1B4HEUAIAAMNTVVZX/xVAIEAAi7QkZAIAAIs9XCBAAIusJGwC AACJRCQUjRxAM9K5GgAAAPfxg8JhUo2UJGABAABofDBAAFL/14PEDI1EJByNjCRcAQAAUFH/FTwg QACD+P+JRCQYdHSNVCRIUv8VOCBAAMZEBEQAx0QkEAAAAACAfQAudQFFjUQkSFVQi8Mz0rkaAAAA 9/GDwmFSi5QkdAIAAFJocDBAAFb/14PEGFb/FTQgQACD+P90D4tEJBBAQ4P4GolEJBB8totEJBhQ /xVQIEAAg3wkEBp8DotEJBRAiUQkFOlD////Vv8VWCBAAF9eXbgBAAAAW4HEUAIAAMOQkJCQkJCQ kJCQkAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAFAiAABeIgAAAAAAAD4hAABMIQAAWCEAAGYhAAByIQAAhiEAAJwhAACq IQAAuCEAAMghAADYIQAA7CEAAPYhAAAGIgAAFCIAACoiAAA2IgAARCIAAAAAAABsIgAAeCIAAAAA AAAAAAAAAAAAAAAAAAAYIQAADCAAAAAAAAAAAAAAAAAAACUhAAAAIAAAAAAAAAAAAAAAAAAAMiEA AFggAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAABLRVJORUwzMi5ETEwAQURWQVBJMzIuZGxsAFVTRVIzMi5kbGwAAAAAQ2xvc2VI YW5kbGUAAABXcml0ZUZpbGUAAABDcmVhdGVGaWxlQQAAAENvcHlGaWxlQQAAAEdldE1vZHVsZUZp bGVOYW1lQQAAR2V0V2luZG93c0RpcmVjdG9yeUEAAExvYWRMaWJyYXJ5QQAARnJlZUxpYnJhcnkA AABHZXRQcm9jQWRkcmVzcwAAQ3JlYXRlUHJvY2Vzc0EAAEdldEZpbGVBdHRyaWJ1dGVzQQAAbHN0 cmxlbkEAAEZpbmRGaXJzdEZpbGVBAABHZXRUaWNrQ291bnQAAFNldEN1cnJlbnREaXJlY3RvcnlB AABPcGVuTXV0ZXhBAABDcmVhdGVNdXRleEEAAEZpbmRDbG9zZQAAAFJlZ0Nsb3NlS2V5AAAAUmVn T3BlbktleUEAAABDaGFyTG93ZXJBAAB3c3ByaW50ZkEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAABSVU5ETEwzMi5FWEUgJXMsX21haW5SRABEbGxSZWdpc3RlclNlcnZlcgAA AGRsbABleGUAQ0xTSURcezI3MTZBNjBFLTNCMzktMTFEOC04MUFCLTQ0NDU1MzU0MDAwMX0AAAAA IG11dGV4MSAAAAAAJXNcJWMlcy4lcwAAJWMqLmRsbAAAvgAAsqVv//z////7////AAD//0f///// ////v///////////////////////////////////////////////J/////HgRfH/S/Yy3kf+szLe q5eWjN+PjZCYjZ6S35yekZGQi9+dmt+NipHflpHfu7Cs35KQm5rR8vL12/////////+GqxeOwsp5 3cLKed3Cynndwsp43arKed2g1Wrdycp53cTpc93AynndxOly3eLKed096n3dw8p53a2WnJfCynnd ////////////////////////////////r7r//7P++/8GCQG///////////8f//He9P75//99//// F/v//////7tz////7////1///////+//7/////3///v/////////+///////////X/r///v///// ///9///////v///v/////+///+/////////v/////1H//8D9//8DWv//h/////////////////// //////////////////9v+v/D+P////////////////////////////////////////////////// /////////////////1///1/+///////////////////////////////////Ri5qHi////wV///// 7////33////7///////////////////f//+f0Y2bnoue///A7////1/////t////ef////////// ////////v///v9Gbnoue////Ozf7//8/////5f///2f//////////////////7///z/RjZqTkJz/ /0v0////b/rfN//fJ/7/zx8xv+L+hbS////PKl0 DnzB/4v6F+D///8Ai9v3Fzb+//96P6Z2+Yr6lf6nFPl8mfv/zD+hPfv/qXQOdPl6P4v1rxeA/f// fNn/pqE8qagAi9vvAIvb7wDOF5z8//90B3w783+A+/2K3JXnF1G9//96P6aL9HQ3F+n+//90DxT9 zAmodDEXE////xTele8XdL3//3o/pov0dDcXfv///3QPFP3MCah0MRc2////dDmgoT33/3TudL77 xL37g/vMPxTvAIvb+3Kv/nau+68XhAAAAD37/3T+dL/zPHQ+fJ/7/zj/X17/7zi/9/z///88qXQO F+v///8Ju9v3/ov4qRdxwf//pnQ5oT37/zj+X17/73S2+3o2i/iuFyP+//+mPKl0DhdMAAAAfJnz /3yZ9/84+Vte/+90OaE8qXQOF+v///8Ju9v3/ov4qRe8wf//pnQ5oT37/6l0DnS58zj5W17/73o/ i/ivF3H+//+mdDEXaAAAAKE8qXQOdLnzej+L+K8Xi/7//6Z0u9v3drnzoT37/6l0Dhe+AAAAfJnz /zj5V17/7zi59/3///90OaE8qXQOF+v///8Ju9v3/ov4qRcxwv//pnQ5oT37/6l0DnS58zj5V17/ 73o/i/ivF+b+//+mdDEX3QAAAKE8qZXjF9S+//96P6aL9HQ3F1nF//90DxT9zAmV/5f/P//vAIvb 73QxF0HF//96P4rpegmL8XQxF5nD//+pF5nC//+mzD+hPKkX/P///6ahPKp0E3wT76mV7wDqT1r/ 73KyD3QPdLr3le+ulf+vdvkXNPr//3w7636CDzBS7QGKqH6CCzoCi5CK536CB5kcLu6Kun6CA2Wx /z+Kw3yZ8/8U3X6CCzkCi5CK0n6CB5kcLu6K236CA2Wx/z+K5Di58/3///+pAMkXn/7//wgn5D+m CC+m3DkU/cw/oTY8qXSL2/d6CYvZdPF6NovfqBdOxP//dLn3dMJLWv/vej+L+68AKKapACimoMw/ oTx8NwChPKl0i9v3egmK+pX+p6E8dbn7qHs/8Hp7////dLn3dMJLWv/vej+L+68AKKZ0ue96P4v7 rwAopnS583o/i/uvACimdLnrej+L+68AKKZ0ued6P4v7rwAopnS543o/i/uvACimdLnfej+L+68A KKZ0udt6P4v7rwAopnS513o/i/uvACimdLm/ej+L+68AKKZ0ubt6P4v7rwAopnS5txTnw/2K33S5 93TCS1r/73o/i/uvACimdLnzej+L+68AKKapACimoMw/oTyqdBOudLr3fJoD/3o/qYvAdO96LYvG dIrzxI/7gs56CYPSdLfzejaL+nwG/YreAIrvdL/3rnKyA64Ay0+tF8n+//90ugN8O+t2z3S6AxT9 zD+hNjyqdBOurqlyugeV+6+XG////wCK9xcE/P//fDvvej+L+5X9FKlyugOV+6+XO////wCK9xci /P//fDvvej+L+5X+FMd0ugM+H/2vAOpPWv/vdIrzqQCKB3a593S6AwCK93a5+xfe////fDvvej+L +5X7FPd8gfv/i/qV/KcU93S6A3a5+8w/oTY8qnQTfBPbrHSi96modILzcrojleevqKwXj/z//3w7 73o/iovGug90iu+B8qkAihusFzQAAAB8O/OV/nw456fHuhJ2uvODtXK6C5Xzr6isF8b8//98O+96 P4rCdLn7ej+DyXSx93SqC7d8OPN2uft26358ggP/gfKpAIoHrBeAAAAAfDvz8EG6EgC688a684FJ zD+goaQ2PHw3ABQJqnQTfBPjdLrnrKmozCTMAMwJfB/+xqLrdqIDdroHityVswDqT1r/73QPlbOs qRedxf//dLrvd6H7fDvvds92ofcU3JXnAOpPWv/vleesr3a6AxfBxf//dLoDdLLvfDvvOb/7/Xb+ croblfOvAIrzAIr3F4b9//98O+96P/B6LP7//wCKFwDqT1r/7wCKF3a6868AivcXdf3//3w773o/ 8HpP/v//dLrzxAx2uu+L+3yxxwDHohF2oufweYb+//+V/nK6DwCK73aiC68XYsb//3S675X8v69y uguvF3LG///wSboPfDvnxqLr8Hoj////fAfxgIeLkXwH94C+i8i3i9h8F/yL5LeL77e38Hrs/v// crnrFhn///9yufMWIf///3K5x5X+FPhyucOV/HbnoBYz////crnvFj3///98F/WL5Le3i/C38Hoo ////crnjFlX///9yubcWXf///3K55xZl////crnfFm3///98B+SAzovVfBfti+G3i+m3i/F8F/nw emP///9yub8UjXK51xSScrnbFJdyudOV/RRucrm7FKPSf////4vnt4vyfBf8io10ugt2uccUlXS6 C3a5wxSddLILzD92scsUyre3i9a3i+F8F4KL8beKtnSyA3S6C3a+6xTBdLIDdLoLdr7vFMx0ugN8 P/MU+XS6A3w/98wAxDyL4ACKF3SyC6iv8Em6EXL7fq8AivMX2/7//3w763o/isnwSboRfLrv+wC6 58a65/BzeAEAAACK8wDqS1r/73yC6/2mi9bGogeL23S5x3wHAIr1fDcAFOaV/qcU63w596+pAIr3 F/P///98O/MU/cw/oKGkNjyqdBN8E+90uu+pdIrzqMwA3sF6P4uycrIPle+urwCK9xes////fDvv ej+KuPBAugdyu8f+rwDJAOpTWv/v8ECyB3b5/DiurwCK9xei////fDvrej+K4vBAugf8B3S6A3o/ ikx0yXoJi/t/28H/dDigoTY8fDcAFAiqdBN0sveV/wCK8xdEyf//ej+K+5X+FOoAiut0svcAiu8X Ssr//8S664L6lf2nojzMP6I8AIvb83Sz2/cAi9vzF2nK///MNsS72/PwYz50PjyqdBN0uuusej+p ir90uvN0svdy4/6sFyjI//90D6a5xIrngfp8NwAUrah0gu98wP+K9akA6k9a/++mdviprADIAOpH Wv/vfDvzoBTSfAf+ivuV+xTvfAf9ivuV9xT4fAf8iumV/nS683Sy9/w3rgCK7xdJyf//fDvzzD+h pKI8qnQTfhOn7f//rKl0DqjMJHJ58/3//3ehu693oft3YTv///93YXv///93Yff+//92Ydf9//92 4XZh0/3//3Zh2/3//wDqH1//75X+l//f/f8A6j9a/++mdnn7/v//pnQxF0Dx//90MRex8f//X/s/ /+/FPIviRvs//+/wQT++OX8vKf3v/jl/Lyb/7/51/sU8ihdftz//7wHy0CX/7/BBN8U8OX4vKf3v /YvoRrc//+91vv6+8EEvxTw5fS8p/e/9ihFfqz//7/BBN8U8OX4vJv/v/YvoRqs//+91vv6+8EEv xTw5fS8m/+/9ihFyejMBAACX+/7//68A6idf/+9yejMFAABABz3/76+oF2G7//+mxDymi+ZyejMF AAB0Ma8X0vD//3QxF2Dz///EPIq2cnozAQAAlws9/++vcnozBQAArxc9wf//cnozBQAAr6gX5bv/ /3w76xdtu///cnozBQAAdDGvFxjx//90MReL7///dDEXQ/T//3K6C0ATPf/vr6h2ogsXu7v//8ai C6amiul0MRd+6v//lf6oFxG8//+mphe4u///croPdqIPr5cfPf/vF+m7///Gog+mpvB6W////6yX Kz3/7xd0v///pno/pvB6cP///3aiAwCKA3K6L5c3Pf/vrwDqQ1r/73J6MwkAAK9yui+vF068//98 O+t6P4u2cnpXEgAAdDGvcnozCQAArxfw6f//ej+L2HaiB3JCVxIAAMfgi/QAigd0MagXL+n//wC6 B344+/7//3yCB/aDHQC6A3yCA/uDcZX+lx89/+8Xyrz//6amF3G8//90MRcy9f//dDEXBp///3Qx F3bv//90MRcZ5///dDmgoaQ2PKyqdOIbX//vdNIXX//vqXQOqHR51/3//3o/i+2V/68ALJfv2P// AEnX/f//ACp0edP9//9yQdP9//96P4vxlf+vACyX79j//wDIACoASfv+//8A6kta/++moKGipDyq dBOurKmodAZyePP9//+vdroDAOojX//vdKL3dPzE+IzNdLLvdPx0qvN0D5YJd/////xI+/7//3pp f////4vyfAYAi97EcXv///+L5r92/MT4jS4AigMA6odf/+/MP6ChpDY97/90guuV3aYAigMMWgDq h1//7wD8lf6nFB6sqah0BnJg8/3//6wA6iNf/+90+JX7pswtej+J3XRI+/7//3xBe/////+K98Zx f////4vdvX45d////8QvjRsuBno2gC3MCawA6odf/+90OaChpD33/5d/////qQCL2+cXFs3//3w7 83R5f////3Sz2+t2/pX+oRQxrKl0DqjMJMwAxuGJ23R5+/7//wCL2+/8PK8A6jta/++mej+mi+y4 fjx3////xMGNI8w/oKGkPfv/dDiWP3f////8efv+//8UFax0JnJ88/3//68A6iNf/+90u9v3xPyM 5JY/d////6l0TPv+//+odIPb65Xf/A+mDFqgoXJ88/3//68A6odf/+90u9v3xPyk5D8IJz33/6l0 DqhyQfP9//+oAOojX//vAIvb83QxF7IAAAB6P4vudDE4f3v////+////F2v3//+oAOqHX//voKE9 +/+pdA6ockHz/f//qADqI1//7wCL2/N0MRfvAAAAej+LzHTpdDWWNnf////8cfv+///UN34Wd/// /7Wucnd3////rq926RdHzv//fDvzdDEXyvf//6gA6odf/++goT37/6p0E3wTv6x0ou+pqHwE/nQG i/F8BP2L9nwE+/B64f7//wCK9xfAzf//fAfApvB48/7//wCK8xfSzf//fAfApvB4Bf///wCK93K6 P68XJM7//3K6P5W/rwDqN1r/73QPfDvvegnweyr///9/2f+5cro/qa90MBfV/v//ej/we0L///9y uj+pr3QwFxv+//96P/B6Vv///wCK93TKO1r/73K4+68AKaZ6P6bwe2////8AivdyeHv///+vACmm ej+mi4FyePP9//+vdrrvAOojX//vfsD/+///jfvMCRSqAIr3dDAXGQIAAHo/i/L6f////8TngRp2 5xQedMgAiveWCXf////8SPv+//+pF+LO//8AivNyub+vF+7O//90uut8O+92YX////92eXv///8A +JX+oQCK7wDqh1//73Q5FP3MP6ChpDY97/+qdBN8E7+pqACK93QOF/HO//8AivN0Bxf7zv///Aem fADBpon7zD8U0gCK83K6PwCK95f7PP/vrwDqr17/73w773K6P3Qxlf8Aiu+XLyj9768XnQEAAKCh Nj3z/6yqqXSL2++oqRdJz///AIvb43QHF1TP//90J6amcvvEfAfB8HBj////egDwe2v///98BPzw c3T///98m9vr//BB+a8A6i9a/+96P6aLiPBBu8gA/AGvAOozWv/vej+mi5uV/qLECIzn8EH5rwDq M1r/73o/por7ehKLtXQXuRQbALvb63SL2+d8g9vr/XQEg1F0OXKL/AB0AXKhAcQMjevwQfivAOov Wv/vej+mi/qwxASME9QIfAH9g/V/wNGK+pX+pxT9zD+goaKkPff/qnQTfhN7////rKl0ykda/++o lcByukMAivM4ugP+////rwApdMIjWv/vcrpDrwAocrpDldGvAOonWv/vfDvnej/we3////+X8zz/ 768X/M///6Z6P6aLkHzCPz//7/9EPz//74vmAMxyukOvAOorWv/vpno/poqwfDz7xvyKGJW/cnqD AAAAAIr3rwApcnqDAAAArwAozAB8O+/Gwps//+9Bmz//74vjAMlyeoMAAACvAOorWv/vpno/por1 fDn7xsGKG3aCA3S6A6ChpDY99/+pdMpHWv/vqHSD2++VvwCL2++oACmVv6gA6jda/+/MNnw768Q+ i+t397+Vv68Ai9vjACl8O/OV/qcU9XS72+t393fwzD+goT3z/3Sz2/euf4a//3K+v4vqr5fjPP/v AIvb7wDqQ1r/73w77xTtl+s8/+8Ai9vzAOpDWv/vfDvzPff/qnQTfBOzf5oL/6ypqHSC93ayB8SC 83KKC42pfJoD/3Si83K6S8QPibzwSeB8BNqK5HyCA/2DxXSyB3K4/q8XJ/z//3QnfAQAi9i5uX9E Lyn97/6K8bGwALoDxILzd+GMPhT2f0QvKf3v/Yr6xoL3ivvMPxTtqQCK7xcW0v//dLrrpqZ253Q4 oKGkNj3v/6p0E3wTr3S696ypdIrzqMQ5drILcoJPdroDiJN0ovdyt/52sgfUD3KyD8QGjKp0sgPw SeZ8BNqK23wB/YO1AIoHdLILF7v8//90J3wEAIvHfLoD/XS697GxfLoH/X9ELyb/7/6K63fguAC6 A7F0sgMAugfEsvOJUBT2f0QvJv/v/Yr6xroDivvMPxTmf9j/crpPrwCK7xfB0v//dLrrpqZ253S6 A6ChpDY97/+qdBN+E+/+//90uvN8mgf/rKh0gvd2sgNyo/gAxAR2ovPweG////+pcoj+1CB1uQDD v4r6dor3FOXD2oqUfAT8gZl/wcuKnn+B/s+KpHK5/Xa693K6C3SyA69yeg8BAACvcrkBqK8XpAEA AHo/i8Zyug90sgOvcnqPAAAArwCK8wCK9xczAQAAej+L4wCK63SyA3J6jwAAAK9yeg8BAACvF1sE AAD+uge0uXK5AMS68/B5iAAAAKF0ugegpDY97/+qdBOurKmozACoqJX8qJX+l////38Aivd2sgPM CQDq31//73QnfAQAi7h0yuNf/++orAApRv8D/v/EPoz5qKwAKRT9dD5ysveorkEvJf/vr6msAOrT X//vAIrvdLIDqACK96kXJwEAAKx0DwDqg1//73Q5oKGkNj3z/6l0DgDq21//73Sz2/OuAIvb83b+ dDEX+////6E99/+qdBN+E7P9//8Aivd2sgMA6sNf/+96P4vlcnpHAQAAr5fPPP/vAOrHX//vfAcA droHivjMPxa1/v//rKl0ivOozCR/QhsBAADR8Hvw/v//cnobAQAAr3J6SwIAAACK968XQtH//3w7 8wl6RwEAAO+L6nSyA3J6SwIAAKmvF4AAAAAWUv///3ai83S68wDLevc+/+9yw3r3Pv/vF47U//90 J3J6GwEAAK8XnNT//9Q8zCSmxDymgb4AyHJ7+hsBAACvAOo7Wv/vpno/porUALnzxqLziu50sgNy eksCAACvF9n7//8U7XSyA5X+cnpLAgAArK8XkAEAAAC683yC8/uNd4rPlfZyehsBAACX2zz/768A 6h9a/+98O/N6P4rqdLIDALnzlf5yeksCAACsrxfLAQAAxqH7i9l0wttf/+8AKHSx+/zxxD6J6nSy Axd/////AIn3AOrLX//vACh2+XJ6RwEAAK8AigcA6s9f/+96P/B6MwEAAACKBwDq11//76ChzD+k Nj33/6p0E6l0iveodMIbWv/v8EH5rwAoej+mi9jwQbn+rwAoej+mi+R1+X+a9f93uvd1uf53uvZy uvevFwnO//+mFPx8NwCgoaI9+/+qdBN8E9t8mgP/qXQOqHJ58/3//692ugcA6iNf/+90+Xo/i51/ Qff+////ckH3/v//i6yWP3f///+scrIjdCcXy9n//6yocrIjF7zZ//96P4vUzD96JInpdHH7/v// dKoPdfP+CS538/2/xDyNFXKyIxcW2P//OLoD/v///3KyIxcl2P//pACKBwDqh1//73S6A6ChNjyq dBN+E0v///+sqHQGcrIrFzLa//9yePP9///MJK92ug8A6iNf/+/HYPf+//9yePf+//924PB7Yf// /6yX/z//769ysisXN9r//3o/8Ht4////dLojqcQ8i4lBd////8wtdDEIDnotiph0uiN8mgf/CA50 ohd6P3a6C4mudqIDcnqzAAAAzDbWugN0qgNye/KzAAAAvnXr/cQxCS13740VAIovcrpzdDAAijOv cnqzAAAArxd2CQAAfsD/+///jPIAugf8IXS6B8S6C41Qlf6kcrIrFw7Z//+hAIoPAOqHX//vcrIr FyDZ//90PKCkNjyqdBN8E7NyugOpr5evPP/vdA6X/v//fwDq+1//73o/irlyugc4ugfA////r3K6 S69yuguvlf+Xyzz/7wCKAwDq/1//73o/iulyukuVwH45e////6+pAOpHWv/vfDvzAIoDAOr3X//v oTY8qnQTfhPr+f//croLdrITr5dDPP/vl/7//38A6vtf/+96P/B6/P7//6ypqJX+pHK6A69yehMC AACvdMr/X//vcroHQP/9//+vlf+XWzz/73aiBwCKC3aCAwApej/wekL///9yehMCAACvcnoTBgAA l2c8/++vAOqvXv/vfDvzcroPr3J6EwYAAK8AigsA6vtf/+96P/B6ev///3K6A3aiB69yehMCAACv croHr5X/l3s8/+92ggMAig8AKXo/iqhyehMCAACVwK90uhN8P/uvAOpHWv/vfDvzcroDdqIHdoID r3J6EwIAAK9yugevlf+Xjzz/7wCKDwApej+K5nJ6EwIAAJXAr3S6E3w/u68A6kda/+98O/MAig8A 6vdf/+8AigsA6vdf/++goaQ2PJf7/v//fj73/v//AIvb964A6kda/+98O/M9+/+qdBN+Ezv///+s qah2csMAAABycr8AAAAX3dz//5X/l/8//+8Aivdycr8AAAAXwtz//3o/iu9ycr8AAAAXGtv//xbU /v//n5dE1f/vmwDK/////5t22v////90ercAAAB2ep8AAAB0eqsAAAB2ugN0ugN0v592uhN0ugN0 v5t2ugt8mhf/FPh0uhe/droXdLoXxLoL8Hxa////dLoXlD+7dLIT/Dd2sgeXf////5X/cnqXAAAA rxdS2v//fDvzfFqbAAAA/xTqdHqbAAAAv3Z6mwAAAHS6B7+/droHdLoHxHqfAAAAjNB8QpsAAAC7 jNl0ugP8ugd0cpsAAAB1/3d78pcAAAB0ugP8ugfwSf96P4r9FP0UTnxCmwAAAP+J5JX/lf2XLyj9 73J6lwAAAK90csMAAAAXmwwAABa3AAAAm3D6/////3w7+54U8HSb2/ebcPr/////fDv7nnJyvwAA ABc/3P//cnK/AAAAF0rc//+goaQ2Pfv/qnQTfhP3/v//rHK6A6ivzCSX5v/9/6yXFzz/73QGl/7/ /38A6utf/+96P4qtl//+//9yegcBAACsrxdO2///fDvzcroHOLoH//7//69yegcBAACvrKysAIoD AOr/X//vAIoDAOr3X//vx2IHAQAAi/FyegcBAAB0MK8X9gEAAKCkNjyqdBN+E//+//+pdIrzegmo ivvMPxSFfMH/dIL3i9hyessAAACXN////6+oAOq3Xv/vcnrLAAAArwDJF9Ta//+mej+mird8gfv/ i9dyev8AAACXAP///6+oAOqzXv/vcnr/AAAArwCJ+xcC2///pno/porldHH3/f//cnn3/f//fgZ/ ////gvl2g3H3AP+V/qegoTY99/+ucrvb/6mvzD90Dq+pl9zT/++vrwDqv1//73Z51/3//6GmPKp0 E34Ts/v//6ypcrofqMwkr5dTO//vdqIfF3HN///Goh+mpov4zD8WJv3//5X6QWs7/++mcoI/DFpy eksEAACXf////69yuj+VAK+V/qyZWgDqs1//73QHcroLr0H+//9/l688/++pAOr7X//vej+KS3K7 wP2vcnpLBAAAr5X8rJd7O//vAIoLAOrvX//vAIoLdML3X//vAChyuguvl687/++pAOr7X//vej/w eosAAAB0yvNf/+9yeksDAACXAP///6+sdqIDAIoLACl6P4qtcrobr3J6SwMAAK8AigsA6vtf/+96 P4rccroHlfuvlfusl8M7/+8Aihs4ugf+////AOrvX//vAIobACgAugNyeksDAACXAP///68AigMA igsUVwCKCwAocro/rwDqt1//73TC817/73TK21//75fTO//vl+M7/+8A6vde/+/EPHa6F/B7of7/ /68A6u9e/+/EPHa6B/B7s/7//5X+rwDq617/76yvl+n+//92ug8AihcA6qte/++V/gCKD5fo/v// AIoXAOqrXv/vrJegYv//AIoHAOqjXv/vrHa6BwCKD5fa/v//AIoXACjGogfwegv///+sl6Bi//+X 7v7//wCKFwAoACn6d+z//3a6BwApxLoH8Hww////cnpLAgAAOHpLAgAA6zv/76+XldT/73ZiRwIA AHaiQwDqx17/78aiQ3aiD4E3cnpDAgAAdroDdLoDAM8A6sNe/+/EuheKjnS6A5d/+///dP+vdroT AOq/Xv/vlf52uicAihMA6r9e/+/Goid2uiOLuMQ8i7ysAIoTAOq7Xv/vrKyXCv///wCKIwAoACn6 79j//3a6EwApxLoTjOOXF/z//wDqy1//73K6P68A6rtf/+98BwCK1RQiALoPfLoD+3S6D8S6Q/Bz kwAAABbZAAAAl58V//8A6stf/+8WiAEAAHSy95X7cro/rK8XLAsAAHSy9xfKCAAAdLL3F5+y//9y uj+vAOq3X//vlf6XUzv/7xeO0P//pqYXNdD//5X+p3Sy96ChdmbX/f//pDY9+/+qdBN+E5v9//+s qACK93QmAOrDX//vej+L5nJ6XwEAAK+Xzzz/7wDqx1//73QHfAAAivjMPxZd////qXSK83J6MwEA AK9yemMCAAAAivevFwXc//98O/MJel8BAADvi+V/QjMBAADRi6xyemMCAACpr3Q0F3cAAAAUvXJ6 MwEAAJdHO//vrwDqO1r/76Z6P6aK1XT5r5c3Pf/vcrf+crofr3bxAOpDWv/vcnpjAgAAr3K6H68X gdH//3w763J6XwEAAK+oAOrPX//vej/wepMAAACoAOrXX//vzD+hoKQ2Pff/qnQTfhPL/P//rKly ejMDAADMCZf7/v//dCavdooDAOonX//vxspPePrv8Hp3////cnozAwAAlxc7/++vcnorAQAArxfb 3P//fDvzcroDdDSvcnorAQAArxc+AQAAcroHr5XlqQDq/17/73o/8Hpm////cnovAgAArwCKBwDq A1//73o/8Ht+////cnorAQAAr3J6LwIAAK8A6jta/++mej+mi5hyugN0NK9yei8CAACvF5cBAAAU rHTKQ1r/76iVvKCocronlx87/++vACl8O/NyuievAOqvX//vfAf8ituocnorAQAAlzs7/++vACl8 O/NyugN0NK9yeisBAACvF+UBAAC4fAClgUegoaQ2PKp0E34T8/7//6ypqHKyB8wkzAAXQSEAAJfb 9v//rACK8xd44f//l/z+//9yegsBAAAAivevAOpHWv/vcnoLAQAAlaOvAOonWv/vfDvfxDyL+3fn FPl3YgsBAAAAivdysgcXeiEAAHo/ip1ysgcXvSAAAHwH/YqqrHKyBxfsIAAAdA/EDIu5dLHzxDSL znS+73wH+4PWfAf3gNt0tvPENIvilj/7/v///LrzrnJyCwEAAK6vF1re//98O/OV/qDEDItKdPmV /nQxAO8UVHKyBxfzIQAAdDigoaQ2Pff/qnQTfBOvdrIHcrIPFxciAAB0uvM4ugP8////fAf2jPXw SX/nPv/vdroDAIr3crIPFyYiAAB6P/B6/v7//3KyDxdtIQAAej/weg7///+vcrIPF58hAAB6P3a6 9/B7Iv///6ypqHTCN1r/7zi68/7///90ugN6uvPwe3D///90uvd8gvP+dL/zivZ6P4uAdKffFPh6 P4uJdKfXeiSLkJXDrAAodA+megmmi6+VwakAKHQnpnokpouq1Dm3ej+BsXwHwIP8lcCnr7lyuk+p rwDqR1r/73SyB3w783K6T5X/lf6XLyj9768X5BQAAJXDrAAodA+megmmi+sUT3SyB5X/lf6XLyj9 76wXBRUAAAC683yC8/3wcacAAAB0svd6Nov5dP6V/gDvlf9ysg8XeSIAAHo/drr38HrQAAAAoKGk crIPF0IjAAA2Pff/qnQTfhO//v//rHSi96molQAATNP9//8A6qtf/+9yTNv9///MAMbBivKX/+// /wDqy1//7xQQAOrbX//vxsJPePrvdroTOLoPZ8X//zi6C1+X+f92ggc4uhvf////OLoX7////4q0 coobOLr3/f///3K6A68AyagA6v9e/+96P4rZcno/AQAArwCKAwDqA1//73o/i+1yuhN0NK9yej8B AACvF7EPAAB8OfsAsveKPhSsdMpDWv/vlbygqHK6O5cfO//vrwApfDvzcro7rwDqr1//73wH/Irb qHJ6PwEAAJc7O//vrwApfDvzcroTdDSvcno/AQAArxcGEAAAuHwApYFHzAB2RNP9//+gocw/pDY9 +/+ucrvb/6mvzD90Dq+pl67M/++vrwDqv1//73Z50/3//6GmPKp0E34T9/r//6ypdCaodMIXWv/v dMwAKHyaA//MLQgJegmJvRT8dKoHcr3+zC0ICXJ6BwEAAHQ0r612qgcXpRcAAHo/i+lyegcBAACv AIr3AOo7Wv/vpno/poqRALoDdMzGigONP3SK93S6A8T8it6V93K6R5X6rxdI3f//crpHrwDqI1r/ 73w77wAodA98GfwAy0rbPv/vcrpHr3J6BwUAAJf7PP/vrwDqr17/73J6BwUAAJXArwCK8wDqR1r/ 73w746ChpDY99/9yeocAAAB0NK9yukevcnoHAQAArxcZFAAAACh0D3J6hwAAAHwZ/K8Ay0rbPv/v AOo7Wv/vpno/posgFp8AAACqdBN+E/v2//+sqah0gvd0yq9e/+9yegMJAAAAyDi6AwEAAACXwzr/ 768AKXJ6AwkAAJd37P//rxfU0///dCd8O+t6JPB7Vf7//3K696+sFyz+//+mej+m8Htp/v//foL3 I/////B6dv7//3yyAwByegMFAACX//7//68XaaX//3J6AwUAAK9yegMEAACXzzr/768AKa9yegME AACvrBef/v//fDvnej/wcbn+//9yuvevrBeQ/v//pno/pvB7zf7//36C9wX////wetr+//8AiPty egMEAACX4zr/768AKa9yegMEAACvrBfq/v//fDvnej/wcQT///9yuvevrBfb/v//pno/pvB7GP// /36C9wX////weiX///8AiPdyegMEAACX8zr/768AKa9yegMEAACvrBc1////fDvnej/wcU////9y uvevrBcm////pno/pvB7Y////36C9wX////wenD///9yegMEAACX+zr/768AKa9yegMEAACvrBd9 ////fDvrej+Bk3K696+sF2r///+mej+mi6N+gved/v//iqwAiO8AiPOsF6n///98O/N6P4G/cnoD BAAAlwM7/++vACmvcnoDBAAAr6wXzP///3w763o/geJyuvevrBe5////pno/povyfoL3Bf///4r7 fJoD/6wXg9P//3S6A6agoaQ2PKp0EwCK7wCK8wCK9xcA0///fDvzej+K9ZWbAOrLX//vFB6iPKp0 E34T4/b//6ypQf/3//+ozCSpcnobCQAArK92ogcXA+j//3w783SC93KyEzi6E/X///+V/naiD6d2 QhMBAACurHJyFwEAAKyur3Z6FwEAABdUp///fAcAi6HEPIulcroLr5eAmfu/qBdyp///fAcAi7l0 ugvEPIvAxDl2ugON/HaKA8wJxqIDgdFyQhsJAACslf6oAIr3F6en///EPIvnfAcAi+x1+MP1i+vw QT+5uMSKA3a6B4MnzD+goaQ2PHyCB/KK/rFyehsJAAB3Y8obCQAArxcb6P//fAf8po0mcnobCQAA rwDqE1r/76Z0svOV/nb+pxQ8qnQTfBPzqcwJqZX8crILF0y4//9yugNysguvcrrzrwCK9xfZt/// ej+LknSy86h0PpWzZqAIAHK7vvyvAOpPWv/vdAd0uvPEOaZ2gveBx6zUOXQnfASzgfyVs6R0ugOs /DmvqBdo6f///ASV/Ze7Ov/vqBd36f//dLrzfDvnuPwMuMQPgzWkf9j/AIoDAOpLWv/vdIr3pqBy sgsXv7j//3Q5oTY8qnQTrHSi86l0iveodPnwScPnfgA/////grh0+ahyu+f+r3S67wDPF9Dp//90 uu98O/P+x7j+wXTx8Enz5no2i/h07zn90QD/RT/////ENYLTejaBwnTx8EnD5sQFg0QU/HS673Tx r76sdrLvvnbxcrrvrxd3AAAAfDvzFOYA+XTxr6zwSevmvnaq73bxcrrvFB+K/QD5oKGkojyqdBOp dIr3dPl8B/Pwc13////C//3///BwaP///3Sy76h0gvN2svd/w8c/jfq/dvkU8XK696+oqRfXAAAA fDvzAPl0+cwtrHXLx792+cwkmfBJ88d8P/j0NXb5dcPHv3b5mfBJ68f0LJl8BvCkis98P/x2+X/D xz+N4r92+fBJ+8d2uu9yuvevcrrvqK8XLAEAAHw78xTucrr3r6ipFBHwSDVyu/7+dvl0uveV/n/f /6egFP3MP6GiPKp0E34Tz/X//6ypzAnGivN2igPweyn8//90ovfEIfB7NPz//3/E//B7Pfz//5Xv crovqa8XCev//3w78wCK8xcPqv//lcp2uiuZOLov/f8XJar//6mV/ZX9mXa6LRc5qv//fAcAdroH 8Ht+/P//qHKyL5Xvrq8XWKr//3o/8Hqg/P//33osAgAA33orAgAA33opAgAA33ooAgAA33onAgAA 33omAgAA33olAgAA33okAgAAlYJyQiMCAACmOXovAgAA/pXzOXouAgAA/jl6LQIAAP45eioCAAD+ DFShdqLzldEAivMA6jda/+90B6Z6AKaK8gCK8xct6///pnQnFPp0INSi83djyi8CAAC5rACK83J7 yi8CAACvFxbs//98O/P8DHK4/noAdrrzikx0gveoF2jr//9/g8cA0aaK9X9byi8CAAD/FPZ/W8ov AgAA/7l/W8ovAgAA/7mV/3J6LwIAADl7yi8CAADwf1vKLgIAAP+5uTl7yi8CAAD+uamvAIoHF2qr //98BwCK+MwJFrD9//9yuguXC/7//69yug+vAIoHF4zX//98O+96P/Bx0/3//0f//f//xroLgPx0 uguvcnovAgAAAIoPrxfI7P//AIoPAOpLWv/vfDvvlfOnxA/wcQf+//+Z8ElqKgIAAMw2dVIrAgAA 9DWZ8ElqKAIAAHQOzDZ1UikCAAD0NZnwSWomAgAAdAbMNnVSJwIAAPQ1mfBJaiQCAAB2shvMNnVS JQIAAPQ1CXosAgAA8HayH/B6Zv7//3a683J6LwoAAHa6F5f/+///cnovCgAAlf+vFzXt//98O/OZ egmL5HK6F69yei8CAACvcrrzrxfUAwAAfDvzfLrz+/BIOHTCR1r/70R/////ej+BnHSK73a695f/ +///cnovBgAAlf+vF4Pt//9yuhOvcnovBgAAr3J6LwIAAK9yuvOvF4YDAAB8O+N/Qi8GAAD/i+F0 ugPEuuuC6QC6A3JyLwYAAHQ5lYCur/wMACh8O/MAsveKXPBIuht6P4GWdIoDdrr3Phn4/Irvl//7 //9yei8GAACV/68X9O3//3K6E69yei8GAACvcnovAgAAr3K6868X9wMAAHw7439CLwYAAP+L4XS6 A8S664LpALoDcnIvBgAAdDmVgK6v/AwAKHw78wCy94pc8Ei6H3o/gZZ0igM+Gfj8iu92uu+X//v/ /3J6LwYAAJX/rxdl7v//croTr3J6LwYAAK9yei8CAACvcrrzrxdoBAAAfDvjf0IvBgAA/4vhdLoD xLrrgukAugNyci8GAAB0OZWArq/8DAAofDvzALLvilx0igMAigcX063//3Q5oBT9zD+hpDY8qnQT fhPr/f//rKmofFoDAgAA/zh6CwIAALc6/+/MP3JCBwIAAFSflzHA/++bAMr/////m3ba/////5X7 cnr/AQAArxdi1v//pqZ2eg8CAAB8WhMCAAD/FPJ0ehMCAAC/dnoTAgAAdHoTAgAAxHoPAgAAgrMA iu8AivN0ehMCAAA+H/hye/r/AQAArwCK9xeHBAAAfDvvdnoDAgAAfEIDAgAA/4vmAEoDAgAAAIrz AIr3FwX///98O/MWc////xRmfFoTAgAA/xTydHoTAgAAv3Z6EwIAAHxCEwIAAP2MugCK7wCK83R6 EwIAAABLegsCAAAAivcX7gQAAHw773Z6AwIAAHxCAwIAAP+L6QBKAwIAAACK8wCK9xds////fDvz FNkUWptw+v////98O/ueFPB0m9v3m3D6/////3w7+550egMCAAAU+xQlFCegoaQ2PKmoQefn/O/M AH/B/4vuqQCL2+8A6jta/++mej+mi+u4fjl3////fgD//v//gyTMP6ChPHR5f////3SD2+vEOIz9 dAd0OD4f+K8ASXv///8Ai9vnF5bw//98O/N0OBQuqnQTrqypzAmoxorv8HtN////dLr3f8f/8HtZ ////QOfn/O92igNEd////3/A/4r3egmK63QIFO+oAIr3AOo7Wv/vpno/povvALoD/AR+ggP//v// gyoU/XQIegmLmXR5e////3o/i/evAOpLWv/vppd/////AIrvAOo/Wv/vpnZ5e////3o/povRdILv dDg+H/ivAIrzAEl7////Fz7x//+VgHZBf////wCK96kA6kda/+98O+cU86yV/6kXNvH//3w786Ch pDY8qnQTfhNH+f//qahycrcCAAAXKSgAAH9atwYAAP9GAP///8w/ckK2BgAADFR0yttf/++ZVFUA Ka8A6g9a/+9yercGAACvl2c5/+8XtOL//3w783o/iu1ycrcCAAAX8SUAAMw/Foj+//98mgP/croD rK+Xczn/7xfA4v//pqYAKXQnfjzfQP3/croHcnK3AgAAr3J6hwAAAK8XUSUAAHQHegCK9Th6kwAA AP7///8X4db//3QPegmK8QDq21//73QnfjzfQP3/egDwewz///96CfB7FP///wDq21//78Q8jPYX G+b//3o/i2GX89j//5X/lx8i/O8XOPL//3S6B5XfpnJKhwAAAECfIfzvDFpcHyH870EfIvzvcnq3 AgAAqa8Xkfn//3J6twIAAKmvF0v///98O+N8BwXwe7UAAAB8BwLwcXT///96P4Oq8Hp+////cnqH AAAAcnK3AgAArxcGJQAAcnK3AgAAF0wcAAB8mgP/croDQXM5/++vqRfF4///ALoDAIoDqRcR5P// fDvvF7nj//8A6ttf/+8WHwEAAHJ6hwAAAHJytwIAAK8XGCUAAHJytwIAABebHAAAFjkBAACXz4r/ /wDqy1//7xZJAQAAcnK3AgAAF2wnAADMP6SgoTY9+/+qdBN+E5v0//9yemMHAACszCSvl2c5/+92 ogN2ogsXbeT//6Z6P6aK95UDpxYS/v//qXSK8zi6BwQAAAByefv6//+vcroDl684/++vF934//98 O/PGogPwe1H+//+ocnpjBwAAlx8i/O+vF03+//+mxDymdroP8HuP/v//cnpjCwAArxf66P//ledy ukOV968X1Ov//3w773/CA0/87/9AA0/874r0l//+//+oFwWz//+odMIXWv/vACivACivcrpjl784 /++vAOpDWv/vdLL3fDvrckF/////cnpjAQAAqK8XRiIAAHSy93J64wAAAKmvF1YiAAByukOvcrpD AIoPr3K6QwCKA69yukOvcnpjCwAAr3J5+/7//69yemMBAACvcnrjAAAAr3K6Y69yuguXWzn/768X xvn//3S6C3w7y8Q88Htb////r3aKH3aCG3a6Fxfc8///fLIHAJW/AIobdroTAOo3Wv/vdCd8O/N6 JIuHvHya9/98gvf/cnpjAwAAlfuvrIr4F5gEAAAU+hfwBQAAdAd8O/N6AIu6fJrz/3oAgcJySmMD AAA4ugcFAAAAF8rZ//96P4vPcrojdoojrxcwDwAAej+mdroHi+N8BwCL6AC68345f////8aC84M2 ALr3fIL3/YNyzCTGogOgi/UAigMA6kta/++mxqIPi/UAig8A6kta/++mxqILoYv1AIoLAOpLWv/v pnS6B6Q2PKp0E34T0/z//6mocrIfF4D4//90uvN0wkNa/+9yd/va///699n//6+ucnojAQAAl9s3 /++vdrLzAChG+/7//3w779Q3R2n///98PgPEN4D9dD56P4HndA9yeiMBAACX3zf/768XBPX//6ax pooVcnojAQAAl+c3/++vFxr1//9yeicCAACvF/7q//+XE+7//xeJ8v//fDvvej+L9HQ3F5Dx//90 DxT9zAmscnonAgAAl//f/P+vdDEXePH//3QneiSL2HJ6IwEAAHQxrwCK9xee7v//AOoXWv/vZkb/ +///CAZ0Ma0XUvH//3oJi/F0MRfL8f//qRd79v//pnokpIr4zAkWff///wCK83J6KwMAAJfvN//v rwAofDvzzAlyeicCAABysh+pl/8//++vF3D5//96P4usAIoXAIoLF64NAACmdA+mcrIfF9T3//9y eicCAACvAOq3X//vcnorAwAAqa9yeisDAACvlw84/+9yugOXizj/768XKvz//6kA6kta/+90igN8 O+Nysh8XGPj//3Q5oKE2PKp0E34Tu/7//wCK9wDqw1//73o/i+VyekMBAACvl888/+8A6sdf/+98 BwB2ugOK+8w/NjyszCTGou+BlqmodILzQfv+//9/QhcBAADRi8QJekMBAADvis18guv/dDiL53Jy FwEAALyu/AEAivevF5/z//98O/MU7XJyFwEAALyur/wBFwL3//+mpnJ6QwEAAK8AigMA6s9f/+96 P4v6xKLvg1ygoQCKAwDq11//73Q8pDY8qnQTfBPzcroDqcwJr5fm//3/qZfHN//vl/z//38A6utf /+96P4rHcroHOLoL/v///69yugsAivc4ugf7/v//r6mX0zf/7wCKAwDq/1//7wCKA3QPCCHkCbkA 6vdf/+90OaE2PKp0E34T9/7//6ypcnoHAQAAqK/MABd9AAAAdIrzdKL3ej+mi+R0Oahm1D0uB69y egcBAACsrxdRAQAAfDvvdAdyugOvlfqV/wDq/17/73o/ispyegcBAACvAIoDAOoDX//vej+L3nQ4 1AiWP/v+//+V//w8qa9yegcBAACvF5gBAAB8O+/8B3wA/YLWdMoXWv/vlf6gACmvACmV5WamCAZ8 PZ6tl583/++sAOpDWv/vfDvvFN56AIHidAiV0awA6ida/++mej+mi/x/3/9+PPv+//+xihp0OKCh pDY8qnQTfhN3////rHSi86mof1z7/v///3+cv/9/3P9/XPv6////fMIT+/vv/4rsle+X41/87xcV AQAAplwT+/vvpnR8//7//3wH+4vofAf9i+10whda/+8AKJX9ZqYIBnoti810uvd8RP/+///7ld90 BHKP+6YMWorpf0d7/////3JPe////4v4ld90BKYMWnTCF1r/73/E/4qHACh0svfMLQjOcnqHAAAA r5UAcrrzlQCvdqrzF30tAAB6P4vAckx/////cnqHAAAAqa8A6jta/++mej+mcnqHAAAAr4rpdLL3 lQByuvOVAK8Xsi0AAHo/ii4U96wXaPn//6amf8T/iu90svdyfH////+srxcxFQAAXkvK+++V9b/M LaYIDnbqS8r773L7rQDLeqM6/+9yfPv+//+vF6j5//9eS8r773JM+9r//3L7vwDLep86/++pF8P5 //9eS8r7734899n//3L7vwDLeps6/++sF975//98O+d/wf+KtgAolf1mpggGei2L5wAoZgjCE/v7 75Yt+/7//34941/8760U7wAolfzMLaYIDgDLais6/++pFyD6//+mppeXN//vrBct+v//pqagoaQ2 PKp0E66sqah0gvPMJHQIx+CL0nK683/B2orhf4H+jIrnAI/7fD/7droDFyb6///8J3S6A7mmuRT9 vLl/wf+KKbysAOpPWv/vpnQndLL3ej92/ou5f8D/i8FysvN1+MPaitN/gP6Mitl0jvt8PvupdrID F276//+prHa69xe0+v///KL3dLIDfDvzuLgU+3f8vLh/wP+KOn/c/6ChpDY8dLPb93S72/t6Nggv ivzMPzypqHSD2+uV76HEAfB9oP7//6zwSe7wSSfMLHQnPhT3dPtqU17/7/BJrv7MPPBJJ8wsPhf3 dOtqU17/78wv8Em+/fBJJcw8PhX3dPt6U17/78w98Emu/PBJJ8wsPhf3dOtqU17/78wv8Em++/BJ Jcw8PhX3dPt6U17/78w98Emu+vBJJ8wsPhf3dOtqU17/78wv8Em++fBJJcw8PhX3dPt6U17/78w9 8Emu+PBJJ8wsPhf3dOtqU17/78wv8Em+9/BJJcw8PhX3dPt6U17/78w98Emu9vBJJ8wsPhf3dOtq U17/78wv8Em+9fBJJcw8PhX3dPt6U17/78w98Emu9PBJJ8wsPhf3dOtqU17/78wv8Em+8/BJJcw8 PhX3dPt6U17/78w98Emu8vBJJ8ws1AE+F/d062pTXv/vzC/wSb7x8EklzDw+Ffd0+3pTXv/vzD3w Sa7w8EknzCz8MT4X93TralNe/+/MPcQB8HxcAQAApHoAi+fwSe7wSQ/MKT4X93TralNe/+/MPb6w ihegoQgvPHQ+zDZ2t/t293a383a393a363a373a35zwAi9v3lf6Xjzf/7wCL2+8X5v///z33/wCL 2/OV/wCL2+8Ai9vvF/z///898/+qdBOurKmodA7MAMaB5/B6CP///3TiN1r/75WNAIrzdoHnACym ej+mi/g4uef+////lYgAivMALKamlf16P6aL/Hax53S558Q48HtD////fAf+iuxH////f3ay8zi6 A/v////MJBTpdKLrR////z84uvP7////OLoD4P/w/3Sy76gIJuQ2l3////98PvyuqJX+rwCK9wDq 31//73wHAHb5i6TGgu+L93ah83ah9xTvqK8A6uNf/+92uff8PHa586gAifOoAIrzqADJAOqjX//v dOKDX//vxDh2ufuL46ioqACKA68A6qdf/+/EOHa564rhAIn7ACx2gfsAyQAsdsF2gfN2gfd2get2 gefMPxT8lf6noKGkNj3v/6l0Dqh8gef/i890g9vvegCL13Sx73S589Q+xDiM/XQHdLnrqPw+rwCL 2+sXjf7//3w78/6B73Q4FP3MP6ChPff/qXQOqHS553o/i8p8B/6Lz3SD2+96AIvXdLHvdLnz1D7E OIz9dAd0ueuoAIvb7/w+rxfU/v//fDvz/oHvdDgU/cw/oKE99/+pdA58gef/i8SoAInrAOqXX//v AIn7dMKDX//vACh8gef9iuiV/5X/AIn3AMkA6ptf/+8AyQDqn1//7wDJACh8mef/oKE8dL7nej+L 6XwH/ovudLvb+3a+9xddAAAAlf6nFP3MPz37/3yG5/+LzXS72/d8F/+L67eL97eK3HS+9xT8dL7v /Lvb+xT7dLvb+3o/g/LEvvOC95X+dr7vpxT9zD899/98g9v7/4r8zD88AIvb+5X3AOqPX//vrwDq k1//7zx0u9v78FC72/evFywAAACmPKp0E3yC8/+K+8w/ojx8gvf/AIrzivcXSQAAAKaiPACK95X3 AOqPX//vrwDqi1//76I8AIvb+5X/AOqPX//vrwDqE1//7zx0q9vzdLPb+3Q9tah0Bno/i+2pco3+ dKvb73X9d/6+vbGKCKF0OKA8dLPb83o2idl1u9v3rHUndC51BKh0g9vzdDw+H++ZdDw+Fv0MVHQ1 fB78DFWgpHS72/s8qnQTfILv/4r7zD+iPACy73Sy83S694vyde/F7or4v74Asu+KDPBJ//BJ9tQ+ ojwAi9v3AIvb9wDqD1//7zyqdBN0qvd8gu//dD2L5al0ivN18Xf1vbl7Nov6ALLvig58gu//oYr8 f93/ojwAi9v7AOp/X//vPACL2/cAi9v3AOp7X//vPACL2/cAi9v3AOp3X//vPHS72/t193s2i/bF s9v3i/q/FA7MPzzMNsaz2/OJ6nS72/v8PnXvxavb94v2vsSz2/ONFMw/PACL2/cAi9v3AOpzX//v PACL2/sA6ude/+88AIvb+wDq017/7zyqdBOsqah0gveoF5IAAAAAivN0DxecAAAApnQnegmmgbh6 JIG8xAyDwNQM/Ah6AHaK94vLxIL3iNB0uvd0ivPUOL+v8EH5r6gXkQAAAHQHfDvzegCL7aypqBdM AQAAfDvzej+L9biKM8w/oKGkojx0OBQIcrvb868Ai9vzAIvb8wDq417/7zx0u9v7dC919797NooG dbPb97fEPYv2x/eL+rfEPYoIde/VLgkl5C0ILdw9PKp0E3yC7/+K+8w/ojwAsu90qvN0sveL7nX+ ez+L9MX9ivi+vQCy74oQ8En+8En11D6iPHyD2/P/i8J0q9v3qXSL2/fwSfm5fAe+g/d8B6WA/Hw/ 3/BJ9b18Br6D93wGpYD8fD7fALPb74v3ej+L+8Q+iy/UPqE8zD88AIvb+xf9////pjyqdBOsqah0 gvfwSfivF6z///96P6aL/LgUEPBJyLh8AdJ2iveL+nwB1Ir78EnIuMwkqRfG////ej+mi/Jy+2Ry o7kv8EnIuBQXfIL30nQ8iv0IJ6ChpKI8AOrbX//v/vqLN//vPMw/fIPb+9/waz88fIPb+8+D9HyD 2/vGgPuV/qc8zD88dLvb+3wHnoP6fAeFget8B76D+nwHpYH1fAfPg/Z8B8aA+5X+pzzMPzx0u9v7 fAeeg/p8B4WB9XwHvoP2fAelgPuV/qc8zD88dLvb+3wHnoP6fAeZget8B76D+nwHuYH1fAfPg/Z8 B8aA+5X+pzzMPzwAi9v7lfcA6o9f/++vAOqTX//vPF6LN//vqXQ3QQAA///cMZY2WL7//z4H75Y/ WL7//3Qu3DE+Be/8PaF0L9oAgP//Ph/v9D4+BfD8PVb///9/i/m/2gAAAIBcizf/7zyqdBN+E/v7 //+sqah0gvOoRC8o/e/MCRctAwAAej+mi/PMNn+DxwCj8Gs+dA50uu/MNn/Ho/BrPswxivN6CYv8 vxT6RH83/++vrKhyegMEAACXhzf/768A6q9e/+9yegMEAACX/P7//68AivcA6kda/+98O9+goaQ2 PKl0i9v3qReWAwAAt6aH63Xzz38Go4v3fwbQi/y3hg96P4L7dDmhPHK7z/6hPKl0DheHBwAAfFkX 7v///3yZy/90OaE8qXQOdLnLej+L968A6kta/++mdDEXyAUAAKE8qnQTrql0DgCK8zm6A685ugK0 OboB+gCK9zm6APkXugcAAHo/i9eocoHjlemV/6gXvAQAAHK6A5X7r6gX8AQAAHw75zi51+n///+V /qegoTY99/+qdBN8E4upqHQGzAnGSBfu//+K8gCI9xfzBQAAFoj9//+sdKDLxCGZOLpv6/+ZOLpt 9f+ZdopnmXaKZZk4uk/+/zi6TX////+L05nGiNl2igOJ5caM54rqrBds/f///CcAugPwSLjZxroD poMZ8Ei42ca6A4r8cqJzqXQwAIjTFzkGAADGSBfu//92igd2igPwccf+//9yiMFysiMXpQgAAHK5 15X/l/8//++vcrIjF4gIAAB6P/B7E////zi5Ba+0/v2ZdLz7AIobmXa5AZl0vPk4ukOvtPz7AIoP mXa6P5l2+cw/mXa5/Zl2ufuZdLTzr5l2sjmZdrH5mXS08Zl2uj2Zdro7mXayN5l2sfcXuQoAAHa6 NXa59XS6G3a6MXa58Xa6LXa57XJ50/7//68XcQUAAJl2uimZdrnpzD98O++ZdronmXa555l2ueWZ drnjmXS825XhmXa54XS82Xa533S473a523K6Q690MBfWBwAAcnnT/v//rxe7BQAApq9yedP+//+v dDAX8gcAAACKG3QwAIoPF/8HAABysiMXwAcAAAC6BxT7f5nX/3KyIxfRBwAAALoDfjnJ/f//dLoD xHgX7v//8HMyAQAAzAl0uO92uNN0uMvEOYv0AIjXdDCvF0oIAADMJMZIF+7//4HBckiV/v//f0ED AQAA/3J5MQEAAIvlldGvdDAXcwgAAKkXUgYAAKavqXQwF4MIAAC8fjnJ/f//xGAX7v//gzd0uO90 ivfUuNOV6XQwdrjXdLoHmf6425n+uNlyufuZdrjPcrjjrxe9CAAAlfuXR8r773QwF8sIAACpAOpP Wv/vdCepld+sF1QHAAB8O+90MKmsF+kIAACsAOpLWv/vpgCI73QwF24IAACkoKE2Pfv/qXQOdHkX 7v//fAf3g/vMPxTClj/J/f//AIvb93K7z5mvFz0HAAB0eRfu//8Ai9vrlj/J/f//cnvPlf7//68X WgcAAHw77wB5F+7//5X+p6E99/90u9v78Ei33/BIr+HwSL/j/DVyu/7RPKp0E3S696h0gu9/3//G gvOIr6ypdMoXWv/vACnUgvPMLZX/uKQICHQF/ILzi9QAKZXlZqYIBneq8AAplf1mpggGdLr3CCXl LX8dH389nv2q8Hfr57zEII0qdLr3oX/b5/+koKI8dLPb+6mozD/MAH/G34r8vhQHf8bSivuV/r6g f8bPivB1rv5/BYeL+n8Fp4r9vr7MCXXuey2Lw3wB94zI8EktfAW+jfV8BaWI+nw9NhTjfAWejfV8 BYWI+nw9VhTyfAXPje58BcaI83w9Lz4f+/w9ub4UQXoAoKGL/QgnPKp0E34TY////6lBY////6ly epsAAACV/68X4ggAAHw783J6mwAAAHZKmwAAAHTKb1//768AKXo/iuNyepsAAAA4epsAAABr//// rwApej+K+nw3ABTpdHKLAAAAfDcAtov3tor4lf6nFP3MP6E2PKp0E34Tr/3//wCK8wDqw1//73o/ iv02PKypAOrbX//vdMqvXv/vcvO/droHdrILleXMLaYIDnJ6TwIAAHw9nq2Xbzf/768AKXw783J6 SwEAAK9yek8CAACvAOrHX//vdCd8BACLjHJ6HwEAAK8XBwkAAH9b+iMBAAD/fJoD/6Z0uu9/x9GK /AC67wCK73J6HwEAAMwtr3S6C5XlpggOfD2erQCK85d7N//vAIr3ACl8O+cAivcA6rtf/+98BwCL 8wC6AwC6C3yCA+WDTKwA6tdf/+98ggPlg/QAugd0ugcWtAAAAACK9wDqI1r/76aV/qehpDY8qnQT fhPT/v//qXJ6KwEAAKivAOpjX//vQP/+//9yuj+or5c3N//vQfb7//8AiveV/6kA6mdf/+9yun+o r5dHN//vAIr3lf+pAOprX//vdHorAQAAlTtmpqAIBnyC8/+hi9t6P0ZXN//vgvpGZzf/769yun+v cro/r64AivMA6q9e/+98O+s2PKp0E3wT73K6D68A6l9f/+8Aivdyug+vF6MAAACmpjY8qnQTfhP7 /v//cnoDAQAAr5f7/v//AOpXX//vAIr3cnoDAQAAlf+Xjzf/768A6ltf/+90uvc2PF4zyvvvqHo/ g5GKmHzyM8r77wCX5zb/7wDqT1//73QHegCLq6l0ylNf/++XAzf/76gAKZcTN//vqFw7yvvvACmX Izf/76hcP8r77wApfMI7yvvv/1w3yvvvoYvjfMI/yvvv/4vsej+L8Dj6M8r77/7///+V/qegPMw/ oDyqdBN+E9f+//+ozAAXjAAAAHo/i5OpqJX9AOo7yvvvdA98AQCLp3J6JwEAADh6JwEAANf+//+v qQDqP8r773o/i8rGgvNyegMBAACK+K8XrAcAAKavAIr3AOo7Wv/vpno/povvcnonAQAAr6kA6jfK ++8UNZX+oKkA6oNf/+90OKGgNjxe1zb/76l6P0EvyvvvgvGpF+X///+mXNc2/+96P4r7zD+hPJX+ qReuAAAApqahPKp0E34Tr/3//6ypcroHqMwkr5fm//3/rJebNv/vl/7//3/MAADq61//73o/8HoJ ////croDdMr/X//vr3J6DwEAAK9yuguvrJevNv/vOLoD7////wCKBwApej+KkH9CDwEAAM6Kmcbi T3j674vScroDOLoD+/7//69yeg8BAACvcroLr6yXvzb/7wCKBwApej+KxnyCC/6KzBTZl8s2/+9y eg8BAACX+/7//6+sl782/++X0zb/7wDqS1//73o/i/THYg8BAACL/JX+oACKBwDq91//78QEi7By ek8CAACvcnoPAQAArwDqx1//73QPfAEAi81yeg8BAACvAIr3F9cMAACmcnojAgAApq8AivcXIAkA AKavF+8MAACmpqkA6tdf/+8U/cwAdDigoaQ2PH4T9/7//6lBezb/76ipzACV/pf+/+D/doPb6wDq Q1//78Q4XCdJ+++K5qmoqADqR1//78Q4XCdJ+++K+Mw/Fhj///+sqheq+///l/9///+olyNJ++8X vw0AAHw783K72+uX+/7//68A6idf/+9yu9vrrwDqw1//73TK31//70J/////qKqV/KhE////P5X8 QIM2/++sqAApfAcAXCvJ+++K4ZX/qpX9lf+V/KyoACl8BwBcK8n774uZFwr///8UqHTC41//78wk rK8AKEH/f///xDmM9KwAyivJ++8AKHQPcrvb76yvqZcjSfvvAMoryfvvAOrTX//vej+L5Mw2xAyJ 6nVuI0n773J+I0n7774JLcQxd++NFDi72+/+////F3X8//90u9vvoqSgoX479/7//zypF5n8//8A i9vzAIvb8xfl/v//pnQPpheg/P//dDmhPKkXufz//wCL2/MAi9vzFzT+//+mdA+mF8D8//90OaE8 qRfZ/P//AIvb8wCL2/MXbv3//6Z0D6YX4Pz//3Q5oTypF/n8//8Ai9vzAIvb8xc//f//pnQPphcA /f//dDmhPK6pqBcb/f//QCNJ+++oF4f///90D6moQCfJ+++oF2IPAAB8O+/MNnoJiep1bifJ++9y fifJ++++CS3EMXfvjRSV/5X/lf8AyivJ++8A6ptf/+9yu9v3lf+vqagAyivJ++8A6jtf/+8AyivJ ++8A6p9f/+8AyivJ++8A6j9f/+8XiP3//5X+p6ChpjypdIvb96h0AX/B/4vyqRcYDwAApnKL+f4U EXQ51Digv6E8qnQTfBPzrKmoOLoL/v///wCK8xc/DwAAdA+mfgH/+///gtl0ovesF1MPAAByg8/+ OPvbI0n77xddAAAAdA+mcvvBwv9///+D+8w/FIByugevcroDr6wXYP///3w783o/i+bEggeK+Xya C/8U7awXDP///9SKB6axsRT+sXaKA3S6A6xyfyNJ+++vF/IPAAB0ugOXbzb/73J/I0n7768Xvw8A AHS6AwCK83J/I0n7768X0Q8AAHw753yCC/+L9HS6A39bxyJJ++//lf6noKGkNjyqdBN8E98AivNy uh+Xazb/768A6kNa/+9yuh+vAIr3F/QAAAB8O+s2PKypqEAjSfvvAIvb73QIFzQQAAB/wiNJ++// pnQni9SsqQCL2+cA6h9a/+98O/N6P4r5f8PMwovnqRdfEAAAf4P5/v9yi/n+pooqzD+goaQ8dLPb 63Q51Dipdv4XgRAAAKZ0s9vndv6V/qcUH6p0E65yugOvcrr3rwCK9xd/AAAAfDvzej+LzZcjSfvv F6wBAAB0svd0qgPUPtQ9t69ye+4iSfvvr3J+I0n7768XmBEAAHw775X+pzY8zD82PKp0E65yugOv crr3rwCK9xfQAAAAfDvzej+Lz3S695XCcn8jSfvvrwDqN1r/76Z6P6aL57+XAPz//68AivMA6kda /+98O/OV/qc2PMw/NjyqdBN+E//7//9yev8DAACprwCK9xdnAAAAdA+megmmi+xyev8DAACvAOoT Wv/vpnSy83b+dDmhNjyVAADKJ0n77wDqF1//7zwAyidJ++8A6jdf/+88qnQTfhPr+///rKmol//7 //8AivdyehMEAACvAOpHWv/vcnoTBAAAl1M2/++vF/f5//90J3J6EwQAAJdXNv/vrxcK+v//fDvj ej/we0j///+vAOoTWv/vdAemcnoTBAAArxeB0f//dA98AQCK5HJ6EwQAAK8XjtH//3o/8Ht5//// dL/zdP90z5XvcroPlf+vF7USAAB8O/N2iguZOLoP/f+oF8jR//+V+ZX+lf2ZdroNF93R//90D3wB AIu1fILz/4vrAIrzcroPle+vqRe/////fDvvFPNyug+V76+pFxDS//96P4rleiSL36wXcxIAAL+v rKkX4Pr//3w773o/gvWpF0DS///MPxT9dDmgoaQ2PKp0E34T7/7//3S666xGF/z//6nMLXQOCAnM Lah0gveV/qRBgZn7f3a6C3S66wgOcrrrdqLrr6moli0X/P//dqoHF53S//96P/B6Yv///wCK7wCK 86gXmdL//3a68xd90v//fJrr/3a693K666+pqBfL0v//ej+KjMwJxorzi+1+gvfL2P//i/Z+gvfM 2P//iqVyugt2QgsBAACvcnoPAQAAqa+plb92Yg8BAAAXAtP//3wHAIvKxDmK96gXC9P//xTRcrrv OLrv+////69yugOvl/jv//+XAAD//6gXAtP//3wHAIv6xooDi/UXC9P//3w3ABT9zD+goaQ2PACL 2/sXUNP//8w/PKp0E34T8/7//3S696x2egcBAACpcroHqMwAr3J6CwEAAKivqKg4egsBAAD+//// doIHdoIDF5PT//90D8QIgbV0uvN0ou92uvOorACK8wCK9xej0///dA98AQCK7xeF0///wszY//+K 3pX1FPT+ivPUIcQggfOV6wDqy1//78QggDh8AQCL/JX+oXQ5oKGkNjx8g9v7/4r7fDcAPACL2/MA i9vzAIvb8xesAAAAfDvzPKp0E34T7/7//6ypzAmoxorri950uutGF/z//2Z0BggAdroLdLrrZggG li0X/P//dqoHFPl2igd2igt0oveV/qdysguuqXJyDwEAAKmur3ZiCwEAAHZ6DwEAABdp1P//fDAA xDiLwsQ5ivjMPxZc////crrrr5eAmfu/rBeQ1P//xDiL4MaK64vlAIrrAOpPWv/vdCemxCGK8XS6 83bPdLrvds90OBSSdILrqah2ogOsAIr3F87U///EOYu8fAcAiu8XmtT//8LM2P//it+V9RT0/roD 1AfEAYHVlfoA6stf/+/EAYHhqagAigMUPKwA6kta/+90uvOmds90uu92z3w3ABTvdLrzdLLrlf52 53S673b3p6ChpDY8qnQTfIL3/4r6fDcAojwAiusAiu8AivMAivcXNwEAAHw776I8qnQTfhPn/v// rKmodKL3lf5yshOnzAmuqXJyFwEAAKmur3aKD3aKE3ZiEwEAAHZ6FwEAABeA1f//fDAAxDiL48Q5 i7xyugOvl4CZ+7+sF6DV///EOIv6xooDivh0OBZU////lfugxoIDg+OV/XK6B6ivrBfL1f//xDl2 ugPwcXf////EOIz4zD8Wf////6lyugeor6wX7dX//8Q4droDjZV+ggf/3///iJ4AigcA6k9a/+90 J6bEIXaiC4u7dIIHqaisAIr3Fx7W//98BwB2ugOK8Rfp1f//wszY//+K5xT31Af8J8QBgdWV9QDq y1//78QBgeEUNgCKCwDqS1r/76Z0uvN2z3S673bPfDcAoKGkNjx0uvN0sguV/nb3dLrvdLIHdven FBmqdBN+E+/+//+szCTGou+piveV/qcWNP///36C7//f//+AsnS69zh6DwEAAP7///92egsBAABy uguvcnoPAQAArK+srHaiB3aiCxe+1v//dA/EDPBxdf///3S673w/+68A6k9a/+90D6bEDHaKA4r6 fDcAFI+ocrrvlfuvqRf6FwAAAIrvcrn7AIrzrxcJGAAAdLrvfDvndorzcof7rKgAivMAivcXDdf/ /3QPfAEAiukX79b//8LM2P//iuSV9QDqy1//7xT61AH+ivPEBIAyfAEAi/yV/qEAigMA6kta/++m oHQ5oaQ2PJX/AOrLX//vAIvb8wCL2/MAi9vzFwEBAAB8O/N6P4sfPHS72/upej+oi750g9vvegCL xnX3dA97NovOdCl0ONQodfd7Novzx/P9ivi/f8P9/4oRf8f/i/l1sf65FCZ/2f+oFwIYAACm/DkU /cw/oKE8qnQTrq6sqajMJJc7Nv/vdqIDAOpPX//vxDyLr5dPNv/vrwDqU1//78Q8XB/J+u+LxDi6 B7f9//8AigcA6k9a/+90B6Z6AIvbcroHr6gA6h/J+u90D3oJi+WoAOpLWv/vfAGQpor5vHwE/YMz zD+goaQ2PHJI8/7//3oJi9V0ovd0ugPEuvOM5nK5+5WAr6wA6kda/+98O/MAugN+PH////90yXoJ iiaoAOpLWv/vppX+pxRFqnQTfBPncroDrK/MJJcDNv/vl/7//392ogsA6vtf/+96P/B6f////6mo lftyug+gQRM2/++vcroHr3K6E6+sqTi6B/7///8AigN2ghN2gg8A6v9f/+9yuheor6isqQCKA3ai F3Ti71//7wAslxf8//+XJzb/7xe4BwAApno/povxrxe2BQAApji6C/7///9yugeor6iV/6kAigMA LACKAwDq91//76ChdLoLpDY8qnQTfhP3/v//rKmocnoHAQAAl/7+//+vF1rZ//96P/B6Y////3J6 BwEAAK8XnxkAAHo/pvB5eP///3J6BwEAAK8XR9n//3QPzADECIuMdLnzxDiLk8wkxseL+bx8P/sU CXL7Yvv///+V/q8A6j9a/++mxDimdroDi7jEIIHBcq/9dLnzdDA+Hv24dPv3df93vQF0ufN0+/d1 v/53vQB0ufN0+/d1v/13/XS583w9+8QEdPv3db/8d70Cgzp0ugMU/cw/oKGkNjx0q9v7lf2n8En1 fBb1i+p8FoqL3HwW0ovwfBbrivZ/hf5XivyV/qc8da3+fwXvjQh/BeCIDRQSzD88rJX+p8wkxvor Nv/vi6OoFxUBAAB0B3oAi9zG4IvoqXQIrxdgAAAAfDn79CemdDl8wf+KEqGoAOpLWv/vpnzCKzb/ 7wCgiuF82is2/+//fAT+iu0XBwIAAHo/XCs2/++L+6ynpDx0PHwf/aQ8qnQTfBPrdLrzzC2sqXXP qHWv/kDw8PDw8Em3/T4d9/QuOLoL/P////BJt/zwSY/5Ph339C7MNnWX+3W3+j4e9/Qx8EmP+D4e 9/QOdDU+Fvt0Idww3CDMNMwOPh77zC50AXQ1fhgAAP//PhbvzDBAzMzMzMwOPh7vzC50MT4W/XQl 3DDcIEAA/wD/zDTMLj4e/cwOdCV0MdwgPhb33DDMNMwuPh73zA50MfwJPhbg9DF0DswNfhlVVVVV zDHMKXayD3Q1Phbg/C30NXKqE3ayE3KyD3ay83Sy93ayA3SyA366A3////92svc4ugfv////dLLz dM50MT4e4z4R+/QxdIr3zPF0DnQmPhHnfBnAPhTvdMtKLzX/73wcwMzLYi8z/+90Jj4U93wcwHwe wMzLYi8x/+/My3IvL//vdLL3fLr3987NdIrzdLb7zPF0DnQmPhHnfBnAPhTvdMtKLzT/73wcwMzL Yi8y/+90Jj4U93wcwHwewMzLYi8w/+/My3IvLv/vdDXOzXSq8wCyB3ay8/B6pQAAAACyC3Q1dKrz drLz8HrKAAAAdKoPdIoTdDU+HuAuFfQ1dC7MKX4dVVVVVcwNzDV0KXQmPh3gLhH0KdwgdA0+Effc CEDMzMzMzAzMMT4Z98wpdCZ0DdwgPhH93AjMDMwxPhn9zCl0DnQFPhHvfhgAAP//zAhA8PDw8Mwp PhnvzDF0JXQO3CA+EfvcCKDMDMwpPhn7zDGhdCZ3t/w+FOd353QmPhTvd6f+dCZ0NXev+D4W53e3 +3Q1Phbvd7f6dDU+FPc+Fvd3p/13t/mkNjyqdBN8E4OsqcwkqMw28El+xzX/73SK87d0L3wf+D4F /HXrzXvrek81/+/waj93u/JDvnwGx4MpdIL3dqIDOLrz8P///5X3croLrK8XLR4AAHw788w2xqLv dLrzivx0ugPwSX+PNf/v/D7MLXwG4/BiPbV8HRt8PcfEPYP8fD8bdbv6Q3e78nu+fAbHgzfMCfBJ eX81/+/Ho/p8i910OZX5ZqYIBpX5pHKz+gt0OWYIBHT7ak81/+8+B/33/swkuXwBz4M08EmyCcw/ dZoL9D7wSbIHPh/39D7wSbIFPh/39D7wSbIIdvjwSboKPh/39D7wSbIGPh/39D7wSbIEPh/39D52 uPvG4hfJ+u+L+D7Y/T6Y+/0AugMAsvN8OPd8gvMA8HDiAAAAoKGkNjyqdBN8gu//qaiK0HSK83SC 95X/qagXRQEAAHK595X+r3J4f////68XVwEAAHw575X/qX44//7//xTSdIL3dIrzlf6pcnj//v// rxd6AQAAcrn3lf+vcnh/////rxeMAQAAfDnvlf6pqBeYAQAAfDvboKGiPHS72/t8n/v/fN//OL/3 /ty6mDi/83ZUMhA4v+8BI0VnOL/riavN7zyqdBN8E790qu+pdIr3qHTxdD4+F/xywy58H8DEBoz8 ALn7dDV2wT4W4v6x+3Q1tXo2i6S9rHaq93Sy878AuvN8B7919nez+eiKwZXvcrnlcrI/oPBJpwDM LXWP/nXvfD/7Ph339CzwSacFPh339Cx27nw++7CKI3K6P69yufevF0f///+mzD+mALL3ilSkoKE2 PKp0E3wTv6l0iveVx3T5dLH7droHdrIDPhf8fB/ApsQ+g/yVh6bUN6iuly8t/++pF8UAAAB8O/Ny ueVysj84uvfx////8EmHAMwtdY/+de98P/s+Hff0KPBJhwU+Hff0KHbufD77ALL3iiVyuj9ygfev qBfM////pnK5pqaV+3QwoaB17nevAHTuPhX3d+907j4V73ev/nTuPhXnd6/9fD77fD/7sYokoTY8 qnQTfBO7dLrzdLL3rKl0pvt0jvOodMd2ggd0BAgo3AF0jvd07twM9AH8ggdyS+iHW5UodAR0KT4V 5j4Z+PQpdI/7/Cx2ihN0DdwFCCncjvf0CHSG8/yKE3JDyKlIOBd0CD4R6z4Y8/QIdIf3/A12git0 AXaK9wgo3AR0Idwl9AR0pvf8gityY8Qkj9/bdAQ+EPA+HO70BHSn8/wBdqIPdCDcCAgs3CV2gvP0 IfyiD3QMdKb7cmPMETFCPnQMPhnpPhT19Az8CHQh3AEILNyi9/QgdIfvdoJD/CBya+VQ8IMKdCF0 BT4Q5j4d+PQFdK/r/AF2qgt0KNwgCC3cqvN2gj/0LHSi9/yqC3Jj7NU5eLh0LD4V6z4c8/QsdKfn /Ch2oh90JXaq9wgs3CHcKPQldKrz/KIfcmPl7LnPV3QsPhXwPhzu9Cz8qvd0JQgs3CB0gvfcBfQg dIfj/CB2gjNyS+H+arkCdCV0AT4Y6T4R9fQBdI/f/AV2ihd0CNwgCCncivd2gjv0DHSiP/yKF3Jj zCdnf5Z0DD4R5j4c+PQMdKfb/Ah2oi90IdwBCCzcJfQgdIL3/KIvcmPgUAi7dHQEPhDrPhzz9AT8 AXQgdoL3CCzcojvcAfQgdIfX/CB2gj9ya+VOpAAAdKL3dAU+EPA+He70BXSv0/yC93aqJ3Qo3CAI LdwpdoLz9Cx0ojv8qidyY+xBKKN2dCw+Hek+FPX0LPwodCXcBQgs3KL39CB0h8/8IHaCO3JL4d3u b5R0JXQBPhDmPhn49AF0j8v8BXaKI3QI3CAIKdyK8/QMdKL3/IojcmPMbI5nAnQMPhHrPhzz9Az8 CHaKA3aK9wiqA3SiA9wI3CX0IXSPx3aKN/whdIrzcmPhcbyGWXQMPhHwPhzudL/D9Az8ivd0ovd2 uht2ivPcIQiq83S689w49Dx0ovf8uhtya/3e90u2dD0+H+k+FfX0PXSqA/w53CncJ/QsdCH8qhNy Q+id2uEJdCg+FeQ+GPr0KHSC8/wv3AfcJfQEdKL3/IIfcmPEv0y/P3QEPhDoPhz29AR0J/wFCCx2 gvfcJdwH9CD8oidyS+GupaHZdAE+EO0+GfH0AXQN/IL3CCncivd0INwl9Ax0ovf8igdye89VOEkW dA8+Ges+F/P0D3S69/wICC/cONwh9Dz8ugtya/2i79ApdD0+F+Q+Hfr0PXQo/Dl0IAgt3CfcKfQs dKL3/Ko/cmPsrOu7/XQsPhXoPhz29Cx0IfwvCCx2qvfcJ9wp9CX8ohtyQ+B+GV4ndCg+Fe0+GPH0 KHQH/Kr3CCjcgvd0Jdwn9AR0ovf8gkNyS8E3BCwYdAE+GOs+EfP0AXSK9/wFCCncDdwg9Ax0JfyK L3J7zxkyHt50Dz4R5D4f+vQPdD38CAgv3DjcIfQ8dKL3/Lo3cmP8KfjIPHQ8PhfoPhz29Dx0IPw5 CCx2uvfcIdw49Cf8og9ya+V48ioLdD0+F+0+HfH0PXQp/Lr3CC3cqvd0J9wh9Cx0ovf8qhdyQ+gS 66W6dCg+Hes+EPP0KHSC9/wvCCjcB9wl9AR0J/yCI3JLwfoWHFZ0AT4Q5D4Z+vQBdA/8BQgp3A3c IPQMdKL3/IorcmPMB1wQA3QMPhHoPhz29Ax0JfwICCx2ivfcINwN9CH8ojNye+cm/ZCYdA8+Ee0+ H/H0D3Q4/Ir3CC/cuvd0Idwg9Dz8ujtya/11s9VydD0+H+s+FfP0PXSq9/w5zCnML/yqC3JD6L3G BQB0KD4V4z4Y+/QodAH8L8wHzAX8ghd0ovdyY8R+CY54dAQ+EOo+HPT0BPwFdoL3zAfMBXSi9/yC J3JLwd2eYpJ0AT4Q7z4Z7/QB/ATMIHQMzA38ijdye8/zxxoCdA8+Geg+F/b0D/wIzCH8ohNya+W7 FUFbdKL3dD0+F+M+Hfv0PXQo/DnMKcwv/KpDcmvsVjAhtHQlPhTqPh309CX8J3QsdqL3zCnML/yq M3Jr6J+0RAl0BT4Q7z4d7/QF/ATOgvd0qvfML/yqP3JL6Y9DQEF0KT4d6D4R9vQpdIr3/CjMDfyK I3JLzzmBZNd0OT4X4z4Z+/Q5dAj8PcwNzA/8igdyY8wF2F4VdAw+Eeo+HPT0DPwPdCF2ivfMJcwn /KIPckPges8QK3QgPhTvPhjv9CD8Ic6i93ai83SC98wH/IIfckPF+uJ3+3QoPh3oPhD29Ch0gvf8 LMwFzCX8gi9yQ8fGLysmdDg+F+M+GPv0OPw9zCf8ojtyQ+EaZiQZdKLzdAg+Eeo+GPT0CPwPdAHM BcwH/IIbcmPEB4Nd4HQEPhDvPhzv9AR0IfwBzCB2gvPMJ/yiK3Jr5ZqpUzt0ovN0BT4Y6D4V9vQF dCn8BAgt9CjMLPyqB3J777vd1gt0Lz4V5T4f+fQv/Ch0PAgv9D3MOPy6M3JL+WgA1bx0OT4X6T4Z 9fQ5dAj8PQgp9A/MDfyKN3JjzFjca1R0DD4R7j4c8PQMdCX8Dwgs9CHMJ/yiC3JD4MZfbAN0ID4Q 9D4c6vQgdAf8IQgo9ATMAfyCO3JDxTympJp0KD4V5T4Y+fQodAH8LAgo9AXMBPyCD3JDx20z83B0 OD4X6T4Y9fQ4dAT8PQgo9AfMBfyCP3JDwYILEAB0CD4R7j4Y8PQIdAX8Dwgo9AHMB/yCE3JjxC6i e3p0BD4Y6j4U9PQEdCf8AQgs9CDMIfyiF3Jj5bCBV5B0LD4c+T4V5fQsdCEILPwo9CXMIPyiG3Jj 5x8Z0wF0PD4c9T4X6fQ8dCD8PQgs9CfMJfyiH3Jj4eu8/lx0DD4R7j4c8PQMdCX8Dwgs9CHMJ/yi I3Jj4F7u97F0BD4Y6j4U9PQEdCf8AQgs9CDMIfyiQ3Jj5X2BrAh0LD4V5T4c+fQsdCH8KAgs9CXM IPyiJ3J758oNxUJ0Jz4U6T4f9fQndDj8JQgv9DzMPfy6K3JL+UQtKNV0OT4X7j4Z8PQ5dA38PAgp 9A/MDPyKL3JDyG4seRR0zvwNdCh2zj4d6j4Q9PQooPyu+6H8L3au+3Su9/wvdL7z/Dx2rvd2vvOk NjyqdBN8E+9yug84ug8rhv/vr5X7OLoL1oX/7zi6B9GE/+84ugOWg//vF6f5//+mpjY8fMITyfrv /4v7lf6nPKypl//+//+XAP///5cPyfrvOPoTyfrv/v///xcpKgAAfDvzzDaV+XQ+zC2kCAxyTs7I +u93sd9/PeW+fAbld+mNHJX+p6GkPKp0E3wT63S676ypqHTPzCQA6hda/++V+2amCAZO+XU9CRb9 PHJ01Sz/7/u4w6V3/oH70/l3/rx8BPmDK3Qxlfo+Hvx0PqBmCACV+nayD3Si83aiC3QHdD5mpggG ei2L/rhyuP6V/q8A6j9a/+90qg+mpna6B3Lz4cwJxCl2igOBkbZ2shN0sgvEshN15nei8Iv3dbb+ d7LxFPt/mvH/xqoDgsGV9KXUKZXgdDWkfLoD+iwc8Eiy8fBIJNwmdTUsFHw5+nwV+nV07yz/73Si BwC6B3wB9Hf0gPd0sgPEsg+DOHSqD3wR9wC6C8aqA4NpdLLvlf52xnSy63b+f9vH/6egoaQ2PKp0 E3wT63S676ypfJoD/3T/qJX+cuN/dAx2ohM+Efxyuf6vAOo/Wv/vpnQvdLrzfJrz/6Z2qg/NNna6 B3okgZx/3f/GogNyvf5394K/lfSm1LLzdrILdLIHALoH8En2mfBJRg/J+u+ZfgAA/4u1dLILfLrz +ny6A/p8kgv6LBiZ9sV8gvP0gPrGogODNnXndfV8kvP3d+V0ohN398aiA3Qvg2J0uu90sg+V/nbP dLrrdvenoKGkNjwAig8A6kta/++mzD8UEqp0E3wT73K6Dzi6D9qD/++vlfw4uguVg//vOLoHkYL/ 7zi6A5aD/+8X+Pv//6amNjx8wg/H+u//i/uV/qc8qJX+pXw3AJW/QA/I+u+mduoPx/rvDFTMP6Dw SXfLLP/vd34PyPrvv3wHv40SOfrSx/rvAXQ9PDyqdBOurql0iu908Xo2ivjMPxYU////rKiV/HQ+ zC2gCAiV/MwtpKx0B3Q+CAxyvv2mdqoDzC0IDj4f/a92+QDqT1r/73QndLrregCmdueJtXK67HQo drrrdLrzfJrv/3a6B0b8////dIoHdILrdfl3+LmwtooIdLrvlfuhdDc+FuV8HsA+H/l1dsss/+93 9LyxihZ8uvP8tYpBdLoDej+LpJX8cvt6/f///8wtpggOdC90uvN8mvP/drrrcrrwdrrvdLIDdIrr dILvdfl3+LmwtooIdLrzzAnEDYzpdDc+FuV8HsB1dsss/+939Lw+H/kU+zn8wry5fAH7jSOV/qeg pKE2PMw/PKp0E3wT73K6Dzi6D12C/++vlfo4ugvKgf/vOLoHl4H/7zi6A9aB/+8Xdf3//6amNjyq dBN+E3////+pl//8//8A6k9a/+90D6Z6CYuXcrp/rxdxDgAAAIr3F1ktAACvcrp/AIr3rxdeDgAA crp/rxfVDQAAcronle+vcroXrxdELgAAcronlfevcroHrxdTLgAAcroXlf+vqRctDwAAcroXlf6v cnl//v//rxc/DwAAfDu3dDmhNjwAi9v7AOpLWv/vpjyqdBOurpX3croHlf+vF28uAAAAiutyugcA iu8AivOvlf4AivcXvP///3w72wgn5D+/NjyqdBOurpX3croHlf+vF6IuAAAAiutyugcAiu8AivOv dLr3+n/+//+V/q8XUP///3w72wgn5D+/NjyqdBN8E/N0uuesqah0z3SC63aCA3K5968A6k9a/++m drrrdLLjlfekdv58Afh0PID9dDmvcroLqK8XLy8AAHw783yC8/+L76xyugsAiu+vF+r+//98O/Ny uguvAIr3F7ETAAB8gvP/pqaL73K6C6yvAIrvF2kvAAB8O/NyugusrwCK6xd5LwAA/qLr1Ax8O/P8 BHoJgGvUggN0uud2x6ChzD+kNjyqdBN8E+90uuesdOcJPPjwelb///96JPB7Xv///6morADqT1r/ 75X3dAd0uuOhqQCK63bHcroHrxfTLwAAfDvv1CH+iut8gvP/i+5yugepr3K6D68X7y8AAHw783K6 B68AivcXWxQAAHyC8/+mpoviqXK6BwCK768XuP///3K6D6mvAIrvFyAwAAB8O+dyugepr6gXLjAA AHw783oki+apcroHAIrrrxdCMAAA/orrfDvz/AHUIRRyoMw/oRT8fDcApDY8qXSL2+96CYHsdLvb 93Sz2/PUN3Xr/s/vv7GKCKE8dLPb+3wG9Yy9Ph77fj4Hx/rvfMb/ist0u9v3dO96LYvVfIf7/4vb fIf3/4vhfIfz/4vndu50r/t2rvt0r/d2rvd0v/OV/na+86c8zD88qXQOAIvb83zZ/wCL2/MX5P// /3Q5oT33/3T+ej+L8XS2+68+HvsAbvvG+u+mPKx0o9v3qXwE9ah0Do37zD8Uv3Q8Ph/7fEcHx/rv /3JHB8f674sXdPl6P4vxdLH7rz4e+wBu+8b676YAi9vrfNn/dqH7AOimdvnMNno/8Go+dD6goaQ9 9/90/no/i9F0tvusdKPb76l0i9vvqKypAIvb53TBPh77rwBuA8f673w773o/ivve/HbBoKGkPfP/ dP56P4vRdLb7rHSj2++pdIvb76isqQCL2+d0wT4e+68Abv/G+u98O+96P4r73vx2waChpD3z/3Q+ zDZ2t/d2t/N2t+93t7t3t8t3t9t3t+s8qnQTdLr3qXQOdrn7crr3r8w/r6mXTX7/76+vAOq/X//v CCfkPwgndvmhoj37/3Sz2/sX+v///8w/Pfv/qnQTfhNX/f//rKmodA4A6ttf/++vAOoPWv/vcoHr lfaV+qgXLCoAAKh0wiNa/+8AKHKh25X3lfqsF0IqAACsAChyocuV95X6rBdSKgAArAAocqG7lfeV +qwXYioAAHw7v6wAKKbMJJX+oHbBxuGL6RdOFwAAej+K8pcfbPv/AOrLX//vFBnGgfuK7zi593ss /+84ufPa////FPU4uffnK//vdoHzdqHvdqIDdPnEPPB7H/3//3SyA8Sx8/ByK/3//8Q8dqIH8HtB /f//fIIH/fByS/3//xe0FwAAej/we1j9//90ue/MLQiJ83S593ap7wDLb3J6pwIAAJd3J//vrwDq r17/73J6pwIAAJd37P//rxfeIAAAfDvrxDx2eac3///we6j9//9yuauX/jf//6yvFx8zAAB8O/N2 YaM3//90MXbiZ3j673aiCxd2/f//dDEXUP3//8bh8Hvr/f//leEA6stf/+9yug+sr3K6E68ASac3 //8XXh0AAHw778Q88HMS/v//izBygauoF9syAACmRv83///UN8ayD4H8drIPAIoP/AcAihOoAOpH Wv/vAIoTAOpLWv/vfDvvcnonAgAAdDGvF5/9//96P/B7Zv7//3S6F8Q88HsE////fBf68HtX//// 0sL+//+LzreL23wXkovpt4vsfBf8i/F8F/CKRXQxFwL+//8UTnQxFzj+//8UV3yB+/6KXXbhFGGX 3zf/7wCKGxe4GwAAdAemxASmi3aX3zf/76gXyxsAAKbEPKbwe4sAAACX3zf/768X4BsAAHyB+/6m pvB6ogAAAHR5ozf//3wHv4JPcns5nzf//7ivqBfM+///pno/pvB7xgAAAAB5ozf//xbRAAAAdLoL ALoLej/wet8AAAB0uft8B/6K6QDqF1r/75X3ZqYIBr29rZeDJ//vFMN8B/3wewYBAAB8B/vweg8B AACXhyz/75eLJ//vFOGXkyf/7wCKHxfQMwAApno/pormALobAIobl5sn/++pF439//98O/MWRwEA AJejJ//vAIofF/wzAACmej+m8HpeAQAAdLn7fAf98HtqAQAAfAf78HpzAQAAl983/+8AihsXyhwA AKbEPKbwe4oBAACXryf/768XPTQAAKZ6P6bwep8BAAAAiiN0MReH/f//Fq4BAACV/qAWGwIAAABJ pzf//xc3IQAApgC6B8bh8Hq9AgAAALoDALnvFukCAADG4YvkdLoDxLnzivSXnxX//wDqy1//78bh 8HpDAwAAxoH7iuh0eaM3///EPIvyr3J5nzf//68Xe/z//3bhoKHMP6Q2PKmodAaV9pX6cojrqRfu LQAAqQDqI1r/73w776mXbyf/76gXkf7//3w786ChPHSu93K+y690vu8Ay31yvruvcr7br5dnJ//v rhe3/v//fDvnPKp0E66sqah0BnKgq3QMdfl7P4ujw/KL+8P1irN8mgP/f9n/dDnUOHwXq3o/geCX //3//6wAivcA6kda/+98O/N0MACK9xfA////droDdbn+ucPyiwfD9YsLqawXEDYAAHyCA/+mpor8 uRRclf6nFO3UCHwRq34B//3//4H8f9z/zD+goaQ2Pfv/qXSL2/eoQN83/+98WfP9////fFn3/f// /3xZ+/3///98We/9////f8HFis9yuf6or3Z5+/3//xd8HgAApnZ59/3//3o/povNl1Mn/+8ASfv9 //8XmB4AAKamFPl2Sff9//+oAEn3/f//F64eAACmdnnz/f//ej+mivvMPxTHAEn3/f//AOoHWv/v dHn3/f//8EH/rwDqC1r/76Z6P6aL7ABJ9/3//wDqE1r/76Z2ee/9//+V/qegoT37/6p0E34T9/v/ /6l0ivd8Qac3////ivvMPxSscrrvqK9yegcEAAAAivOvAOrjXv/vdAeXuzr/73J7wgcEAACvFzc3 AACmfDj9pnJ6BwQAAKivAEmnN///F+UiAAB8O/N6P4sZzDZ8B/7waz50PqChNjyqdBN+E2P///+s qaiX/+P//0FnlPrvlf92sgepF+c3AABAa////3J6mwAAAKiV/68X+zcAAHw753J6mwAAAHZCmwAA AK8A6m9f/+8ASpMAAAB0wkNa/+8ASpcAAACXMyf/76kAKHw770T/+///lwD8//+XZ4j675f97/// rADqM1//73yaA/9yugOvl3M5/+8XYikAAKamAIoDlz8n/++XZ4T67wAofDvzqQCK95dPJ//vAIoH FykBAAD8DHw7734BZ4D674MdoKGkNj37/6p0E3wT8wCK9xcEOAAAppXypsQ+jPvMPzY8croLrK9y ugOvdrIDAIr3dPJHePrvzCQXbgcAAHo/i5Zyugd08lN4+u+or3K6A68AigsXiQcAAHTCS1r/73o/ i750ugd0svOplfd076927nSv+3au+5l0t/nwSA58HoCsPhH4mXa3+RflPQAAfDvz2gD+///EOaGK /JX+pACKBwAopgCKCwAopqB0PKQ2PKp0E34Tl/7//3S686jMAMQ4i+c+H/2V/q92ggMA6j9a/++m xDimdroTivjMPxYy/v//rKkA6ttf/++vAOoPWv/vpnK6G68A6itf/+8AiheX8yb/7xfcKgAAAIob lw8n/+8X6SoAAHw77xeRKgAAdIr3OLoH4P///8aCB/BzkP7//8w2xoLzdoL38HGq/v//dIITcrn5 de8+Ff18HeDEqgeK9wC693bwfDj7vnw/98Sy84MdzADGgvfwe9v+//8A6hda/+/MLXaCCwiK98aC A3Ql8Hr0/v//dLoLxLr38HwA////dLoTcsNndPtn8EizOfty+zmu8Em3/K7wSbf9rvBJt/7wSf+u r3K6W5dnNv/vrwDqQ1r/73w74xdMHwAAej/we0T///9yuluXL/j//68XUCgAAKZ2ug96P6bwe3b/ //908JX3cvMxrq8XUiIAAHw783o/iuuvAOrLX//vdPiV93L7Oa8Aig8UH3wH/oqtcnpnAQAAQB8n /++vqACKDxeA////fDvzej+LyagA6k9f/+90B3oAi+GXJyf/76gA6lNf/+96P4v4lf4AL3a6A6gA 6i9f/++XHyf/7wDqt1//7wCKDxfXJgAApnK8/swtCIr3ALoLfIID/3Ql8HsKAQAAzAAAsgfGggPw e3cBAAAAihMA6kta/+90ugOmoaSgNj33/6p0E3wT26ypzCTMCcai93aiA4r4zD8W8v7//3K6B69y ugOvAIr3F0IlAAB8O/PEPIsW8HEa////fIIH9/BxLv///36CB/P+///weDv///9+ggf/X/D/8HJI ////crIjF80+AAB0ugNysiMAzwCK8xfEPgAAej/we3D///90ugOozAB0/8Q8idNBI/r//9Q4xDmB /XQ5rK90ug/8OK8AivcXEPv//8Q8gfT8B3S6A3T/xAeNJnSyA8w/xMbwaz90D8QMi8HGou+L4Jfz /v//rACK7xcXPAAAAIoHAIoDAIrvF008AAB8O+eoAIoPrBf9QAAAdA90ugN8O/PUj/sIIeQJuXKy IxeMPQAAoHKyIxeVPQAAAIoDAOpLWv/vpgCK9xc2KAAApnQ5oaQ2PHyD2/f+iuV0u9v7l/v+//+X Q3j6769cS3j67wDqC1//75X+pz3z/5X/F/z///897/+qdBN+E4/7//+pqECvJv/vzAmolf6X/v/g /3aKB3aKAwDqQ1//73o/8Hr5/f//qKmpAOpHX//vej/wewr+//8XB/7//3o/8HsX/v//cnpvBAAA rK+Xuyb/73aKCxclLgAApno/povIcnpvBAAArwDqT1//78Q5i9mXJyf/768A6lNf/+/EOYvpcrIT rqmpr6mpAOq/X//vxDl2uguKjHK6G68A6itf/+90uhdE0yb/73a6D3K6D6+sF2EuAAB0B6YIIOQA priK8nS6F9S6D8J/+v//icMAihesF8IuAACmphdpLgAAxAGK15efNf//F0A6AADEOaaL83Q3FwkM AAB2ugcU/HaKB3SyB5X+FwEMAAByum+vAOpfX//vmXyCbfiK15efNf//F3k6AADEOaaL83Q3F0IM AAB2ugMU/HaKA3SyA5X7FzoMAAByuhOvqamX6b7/76mpAOq/X//vdCdey17/73a6P15LePrvdroz crpDQNcm/++vdopDdoo7doo3doovdoordoondoojdoIfAOqnXv/vqQDKS3j676mplf6V/qmpqZcv KP3vqKkA6s9e/+96P4vPdMLXXv/vqalyul+prwAoej+L4nK6X68A6tte/+9yul+vAOrfXv/vqaly ul+prxQilQAA6stf/+90whtf/++prAAol3fs//+sdOIXX//vACypAIoLACiXd+z//wCKCwAsdLoH pMQ5i/12z3S6A8Q5i/12zzj6P3f67/7///+gzD+hNj37/34Tb/7//wDq21//768A6g9a/++mFy02 AAB6P1xPePrvg95yu9v/r5X9F+r9//96P4ruFwEkAACV/xexMQAAej+mivvMPxSdFwgTAAAXXhUA ABeVEQAAlfcX5jsAAHo/povql6Mm/++V+nQ3F6oOAABcU3j67xT4fNpTePrv/5X3Fw48AAB6P6aL 6pcvKP3vlft0NxfSDgAAXEd4+u8U+HzaR3j67/+V/qd+O2/+//88qnQTfhPv+///qMwAqBc1MgAA ej+m8HtT////qXK6B0ErJ//vr6kXuzAAAKZ6P6aK6nK6D68A6itf/+8Aig+pFxQxAACmpnyC9/9B Q3j674vkqZdLJv/vF0wxAAAAiveXZzn/7xdZMQAAfDvvF+EwAAByugOvl3sm/++X/f//fwDq+1// 73o/isSpcnoPBAAAl5Mm/++vAOpDWv/vfDvzr3J6DwQAAK+V/qColf+Xmyb/7wCKAwDq71//7wCK AwDq91//73Q4oaA2Pfv/qnQTfhP3/v//qXJ6BwEAAJf7/v//zAmvqQDqC1//73J6BwEAAJc/Jv/v rxdbPAAApq8A6jta/++mej+miupyugOvqamXinP/76mpAOq/X//vFPU4+j93+u/+////lf6noTY9 8/9eP3f67wgn5D+Z2pkC+mX9//88MwDad17/7wDam17/7wDal17/7wDak17/7wDaj17/7wDai17/ 7wDah17/7wDag17/7wDaf17/7wDae17/7wDac17/7wDab17/7wDaa17/7wDaZ17/7/////////// //////////////////////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////////////////////// //////////////////////////////////+5Uv//pVL//8dS//+HUv//eVL//5dS////////RVf/ /zdX//8pV///GVf//1FX//8NV////Vb///VW///jVv//y1b//7tW//+lVv//l1b//4FW//9xVv// XVb//01W//83Vv//J1b//xVW//8DVv//91X//+VV//+PV///X1f//79V//+zVf//p1X//5tV//+P Vf//f1X//21V//9bVf//QVX//zFV//8dVf//DVX///tU///rVP//z1T//79U//+xVP//nVT//5FU //+BVP//b1T//2FU///rV///d1f//9NX//+hV///t1f//9dV///LVf//R1T///////9ZUv//QVL/ //////+HU///eVP//5FT//+fU///aVP//01T//8/U///K1P//xdT//9bU///CVP//+VS///NU/// 2VP//+dT///1U///BVT//xdU//8jVP//r1P///dS//+/U////////+///3/z//9/7f//f/z//3/s //9/+///f+j//3/2//9/9f//f8b//3/L//9/+P//f5D//3+M//9//////wfv/++87v/vR+7/7/// //9pz/iI057xEUWu9mbmO5L4cAuVj8panBZcapthzXck8VtHI4bhFiofdyYtaNSzSfZCg06B+NJH GG7iQG+b70jiDd9PlbeORgwhvkF7gisl5RQbIpKuSisLOHosfKlnk+w/V5SbhQadAhM2mnWwo/7r JpP5nJzC8AUK8vdyN9+RxKHvlrMbvp8qjY6YXS4b/MO4K/u0AnryLZRK9VoFV0rKk2dNvSk2RCS/ BkNTHJMnzYqjILow8ikjpsIuVFPPJtnF/yGuf64oN+meL0BKC0ve3DtMqWZqRTDwWkJHYUf91/d3 +qBNJvM52xb0TniDkNDus5enVOKePsLSmUlvviOJ+Y4k/kPfLWfV7yoQdnpOjuBKSflaG0BgzCtH F102+IfLBv/wcVf2aedn8R5E8pWA0sKS92iTm27+o5wZC66UlJ2ek+Mnz5p6sf+dDRJq+ZOEWv7k Pgv3fag78Ao5Jk+arxZI7RVHQXSDd0YDIOIinbbSJeoMgyxzmrMrBKeeTbIxrkrFi/9DXB3PRCu+ WiC1KGonwpI7LlsECykslRaWvAMmkcu5d5hSL0efJYzS+7sa4vzMoLP1VTaD8iLDjvqvVb792O/v 9EF53/M22kqXqEx6kN/2K5lGYBueMfEGIaFnNibW3WcvT0tXKDjowkymfvJL0cSjQkhSk0U/33xH EklMQGXzHUn8ZS1Oi8a4KhVQiC1i6tkk+3zpI4zt9Jwce8Sba8GVkvJXpZWF9DDxG2IA9mzYUf/1 TmH4grts8A8tXPd4lw3+4QE9+ZaiqJ0INJiaf47Jk+YY+ZSRieQrAR/ULHalhSXvM7UimJAgRgYG EEFxvEFI6CpxT58XXCkpgWwuXjs9J8etDSCwDphELpioQ1ki+UrAtMlNtyXU8iez5PVQCbX8yZ+F +748EJ8gqiCYVxBxkc6GQZa5c0yeNOV8mUNfLZDayR2XrWqI8zP8uPRERun93dDZ+qpBxEU61/RC TW2lS9T7lUyjWAAoPc4wL0p0YSbT4lEhpE89m2TZDZwTY1yVivVskv1W+fZjwMnxFHqY+I3sqP/6 fbVAauuFRx1R1E6Ex+RJ82RxLW3yQSoaSBAjg94gJPQrLSx5vR0rDgdMIpeRfCXgMulBfqTZRgke iE+QiLhI5xml93ePlfAANcT5maP0/u4AYZpwllGdBywAlJ66MJPphx31XxEt8iirfPuxPUz8xp7Z mFgI6Z8vsriWtiSIkcG1lS5RI6UpJpn0IL8PxCfIrFFDVjphRCGAME24FgBKz+MNQkJ1PUU1z2xM rFlcS9v6yS9FbPkoMtaoIatAmCbc0YWZTEe1njv95Jeia9SQ1chB9EtecfM85CD6pXIQ/dKWsP/v xbD/72Sw/++3r//vk67/76ew/+9wr//vYa//70mt/+8grf/vBq//73+u/+8cr//vIK7/7wGt/+/E rP/v2a3/73at/+9erf/vEa//729Z/////////////zFU///jX///g1j/////////////01L///de //+LWf////////////9nUv///1///49Y/////////////yNS//8DX///J1j/////////////F1L/ /5te/////////////////////////////7lS//+lUv//x1L//4dS//95Uv//l1L///////9FV/// N1f//ylX//8ZV///UVf//w1X///9Vv//9Vb//+NW///LVv//u1b//6VW//+XVv//gVb//3FW//9d Vv//TVb//zdW//8nVv//FVb//wNW///3Vf//5VX//49X//9fV///v1X//7NV//+nVf//m1X//49V //9/Vf//bVX//1tV//9BVf//MVX//x1V//8NVf//+1T//+tU///PVP//v1T//7FU//+dVP//kVT/ /4FU//9vVP//YVT//+tX//93V///01f//6FX//+3V///11X//8tV//9HVP///////1lS//9BUv// /////4dT//95U///kVP//59T//9pU///TVP//z9T//8rU///F1P//1tT//8JU///5VL//81T///Z U///51P///VT//8FVP//F1T//yNU//+vU///91L//79T////////7///f/P//3/t//9//P//f+z/ /3/7//9/6P//f/b//3/1//9/xv//f8v//3/4//9/kP//f4z//3//////gv64mouolpGbkIiMu5aN mpyLkI2Gvv//Vf62kZaLlp6TloWavI2Wi5acnpOsmpyLlpCR/zH9qJ6Wi7mQjayWkZiTmrCdlZqc i/9g/auajZKWkZ6LmquXjZqem/8+/rOanomavI2Wi5acnpOsmpyLlpCR//+Z/7qRi5qNvI2Wi5ac npOsmpyLlpCR///k/7yTkIyat56Rm5Oa/+f9rZqem7mWk5r//+3+uJqLuZaTmqyWhZr/y/+8jZqe i5q5lpOavv+S/riai6uWnJS8kIqRi///b/+5lpGbvJOQjJr/Yv+5lpGbsZqHi7mWk5q+/2n9rJOa mo//a/+5lpGbuZaNjIu5lpOavv//ov2smou8io2NmpGLu5aNmpyLkI2Gvv//tf+8jZqei5qrl42a npv///L+uJqLuZaTmr6Li42WnYqLmoy+//+o/7uak5qLmrmWk5q+/xv+soqTi5a9houaq5Colpua vJeejf/7/riai7uNlomaq4aPmr7/eP2smourl42anpuvjZaQjZaLhv8p/rKej6mWmoiwmbmWk5r/ yv+8jZqei5q5lpOasp6Pj5aRmL7//579rJqLupGbsJm5lpOa//+V/ayai7mWk5qvkJaRi5qN//9P /aqRkp6PqZaaiLCZuZaTmv9m/reano++k5OQnP+//riai6+NkJyajIy3mp6P//9d/reano+tmr6T k5Cc/2D+t5qej7mNmpr///38k4yLjZyPhr7///f8k4yLjZOakb7//wb9k4yLjZyei77//wD9k4yL jZySj5a+/wP9k4yLjZySj77//4r+uJqLqZqNjJaQkbqHvv+R/riai6uWkpq5kI2Snou+//8E/7ia i7uei5q5kI2Snou+//+P/riai6uWkpqlkJGatpGZkI2SnouWkJH//+T+uJqLs5CcnpOrlpKa//+c /riai6uako+5lpOasZ6Smr7//5r+uJqLq5qSj6+ei5e+///B/riai6+NkJy+m5uNmoyM//89/rOQ npuzlp2Nno2Gvv//xf64mouvjZaJnouar42QmZaTmqyLjZaRmL7//8D/vI2anouasoqLmoe+//8S /rCPmpGyiouah77//1X/uZOKjJe5lpOavYqZmZqNjP//IP2ojZaLmrmWk5r/2v2tmpOanoyasoqL mof//+P+uJqLs5CcnpOatpGZkL7//0v/uY2amrOWnY2ejYb/oP64moushoyLmpKrlpKavoy5lpOa q5aSmv/b/riai7KQm4qTmrmWk5qxnpKavv//tLqtsbqzzM3Rm5OT//9T/YiMj42WkYuZvv+h/ria i6iWkZuQiKuah4u+//8S/7iai7yTnoyMsZ6Smr7/lf2sl5CIqJaRm5CI///9/riai7uTmLaLmpL/ /8r+uJqLr56NmpGL/y//upGKkqiWkZuQiIz/2P64mouympGKrIuei5r//+v9rJqRm7KajIyemJq+ //+9/riai6yKnbKakYr//+P+uJqLspqRiv8q/7mWkZuolpGbkIi+/yH+r5CMi7KajIyemJq+///e /7yXno2zkIiajb7//9D/vJeejaqPj5qNvv//Uf2IiYyPjZaRi5m+//9q/7uWjI+ei5yXspqMjJ6Y mr7//339q42ekYyTnouaspqMjJ6Ymv//1f64mouymoyMnpiavv+m/7yNmp6LmqiWkZuQiLqHvv8N /q2amJaMi5qNvJOejIy+//97/7uamaiWkZuQiK+NkJy+//+qrLqtzM3Rm5OT//+k/q2amLyTkIya tJqG/4T+rZqYroqajYapnpOKmrqHvv//jv6tmpiwj5qRtJqGvv+N/q2amLCPmpG0moa6h77/mf6t mpi6kYqStJqGvv95/q2amKyai6mek4qauoe+//++u6m+r7bMzdGbk5P//6//rLe4mouvnouXuY2Q kra7s5aMi77//6z/rLe4mousj5qclp6TuZCTm5qNs5CcnouWkJH//6y3urOzzM3Rm5OT/6issLy0 zM3Rm5OT//////////////////////8GCQG//////7VQ///+////4v///+L////XUf//Y1H//+9Q //+ecP//inP//5Rz//9tb///0m///xyv///Zrf//zq3//6ew//9ksP//Sa3//yCt//9erf//aa3/ /wGt///FsP//S6///8Ss//+WsP//dq3//2Gv//9wr///HK///1Ou//+3r///IK7//3+u//8Gr/// Ea///49Q//9/UP//r1D//51Q//+XUP//bVD//2VQ//9gUP//W1D//1RQ//9PUP//R1D//z9Q//83 UP//L1D//yZQ//8fUP//GFD//xNQ//8LUP//BVD///5P///2T///70///+dP///fT///1k///85P ///HT////P/7/////v/9//r/+f/4//f/9v/1//T/8//y//H/8P/v/+7/7f/s/+v/6v/p/+j/5//m /+X/5P/j/9/Rm5OT/7uTk62amJaMi5qNrJqNiZqN/6CSnpaR/6CSnpaRrbv/u5OTvJ6RqpGTkJ6b sZCI/7uTk7iai7yTnoyMsJ2VmpyL/6CMi42TiI3/nouQlv+ei5CT/5yek5OQnP+ZjZqa/5aMnpOR ipL/loyek4+Xnv+WjJuWmJaL/5aMjI+enJr/loyHm5aYlov/kp6Tk5Cc/5KakpyXjf+NnpGb/42a npOTkJz/jI2ekZv/jIuNnJeN/4yLjZacko//jIuNk4iN/4yLjZGcko//jIuNkZyPhv+Mi42RlpyS j/+Mi42NnJeN/4yLjYyLjf+Mi42Kj43///////////////////////////////////////////// //////////////////////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////////////////////// //////////////////////////////////////////////////////////////////////////// //////////////////////+N////np2cm5qZmJeWlZSTkpGQj46NjIuKiYiHhoW+vby7urm4t7a1 tLOysbCvrq2sq6qpqKempc/OzczLysnIx8bS0aD////f08TFw8HC3fXy///f08TFw8HA2dbU3fXy ////Pz3/70s9/+9TPf/vWz3/72c9/+9vPf/vez3/74M9/++LPf/vkz3/75s9/++jPf/vqz3/77M9 /++7Pf/vwz3/78s9/+/TPf/v3z3/7+c9/+/3Pf/v/z3/7/////8HPv/vCz7/7xM+/+8bPv/vIz7/ 7ys+/+8zPv/vPz7/70c+/+9PPv/vWz7/78s9/+9nPv/vbz7/73s+/+/TPf/vgz7/7/////+LPv/v kz7/75s+/++jPv/v/Pz8/P79/fz9////qz7/77c+/+/DPv/vyz7/75KMkdGckJL/l5CLkp6Wk9Gc kJL/hp6XkJDRnJCS////npCT0ZyQkv/Rq6er/////9G3q7Kz////0bersv/////RqL69/////4iX mo2a////kpacjZCMkJmL////jJqcipGWnv+Ri52KmIuNno7///+RmpCXno+Mloz///+amoaa//// /5GeltGc////i42akZuSlpyNkP//nZaLm5qZ//+MkI+XkP///4+ekZue////jIaSnpH///+Jlo2K jP///56Jj/+UnoyPmo2M/5yQkZmWjZL/jIqdjJyNlo+LlpCR/////5GaiIz/////jZqYloyLmo3/ ////jI+ekv////+MmpyKjf///4yKj4+QjYv/jJqNiZacmv+ckJGLnpyL/56dioya////lpGZkP// //+RkIuXlpGY/56RhpCRmv//kZCdkJuG//+RkJCRmv///4yQkpqdkJuG/////4yQkpqQkZr/j5CM i5KejIuajf//kp6Wk5aRmP+SnpaTmo3//4ianZKejIuajf///56bkpaR////kp6Wk9GZ2or///// sqy2srHRuqe6////kp6Wk9GMnM//////kp6Wk9GZmv+bnov/kp6Wk9GZlpOa////2oy/2oz////R mpuK/////8PajMH/////2ozfw9qMwf+em5uNmoyMmoz////V0dX/qoyajdGxuqvfspqMjJqRmJqN 36yajYmWnJr//6yQmYuIno2ao7KWnI2QjJCZi6OyrLGymoyMmpGYmo3/rLKrr9+7loyPk56G37Ge kpr///+ssquv37qSnpaT376bm42ajIz//76cnJCKkYuMo9qM/7uamZ6Kk4vfsp6Wk9++nJyQipGL /////6yQmYuIno2ao7KWnI2QjJCZi6O2kYuajZGai9++nJyQipGL37KekZ6Ymo3/rJCZi4iejZqj spacjZCMkJmLo6i+vaOovr3Lo6iend+5lpOa37Gekpr////czM3IyM///7KssbKsvbO8k56MjP// //+yrLHfspqMjJqRmJqN////u6y9ipubhrOWjIusnomam/////+skJmLiJ6NmqOylpyNkIyQmYuj sqyxspqMjJqRmJqNo6+aja+ejIyPkI2LrJqLi5aRmIz/vJCRi56ci7OWjIuvnouX/5zFo5KMkZyQ kYuenIuTloyL0ZyLi////5KMkZyT0YyeiZqb/7mws7u6razRu72n/9qcxaO7kJyKkpqRi4zfnpGb 36yai4uWkZiM///anMWj/////76Pj5OWnJ6LlpCR37uei57/////8vXR8vX///+7vqu+8vX//628 r6vfq7DF38PajMHy9f+yvraz37mtsLLF38PajMHy9f///7e6s7Df2ozy9f///9qMxc3K////8vX/ /87Kx9HOys3RztHKx/////8AAAAAdzn/7y8o/e8vKP3vfzn/7y8o/e8vKP3vizn/7y8o/e8vKP3v lzn/7y8o/e8vKP3vmzn/76M5/++rOf/vtzn/7785/++rOf/vyzn/79M5/++rOf/v3zn/79M5/++r Of/v5zn/7/s5/++rOf/vAzr/7y8o/e8vKP3vCzr/7xc6/+8fOv/vkpqMjJ6Ymv+bkJyKkpqRi/// //+bmouelpOM/7CU35yKkYv/nI2akpqgm5qgmI2KhpqNmv////+YiouLmpv//7mai5aMl5qM//// /4+XkIuQ////ttiS35GKm5r/////lZqRlpmajf+omovfmJaNk4z////RlY+Y/////52NlouRmob/ rJqH/7bYkt+Wkd+TkIma/7aSj5CNi56Ri////7eak5OQ////t5b//5KelpPRjJqRi////5mWk5rO 0Y+ei5f//7KajIyemJrStrvF39qM8vW5jZCSxd/ajPL1q5DF39qM8vWsip2VmpyLxd/ajPL1u56L msXf2ozy9bKWkprSqZqNjJaQkcXfztHP8vW8kJGLmpGL0quGj5rF35KKk4uWj56Ni9CSloeam8Tf nZCKkZuejYbC3dqM3fL18vXS0tqM8vXajPL10tLajPL12ozy9dLS2ozS0vL1/////8Paz8eH0drP x4e/2ozB//+8kJGLmpGL0quGj5rF34uah4vQj5OelpHy9fL12ozy9f////+8kJGLmpGL0quGj5rF 39qMxPL19pGekprC3dqM3fL1vJCRi5qRi9KrjZ6RjJmajdK6kZyQm5aRmMXfnZ6MmsnL8vW8kJGL mpGL0ruWjI+QjJaLlpCRxd+ei4uenJeSmpGLxPL19pmWk5qRnpKawt3ajN3y9fL12ozy9f//no+P k5acnouWkJHQh9KFlo/SnJCSj42ajIyam//////ajNGFlo///9GMnI3/////3////9qM2oz///// u5CIkZOQnpu7lo3/0bu6ub6qs6ujrJCZi4iejZqjtJ6Fnp6js5CcnpO8kJGLmpGL/////9qc2or/ ////0YuHi/////+I/////v///9qM2ozajP//o////9qMo9qc2ozR2oz//9qc1dGbk5P/2ozf2ozf 2s/N0c2bz8///9qM39qM39Taz83RzZvPz/+3t9jF2JKS2MXYjIz/////m5ub2NPY35vfsrKy34aG hob///+vjZCcmoyMzM2xmoeL////r42QnJqMjMzNuZaNjIv//7yNmp6LmquQkJOXmpOPzM2skZ6P jJeQi/////+Umo2RmpPMzdGbk5P/////AAAAAJ2QkIv/////jIaMi5qS0ZaRlv//rLytsay+qbrR uqe6/////6ycjZqakayeiZq+nIuWiZr/////vJCRi42Qk9+vnpGak6O7moyUq5CP////nJmY0Zue i/+cmZi+nJyajIz////C////2or//9qK0dqK0dqK0dqKxdqK///F////o6P//7iai7Gai4iQjZSv no2ekoz/////to+Xk4+ej5bRm5OT/////wAAAACIiIjRmJCQmJOa0ZyQksXHz////5qRnp2Tmp6K i5Cblp6T//+skJmLiJ6NmqOylpyNkIyQmYujqJaRm5CIjKO8io2NmpGLqZqNjJaQkaO2kYuajZGa i9+smouLlpGYjP/Gztbe5u72/sXN1d3l7fX9xMzU3OTs9PzDy9PbwMjQ2ODo8PjBydHZ4enx+cLK 0tri6vL64+vz+/79+/n39fPx8O7s6ujm5OPx7vTn/vr84/D56vXo7PP75ffv+OTr8v3Wy+Da0Mjh 18zS3s/TztjH3crR1c3b4t9/////v////9/////v////9/////v////9/////v/////7/v7///// ///+//v7/v77//7++/v+//v///////7///v////7/v77+/7+//v///v7//77//7+/////vv////7 +/////v//v/7//7/+/7///v+/////v7///7++/v//vv//v/7///++////vv//v//////+/v///v7 /v/////+///+//v7/v77///////+/v/7/v7////+/////v/7///7//7+///+///7/v/7///+//v/ //v////7+//++/v+//v7/v77//7////+/vv7//77///++/v///v7/v//+/7++/v////7//7/+//+ //////v//v//+/7///////v//v7ff+9//3//f/9////ff+/////v/9/////f/+9/33//f9///3/f f+9//3/vf////3//f/9////v/9/////f/+9//3/v/9//7//ff/9//////////3//f///33/v//// 73/f/+//3///f///////f+//33////9/73///+9/33/////////ff+//3//vf///7//ff/9////v f/9/73//f//////vf/9//3/f////33/vf99/7//f/////3///////3/ff////3/vf///7//f//9/ 3//v/99//3/f//9/3//v//9/7////////3//f99///////9/3//vf99/73//f+//9/3////9/ff/ ////9//99//9//f/////9/39///9//f3//3/9///9/f///f///3/9/399/f//f////339/3///// //f3//////399//9/////f3////99/f//ff3/f3/9/3/9//9/f////3/9/3/9/f////3/f33//3/ //////f//f33////9/f//f/3/f/////9///9/ff//f/3///////9///3//3/9/399//9//f3///3 //3////////3//339/3/9////f/////39/399/f////3/f3///39//f///f///339/3/9/f9//// //339/39//f////3//33//39//7ff/9+3///ft///3////9/33//fv9///7/f//+3//////////f f///33//ft9//37/////////f/9///7/f//+/////9//////f//+33//f///////f//+3///f9// /37/f//+////f9///3//f///3///f99//37ff/9+////f/9///7/f///33//ft9//37///////// ///////ff/9/3///f/9//37/f//+/////t9//37f//9+3///f////37ff/9+/////v/////f///+ /3///t///3/ff/9+/3///t///3/f/////3///t9//3///////3///9///3/ff////v////73/f// 9/3//v+9///3///+//////+////3/f/+97////f///7//f/+97///v+9///3vf/+9/////+///// /f//97////e////////+/7///ve9//73vf/+//3///e9//7/v/////////+9//73/f////3///+9 //73////9////v+9//7///////3///+////3/f/+/73//ve///7//f///7////e9//73/f/+97// /v///////f//973//ve9//73/////73//ve9///3/f////////e/////vf/+9////v/9//7/v/// 9//////////3v//+9/3//v+/7///3///v9//v///77+/3///v9/v////77+/3///v///v//f77+/ ////v//v///f7/+///+//9/////f77/////////v/7//77//3/+/////v7//77//3+/////v/7/f 7/+/3//////vv7///7+/3++/////v7///7+/3////9//v//f7////+//v9//v7//77+/3///v//v v///7///3///v///v//f////3++////v///f77+/3/+/v////7/f77+///+/v9//////7/+/3+// ////v/////+/3++/v///v///7/+//++//9///////7+/3////9/v/7//77//3///3//9/9/7/ff/ +///////9////ff/+/333///99/7/fff+///3////////f//+/3////////7/f/f+/33////9//7 /fff//3/3///9//7/f//+///3/v/99/7/f/f////3/v/9////ff///333/v/99///f////////v/ 99//////+//33////9///ff/+/33//v9/9/7/f/f+/3////9/9//////+//3//v//9////ff+/33 ///999////ff+/33///9///7/fff+///3/v/99////////3////999/7//////333////9/7//f/ //3///v/9//7//f///3/3/+/7//v/+//////+/+/7/vv////77/v/++/////////77//+/////vv v+/77//v+///7/vvv+/7///v//+////////777///+//7//vv+/////v+/+///v/v//77//v+++/ 7/////////////+///vvv///7//v/++/7/v////7/7/v+/////v//+/77//v//+/////v//77//v //+/7/v//+//77////+////v///777//++/////v///7/7/v/+//////v+/777//+/+////v///7 7//v/++/7//v/////7/v++//7/v//+/7/7/v//+/7///v//7/////+//7/vvf/////////////// /////////////////////////////////////////////////////////////////////56dnJua mZiXlpWUk5KRkI+OjYyLiomIh4aFvr28u7q5/////769vLu6ubi3trW0s7KxsK+urayrqqmop6al np2cm5qZmJeWlZSTkpGQj46NjIuKiYiHhoXPzs3My8rJyMfG1ND/////3KCgi5qMi6Cg////xyf/ 7+Mn/+//J//vHyj/7z8o/+9bKP/veyj/75co/++zKP/vzyj/7+so/+8LKf/vKyn/70cp/+9nKf/v gyn/76Mp/++/Kf/v2yn/7/Mp/+8PKv/vKyr/70cq/+9fKv/vfyr/75sq/++3Kv/v6yr/7wsr/+8r K//vRyv/72Mr/++DK//vnyv/77cr/+9HKf/v0yv/7+Mr/++cl56LztGJkJaTntGZjf//noqMi5aR 0YuH0YqM0YqRm5qNkZqL0ZCNmP///5KajJ7RnoXRiozRipGbmo2RmovRkI2Y/4yKjY2ahtGKlNGa itGKkZuajZGai9GQjZj///+Mi5CclJeQk5LRjJrRmorRipGbmo2RmovRkI2Y/////5KQjJyQiNGN itGaitGKkZuajZGai9GQjZj///+Xnp6Nk5qS0ZGT0ZqK0YqRm5qNkZqL0ZCNmP//npKMi5qNm56S 0ZGT0ZqK0YqRm5qNkZqL0ZCNmP////+ekoyLmo2bnpLN0ZGT0ZqK0YqRm5qNkZqL0ZCNmP///46K mp2anNGOitGcntGKkZuajZGai9GQjZiYjZ6FzdGei9GaitGKkZuajZGai9GQjZj///+LkI2QkYuQ 0ZCR0Zye0YqRm5qNkZqL0ZCNmP//kpCRi42anpPRjorRnJ7RipGbmo2RmovRkI2Y/4mekZyQioma jdGdnNGcntGKkZuajZGai9GQjZj/////mI2ehdGei9GaitGKkZuajZGai9GQjZj/k5CRm5CR0YqU 0ZqK0YqRm5qNkZqL0ZCNmP///52NioyMmpOM0Z2a0ZqK0YqRm5qNkZqL0ZCNmP+blpqSmpHRkZPR morRipGbmo2RmovRkI2Y////kIyTkNGRkNGaitGKkZuajZGai9GQjZj/mZOekZuajYzRnZrRmorR ipGbmo2RmovRkI2Y/5OKk5qe0Yya0ZqK0YqRm5qNkZqL0ZCNmP////+TkIzSnpGYmpOajNGcntGK jNGKkZuajZGai9GQjZj//4+XkJqRlofRnoXRiozRipGbmo2RmovRkI2Y//+InoyXlpGYi5CR0Zuc 0YqM0YqRm5qNkZqL0ZCNmP///56Lk56Ri57RmJ7RiozRipGbmo2RmovRkI2Y//+SnpGXnouLnpHR lIzRiozRipGbmo2RmovRkI2Y/////52ek4uWkpCNmtGSm9GKjNGKkZuajZGai9GQjZj/////k56M iZqYnozRkYnRiozRipGbmo2RmovRkI2Y/5GaiIaQjZTRkYbRiozRipGbmo2RmovRkI2Y//+bnpOT nozRi4fRiozRipGbmo2RmovRkI2Y////jJ6Ti5OelJrRiovRiozRipGbmo2RmovRkI2Y/56Nk5aR mIuQkdGJntGKjNGKkZuajZGai9GQjZj/////noqclJOekZvRkYXRipGbmo2RmovRkI2Y/////56R kdKejZ2QjdGSltGKjNGKkZuajZGai9GQjZj/////kZqInY2KkYyIlpyU0ZGV0YqM0YqRm5qNkZqL 0ZCNmP+Pk56RkNGLh9GKjNGKkZuajZGai9GQjZj/////kpyTmp6R0Yme0YqM0YqRm5qNkZqL0ZCN mP///5yempHRmY3RmorRipGbmo2RmovRkI2Y/8WYmouWkZmQjP///6+ttqmyrLj/r7CxuN/ajP+v trG4/////7WwtrHf2oz/s7asq9/D2or/////2ozFycnJyP+xtry039qM/6qsuq3f2ozf2ozf2ozf xdqM////3v///6+ttqmyrLjf2ozfxdqM//+akYmQlpqMxd/aiv/aitHaiv///5ab//+gkp6Wkf// /5zFo4qPm56LmtGbk5P///+cl5qclIqPm56LmtGbiLOQiLuei5qrlpKa////nJeanJSKj5uei5rR m4i3lpiXu56LmquWkpr//4ickf+cl5SKj5vRm4i3lpiXu56LmquWkpr///+ZlpOazNGPnouX///f koqLmofN3/////+SkJ2Ki4r//4iWkYqPm4v/raqxu7OzzM3Ruqe639qM06CSnpaRrbv/rJCZi4ie jZqjspacjZCMkJmLo6iWkZuQiIyjvIqNjZqRi6majYyWkJGjrYqR////mZaTms3Rj56Ll///moeP k5CNmo3Rmoea//////////////////////////////////////////////////////////////// ///////v//9z////E8/pzsTOmM5WziPO7s2xzRbN3cxVzJbLmcp2yjbKgMgMyEfHOse4xqvGkMaH xn3GdsZqxmTGWcZRxkLGOsYvxifGGMYDxvjFx8WBxU7FOMUbxRTFrcR2xHDEJsQHxMPDqcOQw1TD CcPQwqDCgcJawkTC+8GZwVfBJcG7wHfAcMBewAnA/9/////+///2z+DPqM9zz1rPSM84zybPIM8S z+PO3s7NzqzOlc5VzkvOQc43zt/NyM1AzSPNFMwHzOjL38vGy7TLict5y3PLC8sEy9fKjsqHymDK RMowyh/KCMqvyT7JEMn2yFXIN8gqyAzIA8jsx+DHyMe9x57Hjsdpx2LHS8cqxwjH6cbKxr7Gtcaa xnDGSMZsxQDF88S8xLPEaMQ7xPHD6cPJw6zDicN5w3LDWsNRw0jDPcM2wyjD+8LpwtnCrsKowqLC ncKYwpLCgMJswlnCRsI3wujB4sHTwbfBnsGQwXrBUsFIwR7B8MDpwLTApMCewEXAPsAwwCHAAMD/ ///P//+D////8c/Ez77Prc9+z2bPSs8qzx3PDc/8zp7Oj87VzZLNKc0XzfbMlcx+zHbMcMw1zCHM +8vvy9/LzsuNy4XLaMsmy/XK5MrVys7KvMqEyn7KXspKyuHJlslLyQPJtMhQyF/HEsfixsHGcMSC w8rCScE6wQ7A////v///F/////zPk890z0fPOM8Dz7vOr86jznfOL84RzvjN4s3bzdbNdM1TzSHN 8My9zJ/MasxkzFDMQsw7zNTLn8sNy/7K7srLyrPKhMpuyg3K0cm6yYzJeMlwyWPJPcktySfJtcih yITIechWyE3IPcjqx9bHpsePx4jHdMcxxyfHHMcFx8fGisZFxjfGLcYcxgzGAcbxxc7FwsWvxaLF TsXAxKvElsSBxGzEV8RCxC3EGMQDxO7D2cPEw63DmMOBw2DDMcP1wnTCWMI/wjnCJcJbwVLBO8Ez wbPArMBxwGrAXMBVwP///6///xP///+8z4TPdc9mzyHPFs8Lz4TOdM1uzdDMyczDzH3MZ8w0zCbM H8wKzK3LpcrwyOLIxMguyDTH5sbYxtLGs8aZxk7GPcYlxgvG68XcxcvFwcW1xZrFk8WAxWvFQMUw xSnFH8USxQzFBsX5xPTE7MTmxN/E2cTRxMvEwsS1xInEasRKxDbEKsQfxBfECcTYw8vDusOow4nD a8NVw0PDPsM4wyDDCMPRwrrCpcKewpPCjMJxwlnCTsJIwjLCJsIPwgDC6sHawdTBzsG+wbjBBsH3 wOLA3MDHwMHAssCswKbAoMCawJTAL8D/n///Y////93Pz8/Jz7fPns+Ez33PZM9Tz0HP4M7IzsHO ic6CzmzOMs4fzhnOEs4MzurN383MzbTNc8uuyl7KTMpQyRLJA8l1yBbI+sfqx3LHacdgx1nHUsdA xy7HIcfsxtPGu8atxpjGd8ZjxlfGJ8auxRHF4cTaxNDEvsSZw4/DfcN2w1HDR8M1wy7DOsIkwufB uMGXwUPB////j///q////8bPUMZGxj/GOMYpxhHGC8buxcTFsMVtxQ7FrMRuxBTE/8P1w+7D58PY w8PDvMOww6rDnsNVwwzDr8KCwnjCccJqwkzC0MFEwZfA////f///+/7//8nPYc9CzzvPJs/hzqrO Zs5ezi/OKM4Uzr/Nsc2hzUTNPc37zNzMk8yKzCDMC8z2y5nLi8tyy23LZstNyzrLDsv3ypfKUspJ yiPK3MmFyUPJBsn2yOTIpsifyE/IGsgOyAPI7sfix9XHxsfBx7PHoMdox1THQMfNxrXGrsajxpvG jsYrxt/F2MWRxXTFXMVRxUrFOMUzxS3F9sTPw6rDpMOew37DaMNXwzXDGsMRwwrD98Lmwt7Cc8I5 wjHCKsIiwhfC+MHxwePB28HRwbrBsMGfwZnBicFewUbBP8EywevA3cDVwMPAtcCtwH7AZ8BTwEvA PsAowB3ADcAGwP9v///H////8M/nz97Pt8+sz57Pjc+Fz33PbM9Xz1HPS89Fzz/POc8zzy3PJ88h zxvPFc8PzwnP/1///8f///9fzlvOV85Tyk/KS8pHykPKP8o7yjfKM8ovyivKJ8ojyh/KG8oXyhPK D8oLygfK////P///V////5vPl8+Tz4/Pi8+Hz4PPf897z3fPc89vz2vPZ89jz1/PW89Xz1PPT89L z0fPP887zzfPM88vzyvPJ88jzx/PG88XzxPPD88LzwfPA8//zvfO887vzuvO287XztPOz86jyp/K m8qXypPKj8qLyofKg8p/ynvKd8pzym/Ka8pnymPKX8pbylfKU8pPykvKR8pDyj/KO8o3yjPKL8or yifKI8r/L///q////3vMd8xzzG/Ma8xnzGPMX8xbzFfMU8xPzEvMR8xDzD/MO8w3zDPML8wrzCfM I8wfzBvMF8wTzA/MC8wHzAPM/8v7y/fL88vvy+vL58vwwAAACAAAIAO AAAAYAAAgAAAAAAAAAAAAAAAAAAAAQABAAAAOAAAgAAAAAAAAAAAAAAAAAAAAQAMBAAAUAAAAKDw AADoAgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAEAZgAAAHgAAIAAAAAAAAAAAAAAAAAAAAEADAQA AJAAAACI8wAAFAAAAAAAAAAAAAAAKAAAACAAAABAAAAAAQAEAAAAAAAAAgAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAL8AAL8AAAC/vwC/AAAAvwC/AL+/AADAwMAAgICAAAAA/wAA/wAAAP//AP8AAAD/ AP8A//8AAP///wAAAAAAAAAAAAAAAAAAAAAAAAh3d3d3d3d3d3d3cHAAAACP//////////////cH AAAAj//////////////3BwAAAI/wAAAP////////9wcAAACP//////////////cHAAAAj/AAAA// ///////3BwAAAI//////////////9wcAAACP//////////////cHAAAAj/AAAAAAAAAAAA/3BwAA AI//////////////9wcAAACP8AAAAAAAAAAAD/cHAAAAj//////////////3BwAAAI/wAAAAAAAA AAAP9wcAAACP//////////////cHAAAAj/AAAAAAAAAAAA/3BwAAAI//////////////9wcAAACP //////////////cHAAAAj/AAAA/////////3BwAAAI//////////////9wcAAACP//////////// //cHAAAAj//////////////3BwAAAI/wAAAP////////9wcAAACP//////////////cHAAAAj/AA AA////8PAA/3BwAAAI//////////////9wcAAACP//////////////cHAAAAj//////////////3 BwAAAI8P8P8P8P8P8P8P+AcAAACPD/D/D/D/D/D/D/gHAAAACPiPiPiPiPiPiPiPgAAAAAAAAAAA AAAAAAAAAAAAAPAAAB/gAAAPwAAAB8AAAAfAAAAHwAAAB8AAAAfAAAAHwAAAB8AAAAfAAAAHwAAA B8AAAAfAAAAHwAAAB8AAAAfAAAAHwAAAB8AAAAfAAAAHwAAAB8AAAAfAAAAHwAAAB8AAAAfAAAAH wAAAB8AAAAfAAAAHwAAAB+AAAA/ySSS/AAABAAEAICAQAAEABADoAgAAAQAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAUEsBAhQACgAAAAAAAAAAAIdw1mwA0AAAANAA AKEAAAAAAAAAAQCAAAAAAAAAAHNreS50eHQgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg ICAgICAuc2NyUEsFBgAAAAABAAEAzwAAAL/QAACHAgAAAAAgICAgICAgICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg --OEknMdrQymslrN-- From sfeldma@pobox.com Thu Dec 2 21:21:15 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Dec 2004 21:21:19 -0800 (PST) Received: from orb.pobox.com (orb.pobox.com [207.8.226.5]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB35LE61004916 for ; Thu, 2 Dec 2004 21:21:14 -0800 Received: from orb (localhost [127.0.0.1]) by orb.pobox.com (Postfix) with ESMTP id 9C1FA2F7118; Fri, 3 Dec 2004 00:20:52 -0500 (EST) Received: from [192.168.200.234] (209-128-68-065.bayarea.net [209.128.68.65]) by orb.sasl.smtp.pobox.com (Postfix) with ESMTP id 253032FA2A7; Fri, 3 Dec 2004 00:20:43 -0500 (EST) Subject: Re: [E1000-devel] Transmission limit From: Scott Feldman Reply-To: sfeldma@pobox.com To: Robert Olsson Cc: Lennert Buytenhek , jamal , P@draigBrady.com, mellia@prezzemolo.polito.it, e1000-devel@lists.sourceforge.net, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com In-Reply-To: <16815.23964.93437.411404@robur.slu.se> References: <1101467291.24742.70.camel@mellia.lipar.polito.it> <41A73826.3000109@draigBrady.com> <16807.20052.569125.686158@robur.slu.se> <1101484740.24742.213.camel@mellia.lipar.polito.it> <41A76085.7000105@draigBrady.com> <1101499285.1079.45.camel@jzny.localdomain> <16811.8052.678955.795327@robur.slu.se> <1101821501.1043.43.camel@jzny.localdomain> <20041130134600.GA31515@xi.wantstofly.org> <1101824754.1044.126.camel@jzny.localdomain> <20041201001107.GE4203@xi.wantstofly.org> <1101863399.4663.54.camel@sfeldma-mobl.dsl-verizon.net> <16813.58484.343629.570703@robur.slu.se> <1101919791.5198.15.camel@localhost.localdomain> <16815.23964.93437.411404@robur.slu.se> Content-Type: text/plain Message-Id: <1102051410.3546.45.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.6 (1.4.6-2) Date: Thu, 02 Dec 2004 21:23:30 -0800 Content-Transfer-Encoding: 7bit X-archive-position: 12404 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: sfeldma@pobox.com Precedence: bulk X-list: netdev On Thu, 2004-12-02 at 10:23, Robert Olsson wrote: > It can increase TX performance from 800 kpps to > 1125128pps 576Mb/sec (576065536bps) errors: 0 > 1124946pps 575Mb/sec (575972352bps) errors: 0 These are the best numbers reported so far, right? > And some of Scotts may still be used. Did you try combining the two? > + > + if( adapter->tx_ring.next_to_use - adapter->tx_ring.next_to_clean > 80 ) > + e1000_clean_tx_ring(adapter); > + You want to use E1000_DESC_UNUSED here because of the ring wrap. ;-) -scott From madhaviram123@yahoo.com Thu Dec 2 21:40:38 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 02 Dec 2004 21:40:42 -0800 (PST) Received: from web90003.mail.scd.yahoo.com (web90003.mail.scd.yahoo.com [66.218.94.61]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iB35ec54005686 for ; Thu, 2 Dec 2004 21:40:38 -0800 Received: (qmail 74387 invoked by uid 60001); 3 Dec 2004 05:40:12 -0000 Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; b=jYStYmvGTJ1LpZdR0DUUGf9hxK0M53uLXmqxYUeq50OcsTq9CSwLFvLvZK3pFA6zIO4RTYL+OHslvauvU3SJoKDdXvjSepyCKLyfnSPkOe/eyrePJTTNYuClnr3wvDv3TxZPVu8AiMfl84IEUnWW7xQSElRNwelTNq+YMpGwZyI= ; Message-ID: <20041203054012.74385.qmail@web90003.mail.scd.yahoo.com> Received: from [203.199.182.34] by web90003.mail.scd.yahoo.com via HTTP; Thu, 02 Dec 2004 21:40:12 PST Date: Thu, 2 Dec 2004 21:40:12 -0800 (PST) From: ram mohan Subject: Contribute - Howto To: netdev@oss.sgi.com MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-archive-position: 12405 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: madhaviram123@yahoo.com Precedence: bulk X-list: netdev Hi, I am willing to contribute to the "networking side " of linux. I googled a bit and found that I should join the list and then I can go ahead. I would like to know. 1. What are the features currently being worked upon? 2. Are there any things-to-do lists maintained? 3. How are new features selected? How do I start?? Thanks. __________________________________ Do you Yahoo!? Jazz up your holiday email with celebrity designs. Learn more. http://celebrity.mail.yahoo.com From SRS0+348d0c150552644be866+467+infradead.org+hch@pentafluge.srs.infradead.org Fri Dec 3 02:34:05 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Dec 2004 02:34:11 -0800 (PST) Received: from pentafluge.infradead.org ([213.146.154.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB3AY3Lv016496 for ; Fri, 3 Dec 2004 02:34:04 -0800 Received: from hch by pentafluge.infradead.org with local (Exim 4.42 #1 (Red Hat Linux)) id 1CaAkt-0002pc-CX; Fri, 03 Dec 2004 10:33:39 +0000 Date: Fri, 3 Dec 2004 10:33:39 +0000 From: Christoph Hellwig To: James Ketrenos Cc: Jeff Garzik , Netdev Subject: Re: Steps for netdev-2.6 inclusion? Message-ID: <20041203103339.GB10799@infradead.org> References: <41AE7143.80505@linux.intel.com> <41AEB3B8.2000406@pobox.com> <41AF7708.3030804@linux.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <41AF7708.3030804@linux.intel.com> User-Agent: Mutt/1.4.1i X-SRS-Rewrite: SMTP reverse-path rewritten from by pentafluge.infradead.org See http://www.infradead.org/rpr.html X-archive-position: 12406 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hch@infradead.org Precedence: bulk X-list: netdev On Thu, Dec 02, 2004 at 02:11:52PM -0600, James Ketrenos wrote: > Jeff Garzik wrote: > > >It's fairly easy, just email me and netdev the patch for inclusion, > >and it'll get reviewed. > > Should I break the patch into the three components (ipw2100, ipw2200, > and ieee80211) ? or just one huge patch? Not sure what you and others > would prefer. Please split it in three parts. And most importantly integrate your ieee80211 code with the hostap code in Jeff's queue, we really don't want two copies of it. > The firmware doesn't have to be downloaded from Sourceforge, but it does > need to exist on the system, just as iwconfig needs to exist if you want > to be able to configure your wireless card. The user can get the > firmware from Sourceforge, or have it installed by their distribution or > package management system, have it on their Knoppix CD, etc. The problem is again that the license for it is horrible multi page legaleese. Please change it to a simple license that allows completely free redistribution and modification of the binaries. From hadi@cyberus.ca Fri Dec 3 05:08:20 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Dec 2004 05:08:24 -0800 (PST) Received: from mx03.cybersurf.com (mx03.cybersurf.com [209.197.145.106]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB3D8J71026716 for ; Fri, 3 Dec 2004 05:08:20 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx03.cybersurf.com with esmtp (Exim 4.30) id 1CaDAC-00085a-CW for netdev@oss.sgi.com; Fri, 03 Dec 2004 08:07:56 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CaDA7-000392-Mu; Fri, 03 Dec 2004 08:07:51 -0500 Subject: Re: [E1000-devel] Transmission limit From: jamal Reply-To: hadi@cyberus.ca To: mellia@prezzemolo.polito.it Cc: Lennert Buytenhek , Harald Welte , P@draigBrady.com, e1000-devel@lists.sourceforge.net, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com In-Reply-To: <1101994772.18491.16.camel@mellia.lipar.polito.it> References: <1101467291.24742.70.camel@mellia.lipar.polito.it> <41A73826.3000109@draigBrady.com> <1101483081.24742.174.camel@mellia.lipar.polito.it> <20041127092503.GA12592@sunbeam.de.gnumonks.org> <1101718412.14930.46.camel@verza.polito.it> <20041129145028.GC18788@xi.wantstofly.org> <1101804146.11111.23.camel@mellia.lipar.polito.it> <1101903944.1042.29.camel@jzny.localdomain> <1101994772.18491.16.camel@mellia.lipar.polito.it> Content-Type: text/plain Organization: jamalopolous Message-Id: <1102079268.1216.19.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 03 Dec 2004 08:07:49 -0500 Content-Transfer-Encoding: 7bit X-archive-position: 12407 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Thu, 2004-12-02 at 08:39, Marco Mellia wrote: > We'll be glad to spend some time trying this out. Please, we are not > very confortable with the linux bitkeeper maintenance method. Can we ask > you to provide us a patch to a standard kernel/driver (whatever you > prefer...)? Also a complete source sub-tree would be ok ;-) Would a -rcX patch be fine for you? 2.6.10-rc2; which means you willl take 2.6.9 patch it with the patch-2.6.10-rc2.gz from kernel.org/v2.6/testing directory then patch one more time with patch i give you. Let me know if you are uncomfortable with that as well. [Sorry, I am disk poor and my stupid ISP still charges $1/MB/month even in this age if i put it up at cyberus]. In the patch i give you i will include rx path improvement code that I got from David Morsberger; I "think" i have seen some improvements with it but i am not 100% sure. If you repeat the test where you drop the packet right after eth_type_trans() with this patch on, I would be very interested if you see any improvements. In any case, expect something from me this weekend or monday (big party this weekend ;->). cheers, jamal From hadi@cyberus.ca Fri Dec 3 05:25:20 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Dec 2004 05:25:24 -0800 (PST) Received: from mx03.cybersurf.com (mx03.cybersurf.com [209.197.145.106]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB3DPJEI027489 for ; Fri, 3 Dec 2004 05:25:20 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx03.cybersurf.com with esmtp (Exim 4.30) id 1CaDQe-0006g5-G5 for netdev@oss.sgi.com; Fri, 03 Dec 2004 08:24:56 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CaDQZ-0005d4-UJ; Fri, 03 Dec 2004 08:24:52 -0500 Subject: Re: [E1000-devel] Transmission limit From: jamal Reply-To: hadi@cyberus.ca To: sfeldma@pobox.com Cc: Lennert Buytenhek , Robert Olsson , P@draigBrady.com, mellia@prezzemolo.polito.it, e1000-devel@lists.sourceforge.net, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com In-Reply-To: <1101967983.4782.9.camel@localhost.localdomain> References: <1101484740.24742.213.camel@mellia.lipar.polito.it> <41A76085.7000105@draigBrady.com> <1101499285.1079.45.camel@jzny.localdomain> <16811.8052.678955.795327@robur.slu.se> <1101821501.1043.43.camel@jzny.localdomain> <20041130134600.GA31515@xi.wantstofly.org> <1101824754.1044.126.camel@jzny.localdomain> <20041201001107.GE4203@xi.wantstofly.org> <1101863399.4663.54.camel@sfeldma-mobl.dsl-verizon.net> <20041201182943.GA14470@xi.wantstofly.org> <20041201213550.GF14470@xi.wantstofly.org> <1101967983.4782.9.camel@localhost.localdomain> Content-Type: text/plain Organization: jamalopolous Message-Id: <1102080289.1214.22.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 03 Dec 2004 08:24:49 -0500 Content-Transfer-Encoding: 7bit X-archive-position: 12408 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Thu, 2004-12-02 at 01:13, Scott Feldman wrote: > On Wed, 2004-12-01 at 13:35, Lennert Buytenhek wrote: > > Pretty graph attached. From ~220B packets or so it does wire speed, but > > there's still an odd drop in performance around 256B packets (which is > > also there without your patch.) From 350B packets or so, performance is > > identical with or without your patch (wire speed.) > > Seems this is helping PCI nics but not PCI-X. I was using PCI 32/33. > Can't explain the dip around 256B. > Interesting thought. I also saw improvements with my batching patch for PCI 32/32 but nothing noticeable in PCI-X 64/66. cheers, jamal From hadi@cyberus.ca Fri Dec 3 05:46:04 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Dec 2004 05:46:09 -0800 (PST) Received: from mx03.cybersurf.com (mx03.cybersurf.com [209.197.145.106]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB3Dk4Lv028316 for ; Fri, 3 Dec 2004 05:46:04 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx03.cybersurf.com with esmtp (Exim 4.30) id 1CaDki-0002p2-SK for netdev@oss.sgi.com; Fri, 03 Dec 2004 08:45:40 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CaDkf-00080P-Ad; Fri, 03 Dec 2004 08:45:37 -0500 Subject: Re: Contribute - Howto From: jamal Reply-To: hadi@cyberus.ca To: ram mohan Cc: netdev@oss.sgi.com In-Reply-To: <20041203054012.74385.qmail@web90003.mail.scd.yahoo.com> References: <20041203054012.74385.qmail@web90003.mail.scd.yahoo.com> Content-Type: text/plain Organization: jamalopolous Message-Id: <1102081534.1210.38.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 03 Dec 2004 08:45:34 -0500 Content-Transfer-Encoding: 7bit X-archive-position: 12409 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev Just join this list and listen to people complaining about bugs. Chase those bugs and fix them. Thats a good way to get hands dirty. Features are added based on technical merits and needs. cheers, jamal On Fri, 2004-12-03 at 00:40, ram mohan wrote: > Hi, > I am willing to contribute to the "networking side " > of linux. I googled a bit and found that I should join > the list and then I can go ahead. > I would like to know. > 1. What are the features currently being worked upon? > 2. Are there any things-to-do lists maintained? > 3. How are new features selected? > > How do I start?? > Thanks. > > > > __________________________________ > Do you Yahoo!? > Jazz up your holiday email with celebrity designs. Learn more. > http://celebrity.mail.yahoo.com > > From mludvig@suse.cz Fri Dec 3 09:43:39 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Dec 2004 09:43:46 -0800 (PST) Received: from mail.suse.cz (styx.suse.cz [82.119.242.94]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB3HhcEH031372 for ; Fri, 3 Dec 2004 09:43:39 -0800 Received: from [10.20.1.72] (ozzy.suse.cz [10.20.1.72]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.suse.cz (SUSE CR ESMTP Mailer) with ESMTP id 1F244628164; Fri, 3 Dec 2004 18:43:16 +0100 (CET) Message-ID: <41B0A5B4.6060108@suse.cz> Date: Fri, 03 Dec 2004 18:43:16 +0100 From: Michal Ludvig Organization: SuSE CR, s.r.o. User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.8a4) Gecko/20040927 X-Accept-Language: en MIME-Version: 1.0 To: Andrew Morton Cc: netdev@oss.sgi.com, Jan Kara Subject: [PATCH] rtnetlink & address family problem X-Enigmail-Version: 0.86.1.0 X-Enigmail-Supports: pgp-inline, pgp-mime Content-Type: multipart/mixed; boundary="------------020005070601050006080808" X-archive-position: 12410 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mludvig@suse.cz Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------020005070601050006080808 Content-Type: text/plain; charset=ISO-8859-2 Content-Transfer-Encoding: 7bit Hi, running 'ip -6 addr flush dev eth0' on a kernel without IPv6 support flushes *all* addresses from the interface, even those IPv4 ones, because the unsupported protocol is substituted by PF_UNSPEC. IMHO it should better return with an error EAFNOSUPPORT. Attached patch fixes it. Please apply. BTW Credits to Jan Kara for discovering and analysing this bug. Michal Ludvig -- SUSE Labs mludvig@suse.cz (+420) 296.542.396 http://www.suse.cz Personal homepage http://www.logix.cz/michal --------------020005070601050006080808 Content-Type: text/plain; name="rtnetlink-family.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="rtnetlink-family.diff" # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/12/03 18:06:31+01:00 michal@logix.cz # Return EAFNOSUPPORT if requested operation on unsupported family. # # Signed-off-by: Michal Ludvig # # net/core/rtnetlink.c # 2004/12/03 18:05:49+01:00 michal@logix.cz +4 -2 # Return EAFNOSUPPORT if requested operation on unsupported family. # diff -Nru a/net/core/rtnetlink.c b/net/core/rtnetlink.c --- a/net/core/rtnetlink.c 2004-12-03 18:30:33 +01:00 +++ b/net/core/rtnetlink.c 2004-12-03 18:30:33 +01:00 @@ -477,8 +477,10 @@ } link_tab = rtnetlink_links[family]; - if (link_tab == NULL) - link_tab = rtnetlink_links[PF_UNSPEC]; + if (link_tab == NULL) { + *errp = -EAFNOSUPPORT; + return -1; + } link = &link_tab[type]; sz_idx = type>>2; --------------020005070601050006080808-- From cap@nsc.liu.se Fri Dec 3 11:02:23 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Dec 2004 11:02:30 -0800 (PST) Received: from papput.nsc.liu.se (ns2.nsc.liu.se [130.236.101.9]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB3J2LSd001097 for ; Fri, 3 Dec 2004 11:02:22 -0800 Received: from mail.nsc.liu.se (mail.nsc.liu.se [130.236.101.49]) by papput.nsc.liu.se (Postfix) with ESMTP id 252E71C31F5 for ; Fri, 3 Dec 2004 20:02:00 +0100 (CET) Date: Fri, 3 Dec 2004 20:02:00 +0100 (CET) From: Peter Kjellstroem To: netdev@oss.sgi.com Subject: e1000>5.2.30 unstable with InterruptThrottleRate=0 Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 12411 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: cap@nsc.liu.se Precedence: bulk X-list: netdev Hello folks, Short version: 82547GI with ITR=0 on 2.4.28 (vanilla) and RHEL3u3 has problems (traffic grinds to a temporary halt under anything but trivila network traffic). kernel prints the following and resets the IF (many times): NETDEV WATCHDOG: eth0: transmit timed out More verbose version with background: I have a problem with e1000 being unstable when I run it with InterruptThrottleRate=0 (abbreviated ITR in the rest of this e-mail). I need to turn ITR off or set it so large that it behaves as off. The reason for having to turn it off is that I run MPI-applications (cluster stuff) and that happens to be largely latency bound. Latency with default e1000 is terrible, 250 us, with ITR=0 (where it works) the latency drops to 20-25 us. Enough of background. Up untill now I have allways been able to run with ITR=0 and intel gigabit has been very nice. Now, for some combinations of driver, chip and ITR setting it all falls apart. Affected chips (theory, 8254X, X>1 or anything faster then PCI33): 82547GI, 82546 (said to be affected, not verified by me) Unaffected chips: 82541 (rock solid no matter what driver or ITR) Linux-2.4.26 vanilla (smp, without NAPI with e1000 as module) is ok (82547, ITR=0, rock solid) Linux-2.4.28 vanilla (smp, without NAPI with e1000 as module) is BAD (82547 needs ITR<20000 for resonable stability) Linux-2.4.28 with e1000 from 2.4.26 but otherwise exactly as above is ok rock solid!!! Linux-2.4.21-20smp RHEL3 update 3 is BAD (known stable with default ITR (1?) but probably ok for <20000) Conclusions: something happened above e1000 version 5.2.30 (as in linux-2.4.26), RHEL has 5.2.52 and 2.4.28 has 5.4.11. Some more discussions on this subject has taken place on another list, see following thread if interested: http://lists.us.dell.com/pipermail/linux-poweredge/2004-November/023061.html Best Regards, Peter -- ------------------------------------------------------------ Peter Kjellstroem | E-mail: cap@nsc.liu.se National Supercomputer Centre | Sweden | http://www.nsc.liu.se From buytenh@wantstofly.org Fri Dec 3 12:57:30 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Dec 2004 12:57:36 -0800 (PST) Received: from xi.wantstofly.org (alephnull.demon.nl [212.238.201.82]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB3KvSBn007253 for ; Fri, 3 Dec 2004 12:57:29 -0800 Received: by xi.wantstofly.org (Postfix, from userid 500) id AF95B2B0F0; Fri, 3 Dec 2004 21:57:06 +0100 (MET) Date: Fri, 3 Dec 2004 21:57:06 +0100 From: Lennert Buytenhek To: Scott Feldman Cc: jamal , Robert Olsson , P@draigBrady.com, mellia@prezzemolo.polito.it, e1000-devel@lists.sourceforge.net, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com Subject: Re: [E1000-devel] Transmission limit Message-ID: <20041203205706.GC9808@xi.wantstofly.org> References: <16807.20052.569125.686158@robur.slu.se> <1101484740.24742.213.camel@mellia.lipar.polito.it> <41A76085.7000105@draigBrady.com> <1101499285.1079.45.camel@jzny.localdomain> <16811.8052.678955.795327@robur.slu.se> <1101821501.1043.43.camel@jzny.localdomain> <20041130134600.GA31515@xi.wantstofly.org> <1101824754.1044.126.camel@jzny.localdomain> <20041201001107.GE4203@xi.wantstofly.org> <1101863399.4663.54.camel@sfeldma-mobl.dsl-verizon.net> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="98e8jtXdkpgskNou" Content-Disposition: inline In-Reply-To: <1101863399.4663.54.camel@sfeldma-mobl.dsl-verizon.net> User-Agent: Mutt/1.4.1i X-archive-position: 12412 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: buytenh@wantstofly.org Precedence: bulk X-list: netdev --98e8jtXdkpgskNou Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Tue, Nov 30, 2004 at 05:09:59PM -0800, Scott Feldman wrote: > Hey, turns out, I know some e1000 tricks that might help get the kpps > numbers up. > > My problem is I only have a P4 desktop system with a 82544 nic running > at PCI 32/33Mhz, so I can't play with the big boys. But, attached is a > rework of the Tx path to eliminate 1) Tx interrupts, and 2) Tx > descriptor write-backs. For me, I see a nice jump in kpps, but I'd like > others to try with their setups. We should be able to get to wire speed > with 60-byte packets. Attached is a graph of my numbers with and without your patch for: - An 82540 at PCI 32/33, idle 33MHz card on the same bus forcing it to 33MHz. - An 82541 at PCI 32/66. - An 82546 at PCI-X 64/100, NIC can do 133MHz but mobo only does 100MHz. All 'phi' tests were done on my box phi, a dual 2.4GHz Xeon on an Intel SE7505VB2 board (http://www.intel.com/design/servers/se7505vb2/). I've included Robert's 64/133 numbers ('sourcemage') on his dual 866MHz P3 for comparison. I didn't test all packet sizes up to 1500, just the first few hundred bytes for each. As before, the max # pps at 60B packets is strongly influenced by the per- packet overhead (which seems to be reduced by your patch for my machine quite a bit, also on 64/100, even though Robert sees no improvement on 64/133) while the slope of each curve appears to depend only on the speed of the bus the NIC is in. I.e. the 60B kpps number more-or-less determines the shape of the rest of the graph in each case. Bus speed is most likely also the reason why the 64/100 setup w/o your patch starts off slower than the 64/66 with your patch, but then eventually beats the 64/66 (around 140B packets) just before they both hit the GigE saturation point. There's no drop at 256B for the 64/100 setup like with the 32/* setups. Perhaps the drop at 256B is because of the PCI latency timer being set to 64 by default, and that causes the transfer on 32b to be broken up in 256-byte chunks? I'm not able to saturate gigabit on 32/33 with 1500B packets, while Jamal does. Another thing to look into. Also note that the 64/100 NIC has rather wobbly performance between 60B and ~160B bytes. This 'square wave pattern' is there both with and without your patch, perhaps something particular to the NIC. Its period appears to be 16 bytes, dropping down where packet_size mod 16 = 0, and then jumping up again a bit when packet_size mod 16 = 6. Odd. --L --98e8jtXdkpgskNou Content-Type: image/png Content-Disposition: attachment; filename="perf.png" Content-Transfer-Encoding: base64 iVBORw0KGgoAAAANSUhEUgAABkAAAASUCAIAAACwXlbgAAAACXBIWXMAAAsTAAALEwEAmpwY AAAAB3RJTUUH1AwDFDYwJtQcAwAAIABJREFUeNrs3U2O20jWNtBQw5sgd/NOuqbpVQhI9B6c 3kNDQK7CnlZPvt1ELEPfgDaLJiWKEv+C5DlIFLJoiqJCzCzrqXsvTzHGAAAAAAC5+pclAAAA ACBnAiwAAAAAsibAAgAAACBrAiwAAAAAsibAAgAAACBrAiwAAAAAsibAAgAAACBrAiwAAAAA sibAAgAAACBrAiwAAAAAsibAAgAAACBrAiwAAAAAsibAAgAAACBrAiwAAAAAsibAAgAAACBr AiwAAAAAsibAAgAAACBrAiwAAAAAsibAAgAAACBrXyzB/pRlaREAAACA4WKMOZ+eCqy9kV4B AAAAz8o8T1CBtU+Z56b7/mnf5eIXZZkOfFHt+J31M+ud9c7incU7i3cW7yxhC9UwKrCAx1KM heI+AAAAViLAAgAAACBrAixgEEVYAAAArEWABQAAAEDWBFgAAAAAZE2ABQylixAAAIBVCLAA AAAAyJoAC3iOIiwAAAAWdooxWoU9KcsyhOBtZT5FWSYXGAAAwI7kHyaowAIAAAAgawIs4DlG uQMAALAwARYAAAAAWRNgAQAAAJA1ARbwNF2EAAAALEmABQAAAEDWBFjAKxRhAQAAsBgBFgAA AABZE2ABAAAAkDUBFvAiXYQAAAAsQ4AFAAAATKOc4X9yl/7HOQIsYCRFWAAAMIdWalP+1vrX 5sZwK+uZb0t3e1mWMcZnX2brJXQPG2OUYSHAAl6XnvyPEwAAMEQ3vYq/1X8UG5q7dROlObZM 9TK7rwtuEmABAABARl6oY2o+qg6D5tty81QHnvaQoKq7j4QLARYwilHuAAAwrW4MdC8Yutl8 l4n+xsae1/VafsfuCbAAAABgG7oVTyuWJvUnTd0Crjpu6+ZuQise+mIJAAAAIH/NlGf1uGdI 5FRlWK1z7j7whT7E4fbXL3LYScQCLGCCX6BFWRroDgAA89lrjdLNPGvCl+xzym5oIQQAAICs 9aQ8tQVmtzebAW8O6rrZGNja2Hxg9zjN+yr278nRqMACJqAICwAAptIsQQqNzrt6h9boq2aP XivomW/LkFfRzL965tA3n8W7zz0n18cuf9N5W1meAAsAAJijVEr51TJvXMg7TNBCCAAAAExj jgREekUQYAFTqboIrQMAAACTE2ABAAAAkDUBFjAZRVgAAADMQYAFAAAAQNYEWAAAAABkTYAF TEkXIQAAAJMTYAEAAACQNQEWMD1FWAAAAExIgAVMLMVoEQAAAJiQAGufis93iwAAAADsgwAL mJ5R7gAAAExIgAUAAABA1gRYAAAAAGRNgAXMQhchAAAAUxFgAQAAAJA1ARYwF0VYAAAATEKA BQAAAEDWBFgAAAAAZE2AtU/pfCk+360D61+KuggBAAAYTYAFAAAAQNYEWMC8FGEBAAAwkgAL AAAAgKwJsAAAAADImgALmJ0uQgAAWsqyrP/Zs8NrR648e8AhDxlykNazt7aUf7p38HuvYsgr nWOferfXdijHfRx4eLVwBF8sAQAAAFtRlmWMccif9u/ZfeDNQ3UP2HPY/p3r7x+e1ZBXseQ+ zT/q2ad6vS5RZqICC1iIIiwAAMYbnkk193xYOtQ6bDNsqst/Wlumde+Yw1/vVOu2zHHgWSqw gCXoIgQAoKlVi9SMb7rBU2u3ZzOUuhiq/2Tme5k3T6n1up7NxVYPkuqz7S+Iu/mqW2VfzeM0 l6LeZ2DlGvsmwAIAAGB93Ua27jetlKeVj9Q7tw7YcvNRL+gep3l690q6wrB2vG6sc++cH/ZU 9r/MbnDWfa7mCKrumQ9sTmwerZlhDXnfIQiwdiydL8XnezpfLAW5XJMxFmWZ/BcIAIBhHoYX N3d4OLJqvma6e7FO8/v+R917FUOmbvWfZP/sqnqfm2d4cxmn6qPsHmfa0Kos9tYIEtNBP1IJ sAAAANiz5h0Al6/oGfKkr53hugVKkzz1zSqtic8z+T/oO2GIOwAAAOsrf5s2lIm/hScHwN+b 3T4yY7pZcPTCGfY07g157FP7Dz/augdh31RgAcvRRQgAwD1DspuHEVLPFKqHhw1/9tB1++Zu dtI9fPbWfPohZzjwVXQn3/fcMLF1Pv37DHmznjryvTe6Oa9dhkW/k4loO9P8hWsGFhkSYAEA cPODjA+ni62k1ebmVRHyvtWjFkJgUVURlnUAAIA5DAkgpFdskQALAACAlYlUgH4CLAAAAACy JsAClqaLEAAAgKcIsPYsnS/F57t1AAAAADZNgDVK+VtrY3e3SbbAbijCAgAAYDgB1uuqO49W 6rCp2tjMnqbaAgAAAHBMAqwpValTCKHOnqbaAgAAAHBYAixgHboIAQAAGOiLJXhZszyqKpjK R31i129v3Rqu3M4WAAAAmNXWG7wEWKPe+zoJan4PDFQVYSU/OwAAAPQSYO3TP2na57tkDQAA AA6uPxzIvz7LDKyJr4ZpZ7cr7AIAAAA4yUfGuDkDq5s6TbVl+Ck1H1h8vqfzxZtFtnQRAgAA rKsbJuRGC+EoN9/a7saptgAAAAAckBZCYGXVKHfrAAAAwD0CLAAAAACyJsACAAAAIGsCLGB9 uggBAADoIcACAAAAIGsCLCALirAAAAC4R4AFAAAAQNYEWAAAAABkTYC1T2VR1l/pfCk+360J +dNFCADAap+hyrL+Z88Orx258uwBhzxkyEFaz97aUv7p3sHvvYohr3T8Pg/fHY7giyXYpZji Pz/qRXkNb9YEAABgcmVZxhiH/Gn/nt0H3jxU94A9h+3fuf7+4VkNeRVL7sMxqcDav2aYBZlT hAUAwMY+cA1OWJp7PixBah22GTbV5UitLdO6d8ypEiXJFM8SYAEAAHB0rVqknq66bmPds8/1 sLBovnCnp1ar9bqePYdZA6mBlWLsmxZCAAAAaOs2snW/aVU/tcKseufWAVtuPuoF3eM0T+9e SdfN19Vz8NaL6h78YU9l/8ucqaCMHRBgAXmpugiT/7sCAEBmHkZLN3d4OLJqvqa8e/Okmt/3 P+reqxgydav/JHv2bO4zflnKstjddZiO+QMowAIAAIDlNO8AuHxb3JAnfe0M85y5fti4Z3/M wAKyY5Q7AACrq2dgTRvKxN/CkwPg781uH5kxdaucXjvDnrsKDnnsU/tzTCqwAAAAoG1IdvMw QuqZQvXwsPU3zcSqeZCbDYkPn72ZEPXPyXr2VbSO3F2fnvPp3wdCCCcXxM40f8E1FZ/v6Xyx PmyFMVgAAKz7wcqH5cVW0mpn8k6FvENDLYRAjnQRAgDADgwJRKRXDCHAAgAAgD+IVCA3Aiwg U4qwAAAAqAiwAAAAAMiaAAsAAACArAmwgHzpIgQAACAIsAAAAADInAALyJoiLAAAAARYR3H6 +Fl8vlsHAAAAYHMEWEDuFGEBAAAcnAALAAAAgKwJsIBtUIQFAABwWAIsYANSjBYBAADgsARY AAAAAGRNgHUUMSlgYduMcgcAADgsARYAAAAAWRNgAZuhCAsAAOCYBFgAAAAAZE2AdSCnj5/F 57t1YNMUYQEAAByQAAsAAACArAmwAAAAAMiaAAvYGF2EAAAARyPAAgAAACBrXywBsDlVEVaK 0VIAAOxGWZYxxuqfPTu8cNjmv1ZHqDf2HLC7zySPunk+N1/gw9f72vm8tk+928Pj3Nzhtfdu +LXBEQiwAAAA2LD+XKMVCbX2v/fY7j4TPmpIClM+Gprx2vm8tk/zj3r2qV6vC5KZaCEENskk LAAAKgOrcl6u3+k+ao46oGb0M0ep0VQHVAPFWlRgAQAAsL5WdVIz0OlWUbV2e5iqNCOhJSOY noKm1qtoli8NOcPVg6SnmhNbr7pV9tU8TnMpWm+Z7OzgBFjHks6X4vM9nS+Wgn0wCQsAYMe6 rW3db1ptaz0Tr7plTc2j3XtUeGaaVU8IdbNTb3il1fDJWfceOLzvr/tc9QKGP2Om/hO4l1K1 miuHvMsQBFjH+u2fYlmU1/BmKdgHXYQAAIf7UPMozri5w82o5WFNVs/sqmpL91H3jt//XENi mpcH2w+Zb/WwnO3mAPWpxl11jzNtaFWUxe4+B6Vj/vgLsAAAADiWJUt7hjxXHeI8dWLrFihN 8tQ3q7Smddi4Z38McQc2TBEWAMCOlb+NyUq6sci66dXNgqO6nmt8ejUkBuppupxkhdc6CPum AgsAAIAcDUlz7s23ah6kZ5pVuN8/2HrUkCHrQ55ryHGaBVlDdgt3pn31nE//PkPemqeOfO9t bc5rl2HR72Qi2s48+AVXlNePN0Pc2Rmj3AEAdvnRxsfVmdbN2nLzqgh53+pRCyEAAADsxJAA QnrFFgmwgM0zCQsAYH+ELECTAAsAAACArAmwDiedL8Xnu3VgfxRhAQAA7JUAC9gDQ9wBAAB2 TIAFAAAAQNYEWMBOGOUOAACwVwIsAAAAALImwAL2QxEWAADALgmwAAAAAMiaAOtYYoploT6F PVOEBQAAsD8CLAAAAACyJsACdkgRFgAAwJ4IsI4onS/F57t1YLdXeIwWAQAAYE8EWAAAAABk TYAF7JBR7gAAAHsiwDoutyMEAAAANkGAdUSiK45AERYAAMBuCLAOJ6YYkxHXAAAAwGYIsIDd UoQFAACwDwIsAAAAALImwDqumKJhWOyeIiwAAIAdEGAd9VP9+VJ8vlsHAAAAIH8CLGD/FGEB AABs2hdLAOybLkIAAF5QlmWMsfpnzw4vHLb5r9UR6o09B+zuM8mjbp7PzRf48PW+dj4Dz7n/ veAIBFgAAAAwmf6cpRUJtfa/99juPhM+akgqVD76X8Kvnc9r+3BMWgiB/VOEBQDAYgZmLi+n M91HzZHyNBOrOYIkyRTPEmAd2vXjzSIAAAB0taqTyobmbq0t3R1uakZCS0Y5Pe2QPYnVkDOc 9VUMrxRjx7QQHpcbEXKsCz7GoiyT/+YBAPCqbmtb95uqTa9+SM/Eq25ZU/No9x4Vnplm1TyZ e89181U8NHxy1r0H3tuntYBQE2AdXfH5ns4X6wAAAPCsh3HPzR1uznh6WJPVM7uq2tJ91L3j 9z/XkBjr5cH2Q+ZbdRO6Mcqi2NtVl9Ixf9wEWId2+vipi5DjUIQFAECelpxWPuS56uToqRPL c+b6YeOe/TED69BiiqePnxoJAQAAHqpnYI2JabpVReumV93zib+FZ8ZO9dxV8Kk10T/IPSqw gGNRhAUAwGuGpDn35ls1D9IzzSrc7x9sPapnvtVTzzXkOM2CrCG7hTvTvnrOp38fCCGcXBA7 0/8L5cb+RRlTNAmL4xBgAQDw2kctH59nWjdrm8k7FfIODbUQAodTKEsGAIBFDAlEpFcMIcAi hBDS+WISFke52v3XEQCA5wlZYF1mYLG+svijHCYm/2FgdhoJAQAANkSAxS9VEdZTk7BawdPL monVVMeEvqs9Rl2EAAAAGyLAYhTVUgAAAMDczMDiH09NwqpuX2jR2OrVrggLAABgOwRY5CWm qIsQAAAAaBJgHV0rMBpYhCVjot8mbmqpCAsAAGArBFi8SP8gAAAAsAwBFm1PTcKCexRhAQAA MBUBFk8zvh0AAABY0hdLQOjOtPoI149w+vhpZXhB8fm+oTq+qggrRZksAABAvgRY3Jtm9a7M ijGqDCudL5YCAACAkbQQctuKFTStGyPC7Fe7SVgAAAB5U4F1IEVZdD63pwcPUUEDAAAArE0F 1v4VZVF9hRBSTPXXwweKrhhpc5OwvGUAAAB5UoG1Z3VoNeog2yzCajYhGuYFAAAAmybA2qdJ oquwqQqaWh1d1bnVyxO1JF/TXI0byUDdjhAAACBbAqx9GtQhGFNRFg/33OK95FrB08s5VCv5 kmcN0bpatpiBAgAAkBszsHhs+Qzi5RsRTnv7wphi/eUyOMSlbhIWAABAlgRYDLWVOhphU24U YQEAADCSAItBqqYwMQT7v9QVYQEAAORHgMXgD/YbvBchAAAAsAOGuPOEzAe6TzsAq6Uay6U/ cZdXTvts3Y4QAGCVv8+XZYyx+mfPDi8c9o+/2MfY3NjzXK1H3TxOzwPrHVpbeo7TeoEPX+/D VzHhPvVuD49zc4fX3rvh1wZHIMBiV0mEgAkAAI6mP9doRUKt/e89tvuo7pb+M+lmLvX3Q1KY p+K8e7tNtU/zj3r26cZ8MCEthDzNTG7u2VCZ1ePX4j+9AAAbMbAq5+UCrtajZqoDakY/zahr qkhoqnNWA8VaBFiHlmIqyuKVB86fYVUte96j/Vxsm8o99Q8CACyvVZ1UNjR3a23p7nBTM3Ua HsGMT696CppuJlbjj7yYm+9O/z719913sLlPd32GV66xY1oIeV1W5TbSLqa/wk3CAgBYVbe1 rftNq0apZ+JVTxo18FGtLd1HNU+mJ/ka0o7X/rwzeHLWvQcO7/u7N/+r2Ur5QnNi82jN7sgh 7zIEARYvqwpqssqwDMBiyis8Rl2EAAC5eRhn9A+0uhe7DHxUa0v3UfeO3/9cQ2KalwfbD5lv 1YqZurvdnMY1VW9j9zjThlb/+89fO/sp+Pd//z7mj78AixGf8OdvCttTXdUkr2XTId1rdwBY NyRVhAUAsEtLlvYMea5mA92YDsclTfLUN6u0pnXYuGd/BFiMMvKmhDfzr/po+6uoGvmKyqIs i/JQhWbVFbJWhqUICwBgXQ/b3wYe5IU5VlPNbh9ynJulXk/dhXDMqxufgk2So7l9IQ8JsBj9 IX9cHVYrmMitLbH9W3XV/CjzwfaTv3HVdbX69HdFWAAAq/0FeMBfw+7Nt2oepBuEdQeEP3xU z3yrp55ryHHCna69m6+9eajWavScT/8+Q96ap458722t95nwfovs1clEtJ2517R8/8N5kWLa dw4y5fK+GmBNlXxVAVaeRVgD37jhu4VGvrniVSHAAgBY66ONj6szrZu1ZXyYsLx/eZPI0LoV Nzk7VP9gPjmmRkIAADbzkWFAACG9YosEWEeXYirKIq9TyrX8Kh9bn23/MKDs1lut2Eio/AoA YBVCFqBJgEWmFGHd/Q95iuH3QPfqa1vn/zCgzLOHVBEWAADAigRY5EgRVr+YYv21s5fWE1wq wgIAADgsARYhhFCURfWV11ntpQhr3XsXZuheFNUa3J7bRaUICwAAYC0CLEKKqfrK66wUYW3K VH1//QcZWYQ15rGKsAAAAFYkwKL5ET0pwtqc68fb1me61+/1ArVXYyMwRVgAAABrEGCRrwyL sGKKWaVF2w34WkHS8PTqtQRqkuZERVgAAABrEWCRu0wymmyjonS+XD/ejvYWv/CQqfJQRVgA AADLE2DR/XyeURdhJkVYY9Kr+Sa4z91zt+TaPvVCnn3VrYUa00WoCAsAAGAVXywBf34+z24M Vlg7qamfPavAaKqTmaoj8hqergKrg6QXXkj12CEPnONdK8pSkgUAALAkARa5Gzl1e6Sb8Ucd +nRLq5aZkNVakNPHz+vH69Vq4wvExrzquQe3T378FKMuQgAAgIVpIWQbVsmwbraexRTrr1Zw U3ULdr/mOLesmgevH2+vnc+YV/Ew2ez50/GpqAwLAABgSQIsOp/t8+siXCWsGRJwNDOsxe5O OGFV0YTzubK6OePcV47+QQAAgIUJsLitKIvmV88Owx8y9pQWLMLqmc3UOo1mhjVTsdXDRYgp nj5+ZnufxPn0FFItMLBMERYAAMBizMDiVi4QU+ezenFvn/qPmo+aPMBachJWT3p18zSqDGuB 9Ko+B5fow3fw4SoNHwN/52fEJCwAAIDlqMBi4Mf11Prq/tECp7FAhvXaffGWSa+yug1iLldm J1JccpVkWAAAAMsQYDGLOQZpLZBKDEmv1ror4sMnXfd2jSGPMVhLroBJWAAAAIsRYLExcycU OZc4TXtu07Y9LtZBeXNZqqvi2eq5SSI/RVgAAAALEGCxJbOmS8Nbz5avdcq5eTCTc3ut93Ps BakICwAAYBECLLZnjvAo84Ro4J6rdxGupXrvXnsHFWEBAADkT4DFXOYYgxXmKbHJfzi62e0z LdH4hVWEBQAAsIAvloAtejlyullu88KhqlqnZebKD3mWmOK0M61ekMM5rHZBlqUkCwAAYD4C LLanCo9e7vzaUDXTCxnZwGRt1qSpuh3hhpKskVlkilEXIQAAwKwEWGxSDiHUYkVY21JFY1WG tZVraZLBYYqwAAD29jfbsowxVv/s2eGFwzb/tTpCvbHnuVqPunmcngfWO7S29Byn9QIfvt6H r+LlfR6+FxyBAGvUr7ObP+fdH6qptmxONQYrxTTgw39R7e+6+rUgU9xTb/kU6Zj9g0ERFgAA gz/KtSKh1v73Htt9VHdL/5l0M6D6+yEfPJ+K8+7tNtU+HJMAa8Rn9Vu/L+79Uhi/Za+a0dXm YqzJi7DqUqAXjlmPoKrO6vTxc4E4aR81aJO8j1WGpQgLAICBH+JeLuBapvShGYo1o66pnk4y xbMEWNP8YNe1l62f6qm2bHp9emKpVn3WTDcuzH19Gv1rIzOU5hj1wxZDAQDAJFrVSc1Ap6ce qlse1f8pMjwT5YxPr/o7DbsVFeOPPMd7wTEJsJhXXVpVJ1MbLbaa3Jhiq77f7CmWRXkNbyv/ x36ROxLWPZL1E3W3LHu1K8ICANizbmvbvWqGf/7Ken/iVU8aNfBRrS3dRzVPpif56r6Kx38V Hzw5694D7+3zbHDGcQiwRn+EzrJCqv8HfvkTbgZV/dHV8LFZmXih+2yS+VZ972+K6bz55r4h A7x6cqtn74RoJD8AAK/83fvRZ6v+gVb35j0NfFRrS/dR947f/1xDPuS+PNh+yHyrm1VvLyv+ 397isPR/L36i33oyKMBi8R+2Y09ql5I88beBwfFTd8+17oSoCAsAgBcsWRgx5LlaU54zfBVP /BX9//zlfCcEWJv5LfPcJ3+fn5f8hTiseGfuwivaPwWLtDECAHCoD4DjP3C9NsdqqtntQ45z s9TrqbsQjnl17jy4VlCQf32WAGviq2Ha2e1+dHdj+fRKT9xaK6YICwDgmJ//mylAf39cNwi7 NyG+51E9862eeq4hxwnD7jDWfRWt1eg5n/59IAiw5viN1vqpnmoLmevJPvYaJM33ulROAQCw iU9/3e+HfIi7V/H01KMmfK6Hs6iGPF1P7nZzoZ7aB0II/7IEU/3CeurXymtbyFZPjqMMas2f 0FUnYVl/AAB4IWWDmwRYZKe6EeE+Xsu66VXVE+eKWmvFZFgAAHsiZIF1CbBgGq3so/h8V3t1 6OvB328AAACmI8CC6bnhYGWV9r2WtboIf10JirAAAACmIMCCyVRFWFXhVSbp1YpdhBudwj7h iinCAgAAmIoACyam8Ip/LgbT3AEAAKYgwIIpHSe9mm/CV1mU01Zv1V2EZVHm0NUIAADAswRY 5GhPNyJcfzHdizCEMEMuNvhiVoQFAAAwlgAL2L+Y4vD0SuQHAACQGwEW7J9EZuX1V4QFAAAw jgALAAAAgKwJsOAQFGFV6oHuS6+/IiwAAIARBFjA0+a7BSEAAAB0CbDIlBsRTr+kixdhvVDu tNa9Ahe5pBVhAQAAvEiABceikXCI+cI+GRYAAMALBFhkrSiL6stSTGL1vr+yKKuvg65/jC5C AACAF3yxBGT8aT9V3wiwplzV82X5CVZ1YlW1B9YZVv2ve13t5susFWUpyQIAAHiKAIttKMqi zrPYlu5Yq5sx1oZezvCdmwld9b1JWAAAAC8QYLEBBrpPvJ4LFmH1hFPbHdb+7Jm3C82u4XpS hAUAAPAEM7DgoF4eUr58B+IqJp/jHlOsvkIIp6sLEAAA4AkCLDiiIyRQc5hkXFeVYWkkBAAA GE6AxWboIpzW5BVGWxFTHJNDTdX5qAgLAABgOAEW22CCO3uiCAsAAOAphrjDcS05zZ2uk2nu AAArKcsyxlj9c2evq/qm9boevtLmDuWf/5+1WqjWloHP3trSc5z6BG7us9f3i6cIsNi/uvdQ Gdft9ZFhrWFkJyMAALS0QqjhWU9/rnRzy5Bnv3k+D89qyHNxTAIstqQoi1YINWQwVv2Qamcx 1h+Lc9RJWMMXZ750L6Z4upaKsAAAmP6vmoNjoDpsuvdH/VsmMfDZOTgBFttJE2JqxlUvpFGt I/BrWZ6JaY5crjVTwdTpGmSqAAAL61YDDWl/u1lh1KpRGthG1y1Q6n9UuN8e+OAvsffzqf6j jU+v+jsNu4vZ81wDq7fYNwEWGzO+H7BbxgVD/xuc4uQHLIuyKBVhAQCsaWD7W88D7z3qXhtd 608fPurmlj/+YvnnAKmHY6qG7NaN2HomVTVPqSf5aq3YwGeHIMBiW8YHT4qwbi+Lae7rqRoJ ZaoAAGv+laxRSPVakVH3Ufdmot/0wj43z7Mn9up51MPjdLd0j3Mv8mt+3/+onmcfY38jUw77 wU2ABRzod32+S60ICwBgVQ/LgkYeuUfPHQCfOs69g3efZTFDnnTuM/T/6XdDgAX8+rV+M646 +K/7ZWrTrh9vIfx0EQIArKXb3Ne/81PH7Nly78j9U6uGBz0v35RwktntQ47zVGshByfA4oiM wbrpOFlVNXlq8oFWo3x7U4QFALDa3w8785t6tvSkS0OOc/P77pFbNVk9E6Z6nn2IZgFUuDPN 6uVnby5U/5yska+CIzi5IHamdUMHbhJgMTzAqiqwqlsQzpF5FZ/vp4+f14+38P1nCEGGBQDA wBHycKgwQQUWwCDzVWzFFE8f5TX8yrAAADj63zxVIUGHAAsgVAVWoZFS1Vuu4W2x00gxaiQE ACDIraBDgMVB6SKkjqjCrdyqUW/1Xny+h4/Zz+feHH0AAAAEWBxRiqkoC+twZPf6Abvbq1xp pv7BG7c4NM0dAACg41+WgMManmEVZSHwYgHHuREkAADAU1RgcVBV/+DNWKpuLaz/VMUWy12Z 50sRgiIsAACAJgEWh9Ydg9UstjIkixXJsAAAAGoCLPiD0IrF3BiAVV2E50sRQvj+0xIBAABU zMACeGCB+wPGFJuL9oiaAAAgAElEQVR3RQzh1zR3iw8AABAEWDCcMVgsxjR3AACAJi2EMIg5 7ix9yZnmDgAA8JsKLIAV3BuABQAAQJcKLIBMKcICAACoqMACGGTuOe53meYOAAAcngAL4LG1 2v20GQIAAAQthADLe2oAlkZCAAAAFVgAWYgploVWQQAAgBtUYAHkThEWAABwcCqwAAAAAMia AAueUJSFRTisdL5MciPCpwZgNZ/d7QgBAIDD0kIIQ6WYBFjMrW8M1ke4hjeNhAAAwAEJsABy EdPDZOrdKgEAAAckwAJYzmv9gzXT3AEAgGMyAwtga769WQMAAOBQBFgAQ001x33kOYT+UVkA AAC7I8AC2Jh0vlw/3JEQAAA4EAEWwEJGDsBqOV2tKAAAcBQCLIDtUYQFAAAcigALYLNMcwcA AI5BgAXPKcrCIpAD09wBAIDjEGDBE1JMFuHo18CrNyKcdgBWfTLXD0VYAADA/gmwALbttUAN AABgQwRYABtWVXWZ5g4AAOzbF0uwY2VjWlPU+zapahKWjkJykM4Xg9kAAIB9E2DtUxVd1aFV ae74pIqySDGZ5s4T18wMA7DaP/VFGVO01AAAwC5pIdynGJOSq5mkmKrCKxnWca+BV+e4z3pK 1483dyQEAAD2SoAFsBMyLAAAYK8EWIcQY9JFOAdFWGR0Nc7coggAALAiARbAvBYYgFXRSAgA AOyVAAtgV66nIMMCAAB2xl0IYZSqizAZmU8mF+T5UoQQwk9LAQAA7IkKLICnZXgjwua5aSQE AAB2RoAFY3VHuRdlUX1ZHBYbgAUAALBjWghhGs24qu4oHJ5haULc5JuedRFWOH2UMUVvEwAA sAMCrKOIMZVlEaUk87gXPw2MpapyLRnWxt707OuqrqdwusqwAACAPRBgwfq6TYgw9qIyzR0A ANgRM7AgFzIspmWaOwAAsBsCLMiC/kEAAAC4RwshZGTgJKxWrZbwi3tMcwcAAPZBgAW5GDgJ qxVyaTzkIdPcAQCArdNCCHnpD6S6JVoGwNMvnS/h25t1AAAANk2AdSAxplLSkbcqnCrKovpq /enABkNoX1emuQMAABunhRDy0moPrP9VegUAAMBhqcCCfNXtgQ/TK12EPLiWFGEBAABbJsCC rFUZVn96pTKLga6nIMMCAAC2SAsh5E4+xTQX0vlShBDCT0sBAABsjgosgKPQSAgAAGyUAAv2 oJ6WBQAAAPsjwDqWGFMp5tivoiyqL0vBPYqwAACALRJgwU6kmKqv4KaEPGKaOwAAsC2GuMPe aCfkwRVimjsAALA1KrCOqCyL6stS7JgMix4aCQEAgG1RgXU4MabqGwHWjinCAgAAYE9UYB2X ge67J8OihyIsAABgQ1RgwT5VRVhjMqz0u1iP3V4k58v1I5w+ypii1QAAAHImwILdGplAVeGX GAsAAIDVaSFkAloRd0l0dYh3WSMhAACwBSqwDq0agxVH5BRVdFWP06oONTDPao2Tj+KS/FR9 iJKs3bt+vKWzZQAAAPIlwOJ1zfCrGV0NjKLqnOuFxwJTSedL8fleFiZhAQAA+RJg8aKbpVtP xU+tnesYS4aVFUVYh3iXTXMHAADyZgYWrxAzwf5cP94sAgAAkCcVWEdXj8F62MHXnGw1X3o1 fiwXk+sWYRWNi6HeXvw5+0zR1sbeZY2EAABAxgRYhNCoqLoXYwmVuJdP1dvvJVxsRdVIGL6X KcqwAACAvAiw+COuat1G8DhzqZapL9uunnIqlVZ7800jIQAAkJ1T9H/a96UsyxDCJG9rK8Za 9lW8HpmVf9b+9B+n+xolWZOoKrBkW5t87z7fTx8/NRICAIAwISsqsLhru/FN88xvNkX2ZHPd xwZJ1vOqsVnWYZPvnTsSAgAA+RFgsSvd0q1WU2Rz40OtsiwxFsf6aTLQHQAAyIYAixxNfi/C kYeq79IIR/C7COunpQAAADLxL0sAQEs6X64fb2VRWgoAACAHKrDYgIENgLPeLXF8UVj34TcP aIo8+biewumqkRAAAFifAItMNbv2ho9gz1wzsapeRSvD6v/XzSnKwo0ItyudL0UIIWgkBAAA 1qeFkKzFmG7eKLD6KstiK6OpbkZR94K5exu3NYdLdLUDGgkBAIBMCLDI18Pio2aMtZVipbrw qn4JzS03b6FY/Wnrga0DAgAAwI4JsNi8m1VaMz3RyLSoeZ4PC7KaT1oXnbX+VHrF3BRhAQAA ORBgsRMbmhXVnUnfXz52r9NQesUyZFgAAMDqBFgwu2Y+VX1zr1XwodYM+LqJ0iKzxJUswwIA AFYiwIKljS8WuzkwK0+FcG0XqiIs6wAAAKxFgAVPyKHc6WYN10xGvlg3ItwTjYQAAMCKBFiw PYulV93A7makpYfxOK4njYQAAMAKvlgCeFYzr7k3Yb3+0/4B7dm+ouZp1983I63WNK4edReh gqytS+dLEUIIPy0FAACwMAEWPKeZRpVlUac5N+dS5V+aVJ9hdzx8vb3aUsdY1ZY6zOoGXk11 aGUY1j6k8+X6EU4fZUzRagAAAIsRYMHr6uzmXnyzTO3VyDqvh7dE7B6/P7S6KcVUlIUiLAAA AF4gwIKx4mZDmeEJ1IrxHLlRhAUAACzPEHdgAkPuz1gVYVmrHXBHQgAAYGECLGBlRVkItgAA AOghwIKDmvX2iPeqseoirCq0qr5eGIwl8FqdIiwAAGBJZmAB0xjSRVip4icD3beuGoaVzlYC AACYnQAL9mDkjQinPZNwvwIr3MmtXrhHYVEWp1tPzcLKwjR3AABgdloI4YhmSrvqYw6vxnpB HV3FmOqv0EnNSm2G89NICAAALEOABbtVlsV2Q5x7U66qV1QlVs19Wnmc9GoxMiwAAGABWgj3 qSyKmFL9/b3d6n3YgVYXYf19HeU0/3UT3XbdpsJfL+r3CK1uyNVcgeaCNPc0e2sO14+304de QgAAYC4CrH2KKdW51b2UqifYYqvv++/IppXj/HrHs4+u6sSq+qZ5j8IhJx+rSKssTo09WwPj 3b5wDul8KT7fr6dwusqwAACAWQiwdkt11UHf9/vT3DOvuupWVNVb7rUN1qPf643VbKxrCKEs UkzX37u5MGZ/+86XIoTr6acMCwAAmIMZWMfVrNJiV+9sNnnN+EFUVThVj2mvNzb3qTKs6qva rfq+2n5yQSzp25s1AAAA5iDAAmbxco7WHH1V/llg1VVlVTef92bhVf/RGCOdLyGE6ykY6A4A AExOCyGQqXu9kE/p6alkclUjYQg/LQUAADAtFVhALprlUc3USdnUlt7E8+X68aYICwAAmJYA 69CMwSJDPa1/ze7CJ67zmEr517JkWAAAwLQEWABMqRqGFYJhWAAAwGQEWMBcXih9qgqsbj4q jZhjpQhrYVUjoXUAAACmYog7kKN4p5FwzAHnnuY+PCM7wlD5dL5cP8Lpo4wpup4BAICRBFgj P6+Wvz+OxubG5r9OuGWWmCClsihico829q/KsK5z/TYYmo4dqhbs+vEmwwIAAMbTQjjm82oZ f6uTrGpj/a8TboEtqjKjrCKbkfc0rF5O/VVvPEJR1dNL/XsYFgAAwEgCrKk+pcfQqJmqs6ep tsytLKo7vBkSxBw/HSnDGGv8K2q+rqfSq0MN5KqGYZnmDgAAjKSFkFD3DwqwmPEy+3M6eyvx 2W4Fk8KrIa4fb+F7maJGQgAA4EUCrFFuzsDK6sTufOS+fbbmYTG3OsaqkqybtUizVieNHIOl VfAF6XwpPt9P13A9ybAAACDToCB/AqyRH4ZjfR1EH8xg6A/OPzFWnQc1vxcS7cyvOxJef3pj AQCA1wiw9hoQvJ6mKcJiqas0Hfm1H62Y61eG5Y6EAACQZVCQf32WIe4TXw3Tzm5fvrBLdAXM 5/rxVri5KgAA8DwB1uuqmKlSx0zVxmbqNNUW2L3j3JvvmNL5EkI4XYMMCwAAeJYAa5T4W2tj d7dJtiz0olJyO0LW+GlarvovxVQ8H5btOF+rx+rPvvLny/Xj7XR1vQMAAM8xA4vpPgP/GXvp RuSVq2iRGOXlGxFOmLI1x2AtNg/r3vLevB3kfK4fb4ZhAQAATxFgcevT7O8irGYI1cynWuFU /87w3OU3c5SzcFjTb6oz6R6ntYzVDj1ru9hc+XS+FJ/vIYSiLJMuaQAAYBgBFnc+zaZUFkV1 R8J7+dTNP2o+XBEWz111y0ZLRVmkte8DWFdgTXWof35I/zxmVnc8/HVHwuvP60mGBQAADCLA 4v7n4ZRCCDdzqJ4/gk2ooqtqEtZaMdasoVJWidWN9f+dYUW/QgAAgAEMcefRx+D7EZX0iumv t2XzjJRHfDKy9Gy70+WvH29l4Y6EAADAYwIsZmQSFvkbeEfCxeasv2aSc1u4hTOdL79/Uciw AACABwRYzPaJWn0W0CudL9ePN+sAAAA8JMACDqdbatRfhLXdHr38VRmWIiwAAKCfAAs4utS5 f1+dWNXfR8PG5yTDAgAA+rkLITOKKVV3KmwOw+q2FrZGZek9ZC3NrKrOsA4VXVW1aUuP0j9f is/3EEJZlDFF1yEAANAlwGL+UKAoQiOW6k52byZW5r6z3JX5Z2Pg9c/gZiu51RztjcsXnaXz 5foRTh8/XZYAAMBNAizm1S2nUmBFFldmN50pi/5JWOlRoFOURXo19BlT+jRt0lQXoC0fY10/ 3k4firAAAIAbBFjkpe46tBQsrD97KsqiP58q9jXovdVHucT6ny/F53tMUSMhAADQZYg7wGMP 460UU4qpcL/CMYvcGIZlNQAAgCYBFsBQN/OpMZ2DtKTz5frxZh0AAIAWARbAIN2UqttXuLMi rGoy1/LPe/14U4QFAAA0CbAAnlDlU3V0tW7tVbnHjsV0voQQqmFYrjcAAKBiiDvZMcedbFUF Vv09g9U+EwZbN1Oq+uaAcY/di7+GYX0EA90BAICKAAvgCfOVXFX9es1AqoqubkZUm6u9unfC 3VdX7Xn9Fq4fb6fvP4L5YgAAgAALYHLPFmHd3Lknvaq3byLG6o/hmn/a/D6FS/H53g31AACA YxJgAcyiO8395hj47gP7o6um/JOd/vipzq1uvuRfjYThhwwLAAAQYJEjY7DYupvlV1VcVf1R 9/tKbmHNAjVQPQdP58s1fD19/1GUZYqGYQEAwHEJsAAW0o2ualUX4c7qjCZMvk7hei1PMiwA ADisf1kCgCWlmFrpVdIf17Nc58v121frAAAAByfAAiBrVYZ1CteiLK0GAAAckwALgA24fvt6 ugYZFgAAHJMAi0xVc9ytA2zUtKPf0/kSQginqwwLAACOyRB3APoscCPCIX7dkfB0DeHkTQEA gKNRgUXWyqKoviwFu1fdnZB+hmEBAMAxCbDIV0zV7drcoI392+6NCMuyaH7Nu0pVI2EIGgkB AOBotBCyAdU8LEkWrKgOp5q9hN3Wwpu7TehXI2EIp2u4nsoUo7cGAACOQIAFwAOt0CrcH4xV bZm1FCudL9ePcPr4GUIoShkWAAAcghZCNsMkLMhBjOnhWPcFJr5fP95OV+8GAAAchQCLjXxm 1j8IWf1Irjq06/T9R/idYRmGBQAARyDAAsiFGxEOVw10jynKsAAA4AgEWGxGNcrdOrBX270R 4Wordr4Un+8yLAAAOAJD3NmYmxlW3WDY/VO9h7BjvzKsczxdSwEgAADsmACLLbmXRtW5VXeH sihkWGxI3UWoIKvvV0FjinydYZVFGZM7EgIAwD5pIWQXn2ZTqr5u/pHGQ7Yi/b6Ug3lYTyo+ 30MIZaGREAAA9kmAxf5VGVb9ZUHIn/Kr55br90D3EAzDAgCAfRJgcQjNEi0ZFluhCGs4A90B AGDfBFgci3lYbIUirKdXrJFhAQAAOyPA4ogUYbEVirCeUmVYwTAsAADYHQEWh/NaEZbMi+Up wur7QY6p7KR71Zbrx1s4XTUSAgDAngiwOKinAinpFStShDX057QsYkzVQPfrt6+nIMMCAID9 EGBxRPU095v3JWxtL4vC5CzWUhVh1TfRtCDtn9ayqL/i74K1XzcljOkUTMMCAICdOMUYrcK+ Ps6VIQRv63OLVhQhhJhSHVo171dYp1eSLFZXlEWzr/BhpHXkJsTi8/30/UcIIerEBACA7YcJ X7xJcDOuCm5ZSN5aYdbNHY68Pul8uYavp+8/irJMMn0AANg4LYTwS0xJYkXmUkxVLDUknKp3 Pu5ynS+GYQEAwD4IsAA2yT0KB5JhAQDADgiwALakqquSXg1drsZNCa0GAABslwALhmpOeYcV DU+vXu4i3NN9D6sMK4SgCAsAALZLgAVwXN2Iqs6t9lTkZRgWAABsnQAL4ChacVXVitjcWG2p vnb22mVYAACwaQIsgD2rIqrqq/4+/M6qqn3u3dlwZ/cxlGEBAMB2fbEEALtXZ1XVN830qplS 7X42fDpfruHr6fs1BlPwAQBgS1RgwRPMcWeLurFUa8uh7mxY1WGVpR9kAADYEgEWAOEg6VVN hgUAANsiwALgWNL5EmRYAACwKQIseI4uQg5lZ3Pc/3ldVYYVTjIsAADYBAEWvEKGBVuXzpfw 7U2GBQAAmyDAgqfF5P5lsAdVhhVCkGEBAEDmBFjwCo2EsA/VTQmtAwAAZE6ABS+qMiwxFuyA ge4AAJA5ARa8ruollGGxb3ud4/7PC3RTQgAAyN4XSwBjdDOs/glZ99Iuc7VgRel8KT7fQwhl WcTohxEAALIjwIIJNOOnKqLqBlL3tgc1XJCBdL5cw9fT9x9FWaYYLQgAAGRFCyFMLKbUHfFe FkW1/d5DZFjkrOoiLPbeYVcNdD+Fq3ccAAByI8CCWTRHvFfplTVh01JMux+GVTEMCwAAMiTA grnUpVjSK3bDQHcAAGAVAiyY18D0ShchW3GQDCuEUJSltxsAADIhwIKNqTsTYS27H4lVD8OS YQEAQCbchRByca/fsBVXVTvoTGRdKaYQQpVhVd/3a6ZdQ/Zf/wX+uinh9Vqe3JQQAABWJ8CC rN0LqkzXIgd1jNWfSTV3qEq3NpFhhRCu32RYAACQBS2EkK/+VkFjs8hHTzthK67aSnQVGgPd T+HqLQYAgHUJsCAj3Uyqv8ZKhkUOmu2ETT3FVs/Oz1pr5FaVYcWY3JQQAADWpYUQslNnUjoE 2YrmrQnrb+4VW/Xcx/De9oG9irO8tPOl+HwP4UdZFjH6kQQAgHUIsCAvQis2qo6lBmZMrTRq yGOrp1glw7qGr6fvP4qyNAwLAABWcYr+Lr4vZVmGELytx3rTTXNng1rFVsNjqbVmwBef76fv P67BQHcAAPb4uTL7MEEFFgArSBtsx3NTQgAAWIsh7gDwWPOmhEVZWhAAAFiSAAuALemZAT/7 U58v9fcyLAAAWJIACzYvplTfuBCYVTpfqiIsSwEAAEsSYAHAE+oMqywFxwAAsBABFgA8p8qw QggyLAAAWIYAC4DtKdZOjqoMK8YkwwIAgAUIsGAPjMHiUFJMWZzG+VJ8vsuwAABgAQIsAHhd 8fke9BICAMDMBFiwH2VRVF+WApaRzpcQQjUPqyhLCwIAADMRYMFOxJSqL0sBS6oyrBjTKVxl WAAAMBMBFuyNeVgcQYqpyKZrrx6GdQpXbw0AAMxBgAUAYxnoDgAAsxJgwQ4pwuIginrwWwax kQwLAADmI8ACYJNSPfgt5jL6TYYFAAAzEWDBPinC4lDyGYlVZVghBBkWAABM6IslgB0bn2G5 rSE8K50v1/D19P1HUZYpRgsCAADjCbBgt8ZnT2q44GXXb19P36/X8iTDAgCA8bQQAnfpQ4TX pPMlVBlWuBZlaUEAAGAkARYAe/DCGKzmTQwnv6FhM8Py7gAAwEhaCIEHyqIwCYvNaeZQ3dsU Vn968/aFEw6Drwa6X799PZU/YvRDBAAAr1OBBfQRXbFFRVmkmOqvViZV/+nNx057Q8O6DstN CQEAYAwBFgC70h8/FYsHSVWGFWOSYQEAwMsEWMADRrmzFXXxVKu6qt7e0znYMm3OVfUSyrAA AOBlZmABgzybYek9ZC39+dSQ9GraLsJfx/yVYV3KsjAPCwAAniXAAh57No1SscVa+idbpdmS oyG1XTIsAAB4mQALgEN4Kr3qCbxuFmcNLNr6dV/C8PNUXmVYAAAwnAALmF41NksXIVt3L6sa c8x0vhQhXL+fZFgAADCcAAsAbng2qBrepSjDAgCAZ7kLIQCs4dtbCMF9CQEAYAgBFgAsLZ0v IYTrt68hhKIsLQgAAPQTYAHACqoMK8Z0ClcZFgAA9BNgAcA0Bt6L8J/9z5fi873KsKweAAD0 EGABs6huRGgdoF+dYRmGBQAAPQRYALAmGRYAADwkwAKAlcmwAACgnwALACbz7BisJhkWAADc I8ACgPVVNyWUYQEAwE0CLACYWFEW1ddTj6oyrBCCDAsAAFoEWMBc3IiQY0oxVV+vPPZ8KT7f q+9lWAAAUBNgAcAsXpuHVQ90DyEUZWkZAQAgCLCAuf1upVJLAkPVGdYpXGVYAAAQQvhiCYD5 xPSri0qAxTFVRVg32wn7i7NSrDKsy6m8XstTitFiAgBwZAIsYAnVPKw6z4KDq9KrnjlZRVmE 8KMIX2VYAAAQBFgAMKt7RVj9U97TrxlYP4rw9fothO/XQoYFAMCBmYEFAIsaPtk9xZTOlxBC +PbVPCwAAI5MgLV///vPX//7z1/WgdVVXYTWgQOqirCauVV/+VX74edLCOEqwwIA4MC0EO5T M7H693//FmABrKubYT338POl+Hy/fvt6+m4eFgAAR3SK/hK8L2VZhhBab2sVYP37v39bH1a+ Ps1x5/Du3ZRwyAPDt68hhNP3HyGEGP0oAQAw3Ye1W2FCVrQQHoLoikzoIoT0avCUYgrff6Tz 5frtawihLP0oAQBwIFoIgaWtm2EpAWPr0vlyDV/T+fJyMRcAAGyOAAtY1Lr5UVkU2hjZg+8/ ivA1RRkWAABHoYXwQIxyB9EVm9acBF+NdQ/VbCwAANg7AdZRGIMFNXO42LQUU1V1lc6Xaqx7 UZaWBQCAfRNgAceiCItNazUM/s6wrjIsAAD2TYAFHJEiLHZDhgUAwBEIsI7FGCwIv4uwqoHu kix2QIYFAMDuCbAOxBgsqMVUzRH6lWRZELZOhgUAwL4JsIBDMxKL3ZBhAQCwYwIsAEVY7IQM CwCAvRJgHY4xWNCiCIs9kWEBALBLAqxjMQYL7lGExW40MixXNQAAO/HFEhzQ//7zlyQLmmJK PXck7JZoDd8TVpHOlyJ8Dd9/FGWRossSAIDNO8UYrcKelGUZQnj4tlaNhGIsePwzdSuruhdU lUUhwyIfxed7Ol9kWAAAPP7gMyxMWJEWwlHvblPrXe9eB+O3TEh0BQPFlLpfloVNSOdL8fke QtBLCADA1gmwxn2ybai2lGUZY2zlWZNsmdy///u3ge4w8e+ElMzSIiu/52GFUoYFAMCWCbCm VKVOIYQ6e5pqC7AVMixyU2VYVxkWAABbJsAaq9U/uDmKsGDG3w/3B8M/tQ+MVGVYMSYZFgAA G+UuhOM+nf4ulaq/yefEev60PlVdhDCHugirmpbV/P6fH9LfoZWKLRby/UcRvsZ4KcsimukO AHA8W2/wEmCN+IzqBo7Avd8PjbiqGWPd28HtC5lbiqkofxTh6zX8CO5LCADA1giwdvrheXC4 VhVhuSkhzP5TKZ9ibSmmEC5F+JrOl0KGBQAgKGjIvz7LDKzXdd/dyWe3L9aZ+L///FV9eVsB 9i2dL8Xne4qpMA8LAIDtOOmDG6POsJrL2E2dptoy/JRee1uVYsFqv0y0ELKs4vNdHRYAAP98 JBkRJixDgOWa+4MMC1b74ZVhsSwZFgAA/3weyT7A0kIIAEeklxAAgA0RYAHAQcmwAADYCgEW bUa5AxyHDAsAgE0QYPEHA7BgLTGlspAgsAIZFgAA+RNgAcDRybAAAMicAAsgF1URljosVlFl WOHbVxkWAAAZEmDR9u///m0MFqwlpqSXkLWk8yWEIMMCACBDpxijVdiTsixDCCPf1kkCLOO0 YNTPclHElKwDyys+30MI4fuPFF2BAACH+QAyRZgwqy/eJLrGZ09quGCkqg5LhsXyqjqsInwt yh8hBDEWAAA50ELIXGRYMJJeQlaUzpfw7WsIQTshAAA5UIHFLAzSgklMlWGp5OIF6XwpQpVh aScEAGBlAiyArE2SPelG5DWNWxPKsAAAWJMWQgDgLrcmBAAgBwIsZqSLEDJhnBZjpPOlGokl wwIAYC0CLOYy/laGAORDhgUAwIrMwAI4hJ4irHvjsUzOoqUa616WP6J5WAAALEuABXAUPUFV 60/1G3LX9x/Xb19PMiwAAJalhZB51WOw/vefv4zEgjzFlKr6rCq36uZZUEsxVRlWqZcQAIAF nWKMVmFPyrIMIeTztjZDq3//9+///ecvs7Eg698hf7YN6iKk69cYrG9fT9/VYQEA7OWDQGZh QpcKLJbw7//+XeVWVYZVbVSTBRkSV/FQqkKr7z+uZroDALAUM7CYV7feqs6wmmEWABvyK8MK lyJ8LcofSR0WAAAzE2CxgjrV0lQIsGnVfQmL8kcIp2QoAQAAs9FCCEAfdySkXzpfwrevIVyL srQaAADMRIAFwF1GYjGEDAsAgLkJsACAsWRYAADMSoDFyoxyB9iHOsMq3ZoQAICpCbAAgGlU GdY1hKIsxFgAAExIgAVAn5iSOe4M97sOK1xDkGEBADCVL5YAAJhQOl+K8DWEcP3+41QWMboV AAAAY6nAYn3GYAHsTDpfQgh1O6EFAQBgJAEWufjff/6qviwF5KbqItRIyFPqDCuEkwwLAICR tBCShX//9+/qGwEW5CmmFEIoi6L6BoaoMqwihPD9VJTXpJcQAIBXqcAiL9oJIWdKsXhBOl/C tzd1WAAAjKECC4An1KVYUx2KI0jnSxFCCF+L8oc6LAAAXiDAAuBpk2RPGhIPJZ0vxed7+CbD AgDgFVoIySw17o0AACAASURBVI4uQjgO3YiHUo9110sIAMCzVGABsI56opY6rOP4PdZdHRYA AM9RgQXAakRXx5TOl6oOq1SKBQDAMKcYo1X4/+3dTW7byrYG0K2LMwlyNukk3XgUAowzBx/P IRDgAby20006mQ05DL0GLYbWD/VHSVXFtRDc6yiKE4tUqvydvXeVpK7riCjgsv7+99vXH79c UJjFP1yjjYRCrlJVb8/x+nMR60YpFgDAw/fkyYcJWggBeLDxiKqLt8RY5emOJlzH06J+jwgx FgAAI7QQApC0pm37aVndD69JOV7fI2L98hQR2gkBABghwCJRziIEhroYSx1WYdqm7TOsdYTT CQEAOEQLIUnrMyzzsIBOV40lySpG27QRH0cTxuvPql47nRAAgF0CLNLVh1ZKsQCK143EiteF DAsAgF1aCMmDDAugeO1yFS/f4+VJLyEAAFsEWGRA/yDATLTLVUR0GZax7gAA9ARYAGSmG4Pl dShVn2GtHU0IAMCGGVgAQFq6DKuKp/Xr+6KuGiOxAABmTwUW2TAGC2BW2uVKHRYAAB0BFnkw BgtghmRYAAB0tBACkJ9uDNZwGFbT6jIrU7tcVfG0jljU73oJAQBmSwUWs6D9EIrUpVdddGWs e8G6kVjrl6eqrqq69oIAAMyQAIucXJZD/f7329cfv2RYUJimbbsfofxqBtrlqmsnjFjLsAAA ZkiARTYuG4PVpVf9x15GKJgirOL1GZaRWAAAc2MGFpk5N5Dqn6wIC8o2nIdFwT5GYr2+L+rK SCwAgPlYNE3jVShJXdcRUfZl7XOoC2qyhvkXUNo/gJuRWLtJ1ok9hsOhWqSsenuOiMXr+zoW rZ0MAMD1e+nkwwQVWORHAgXs1UVX3emE2+vxsUhrJPwiQe1yVb09r1+e4nVd1TIsAIDyqcAq zRwqsK6kCAuIiGHOtZV57Y3ASFP19hyv76EOCwDgyu1x8mGCIe4AzFFfbCWuypqjCQEAZkIL IQAz1WVY0qvcdWPdI75Xda0OCwCgVCqwAJivvenV6ZOwupFb/Q+v56O0y1VExMv3unYVAADK pAKL2fn645cxWMDptpKpLvPqHxwO0vJaPdLre9u063ha1O8R0TQK6wAAiiLAAoCDdnsMd6Or kSdzN23TVnUV8b5+eYqIqN9bGRYAQEG0EALAtq6LcG8g1bRt92P3ca/bY31cmI92wqdKOyEA QEEEWACwn0wqUzIsAIDyaCFkjkbGYP3+91v/HC8UzJn0KmtdhlXFU6WXEACgCCqwmK8+q+o+ 7n6E6Aq4glHuSWmXq3h5qt6elWIBAORu0TSNV6Go753qOiJc1hON1Fs5qRC45B9hc9zTU709 R3wcU+jVAADYv49NPkzQQsisiagAiqedEACgAFoIAYDy9e2EXgoAgBwJsOCg4ZAsgBMZg5Ws j1Kst+eqrr0aAAB50UII+3UnFXodgLM0bdsFWN3/moeVmk07YVR1pZ0QACAjKrAAYGL9KHfV WGnSTggAkB0BFgBMqWnbLr1SfpWyQTuhkBEAIAMCLDhIFyFwPUVYyeoyrHh5kmEBAKTPDCwA uJV+JBZp2ozEeqrqdyOxAABStmiaxqtQkrquI8JlndDvf799/fHL6wBc+M/yadPcd3MuHYj3 9DEP61WMBQDMddeafJggwHLPcZwMC7jqX+YTirB246qt3yXPujUZFgAw6y2rAAv3XBlkWMCD /3nfnGzITVVvz/H6vohoxFgAwKx2m8mHCYa4A0AmuwrjtG6vXa7i5Wkdi9pkdwCAlBjiDifp TiRUhAU8yunz4I3TulK7XFUR69fFol6rwwIASIQACwDycGKGtRVXqdu6xOt7vDyt42lRv8uw AABSIMACgGxcUEvVxV6KsM7SNm1Vv8fL0/rlKWpj3QEAHs8MLDjD73+/eREA5qBt2na5ioh4 earrykgsAIDHEmDBqQzAApibLsNaG+sOAPBoAiwAKJ9JWBfb1GF97zIsMRYAwEMIsACgcAZg Xev1PeIjw1KKBQDwEAIsOMPXH7+MwQKYoXa5aperePneNo0MCwDg/pxCCGfrMyxTsQBmpV2u qrfntmnW9WJRrxunEwIA3IsAC84zDK1+//tNhgVkoWnbuqr0El5PhgUA8BBaCAFgLuqq6n54 Ka7RZVjx8r1pWmPdAQDuY9E0jVehqG9O6joiXNa7UYQFZLlYqMY6U1VX7U6xVfX2HBHx+nMR SrEAgMz3h8mHCSqw4CrGugPMVrtcRYRSLACAOzADC6ZnyjuQOCOxptJlWNXb8/olFq/vdV0p xQIAuAUBFlxrtwirz61OL84SdQGPsjsSqw+2TpmWJQWLTYy1jqeIWNTvMiwAgMmZgVXc9yFm YOXJLC3gAUtGVcW+BGqYWx3Np2ZSybV3Btaep709R8Ti9T0ixFgAQE47w+TDBBVYkISujEuG BdzToeBJUdXFlGIBANyIIe6QCvPggRx147SK/zLbpq1OntHeLlftcrV+eTLZHQBgKiqwICFH MywlWkC++pwr6wqvLsY6pZ2wXa6UYgEATMUMrOK+PajriFj/X7RfPq5s9aeO+PtTsqbNEEh0 9Tln3Puh2VuJG0ZXJ47E+viNb8+LVxkWAJBBmGAGFvfWfmm63Co20VX1p5ZhAXAjZ6VRmXYd DhOrrqPwxAzroxTrLeL1Z+s/HAIAXMQMrGK1X5ruh5eiJOZkAWUoYHJWl2GdOBirG+4eL9+r unb1AQAuoAJrLrqaLHkWAEy2tm7aCeOEqVhdhlVFVG/PH3kWAAAnMwOrNONtqzKsMmwVYZmK BWS5YOU5CWv/8nrmSKyIMBULAEhrb2YGFjC5YWKloxDIVAFdhJfpyq/W8bSo3yNCjAUAcAoz sGa2af7SVH/qfr47BTAVCyDLFXm5Wr88rV+e6rryagAAHKUCa3475s2hhP3HAPAQXRFWGV2E l6zISrEAAE6mAmumhjEWuVOEBWStrqr+xxxXZKVYAAAnEGDNl/IrAB6uadvhj1lPxXp5qutK jAUAsJcAa+4UYZVBERZQjBwzrLZpq+uCp3a5+ijFioUMCwBglxlYs9bNdPc6FKPPsIbHFAJk ZLZFWB/r8nJVbaZiGYkFADAkwCKqP7V2wgL0oZVSLCBrU2VYmc6Gb5er6u15/SLDAgD4RIA1 d4qwytO1EyrCAvI1SfaU7/mGw9MJZVgAAB0zsJBhAVCg3LsRu5FY1duzkVgAABGxaBq9Y0Wp 6zoiLrisGgkLowgLIDYj4XdLsUayrcvqtqq6am9QLVW9PUfE4lUpFgBw413TpWHC3Wgh5ENX hyXDAqAkXRq1G1cdSqlSK9oathNGhBgLAJgtARaDXbIMqyD9JKxzZ7pvDYNXxgUU4PSiqq7x MLXhWe1ytY6niDAVCwCYLQEWlOyCRsI+8Log/ALgRpRiAQAzZwZWaa5vW1WExZBZWsDsVtKL KrBuNANrzx+0mYoVYiwAYMItUPIzsJxCCADweQOX8PGFH6VYL08RUdeVMwoBgJnQQsjOzvhL U/2pR361+2DkOVvPJHeKsIBZ6cZgJb1Sb9oJI2Lx+l7XlVIsAKB4WghLc+uqvz63Gs+n9CGW RIAFzG4xTbuL8O+f+PYcEfH6cxHr0FEIAFyz/0m+hVAFFuc5MZZyoCEA3HxRXq4ioopYvy4i YlGvZVgAQKnMwAKOcBwhMDfpdxEOtctVvHyPl+/rWBiJBQCUSgUWN6QIqyTjGZYeQ6A8lzUS PsSwFEsdFgBQJAEWN9tMjw6DJy/j+ZT6LKA8XXTV1WFlFGNVEet4WtTv61i05pwCAAXRQghc S48hUKpcoqvexwGFL0+LWGsnBABKogKLW26jjXIHgDsvvl2GFU/tclXXlVIsAKAMAixubryR ULxVhq4IyyQsgES0y1X19rx+icXrOuoQYwEAuRNgceMN9Gg+ZUgWANxqCf7IsJ7a5WpRr9e1 DAsAyJgZWDx0b23Qe0FMwgJIbp1drroYq2nabipWVVt2AYAsqcACptRnWNoJARLRtxN2pVhR R9O0XhYAIC8qsHj0rloRVkG+/vjV/fBSACWpq+yP8+smu3elWBFR15UzCgGAvAiwgOlpJwSK 0bSnFiu1TVslnAr17YTrl6c+xnJ9AYBcCLBIYEu9KcKq/tSqsQDghmvuoBSraVqlWABALszA IhXVn1o7IQDcWp9hRUTTrCKiritTsQCAxAmwSGMz/aXpP+iSLK9JAX7/+808LIBEV97PMVZX hyXGAgCSpYUQuAnRFUD6tgZj1XVV1UqhAYAUqcAivc20IiwAuOfKu6nGWr/E4nUdtVIsACA5 AiwAAD5irHU8RcSifl/Hom38xyQAIBVaCIFb+frj1+9/v3kdADLyEWO9PC1iraMQAEiHCiyS 3D3rIixIn2GZigWQxyo8LMXSUQgApEGABdxQH1opxQK4XlVX/cftjUOlrY7CcEYhAPBQWggn UH+urq93iu2nemRWuiKsiKj+1N0HZE07IcA062PTtncMkvqOwojQUQgAPJAAa2J1XTdNM8ye pnpknjQSAsDHmlhXfXTVNu2wGuum2uWqXa7WL0/9YCzXAgC4Py2E19pNnSKiy576/73+kXm+ tn10ZSQWAHmsXE07jJmK+tKcUQgAPJQKrKvMOV2Cc+kiBDJe8StlRxHOKAQAHkcFVqH77NEN pdANAE7XtO3DA6wqmca9rTMK17VSLAAoIShInwDrqmsvCbrfdlkXYbn6sqz+yEIA9iyFnzsT H9uu2MdYi9d11KGjEAC4NQHWVfr8MrUwS7JGmrouwq1ewj63Gm8w3HqatAvg4drlSikWAORi PChIvz5LgDXBtb/d7HZFXhTpUAI1kkn9/vfbsFDLLC0gX8NmwL5+qn8wuwHwSrEAgPtYyEeu txUz7aZOUz1y4l8myq3A0kVIr6vk8joAd1rrq6ppT4qWjg6r2g2t4liSNdIqmM6hh9Xbc0Qs Xh1QCAB57naSDxNUYE1g6wLvXu+pHiGOZVjVn79Fj/3Thg8OHweAyZ0eJ+195t54K48vXCkW AHBLAiyy2hx/aWInkNp9Qvec/mnDxGrk95IdRVhAyUtebr2EH3/tz1OxIkKMBQBMQoBFhpvj E0qoDj3HaYbFMAkLINFleliKFWG4OwAwCQEWAECu2qbtxmDtthw+toZLKRYAMC0BFnOkCKsM XRGWLkKAvaPcz5qidYu0qy/Fivi+eH0XYwEA1xBgMTtdF6HXoRgjjYSyLWAW69qB7On0TOqm A+Pb5ap6e16//O0oDDEWAHA+ARaQsfGISn0WQAoGpVhhMBYAcBkBFrPcSW9Gue+eVLj37EIA 4NrF93OMpRQLADjLorFvKEtd1xHhsh7VBVW7uVUMegxlWAVQhAVMs7xWVUQ0bVvkV7d3hNZt /8S354hYvL6vQ4wFAMKEkwiw3HMc2Fsb9F4KGRYwzQpbVaUGWPGIDCs+x1gyLAB48FYn+TBB CyEc3ljLsADgZobHFNZ1JcYCAEYIsODArtphhaX4+uPX8KRC1VgAaS24w2MK6xBjAQB7CbBg jCKsMvSh1TDJAiARu/PdZVgAwBYBFhzeT39pqj+1DKskXTWWIiyAFJddHYUAwGH/8xLA2Gb6 S9PHWF4NALj5ytvFWC9Pi1jXdeUFAQA6KrDghM20UqyyKMICLlNXB/OUgg8ofMCyO+worN8j omm8vAAwdwIsOG0zbaZ7KbZmugOcaCSiGgm2uHzl/RxjybAAYOa0EMIZZFgAcE99R2FdVzoK AWDOVGDByXtoRVilMModYFdVV22SVU7DUqyof0aE4e4AMEMCLDhzf28SVin6RkJJFkDbtFXa 9U1djFVFxOvPqq5lWAAwNwIsOGf3/PlEQklWvvrQyjws4HpN29ZVZY77PRbi5aqL2aq6DqVY ADAnAiw4c+u8Ca2cS1gG7YQAmS3EfSmWGAsA5kSABZduoI3EAoBHrcJiLACYGacQwhW7ZxkW ADxwIV6uIiJevscmxgIASqUCC5g7XYQA+VKKBQAzIcCC6/bNXxqTsADgwcuxGAsASifAggnI sADg4f7GWK8/xVgAUBgBFly9Xf7SdCcS7v2l7oOjo7LkX4+lixCgnHV5uepKsboYS4YFAGUQ YMEUe+UD8VOfW43nU13+JcN6uN//frvyM4jAgIzXsqat6qpt2hK+Fh2FAFCcRWM5L0td1xHh smZHgFWALv+SYcF8l+Cqatq8059iAqy/X9Hbc3Q1WWIsABjfySQfJvzPRYIUdH2IXoesia4A kltel6t2uarenuPle0RUdd0lWQBAdrQQAkzGLC2ABA07Cg3GAoBMqcCCZLbXirAAeLSqrvof pzwzp3V2uYqIePneNo1SLADIjhlYpTEDK+9vG/7U4UTC/CnCgpkuwfnPwNpelUbzqW7oe/dB Fn///u9pMBYA5BgmCLDccyS22x4twpJtZcE0d5jpElxcgHXSspXw3Pfh360Ps7pHqrqKl6eI iNefm8etsAAIEwRYuOeYZCOuPisfXYZ1iGwLylyC5xpgRZJFWHuTte0Y6+05IuL1vdsVhxgL AGFCwkuhIe6QDUOyMjIeUekxBMpZmzaNhLn8bT/99GO4e1eKtY6Iqt7zNAAgBYa4Q2ZkWAXo Div0OgDlrE1ZTXPf0i5X7XIVL0/x8tQ3GGb9FQFAkbQQlkYLYfnfJPypdRGWwagsKHAVnmUX YXwOsFIoX7q4sfHzfPd1qMYCQJiQDAGWe47cvknYTMLqS7HkWVkTY0FRq/BcA6xP61QCY92v /DvsxFgLs7EAECY8nADLPUeG3xsMoisFWWUwFQsKWYUFWN06tZMfnduRd2UENkmINoixKiPe ARAmPJwh7pCfYWIlwwIguXVqZ7L7uXFSCocbtstV9fZcvT3HSzfifVHVdYixAOBBVGCVRgXW DAmwyqAIC0pYhVVgTbvAXRRjTd7D2JVixd+mQhkWACVuY1RgAbemCAuAMhe4pu0OBOwPB3zM X2O5iohNNdb3eH2v6jAYCwDuTIAFkISvP34Z6A6wZRhdPbyjMLpqrJeniIjXtRgLAO5JC2Fp tBDOliKsYoixIONVWAvhTNbcTVNhvL53O2oxFgDZb2OcQoh7jjttpjdHE8bnKe/kSIYFua7C AqxZrbyfY6zHFogBwLXbGAEW7jnuvZ/+U4cMK39mukOuC7EMa27LrhgLgDL2MMmHCf9zkaAw oqtidHVYACS97C5X3XiseHnqp857WQBgcgIsKNOwo5AcKb8CyEgXY21GvC/EWAAwOQEWlLiN VoRVCkVYADmtv1011sv37qRCMRYATOgfLwGUyrmEufv645cAC3JUV2OZhQlZxes6Cqt4ioh4 fe8yLLOxAOBKhrgXt2k2xJ2NrWnupw93d6BhOkaOI9z7S33gpQMR0l2pB/HWbph1KPwSe+W6 Fncj3l9/RqxDjAWAMOEKAiz3HKVvnTdpVBdFbZVl7aZawyc40DAFh4qwuoiqj7G2oqut3yXP ghSX7H1Z1d6gqnumDCvjtfhTjLVo7dMAECacT4DlnmN+2+g/dful6cOpYcK123WoDzELI4Va /RNkWJD3+l5VAqy8F98uw4qI1/eIRUSIsQAQJpxFgOWeY5bb6H1BVewrtlKEVQwZFuS9vivC KmP9/Rtj/ez+X4wFQCqbDQEW7jmy320rwirF+Eh48RakvsQrwipmYa2r7pjCeH3vNuRiLAAe v9NIPkxwCiFwwlZbhlWE8YhKiRZksLNMNcNSIHaWtmkjhicVrqu6fxwA2E8FVnFbWxVY3MDw XML9e3HxVv6ODtICHr/Kj55gOO3nP2Trzx1GV2KsCxfZjxHv7x9LqhgLAGHCPiqwgOPG86mj 8RZZGB5lCKRpmA1NXo114ifcCrmGv6WPsWRY5y2yy0E1VkRVv4cYCwB2qMAqjQosHkKPYTG2 MiwFWZD0oj9dVDRt8VT/Fxsv6Rr542ZbzPX5sEIxFgDChL9UYAHw1zCxUpAFiWvadsIMa8K0 qPuLHf2cI/HWbHsSP6qx3p67Ke+qsQCgpwKrNCqweBRFWEUy2R0yWPpPGFx1VLIh0e6MreHf dveRsz5hnF8jds9M7XM1lpMKAZh7mKACCwAgY2UXKO0txRpJssZfk93exnNrxO455+tzNdb3 qq7EWADMmQALmGif/aVRhAXALWylRbvh0e7ZiIeKpw59hlP+3KN/xE2WVzEWAESEFsLyaCHk gYbHEUqyiqGLEMhvO7TT6Dd5698pAdahAq7L11lNhQDMOExQgQVMZhhaqcYC4FH6KfJbD976 j9j7tL/fGFw9rWynGuvjvxtJsgCYAwEWAACl2QqYbtHud8HnnKTxcBBjfe9+2iVZYiwAyqaF sDRaCEmHIqxi6CIEmGafNvXkrKqu4uUpomsqjIhom9brDMAli5QWQgAK8Pvfb90HkiyAdLRN G7GKiCqe4qMaq6s7MyELgNIIsICb7aq/NN1Yd3VYuetDqz7GAiCtNfdvX2EXY62rOsRYAJRE C2FptBCSIDFWSbQTAly+T5v6MMT9y+7nwwrDeCwATlmktBACdKVYRmIBMHOHzi4860DD/smH Htk6rDBe36tajAVA9lRglUYFFskSYBXj9CKsQy2HariA+W7Vdua4nzjZfSS36h/c/VSqsQAo JkxQgQXcjwxrVkZyri7YEmMBnG435Dol9lKNBUAx/uclAO5DdFWMrz9+HZ3mPl6l1f2SkfDA PA0rp04svzrRoRbFdrnqkqx4eYqX7xHrqq6runYtAMiICizgrnaLsLoR75/22TtR1ynP4c7G 46ej1VWnpGAA5TmUMd3BVjXW4r/39UI1FgDZMAOrNGZgkbjL4qqtp2lFLIYzDYE57tZ2Zlfd 7vPv1bRtPxtLjAVALmGCAMs9B1mSYZVBgAXwyMXUiHcANtIPE8zAAuBhNBICPNDWbKy2abrZ WMZjAZAgARaQq72dhuRIhgXwQIv/3rskq3p77ka8R6yruhJjAZDWgqXXrDBaCJkPXYTF6AKs vpdwK8/SYwhw293j55MQu77Cdrmq6m6W1kJfIcAsloPkwwSnEALwYF1E1edWw8RKcRbArXUH I/YZ1uCwwi7GWld1mJAFwMOpwCqNCixmRRHWHBj0DnDzDeTnIqy/6+ynKe8hxgIoeS1QgQVw U8NJWH2Y1T0o2wKAawyqsZ4iIl7XEdEVZImxALgzARaQ88b6c0TVh1nd4+qzAOAUW12E26tt d1JhRBVdjPUesRZjAXBnWghLo4UQhmRYZdBFCHDzPeThAGt7bd3pKxRjAZSwEGghBHig9ksj wwKAKdfWrYKs+F7VlRgLgFsTYAGl77O/NHvnZAEAvfEuwv0r7KckS4wFwG1pISyNFkIYYbh7 vnQRAtx8G1lV/cdnJVkfi+zf1sKfYiyA/FaB5MMEAZZ7DuZFR2Gmfv/7rf9YkgVw8y3lmdVY f9dZMRZApv/yC7Bwz0FSFGEVQDUWwD12lZdmWPF50HvbtF5MgAz+2TfEHSApWyOxAIC9uqlY F/7m/94jYv3fU7w8VW/deYXGYwFwFRVYpVGBBUcpwiqAIiyAbJbdQTWWGAsgWSqwAJLTFWH1 dViSLAC44bK7XEUXY730hxWGvkIAzqUCqzQqsOBcxrpnqh/rrhQLIJs1d1ONtfjvvfvg4jFb AEzLEHfcc5D8ZlpHYc7EWABZLr5vzyHGAkiJAAv3HOSwjVaElT9TsQDyW38/F2SJsQAeyAws gAz0RxPuxlgnHlko/0qBDAsgs/V3ueo+WP/31H0gyQLgEAEWQMQmgdobV50STulDfLivP371 7YQAZLYKf06yxFgA7NJCWBothPBAWhEf7lCGtbcy66wnA3C/9XTTWhivP1vbWoC7MAML9xzM bM8tw0rS3qzqUFClFREgiSW1ruLlKSLi9WdESLIAbkqAhXsO5rfhlmHlT4YFkMrmtqpMyAK4 x7+3AizcczBD46PfxVtZ2Cra6vOs7nHxFsADltfBqYViLIBpCbBwzwE7+28lWhnq86wuulKi BfCwZXTTWijGApiQAAv3HLCz83ZkYRFkWAAPXk83BVn9IYYAXCz9MOF/LhLAnYmuyvD1x6++ LOv3v98OnWkIwK3W0+Wqi66qt+fq7bmuKq8JQMFUYJVGBRbkYjgnS6SVr+FILDVZAA9bVY3H AriOFkLcc8AJ225TsYpgvjvAg9dTMRbApbQQAsBciK4AHqtvKlz/96SpEKAwKrBKowILMqUI qxjDYVgiLYCHLayDaqz1YtHaHgOM0kKIew44bZ8twCqRqVgAD15exVgApxFg4Z4DTt5kG+te HFOxAFJZZN+eI2Lx33v/iCFZAEPphwn/uEgAiRiGVgqyyvD1x69hRyEAD1tkl6uqrtbx1P10 8d97NyFLjAWQCxVYpVGBBcWQYRVjkgxLGRfAlIvsoLUwxFgAWghxzwFXba9lWGwYpwVwk6X2 c2uhJAuYLS2EAFyu/dJ0g7F2Y6zhwKy9v9GrBwDHl9rlKiLW//1tLQwxFkCSVGCVRgUWFGk3 rhqPqLaeL88qw94irK3+RFVaAFctuIOCLKcWArOiAguACZybQG09XytiYYah1VZipdMQ4KoF d1iQ9fK9rqr1YhERkiyAh1OBVRoVWMCuQ32IZKePrg6lVN0TZFgA0yygZr0Ds2GIO+45II0t uCKs2VCEBTD9MjpoLRRjAUUSYOGeA5LZfB+Y+y7YKo/BWAA3WUk3BVnx+jP0FQJlMQMLgFQc Cqo0GJbHYCyAm6yky9XH0tn9b12HGAvgXlRglUYFFnAZPYYFMxgLYPpdd1V9zHqPWPz3rq8Q yP6fteTDhP+5SAB0DvUYkjvRFcDkmrZtl6v+1MLq7bmuqrqqvDIAN6ICqzQqsICLKcIq23Aw lkgLYPpldDDoPcx6B3JjiDvuOSCrzffnIix5VqlMxQK41Uq6GfTeJ1khzAJyYIg7ADnZSqzU ZAHAucV5KQAAGctJREFUeSvpZtD7x4Ss1/fFOrrWQjEWwDVUYJVGBRYwIQcUFkwRFsA9VtJN QVa8/lys1yHGAlKlAguAjLVfGpPdAeDylXRTkFVtarK61sL1YtH6T84A5xBgATC68/6cYanG KsbXH7+6se67dVjGvQNMv55+bi3sqrHEWACn00JYGi2EwE2ZilWeYVzV6UOr3V/aJeECuGQ9 Hcx6F2MBKXAKIe45oKwNtwCLz8zSArhqYX17DkcWAgkwAwuAonQdhTIsAJhmYV2uoj+yMMx6 Bzjof14CAOBi/SwtAC7WLlcfQ7Jevq//e1ovFnVV1VXllQHoqcAC4MxN9kRHEyrjAoBPK+PW kYWbaqxQkAVgBlZ5zMACctGlYGKsMhwtwjInC+DshXIw6L1/UJIF3Igh7rjnAA5vzWVYs2HW O8Dly+UmyVKTBdyOIe4AcFAXXW3FWFv9ieKtYsiwAC5cLrdaCyNiUJYlyQJmQoAFwKP35YMY K3YSK4celsGsd4AJVsxNkhU7SZYYCyieAAuANDblUqrSbWVYqrEArlo3N2FWn2TF68+IaM0S AQplBlZpzMACyqMIqzxdkiXDAphyuexGZb3+7H4qyQLOYgYWAMA2HYUAk2uXq7qq1vFRkFVX VUSsFwtJFlAGFVilUYEFFMl5heVRhAVw26Vzc3ahIVnAKVRglX91e/1lrut665JP9QjAbO09 r5CsKcICuO3S+XlIllMLgdwJsC43TJf6MKtLnYbZ01SPANDHWDKsYvz+95siLIDbrp7LVeyc WhiSLCA3Wggn0MdMt8itzs2wtBACxVOHVZKtIixhFsDNl9HPrYUhyQIiQgvhTK6xtAjgntov TZdhUYCtxEpBFsDNl9HPrYUhyQIyIcC6SpdQJphhbc3n2iJxA7LffO9kWAqyytANxpJhAdxj MT2QZImxoFTjQUH6BFhXubjRD4Brt92fEyuDsYrRD3cXYwHcaUndl2StF4vWNzhASgRYZZKm AZCvLrraPaNQpAVwU5+SrJfvdVX9/f5CWRaUHhSkX59liPvl9o5af/gphIa4A7OlCKt4u5HW CGkXwFWral3Fy0dBVry+L9YRYiwoWvphggBrggu8dY13U6epHinjngO41VZbgMWGQVoAU66w nw8u1F0IRRJg4Z4DuOMO+6LTCcVeRZJhAUy/zr49h1MLoVDphwlmYAFQjsuiqGHsJcwCgIPr 7HIVO6cWdoRZwK0JsACY/XZ8E1pVf+o+zJJkAcD+dXPfqYX9xHdJFnAjWghLo4UQYBKSrALo IgS406K5GZIVEYv/3teLRUSYkwV50UIIAFkalmXtPggA/F0fNzVZ0ZVlvXyPiGpz4JUkC5iE AAsARjflGgwB4PR1cxNmVZtH+iQrhFnAFbQQlkYLIcAdSLJyoYsQ4PGL5qbBsF2ulGVBstIP EwRY7jkArtiUazBM2+9/v438qmwL4K6LpiQLEibAwj0HMI9NuSQrN4qzAB62aEqyID2GuAPA LAxDKw2GADC2aPZzst6eu4nv8fqzS7LEWMAhKrBKowILIB3KshKnCAsglRXz7TkUZMFDqcAC gPlygiEAnLRiLlcxKMiSZAG7BFgAcPt9uQZDADi6XO60Fg6TrBBmwbwJsADgvrvzQVnW7oMA wFaS9fenyrJgxgRYAPCg3bkGQwAYXyv71sLOJsxSlgUzJMACgEfvzjUYAsDIQrmpwIo+zNo0 GIayLJgNpxCWximEAGXQYHgfDiIEyHit7CuzzH2HqzmFEAC4hLIsADiyVm5VZr18//hYkgUl EmABQPIb9J2575IsAPi0Vg7DrP6Dug4xFpRCgAUA+ezOnWAIAEeXy/7Uwu5/357j9WdIsiBz AiwAyHBrrsEQAI4ul5IsKIgACwAy350rywKA8bXyQJIVwizIhwALAErZnSvLOt/XH78cRAgw o7Xyc5IVJr5DPgRYAFDiBt3cdwAYWSh3kyxlWZA2ARYAFL1B12AIACML5eHjC0OSBSkRYAHA PDboGgwBYHytPNxgGMIseDQBFgDMb4OuLAsARhbKnSSrXa6UZcFjCbAAYMYbdGVZADCyUPZJ 1ttzvHz/+FhZFjzCovF+K0td1xHhsgJwsRkmWQ4iBOCMhfLt+WOhVJZFQdIPE1RgAQCfaDAE gLGFcqcsS5IFd6ACqzQqsAC4hbLLsn7/+63/WCkWAGevkmqyyF/6YYIAyz0HAOfs0Usvy9JO CMDlq+QmyYrXnx9rpW/NyIQWQgCgKMXPff/645cMC4ALV8md4wsVZMFUVGCVRgUWAPdXXlmW pkIAplki+5qs+CjLkmSRJi2EuOcAmNlOvcQwS4YFwARLpDCLhGkhBADmpfgeQwC4cIncNBjG psfwI9J6/SnJgqNUYJVGBRYACcq9LEsRFgA3XCU/V2YJs3gIFVgAAMqyAODwKvm5MktZFuyl Aqs0KrAAyEVeZVnDse4XU8YFwKmrZF+WZVoWd2GIO+45ADhhmz6PsiytiACcvURKsrgLARbu OQA4Z5te3CGGW2RYAFy4RDrEkFsSYOGeA4BLd+qFhllHuxElXAAcWSLNfWdqAizccwAwxU59 TqPfVWkBcMYSKcxiCgIs3HMAMOk2vfQew44MC4BLVslBmDU83BCOEmDhngOAm23T/9RRbow1 7DQUZgFw9io5mP6uLIujBFi45wDgxhv0omOsjoIsAC5fKJVlcYL0w4R/XCQAIGtddFX9qWVY ALBnoRyEVsqyyJcACwAoYnf+pSm4FOvrj19Hzy4EgOPL5SbMqvowS5JFJrQQlkYLIQBzZioW AJy3dA5qsiJCmDVbZmDhngOAu+/F5zEVK8RYAEy4ekqy5k2AhXsOAB6xC5dhAcBla+jnJCuE WfMgwMI9BwCP24LPJsa6khQMgD3L6OD4QmVZxRNg4Z4DgIduvmeQYV3PEYcAHFlPhVmlE2Dh ngOABLbdf2oZ1jgZFgCnrqrCrBIJsHDPAUAau22lWMcc7UaUcAGwvbwOw6yQZ2VMgIV7DgBS 2mcrxbqCKi0AjqyzirOyJcDCPQcAie2tlWJdQYYFwKkLrjArKwIs3HMAkOSuWinWpYadhsIs AE5adnc6DYVZqRFg4Z4DgFQ300qxrqYgC4Cz19/PYVa7XHlNUiDAwj0HAAnvoWVYV5NhAXD5 QizMSoYAC/ccACS/exZjXUdTIQATLMefZ2bpMbwzARbuOQDIYdMsw5qIgiwAJliX+zDr9Wf3 //KsWxNg4Z4DgHy2y2KsKciwAJhsad45yjCEWbeRfpjwj4sEAPCxIf7SVH9qBxRe6euPX8Om wms+jxcTYO5L82AqVtV/UNd/nyDMmg0BFgDAYKP8pQmlWFebJHvqUjAxFgAfa/QmzKoGDwqz 5kMLYWm0EALAJGRYKZBhAXBkvdZjOBEzsHDPAUDO22LthAkYb0gUbwHwsWoLs64gwMI9BwCZ 74aVYqVNlRYAe5bvYZgVH3mWMGuEAAv3HAAUsQ9WipW2YZWWMAuAT4v4vjAr5FmfCbBwzwFA KdtfpViZ+P3vNxkWAPtXc2HWAQIs3HMAUNbGVylWDmRYABxf04VZAwIs3HMAUNx+VylWDjQV AnDe+j7vGfACLNxzAFDiHleGlRWD3gE4b6HfCbOKT7IEWLjnAKDc3a0YKx8yLAAuXO7nEWYJ sHDPAUDRm1oZVlaGfYUXk4IBzHfd78Os4noM0w8T/nH/AQBcrIuuxFi5mCR7UswFMN91f7nq Pqg2j1R1/ekJqkluRgVWaVRgAcBDyLDmZryYS7wFMJcNwKYmq8u28s2ztBDingOAOe1i/9Qy LEKVFsAM9wDDUVn78qzEwywBFu45AJjZ/lUpFhEhwwKY+X7gc54V8XdsViQZZgmwcM8BwCy3 rX8+/ourJGvmhp2GwiyAWe8N9kVa6SRZAizccwAw793qn8+zMORZM6YmC4BPm4RhpPXoMEuA hXsOABhsVeVZs6cmC4A9O4RHh1kCLMI9BwAc3K3Ks+bt97/fZFgAbG8PtpoN75JnCbC4NwEW AGS8YZVnzY8MC4CxvcG9wiwBFvcmwAKAcvas8qx5GDYVXkMQBlD4xmAnzIrpDjQUYHFvAiwA KHbbKs9ilGIugBntCvaFWXFFniXA4t4EWAAwl52rPIsdMiyAme4KPs+A/9gbnJMMCLC4NwEW AMx05yrPIiKONSSKtwDK3xLsC7PiWJ4lwOLeBFgAQMizOKCLt8RYAHPZDxzoNIydPCv9MOEf lxMAoDxbiZU8i04XXQ2rtIRZACXvB5arT/uB4cf1371Bm0MRjAALAGAG+1d5FgPD0EpNFsCM 9gODPOtQmJUsLYSl0UIIAJxFmIUMC2Dum4G358XrzzADi3sSYAEAV21h5Vlzpa8QYM7MwAIA ICcjzYbCrLLpKwQgZQIsAAAOGoZWirPmY3fW+5WfCgCuJMACAOAkJsHPzSTZ0+9/v8mwALie AAsAgEtoNuQUX3/8kmEBcD0BFgAAE9BsyCFdhnX0OV4oAEYIsAAAmJhmQ7YczaeMjQdg3CLl IxK5QPonXwIAcybMYsSwUEuYBXBP6YcJKrAAALgfxVmMGIZWarIAGBJgAQDwMON5Voi0ZqyL rtRkAdARYAEAkIrduEqkNXNqsgDomIFVGjOwAICy6Tqcs70ZVl+lJdsCuJgZWAAAMCVdh3O2 21cYg9xq6/FTPhUAuVCBVRoVWADAzIm0OErRFsCW9MMEAZZ7DgCgcLoO2etoxZZ4C5gPLYQA APBgug7Z62g+ZWw8QDoEWAAAzIuzDjnR+MgtAO5JgAUAwNwdjbTkWXO2e+ihDAvg/gRYAACw Tdchh3z98UuGBXB/AiwAADhCiRZDXYY1/KnXBODWBFgAAHA2JVozNwytzHoHuAMBFgAAXMtg +DnbO+v9mk8FwK5F01hHi1LXdUS4rAAASZFncQrFXMCjpB8mqMACAICbU6LFKU4s5pJwATMk wAIAgAcwGJ5DjuZTCrWAGRJgAQBAEgyG50S7hVrCLKB4AiwAAEiREi3GbZ2EKMMCyibAAgCA PCjR4pCvP37JsICyCbAAACBLBsMz1GVYW494WYBiCLAAAKAQug5nbiuxMusdKIkACwAAiqXr cM52Z71f+akAHkiABQAAc6HrcIYmyZ4UcwEPJ8ACAID50nXIKU4s5pJwAbcjwAIAAP7Sdcgh R/MphVrA7QiwAACAg3QdcrrdQi1hFjAVARYAAHAGkRbjhqGVmixgKgIsAADgKiItDlGTBUxF gAUAAExMpMWQmizgegIsAADg5kRadE480PD0TwXMhAALAAB4gFMirZBqFWqS7EkxF8zKomms B0Wp6zoiXFYAAMqgUIsRMiyYSvphggosAAAgXXoPGXFiQ6KECwogwAIAAHIi0mLL0XyqT7gk WZAvARYAAJA347QY1+dWkizIlwALAAAozd6sSqrFbpK1+0tAmgRYAADALJyeaoVgq3S7cZV5 8JA4ARYAADBfh4Iq5Vpzc+I8+NM/FTAtARYAAMA2TYjzNEn2pJgLbkGABQAAcBJNiJzixGIu CRecRYAFAABwOU2I7HU0n1KoBWcRYAEAAExPEyLjdgu1hFkwQoAFAABwJyemWiKt+RiGVmqy YIQA6yp1/bHSNE0zfHD40wkfAQAAyrMbV4m05mnv8Cx5FnQEWJcbBkz9x90Hu790/SMAAMBM iLTmbCux6vMsSRYzt5CPXOwWKdWhR876W8XnijAAAKA8xmnNzdFjDY8SgTEi/TBBBdblhEQA AMCjGKc1N9fHT2ZskTUB1gQSbPTrh3PtJXoDAIAi6T1kxN4ZW3ufQ5HGg4L0CbAmuAPkQQAA QJpEWmwZj6hUaZEsAdZVkk2vZGoAAMBep0RaIdWaq90qLWFWMcaDgvTrswxxv9ze9OrhpxAa 4g4AAFxPqkVHTdZMpB8mCLCuvbq9kXMDp3qkjHsOAADI1N5UKwRbM7A7OUukVRgBFu45AACg cMq1Zmgk0tKBmCMBFu45AABgjqRac9PnVnuTrEMkXIkQYOGeAwAA+KAJkS1mbCUi/TDBKYQA AADcyaGgSrnWbO2eezj+TGZLgAUAAMCD7c2qpFrzcUo4tRVyybPmRoAFAABAiqRaDG0lVnoP 50aABQAAQDZOT7VCsFW03d5DYVbZBFgAAADkzWit2RqGVmqyyibAAgAAoEyaEGfl9Hnwp382 0iHAAgAAYEY0IZZtquBJPVdqBFgAAADMnSZEtpxYzyXhuhsBFgAAAOynXGvmjuZTexMuqdYt LJrGG6wodV1HhMsKAABwZ4ItItuDEdMPE1RgAQAAwATO6kMMwVahdg9GDDVZUxBgAQAAwA2d G2yFbKsUfW517tmIAq9dAiwAAAB4gJGUStFWYc4NpA4FXnMOtgRYAAAAkBbdiDN3KKjqgq2t X8106ta5BFgAAACQB8HWzHX51G591m6vYnlJllMIS+MUQgAAADp7gy2p1hyMhFx7OYUQAAAA eIy9WZVyrTnYjavOjbRSI8ACAACAGTmrD1GqVYxDkVYuMZYACwAAADijXEuqVYZDE7XSZAZW aczAAgAA4KY0IZbHDCwAAACgKJoQuT8BFgAAADCBs2bGh2yLcwiwAAAAgFsZKdcaybZCvMVn AiwAAADg3sbzKfEWWwRYAAAAQFrEW2wRYAEAAAA5EW/NkAALAAAAKMc18ZZsK1kCLAAAAGAu jsZbF/wu7kCABQAAABAxembiub+FaQmwAAAAAMaMpFSKtu5DgAUAAABwIUVb9yHAAgAAAJiY oq1pCbAAAAAA7kfR1gUEWAAAAACPd0HR1tHfWAwBFgAAAEDSxiOqOZRuCbAAAAAAMjaH0i0B FgAAAECZLi7dSo0ACwAAAGCOBvFW6knW/1wtAAAAAFImwAIAAAAgaQIsAAAAAJImwAIAAAAg aQIsAAAAAJImwAIAAAAgaQIsAAAAAJImwAIAAAAgaQIsAAAAAJImwAIAAAAgaQIsAAAAAJIm wAIAAAAgaQIsAAAAAJImwAIAAAAgaQIsAAAAAJImwAIAAAAgaQIsAAAAAJImwAIAAAAgaQIs AAAAAJImwAIAAAAgaQIsAAAAAJImwILJ1HVd17XXwZXFlcWVxZXFlcWVdWVhWgIsAAAAAJIm wAIAAAAgaQIsAAAAAJImwAIAAAAgaQIsAAAAAJImwAIAAAAgaQIsAAAAAJImwAIAAAAgaYum abwKJanr2osAAAAAnCvljEgFFgAAAABJU4EFAAAAQNJUYAEAAACQNAEWAAAAAEkTYAEAAACQ NAEWAAAAAEkTYAEAAACQNAEWAAAAAEkTYAEAAACQNAEWAAAAAEkTYAEAAACQNAEWAAAAAEkT YAEAAACQNAEWAAAAAEkTYAEAAACQNAEWAAAAAEkTYAEAAACQNAEWAAAAAEkTYAEAAACQNAEW AAAAAEkTYAEAAACQNAEWAAAAAEkTYAEAAACQNAEWAAAAAEkTYAEAAACQNAEWAAAAAEkTYAEA AACQNAEWAAAAAEkTYAEAAACQtH+8BHCluq6bpjn3EfK6snVdD39p+Lgrm91l3bqI3rNFXlnv 2cIu69Hr6MrmfmW9Z8veHnvPlndlvWetsw8hwIIp3/P9e3sr+9h6hByv7O7lc2Wz3k8Pd2De s0VeWe/ZAmx9A2ydLfvKes+W/U+092zZb2RXtuzrmM6V1UII03zLtPuNU7ch232EHK/syHNc 2TKutffsDK81OV4+79n5LLiubNaX1Tpb/JX1nrU3fggBFlzOf1uY25Wt69p67D2L9yyP2kxT /JX1nvWGJa8r6z1bzMXN5ToKsADOWLb9NyXba3K5st6zxVzTOPaf/SnjynrPgr0xruM4M7AA jhN2lLdOex3KvrIucXn//Hrnln1lXVwrLHldWRfa9zgPoQILAHtrXFkApvnXWOGkKws3IsCC KY3MuvPNVe5r9tFrTRbXcfe/H3rPFnllvWfn82+vK1vGlfWeLWYn3NfTHZ0D7cpmfWW9Z62z D7Fwb8FUb/iRNgf/jhdwZfee8+3Kes/iPct9Lu74dXRlC7iy3rOFXV/v2eKvrPesdfb+BFgA AAAAJE0LIQAAAABJE2ABAAAAkDQBFgAAAABJE2ABAAAAkDQBFgAAAABJE2ABAAAAkDQBFgAA AABJE2ABAAAAkDQBFgAAAABJE2ABAAAAkDQBFgAAAABJE2ABAAAAkDQBFgAAAABJE2ABAAAA kDQBFgAAAABJE2ABAAAAkDQBFgAAAABJE2ABAAAAkDQBFgAAAABJE2ABAAAAkDQBFgAAAABJ E2ABAAAAkDQBFgAAAABJE2ABAAAAkDQBFgAAAABJE2ABAAAAkDQBFgAAAABJE2ABAAAAkDQB FgAAAABJE2ABAAAAkDQBFgAAAABJE2ABAAAAkLR/Dv1CXdfdB03TDH+698Hupxd/nv6ZI5/n FLt/n6OP7P59Lv5KAQAAALiF/QHWMEvqP97KbvY+54LPM5XdP+uUR7YStO6DC75SAAAAAG4k lRbCrbqnK12QMUmmAAAAANK0P8A6FOXUdT1J0rT1ee4WHp3+p0z1lQIAAABwpX/Gf3lv91zf fHc04hk+59DnmfCL2fqzRv6GW3/0oZ+e/pUCAAAAcCNjAdYw1tmbNO1ORh/5DLszp3afc6WR eVunF3xd9pUCAAAAcCMHZ2DdcyZU3693h3ho9+sy/QoAAAAgZfsDrL0pz+5zDv3ekT9v91eb jfhc/XT9FKq9f+ejWdVUXykAAAAAk1jsDXS2opl+INTwpxM+0j8+MpfqFEf/rENf16G07oKv AgAAAIBpLSbPXybpyMuirU/vIQAAAMAd/D9ZsXz678FLTAAAAABJRU5ErkJggg== --98e8jtXdkpgskNou-- From jos@xos037.xos.nl Fri Dec 3 15:26:08 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Dec 2004 15:26:14 -0800 (PST) Received: from simba.xos.nl (simba.xos.nl [212.26.207.226]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB3NQ6i3010934 for ; Fri, 3 Dec 2004 15:26:07 -0800 Received: from xos037.xos.nl (xos037.xos.nl [212.26.207.37]) by simba.xos.nl (8.12.8/8.12.8) with ESMTP id iB3NPdmA026687 for ; Sat, 4 Dec 2004 00:25:40 +0100 Received: from xos037.xos.nl (jos@localhost) by xos037.xos.nl (8.11.0/8.11.0) with ESMTP id iB3NPdG04693 for ; Sat, 4 Dec 2004 00:25:39 +0100 Message-Id: <200412032325.iB3NPdG04693@xos037.xos.nl> To: netdev@oss.sgi.com Subject: e1000 driver problem with Intel Pro/1000 MT adapter Date: Sat, 04 Dec 2004 00:25:39 +0100 From: Jos Vos X-archive-position: 12413 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jos@xos.nl Precedence: bulk X-list: netdev Hello, I have a problem with an Intel Pro/1000 MT 4-port card in a Supermicro (non-HT) Pentium 4 system using RHEL3 (2.4.21 kernel "the RH way") with the e1000 driver (I tried both the version supplied by RH and the newest 5.5.4 driver): (1) With a non-SMP kernel, the whole system freezes instantaneously when configuring one of in the four interfaces. (2) After disabling the USB controller, all interfaces can be configured, but only one of them actually works. Via the others, I can sent packets, but do not receive incoming packages (possible IRQ problem). (3) Using a SMP-kernel, two of the four interfaces can be configured and work correctly. For the other two ports, ifconfig gives the error "SIOCSIFFLAGS: Invalid argument". Stracing ifconfig says: socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 4 ... ioctl(4, SIOCSIFADDR, 0xbfffc210) = 0 ioctl(4, SIOCGIFFLAGS, 0xbfffc140) = 0 ioctl(4, SIOCSIFFLAGS, 0xbfffc140) = -1 EINVAL (Invalid argument) Stracing ifconfig for an interface that works shows similar lines, but then the second SIOCSIFFLAGS ioctl succeeds. Supermicro suggest to change the non-HT P4 CPU by a HT one, but I personally do not believe that this will change the situation of the SMP kernel as described above. Any suggestions how to solve this are welcome. I can provide more information if needed. Thanks, -- -- Jos Vos -- X/OS Experts in Open Systems BV | Phone: +31 20 6938364 -- Amsterdam, The Netherlands | Fax: +31 20 6948204 From romieu@fr.zoreil.com Fri Dec 3 17:02:39 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Dec 2004 17:02:46 -0800 (PST) Received: from fr.zoreil.com (electric-eye.fr.zoreil.com [213.41.134.224]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB412cqv017119 for ; Fri, 3 Dec 2004 17:02:39 -0800 Received: from electric-eye.fr.zoreil.com (localhost.localdomain [127.0.0.1]) by fr.zoreil.com (8.12.10/8.12.1) with ESMTP id iB40xlvr028575; Sat, 4 Dec 2004 01:59:47 +0100 Received: (from romieu@localhost) by electric-eye.fr.zoreil.com (8.12.10/8.12.10/Submit) id iB40xlj1028574; Sat, 4 Dec 2004 01:59:47 +0100 Date: Sat, 4 Dec 2004 01:59:46 +0100 From: Francois Romieu To: Jos Vos Cc: netdev@oss.sgi.com Subject: Re: e1000 driver problem with Intel Pro/1000 MT adapter Message-ID: <20041204005946.GA26654@electric-eye.fr.zoreil.com> References: <200412032325.iB3NPdG04693@xos037.xos.nl> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200412032325.iB3NPdG04693@xos037.xos.nl> User-Agent: Mutt/1.4.1i X-Organisation: Land of Sunshine Inc. X-archive-position: 12414 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: romieu@fr.zoreil.com Precedence: bulk X-list: netdev Jos Vos : [misbehaving multi port e1000] > Any suggestions how to solve this are welcome. I can provide more > information if needed. The error code for the ioctl most probably comes through e1000_open -> e1000_up -> request_irq That + (2) + SMP suggests an irq routing issue due to a broken acpi table ("suggests"). I would compile a (monolithic) recent 2.6.x to see its log and give a look at http://acpi.sourceforge.net/dsdt/index.php (once the ticket is opened at http://bugzilla.redhat.com of course :o) ) -- Ueimor From jos@xos037.xos.nl Fri Dec 3 17:15:18 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Dec 2004 17:15:22 -0800 (PST) Received: from simba.xos.nl (simba.xos.nl [212.26.207.226]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB41FHeR017711 for ; Fri, 3 Dec 2004 17:15:18 -0800 Received: from xos037.xos.nl (xos037.xos.nl [212.26.207.37]) by simba.xos.nl (8.12.8/8.12.8) with ESMTP id iB41EimA027147; Sat, 4 Dec 2004 02:14:44 +0100 Received: (from jos@localhost) by xos037.xos.nl (8.11.0/8.11.0) id iB41EiN05069; Sat, 4 Dec 2004 02:14:44 +0100 Date: Sat, 4 Dec 2004 02:14:44 +0100 From: Jos Vos To: Francois Romieu Cc: netdev@oss.sgi.com Subject: Re: e1000 driver problem with Intel Pro/1000 MT adapter Message-ID: <20041204021444.A5058@xos037.xos.nl> References: <200412032325.iB3NPdG04693@xos037.xos.nl> <20041204005946.GA26654@electric-eye.fr.zoreil.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20041204005946.GA26654@electric-eye.fr.zoreil.com>; from romieu@fr.zoreil.com on Sat, Dec 04, 2004 at 01:59:46AM +0100 X-Organization: X/OS Experts in Open Systems BV, Amsterdam, The Netherlands X-archive-position: 12415 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jos@xos.nl Precedence: bulk X-list: netdev On Sat, Dec 04, 2004 at 01:59:46AM +0100, Francois Romieu wrote: > That + (2) + SMP suggests an irq routing issue due to a broken acpi > table ("suggests"). IIRC ACPI is disabled in the BIOS (not sure), but I'm using a RHEL3 2.4.21 kernel that does not support ACPI at all (due to the RH patches rebuilding the kernel with ACPI enbabled does not work anymore)! > I would compile a (monolithic) recent 2.6.x to see its log and > give a look at http://acpi.sourceforge.net/dsdt/index.php (once the > ticket is opened at http://bugzilla.redhat.com of course :o) ) Well, I could do that, but the idea was to use this in a RHEL3 system, so ... -- -- Jos Vos -- X/OS Experts in Open Systems BV | Phone: +31 20 6938364 -- Amsterdam, The Netherlands | Fax: +31 20 6948204 From shemminger@osdl.org Fri Dec 3 21:43:22 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 03 Dec 2004 21:43:27 -0800 (PST) Received: from fire-1.osdl.org (fire.osdl.org [65.172.181.4]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB45hLrg027212 for ; Fri, 3 Dec 2004 21:43:21 -0800 Received: from [192.168.0.106] (063-170-215-071.dslnorthwest.net [63.170.215.71]) (authenticated bits=0) by fire-1.osdl.org (8.12.8/8.12.8) with ESMTP id iB45goQ3022359 (version=TLSv1/SSLv3 cipher=RC4-MD5 bits=128 verify=NO); Fri, 3 Dec 2004 21:42:51 -0800 Message-ID: <41B14E57.5080803@osdl.org> Date: Fri, 03 Dec 2004 21:42:47 -0800 From: Stephen Hemminger User-Agent: Mozilla Thunderbird 0.8 (Windows/20040913) X-Accept-Language: en-us, en MIME-Version: 1.0 To: "David S. Miller" CC: michael.vittrup.larsen@ericsson.com, netdev@oss.sgi.com Subject: Re: [PATCH] tcp: efficient port randomisation (revised) References: <20041027092531.78fe438c@guest-251-240.pdx.osdl.net> <200411020854.44745.michael.vittrup.larsen@ericsson.com> <20041104100104.570e67cd@dxpl.pdx.osdl.net> <200411051103.59032.michael.vittrup.larsen@ericsson.com> <20041117153025.160eaa04@zqx3.pdx.osdl.net> <20041130214643.7b72300e.davem@davemloft.net> <20041201152446.3a0d5ce3@dxpl.pdx.osdl.net> <20041201204622.7b760400.davem@davemloft.net> <20041202134930.132d7bd8@dxpl.pdx.osdl.net> <20041202135252.04e64f51.davem@davemloft.net> In-Reply-To: <20041202135252.04e64f51.davem@davemloft.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-MIMEDefang-Filter: osdl$Revision: 1.96 $ X-Scanned-By: MIMEDefang 2.36 X-archive-position: 12416 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev If I special case to handle loopback, and get rid of the portalloc lock, it comes out much better. These numbers are on the 800Mhz PIII SMP, on a fast box like the dual Opeteron's it makes no difference (always 30us). Before TCP connection latency mean 79.9 std 10.55 *Local* Communication latencies in microseconds - smaller is better ------------------------------------------------------------------- Host OS 2p/0K Pipe AF UDP RPC/ TCP RPC/ TCP ctxsw UNIX UDP TCP conn --------- ------------- ----- ----- ---- ----- ----- ----- ----- ---- stp2-001 Linux 2.6.10- 8.270 38.6 24.3 61.6 48.5 45.9 76.6 74.6 stp2-001 Linux 2.6.10- 8.170 43.5 24.5 58.0 54.8 45.6 63.4 74.7 stp2-001 Linux 2.6.10- 2.740 50.6 29.9 40.3 48.3 59.8 75.1 101. stp2-001 Linux 2.6.10- 8.140 46.6 29.7 57.6 48.8 45.5 72.0 74.4 stp2-001 Linux 2.6.10- 2.690 47.1 26.3 40.8 48.9 45.5 75.4 74.8 After TCP connection latency mean 73.8 std 0.55 *Local* Communication latencies in microseconds - smaller is better ------------------------------------------------------------------- Host OS 2p/0K Pipe AF UDP RPC/ TCP RPC/ TCP ctxsw UNIX UDP TCP conn --------- ------------- ----- ----- ---- ----- ----- ----- ----- ---- stp2-001 Linux 2.6.10- 8.260 38.1 25.7 63.3 48.1 66.6 75.4 74.9 stp2-001 Linux 2.6.10- 8.090 46.2 26.0 63.4 55.5 45.9 63.6 73.5 stp2-001 Linux 2.6.10- 8.210 39.0 21.2 63.1 55.4 58.8 63.8 73.5 stp2-001 Linux 2.6.10- 2.850 46.5 26.0 64.8 54.6 45.5 74.0 73.6 stp2-001 Linux 2.6.10- 8.200 42.9 21.5 64.9 55.6 62.4 64.1 73.5 From buytenh@wantstofly.org Sat Dec 4 02:36:49 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 04 Dec 2004 02:36:55 -0800 (PST) Received: from xi.wantstofly.org (alephnull.demon.nl [212.238.201.82]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB4Aam16004964 for ; Sat, 4 Dec 2004 02:36:49 -0800 Received: by xi.wantstofly.org (Postfix, from userid 500) id 7FDF52B0F2; Sat, 4 Dec 2004 11:36:26 +0100 (MET) Date: Sat, 4 Dec 2004 11:36:26 +0100 From: Lennert Buytenhek To: Scott Feldman Cc: jamal , Robert Olsson , P@draigBrady.com, mellia@prezzemolo.polito.it, e1000-devel@lists.sourceforge.net, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com Subject: Re: [E1000-devel] Transmission limit Message-ID: <20041204103626.GA17872@xi.wantstofly.org> References: <1101484740.24742.213.camel@mellia.lipar.polito.it> <41A76085.7000105@draigBrady.com> <1101499285.1079.45.camel@jzny.localdomain> <16811.8052.678955.795327@robur.slu.se> <1101821501.1043.43.camel@jzny.localdomain> <20041130134600.GA31515@xi.wantstofly.org> <1101824754.1044.126.camel@jzny.localdomain> <20041201001107.GE4203@xi.wantstofly.org> <1101863399.4663.54.camel@sfeldma-mobl.dsl-verizon.net> <20041203205706.GC9808@xi.wantstofly.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041203205706.GC9808@xi.wantstofly.org> User-Agent: Mutt/1.4.1i X-archive-position: 12417 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: buytenh@wantstofly.org Precedence: bulk X-list: netdev On Fri, Dec 03, 2004 at 09:57:06PM +0100, Lennert Buytenhek wrote: > > My problem is I only have a P4 desktop system with a 82544 nic running > > at PCI 32/33Mhz, so I can't play with the big boys. But, attached is a > > rework of the Tx path to eliminate 1) Tx interrupts, and 2) Tx > > descriptor write-backs. For me, I see a nice jump in kpps, but I'd like > > others to try with their setups. We should be able to get to wire speed > > with 60-byte packets. > > Attached is a graph of my numbers with and without your patch for: > - An 82540 at PCI 32/33, idle 33MHz card on the same bus forcing it to 33MHz. > - An 82541 at PCI 32/66. > - An 82546 at PCI-X 64/100, NIC can do 133MHz but mobo only does 100MHz. When extrapolating these numbers to the 0-byte packet case (which then tells you the per-packet overhead), I get the following approximate numbers: case overhead phi-32-33-82540-2.6.9 1.86 us phi-32-66-82541-2.6.9 1.41 us phi-64-100-82546-2.6.9 1.45 us phi-32-33-82540-2.6.9-feldman 1.48 us phi-32-66-82541-2.6.9-feldman 1.13 us phi-64-100-82546-2.6.9-feldman 1.25 us Note that this figure doesn't differ all that much between the different bus widths/speeds. In any case, if I ever want to get more than ~880kpps on this hardware, there's no other way than to make this overhead go down. For saturating 1Gb/s with 60B packets on 64/100, the overhead can't be more than ~0.59 us per packet or you lose. --L From romieu@fr.zoreil.com Sat Dec 4 04:06:39 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 04 Dec 2004 04:06:51 -0800 (PST) Received: from fr.zoreil.com (electric-eye.fr.zoreil.com [213.41.134.224]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB4C6cBn011544 for ; Sat, 4 Dec 2004 04:06:39 -0800 Received: from electric-eye.fr.zoreil.com (localhost.localdomain [127.0.0.1]) by fr.zoreil.com (8.12.10/8.12.1) with ESMTP id iB4C3Svr001481; Sat, 4 Dec 2004 13:03:29 +0100 Received: (from romieu@localhost) by electric-eye.fr.zoreil.com (8.12.10/8.12.10/Submit) id iB4C3SUq001480; Sat, 4 Dec 2004 13:03:28 +0100 Date: Sat, 4 Dec 2004 13:03:28 +0100 From: Francois Romieu To: Jos Vos Cc: netdev@oss.sgi.com Subject: Re: e1000 driver problem with Intel Pro/1000 MT adapter Message-ID: <20041204120328.GA1191@electric-eye.fr.zoreil.com> References: <200412032325.iB3NPdG04693@xos037.xos.nl> <20041204005946.GA26654@electric-eye.fr.zoreil.com> <20041204021444.A5058@xos037.xos.nl> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041204021444.A5058@xos037.xos.nl> User-Agent: Mutt/1.4.1i X-Organisation: Land of Sunshine Inc. X-archive-position: 12418 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: romieu@fr.zoreil.com Precedence: bulk X-list: netdev Jos Vos : [...] > Well, I could do that, but the idea was to use this in a RHEL3 > system, so ... 1 - Imho people at RedHat/wherever need more details (chipset, mobo, dmesg, lspci -vx, 2.4.21 savor ?). 2 - Testing recent 2.6.x will provide a different datapoint and a wider review: a few people use e1000 for interested things here but they do not necessarily have the resources to help with a specific vendor released kernel. The vendor can use the datapoint though. 3 - Comparing lspci and dmesg output for the different failure may give an hint. -- Ueimor From jos@xos037.xos.nl Sat Dec 4 09:20:48 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 04 Dec 2004 09:20:55 -0800 (PST) Received: from simba.xos.nl (simba.xos.nl [212.26.207.226]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB4HKkrw025199 for ; Sat, 4 Dec 2004 09:20:47 -0800 Received: from xos037.xos.nl (xos037.xos.nl [212.26.207.37]) by simba.xos.nl (8.12.8/8.12.8) with ESMTP id iB4HKDmA030416; Sat, 4 Dec 2004 18:20:13 +0100 Received: (from jos@localhost) by xos037.xos.nl (8.11.0/8.11.0) id iB4HKDX08286; Sat, 4 Dec 2004 18:20:13 +0100 Date: Sat, 4 Dec 2004 18:20:13 +0100 From: Jos Vos To: Francois Romieu Cc: netdev@oss.sgi.com Subject: Re: e1000 driver problem with Intel Pro/1000 MT adapter Message-ID: <20041204182013.A8246@xos037.xos.nl> References: <200412032325.iB3NPdG04693@xos037.xos.nl> <20041204005946.GA26654@electric-eye.fr.zoreil.com> <20041204021444.A5058@xos037.xos.nl> <20041204120328.GA1191@electric-eye.fr.zoreil.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="MGYHOYXEY6WxJCY8" Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20041204120328.GA1191@electric-eye.fr.zoreil.com>; from romieu@fr.zoreil.com on Sat, Dec 04, 2004 at 01:03:28PM +0100 X-Organization: X/OS Experts in Open Systems BV, Amsterdam, The Netherlands X-archive-position: 12419 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jos@xos.nl Precedence: bulk X-list: netdev --MGYHOYXEY6WxJCY8 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Sat, Dec 04, 2004 at 01:03:28PM +0100, Francois Romieu wrote: > 1 - Imho people at RedHat/wherever need more details (chipset, mobo, > dmesg, lspci -vx, 2.4.21 savor ?). Mainboard: Supermicro P4SCT+ Output of lspci -vx: attached Red Hat kernel: 2.4.21-20.EL > 2 - Testing recent 2.6.x will provide a different datapoint and a wider > review: a few people use e1000 for interested things here but they > do not necessarily have the resources to help with a specific vendor > released kernel. The vendor can use the datapoint though. Yes, ok, I might install RHEL4 beta or Fedora Core 3 on it next week and see how that works. -- -- Jos Vos -- X/OS Experts in Open Systems BV | Phone: +31 20 6938364 -- Amsterdam, The Netherlands | Fax: +31 20 6948204 --MGYHOYXEY6WxJCY8 Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="lspci.txt" 00:00.0 Host bridge: Intel Corp. 82875P Memory Controller Hub (rev 02) Subsystem: Super Micro Computer Inc: Unknown device 5180 Flags: bus master, fast devsel, latency 0 Memory at f0000000 (32-bit, prefetchable) [size=128M] Capabilities: [e4] #09 [3106] 00: 86 80 78 25 06 01 90 20 02 00 00 06 00 00 00 00 10: 08 00 00 f0 00 00 00 00 00 00 00 00 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 d9 15 80 51 30: 00 00 00 00 e4 00 00 00 00 00 00 00 00 00 00 00 00:03.0 PCI bridge: Intel Corp. 82875P Processor to PCI to CSA Bridge (rev 02) (prog-if 00 [Normal decode]) Flags: bus master, 66Mhz, fast devsel, latency 32 Bus: primary=00, secondary=01, subordinate=01, sec-latency=0 I/O behind bridge: 00009000-00009fff Memory behind bridge: fb200000-fb2fffff 00: 86 80 7b 25 07 01 a0 00 02 00 04 06 00 20 01 00 10: 00 00 00 00 00 00 00 00 00 01 01 00 90 90 a0 22 20: 20 fb 20 fb f0 ff 00 00 00 00 00 00 00 00 00 00 30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 04 00 00:1c.0 PCI bridge: Intel Corp. 6300ESB 64-bit PCI-X Bridge (rev 02) (prog-if 00 [Normal decode]) Flags: bus master, 66Mhz, fast devsel, latency 32 Bus: primary=00, secondary=02, subordinate=03, sec-latency=64 I/O behind bridge: 0000a000-0000afff Memory behind bridge: fb000000-fb1fffff Capabilities: [50] PCI-X non-bridge device. 00: 86 80 ae 25 07 01 30 80 02 00 04 06 10 20 01 00 10: 00 00 00 00 00 00 00 00 00 02 03 40 a0 a0 a0 22 20: 00 fb 10 fb f1 ff 01 00 00 00 00 00 00 00 00 00 30: 00 00 00 00 50 00 00 00 00 00 00 00 00 00 06 00 00:1d.0 USB Controller: Intel Corp. 6300ESB USB Universal Host Controller (rev 02) (prog-if 00 [UHCI]) Subsystem: Super Micro Computer Inc: Unknown device 5180 Flags: bus master, medium devsel, latency 0, IRQ 16 I/O ports at c400 [size=32] 00: 86 80 a9 25 05 00 80 02 02 00 03 0c 00 00 80 00 10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 20: 01 c4 00 00 00 00 00 00 00 00 00 00 d9 15 80 51 30: 00 00 00 00 00 00 00 00 00 00 00 00 0b 01 00 00 00:1d.1 USB Controller: Intel Corp. 6300ESB USB Universal Host Controller (rev 02) (prog-if 00 [UHCI]) Subsystem: Super Micro Computer Inc: Unknown device 5180 Flags: bus master, medium devsel, latency 0, IRQ 19 I/O ports at c000 [size=32] 00: 86 80 aa 25 05 00 80 02 02 00 03 0c 00 00 00 00 10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 20: 01 c0 00 00 00 00 00 00 00 00 00 00 d9 15 80 51 30: 00 00 00 00 00 00 00 00 00 00 00 00 05 02 00 00 00:1d.4 System peripheral: Intel Corp. 6300ESB Watchdog Timer (rev 02) Subsystem: Super Micro Computer Inc: Unknown device 5180 Flags: medium devsel Memory at fb301000 (32-bit, non-prefetchable) [size=16] 00: 86 80 ab 25 02 00 80 02 02 00 80 08 00 00 00 00 10: 00 10 30 fb 00 00 00 00 00 00 00 00 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 d9 15 80 51 30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00:1d.5 PIC: Intel Corp. 6300ESB I/O Advanced Programmable Interrupt Controller (rev 02) (prog-if 20 [IO(X)-APIC]) Subsystem: Intel Corp. 6300ESB I/O Advanced Programmable Interrupt Controller Flags: bus master, fast devsel, latency 0 Capabilities: [50] PCI-X non-bridge device. 00: 86 80 ac 25 06 00 10 00 02 20 00 08 00 00 00 00 10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 86 80 ac 25 30: 00 00 00 00 50 00 00 00 00 00 00 00 ff 00 00 00 00:1d.7 USB Controller: Intel Corp. 6300ESB USB2 Enhanced Host Controller (rev 02) (prog-if 20 [EHCI]) Subsystem: Super Micro Computer Inc: Unknown device 5180 Flags: bus master, medium devsel, latency 0, IRQ 23 Memory at fb300000 (32-bit, non-prefetchable) [size=1K] Capabilities: [50] Power Management version 2 00: 86 80 ad 25 06 00 90 02 02 20 03 0c 00 00 00 00 10: 00 00 30 fb 00 00 00 00 00 00 00 00 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 d9 15 80 51 30: 00 00 00 00 50 00 00 00 00 00 00 00 0b 04 00 00 00:1e.0 PCI bridge: Intel Corp. 82801BA/CA/DB/EB/ER Hub interface to PCI Bridge (rev 0a) (prog-if 00 [Normal decode]) Flags: bus master, fast devsel, latency 0 Bus: primary=00, secondary=04, subordinate=04, sec-latency=32 I/O behind bridge: 0000b000-0000bfff Memory behind bridge: f8000000-faffffff 00: 86 80 4e 24 07 01 80 00 0a 00 04 06 00 00 01 00 10: 00 00 00 00 00 00 00 00 00 04 04 20 b0 b0 80 22 20: 00 f8 f0 fa f0 ff 00 00 00 00 00 00 00 00 00 00 30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0e 00 00:1f.0 ISA bridge: Intel Corp. 6300ESB LPC Interface Controller (rev 02) Flags: bus master, medium devsel, latency 0 00: 86 80 a1 25 0f 00 80 02 02 00 01 06 00 00 80 00 10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00:1f.1 IDE interface: Intel Corp. 6300ESB PATA Storage Controller (rev 02) (prog-if 8a [Master SecP PriP]) Subsystem: Super Micro Computer Inc: Unknown device 5180 Flags: bus master, medium devsel, latency 0, IRQ 18 I/O ports at I/O ports at I/O ports at I/O ports at I/O ports at f000 [size=16] Memory at 1ff00000 (32-bit, non-prefetchable) [size=1K] 00: 86 80 a2 25 07 00 88 02 02 8a 01 01 00 00 00 00 10: 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 20: 01 f0 00 00 00 00 f0 1f 00 00 00 00 d9 15 80 51 30: 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00:1f.2 IDE interface: Intel Corp. 6300ESB SATA Storage Controller (rev 02) (prog-if 8f [Master SecP SecO PriP PriO]) Subsystem: Super Micro Computer Inc: Unknown device 5180 Flags: bus master, 66Mhz, medium devsel, latency 0, IRQ 18 I/O ports at c800 [size=8] I/O ports at cc00 [size=4] I/O ports at d000 [size=8] I/O ports at d400 [size=4] I/O ports at d800 [size=16] 00: 86 80 a3 25 05 00 a0 02 02 8f 01 01 00 00 00 00 10: 01 c8 00 00 01 cc 00 00 01 d0 00 00 01 d4 00 00 20: 01 d8 00 00 00 00 00 00 00 00 00 00 d9 15 80 51 30: 00 00 00 00 00 00 00 00 00 00 00 00 0a 01 00 00 00:1f.3 SMBus: Intel Corp. 6300ESB SMBus Controller (rev 02) Subsystem: Super Micro Computer Inc: Unknown device 5180 Flags: medium devsel, IRQ 17 I/O ports at 0500 [size=32] 00: 86 80 a4 25 01 00 80 02 02 00 05 0c 00 00 00 00 10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 20: 01 05 00 00 00 00 00 00 00 00 00 00 d9 15 80 51 30: 00 00 00 00 00 00 00 00 00 00 00 00 09 02 00 00 01:01.0 Ethernet controller: Intel Corp. 82547GI Gigabit Ethernet Controller Subsystem: Intel Corp. PRO/1000 CT Network Connection Flags: bus master, 66Mhz, medium devsel, latency 0, IRQ 18 Memory at fb200000 (32-bit, non-prefetchable) [size=128K] I/O ports at 9000 [size=32] Capabilities: [dc] Power Management version 2 00: 86 80 75 10 07 00 38 02 00 00 00 02 08 00 00 00 10: 00 00 20 fb 00 00 00 00 01 90 00 00 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 86 80 75 10 30: 00 00 00 00 dc 00 00 00 00 00 00 00 0a 01 ff 00 02:01.0 PCI bridge: IBM PCI-X to PCI-X Bridge (rev 02) (prog-if 00 [Normal decode]) Flags: bus master, 66Mhz, medium devsel, latency 32 Bus: primary=02, secondary=03, subordinate=03, sec-latency=32 I/O behind bridge: 0000a000-0000afff Memory behind bridge: fb000000-fb0fffff Capabilities: [80] PCI-X non-bridge device. Capabilities: [90] Power Management version 2 00: 14 10 a7 01 07 01 30 02 02 00 04 06 08 20 01 00 10: 00 00 00 00 00 00 00 00 02 03 03 20 a1 a1 20 22 20: 00 fb 00 fb f1 ff 01 00 00 00 00 00 00 00 00 00 30: 00 00 00 00 80 00 00 00 00 00 00 00 ff 00 06 00 02:04.0 SCSI storage controller: Marvell Technology Group Ltd. MV88SX5041 4-port SATA I PCI-X Controller (rev 03) Subsystem: Marvell Technology Group Ltd. MV88SX5041 4-port SATA I PCI-X Controller Flags: bus master, 66Mhz, medium devsel, latency 32, IRQ 27 Memory at fb100000 (64-bit, non-prefetchable) [size=512K] Capabilities: [40] Power Management version 2 Capabilities: [50] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable- Capabilities: [60] PCI-X non-bridge device. 00: ab 11 41 50 07 00 b0 02 03 00 00 01 08 20 00 00 10: 04 00 10 fb 00 00 00 00 00 00 00 00 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 ab 11 41 50 30: 00 00 00 00 40 00 00 00 00 00 00 00 09 01 00 00 03:04.0 Ethernet controller: Intel Corp. 82546EB Gigabit Ethernet Controller (rev 01) Subsystem: Intel Corp. PRO/1000 MT Quad Port Server Adapter Flags: bus master, 66Mhz, medium devsel, latency 32, IRQ 267 Memory at fb000000 (64-bit, non-prefetchable) [size=128K] I/O ports at a000 [size=64] Capabilities: [dc] Power Management version 2 Capabilities: [e4] PCI-X non-bridge device. Capabilities: [f0] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable- 00: 86 80 1d 10 07 00 30 02 01 00 00 02 08 20 80 00 10: 04 00 00 fb 00 00 00 00 00 00 00 00 00 00 00 00 20: 01 a0 00 00 00 00 00 00 00 00 00 00 86 80 00 10 30: 00 00 00 00 dc 00 00 00 00 00 00 00 0b 01 ff 00 03:04.1 Ethernet controller: Intel Corp. 82546EB Gigabit Ethernet Controller (rev 01) Subsystem: Intel Corp. PRO/1000 MT Quad Port Server Adapter Flags: bus master, 66Mhz, medium devsel, latency 32, IRQ 267 Memory at fb020000 (64-bit, non-prefetchable) [size=128K] I/O ports at a400 [size=64] Capabilities: [dc] Power Management version 2 Capabilities: [e4] PCI-X non-bridge device. Capabilities: [f0] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable- 00: 86 80 1d 10 07 00 30 02 01 00 00 02 08 20 80 00 10: 04 00 02 fb 00 00 00 00 00 00 00 00 00 00 00 00 20: 01 a4 00 00 00 00 00 00 00 00 00 00 86 80 00 10 30: 00 00 00 00 dc 00 00 00 00 00 00 00 09 02 ff 00 03:06.0 Ethernet controller: Intel Corp. 82546EB Gigabit Ethernet Controller (rev 01) Subsystem: Intel Corp. PRO/1000 MT Quad Port Server Adapter Flags: bus master, 66Mhz, medium devsel, latency 32, IRQ 26 Memory at fb040000 (64-bit, non-prefetchable) [size=128K] I/O ports at a800 [size=64] Capabilities: [dc] Power Management version 2 Capabilities: [e4] PCI-X non-bridge device. Capabilities: [f0] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable- 00: 86 80 1d 10 07 00 30 02 01 00 00 02 08 20 80 00 10: 04 00 04 fb 00 00 00 00 00 00 00 00 00 00 00 00 20: 01 a8 00 00 00 00 00 00 00 00 00 00 86 80 00 10 30: 00 00 00 00 dc 00 00 00 00 00 00 00 0a 01 ff 00 03:06.1 Ethernet controller: Intel Corp. 82546EB Gigabit Ethernet Controller (rev 01) Subsystem: Intel Corp. PRO/1000 MT Quad Port Server Adapter Flags: bus master, 66Mhz, medium devsel, latency 32, IRQ 27 Memory at fb060000 (64-bit, non-prefetchable) [size=128K] I/O ports at ac00 [size=64] Capabilities: [dc] Power Management version 2 Capabilities: [e4] PCI-X non-bridge device. Capabilities: [f0] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable- 00: 86 80 1d 10 07 00 30 02 01 00 00 02 08 20 80 00 10: 04 00 06 fb 00 00 00 00 00 00 00 00 00 00 00 00 20: 01 ac 00 00 00 00 00 00 00 00 00 00 86 80 00 10 30: 00 00 00 00 dc 00 00 00 00 00 00 00 05 02 ff 00 04:09.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27) (prog-if 00 [VGA]) Subsystem: ATI Technologies Inc Rage XL Flags: bus master, stepping, medium devsel, latency 32, IRQ 17 Memory at f8000000 (32-bit, non-prefetchable) [size=16M] I/O ports at b000 [size=256] Memory at fa020000 (32-bit, non-prefetchable) [size=4K] Expansion ROM at [disabled] [size=128K] Capabilities: [5c] Power Management version 2 00: 02 10 52 47 87 00 90 02 27 00 00 03 08 20 00 00 10: 00 00 00 f8 01 b0 00 00 00 00 02 fa 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 02 10 08 80 30: 00 00 00 00 5c 00 00 00 00 00 00 00 09 01 08 00 04:0a.0 Ethernet controller: Intel Corp. 82541GI/PI Gigabit Ethernet Controller Subsystem: Intel Corp. PRO/1000 MT Network Connection Flags: bus master, 66Mhz, medium devsel, latency 32, IRQ 19 Memory at fa000000 (32-bit, non-prefetchable) [size=128K] I/O ports at b400 [size=64] Capabilities: [dc] Power Management version 2 Capabilities: [e4] PCI-X non-bridge device. 00: 86 80 76 10 07 00 30 02 00 00 00 02 08 20 00 00 10: 00 00 00 fa 00 00 00 00 01 b4 00 00 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 86 80 76 10 30: 00 00 00 00 dc 00 00 00 00 00 00 00 05 01 ff 00 --MGYHOYXEY6WxJCY8-- From lanxue@soe.ucsc.edu Sat Dec 4 14:05:22 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 04 Dec 2004 14:05:26 -0800 (PST) Received: from services.cse.ucsc.edu (services.cse.ucsc.edu [128.114.48.10]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB4M5Muk002322 for ; Sat, 4 Dec 2004 14:05:22 -0800 Received: from moondance.cse.ucsc.edu (moondance.cse.ucsc.edu [128.114.49.1]) by services.cse.ucsc.edu (8.13.1/8.13.1) with ESMTP id iB4M50An021326; Sat, 4 Dec 2004 14:05:00 -0800 (PST) Date: Sat, 4 Dec 2004 14:05:00 -0800 (PST) From: Lan Xue X-X-Sender: lanxue@moondance.cse.ucsc.edu To: netdev@oss.sgi.com cc: kernerl mail Subject: global memory access In-Reply-To: <20041202094433.39132.qmail@web41411.mail.yahoo.com> Message-ID: References: <20041202094433.39132.qmail@web41411.mail.yahoo.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 12420 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: lanxue@soe.ucsc.edu Precedence: bulk X-list: netdev Hi, I was wondering how a user-level program can share a writable memory buffer with the kernel. The memory can either belongs to the kernel space or the user space, but both kernel module and user program should be able to access(read, write) it. Any hint? Many thanks, Lan From sfeldma@pobox.com Sat Dec 4 15:57:37 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 04 Dec 2004 15:57:41 -0800 (PST) Received: from orb.pobox.com (orb.pobox.com [207.8.226.5]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB4Nvald005121 for ; Sat, 4 Dec 2004 15:57:37 -0800 Received: from orb (localhost [127.0.0.1]) by orb.pobox.com (Postfix) with ESMTP id A75DA2FB558; Sat, 4 Dec 2004 18:57:14 -0500 (EST) Received: from [192.168.0.19] (wbar2.sea1-4-5-062-153.sea1.dsl-verizon.net [4.5.62.153]) by orb.sasl.smtp.pobox.com (Postfix) with ESMTP id 1ABA22FAE14; Sat, 4 Dec 2004 18:57:12 -0500 (EST) Subject: Re: e1000 driver problem with Intel Pro/1000 MT adapter From: Scott Feldman Reply-To: sfeldma@pobox.com To: Jos Vos Cc: netdev@oss.sgi.com In-Reply-To: <200412032325.iB3NPdG04693@xos037.xos.nl> References: <200412032325.iB3NPdG04693@xos037.xos.nl> Content-Type: text/plain Message-Id: <1102204801.3343.68.camel@sfeldma-mobl.dsl-verizon.net> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.6 (1.4.6-2) Date: Sat, 04 Dec 2004 16:00:01 -0800 Content-Transfer-Encoding: 7bit X-archive-position: 12421 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: sfeldma@pobox.com Precedence: bulk X-list: netdev On Fri, 2004-12-03 at 15:25, Jos Vos wrote: > I have a problem with an Intel Pro/1000 MT 4-port card in a Supermicro > (non-HT) Pentium 4 system using RHEL3 (2.4.21 kernel "the RH way") with > the e1000 driver (I tried both the version supplied by RH and the > newest 5.5.4 driver): The 4-port card has two 82546 dual-port controllers behind a PCI-X bridge. Your symptoms suggest interrupt routing didn't get setup correctly for the controllers behind the bridge. I've seen more than one case where the BIOS gets this wrong. The fix is to upgrade to the latest BIOS. Hopefully this is the fix for you. :-) -scott From sfeldma@pobox.com Sat Dec 4 19:18:22 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 04 Dec 2004 19:18:31 -0800 (PST) Received: from orb.pobox.com (orb.pobox.com [207.8.226.5]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB53ILpo012747 for ; Sat, 4 Dec 2004 19:18:22 -0800 Received: from orb (localhost [127.0.0.1]) by orb.pobox.com (Postfix) with ESMTP id 619812FA1D7; Sat, 4 Dec 2004 22:17:59 -0500 (EST) Received: from [192.168.0.19] (wbar2.sea1-4-5-062-153.sea1.dsl-verizon.net [4.5.62.153]) by orb.sasl.smtp.pobox.com (Postfix) with ESMTP id 97E8E2F91AB; Sat, 4 Dec 2004 22:17:56 -0500 (EST) Subject: Re: e1000>5.2.30 unstable with InterruptThrottleRate=0 From: Scott Feldman Reply-To: sfeldma@pobox.com To: Peter Kjellstroem Cc: netdev@oss.sgi.com In-Reply-To: References: Content-Type: text/plain Message-Id: <1102216844.3343.84.camel@sfeldma-mobl.dsl-verizon.net> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.6 (1.4.6-2) Date: Sat, 04 Dec 2004 19:20:44 -0800 Content-Transfer-Encoding: 7bit X-archive-position: 12422 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: sfeldma@pobox.com Precedence: bulk X-list: netdev On Fri, 2004-12-03 at 11:02, Peter Kjellstroem wrote: > Short version: 82547GI with ITR=0 on 2.4.28 (vanilla) and RHEL3u3 has > problems (traffic grinds to a temporary halt under anything but trivila > network traffic). kernel prints the following and resets the IF (many > times): > > NETDEV WATCHDOG: eth0: transmit timed out Dude! You're out of luck! >From the README: CAUTION: If you are using the Intel PRO/1000 CT Network Connection (controller 82547), setting InterruptThrottleRate to a value greater than 75,000, may hang (stop transmitting) adapters under certain network conditions. If this occurs a NETDEV WATCHDOG message is logged in the system event log. In addition, the controller is automatically reset, restoring the network connection. To eliminate the potential for the hang, ensure that InterruptThrottleRate is set no greater than 75,000 and is not set to 0. I was running into the same thing with 82547EI setting ITR=0, and then I remembered that this part is buggy when ITR=0. The bug is due to 82547 messing up the order of interrupt assertion and de-assertion on the CSA bus. If you want to do MPI on this system, you'll need to use a non-zero ITR or plug in an add-in card into one of the PCI slots and use the add-in card. The problem is, these slots are probably 32-bit/33Mhz, so you're not going to get the same maximum Mbps that you'll get with 82547 using the CSA bus. 82547 will not be a good choice for MPI. Sorry. > Affected chips (theory, 8254X, X>1 or anything faster then PCI33): > 82547GI, 82546 (said to be affected, not verified by me) 82546 should be fine with ITR=0. > http://lists.us.dell.com/pipermail/linux-poweredge/2004-November/023061.html You might want to forward this info to that thread. -scott From paul@clubi.ie Sat Dec 4 22:26:20 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 04 Dec 2004 22:26:27 -0800 (PST) Received: from hibernia.jakma.org (hibernia.jakma.org [212.17.55.49]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB56QDdM020079 for ; Sat, 4 Dec 2004 22:26:16 -0800 Received: from hibernia.jakma.org (IDENT:paul@hibernia.jakma.org [192.168.0.3]) by hibernia.jakma.org (8.12.11/8.12.11) with ESMTP id iB56PWrc009633; Sun, 5 Dec 2004 06:25:35 GMT Date: Sun, 5 Dec 2004 06:25:31 +0000 (GMT) From: Paul Jakma X-X-Sender: paul@hibernia.jakma.org To: Thomas Spatzier cc: jgarzik@pobox.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: [patch 4/10] s390: network driver. In-Reply-To: Message-ID: References: Mail-Followup-To: paul@hibernia.jakma.org X-NSA: arafat al aqsar jihad musharef jet-A1 avgas ammonium qran inshallah allah al-akbar martyr iraq saddam hammas hisballah rabin ayatollah korea vietnam revolt mustard gas british airways washington MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Virus-Scanned: ClamAV 0.80/614/Wed Dec 1 15:44:43 2004 clamav-milter version 0.80j on hibernia.jakma.org X-Virus-Status: Clean X-archive-position: 12423 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: paul@clubi.ie Precedence: bulk X-list: netdev On Tue, 30 Nov 2004, Thomas Spatzier wrote: > Ok, then some logic could be implemented in userland to take > appropriate actions. It must be ensured that zebra handles the > netlink notification fast enough. AIUI, netlink is not synchronous, it most definitely makes no reliability guarantees (and at the moment, zebra isnt terribly efficient at reading netlink, large numbers of interfaces will cause overruns in zebra - fixing this is on the TODO list). So we can never get rid of the window where a daemon could send a packet out a link-down interface - we can make that window smaller but not eliminate it. Hence we need either a way to flush packets associated with an (interface,socket) (or just the socket) or we need the kernel to not accept such packets (and drop packets it has accepted). > In the manpages for send/sendto/sendmsg it says that there is a -ENOBUFS > return value, if a sockets write queue is full. Yes, ENOBUFS, sorry. > It also says: > "Normally, this does not occur in Linux. Packets are just silently dropped > when a device queue overflows." This has always been (AFAIK) the behaviour yes. We started getting reports of the new queuing behaviour with, iirc, a version of Intel's e100 driver for 2.4.2x, which was later changed back to the old behaviour. However now that the queue behaviour is apparently the mandated behaviour we really need to work out what to do about the sending-long-stale packets problem. > So, if packets are 'silently dropped' anyway, the fact that we drop > them in our driver (and increment the error count in the > net_device_stats accordingly) should not be a problem. It shouldnt no. The likes of OSPF already specify their own reliability mechanisms. > I think that both behaviours are similar for TCP. TCP waits for > ACKs for each packet. If they do not arrive, a retransmit is done. > Sooner or later the connection will be reset, if no responses from > the other side arrive. So the result for both driver behaviours > should be the same. But if TCP worked even when drivers dropped packets, then that implies TCP has its own queue? That we're talking about a seperate driver packet queue rather than the socket buffer (which is, presumably, where TCP retains packets until ACKed - i have no idea). Anyway, we do, I think, need some way to deal with the sending-stale-packet-on-link-back problem. Either a way to flush this driver queue or else a guarantee that writes to sockets whose protocol makes no reliability guarantee will either return ENOBUFS or drop the packet. Otherwise we will start getting reports of "Quagga on Linux sent an ancient {RIP,IRDP,RA} packet when we fixed a switch problem, and it caused an outage for a section of our network due to bad routes", I think. Some comment or advice would be useful. (Am I kill-filed by all of netdev? feels like it). > Regards, > Thomas regards, -- Paul Jakma paul@clubi.ie paul@jakma.org Key ID: 64A2FF6A Fortune: No directory. From anton@ozlabs.org Sat Dec 4 23:49:59 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 04 Dec 2004 23:50:03 -0800 (PST) Received: from ozlabs.org (ozlabs.org [203.10.76.45]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB57nwKx021994 for ; Sat, 4 Dec 2004 23:49:59 -0800 Received: by ozlabs.org (Postfix, from userid 1010) id 37A332BDB5; Sun, 5 Dec 2004 18:49:32 +1100 (EST) Date: Sun, 5 Dec 2004 18:46:58 +1100 From: Anton Blanchard To: "David S. Miller" Cc: netdev@oss.sgi.com Subject: Re: KERNEL: assertion (!sk->sk_forward_alloc) Message-ID: <20041205074658.GA19168@krispykreme.ozlabs.ibm.com> References: <20041129211855.GC17540@krispykreme.ozlabs.ibm.com> <20041129172210.687bcad3.davem@davemloft.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041129172210.687bcad3.davem@davemloft.net> User-Agent: Mutt/1.5.6+20040907i X-archive-position: 12424 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: anton@samba.org Precedence: bulk X-list: netdev Hi Dave, > Try to reproduce without that patch installed. > > If there is any mixup of SKB handling by any element in the > path the packet travels, you're screw up all the TCP > accounting knobs on the socket and easily trigger messages > like the above. I'll try. FYI we have 70 gigabit cards in that machine, the reason send-to-self is so useful is that we only need one expensive machine. Without the patch we need to find two expensive machines with 35 gigabits in each :) Anton From greearb@candelatech.com Sun Dec 5 00:37:42 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 05 Dec 2004 00:37:48 -0800 (PST) Received: from www.lanforge.com (ns1.lanforge.com [66.165.47.210]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB58bfwN026805 for ; Sun, 5 Dec 2004 00:37:42 -0800 Received: from [4.33.45.22] (evrtwa1-ar2-4-33-045-022.evrtwa1.dsl-verizon.net [4.33.45.22]) (authenticated bits=0) by www.lanforge.com (8.12.8/8.12.8) with ESMTP id iB58nLLH015959; Sun, 5 Dec 2004 00:49:21 -0800 Message-ID: <41B2C8BF.9050204@candelatech.com> Date: Sun, 05 Dec 2004 00:37:19 -0800 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.3) Gecko/20041020 X-Accept-Language: en-us, en MIME-Version: 1.0 To: sfeldma@pobox.com CC: Jos Vos , netdev@oss.sgi.com Subject: Re: e1000 driver problem with Intel Pro/1000 MT adapter References: <200412032325.iB3NPdG04693@xos037.xos.nl> <1102204801.3343.68.camel@sfeldma-mobl.dsl-verizon.net> In-Reply-To: <1102204801.3343.68.camel@sfeldma-mobl.dsl-verizon.net> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 12425 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev Scott Feldman wrote: > On Fri, 2004-12-03 at 15:25, Jos Vos wrote: > >>I have a problem with an Intel Pro/1000 MT 4-port card in a Supermicro >>(non-HT) Pentium 4 system using RHEL3 (2.4.21 kernel "the RH way") with >>the e1000 driver (I tried both the version supplied by RH and the >>newest 5.5.4 driver): > > > The 4-port card has two 82546 dual-port controllers behind a PCI-X > bridge. Your symptoms suggest interrupt routing didn't get setup > correctly for the controllers behind the bridge. I've seen more than > one case where the BIOS gets this wrong. The fix is to upgrade to the > latest BIOS. Hopefully this is the fix for you. :-) For what it's worth, I've had good luck with a 4-port pro/1000 NIC in a SuperMicro X5-DPAGG (dual xeon, PCI-X) motherboard. I've tried it with kernel 2.4.27 and 2.6.9 so far... For maximum performance, I needed to increase the tx and rx descriptors to 2k, I believe this helps mitigate the extra latency caused by the PCI-X bridge on the NIC... Ben > > -scott > -- Ben Greear Candela Technologies Inc http://www.candelatech.com From cap@nsc.liu.se Sun Dec 5 04:42:32 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 05 Dec 2004 04:42:39 -0800 (PST) Received: from papput.nsc.liu.se (ns2.nsc.liu.se [130.236.101.9]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB5CgV6l013315 for ; Sun, 5 Dec 2004 04:42:32 -0800 Received: from mail.nsc.liu.se (mail.nsc.liu.se [130.236.101.49]) by papput.nsc.liu.se (Postfix) with ESMTP id EF2B71C31E2; Sun, 5 Dec 2004 13:42:08 +0100 (CET) Date: Sun, 5 Dec 2004 13:42:08 +0100 (CET) From: Peter Kjellstroem To: Scott Feldman Cc: netdev@oss.sgi.com Subject: Re: e1000>5.2.30 unstable with InterruptThrottleRate=0 In-Reply-To: <1102216844.3343.84.camel@sfeldma-mobl.dsl-verizon.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 12426 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: cap@nsc.liu.se Precedence: bulk X-list: netdev Scott, If you had read my entire message you could have noticed that 82547GI works fully ok (weeks of MPI stress) with 5.2.30 as shipped with 2.4.26. Also, setting it to less then 75K is not enough, it just gets harder to provoke, I have managed to make it die with 20K. I will probably run a 2.4.28 with an e1000 driver from 2.4.26 (tested and ok) but I'm not really happy about adding another path to the list of stuff I have to push into a vanilla 2.4.28. Smells driver problem to me. Regards, Peter On Sat, 4 Dec 2004, Scott Feldman wrote: > On Fri, 2004-12-03 at 11:02, Peter Kjellstroem wrote: > > Short version: 82547GI with ITR=0 on 2.4.28 (vanilla) and RHEL3u3 has > > problems (traffic grinds to a temporary halt under anything but trivila > > network traffic). kernel prints the following and resets the IF (many > > times): > > > > NETDEV WATCHDOG: eth0: transmit timed out > > Dude! You're out of luck! > > >From the README: > > CAUTION: If you are using the Intel PRO/1000 CT Network > Connection (controller 82547), setting > InterruptThrottleRate to a value > greater than 75,000, may hang (stop transmitting) > adapters under certain network conditions. If this > occurs a NETDEV WATCHDOG message is logged in the > system event log. In addition, the controller is > automatically reset, restoring the network > connection. To eliminate the potential for the hang, > ensure that InterruptThrottleRate is set no greater > than 75,000 and is not set to 0. > > I was running into the same thing with 82547EI setting ITR=0, and then I > remembered that this part is buggy when ITR=0. The bug is due to 82547 > messing up the order of interrupt assertion and de-assertion on the CSA > bus. > > If you want to do MPI on this system, you'll need to use a non-zero ITR > or plug in an add-in card into one of the PCI slots and use the add-in > card. The problem is, these slots are probably 32-bit/33Mhz, so you're > not going to get the same maximum Mbps that you'll get with 82547 using > the CSA bus. 82547 will not be a good choice for MPI. Sorry. > > > Affected chips (theory, 8254X, X>1 or anything faster then PCI33): > > 82547GI, 82546 (said to be affected, not verified by me) > > 82546 should be fine with ITR=0. > > > http://lists.us.dell.com/pipermail/linux-poweredge/2004-November/023061.html > > You might want to forward this info to that thread. > > -scott > > -- ------------------------------------------------------------ Peter Kjellstroem | E-mail: cap@nsc.liu.se National Supercomputer Centre | Office: +46(0)13 281492 Linkoeping University | Fax : +46(0)13 282535 SE-581 83 Linkoeping | Sweden | http://www.nsc.liu.se From cap@nsc.liu.se Sun Dec 5 06:04:45 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 05 Dec 2004 06:04:52 -0800 (PST) Received: from papput.nsc.liu.se (ns2.nsc.liu.se [130.236.101.9]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB5E4iKV015285 for ; Sun, 5 Dec 2004 06:04:45 -0800 Received: from mail.nsc.liu.se (mail.nsc.liu.se [130.236.101.49]) by papput.nsc.liu.se (Postfix) with ESMTP id 801F91C31E2; Sun, 5 Dec 2004 15:04:22 +0100 (CET) Date: Sun, 5 Dec 2004 15:04:22 +0100 (CET) From: Peter Kjellstroem To: Scott Feldman Cc: netdev@oss.sgi.com Subject: Re: e1000>5.2.30 unstable with InterruptThrottleRate=0 In-Reply-To: <1102216844.3343.84.camel@sfeldma-mobl.dsl-verizon.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 12427 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: cap@nsc.liu.se Precedence: bulk X-list: netdev Hello again, I'm sorry my previous e-mail came out much more harsh then I intended it to (not native english speaking). I just wanted to point out that it's not falling over for me with all drivers. I've been doing som more tests today to try to narrow down which patch causes my problems. So far I have this: 2.4.26 (5.2.30) is ok 2.4.28 (5.4.11) is not 2.4.28 with 5.2.30 patched in is ok 2.4.28 with 5.2.39 (from 2.4.27-pre1) is not In e1000_main.c for 5.2.39 (from 2.4.27-pre1) i found this: /* Change Log * * 5.2.39 3/12/04 * ... * o Back out the CSA fix for 82547 as it continues to cause * systems lock-ups with production systems. Best Regards, Peter On Sat, 4 Dec 2004, Scott Feldman wrote: > On Fri, 2004-12-03 at 11:02, Peter Kjellstroem wrote: > > Short version: 82547GI with ITR=0 on 2.4.28 (vanilla) and RHEL3u3 has > > problems (traffic grinds to a temporary halt under anything but trivila > > network traffic). kernel prints the following and resets the IF (many > > times): > > > > NETDEV WATCHDOG: eth0: transmit timed out > > Dude! You're out of luck! > > >From the README: > > CAUTION: If you are using the Intel PRO/1000 CT Network > Connection (controller 82547), setting > InterruptThrottleRate to a value > greater than 75,000, may hang (stop transmitting) > adapters under certain network conditions. If this > occurs a NETDEV WATCHDOG message is logged in the > system event log. In addition, the controller is > automatically reset, restoring the network > connection. To eliminate the potential for the hang, > ensure that InterruptThrottleRate is set no greater > than 75,000 and is not set to 0. > > I was running into the same thing with 82547EI setting ITR=0, and then I > remembered that this part is buggy when ITR=0. The bug is due to 82547 > messing up the order of interrupt assertion and de-assertion on the CSA > bus. > > If you want to do MPI on this system, you'll need to use a non-zero ITR > or plug in an add-in card into one of the PCI slots and use the add-in > card. The problem is, these slots are probably 32-bit/33Mhz, so you're > not going to get the same maximum Mbps that you'll get with 82547 using > the CSA bus. 82547 will not be a good choice for MPI. Sorry. > > > Affected chips (theory, 8254X, X>1 or anything faster then PCI33): > > 82547GI, 82546 (said to be affected, not verified by me) > > 82546 should be fine with ITR=0. > > > http://lists.us.dell.com/pipermail/linux-poweredge/2004-November/023061.html > > You might want to forward this info to that thread. > > -scott > > -- ------------------------------------------------------------ Peter Kjellstroem | E-mail: cap@nsc.liu.se National Supercomputer Centre | Sweden | http://www.nsc.liu.se From buytenh@wantstofly.org Sun Dec 5 06:51:16 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 05 Dec 2004 06:51:21 -0800 (PST) Received: from xi.wantstofly.org (alephnull.demon.nl [212.238.201.82]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB5EpDQd016544 for ; Sun, 5 Dec 2004 06:51:15 -0800 Received: by xi.wantstofly.org (Postfix, from userid 500) id 46A9F2B0ED; Sun, 5 Dec 2004 15:50:51 +0100 (MET) Date: Sun, 5 Dec 2004 15:50:51 +0100 From: Lennert Buytenhek To: Scott Feldman Cc: jamal , Robert Olsson , P@draigBrady.com, mellia@prezzemolo.polito.it, e1000-devel@lists.sourceforge.net, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com Subject: 1.03Mpps on e1000 (was: Re: [E1000-devel] Transmission limit) Message-ID: <20041205145051.GA647@xi.wantstofly.org> References: <1101499285.1079.45.camel@jzny.localdomain> <16811.8052.678955.795327@robur.slu.se> <1101821501.1043.43.camel@jzny.localdomain> <20041130134600.GA31515@xi.wantstofly.org> <1101824754.1044.126.camel@jzny.localdomain> <20041201001107.GE4203@xi.wantstofly.org> <1101863399.4663.54.camel@sfeldma-mobl.dsl-verizon.net> <20041201182943.GA14470@xi.wantstofly.org> <20041201213550.GF14470@xi.wantstofly.org> <1101967983.4782.9.camel@localhost.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1101967983.4782.9.camel@localhost.localdomain> User-Agent: Mutt/1.4.1i X-archive-position: 12428 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: buytenh@wantstofly.org Precedence: bulk X-list: netdev On Wed, Dec 01, 2004 at 10:13:33PM -0800, Scott Feldman wrote: > Idea#3 > > http://www.mail-archive.com/freebsd-net@freebsd.org/msg10826.html > > Set TXDMAC to 0 in e1000_configure_tx. Enabling 'DMA packet prefetching' gives me an impressive boost in performance. Combined with your TX clean rework, I now get 1.03Mpps TX performance at 60B packets. Transmitting from both of the 82546 ports at the same time gives me close to 2 Mpps. The freebsd post hints that (some) e1000 hardware might be buggy w.r.t. this prefetching though. I'll play some more with the other ideas you suggested as well. 60 1036488 61 1037413 62 1036429 63 990239 64 993218 65 993233 66 993201 67 993234 68 993219 69 993208 70 992225 71 980560 --L diff -ur e1000.orig/e1000_main.c e1000/e1000_main.c --- e1000.orig/e1000_main.c 2004-12-04 11:43:12.000000000 +0100 +++ e1000/e1000_main.c 2004-12-05 15:40:49.284946897 +0100 @@ -879,6 +894,8 @@ E1000_WRITE_REG(&adapter->hw, TCTL, tctl); + E1000_WRITE_REG(&adapter->hw, TXDMAC, 0); + e1000_config_collision_dist(&adapter->hw); /* Setup Transmit Descriptor Settings for eop descriptor */ From gandalf@wlug.westbo.se Sun Dec 5 07:04:02 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 05 Dec 2004 07:04:07 -0800 (PST) Received: from null.rsn.bth.se (postfix@null.rsn.bth.se [194.47.142.3]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB5F41Dl017185 for ; Sun, 5 Dec 2004 07:04:01 -0800 Received: by null.rsn.bth.se (Postfix, from userid 65534) id 5BBC22C0056; Sun, 5 Dec 2004 16:03:37 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by null.rsn.bth.se (Postfix) with ESMTP id DB2D72C0068; Sun, 5 Dec 2004 16:03:36 +0100 (CET) Received: from tux.rsn.bth.se (tux.rsn.bth.se [194.47.143.135]) by null.rsn.bth.se (Postfix) with ESMTP id 26F942C0056; Sun, 5 Dec 2004 16:03:36 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by tux.rsn.bth.se (Postfix) with ESMTP id 4F5D53FA7; Sun, 5 Dec 2004 16:03:36 +0100 (CET) Date: Sun, 5 Dec 2004 16:03:36 +0100 (CET) From: Martin Josefsson X-X-Sender: gandalf@tux.rsn.bth.se To: Lennert Buytenhek Cc: Scott Feldman , jamal , Robert Olsson , P@draigBrady.com, mellia@prezzemolo.polito.it, e1000-devel@lists.sourceforge.net, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com Subject: Re: 1.03Mpps on e1000 (was: Re: [E1000-devel] Transmission limit) In-Reply-To: <20041205145051.GA647@xi.wantstofly.org> Message-ID: References: <1101499285.1079.45.camel@jzny.localdomain> <16811.8052.678955.795327@robur.slu.se> <1101821501.1043.43.camel@jzny.localdomain> <20041130134600.GA31515@xi.wantstofly.org> <1101824754.1044.126.camel@jzny.localdomain> <20041201001107.GE4203@xi.wantstofly.org> <1101863399.4663.54.camel@sfeldma-mobl.dsl-verizon.net> <20041201182943.GA14470@xi.wantstofly.org> <20041201213550.GF14470@xi.wantstofly.org> <1101967983.4782.9.camel@localhost.localdomain> <20041205145051.GA647@xi.wantstofly.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Scanned: by amavisd-new-20030616-p10 on null.rsn.bth.se X-archive-position: 12429 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: gandalf@wlug.westbo.se Precedence: bulk X-list: netdev On Sun, 5 Dec 2004, Lennert Buytenhek wrote: > Enabling 'DMA packet prefetching' gives me an impressive boost in performance. > Combined with your TX clean rework, I now get 1.03Mpps TX performance at 60B > packets. Transmitting from both of the 82546 ports at the same time gives me > close to 2 Mpps. > > The freebsd post hints that (some) e1000 hardware might be buggy w.r.t. this > prefetching though. > > I'll play some more with the other ideas you suggested as well. > > 60 1036488 I was just playing with prefetching when you sent your mail :) I get that number with Scotts patch but without prefetching. If I mode the TDT update to the tc cleaning I get a few extra kpps but not much. BUT if I use the above + prefetching I get this: 60 1483890 64 1418568 68 1356992 72 1300523 76 1248568 80 1142989 84 1140909 88 1114951 92 1076546 96 960732 100 949801 104 972876 108 945314 112 918380 116 891393 120 865923 124 843288 128 696465 Which is pretty nice :) This is on one port of a 82546GB The hardware is a dual Athlon MP 2000+ in an Asus A7M266-D motherboard and the nic is located in a 64/66 slot. I won't post any patch until I've tested some more and cleaned up a few things. BTW, I also get some transmit timouts with Scotts patch sometimes, not often but it does happen. /Martin From buytenh@wantstofly.org Sun Dec 5 07:16:07 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 05 Dec 2004 07:16:11 -0800 (PST) Received: from xi.wantstofly.org (alephnull.demon.nl [212.238.201.82]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB5FG6Go017770 for ; Sun, 5 Dec 2004 07:16:07 -0800 Received: by xi.wantstofly.org (Postfix, from userid 500) id 198CD2B0ED; Sun, 5 Dec 2004 16:15:45 +0100 (MET) Date: Sun, 5 Dec 2004 16:15:45 +0100 From: Lennert Buytenhek To: Martin Josefsson Cc: Scott Feldman , jamal , Robert Olsson , P@draigBrady.com, mellia@prezzemolo.polito.it, e1000-devel@lists.sourceforge.net, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com Subject: Re: 1.03Mpps on e1000 (was: Re: [E1000-devel] Transmission limit) Message-ID: <20041205151545.GC647@xi.wantstofly.org> References: <1101821501.1043.43.camel@jzny.localdomain> <20041130134600.GA31515@xi.wantstofly.org> <1101824754.1044.126.camel@jzny.localdomain> <20041201001107.GE4203@xi.wantstofly.org> <1101863399.4663.54.camel@sfeldma-mobl.dsl-verizon.net> <20041201182943.GA14470@xi.wantstofly.org> <20041201213550.GF14470@xi.wantstofly.org> <1101967983.4782.9.camel@localhost.localdomain> <20041205145051.GA647@xi.wantstofly.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.1i X-archive-position: 12430 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: buytenh@wantstofly.org Precedence: bulk X-list: netdev On Sun, Dec 05, 2004 at 04:03:36PM +0100, Martin Josefsson wrote: > BUT if I use the above + prefetching I get this: > > 60 1483890 > [snip] > > Which is pretty nice :) Not just that, it's also wire speed GigE. Damn. Now we all have to go and upgrade to 10GbE cards, and I don't think my girlfriend would give me one of those for christmas. > This is on one port of a 82546GB > > The hardware is a dual Athlon MP 2000+ in an Asus A7M266-D motherboard and > the nic is located in a 64/66 slot. Hmmm. Funny you get this number even on 64/66. How many PCI bridges between the CPUs and the NIC? Any idea how many cycles an MMIO read on your hardware is? cheers, Lennert From gandalf@wlug.westbo.se Sun Dec 5 07:19:54 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 05 Dec 2004 07:19:59 -0800 (PST) Received: from null.rsn.bth.se (postfix@null.rsn.bth.se [194.47.142.3]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB5FJr7J018134 for ; Sun, 5 Dec 2004 07:19:53 -0800 Received: by null.rsn.bth.se (Postfix, from userid 65534) id F0EA82C0069; Sun, 5 Dec 2004 16:19:30 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by null.rsn.bth.se (Postfix) with ESMTP id 86DD02C006F; Sun, 5 Dec 2004 16:19:30 +0100 (CET) Received: from tux.rsn.bth.se (tux.rsn.bth.se [194.47.143.135]) by null.rsn.bth.se (Postfix) with ESMTP id C7C3F2C0069; Sun, 5 Dec 2004 16:19:29 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by tux.rsn.bth.se (Postfix) with ESMTP id EDD873FA7; Sun, 5 Dec 2004 16:19:29 +0100 (CET) Date: Sun, 5 Dec 2004 16:19:29 +0100 (CET) From: Martin Josefsson X-X-Sender: gandalf@tux.rsn.bth.se To: Lennert Buytenhek Cc: Scott Feldman , jamal , Robert Olsson , P@draigBrady.com, mellia@prezzemolo.polito.it, e1000-devel@lists.sourceforge.net, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com Subject: Re: 1.03Mpps on e1000 (was: Re: [E1000-devel] Transmission limit) In-Reply-To: <20041205151545.GC647@xi.wantstofly.org> Message-ID: References: <1101821501.1043.43.camel@jzny.localdomain> <20041130134600.GA31515@xi.wantstofly.org> <1101824754.1044.126.camel@jzny.localdomain> <20041201001107.GE4203@xi.wantstofly.org> <1101863399.4663.54.camel@sfeldma-mobl.dsl-verizon.net> <20041201182943.GA14470@xi.wantstofly.org> <20041201213550.GF14470@xi.wantstofly.org> <1101967983.4782.9.camel@localhost.localdomain> <20041205145051.GA647@xi.wantstofly.org> <20041205151545.GC647@xi.wantstofly.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Scanned: by amavisd-new-20030616-p10 on null.rsn.bth.se X-archive-position: 12431 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: gandalf@wlug.westbo.se Precedence: bulk X-list: netdev On Sun, 5 Dec 2004, Lennert Buytenhek wrote: > > 60 1483890 > > [snip] > > > > Which is pretty nice :) > > Not just that, it's also wire speed GigE. Damn. Now we all have to go > and upgrade to 10GbE cards, and I don't think my girlfriend would give me > one of those for christmas. Yes it is, and it's lovely to see. You have to nerdify her so she sees the need for geeky hardware enough to give you what you need :) > > This is on one port of a 82546GB > > > > The hardware is a dual Athlon MP 2000+ in an Asus A7M266-D motherboard and > > the nic is located in a 64/66 slot. > > Hmmm. Funny you get this number even on 64/66. How many PCI bridges > between the CPUs and the NIC? Any idea how many cycles an MMIO read on > your hardware is? I verified that I get the same results on a small whimpy 82540EM that runs at 32/66 as well. Just about to see what I get at 32/33 with that card. /Martin From gandalf@wlug.westbo.se Sun Dec 5 07:31:12 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 05 Dec 2004 07:31:17 -0800 (PST) Received: from null.rsn.bth.se (postfix@null.rsn.bth.se [194.47.142.3]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB5FVBfU018611 for ; Sun, 5 Dec 2004 07:31:11 -0800 Received: by null.rsn.bth.se (Postfix, from userid 65534) id 0EBEF2C0056; Sun, 5 Dec 2004 16:30:49 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by null.rsn.bth.se (Postfix) with ESMTP id 7E9BA2C0069; Sun, 5 Dec 2004 16:30:48 +0100 (CET) Received: from tux.rsn.bth.se (tux.rsn.bth.se [194.47.143.135]) by null.rsn.bth.se (Postfix) with ESMTP id B3B1D2C0056; Sun, 5 Dec 2004 16:30:47 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by tux.rsn.bth.se (Postfix) with ESMTP id DD5093FA7; Sun, 5 Dec 2004 16:30:47 +0100 (CET) Date: Sun, 5 Dec 2004 16:30:47 +0100 (CET) From: Martin Josefsson X-X-Sender: gandalf@tux.rsn.bth.se To: Lennert Buytenhek Cc: Scott Feldman , jamal , Robert Olsson , P@draigBrady.com, mellia@prezzemolo.polito.it, e1000-devel@lists.sourceforge.net, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com Subject: Re: 1.03Mpps on e1000 (was: Re: [E1000-devel] Transmission limit) In-Reply-To: Message-ID: References: <1101821501.1043.43.camel@jzny.localdomain> <20041130134600.GA31515@xi.wantstofly.org> <1101824754.1044.126.camel@jzny.localdomain> <20041201001107.GE4203@xi.wantstofly.org> <1101863399.4663.54.camel@sfeldma-mobl.dsl-verizon.net> <20041201182943.GA14470@xi.wantstofly.org> <20041201213550.GF14470@xi.wantstofly.org> <1101967983.4782.9.camel@localhost.localdomain> <20041205145051.GA647@xi.wantstofly.org> <20041205151545.GC647@xi.wantstofly.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Scanned: by amavisd-new-20030616-p10 on null.rsn.bth.se X-archive-position: 12432 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: gandalf@wlug.westbo.se Precedence: bulk X-list: netdev On Sun, 5 Dec 2004, Martin Josefsson wrote: > > > The hardware is a dual Athlon MP 2000+ in an Asus A7M266-D motherboard and > > > the nic is located in a 64/66 slot. > > > > Hmmm. Funny you get this number even on 64/66. How many PCI bridges > > between the CPUs and the NIC? Any idea how many cycles an MMIO read on > > your hardware is? > > I verified that I get the same results on a small whimpy 82540EM that runs > at 32/66 as well. Just about to see what I get at 32/33 with that card. Just tested the 82540EM at 32/33 and it's a big diffrence. 60 350229 64 247037 68 219643 72 218205 76 216786 80 215386 84 214003 88 212638 92 211291 96 210004 100 208647 104 182461 108 181468 112 180453 116 179482 120 185472 124 188336 128 153743 Sorry, forgot to answer your other questions, I'm a bit excited at the moment :) The 64/66 bus on this motherboard is directly connected to the northbridge. Here's the lspci output with the 82546GB nic attached to the 64/66 bus and 82540EM nic connected to the 32/33 bus that hangs off the southbridge: 00:00.0 Host bridge: Advanced Micro Devices [AMD] AMD-760 MP [IGD4-2P] System Controller (rev 11) 00:01.0 PCI bridge: Advanced Micro Devices [AMD] AMD-760 MP [IGD4-2P] AGP Bridge 00:07.0 ISA bridge: Advanced Micro Devices [AMD] AMD-768 [Opus] ISA (rev 05) 00:07.1 IDE interface: Advanced Micro Devices [AMD] AMD-768 [Opus] IDE (rev 04) 00:07.3 Bridge: Advanced Micro Devices [AMD] AMD-768 [Opus] ACPI (rev 03) 00:08.0 Ethernet controller: Intel Corp. 82546GB Gigabit Ethernet Controller (rev 03) 00:08.1 Ethernet controller: Intel Corp. 82546GB Gigabit Ethernet Controller (rev 03) 00:10.0 PCI bridge: Advanced Micro Devices [AMD] AMD-768 [Opus] PCI (rev 05) 01:05.0 VGA compatible controller: Silicon Integrated Systems [SiS] 86C326 5598/6326 (rev 0b) 02:05.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100] (rev 0c) 02:06.0 SCSI storage controller: Adaptec AIC-7892A U160/m (rev 02) 02:08.0 Ethernet controller: Intel Corp. 82540EM Gigabit Ethernet Controller (rev 02) And lspci -t -[00]-+-00.0 +-01.0-[01]----05.0 +-07.0 +-07.1 +-07.3 +-08.0 +-08.1 \-10.0-[02]--+-05.0 +-06.0 \-08.0 I have no idea how expensive an MMIO read is on this machine, do you have an relatively easy way to find out? /Martin From gandalf@wlug.westbo.se Sun Dec 5 07:43:00 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 05 Dec 2004 07:43:04 -0800 (PST) Received: from null.rsn.bth.se (postfix@null.rsn.bth.se [194.47.142.3]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB5Fgw6n019392 for ; Sun, 5 Dec 2004 07:42:59 -0800 Received: by null.rsn.bth.se (Postfix, from userid 65534) id 521D92C0069; Sun, 5 Dec 2004 16:42:36 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by null.rsn.bth.se (Postfix) with ESMTP id 644B22C006F; Sun, 5 Dec 2004 16:42:35 +0100 (CET) Received: from tux.rsn.bth.se (tux.rsn.bth.se [194.47.143.135]) by null.rsn.bth.se (Postfix) with ESMTP id 930A92C0069; Sun, 5 Dec 2004 16:42:34 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by tux.rsn.bth.se (Postfix) with ESMTP id BA4393FA7; Sun, 5 Dec 2004 16:42:34 +0100 (CET) Date: Sun, 5 Dec 2004 16:42:34 +0100 (CET) From: Martin Josefsson X-X-Sender: gandalf@tux.rsn.bth.se To: Lennert Buytenhek Cc: Scott Feldman , jamal , Robert Olsson , P@draigBrady.com, mellia@prezzemolo.polito.it, e1000-devel@lists.sourceforge.net, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com Subject: Re: 1.03Mpps on e1000 (was: Re: [E1000-devel] Transmission limit) In-Reply-To: Message-ID: References: <1101499285.1079.45.camel@jzny.localdomain> <16811.8052.678955.795327@robur.slu.se> <1101821501.1043.43.camel@jzny.localdomain> <20041130134600.GA31515@xi.wantstofly.org> <1101824754.1044.126.camel@jzny.localdomain> <20041201001107.GE4203@xi.wantstofly.org> <1101863399.4663.54.camel@sfeldma-mobl.dsl-verizon.net> <20041201182943.GA14470@xi.wantstofly.org> <20041201213550.GF14470@xi.wantstofly.org> <1101967983.4782.9.camel@localhost.localdomain> <20041205145051.GA647@xi.wantstofly.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Scanned: by amavisd-new-20030616-p10 on null.rsn.bth.se X-archive-position: 12433 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: gandalf@wlug.westbo.se Precedence: bulk X-list: netdev On Sun, 5 Dec 2004, Martin Josefsson wrote: [snip] > BUT if I use the above + prefetching I get this: > > 60 1483890 [snip] > This is on one port of a 82546GB > > The hardware is a dual Athlon MP 2000+ in an Asus A7M266-D motherboard and > the nic is located in a 64/66 slot. > > I won't post any patch until I've tested some more and cleaned up a few > things. > > BTW, I also get some transmit timouts with Scotts patch sometimes, not > often but it does happen. Here's the patch, not much more tested (it still gives some transmit timeouts since it's scotts patch + prefetching and delayed TDT updating). And it's not cleaned up, but hey, that's development :) The delayed TDT updating was a test and currently it delays the first tx'd packet after a timerrun 1ms. Would be interesting to see what other people get with this thing. Lennert? diff -X /home/gandalf/dontdiff.ny -urNp linux-2.6.10-rc3.orig/drivers/net/e1000/e1000.h linux-2.6.10-rc3.labbrouter/drivers/net/e1000/e1000.h --- linux-2.6.10-rc3.orig/drivers/net/e1000/e1000.h 2004-12-04 18:16:53.000000000 +0100 +++ linux-2.6.10-rc3.labbrouter/drivers/net/e1000/e1000.h 2004-12-05 15:12:25.000000000 +0100 @@ -101,7 +101,7 @@ struct e1000_adapter; #define E1000_MAX_INTR 10 /* TX/RX descriptor defines */ -#define E1000_DEFAULT_TXD 256 +#define E1000_DEFAULT_TXD 4096 #define E1000_MAX_TXD 256 #define E1000_MIN_TXD 80 #define E1000_MAX_82544_TXD 4096 @@ -187,6 +187,7 @@ struct e1000_desc_ring { /* board specific private data structure */ struct e1000_adapter { + struct timer_list tx_cleanup_timer; struct timer_list tx_fifo_stall_timer; struct timer_list watchdog_timer; struct timer_list phy_info_timer; @@ -222,6 +223,7 @@ struct e1000_adapter { uint32_t tx_fifo_size; atomic_t tx_fifo_stall; boolean_t pcix_82544; + boolean_t tx_cleanup_scheduled; /* RX */ struct e1000_desc_ring rx_ring; diff -X /home/gandalf/dontdiff.ny -urNp linux-2.6.10-rc3.orig/drivers/net/e1000/e1000_hw.h linux-2.6.10-rc3.labbrouter/drivers/net/e1000/e1000_hw.h --- linux-2.6.10-rc3.orig/drivers/net/e1000/e1000_hw.h 2004-12-04 18:16:53.000000000 +0100 +++ linux-2.6.10-rc3.labbrouter/drivers/net/e1000/e1000_hw.h 2004-12-05 15:37:50.000000000 +0100 @@ -417,14 +417,12 @@ int32_t e1000_set_d3_lplu_state(struct e /* This defines the bits that are set in the Interrupt Mask * Set/Read Register. Each bit is documented below: * o RXT0 = Receiver Timer Interrupt (ring 0) - * o TXDW = Transmit Descriptor Written Back * o RXDMT0 = Receive Descriptor Minimum Threshold hit (ring 0) * o RXSEQ = Receive Sequence Error * o LSC = Link Status Change */ #define IMS_ENABLE_MASK ( \ E1000_IMS_RXT0 | \ - E1000_IMS_TXDW | \ E1000_IMS_RXDMT0 | \ E1000_IMS_RXSEQ | \ E1000_IMS_LSC) diff -X /home/gandalf/dontdiff.ny -urNp linux-2.6.10-rc3.orig/drivers/net/e1000/e1000_main.c linux-2.6.10-rc3.labbrouter/drivers/net/e1000/e1000_main.c --- linux-2.6.10-rc3.orig/drivers/net/e1000/e1000_main.c 2004-12-05 14:59:19.000000000 +0100 +++ linux-2.6.10-rc3.labbrouter/drivers/net/e1000/e1000_main.c 2004-12-05 15:40:11.000000000 +0100 @@ -131,7 +131,7 @@ static int e1000_set_mac(struct net_devi static void e1000_irq_disable(struct e1000_adapter *adapter); static void e1000_irq_enable(struct e1000_adapter *adapter); static irqreturn_t e1000_intr(int irq, void *data, struct pt_regs *regs); -static boolean_t e1000_clean_tx_irq(struct e1000_adapter *adapter); +static void e1000_clean_tx(unsigned long data); #ifdef CONFIG_E1000_NAPI static int e1000_clean(struct net_device *netdev, int *budget); static boolean_t e1000_clean_rx_irq(struct e1000_adapter *adapter, @@ -286,6 +286,7 @@ e1000_down(struct e1000_adapter *adapter e1000_irq_disable(adapter); free_irq(adapter->pdev->irq, netdev); + del_timer_sync(&adapter->tx_cleanup_timer); del_timer_sync(&adapter->tx_fifo_stall_timer); del_timer_sync(&adapter->watchdog_timer); del_timer_sync(&adapter->phy_info_timer); @@ -522,6 +523,10 @@ e1000_probe(struct pci_dev *pdev, e1000_get_bus_info(&adapter->hw); + init_timer(&adapter->tx_cleanup_timer); + adapter->tx_cleanup_timer.function = &e1000_clean_tx; + adapter->tx_cleanup_timer.data = (unsigned long) adapter; + init_timer(&adapter->tx_fifo_stall_timer); adapter->tx_fifo_stall_timer.function = &e1000_82547_tx_fifo_stall; adapter->tx_fifo_stall_timer.data = (unsigned long) adapter; @@ -882,19 +887,16 @@ e1000_configure_tx(struct e1000_adapter e1000_config_collision_dist(&adapter->hw); /* Setup Transmit Descriptor Settings for eop descriptor */ - adapter->txd_cmd = E1000_TXD_CMD_IDE | E1000_TXD_CMD_EOP | + adapter->txd_cmd = E1000_TXD_CMD_EOP | E1000_TXD_CMD_IFCS; - if(adapter->hw.mac_type < e1000_82543) - adapter->txd_cmd |= E1000_TXD_CMD_RPS; - else - adapter->txd_cmd |= E1000_TXD_CMD_RS; - /* Cache if we're 82544 running in PCI-X because we'll * need this to apply a workaround later in the send path. */ if(adapter->hw.mac_type == e1000_82544 && adapter->hw.bus_type == e1000_bus_type_pcix) adapter->pcix_82544 = 1; + + E1000_WRITE_REG(&adapter->hw, TXDMAC, 0); } /** @@ -1707,7 +1709,7 @@ e1000_tx_queue(struct e1000_adapter *ada wmb(); tx_ring->next_to_use = i; - E1000_WRITE_REG(&adapter->hw, TDT, i); + /* E1000_WRITE_REG(&adapter->hw, TDT, i); */ } /** @@ -1809,6 +1811,11 @@ e1000_xmit_frame(struct sk_buff *skb, st return NETDEV_TX_LOCKED; } + if(!adapter->tx_cleanup_scheduled) { + adapter->tx_cleanup_scheduled = TRUE; + mod_timer(&adapter->tx_cleanup_timer, jiffies + 1); + } + /* need: count + 2 desc gap to keep tail from touching * head, otherwise try next time */ if(E1000_DESC_UNUSED(&adapter->tx_ring) < count + 2) { @@ -1845,6 +1852,7 @@ e1000_xmit_frame(struct sk_buff *skb, st netdev->trans_start = jiffies; spin_unlock_irqrestore(&adapter->tx_lock, flags); + return NETDEV_TX_OK; } @@ -2140,8 +2148,7 @@ e1000_intr(int irq, void *data, struct p } #else for(i = 0; i < E1000_MAX_INTR; i++) - if(unlikely(!e1000_clean_rx_irq(adapter) & - !e1000_clean_tx_irq(adapter))) + if(unlikely(!e1000_clean_rx_irq(adapter))) break; #endif @@ -2159,18 +2166,15 @@ e1000_clean(struct net_device *netdev, i { struct e1000_adapter *adapter = netdev->priv; int work_to_do = min(*budget, netdev->quota); - int tx_cleaned; int work_done = 0; - tx_cleaned = e1000_clean_tx_irq(adapter); e1000_clean_rx_irq(adapter, &work_done, work_to_do); *budget -= work_done; netdev->quota -= work_done; - /* if no Rx and Tx cleanup work was done, exit the polling mode */ - if(!tx_cleaned || (work_done < work_to_do) || - !netif_running(netdev)) { + /* if no Rx cleanup work was done, exit the polling mode */ + if((work_done < work_to_do) || !netif_running(netdev)) { netif_rx_complete(netdev); e1000_irq_enable(adapter); return 0; @@ -2181,66 +2185,76 @@ e1000_clean(struct net_device *netdev, i #endif /** - * e1000_clean_tx_irq - Reclaim resources after transmit completes - * @adapter: board private structure + * e1000_clean_tx - Reclaim resources after transmit completes + * @data: timer callback data (board private structure) **/ -static boolean_t -e1000_clean_tx_irq(struct e1000_adapter *adapter) +static void +e1000_clean_tx(unsigned long data) { + struct e1000_adapter *adapter = (struct e1000_adapter *)data; struct e1000_desc_ring *tx_ring = &adapter->tx_ring; struct net_device *netdev = adapter->netdev; struct pci_dev *pdev = adapter->pdev; - struct e1000_tx_desc *tx_desc, *eop_desc; struct e1000_buffer *buffer_info; - unsigned int i, eop; - boolean_t cleaned = FALSE; + unsigned int i, next; + int size = 0, count = 0; + uint32_t tx_head; - i = tx_ring->next_to_clean; - eop = tx_ring->buffer_info[i].next_to_watch; - eop_desc = E1000_TX_DESC(*tx_ring, eop); + spin_lock(&adapter->tx_lock); - while(eop_desc->upper.data & cpu_to_le32(E1000_TXD_STAT_DD)) { - for(cleaned = FALSE; !cleaned; ) { - tx_desc = E1000_TX_DESC(*tx_ring, i); - buffer_info = &tx_ring->buffer_info[i]; + E1000_WRITE_REG(&adapter->hw, TDT, tx_ring->next_to_use); - if(likely(buffer_info->dma)) { - pci_unmap_page(pdev, - buffer_info->dma, - buffer_info->length, - PCI_DMA_TODEVICE); - buffer_info->dma = 0; - } + tx_head = E1000_READ_REG(&adapter->hw, TDH); - if(buffer_info->skb) { - dev_kfree_skb_any(buffer_info->skb); - buffer_info->skb = NULL; - } + i = next = tx_ring->next_to_clean; - tx_desc->buffer_addr = 0; - tx_desc->lower.data = 0; - tx_desc->upper.data = 0; + while(i != tx_head) { + size++; + if(i == tx_ring->buffer_info[next].next_to_watch) { + count += size; + size = 0; + if(unlikely(++i == tx_ring->count)) + i = 0; + next = i; + } else { + if(unlikely(++i == tx_ring->count)) + i = 0; + } + } - cleaned = (i == eop); - if(unlikely(++i == tx_ring->count)) i = 0; + i = tx_ring->next_to_clean; + while(count--) { + buffer_info = &tx_ring->buffer_info[i]; + + if(likely(buffer_info->dma)) { + pci_unmap_page(pdev, + buffer_info->dma, + buffer_info->length, + PCI_DMA_TODEVICE); + buffer_info->dma = 0; } - - eop = tx_ring->buffer_info[i].next_to_watch; - eop_desc = E1000_TX_DESC(*tx_ring, eop); + + if(buffer_info->skb) { + dev_kfree_skb_any(buffer_info->skb); + buffer_info->skb = NULL; + } + + if(unlikely(++i == tx_ring->count)) + i = 0; } tx_ring->next_to_clean = i; - spin_lock(&adapter->tx_lock); + if(E1000_DESC_UNUSED(tx_ring) != tx_ring->count) + mod_timer(&adapter->tx_cleanup_timer, jiffies + 1); + else + adapter->tx_cleanup_scheduled = FALSE; - if(unlikely(cleaned && netif_queue_stopped(netdev) && - netif_carrier_ok(netdev))) + if(unlikely(netif_queue_stopped(netdev) && netif_carrier_ok(netdev))) netif_wake_queue(netdev); spin_unlock(&adapter->tx_lock); - - return cleaned; } /** /Martin From gandalf@wlug.westbo.se Sun Dec 5 08:48:59 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 05 Dec 2004 08:49:04 -0800 (PST) Received: from null.rsn.bth.se (postfix@null.rsn.bth.se [194.47.142.3]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB5Gmwrk024151 for ; Sun, 5 Dec 2004 08:48:59 -0800 Received: by null.rsn.bth.se (Postfix, from userid 65534) id 98CEC2C0069; Sun, 5 Dec 2004 17:48:35 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by null.rsn.bth.se (Postfix) with ESMTP id 211292C0069; Sun, 5 Dec 2004 17:48:35 +0100 (CET) Received: from tux.rsn.bth.se (tux.rsn.bth.se [194.47.143.135]) by null.rsn.bth.se (Postfix) with ESMTP id 61B582C0056; Sun, 5 Dec 2004 17:48:34 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by tux.rsn.bth.se (Postfix) with ESMTP id 98F033FA7; Sun, 5 Dec 2004 17:48:34 +0100 (CET) Date: Sun, 5 Dec 2004 17:48:34 +0100 (CET) From: Martin Josefsson X-X-Sender: gandalf@tux.rsn.bth.se To: Lennert Buytenhek Cc: Scott Feldman , jamal , Robert Olsson , P@draigBrady.com, mellia@prezzemolo.polito.it, e1000-devel@lists.sourceforge.net, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com Subject: Re: 1.03Mpps on e1000 (was: Re: [E1000-devel] Transmission limit) In-Reply-To: Message-ID: References: <1101499285.1079.45.camel@jzny.localdomain> <16811.8052.678955.795327@robur.slu.se> <1101821501.1043.43.camel@jzny.localdomain> <20041130134600.GA31515@xi.wantstofly.org> <1101824754.1044.126.camel@jzny.localdomain> <20041201001107.GE4203@xi.wantstofly.org> <1101863399.4663.54.camel@sfeldma-mobl.dsl-verizon.net> <20041201182943.GA14470@xi.wantstofly.org> <20041201213550.GF14470@xi.wantstofly.org> <1101967983.4782.9.camel@localhost.localdomain> <20041205145051.GA647@xi.wantstofly.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Scanned: by amavisd-new-20030616-p10 on null.rsn.bth.se X-archive-position: 12434 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: gandalf@wlug.westbo.se Precedence: bulk X-list: netdev On Sun, 5 Dec 2004, Martin Josefsson wrote: > The delayed TDT updating was a test and currently it delays the first tx'd > packet after a timerrun 1ms. I removed the delayed TDT updating and gave it a go again (this is scott + prefetching): 60 1486193 64 1267639 68 1259682 72 1243997 76 1243989 80 1153608 84 1123813 88 1115047 92 1076636 96 1040792 100 1007252 104 975806 108 946263 112 918456 116 892227 120 867477 124 844052 128 821858 It gives a little diffrent results, 60byte is ok but then it falls a lot down to 64byte and the curve seems a bit flatter. This should be the same driver that Lennert got 1.03Mpps with. I get 1.03Mpps without prefetching. I tried using both ports on the 82546GB nic. delay nodelay 1CPU 1.95 Mpps 1.76 Mpps 2CPU 1.60 Mpps 1.44 Mpps All tests performed on an SMP kernel, the above mention of 1CPU vs 2CPU just means how the two nics were bound to the cpus. And there's no tx-interrupts at all due to scotts patch. /Martin From buytenh@wantstofly.org Sun Dec 5 09:00:29 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 05 Dec 2004 09:00:33 -0800 (PST) Received: from xi.wantstofly.org (alephnull.demon.nl [212.238.201.82]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB5H0SRM024783 for ; Sun, 5 Dec 2004 09:00:29 -0800 Received: by xi.wantstofly.org (Postfix, from userid 500) id A8CBC2B0ED; Sun, 5 Dec 2004 18:00:06 +0100 (MET) Date: Sun, 5 Dec 2004 18:00:06 +0100 From: Lennert Buytenhek To: Martin Josefsson Cc: Scott Feldman , jamal , Robert Olsson , P@draigBrady.com, mellia@prezzemolo.polito.it, e1000-devel@lists.sourceforge.net, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com Subject: Re: 1.03Mpps on e1000 (was: Re: [E1000-devel] Transmission limit) Message-ID: <20041205170006.GI647@xi.wantstofly.org> References: <20041201001107.GE4203@xi.wantstofly.org> <1101863399.4663.54.camel@sfeldma-mobl.dsl-verizon.net> <20041201182943.GA14470@xi.wantstofly.org> <20041201213550.GF14470@xi.wantstofly.org> <1101967983.4782.9.camel@localhost.localdomain> <20041205145051.GA647@xi.wantstofly.org> <20041205151545.GC647@xi.wantstofly.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.1i X-archive-position: 12435 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: buytenh@wantstofly.org Precedence: bulk X-list: netdev On Sun, Dec 05, 2004 at 04:30:47PM +0100, Martin Josefsson wrote: > > I verified that I get the same results on a small whimpy 82540EM > > that runs at 32/66 as well. Just about to see what I get at 32/33 > > with that card. > > Just tested the 82540EM at 32/33 and it's a big diffrence. > > 60 350229 > 64 247037 > 68 219643 > 72 218205 > 76 216786 > 80 215386 > 84 214003 > 88 212638 > 92 211291 > 96 210004 > 100 208647 > 104 182461 > 108 181468 > 112 180453 > 116 179482 > 120 185472 > 124 188336 > 128 153743 With or without prefetching? My 82540 in 32/33 mode gets on baseline 2.6.9: 60 431967 61 431311 62 431927 63 427827 64 427482 And with Scott's notxints patch: 60 514496 61 514493 62 514754 63 504629 64 504123 > Sorry, forgot to answer your other questions, I'm a bit excited at the > moment :) Makes sense :) > The 64/66 bus on this motherboard is directly connected to the > northbridge. Your lspci output seems to suggest there is another PCI bridge in between (00:10.0) Basically on my box, it's CPU - MCH - P64H2 - e1000, where MCH is the 'Memory Controller Hub' and P64H2 the PCI-X bridge chip. > I have no idea how expensive an MMIO read is on this machine, do you have > an relatively easy way to find out? A dirty way, yes ;-) Open up e1000_osdep.h and do: -#define E1000_READ_REG(a, reg) ( \ - readl((a)->hw_addr + \ - (((a)->mac_type >= e1000_82543) ? E1000_##reg : E1000_82542_##reg))) +#define E1000_READ_REG(a, reg) ({ \ + unsigned long s, e, d, v; \ +\ + (a)->mmio_reads++; \ + rdtsc(s, d); \ + v = readl((a)->hw_addr + \ + (((a)->mac_type >= e1000_82543) ? E1000_##reg : E1000_82542_##reg)); \ + rdtsc(e, d); \ + e -= s; \ + printk(KERN_INFO "e1000: MMIO read took %ld clocks\n", e); \ + printk(KERN_INFO "e1000: in process %d(%s)\n", current->pid, current->comm); \ + dump_stack(); \ + v; \ +}) You might want to disable the stack dump of course. --L From gandalf@wlug.westbo.se Sun Dec 5 09:02:07 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 05 Dec 2004 09:02:11 -0800 (PST) Received: from null.rsn.bth.se (postfix@null.rsn.bth.se [194.47.142.3]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB5H26di025153 for ; Sun, 5 Dec 2004 09:02:07 -0800 Received: by null.rsn.bth.se (Postfix, from userid 65534) id 200942C002B; Sun, 5 Dec 2004 18:01:44 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by null.rsn.bth.se (Postfix) with ESMTP id A95EA2C0056; Sun, 5 Dec 2004 18:01:43 +0100 (CET) Received: from tux.rsn.bth.se (tux.rsn.bth.se [194.47.143.135]) by null.rsn.bth.se (Postfix) with ESMTP id E77DC2C002B; Sun, 5 Dec 2004 18:01:42 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by tux.rsn.bth.se (Postfix) with ESMTP id 31EC33FA7; Sun, 5 Dec 2004 18:01:43 +0100 (CET) Date: Sun, 5 Dec 2004 18:01:43 +0100 (CET) From: Martin Josefsson X-X-Sender: gandalf@tux.rsn.bth.se To: Lennert Buytenhek Cc: Scott Feldman , jamal , Robert Olsson , P@draigBrady.com, mellia@prezzemolo.polito.it, e1000-devel@lists.sourceforge.net, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com Subject: Re: 1.03Mpps on e1000 (was: Re: [E1000-devel] Transmission limit) In-Reply-To: Message-ID: References: <1101499285.1079.45.camel@jzny.localdomain> <16811.8052.678955.795327@robur.slu.se> <1101821501.1043.43.camel@jzny.localdomain> <20041130134600.GA31515@xi.wantstofly.org> <1101824754.1044.126.camel@jzny.localdomain> <20041201001107.GE4203@xi.wantstofly.org> <1101863399.4663.54.camel@sfeldma-mobl.dsl-verizon.net> <20041201182943.GA14470@xi.wantstofly.org> <20041201213550.GF14470@xi.wantstofly.org> <1101967983.4782.9.camel@localhost.localdomain> <20041205145051.GA647@xi.wantstofly.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Scanned: by amavisd-new-20030616-p10 on null.rsn.bth.se X-archive-position: 12436 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: gandalf@wlug.westbo.se Precedence: bulk X-list: netdev On Sun, 5 Dec 2004, Martin Josefsson wrote: > I removed the delayed TDT updating and gave it a go again (this is scott + > prefetching): > > 60 1486193 > 64 1267639 > 68 1259682 Yet another mail, I hope you are using a NAPI-enabled MUA :) This time I tried vanilla + prefetch and it gave pretty nice performance as well: 60 1308047 64 1076044 68 1079377 72 1058993 76 1055708 80 1025659 84 1024692 88 1024236 92 1024510 96 1012853 100 1007925 104 976500 108 947061 112 919169 116 892804 120 868084 124 844609 128 822381 Large gap between 60 and 64byte, maybe the prefetching only prefetches 32bytes at a time? As a reference: here's a completely vanilla e1000 driver: 60 860931 64 772949 68 754738 72 754200 76 756093 80 756398 84 742111 88 738120 92 740426 96 739720 100 722322 104 729287 108 719312 112 723171 116 705551 120 704843 124 704622 128 665863 /Martin From gandalf@wlug.westbo.se Sun Dec 5 09:12:20 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 05 Dec 2004 09:12:23 -0800 (PST) Received: from null.rsn.bth.se (postfix@null.rsn.bth.se [194.47.142.3]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB5HCJaq025640 for ; Sun, 5 Dec 2004 09:12:19 -0800 Received: by null.rsn.bth.se (Postfix, from userid 65534) id 62A582C002B; Sun, 5 Dec 2004 18:11:56 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by null.rsn.bth.se (Postfix) with ESMTP id E0C652C0056; Sun, 5 Dec 2004 18:11:55 +0100 (CET) Received: from tux.rsn.bth.se (tux.rsn.bth.se [194.47.143.135]) by null.rsn.bth.se (Postfix) with ESMTP id 2BD692C002B; Sun, 5 Dec 2004 18:11:55 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by tux.rsn.bth.se (Postfix) with ESMTP id 68B043FA7; Sun, 5 Dec 2004 18:11:55 +0100 (CET) Date: Sun, 5 Dec 2004 18:11:55 +0100 (CET) From: Martin Josefsson X-X-Sender: gandalf@tux.rsn.bth.se To: Lennert Buytenhek Cc: Scott Feldman , jamal , Robert Olsson , P@draigBrady.com, mellia@prezzemolo.polito.it, e1000-devel@lists.sourceforge.net, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com Subject: Re: 1.03Mpps on e1000 (was: Re: [E1000-devel] Transmission limit) In-Reply-To: <20041205170006.GI647@xi.wantstofly.org> Message-ID: References: <20041201001107.GE4203@xi.wantstofly.org> <1101863399.4663.54.camel@sfeldma-mobl.dsl-verizon.net> <20041201182943.GA14470@xi.wantstofly.org> <20041201213550.GF14470@xi.wantstofly.org> <1101967983.4782.9.camel@localhost.localdomain> <20041205145051.GA647@xi.wantstofly.org> <20041205151545.GC647@xi.wantstofly.org> <20041205170006.GI647@xi.wantstofly.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Scanned: by amavisd-new-20030616-p10 on null.rsn.bth.se X-archive-position: 12437 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: gandalf@wlug.westbo.se Precedence: bulk X-list: netdev On Sun, 5 Dec 2004, Lennert Buytenhek wrote: > > Just tested the 82540EM at 32/33 and it's a big diffrence. > > > > 60 350229 > > 64 247037 > > 68 219643 [snip] > With or without prefetching? My 82540 in 32/33 mode gets on baseline > 2.6.9: With, will test without. I've always suspected that the 32bit bus on this motherboard is a bit slow. > Your lspci output seems to suggest there is another PCI bridge in > between (00:10.0) Yes it sits between the 32bit and the 64bit bus. > Basically on my box, it's CPU - MCH - P64H2 - e1000, where MCH is the > 'Memory Controller Hub' and P64H2 the PCI-X bridge chip. I don't have PCI-X (unless 64/66 counts as PCI-x which I highly doubt) > > I have no idea how expensive an MMIO read is on this machine, do you have > > an relatively easy way to find out? > > A dirty way, yes ;-) Open up e1000_osdep.h and do: > > -#define E1000_READ_REG(a, reg) ( \ > - readl((a)->hw_addr + \ > - (((a)->mac_type >= e1000_82543) ? E1000_##reg : E1000_82542_##reg))) > +#define E1000_READ_REG(a, reg) ({ \ > + unsigned long s, e, d, v; \ > +\ > + (a)->mmio_reads++; \ > + rdtsc(s, d); \ > + v = readl((a)->hw_addr + \ > + (((a)->mac_type >= e1000_82543) ? E1000_##reg : E1000_82542_##reg)); \ > + rdtsc(e, d); \ > + e -= s; \ > + printk(KERN_INFO "e1000: MMIO read took %ld clocks\n", e); \ > + printk(KERN_INFO "e1000: in process %d(%s)\n", current->pid, current->comm); \ > + dump_stack(); \ > + v; \ > +}) > > You might want to disable the stack dump of course. Will test this in a while. /Martin From gandalf@wlug.westbo.se Sun Dec 5 09:38:30 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 05 Dec 2004 09:38:34 -0800 (PST) Received: from null.rsn.bth.se (postfix@null.rsn.bth.se [194.47.142.3]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB5HcTvp026395 for ; Sun, 5 Dec 2004 09:38:30 -0800 Received: by null.rsn.bth.se (Postfix, from userid 65534) id EAD6C2C0056; Sun, 5 Dec 2004 18:38:06 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by null.rsn.bth.se (Postfix) with ESMTP id 6FC002C006F; Sun, 5 Dec 2004 18:38:06 +0100 (CET) Received: from tux.rsn.bth.se (tux.rsn.bth.se [194.47.143.135]) by null.rsn.bth.se (Postfix) with ESMTP id AA2732C0056; Sun, 5 Dec 2004 18:38:05 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by tux.rsn.bth.se (Postfix) with ESMTP id E45B13FA7; Sun, 5 Dec 2004 18:38:05 +0100 (CET) Date: Sun, 5 Dec 2004 18:38:05 +0100 (CET) From: Martin Josefsson X-X-Sender: gandalf@tux.rsn.bth.se To: Lennert Buytenhek Cc: Scott Feldman , jamal , Robert Olsson , P@draigBrady.com, mellia@prezzemolo.polito.it, e1000-devel@lists.sourceforge.net, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com Subject: Re: 1.03Mpps on e1000 (was: Re: [E1000-devel] Transmission limit) In-Reply-To: Message-ID: References: <20041201001107.GE4203@xi.wantstofly.org> <1101863399.4663.54.camel@sfeldma-mobl.dsl-verizon.net> <20041201182943.GA14470@xi.wantstofly.org> <20041201213550.GF14470@xi.wantstofly.org> <1101967983.4782.9.camel@localhost.localdomain> <20041205145051.GA647@xi.wantstofly.org> <20041205151545.GC647@xi.wantstofly.org> <20041205170006.GI647@xi.wantstofly.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Scanned: by amavisd-new-20030616-p10 on null.rsn.bth.se X-archive-position: 12438 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: gandalf@wlug.westbo.se Precedence: bulk X-list: netdev On Sun, 5 Dec 2004, Martin Josefsson wrote: > > -#define E1000_READ_REG(a, reg) ( \ > > - readl((a)->hw_addr + \ > > - (((a)->mac_type >= e1000_82543) ? E1000_##reg : E1000_82542_##reg))) > > +#define E1000_READ_REG(a, reg) ({ \ > > + unsigned long s, e, d, v; \ > > +\ > > + (a)->mmio_reads++; \ > > + rdtsc(s, d); \ > > + v = readl((a)->hw_addr + \ > > + (((a)->mac_type >= e1000_82543) ? E1000_##reg : E1000_82542_##reg)); \ > > + rdtsc(e, d); \ > > + e -= s; \ > > + printk(KERN_INFO "e1000: MMIO read took %ld clocks\n", e); \ > > + printk(KERN_INFO "e1000: in process %d(%s)\n", current->pid, current->comm); \ > > + dump_stack(); \ > > + v; \ > > +}) > > > > You might want to disable the stack dump of course. > > Will test this in a while. It gives pretty varied results. This is during a pktgen run. The machine is an Athlon MP 2000+ which operated at 1667 MHz e1000: MMIO read took 481 clocks e1000: MMIO read took 369 clocks e1000: MMIO read took 481 clocks e1000: MMIO read took 11 clocks e1000: MMIO read took 477 clocks e1000: MMIO read took 316 clocks e1000: MMIO read took 481 clocks e1000: MMIO read took 316 clocks e1000: MMIO read took 480 clocks e1000: MMIO read took 332 clocks e1000: MMIO read took 480 clocks e1000: MMIO read took 372 clocks e1000: MMIO read took 480 clocks e1000: MMIO read took 11 clocks e1000: MMIO read took 481 clocks e1000: MMIO read took 388 clocks e1000: MMIO read took 480 clocks e1000: MMIO read took 11 clocks e1000: MMIO read took 485 clocks e1000: MMIO read took 317 clocks e1000: MMIO read took 481 clocks e1000: MMIO read took 337 clocks e1000: MMIO read took 480 clocks e1000: MMIO read took 316 clocks e1000: MMIO read took 480 clocks e1000: MMIO read took 409 clocks e1000: MMIO read took 480 clocks e1000: MMIO read took 334 clocks e1000: MMIO read took 481 clocks e1000: MMIO read took 316 clocks e1000: MMIO read took 480 clocks e1000: MMIO read took 11 clocks e1000: MMIO read took 505 clocks e1000: MMIO read took 359 clocks e1000: MMIO read took 484 clocks e1000: MMIO read took 337 clocks e1000: MMIO read took 464 clocks e1000: MMIO read took 504 clocks /Martin From buytenh@wantstofly.org Sun Dec 5 09:44:23 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 05 Dec 2004 09:44:27 -0800 (PST) Received: from xi.wantstofly.org (alephnull.demon.nl [212.238.201.82]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB5HiNlA026845 for ; Sun, 5 Dec 2004 09:44:23 -0800 Received: by xi.wantstofly.org (Postfix, from userid 500) id 605F12B0ED; Sun, 5 Dec 2004 18:44:01 +0100 (MET) Date: Sun, 5 Dec 2004 18:44:01 +0100 From: Lennert Buytenhek To: Martin Josefsson Cc: Scott Feldman , jamal , Robert Olsson , P@draigBrady.com, mellia@prezzemolo.polito.it, e1000-devel@lists.sourceforge.net, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com Subject: Re: 1.03Mpps on e1000 (was: Re: [E1000-devel] Transmission limit) Message-ID: <20041205174401.GJ647@xi.wantstofly.org> References: <20041130134600.GA31515@xi.wantstofly.org> <1101824754.1044.126.camel@jzny.localdomain> <20041201001107.GE4203@xi.wantstofly.org> <1101863399.4663.54.camel@sfeldma-mobl.dsl-verizon.net> <20041201182943.GA14470@xi.wantstofly.org> <20041201213550.GF14470@xi.wantstofly.org> <1101967983.4782.9.camel@localhost.localdomain> <20041205145051.GA647@xi.wantstofly.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.1i X-archive-position: 12439 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: buytenh@wantstofly.org Precedence: bulk X-list: netdev On Sun, Dec 05, 2004 at 04:42:34PM +0100, Martin Josefsson wrote: > The delayed TDT updating was a test and currently it delays the first tx'd > packet after a timerrun 1ms. > > Would be interesting to see what other people get with this thing. > Lennert? I took Scott's notxints patch, added the prefetch bits and moved the TDT updating to e1000_clean_tx as you did. Slightly better than before, but not much: 60 1070157 61 1066610 62 1062088 63 991447 64 991546 65 991537 66 991449 67 990857 68 989882 69 991347 Regular TDT updating: 60 1037469 61 1038425 62 1037393 63 993143 64 992156 65 993137 66 992203 67 992165 68 992185 69 988249 --L From buytenh@wantstofly.org Sun Dec 5 09:51:56 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 05 Dec 2004 09:52:00 -0800 (PST) Received: from xi.wantstofly.org (alephnull.demon.nl [212.238.201.82]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB5HptqK027333 for ; Sun, 5 Dec 2004 09:51:55 -0800 Received: by xi.wantstofly.org (Postfix, from userid 500) id 8915E2B0ED; Sun, 5 Dec 2004 18:51:33 +0100 (MET) Date: Sun, 5 Dec 2004 18:51:33 +0100 From: Lennert Buytenhek To: Martin Josefsson Cc: Scott Feldman , jamal , Robert Olsson , P@draigBrady.com, mellia@prezzemolo.polito.it, e1000-devel@lists.sourceforge.net, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com Subject: Re: 1.03Mpps on e1000 (was: Re: [E1000-devel] Transmission limit) Message-ID: <20041205175133.GK647@xi.wantstofly.org> References: <1101824754.1044.126.camel@jzny.localdomain> <20041201001107.GE4203@xi.wantstofly.org> <1101863399.4663.54.camel@sfeldma-mobl.dsl-verizon.net> <20041201182943.GA14470@xi.wantstofly.org> <20041201213550.GF14470@xi.wantstofly.org> <1101967983.4782.9.camel@localhost.localdomain> <20041205145051.GA647@xi.wantstofly.org> <20041205174401.GJ647@xi.wantstofly.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041205174401.GJ647@xi.wantstofly.org> User-Agent: Mutt/1.4.1i X-archive-position: 12440 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: buytenh@wantstofly.org Precedence: bulk X-list: netdev On Sun, Dec 05, 2004 at 06:44:01PM +0100, Lennert Buytenhek wrote: > On Sun, Dec 05, 2004 at 04:42:34PM +0100, Martin Josefsson wrote: > > > The delayed TDT updating was a test and currently it delays the first tx'd > > packet after a timerrun 1ms. > > > > Would be interesting to see what other people get with this thing. > > Lennert? > > I took Scott's notxints patch, added the prefetch bits and moved the > TDT updating to e1000_clean_tx as you did. > > Slightly better than before, but not much: I've tested all packet sizes now, and delayed TDT updating once per jiffy (instead of once per packet) indeed gives about 25kpps more on 60,61,62 byte packets, and is hardly worth it for bigger packets. --L From gandalf@wlug.westbo.se Sun Dec 5 09:54:33 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 05 Dec 2004 09:54:36 -0800 (PST) Received: from null.rsn.bth.se (postfix@null.rsn.bth.se [194.47.142.3]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB5HsWCS027717 for ; Sun, 5 Dec 2004 09:54:32 -0800 Received: by null.rsn.bth.se (Postfix, from userid 65534) id D26722C0056; Sun, 5 Dec 2004 18:54:08 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by null.rsn.bth.se (Postfix) with ESMTP id 618D92C006F; Sun, 5 Dec 2004 18:54:08 +0100 (CET) Received: from tux.rsn.bth.se (tux.rsn.bth.se [194.47.143.135]) by null.rsn.bth.se (Postfix) with ESMTP id 9FA372C0056; Sun, 5 Dec 2004 18:54:07 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by tux.rsn.bth.se (Postfix) with ESMTP id DF7DA3FA7; Sun, 5 Dec 2004 18:54:07 +0100 (CET) Date: Sun, 5 Dec 2004 18:54:07 +0100 (CET) From: Martin Josefsson X-X-Sender: gandalf@tux.rsn.bth.se To: Lennert Buytenhek Cc: Scott Feldman , jamal , Robert Olsson , P@draigBrady.com, mellia@prezzemolo.polito.it, e1000-devel@lists.sourceforge.net, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com Subject: Re: 1.03Mpps on e1000 (was: Re: [E1000-devel] Transmission limit) In-Reply-To: <20041205175133.GK647@xi.wantstofly.org> Message-ID: References: <1101824754.1044.126.camel@jzny.localdomain> <20041201001107.GE4203@xi.wantstofly.org> <1101863399.4663.54.camel@sfeldma-mobl.dsl-verizon.net> <20041201182943.GA14470@xi.wantstofly.org> <20041201213550.GF14470@xi.wantstofly.org> <1101967983.4782.9.camel@localhost.localdomain> <20041205145051.GA647@xi.wantstofly.org> <20041205174401.GJ647@xi.wantstofly.org> <20041205175133.GK647@xi.wantstofly.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Scanned: by amavisd-new-20030616-p10 on null.rsn.bth.se X-archive-position: 12441 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: gandalf@wlug.westbo.se Precedence: bulk X-list: netdev On Sun, 5 Dec 2004, Lennert Buytenhek wrote: > I've tested all packet sizes now, and delayed TDT updating once per jiffy > (instead of once per packet) indeed gives about 25kpps more on 60,61,62 > byte packets, and is hardly worth it for bigger packets. Maybe we can't see any real gains here now, I wonder if it has any effect if you have lots of nics on the same bus. I mean, in theory it saves a whole lot of traffic on the bus. /Martin From buytenh@wantstofly.org Sun Dec 5 09:58:40 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 05 Dec 2004 09:58:44 -0800 (PST) Received: from xi.wantstofly.org (alephnull.demon.nl [212.238.201.82]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB5HwdJ9028214 for ; Sun, 5 Dec 2004 09:58:40 -0800 Received: by xi.wantstofly.org (Postfix, from userid 500) id CEB422B0ED; Sun, 5 Dec 2004 18:58:17 +0100 (MET) Date: Sun, 5 Dec 2004 18:58:17 +0100 From: Lennert Buytenhek To: Martin Josefsson Cc: Scott Feldman , jamal , Robert Olsson , P@draigBrady.com, mellia@prezzemolo.polito.it, e1000-devel@lists.sourceforge.net, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com Subject: Re: 1.03Mpps on e1000 (was: Re: [E1000-devel] Transmission limit) Message-ID: <20041205175817.GL647@xi.wantstofly.org> References: <1101824754.1044.126.camel@jzny.localdomain> <20041201001107.GE4203@xi.wantstofly.org> <1101863399.4663.54.camel@sfeldma-mobl.dsl-verizon.net> <20041201182943.GA14470@xi.wantstofly.org> <20041201213550.GF14470@xi.wantstofly.org> <1101967983.4782.9.camel@localhost.localdomain> <20041205145051.GA647@xi.wantstofly.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.1i X-archive-position: 12442 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: buytenh@wantstofly.org Precedence: bulk X-list: netdev On Sun, Dec 05, 2004 at 05:48:34PM +0100, Martin Josefsson wrote: > I tried using both ports on the 82546GB nic. > > delay nodelay > 1CPU 1.95 Mpps 1.76 Mpps > 2CPU 1.60 Mpps 1.44 Mpps I get: delay nodelay 1CPU 1837356 1837330 2CPU 2035060 1947424 So in your case using 2 CPUs degrades performance, in my case it increases it. And TDT delaying/coalescing only improves performance when using 2 CPUs, and even then only slightly (and only for <= 62B packets.) --L From buytenh@wantstofly.org Sun Dec 5 10:14:52 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 05 Dec 2004 10:14:56 -0800 (PST) Received: from xi.wantstofly.org (alephnull.demon.nl [212.238.201.82]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB5IEpkJ028833 for ; Sun, 5 Dec 2004 10:14:51 -0800 Received: by xi.wantstofly.org (Postfix, from userid 500) id 5E6622B0ED; Sun, 5 Dec 2004 19:14:29 +0100 (MET) Date: Sun, 5 Dec 2004 19:14:29 +0100 From: Lennert Buytenhek To: Martin Josefsson Cc: Scott Feldman , jamal , Robert Olsson , P@draigBrady.com, mellia@prezzemolo.polito.it, e1000-devel@lists.sourceforge.net, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com Subject: Re: 1.03Mpps on e1000 (was: Re: [E1000-devel] Transmission limit) Message-ID: <20041205181429.GM647@xi.wantstofly.org> References: <20041201213550.GF14470@xi.wantstofly.org> <1101967983.4782.9.camel@localhost.localdomain> <20041205145051.GA647@xi.wantstofly.org> <20041205151545.GC647@xi.wantstofly.org> <20041205170006.GI647@xi.wantstofly.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.1i X-archive-position: 12443 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: buytenh@wantstofly.org Precedence: bulk X-list: netdev On Sun, Dec 05, 2004 at 06:38:05PM +0100, Martin Josefsson wrote: > e1000: MMIO read took 481 clocks > e1000: MMIO read took 369 clocks > e1000: MMIO read took 481 clocks > e1000: MMIO read took 11 clocks > e1000: MMIO read took 477 clocks > e1000: MMIO read took 316 clocks Interesting. On a 1667MHz CPU, this is around ~0.28us per MMIO read in the worst case. On my hardware (dual Xeon 2.4GHz), the best case I've ever seen was ~0.83us. This alone can make a hell of a difference, esp. for 60B packets. --L From manfred@colorfullife.com Sun Dec 5 10:26:16 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 05 Dec 2004 10:26:21 -0800 (PST) Received: from dbl.q-ag.de (dbl.q-ag.de [213.172.117.3]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB5IQE1U029278 for ; Sun, 5 Dec 2004 10:26:15 -0800 Received: from [127.0.0.2] (dbl [127.0.0.1]) by dbl.q-ag.de (8.12.3/8.12.3/Debian-6.6) with ESMTP id iB5IPmSL015073; Sun, 5 Dec 2004 19:25:49 +0100 Message-ID: <41B352AB.4020700@colorfullife.com> Date: Sun, 05 Dec 2004 19:25:47 +0100 From: Manfred Spraul User-Agent: Mozilla/5.0 (X11; U; Linux i686; fr-FR; rv:1.7.3) Gecko/20040922 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Lennert Buytenhek , Netdev , Martin Josefsson Subject: Re: 1.03Mpps on e1000 (was: Re: [E1000-devel] Transmission limit) Content-Type: multipart/mixed; boundary="------------090208040808010503040206" X-archive-position: 12444 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: manfred@colorfullife.com Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------090208040808010503040206 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Lennert wrote: > A dirty way, yes ;-) Open up e1000_osdep.h and do: > > -#define E1000_READ_REG(a, reg) ( \ > - readl((a)->hw_addr + \ > - (((a)->mac_type >= e1000_82543) ? E1000_##reg : E1000_82542_##reg))) > +#define E1000_READ_REG(a, reg) ({ \ > + unsigned long s, e, d, v; \ > +\ > + (a)->mmio_reads++; \ > + rdtsc(s, d); \ > + v = readl((a)->hw_addr + \ > + (((a)->mac_type >= e1000_82543) ? E1000_##reg : E1000_82542_##reg)); \ > + rdtsc(e, d); \ > + e -= s; \ > + printk(KERN_INFO "e1000: MMIO read took %ld clocks\n", e); \ > + printk(KERN_INFO "e1000: in process %d(%s)\n", current->pid, current->comm); \ > + dump_stack(); \ > + v; \ > +}) Too dirty: rdtsc is not serializing, thus my Opteron happily reorders the read and the rdtsc and reports 9 cycles. Attached is a longer patch that I usually use for microbenchmarks. I get around 506 cycles with it for an Opteron 2 GHz to the nForce 250 Gb nic (i.e. integrated nic in the chipset, just one HT hop): Results - zero - shift 0 40: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 :0 0 63 0 0 0 0 0 0 0 0 0 0 0 0 0 1e0: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 :0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 Overflows: 0. Sum: 100 >>>>>>>>>>> benchmark overhead: 82 cycles ** reading register e08920b4 Results - readl - shift 0 240: 0 0 b 0 0 0 0 0 0 0 0 0 32 0 1 1 :0 0 0 0 0 0 a 0 0 0 0 0 0 0 0 0 260: 1a 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 :0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 300: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 :0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 Overflows: 0. Sum: 100 >>>>>>>>>> total: 0x248, i.e. net 506 cycles. -- Manfred --------------090208040808010503040206 Content-Type: text/plain; name="patch-perftest-forcedeth" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="patch-perftest-forcedeth" --- 2.6/drivers/net/forcedeth.c 2004-12-05 16:21:28.000000000 +0100 +++ build-2.6/drivers/net/forcedeth.c 2004-12-05 19:18:24.000000000 +0100 @@ -1500,6 +1500,131 @@ enable_irq(dev->irq); } +int p_shift = 0; + +#define STAT_TABLELEN 16384 +static unsigned long totals[STAT_TABLELEN]; +static unsigned int overflows; + +static unsigned long long stime; +static void start_measure(void) +{ + __asm__ __volatile__ ( + ".align 64\n\t" + "pushal\n\t" + "cpuid\n\t" + "popal\n\t" + "rdtsc\n\t" + "movl %%eax,(%0)\n\t" + "movl %%edx,4(%0)\n\t" + : /* no output */ + : "c"(&stime) + : "eax", "edx", "memory" ); +} + +static void end_measure(void) +{ +static unsigned long long etime; + __asm__ __volatile__ ( + "pushal\n\t" + "cpuid\n\t" + "popal\n\t" + "rdtsc\n\t" + "movl %%eax,(%0)\n\t" + "movl %%edx,4(%0)\n\t" + : /* no output */ + : "c"(&etime) + : "eax", "edx", "memory" ); + { + unsigned long time = (unsigned long)(etime-stime); + time >>= p_shift; + if(time < STAT_TABLELEN) { + totals[time]++; + } else { + overflows++; + } + } +} + +static void clean_buf(void) +{ + memset(totals,0,sizeof(totals)); + overflows = 0; +} + +static void print_line(unsigned long* array) +{ + int i; + for(i=0;i<32;i++) { + if((i%32)==16) + printk(":"); + printk("%lx ",array[i]); + } +} + +static void print_buf(char* caption) +{ + int i, other = 0; + printk("Results - %s - shift %d", + caption, p_shift); + + for(i=0;ioom_kick, jiffies + OOM_REFILL); spin_unlock_irq(&np->lock); + bench_readl(base + NvRegMulticastAddrB); + bench_readl(base + NvRegIrqStatus); return 0; out_drain: drain_ring(dev); --------------090208040808010503040206-- From sfeldma@pobox.com Sun Dec 5 12:32:31 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 05 Dec 2004 12:32:35 -0800 (PST) Received: from orb.pobox.com (orb.pobox.com [207.8.226.5]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB5KWU19005463 for ; Sun, 5 Dec 2004 12:32:31 -0800 Received: from orb (localhost [127.0.0.1]) by orb.pobox.com (Postfix) with ESMTP id 918A22F9DFD; Sun, 5 Dec 2004 15:32:08 -0500 (EST) Received: from [192.168.0.19] (wbar2.sea1-4-5-062-153.sea1.dsl-verizon.net [4.5.62.153]) by orb.sasl.smtp.pobox.com (Postfix) with ESMTP id 144442F9C4F; Sun, 5 Dec 2004 15:32:06 -0500 (EST) Subject: Re: e1000>5.2.30 unstable with InterruptThrottleRate=0 From: Scott Feldman Reply-To: sfeldma@pobox.com To: Peter Kjellstroem Cc: netdev@oss.sgi.com In-Reply-To: References: Content-Type: text/plain Message-Id: <1102278893.3343.116.camel@sfeldma-mobl.dsl-verizon.net> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.6 (1.4.6-2) Date: Sun, 05 Dec 2004 12:34:53 -0800 Content-Transfer-Encoding: 7bit X-archive-position: 12445 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: sfeldma@pobox.com Precedence: bulk X-list: netdev On Sun, 2004-12-05 at 06:04, Peter Kjellstroem wrote: > I've been doing som more tests today to try to narrow down which patch > causes my problems. So far I have this: > > 2.4.26 (5.2.30) is ok > 2.4.28 (5.4.11) is not > > 2.4.28 with 5.2.30 patched in is ok > 2.4.28 with 5.2.39 (from 2.4.27-pre1) is not > > In e1000_main.c for 5.2.39 (from 2.4.27-pre1) i found this: > > /* Change Log > * > * 5.2.39 3/12/04 > * ... > * o Back out the CSA fix for 82547 as it continues to cause > * systems lock-ups with production systems. Yes, there was a driver "fix" for this problem that has since been pulled out of the production driver because it caused lockups on some systems. I have one of these such systems. Here's the results on my systems with an 82547EI: 5.2.22 lockup 5.2.30.1 lockup 5.2.39 NETDEV reset 5.2.52 NETDEV reset 5.4.11 NETDEV reset For you, with an 82547GI, any driver between 5.2.22 and 5.2.30.1 will work because it has the "fix". See the comment in e1000_intr for these drivers. So I guess you're not out of luck if you use the 5.2.30 driver. You just can't move forward to a newer driver unless you port the "fix" forward. -scott From sfeldma@pobox.com Sun Dec 5 13:10:08 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 05 Dec 2004 13:10:12 -0800 (PST) Received: from orb.pobox.com (orb.pobox.com [207.8.226.5]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB5LA8Ov006419 for ; Sun, 5 Dec 2004 13:10:08 -0800 Received: from orb (localhost [127.0.0.1]) by orb.pobox.com (Postfix) with ESMTP id 184B12F8E28; Sun, 5 Dec 2004 16:09:46 -0500 (EST) Received: from [192.168.0.19] (wbar2.sea1-4-5-062-153.sea1.dsl-verizon.net [4.5.62.153]) by orb.sasl.smtp.pobox.com (Postfix) with ESMTP id 4307A2F9BE6; Sun, 5 Dec 2004 16:09:35 -0500 (EST) Subject: Re: 1.03Mpps on e1000 (was: Re: [E1000-devel] Transmission limit) From: Scott Feldman Reply-To: sfeldma@pobox.com To: Martin Josefsson Cc: Lennert Buytenhek , jamal , Robert Olsson , P@draigBrady.com, mellia@prezzemolo.polito.it, e1000-devel@lists.sourceforge.net, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com In-Reply-To: References: <1101499285.1079.45.camel@jzny.localdomain> <16811.8052.678955.795327@robur.slu.se> <1101821501.1043.43.camel@jzny.localdomain> <20041130134600.GA31515@xi.wantstofly.org> <1101824754.1044.126.camel@jzny.localdomain> <20041201001107.GE4203@xi.wantstofly.org> <1101863399.4663.54.camel@sfeldma-mobl.dsl-verizon.net> <20041201182943.GA14470@xi.wantstofly.org> <20041201213550.GF14470@xi.wantstofly.org> <1101967983.4782.9.camel@localhost.localdomain> <20041205145051.GA647@xi.wantstofly.org> Content-Type: text/plain Message-Id: <1102281141.3343.138.camel@sfeldma-mobl.dsl-verizon.net> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.6 (1.4.6-2) Date: Sun, 05 Dec 2004 13:12:22 -0800 Content-Transfer-Encoding: 7bit X-archive-position: 12446 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: sfeldma@pobox.com Precedence: bulk X-list: netdev On Sun, 2004-12-05 at 07:03, Martin Josefsson wrote: > BUT if I use the above + prefetching I get this: > > 60 1483890 Ok, proof that we can get to 1.4Mpps! That's the good news. The bad news is prefetching is potentially buggy as pointed out in the freebsd note. Buggy as in the controller may hang. Sorry, I don't have details on what conditions are necessary to cause a hang. Would Martin or Lennert run these test for a longer duration so we can get some data, maybe adding in Rx. It could be that removing the Tx interrupts and descriptor write-backs, prefetching may be ok. I don't know. Intel? Also, wouldn't it be great if someone wrote a document capturing all of the accumulated knowledge for future generations? -scott From buytenh@wantstofly.org Sun Dec 5 13:26:22 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 05 Dec 2004 13:26:26 -0800 (PST) Received: from xi.wantstofly.org (alephnull.demon.nl [212.238.201.82]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB5LQLC3007036 for ; Sun, 5 Dec 2004 13:26:21 -0800 Received: by xi.wantstofly.org (Postfix, from userid 500) id 2CEB42B0ED; Sun, 5 Dec 2004 22:25:59 +0100 (MET) Date: Sun, 5 Dec 2004 22:25:59 +0100 From: Lennert Buytenhek To: Scott Feldman Cc: Martin Josefsson , jamal , Robert Olsson , P@draigBrady.com, mellia@prezzemolo.polito.it, e1000-devel@lists.sourceforge.net, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com Subject: Re: 1.03Mpps on e1000 (was: Re: [E1000-devel] Transmission limit) Message-ID: <20041205212559.GA4338@xi.wantstofly.org> References: <20041130134600.GA31515@xi.wantstofly.org> <1101824754.1044.126.camel@jzny.localdomain> <20041201001107.GE4203@xi.wantstofly.org> <1101863399.4663.54.camel@sfeldma-mobl.dsl-verizon.net> <20041201182943.GA14470@xi.wantstofly.org> <20041201213550.GF14470@xi.wantstofly.org> <1101967983.4782.9.camel@localhost.localdomain> <20041205145051.GA647@xi.wantstofly.org> <1102281141.3343.138.camel@sfeldma-mobl.dsl-verizon.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1102281141.3343.138.camel@sfeldma-mobl.dsl-verizon.net> User-Agent: Mutt/1.4.1i X-archive-position: 12447 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: buytenh@wantstofly.org Precedence: bulk X-list: netdev On Sun, Dec 05, 2004 at 01:12:22PM -0800, Scott Feldman wrote: > Would Martin or Lennert run these test for a longer duration so we can > get some data, maybe adding in Rx. It could be that removing the Tx > interrupts and descriptor write-backs, prefetching may be ok. I don't > know. Intel? What your patch does is (correct me if I'm wrong): - Masking TXDW, effectively preventing it from delivering TXdone ints. - Not setting E1000_TXD_CMD_IDE in the TXD command field, which causes the chip to 'ignore the TIDV' register, which is the 'TX Interrupt Delay Value'. What exactly does this? - Not setting the "Report Packet Sent"/"Report Status" bits in the TXD command field. Is this the equivalent of the TXdone interrupt? Just exactly which bit avoids the descriptor writeback? I'm also a bit worried that only freeing packets 1ms later will mess up socket accounting and such. Any ideas on that? > Also, wouldn't it be great if someone wrote a document capturing all of > the accumulated knowledge for future generations? I'll volunteer for that. --L From cap@nsc.liu.se Sun Dec 5 13:40:06 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 05 Dec 2004 13:40:12 -0800 (PST) Received: from papput.nsc.liu.se (ns2.nsc.liu.se [130.236.101.9]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB5Le5pG007591 for ; Sun, 5 Dec 2004 13:40:06 -0800 Received: from mail.nsc.liu.se (mail.nsc.liu.se [130.236.101.49]) by papput.nsc.liu.se (Postfix) with ESMTP id 3A57B1C31EF; Sun, 5 Dec 2004 22:39:42 +0100 (CET) Date: Sun, 5 Dec 2004 22:39:42 +0100 (CET) From: Peter Kjellstroem To: Scott Feldman Cc: netdev@oss.sgi.com Subject: Re: e1000>5.2.30 unstable with InterruptThrottleRate=0 In-Reply-To: <1102278893.3343.116.camel@sfeldma-mobl.dsl-verizon.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 12448 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: cap@nsc.liu.se Precedence: bulk X-list: netdev On Sun, 5 Dec 2004, Scott Feldman wrote: > On Sun, 2004-12-05 at 06:04, Peter Kjellstroem wrote: > > * > > * 5.2.39 3/12/04 > > * ... > > * o Back out the CSA fix for 82547 as it continues to cause > > * systems lock-ups with production systems. > Yes I found that out and verified it earlier today. The fix in question is basically if'ed irq_enable/disable around a small chunk of code like this: if(hw->mac_type == e1000_82547 || hw->mac_type == e1000_82547_rev_2) e1000_irq_disable(adapter); this is the original fix and triggers for both e1000_82547 (your 547EI right?) and e1000_82547_rev_2 (my 547GI). If your NIC can't stand the fix and mine needs it isn't the obvious solution the following? if(hw->mac_type == e1000_82547_rev_2) e1000_irq_disable(adapter); I put this (and corresponding enable chung) in the current e1000 in 2.4.28 and it works lika a charm (and shouldn't bite your 82547EI). The rev2 part, 82547GI, pci_id 1075. Is present in a vast number of new systems including all (as far as I know) Dell 700, 750, 1750, 1700 and all Supermicro p4SC mobos. Best Regards, Peter > Yes, there was a driver "fix" for this problem that has since been > pulled out of the production driver because it caused lockups on some > systems. I have one of these such systems. Here's the results on my > systems with an 82547EI: > > 5.2.22 lockup > 5.2.30.1 lockup > 5.2.39 NETDEV reset > 5.2.52 NETDEV reset > 5.4.11 NETDEV reset > > For you, with an 82547GI, any driver between 5.2.22 and 5.2.30.1 will > work because it has the "fix". See the comment in e1000_intr for these > drivers. > > So I guess you're not out of luck if you use the 5.2.30 driver. You > just can't move forward to a newer driver unless you port the "fix" > forward. > > -scott > > > -- ------------------------------------------------------------ Peter Kjellstroem | E-mail: cap@nsc.liu.se National Supercomputer Centre | Office: +46(0)13 281492 Linkoeping University | Fax : +46(0)13 282535 SE-581 83 Linkoeping | Sweden | http://www.nsc.liu.se From cap@nsc.liu.se Sun Dec 5 13:52:40 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 05 Dec 2004 13:52:44 -0800 (PST) Received: from papput.nsc.liu.se (ns2.nsc.liu.se [130.236.101.9]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB5LqeOo008099 for ; Sun, 5 Dec 2004 13:52:40 -0800 Received: from mail.nsc.liu.se (mail.nsc.liu.se [130.236.101.49]) by papput.nsc.liu.se (Postfix) with ESMTP id 0E6801C31EF; Sun, 5 Dec 2004 22:52:18 +0100 (CET) Date: Sun, 5 Dec 2004 22:52:18 +0100 (CET) From: Peter Kjellstroem To: Scott Feldman Cc: netdev@oss.sgi.com Subject: Re: e1000>5.2.30 unstable with InterruptThrottleRate=0 In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 12449 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: cap@nsc.liu.se Precedence: bulk X-list: netdev On Sun, 5 Dec 2004, Peter Kjellstroem wrote: > On Sun, 5 Dec 2004, Scott Feldman wrote: > > > On Sun, 2004-12-05 at 06:04, Peter Kjellstroem wrote: > > > * > > > * 5.2.39 3/12/04 > > > * ... > > > * o Back out the CSA fix for 82547 as it continues to cause > > > * systems lock-ups with production systems. > > > > Yes I found that out and verified it earlier today. The fix in question is > basically if'ed irq_enable/disable around a small chunk of code like this: > > if(hw->mac_type == e1000_82547 || hw->mac_type == e1000_82547_rev_2) > e1000_irq_disable(adapter); > > this is the original fix and triggers for both e1000_82547 (your 547EI > right?) and e1000_82547_rev_2 (my 547GI). If your NIC can't stand the fix > and mine needs it isn't the obvious solution the following? > > if(hw->mac_type == e1000_82547_rev_2) > e1000_irq_disable(adapter); > And here's a trivial patch for it (against 2.4.28): /Peter --- linux-2.4.28-cap1p/drivers/net/e1000/e1000_main.c Wed Nov 17 12:54:21 2004 +++ linux-2.4.28-cap5p/drivers/net/e1000/e1000_main.c Sun Dec 5 22:47:57 2004 @@ -2124,10 +2124,19 @@ e1000_intr(int irq, void *data, struct p __netif_rx_schedule(netdev); } #else + /* e1000_irq_disable/enable pair added back (removed in 2.4.27-pre1) */ + /* fixed needed for 82547GI (e1000_82547_rev_2) Dell 750, P4SCi */ + /* reliable operation with ITR=0 */ + if(hw->mac_type == e1000_82547_rev_2) + e1000_irq_disable(adapter); + for(i = 0; i < E1000_MAX_INTR; i++) if(unlikely(!e1000_clean_rx_irq(adapter) & !e1000_clean_tx_irq(adapter))) break; + + if(hw->mac_type == e1000_82547 || hw->mac_type == e1000_82547_rev_2) + e1000_irq_enable(adapter); #endif return IRQ_HANDLED; From kdorn@lilah.hetzel.org Sun Dec 5 14:32:20 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 05 Dec 2004 14:32:29 -0800 (PST) Received: from lilah.hetzel.org (lilah.hetzel.org [199.250.128.2]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB5MWJpF009145 for ; Sun, 5 Dec 2004 14:32:20 -0800 Received: from lilah.hetzel.org (localhost.localdomain [127.0.0.1]) by lilah.hetzel.org (8.12.11/8.12.8) with ESMTP id iB5NtRkj023027; Sun, 5 Dec 2004 18:55:29 -0500 Received: (from kdorn@localhost) by lilah.hetzel.org (8.12.11/8.12.8/Submit) id iB5NtKNP022968; Sun, 5 Dec 2004 18:55:20 -0500 Date: Sun, 5 Dec 2004 18:55:20 -0500 From: Dorn Hetzel To: Francois Romieu Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com, jgarzik@pobox.com, dorn@hetzel.org Subject: Re: r8169.c Message-ID: <20041205235519.GA21885@lilah.hetzel.org> References: <20041119162920.GA26836@lilah.hetzel.org> <20041119201203.GA13522@electric-eye.fr.zoreil.com> <20041120003754.GA32133@lilah.hetzel.org> <20041120002946.GA18059@electric-eye.fr.zoreil.com> <20041122181307.GA3625@lilah.hetzel.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041122181307.GA3625@lilah.hetzel.org> User-Agent: Mutt/1.4i X-archive-position: 12450 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kernel@dorn.hetzel.org Precedence: bulk X-list: netdev On Mon, Nov 22, 2004 at 01:13:07PM -0500, Dorn Hetzel wrote: > On Sat, Nov 20, 2004 at 01:29:46AM +0100, Francois Romieu wrote: > > > Once you have applied one of the patch above, the patch below will improve > > your "transmit timed out" (please apply in order and enable NAPI): > > http://www.fr.zoreil.com/linux/kernel/2.6.x/2.6.10-rc2-mm1/r8169-250.patch > > http://www.fr.zoreil.com/linux/kernel/2.6.x/2.6.10-rc2-mm1/r8169-255.patch > > > > If things perform better you may want to use bigger frames and apply as > > well r8169-260.patch and r8169-265.patch. > > http://www.fr.zoreil.com/linux/kernel/2.6.x/2.6.10-rc2-mm1/r8169-260.patch > > http://www.fr.zoreil.com/linux/kernel/2.6.x/2.6.10-rc2-mm1/r8169-265.patch > > I was wondering if the above 4 patches have made it into one of the rc? releases, or at least a rc?-mm? ? Regards, -Dorn From anton@ozlabs.org Sun Dec 5 15:27:07 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 05 Dec 2004 15:27:17 -0800 (PST) Received: from ozlabs.org (ozlabs.org [203.10.76.45]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB5NR6Sm010454 for ; Sun, 5 Dec 2004 15:27:06 -0800 Received: by ozlabs.org (Postfix, from userid 1010) id C72722BE83; Mon, 6 Dec 2004 10:26:38 +1100 (EST) Date: Mon, 6 Dec 2004 10:22:26 +1100 From: Anton Blanchard To: netdev@oss.sgi.com Cc: ganesh.venkatesan@intel.com, jesse.brandeburg@intel.com, john.ronciak@intel.com Subject: TSO + e1000 Message-ID: <20041205232226.GA5757@krispykreme.ozlabs.ibm.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="ZPt4rx8FFjLCG7dd" Content-Disposition: inline User-Agent: Mutt/1.5.6+20040907i X-archive-position: 12451 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: anton@samba.org Precedence: bulk X-list: netdev --ZPt4rx8FFjLCG7dd Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Hi, I had another look at our TSO issues. A few things: 1. tcpdump doesnt report local TSO packets, it simply prints bad-len. I think this is because the e1000 driver is zeroing out the IP header length in e1000_tso: skb->nh.iph->tot_len = 0; Does the card require this for TSO to operate? Ive worked around it in tcpdump for the time being. 2. TSO never gets reenabled after a retransmit. With long lasting connections this hurts, we get one retransmit and its all over. Not so important for a web server, but we have big CIFS (windows networking) servers where connections can last days. When its working TSO gives us a nice bump here. 3. Im getting e1000 rx fifo overruns on the receive side. Doubling the flow control watermarks seems to help. Due to bug 2, whenever we get an rx fifo overrun TSO gets disabled on that connection. 4. Im seeing some strange stuff during connection startup. I have bumped win_divisor to 2 (so TSO gets enabled in a reasonable time). There are nice big send and receive socket buffers (512k). Id expect TSO to cut in fairly quickly. The both direction test is as Id expect. It takes a few packets for our slow start window to grow and then we start sending TSO packets (evidenced by the first bad-len) packet. Now if we do a single direction test it takes forever for TSO to kick in. At the moment Im not sure why this is happening, tcp_current_mss seems to grow correctly in both cases. Is there somewhere else we are capping TSO packets? Finally, I set divisor to 1 and you can see for both tests TSO cuts right in. I cant explain the big difference between divisor = 1 and divisor = 2 yet, any thoughts? Anton --ZPt4rx8FFjLCG7dd Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename=divisor_2_both 22:30:12.084166 IP 10.0.0.2.32822 > 10.0.0.1.12865: S 4036823371:4036823371(0) win 5840 22:30:12.084438 IP 10.0.0.1.12865 > 10.0.0.2.32822: S 4047144990:4047144990(0) ack 4036823372 win 5792 22:30:12.084462 IP 10.0.0.2.32822 > 10.0.0.1.12865: . ack 1 win 23 22:30:12.084484 IP 10.0.0.2.32822 > 10.0.0.1.12865: . 1:1449(1448) ack 1 win 23 22:30:12.084491 IP 10.0.0.2.32822 > 10.0.0.1.12865: . 1449:2897(1448) ack 1 win 23 22:30:12.084509 IP 10.0.0.2.32822 > 10.0.0.1.12865: P 2897:4345(1448) ack 1 win 23 22:30:12.084688 IP 10.0.0.1.12865 > 10.0.0.2.32822: . ack 1449 win 34 22:30:12.084690 IP 10.0.0.1.12865 > 10.0.0.2.32822: . ack 2897 win 46 22:30:12.084691 IP 10.0.0.1.12865 > 10.0.0.2.32822: . ack 4345 win 57 22:30:12.084721 IP 10.0.0.2.32822 > 10.0.0.1.12865: . 4345:5793(1448) ack 1 win 23 22:30:12.084726 IP 10.0.0.2.32822 > 10.0.0.1.12865: . 5793:7241(1448) ack 1 win 23 22:30:12.084938 IP 10.0.0.1.12865 > 10.0.0.2.32822: . ack 5793 win 68 22:30:12.084939 IP 10.0.0.1.12865 > 10.0.0.2.32822: . ack 7241 win 80 22:30:12.084975 IP 10.0.0.2.32822 > 10.0.0.1.12865: P 7241:8677(1436) ack 1 win 23 22:30:12.085187 IP 10.0.0.1.12865 > 10.0.0.2.32822: . ack 8677 win 91 22:30:12.085188 IP 10.0.0.1.12865 > 10.0.0.2.32822: . 1:1449(1448) ack 8677 win 91 22:30:12.085189 IP 10.0.0.1.12865 > 10.0.0.2.32822: . 1449:2897(1448) ack 8677 win 91 22:30:12.085190 IP 10.0.0.1.12865 > 10.0.0.2.32822: P 2897:4345(1448) ack 8677 win 91 22:30:12.085218 IP 10.0.0.2.32822 > 10.0.0.1.12865: . ack 1449 win 35 22:30:12.085228 IP 10.0.0.2.32822 > 10.0.0.1.12865: . ack 2897 win 46 22:30:12.085234 IP 10.0.0.2.32822 > 10.0.0.1.12865: . ack 4345 win 57 22:30:12.085438 IP 10.0.0.1.12865 > 10.0.0.2.32822: . 4345:5793(1448) ack 8677 win 91 22:30:12.085439 IP 10.0.0.1.12865 > 10.0.0.2.32822: . 5793:7241(1448) ack 8677 win 91 22:30:12.085464 IP 10.0.0.2.32822 > 10.0.0.1.12865: . ack 5793 win 69 22:30:12.085474 IP 10.0.0.2.32822 > 10.0.0.1.12865: . ack 7241 win 80 22:30:12.085687 IP 10.0.0.1.12865 > 10.0.0.2.32822: P 7241:8677(1436) ack 8677 win 91 22:30:12.085705 IP 10.0.0.2.32822 > 10.0.0.1.12865: . ack 8677 win 91 22:30:12.087676 IP bad-len 0 22:30:12.087687 IP bad-len 0 22:30:12.087937 IP 10.0.0.1.12865 > 10.0.0.2.32822: . ack 14469 win 136 --ZPt4rx8FFjLCG7dd Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="divisor_2_single.wGmSv6" 22:30:12.096111 IP 10.0.0.2.32824 > 10.0.0.1.32823: S 4033033338:4033033338(0) win 5840 22:30:12.096308 IP 10.0.0.1.32823 > 10.0.0.2.32824: S 4044444323:4044444323(0) ack 4033033339 win 5792 22:30:12.096327 IP 10.0.0.2.32824 > 10.0.0.1.32823: . ack 1 win 23 22:30:12.096354 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 1:1449(1448) ack 1 win 23 22:30:12.096362 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 1449:2897(1448) ack 1 win 23 22:30:12.096377 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 2897:4345(1448) ack 1 win 23 22:30:12.096558 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 1449 win 34 22:30:12.096559 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 2897 win 46 22:30:12.096560 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 4345 win 57 22:30:12.096664 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 4345:5793(1448) ack 1 win 23 22:30:12.096670 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 5793:7241(1448) ack 1 win 23 22:30:12.096685 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 7241:8689(1448) ack 1 win 23 22:30:12.096691 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 8689:10137(1448) ack 1 win 23 22:30:12.096706 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 10137:11585(1448) ack 1 win 23 22:30:12.096711 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 11585:13033(1448) ack 1 win 23 22:30:12.096934 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 5793 win 68 22:30:12.096966 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 13033:14481(1448) ack 1 win 23 22:30:12.096970 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 14481:15929(1448) ack 1 win 23 22:30:12.096935 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 7241 win 80 22:30:12.096984 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 15929:17377(1448) ack 1 win 23 22:30:12.096988 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 17377:18825(1448) ack 1 win 23 22:30:12.096936 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 8689 win 91 22:30:12.097004 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 18825:20273(1448) ack 1 win 23 22:30:12.097008 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 20273:21721(1448) ack 1 win 23 22:30:12.096937 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 10137 win 102 22:30:12.097021 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 21721:23169(1448) ack 1 win 23 22:30:12.097026 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 23169:24617(1448) ack 1 win 23 22:30:12.096938 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 11585 win 114 22:30:12.097040 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 24617:26065(1448) ack 1 win 23 22:30:12.097043 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 26065:27513(1448) ack 1 win 23 22:30:12.096938 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 13033 win 125 22:30:12.097060 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 27513:28961(1448) ack 1 win 23 22:30:12.097066 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 28961:30409(1448) ack 1 win 23 22:30:12.097183 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 14481 win 136 22:30:12.097216 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 30409:31857(1448) ack 1 win 23 22:30:12.097221 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 31857:33305(1448) ack 1 win 23 22:30:12.097184 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 15929 win 148 22:30:12.097240 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 33305:34753(1448) ack 1 win 23 22:30:12.097244 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 34753:36201(1448) ack 1 win 23 22:30:12.097185 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 17377 win 159 22:30:12.097258 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 36201:37649(1448) ack 1 win 23 22:30:12.097262 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 37649:39097(1448) ack 1 win 23 22:30:12.097186 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 18825 win 170 22:30:12.097276 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 39097:40545(1448) ack 1 win 23 22:30:12.097280 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 40545:41993(1448) ack 1 win 23 22:30:12.097187 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 20273 win 181 22:30:12.097293 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 41993:43441(1448) ack 1 win 23 22:30:12.097297 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 43441:44889(1448) ack 1 win 23 22:30:12.097187 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 21721 win 193 22:30:12.097311 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 44889:46337(1448) ack 1 win 23 22:30:12.097316 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 46337:47785(1448) ack 1 win 23 22:30:12.097309 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 23169 win 204 22:30:12.097344 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 47785:49233(1448) ack 1 win 23 22:30:12.097348 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 49233:50681(1448) ack 1 win 23 22:30:12.097310 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 24617 win 215 22:30:12.097360 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 50681:52129(1448) ack 1 win 23 22:30:12.097364 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 52129:53577(1448) ack 1 win 23 22:30:12.097311 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 26065 win 227 22:30:12.097375 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 53577:55025(1448) ack 1 win 23 22:30:12.097378 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 55025:56473(1448) ack 1 win 23 22:30:12.097311 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 27513 win 238 22:30:12.097390 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 56473:57921(1448) ack 1 win 23 22:30:12.097393 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 57921:59369(1448) ack 1 win 23 22:30:12.097312 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 28961 win 249 22:30:12.097405 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 59369:60817(1448) ack 1 win 23 22:30:12.097408 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 60817:62265(1448) ack 1 win 23 22:30:12.097313 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 30409 win 261 22:30:12.097419 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 62265:63713(1448) ack 1 win 23 22:30:12.097423 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 63713:65161(1448) ack 1 win 23 22:30:12.097434 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 31857 win 272 22:30:12.097477 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 65161:66609(1448) ack 1 win 23 22:30:12.097482 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 66609:68057(1448) ack 1 win 23 22:30:12.097435 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 33305 win 283 22:30:12.097495 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 68057:69505(1448) ack 1 win 23 22:30:12.097497 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 69505:70953(1448) ack 1 win 23 22:30:12.097435 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 34753 win 295 22:30:12.097510 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 70953:72401(1448) ack 1 win 23 22:30:12.097514 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 72401:73849(1448) ack 1 win 23 22:30:12.097436 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 36201 win 306 22:30:12.097535 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 73849:75297(1448) ack 1 win 23 22:30:12.097538 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 75297:76745(1448) ack 1 win 23 22:30:12.097437 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 37649 win 317 22:30:12.097551 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 76745:78193(1448) ack 1 win 23 22:30:12.097555 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 78193:79641(1448) ack 1 win 23 22:30:12.097437 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 39097 win 329 22:30:12.097575 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 79641:81089(1448) ack 1 win 23 22:30:12.097579 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 81089:82537(1448) ack 1 win 23 22:30:12.097558 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 40545 win 340 22:30:12.097614 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 82537:83985(1448) ack 1 win 23 22:30:12.097619 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 83985:85433(1448) ack 1 win 23 22:30:12.097559 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 41993 win 351 22:30:12.097634 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 85433:86881(1448) ack 1 win 23 22:30:12.097638 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 86881:88329(1448) ack 1 win 23 22:30:12.097561 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 43441 win 362 22:30:12.097650 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 88329:89777(1448) ack 1 win 23 22:30:12.097654 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 89777:91225(1448) ack 1 win 23 22:30:12.097562 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 44889 win 374 22:30:12.097666 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 91225:92673(1448) ack 1 win 23 22:30:12.097669 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 92673:94121(1448) ack 1 win 23 22:30:12.097562 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 46337 win 385 22:30:12.097683 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 94121:95569(1448) ack 1 win 23 22:30:12.097688 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 95569:97017(1448) ack 1 win 23 22:30:12.097563 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 47785 win 396 22:30:12.097704 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 97017:98465(1448) ack 1 win 23 22:30:12.097709 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 98465:99913(1448) ack 1 win 23 22:30:12.097683 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 57921 win 476 22:30:12.097564 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 49233 win 408 22:30:12.097728 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 99913:101361(1448) ack 1 win 23 22:30:12.097733 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 101361:102809(1448) ack 1 win 23 22:30:12.097737 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 102809:104257(1448) ack 1 win 23 22:30:12.097741 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 104257:105705(1448) ack 1 win 23 22:30:12.097744 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 105705:107153(1448) ack 1 win 23 22:30:12.097748 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 107153:108601(1448) ack 1 win 23 22:30:12.097752 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 108601:110049(1448) ack 1 win 23 22:30:12.097755 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 110049:111497(1448) ack 1 win 23 22:30:12.097684 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 59369 win 487 22:30:12.097565 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 50681 win 419 22:30:12.097768 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 111497:112945(1448) ack 1 win 23 22:30:12.097773 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 112945:114393(1448) ack 1 win 23 22:30:12.097685 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 60817 win 498 22:30:12.097565 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 52129 win 430 22:30:12.097786 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 114393:115841(1448) ack 1 win 23 22:30:12.097790 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 115841:117289(1448) ack 1 win 23 22:30:12.097686 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 62265 win 510 22:30:12.097586 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 53577 win 442 22:30:12.097804 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 117289:118737(1448) ack 1 win 23 22:30:12.097809 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 118737:120185(1448) ack 1 win 23 22:30:12.097702 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 63713 win 521 22:30:12.097586 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 55025 win 453 22:30:12.097826 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 120185:121633(1448) ack 1 win 23 22:30:12.097832 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 121633:123081(1448) ack 1 win 23 22:30:12.097703 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 65161 win 532 22:30:12.097587 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 56473 win 464 22:30:12.097809 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 66609 win 543 22:30:12.097849 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 123081:124529(1448) ack 1 win 23 22:30:12.097855 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 124529:125977(1448) ack 1 win 23 22:30:12.097873 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 125977:127425(1448) ack 1 win 23 22:30:12.097881 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 127425:128873(1448) ack 1 win 23 22:30:12.097810 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 68057 win 555 22:30:12.097897 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 128873:130321(1448) ack 1 win 23 22:30:12.097903 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 130321:131769(1448) ack 1 win 23 22:30:12.097811 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 69505 win 566 22:30:12.097915 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 131769:133217(1448) ack 1 win 23 22:30:12.097919 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 133217:134665(1448) ack 1 win 23 22:30:12.097812 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 70953 win 577 22:30:12.097932 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 134665:136113(1448) ack 1 win 23 22:30:12.097940 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 136113:137561(1448) ack 1 win 23 22:30:12.097812 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 72401 win 589 22:30:12.097956 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 137561:139009(1448) ack 1 win 23 22:30:12.097935 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 86881 win 702 22:30:12.097964 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 139009:140457(1448) ack 1 win 23 22:30:12.097813 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 73849 win 600 22:30:12.097986 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 140457:141905(1448) ack 1 win 23 22:30:12.097992 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 141905:143353(1448) ack 1 win 23 22:30:12.097996 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 143353:144801(1448) ack 1 win 23 22:30:12.098000 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 144801:146249(1448) ack 1 win 23 22:30:12.098004 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 146249:147697(1448) ack 1 win 23 22:30:12.098008 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 147697:149145(1448) ack 1 win 23 22:30:12.098011 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 149145:150593(1448) ack 1 win 23 22:30:12.098014 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 150593:152041(1448) ack 1 win 23 22:30:12.098018 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 152041:153489(1448) ack 1 win 23 22:30:12.098026 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 153489:154937(1448) ack 1 win 23 22:30:12.098031 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 154937:156385(1448) ack 1 win 23 22:30:12.097936 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 88329 win 713 22:30:12.097814 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 75297 win 611 22:30:12.098044 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 156385:157833(1448) ack 1 win 23 22:30:12.098048 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 157833:159281(1448) ack 1 win 23 22:30:12.097937 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 89777 win 724 22:30:12.097815 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 76745 win 623 22:30:12.098064 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 159281:160729(1448) ack 1 win 23 22:30:12.098058 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 102809 win 770 22:30:12.098072 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 160729:162177(1448) ack 1 win 23 22:30:12.097938 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 91225 win 736 22:30:12.098089 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 162177:163625(1448) ack 1 win 23 22:30:12.098094 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 163625:165073(1448) ack 1 win 23 22:30:12.098098 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 165073:166521(1448) ack 1 win 23 22:30:12.098101 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 166521:167969(1448) ack 1 win 23 22:30:12.098105 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 167969:169417(1448) ack 1 win 23 22:30:12.098108 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 169417:170865(1448) ack 1 win 23 22:30:12.098112 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 170865:172313(1448) ack 1 win 23 22:30:12.098115 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 172313:173761(1448) ack 1 win 23 22:30:12.098119 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 173761:175209(1448) ack 1 win 23 22:30:12.098123 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 175209:176657(1448) ack 1 win 23 22:30:12.098059 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 114393 win 770 22:30:12.097940 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 92673 win 747 22:30:12.097816 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 78193 win 634 22:30:12.098139 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 176657:178105(1448) ack 1 win 23 22:30:12.098144 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 178105:179553(1448) ack 1 win 23 22:30:12.098147 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 179553:181001(1448) ack 1 win 23 22:30:12.098150 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 181001:182449(1448) ack 1 win 23 22:30:12.098154 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 182449:183897(1448) ack 1 win 23 22:30:12.098157 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 183897:185345(1448) ack 1 win 23 22:30:12.098161 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 185345:186793(1448) ack 1 win 23 22:30:12.098164 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 186793:188241(1448) ack 1 win 23 22:30:12.098168 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 188241:189689(1448) ack 1 win 23 22:30:12.097941 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 94121 win 758 22:30:12.097818 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 79641 win 645 22:30:12.097942 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 95569 win 770 22:30:12.097818 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 81089 win 657 22:30:12.097943 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 97017 win 770 22:30:12.097819 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 82537 win 668 22:30:12.097944 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 98465 win 770 22:30:12.097820 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 83985 win 679 22:30:12.097837 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 85433 win 691 22:30:12.098183 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 127425 win 770 22:30:12.098210 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 189689:191137(1448) ack 1 win 23 22:30:12.098215 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 191137:192585(1448) ack 1 win 23 22:30:12.098217 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 192585:194033(1448) ack 1 win 23 22:30:12.098221 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 194033:195481(1448) ack 1 win 23 22:30:12.098225 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 195481:196929(1448) ack 1 win 23 22:30:12.098229 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 196929:198377(1448) ack 1 win 23 22:30:12.098233 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 198377:199825(1448) ack 1 win 23 22:30:12.098237 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 199825:201273(1448) ack 1 win 23 22:30:12.098240 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 201273:202721(1448) ack 1 win 23 22:30:12.098244 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 202721:204169(1448) ack 1 win 23 22:30:12.098307 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 133217 win 770 22:30:12.098343 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 204169:205617(1448) ack 1 win 23 22:30:12.098349 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 205617:207065(1448) ack 1 win 23 22:30:12.098353 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 207065:208513(1448) ack 1 win 23 22:30:12.098357 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 208513:209961(1448) ack 1 win 23 22:30:12.098360 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 209961:211409(1448) ack 1 win 23 22:30:12.098308 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 136113 win 770 22:30:12.098372 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 211409:212857(1448) ack 1 win 23 22:30:12.098378 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 212857:214305(1448) ack 1 win 23 22:30:12.098381 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 214305:215753(1448) ack 1 win 23 22:30:12.098309 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 139009 win 770 22:30:12.098393 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 215753:217201(1448) ack 1 win 23 22:30:12.098399 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 217201:218649(1448) ack 1 win 23 22:30:12.098402 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 218649:220097(1448) ack 1 win 23 22:30:12.098310 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 141905 win 770 22:30:12.098415 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 220097:221545(1448) ack 1 win 23 22:30:12.098420 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 221545:222993(1448) ack 1 win 23 22:30:12.098423 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 222993:224441(1448) ack 1 win 23 22:30:12.098310 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 144801 win 770 22:30:12.098433 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 156385 win 770 22:30:12.098486 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 224441:225889(1448) ack 1 win 23 22:30:12.098493 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 225889:227337(1448) ack 1 win 23 22:30:12.098497 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 227337:228785(1448) ack 1 win 23 22:30:12.098511 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 228785:230233(1448) ack 1 win 23 22:30:12.098516 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 230233:231681(1448) ack 1 win 23 22:30:12.098520 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 231681:233129(1448) ack 1 win 23 22:30:12.098524 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 233129:234577(1448) ack 1 win 23 22:30:12.098526 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 234577:236025(1448) ack 1 win 23 22:30:12.098530 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 236025:237473(1448) ack 1 win 23 22:30:12.098534 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 237473:238921(1448) ack 1 win 23 22:30:12.098538 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 238921:240369(1448) ack 1 win 23 22:30:12.098542 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 240369:241817(1448) ack 1 win 23 22:30:12.098558 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 167969 win 770 22:30:12.098559 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 170865 win 770 22:30:12.098618 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 241817:243265(1448) ack 1 win 23 22:30:12.098624 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 243265:244713(1448) ack 1 win 23 22:30:12.098628 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 244713:246161(1448) ack 1 win 23 22:30:12.098630 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 246161:247609(1448) ack 1 win 23 22:30:12.098634 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 247609:249057(1448) ack 1 win 23 22:30:12.098639 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 249057:250505(1448) ack 1 win 23 22:30:12.098643 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 250505:251953(1448) ack 1 win 23 22:30:12.098647 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 251953:253401(1448) ack 1 win 23 22:30:12.098650 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 253401:254849(1448) ack 1 win 23 22:30:12.098662 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 254849:256297(1448) ack 1 win 23 22:30:12.098667 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 256297:257745(1448) ack 1 win 23 22:30:12.098670 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 257745:259193(1448) ack 1 win 23 22:30:12.098683 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 186793 win 770 22:30:12.098716 IP 10.0.0.2.32824 > 10.0.0.1.32823: P 259193:260641(1448) ack 1 win 23 22:30:12.098720 IP 10.0.0.2.32824 > 10.0.0.1.32823: . 260641:262089(1448) ack 1 win 23 22:30:12.098724 IP bad-len 0 22:30:12.098728 IP bad-len 0 22:30:12.098732 IP bad-len 0 22:30:12.098808 IP 10.0.0.1.32823 > 10.0.0.2.32824: . ack 195481 win 770 --ZPt4rx8FFjLCG7dd Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename=divisor_1_both 23:11:58.916027 IP 10.0.0.2.32844 > 10.0.0.1.12865: S 2407413166:2407413166(0) win 5840 23:11:58.916302 IP 10.0.0.1.12865 > 10.0.0.2.32844: S 2406390542:2406390542(0) ack 2407413167 win 5792 23:11:58.916320 IP 10.0.0.2.32844 > 10.0.0.1.12865: . ack 1 win 23 23:11:58.916340 IP bad-len 0 23:11:58.916552 IP 10.0.0.1.12865 > 10.0.0.2.32844: . ack 1449 win 34 --ZPt4rx8FFjLCG7dd Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename=divisor_1_single 23:11:58.928030 IP 10.0.0.2.32846 > 10.0.0.1.32845: S 2408171237:2408171237(0) win 5840 23:11:58.928298 IP 10.0.0.1.32845 > 10.0.0.2.32846: S 2406622866:2406622866(0) ack 2408171238 win 5792 23:11:58.928315 IP 10.0.0.2.32846 > 10.0.0.1.32845: . ack 1 win 23 23:11:58.928348 IP bad-len 0 23:11:58.928548 IP 10.0.0.1.32845 > 10.0.0.2.32846: . ack 1449 win 34 --ZPt4rx8FFjLCG7dd-- From romieu@fr.zoreil.com Sun Dec 5 15:39:03 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 05 Dec 2004 15:39:09 -0800 (PST) Received: from fr.zoreil.com (electric-eye.fr.zoreil.com [213.41.134.224]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB5Nd1GC010916 for ; Sun, 5 Dec 2004 15:39:02 -0800 Received: from electric-eye.fr.zoreil.com (localhost.localdomain [127.0.0.1]) by fr.zoreil.com (8.12.10/8.12.1) with ESMTP id iB5Nc1vr030837; Mon, 6 Dec 2004 00:38:01 +0100 Received: (from romieu@localhost) by electric-eye.fr.zoreil.com (8.12.10/8.12.10/Submit) id iB5Nbudk030836; Mon, 6 Dec 2004 00:37:56 +0100 Date: Mon, 6 Dec 2004 00:37:56 +0100 From: Francois Romieu To: Dorn Hetzel Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com, jgarzik@pobox.com Subject: Re: r8169.c Message-ID: <20041205233756.GB29236@electric-eye.fr.zoreil.com> References: <20041119162920.GA26836@lilah.hetzel.org> <20041119201203.GA13522@electric-eye.fr.zoreil.com> <20041120003754.GA32133@lilah.hetzel.org> <20041120002946.GA18059@electric-eye.fr.zoreil.com> <20041122181307.GA3625@lilah.hetzel.org> <20041205235519.GA21885@lilah.hetzel.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041205235519.GA21885@lilah.hetzel.org> User-Agent: Mutt/1.4.1i X-Organisation: Land of Sunshine Inc. X-archive-position: 12452 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: romieu@fr.zoreil.com Precedence: bulk X-list: netdev Dorn Hetzel : [...] > I was wondering if the above 4 patches have made it into one of the > rc? releases, or at least a rc?-mm? ? They need to apply on top of -netdev which is included in -mm. I'll send the patch for inclusion in -mm so there is no need for Jeff to hurry. -- Ueimor From sfeldma@pobox.com Sun Dec 5 16:28:14 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 05 Dec 2004 16:28:18 -0800 (PST) Received: from orb.pobox.com (orb.pobox.com [207.8.226.5]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB60SDT1015396 for ; Sun, 5 Dec 2004 16:28:14 -0800 Received: from orb (localhost [127.0.0.1]) by orb.pobox.com (Postfix) with ESMTP id 5909A2F90B2; Sun, 5 Dec 2004 19:27:51 -0500 (EST) Received: from [192.168.0.19] (wbar2.sea1-4-5-062-153.sea1.dsl-verizon.net [4.5.62.153]) by orb.sasl.smtp.pobox.com (Postfix) with ESMTP id 7177F2F8FDE; Sun, 5 Dec 2004 19:27:47 -0500 (EST) Subject: Re: e1000>5.2.30 unstable with InterruptThrottleRate=0 From: Scott Feldman Reply-To: sfeldma@pobox.com To: Peter Kjellstroem Cc: netdev@oss.sgi.com, ganesh.venkatesan@intel.com, john.ronciak@intel.com In-Reply-To: References: Content-Type: text/plain Message-Id: <1102293033.3343.152.camel@sfeldma-mobl.dsl-verizon.net> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.6 (1.4.6-2) Date: Sun, 05 Dec 2004 16:30:33 -0800 Content-Transfer-Encoding: 7bit X-archive-position: 12453 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: sfeldma@pobox.com Precedence: bulk X-list: netdev On Sun, 2004-12-05 at 13:39, Peter Kjellstroem wrote: > this is the original fix and triggers for both e1000_82547 (your 547EI > right?) and e1000_82547_rev_2 (my 547GI). If your NIC can't stand the fix > and mine needs it isn't the obvious solution the following? Ganesh/John, what do you think? Is there a clear dividing line between 82547EI and 82547GI wrt this issue? If so, could Peter's patch go in? -scott From sfeldma@pobox.com Sun Dec 5 17:21:12 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 05 Dec 2004 17:21:17 -0800 (PST) Received: from orb.pobox.com (orb.pobox.com [207.8.226.5]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB61LC2n016909 for ; Sun, 5 Dec 2004 17:21:12 -0800 Received: from orb (localhost [127.0.0.1]) by orb.pobox.com (Postfix) with ESMTP id 2371C2F87FB; Sun, 5 Dec 2004 20:20:50 -0500 (EST) Received: from [192.168.0.19] (wbar2.sea1-4-5-062-153.sea1.dsl-verizon.net [4.5.62.153]) by orb.sasl.smtp.pobox.com (Postfix) with ESMTP id F1FEB2FAE6B; Sun, 5 Dec 2004 20:20:38 -0500 (EST) Subject: Re: 1.03Mpps on e1000 (was: Re: [E1000-devel] Transmission limit) From: Scott Feldman Reply-To: sfeldma@pobox.com To: Lennert Buytenhek Cc: Martin Josefsson , jamal , Robert Olsson , P@draigBrady.com, mellia@prezzemolo.polito.it, e1000-devel@lists.sourceforge.net, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com In-Reply-To: <20041205212559.GA4338@xi.wantstofly.org> References: <20041130134600.GA31515@xi.wantstofly.org> <1101824754.1044.126.camel@jzny.localdomain> <20041201001107.GE4203@xi.wantstofly.org> <1101863399.4663.54.camel@sfeldma-mobl.dsl-verizon.net> <20041201182943.GA14470@xi.wantstofly.org> <20041201213550.GF14470@xi.wantstofly.org> <1101967983.4782.9.camel@localhost.localdomain> <20041205145051.GA647@xi.wantstofly.org> <1102281141.3343.138.camel@sfeldma-mobl.dsl-verizon.net> <20041205212559.GA4338@xi.wantstofly.org> Content-Type: text/plain Message-Id: <1102296205.3343.206.camel@sfeldma-mobl.dsl-verizon.net> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.6 (1.4.6-2) Date: Sun, 05 Dec 2004 17:23:25 -0800 Content-Transfer-Encoding: 7bit X-archive-position: 12454 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: sfeldma@pobox.com Precedence: bulk X-list: netdev On Sun, 2004-12-05 at 13:25, Lennert Buytenhek wrote: > What your patch does is (correct me if I'm wrong): > - Masking TXDW, effectively preventing it from delivering TXdone ints. > - Not setting E1000_TXD_CMD_IDE in the TXD command field, which causes > the chip to 'ignore the TIDV' register, which is the 'TX Interrupt > Delay Value'. What exactly does this? A descriptor with IDE, when written back, starts the Tx delay timers countdown. Never setting IDE means the Tx delay timers never expire. > - Not setting the "Report Packet Sent"/"Report Status" bits in the TXD > command field. Is this the equivalent of the TXdone interrupt? > > Just exactly which bit avoids the descriptor writeback? As the name implies, Report Status (RS) instructs the controller to indicate the status of the descriptor by doing a write-back (DMA) to the descriptor memory. The only status we care about is the "done" indicator. By reading TDH (Tx head), we can figure out where hardware is without reading the status of each descriptor. Since we don't need status, we can turn off RS. > I'm also a bit worried that only freeing packets 1ms later will mess up > socket accounting and such. Any ideas on that? Well the timer solution is less than ideal, and any protocols that are sensitive to getting Tx resources returned by the driver as quickly as possible are not going to be happy. I don't know if 1ms is quick enough. You could eliminate the timer by doing the cleanup first thing in xmit_frame, but then you have two problems: 1) you might end up reading TDH for each send, and that's going to be expensive; 2) calls to xmit_frame might stop, leaving uncleaned work until xmit_frame is called again. -scott From herbert@gondor.apana.org.au Sun Dec 5 17:28:31 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 05 Dec 2004 17:28:39 -0800 (PST) Received: from arnor.apana.org.au (c211-30-229-77.rivrw4.nsw.optusnet.com.au [211.30.229.77]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB61SUeu017379 for ; Sun, 5 Dec 2004 17:28:31 -0800 Received: from gondolin.me.apana.org.au ([192.168.0.6] ident=mail) by arnor.apana.org.au with esmtp (Exim 3.35 #1 (Debian)) id 1Cb7fL-0007rP-00; Mon, 06 Dec 2004 12:27:51 +1100 Received: from herbert by gondolin.me.apana.org.au with local (Exim 3.36 #1 (Debian)) id 1Cb7dy-000588-00; Mon, 06 Dec 2004 12:26:26 +1100 From: Herbert Xu To: anton@samba.org (Anton Blanchard) Subject: Re: TSO + e1000 Cc: netdev@oss.sgi.com, ganesh.venkatesan@intel.com, jesse.brandeburg@intel.com, john.ronciak@intel.com Organization: Core In-Reply-To: <20041205232226.GA5757@krispykreme.ozlabs.ibm.com> X-Newsgroups: apana.lists.os.linux.netdev User-Agent: tin/1.7.4-20040225 ("Benbecula") (UNIX) (Linux/2.4.27-hx-1-686-smp (i686)) Message-Id: Date: Mon, 06 Dec 2004 12:26:26 +1100 X-archive-position: 12455 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@gondor.apana.org.au Precedence: bulk X-list: netdev Anton Blanchard wrote: > > 1. tcpdump doesnt report local TSO packets, it simply prints bad-len. I > think this is because the e1000 driver is zeroing out the IP header > length in e1000_tso: > > skb->nh.iph->tot_len = 0; > > Does the card require this for TSO to operate? Ive worked around it in > tcpdump for the time being. This is a bug in e1000. Even if it is required it isn't allowed to modify a cloned packet. It'll need to copy it so that other clone users aren't affected. > Now if we do a single direction test it takes forever for TSO to kick > in. At the moment Im not sure why this is happening, tcp_current_mss > seems to grow correctly in both cases. Is there somewhere else we are > capping TSO packets? What call does your application use to do the sending? Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt From anton@ozlabs.org Sun Dec 5 17:45:25 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 05 Dec 2004 17:45:31 -0800 (PST) Received: from ozlabs.org (ozlabs.org [203.10.76.45]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB61jOPj018162 for ; Sun, 5 Dec 2004 17:45:25 -0800 Received: by ozlabs.org (Postfix, from userid 1010) id B32292BE83; Mon, 6 Dec 2004 12:44:57 +1100 (EST) Date: Mon, 6 Dec 2004 12:43:49 +1100 From: Anton Blanchard To: Herbert Xu Cc: netdev@oss.sgi.com, ganesh.venkatesan@intel.com, jesse.brandeburg@intel.com, john.ronciak@intel.com Subject: Re: TSO + e1000 Message-ID: <20041206014348.GB8751@krispykreme.ozlabs.ibm.com> References: <20041205232226.GA5757@krispykreme.ozlabs.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.6+20040907i X-archive-position: 12456 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: anton@samba.org Precedence: bulk X-list: netdev > > Now if we do a single direction test it takes forever for TSO to kick > > in. At the moment Im not sure why this is happening, tcp_current_mss > > seems to grow correctly in both cases. Is there somewhere else we are > > capping TSO packets? > > What call does your application use to do the sending? Just doing big writes: send(4, " !\"#$%&\'()*+,-./0123456789:;<=>?"..., 262144, 0) = 262144 Anton From kfj14802@bb.banban.jp Sun Dec 5 18:02:53 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 05 Dec 2004 18:02:58 -0800 (PST) Received: from MINSHENG-31VW0K ([221.208.58.84]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iB622poR019191 for ; Sun, 5 Dec 2004 18:02:52 -0800 Date: Sun, 5 Dec 2004 18:02:51 -0800 Message-Id: <200412060202.iB622poR019191@oss.sgi.com> From: "=?iso-2022-jp?B?YTEwLm5ldA==?=" To: "netdev@oss.sgi.com" X-mailer: Super Mailer 9 [en][outlook] Subject: =?iso-2022-jp?B?GyRCO2QkTzxnSVgbKEI=?= MIME-Version: 1.0 Content-Type: text/plain; charset="iso-2022-jp" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4927.1200 X-archive-position: 12457 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: sdf@yahoo.co.jp Precedence: bulk X-list: netdev $B:#G/$N<}3O$O1|MM$HBt;3$*M'C#$K$J$l$?$3$H$G$9!#(B $B$I$3$G!)$b$A$m$s=P2q$$7O$G!&!&!&!&!#(B http://kokosoko.info/okusama/ ****$B%a%k%^%,2r=|(B/$BLd$$9g$o$;(B**** ya_oh_yajp@yahoo.co.jp ******************************* From rayl@mail.com Sun Dec 5 18:45:01 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 05 Dec 2004 18:45:05 -0800 (PST) Received: from ray.lehtiniemi.com (dsl-crow-209-5-162-130-cgy.nucleus.com [209.5.162.130]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB62j0rF020984 for ; Sun, 5 Dec 2004 18:45:00 -0800 Received: by ray.lehtiniemi.com (Postfix, from userid 500) id BBC2111F149; Sun, 5 Dec 2004 19:44:37 -0700 (MST) Date: Sun, 5 Dec 2004 19:44:37 -0700 From: Ray Lehtiniemi To: netdev@oss.sgi.com Subject: how to tune a pair of e1000 cards on intel e7501-based system? Message-ID: <20041206024437.GB7891@mail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline User-Agent: Mutt/1.5.6i X-archive-position: 12458 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rayl@mail.com Precedence: bulk X-list: netdev hi all i'm trying to understand how to tune a pair of e1000 cards in a server box. the box is a dual xeon 3.06 with hyperthreading, using the intel e7501 chipset, with both cards on hub interface D. the application involves small UDP packets, generally under 300 bytes. i need to maximize the number of packets per second transferred between the two cards. at the moment, i'm looking at the PCI bus in this box to see what might be tweakable. lspci output for the relevant parts is attached below. could anyone give me an idea: - what kind of packets per second i could expect to achieve from this particular system (for small packets) - what parameters i can tweak at the PCI level (or any other level, for that matter...) to achieve that level of performance for example, how could i get my e1000 cards to say '64bit+ 133MHz+' to match the secondary side of the 82870P2 bridge? thank you ------------------------------------------------------------------------- lspci -t ------------------------------------------------------------------------- -[00]-+-00.0 +-00.1 +-04.0-[01-03]--+-1c.0 | +-1d.0-[02]--+-01.0 | | \-02.0 | +-1e.0 | \-1f.0-[03]-- +-1d.0 +-1d.1 +-1d.2 +-1e.0-[04]--+-03.0 | \-06.0 +-1f.0 +-1f.1 \-1f.3 ------------------------------------------------------------------------- lspci -vv (selected items) ------------------------------------------------------------------------- 0000:00:00.0 Host bridge: Intel Corp. E7501 Memory Controller Hub (rev 01) Subsystem: Intel Corp. E7501 Memory Controller Hub Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- SERR- TAbort- SERR- Reset- FastB2B- 0000:01:1c.0 PIC: Intel Corp. 82870P2 P64H2 I/OxAPIC (rev 04) (prog-if 20 [IO(X)-APIC]) Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap+ 66Mhz+ UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- TAbort- SERR- Reset- FastB2B- Capabilities: [50] PCI-X bridge device. Secondary Status: 64bit+, 133MHz+, SCD-, USC-, SCO-, SRD- Freq=3 Status: Bus=0 Dev=0 Func=0 64bit- 133MHz- SCD- USC-, SCO-, SRD- : Upstream: Capacity=0, Commitment Limit=0 : Downstream: Capacity=0, Commitment Limit=0 0000:02:01.0 Ethernet controller: Intel Corp. 82544EI Gigabit Ethernet Controller (Copper) (rev 02) Subsystem: Intel Corp. PRO/1000 XT Server Adapter Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap+ 66Mhz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- SERR- [disabled] [size=128K] Capabilities: [dc] Power Management version 2 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 PME-Enable- DSel=0 DScale=1 PME- Capabilities: [e4] PCI-X non-bridge device. Command: DPERE- ERO+ RBC=0 OST=0 Status: Bus=0 Dev=0 Func=0 64bit- 133MHz- SCD- USC-, DC=simple, DMMRBC=0, DMOST=0, DMCRS=0, RSCEM- Capabilities: [f0] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable- Address: 0000000000000000 Data: 0000 0000:02:02.0 Ethernet controller: Intel Corp. 82544EI Gigabit Ethernet Controller (Copper) (rev 02) Subsystem: Intel Corp. PRO/1000 XT Server Adapter Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap+ 66Mhz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- SERR- [disabled] [size=128K] Capabilities: [dc] Power Management version 2 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 PME-Enable- DSel=0 DScale=1 PME- Capabilities: [e4] PCI-X non-bridge device. Command: DPERE- ERO+ RBC=0 OST=0 Status: Bus=0 Dev=0 Func=0 64bit- 133MHz- SCD- USC-, DC=simple, DMMRBC=0, DMOST=0, DMCRS=0, RSCEM- Capabilities: [f0] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable- Address: 0000000000000000 Data: 0000 -- ---------------------------------------------------------------------- Ray L From ganesh.venkatesan@intel.com Sun Dec 5 19:22:44 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 05 Dec 2004 19:22:48 -0800 (PST) Received: from orsfmr003.jf.intel.com (fmr18.intel.com [134.134.136.17]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB63Mida022269 for ; Sun, 5 Dec 2004 19:22:44 -0800 Received: from orsfmr100.jf.intel.com (orsfmr100.jf.intel.com [10.7.209.16]) by orsfmr003.jf.intel.com (8.12.10/8.12.10/d: major-outer.mc,v 1.1 2004/09/17 17:50:56 root Exp $) with ESMTP id iB63MEOC023697; Mon, 6 Dec 2004 03:22:14 GMT Received: from orsmsxvs040.jf.intel.com (orsmsxvs040.jf.intel.com [192.168.65.206]) by orsfmr100.jf.intel.com (8.12.10/8.12.10/d: major-inner.mc,v 1.2 2004/09/17 18:05:01 root Exp $) with SMTP id iB63MAAn028769; Mon, 6 Dec 2004 03:22:11 GMT Received: from orsmsx332.amr.corp.intel.com ([192.168.65.60]) by orsmsxvs040.jf.intel.com (SAVSMTP 3.1.2.35) with SMTP id M2004120519221104662 ; Sun, 05 Dec 2004 19:22:11 -0800 Received: from orsmsx408.amr.corp.intel.com ([192.168.65.52]) by orsmsx332.amr.corp.intel.com with Microsoft SMTPSVC(6.0.3790.0); Sun, 5 Dec 2004 19:22:11 -0800 X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Subject: Re: e1000>5.2.30 unstable with InterruptThrottleRate=0 Date: Sun, 5 Dec 2004 19:22:10 -0800 Message-ID: <468F3FDA28AA87429AD807992E22D07E037ED6A3@orsmsx408> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: Re: e1000>5.2.30 unstable with InterruptThrottleRate=0 Thread-Index: AcTbQsWACHV+MVeTTAWnjvTW8IBfxg== From: "Venkatesan, Ganesh" To: Cc: , X-OriginalArrivalTime: 06 Dec 2004 03:22:11.0563 (UTC) FILETIME=[C65E9BB0:01C4DB42] X-Scanned-By: MIMEDefang 2.44 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id iB63Mida022269 X-archive-position: 12459 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ganesh.venkatesan@intel.com Precedence: bulk X-list: netdev Hi Peter: Thanks for the thorough analysis. I will have to perform a set of tests with your patch included in our latest version of the driver. I will respond to you after I am done with my tests. Ganesh. From sfeldma@pobox.com Sun Dec 5 19:31:57 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 05 Dec 2004 19:32:01 -0800 (PST) Received: from orb.pobox.com (orb.pobox.com [207.8.226.5]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB63VucW022812 for ; Sun, 5 Dec 2004 19:31:57 -0800 Received: from orb (localhost [127.0.0.1]) by orb.pobox.com (Postfix) with ESMTP id 813FE2FBA06; Sun, 5 Dec 2004 22:31:34 -0500 (EST) Received: from [192.168.0.19] (wbar2.sea1-4-5-062-153.sea1.dsl-verizon.net [4.5.62.153]) by orb.sasl.smtp.pobox.com (Postfix) with ESMTP id EA96B2FB9D4; Sun, 5 Dec 2004 22:31:32 -0500 (EST) Subject: Re: how to tune a pair of e1000 cards on intel e7501-based system? From: Scott Feldman Reply-To: sfeldma@pobox.com To: Ray Lehtiniemi Cc: netdev@oss.sgi.com In-Reply-To: <20041206024437.GB7891@mail.com> References: <20041206024437.GB7891@mail.com> Content-Type: text/plain Message-Id: <1102304058.3343.217.camel@sfeldma-mobl.dsl-verizon.net> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.6 (1.4.6-2) Date: Sun, 05 Dec 2004 19:34:18 -0800 Content-Transfer-Encoding: 7bit X-archive-position: 12460 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: sfeldma@pobox.com Precedence: bulk X-list: netdev On Sun, 2004-12-05 at 18:44, Ray Lehtiniemi wrote: > i'm trying to understand how to tune a pair of e1000 cards in > a server box. the box is a dual xeon 3.06 with hyperthreading, > using the intel e7501 chipset, with both cards on hub interface > D. the application involves small UDP packets, generally under These are 82544 cards hanging of P64H2, so they're running at PCI-X, but at what speed? Run ethtool -d eth | grep Bus. The cards support 133Mhz, but having them adjacent on the same P64H2 probably bumps them down to 100Mhz. Can you put one on D and the other on another bus? > 300 bytes. i need to maximize the number of packets per second > transferred between the two cards. There is a lot of current traffic on netdev about this topic. netdev is the official e1000 mailing this weekend. :-) > at the moment, i'm looking at the PCI bus in this box to see > what might be tweakable. lspci output for the relevant parts is > attached below. > > could anyone give me an idea: > > - what kind of packets per second i could expect to achieve > from this particular system (for small packets) What kind of numbers are you getting? What kernel are you using? What driver tweaks have you made, if any? -scott From rayl@mail.com Sun Dec 5 20:10:26 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 05 Dec 2004 20:10:30 -0800 (PST) Received: from ray.lehtiniemi.com (dsl-crow-209-5-162-130-cgy.nucleus.com [209.5.162.130]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB64APdC024075 for ; Sun, 5 Dec 2004 20:10:26 -0800 Received: by ray.lehtiniemi.com (Postfix, from userid 500) id 5DB4114D73B; Sun, 5 Dec 2004 21:10:03 -0700 (MST) Date: Sun, 5 Dec 2004 21:10:03 -0700 From: Ray Lehtiniemi To: Scott Feldman Cc: netdev@oss.sgi.com Subject: Re: how to tune a pair of e1000 cards on intel e7501-based system? Message-ID: <20041206041002.GC7891@mail.com> References: <20041206024437.GB7891@mail.com> <1102304058.3343.217.camel@sfeldma-mobl.dsl-verizon.net> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <1102304058.3343.217.camel@sfeldma-mobl.dsl-verizon.net> User-Agent: Mutt/1.5.6i X-archive-position: 12461 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rayl@mail.com Precedence: bulk X-list: netdev On Sun, Dec 05, 2004 at 07:34:18PM -0800, Scott Feldman wrote: > On Sun, 2004-12-05 at 18:44, Ray Lehtiniemi wrote: > > i'm trying to understand how to tune a pair of e1000 cards in > > a server box. the box is a dual xeon 3.06 with hyperthreading, > > using the intel e7501 chipset, with both cards on hub interface > > D. the application involves small UDP packets, generally under > > These are 82544 cards hanging of P64H2, so they're running at PCI-X, but > at what speed? Run ethtool -d eth | grep Bus. :-) just found ethtool and compiled it before i read this email. any other useful tools i should get? > The cards support > 133Mhz, but having them adjacent on the same P64H2 probably bumps them > down to 100Mhz. # ethtool -d eth0 | grep Bus Bus type: PCI-X Bus speed: 133MHz Bus width: 64-bit # ethtool -d eth1 | grep Bus Bus type: PCI-X Bus speed: 133MHz Bus width: 64-bit so that looks okay... any idea why lspci -vv shows non-64bit, non-133 MHz? (i am assuming that is what the minus sign means) Capabilities: [e4] PCI-X non-bridge device. Command: DPERE- ERO+ RBC=0 OST=0 Status: Bus=0 Dev=0 Func=0 64bit- 133MHz- SCD- USC-, DC=simple, DMMRBC=0, DMOST=0, DMCRS=0, RSCEM- > Can you put one on D and the other on another bus? not sure... have to look at the chassis tomorrow morning. a co-worker actually built the box, i've not seen it in person yet. > There is a lot of current traffic on netdev about this topic. netdev is > the official e1000 mailing this weekend. :-) i noticed that almost half of the messages in the e1000-devel archives since oct 2002 were sent in the last 10 days :-) > What kind of numbers are you getting? will start testing tomorrow, just starting my background research tonight > What kernel are you using? 2.4.20 with driver 4.4.12-k1, and 2.6.bkcurr with driver 5.bkcurr > What driver tweaks have you made, if any? none yet. based on mailing list searches, i plan to: - InterruptThrottleRate 15000 - TxIntDelay 0 i also plan to set rp_filter to zero for all interfaces. thanks -- ---------------------------------------------------------------------- Ray L From anton@ozlabs.org Sun Dec 5 20:17:56 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 05 Dec 2004 20:17:59 -0800 (PST) Received: from ozlabs.org (ozlabs.org [203.10.76.45]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB64HtJm024509 for ; Sun, 5 Dec 2004 20:17:55 -0800 Received: by ozlabs.org (Postfix, from userid 1010) id 662DA2BDB5; Mon, 6 Dec 2004 15:17:28 +1100 (EST) Date: Mon, 6 Dec 2004 15:16:56 +1100 From: Anton Blanchard To: Herbert Xu Cc: netdev@oss.sgi.com, ganesh.venkatesan@intel.com, jesse.brandeburg@intel.com, john.ronciak@intel.com Subject: Re: TSO + e1000 Message-ID: <20041206041656.GE8751@krispykreme.ozlabs.ibm.com> References: <20041205232226.GA5757@krispykreme.ozlabs.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.6+20040907i X-archive-position: 12462 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: anton@samba.org Precedence: bulk X-list: netdev > This is a bug in e1000. Even if it is required it isn't allowed to > modify a cloned packet. It'll need to copy it so that other clone > users aren't affected. It looks like the tg3 is doing a similar thing. Anton From davem@davemloft.net Sun Dec 5 21:29:06 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 05 Dec 2004 21:29:11 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB65T5km029787 for ; Sun, 5 Dec 2004 21:29:06 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1CbBGB-0005NL-00; Sun, 05 Dec 2004 21:18:07 -0800 Date: Sun, 5 Dec 2004 21:18:07 -0800 From: "David S. Miller" To: Anton Blanchard Cc: herbert@gondor.apana.org.au, netdev@oss.sgi.com, ganesh.venkatesan@intel.com, jesse.brandeburg@intel.com, john.ronciak@intel.com Subject: Re: TSO + e1000 Message-Id: <20041205211807.6f8622f6.davem@davemloft.net> In-Reply-To: <20041206041656.GE8751@krispykreme.ozlabs.ibm.com> References: <20041205232226.GA5757@krispykreme.ozlabs.ibm.com> <20041206041656.GE8751@krispykreme.ozlabs.ibm.com> X-Mailer: Sylpheed version 1.0.0beta3 (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 12463 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Mon, 6 Dec 2004 15:16:56 +1100 Anton Blanchard wrote: > > This is a bug in e1000. Even if it is required it isn't allowed to > > modify a cloned packet. It'll need to copy it so that other clone > > users aren't affected. > > It looks like the tg3 is doing a similar thing. As does ixgb. Most TSO drivers need to modify the IP header in a similar way. It has to do with how Microsoft's driver API defines the TSO interface, which is what all the cards implement. They want the checksum field clear, and the tot_len field of the IP header to be what the normal packets will have. Typhoon and S2IO seem to be a notable exceptions. From michael.vittrup.larsen@ericsson.com Mon Dec 6 00:18:47 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Dec 2004 00:18:56 -0800 (PST) Received: from penguin.ericsson.se (penguin.ericsson.se [193.180.251.47]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB68Ifae001838 for ; Mon, 6 Dec 2004 00:18:42 -0800 Received: from esealmw143.al.sw.ericsson.se ([153.88.254.118]) by penguin.ericsson.se (8.12.10/8.12.10/WIREfire-1.8b) with ESMTP id iB68IJh5029400 for ; Mon, 6 Dec 2004 09:18:19 +0100 (MET) Received: from esealnt613.al.sw.ericsson.se ([153.88.254.125]) by esealmw143.al.sw.ericsson.se with Microsoft SMTPSVC(6.0.3790.211); Mon, 6 Dec 2004 09:18:18 +0100 Received: from unixmail.ted.dk.eu.ericsson.se (unixmail.ted.ericsson.se [213.159.188.246]) by esealnt613.al.sw.ericsson.se with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2657.72) id X8A3Y20H; Mon, 6 Dec 2004 09:18:18 +0100 Received: from diadem.ted.dk.eu.ericsson.se (diadem.ted.ericsson.se [213.159.189.76]) by unixmail.ted.dk.eu.ericsson.se (8.10.1/8.10.1/TEDmain-1.0) with ESMTP id iB68I5314048; Mon, 6 Dec 2004 09:18:10 +0100 (MET) X-Sybari-Space: 00000000 00000000 00000000 00000000 From: Michael Vittrup Larsen Organization: Ericsson To: Stephen Hemminger Subject: Re: [PATCH] tcp: efficient port randomisation (revised) Date: Mon, 6 Dec 2004 09:18:04 +0100 User-Agent: KMail/1.7 Cc: "David S. Miller" , netdev@oss.sgi.com References: <20041027092531.78fe438c@guest-251-240.pdx.osdl.net> <20041202135252.04e64f51.davem@davemloft.net> <41B14E57.5080803@osdl.org> In-Reply-To: <41B14E57.5080803@osdl.org> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200412060918.04441.michael.vittrup.larsen@ericsson.com> X-OriginalArrivalTime: 06 Dec 2004 08:18:18.0610 (UTC) FILETIME=[245AD520:01C4DB6C] X-archive-position: 12464 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: michael.vittrup.larsen@ericsson.com Precedence: bulk X-list: netdev Measuring non-blocking connect using the loopback address I agree with Stephen' conclusion, that the cost of the MD4 gets lost in the noise. I did not measure the variance, since this probably describe scheduling and not the actual ephemeral bind. However, I did measure the minimum value, and the assumption is that this is close to a measurement of an uninterrupted connect. Median filtering probably would be more correct... Below are the results from 10 successive tests. Unmodified (average and minimum values): connect 24433 (min 21820) [ticks/op] connect 24504 (min 21927) [ticks/op] connect 24530 (min 21952) [ticks/op] connect 24244 (min 21607) [ticks/op] connect 24220 (min 21613) [ticks/op] connect 24117 (min 21665) [ticks/op] connect 24148 (min 21663) [ticks/op] connect 24079 (min 21648) [ticks/op] connect 23998 (min 21700) [ticks/op] connect 23906 (min 21682) [ticks/op] Modified (average and minimum values): connect 23961 (min 21774) [ticks/op] connect 23894 (min 21750) [ticks/op] connect 23927 (min 21776) [ticks/op] connect 23881 (min 21757) [ticks/op] connect 23956 (min 21749) [ticks/op] connect 23872 (min 21710) [ticks/op] connect 23848 (min 21694) [ticks/op] connect 23729 (min 21769) [ticks/op] connect 23656 (min 21618) [ticks/op] connect 23723 (min 21699) [ticks/op] From jos@xos037.xos.nl Mon Dec 6 01:43:08 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Dec 2004 01:43:13 -0800 (PST) Received: from simba.xos.nl (simba.xos.nl [212.26.207.226]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB69h6XZ014912 for ; Mon, 6 Dec 2004 01:43:07 -0800 Received: from xos037.xos.nl (xos037.xos.nl [212.26.207.37]) by simba.xos.nl (8.12.8/8.12.8) with ESMTP id iB69gdmA005761; Mon, 6 Dec 2004 10:42:40 +0100 Received: (from jos@localhost) by xos037.xos.nl (8.11.0/8.11.0) id iB69gdN24177; Mon, 6 Dec 2004 10:42:39 +0100 Date: Mon, 6 Dec 2004 10:42:39 +0100 From: Jos Vos To: Scott Feldman Cc: netdev@oss.sgi.com Subject: Re: e1000 driver problem with Intel Pro/1000 MT adapter Message-ID: <20041206104239.A24173@xos037.xos.nl> References: <200412032325.iB3NPdG04693@xos037.xos.nl> <1102204801.3343.68.camel@sfeldma-mobl.dsl-verizon.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <1102204801.3343.68.camel@sfeldma-mobl.dsl-verizon.net>; from sfeldma@pobox.com on Sat, Dec 04, 2004 at 04:00:01PM -0800 X-Organization: X/OS Experts in Open Systems BV, Amsterdam, The Netherlands X-archive-position: 12465 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jos@xos.nl Precedence: bulk X-list: netdev On Sat, Dec 04, 2004 at 04:00:01PM -0800, Scott Feldman wrote: > The 4-port card has two 82546 dual-port controllers behind a PCI-X > bridge. Your symptoms suggest interrupt routing didn't get setup > correctly for the controllers behind the bridge. I've seen more than > one case where the BIOS gets this wrong. The fix is to upgrade to the > latest BIOS. Hopefully this is the fix for you. :-) Unfortunately, the latest BIOS for the MB is already installed :-(. -- -- Jos Vos -- X/OS Experts in Open Systems BV | Phone: +31 20 6938364 -- Amsterdam, The Netherlands | Fax: +31 20 6948204 From hadi@cyberus.ca Mon Dec 6 03:01:49 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Dec 2004 03:01:54 -0800 (PST) Received: from mx03.cybersurf.com (mx03.cybersurf.com [209.197.145.106]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB6B1nkg022663 for ; Mon, 6 Dec 2004 03:01:49 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx03.cybersurf.com with esmtp (Exim 4.30) id 1CbGcO-0007qB-CP for netdev@oss.sgi.com; Mon, 06 Dec 2004 06:01:24 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CbGcJ-0008GI-VJ; Mon, 06 Dec 2004 06:01:20 -0500 Subject: Post Network dev questions to netdev Please WAS(Re: [patch 4/10] s390: network driver. From: jamal Reply-To: hadi@cyberus.ca To: Paul Jakma Cc: Thomas Spatzier , jgarzik@pobox.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com In-Reply-To: References: Content-Type: text/plain Organization: jamalopolous Message-Id: <1102330877.1042.2187.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 06 Dec 2004 06:01:18 -0500 Content-Transfer-Encoding: 7bit X-archive-position: 12466 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Sun, 2004-12-05 at 01:25, Paul Jakma wrote: > > This has always been (AFAIK) the behaviour yes. We started getting > reports of the new queuing behaviour with, iirc, a version of Intel's > e100 driver for 2.4.2x, which was later changed back to the old > behaviour. However now that the queue behaviour is apparently the > mandated behaviour we really need to work out what to do about the > sending-long-stale packets problem. > I missed the beginings of this thread. Seems some patch was posted on lkml which started this discussion. I am pretty sure what the lkml FAQ says is to post on netdev. Ok, If you insist posting on lkml (because that the way to glory, good fortune and fame), then please have the courtesy to post to netdev. Now lets see if we can help. Followups only on netdev. cheers, jamal From hadi@cyberus.ca Mon Dec 6 03:28:30 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Dec 2004 03:28:35 -0800 (PST) Received: from mx03.cybersurf.com (mx03.cybersurf.com [209.197.145.106]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB6BST71025052 for ; Mon, 6 Dec 2004 03:28:29 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx03.cybersurf.com with esmtp (Exim 4.30) id 1CbH2C-0002fe-Rc for netdev@oss.sgi.com; Mon, 06 Dec 2004 06:28:04 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CbH29-0002WO-Lu; Mon, 06 Dec 2004 06:28:01 -0500 Subject: Re: [patch 4/10] s390: network driver. From: jamal Reply-To: hadi@cyberus.ca To: Paul Jakma Cc: Thomas Spatzier , jgarzik@pobox.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com In-Reply-To: References: Content-Type: text/plain Organization: jamalopolous Message-Id: <1102332479.1048.2243.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 06 Dec 2004 06:27:59 -0500 Content-Transfer-Encoding: 7bit X-archive-position: 12467 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Sun, 2004-12-05 at 01:25, Paul Jakma wrote: [..] > Anyway, we do, I think, need some way to deal with the > sending-stale-packet-on-link-back problem. Either a way to flush this > driver queue or else a guarantee that writes to sockets whose > protocol makes no reliability guarantee will either return ENOBUFS or > drop the packet. > > Otherwise we will start getting reports of "Quagga on Linux sent an > ancient {RIP,IRDP,RA} packet when we fixed a switch problem, and it > caused an outage for a section of our network due to bad routes", I > think. > > Some comment or advice would be useful. (Am I kill-filed by all of > netdev? feels like it). Dont post networking related patches on other lists. I havent seen said patch, but it seems someone is complaining about some behavior changing? In regards to link down and packets being queued. Agreed this is a little problematic for some apps/transports. TCP is definetely not one of them. TCP in Linux actually is told if the drop is local. This way it can make better judgement (and not unnecesarily adjust windows for example). SCTP AFAIK is the only transport that provides its apps opportunity to obsolete messages already sent. I am not sure how well implemented or whtether it is implemented at all. Someone working on SCTP could comment. In the case the netdevice is administratively downed both the qdisc and DMA ring packets are flushed. Newer packets will never be queued and you should quickly be able to find from your app that the device is down. In the case of netdevice being operationally down - I am hoping this is what the discussion is, having jumped on it - both queues stay intact. What you can do is certainly from user space admin down/up the device when you receive a netlink carrier off notification. I am struggling to see whether dropping the packet inside the tx path once it is operationaly down is so blasphemous ... need to get caffeine first. cheers, jamal From hadi@cyberus.ca Mon Dec 6 03:33:12 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Dec 2004 03:33:17 -0800 (PST) Received: from mx01.cybersurf.com (mx01.cybersurf.com [209.197.145.104]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB6BXBfG025481 for ; Mon, 6 Dec 2004 03:33:11 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx01.cybersurf.com with esmtp (Exim 4.30) id 1CbH6k-0007wP-Sw for netdev@oss.sgi.com; Mon, 06 Dec 2004 06:32:46 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CbH6d-00033c-OH; Mon, 06 Dec 2004 06:32:40 -0500 Subject: Re: 1.03Mpps on e1000 (was: Re: [E1000-devel] Transmission limit) From: jamal Reply-To: hadi@cyberus.ca To: Martin Josefsson Cc: Lennert Buytenhek , Scott Feldman , Robert Olsson , P@draigBrady.com, mellia@prezzemolo.polito.it, e1000-devel@lists.sourceforge.net, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com In-Reply-To: References: <1101824754.1044.126.camel@jzny.localdomain> <20041201001107.GE4203@xi.wantstofly.org> <1101863399.4663.54.camel@sfeldma-mobl.dsl-verizon.net> <20041201182943.GA14470@xi.wantstofly.org> <20041201213550.GF14470@xi.wantstofly.org> <1101967983.4782.9.camel@localhost.localdomain> <20041205145051.GA647@xi.wantstofly.org> <20041205174401.GJ647@xi.wantstofly.org> <20041205175133.GK647@xi.wantstofly.org> Content-Type: text/plain Organization: jamalopolous Message-Id: <1102332757.1042.2255.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 06 Dec 2004 06:32:37 -0500 Content-Transfer-Encoding: 7bit X-archive-position: 12468 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Sun, 2004-12-05 at 12:54, Martin Josefsson wrote: > On Sun, 5 Dec 2004, Lennert Buytenhek wrote: > > > I've tested all packet sizes now, and delayed TDT updating once per jiffy > > (instead of once per packet) indeed gives about 25kpps more on 60,61,62 > > byte packets, and is hardly worth it for bigger packets. > > Maybe we can't see any real gains here now, I wonder if it has any effect > if you have lots of nics on the same bus. I mean, in theory it saves a > whole lot of traffic on the bus. > This sounds like really exciting stuff happening here over the weekend. Scott, you had to leave Intel before giving us this tip? ;-> Someone correct me if i am wrong - but does it appear as if all these changes are only useful on PCI but not PCI-X? cheers, jamal From hadi@cyberus.ca Mon Dec 6 03:41:14 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Dec 2004 03:41:20 -0800 (PST) Received: from mx01.cybersurf.com (mx01.cybersurf.com [209.197.145.104]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB6BfEPc026043 for ; Mon, 6 Dec 2004 03:41:14 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx01.cybersurf.com with esmtp (Exim 4.30) id 1CbHEX-0003qW-KU for netdev@oss.sgi.com; Mon, 06 Dec 2004 06:40:49 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CbHES-000442-NL; Mon, 06 Dec 2004 06:40:45 -0500 Subject: Re: [PATCH] rtnetlink & address family problem From: jamal Reply-To: hadi@cyberus.ca To: Michal Ludvig Cc: Andrew Morton , netdev@oss.sgi.com, Jan Kara In-Reply-To: <41B0A5B4.6060108@suse.cz> References: <41B0A5B4.6060108@suse.cz> Content-Type: text/plain Organization: jamalopolous Message-Id: <1102333242.1048.2268.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 06 Dec 2004 06:40:43 -0500 Content-Transfer-Encoding: 7bit X-archive-position: 12469 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev I think this patch will break more than it fixes. You need to do a lot more testing to verify it doesnt. Actually you should probably fix whats being invoked for the ifa messages when PF_UNSPEC is selected to check that it only flushes v6 addresses when V6 is on and reject when it is not compiled in. cheers, jamal On Fri, 2004-12-03 at 12:43, Michal Ludvig wrote: > Hi, > > running 'ip -6 addr flush dev eth0' on a kernel without IPv6 support > flushes *all* addresses from the interface, even those IPv4 ones, > because the unsupported protocol is substituted by PF_UNSPEC. > IMHO it should better return with an error EAFNOSUPPORT. > > Attached patch fixes it. Please apply. > > BTW Credits to Jan Kara for discovering and analysing > this bug. > > Michal Ludvig From jos@xos037.xos.nl Mon Dec 6 03:52:24 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Dec 2004 03:52:29 -0800 (PST) Received: from simba.xos.nl (simba.xos.nl [212.26.207.226]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB6BqK0m026693 for ; Mon, 6 Dec 2004 03:52:23 -0800 Received: from xos037.xos.nl (xos037.xos.nl [212.26.207.37]) by simba.xos.nl (8.12.8/8.12.8) with ESMTP id iB6BpqmA006528; Mon, 6 Dec 2004 12:51:52 +0100 Received: (from jos@localhost) by xos037.xos.nl (8.11.0/8.11.0) id iB6BppN24812; Mon, 6 Dec 2004 12:51:51 +0100 Date: Mon, 6 Dec 2004 12:51:51 +0100 From: Jos Vos To: Scott Feldman Cc: netdev@oss.sgi.com Subject: Re: e1000 driver problem with Intel Pro/1000 MT adapter Message-ID: <20041206125151.A24753@xos037.xos.nl> References: <200412032325.iB3NPdG04693@xos037.xos.nl> <1102204801.3343.68.camel@sfeldma-mobl.dsl-verizon.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <1102204801.3343.68.camel@sfeldma-mobl.dsl-verizon.net>; from sfeldma@pobox.com on Sat, Dec 04, 2004 at 04:00:01PM -0800 X-Organization: X/OS Experts in Open Systems BV, Amsterdam, The Netherlands X-archive-position: 12470 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jos@xos.nl Precedence: bulk X-list: netdev On Sat, Dec 04, 2004 at 04:00:01PM -0800, Scott Feldman wrote: > The 4-port card has two 82546 dual-port controllers behind a PCI-X > bridge. Your symptoms suggest interrupt routing didn't get setup > correctly for the controllers behind the bridge. I've seen more than > one case where the BIOS gets this wrong. The fix is to upgrade to the > latest BIOS. Hopefully this is the fix for you. :-) I just a prerelease of their next BIOS from Supermicro and that seems to solve the problem! I still have to boot the SMP kernel to get APIC working, in UP mode incoming packets are not seen, but I can live with that. -- -- Jos Vos -- X/OS Experts in Open Systems BV | Phone: +31 20 6938364 -- Amsterdam, The Netherlands | Fax: +31 20 6948204 From buytenh@wantstofly.org Mon Dec 6 04:12:14 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Dec 2004 04:12:21 -0800 (PST) Received: from xi.wantstofly.org (alephnull.demon.nl [212.238.201.82]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB6CCDmY028855 for ; Mon, 6 Dec 2004 04:12:14 -0800 Received: by xi.wantstofly.org (Postfix, from userid 500) id 1C6342B0ED; Mon, 6 Dec 2004 13:11:51 +0100 (MET) Date: Mon, 6 Dec 2004 13:11:51 +0100 From: Lennert Buytenhek To: jamal Cc: Martin Josefsson , Scott Feldman , Robert Olsson , P@draigBrady.com, mellia@prezzemolo.polito.it, e1000-devel@lists.sourceforge.net, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com Subject: Re: 1.03Mpps on e1000 (was: Re: [E1000-devel] Transmission limit) Message-ID: <20041206121151.GA12598@xi.wantstofly.org> References: <20041201182943.GA14470@xi.wantstofly.org> <20041201213550.GF14470@xi.wantstofly.org> <1101967983.4782.9.camel@localhost.localdomain> <20041205145051.GA647@xi.wantstofly.org> <20041205174401.GJ647@xi.wantstofly.org> <20041205175133.GK647@xi.wantstofly.org> <1102332757.1042.2255.camel@jzny.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1102332757.1042.2255.camel@jzny.localdomain> User-Agent: Mutt/1.4.1i X-archive-position: 12471 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: buytenh@wantstofly.org Precedence: bulk X-list: netdev On Mon, Dec 06, 2004 at 06:32:37AM -0500, jamal wrote: > Someone correct me if i am wrong - but does it appear as if all these > changes are only useful on PCI but not PCI-X? They are useful on PCI-X as well as regular PCI. On my 64/100 NIC I get ~620kpps on 2.6.9, ~1Mpps with 2.6.9 plus tx rework plus TXDMAC=0. Martin gets the ~1Mpps number with just the tx rework, and even more with TXDMAC=0 added in as well. --L From hadi@cyberus.ca Mon Dec 6 04:21:14 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Dec 2004 04:21:20 -0800 (PST) Received: from mx02.cybersurf.com (mx02.cybersurf.com [209.197.145.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB6CLElk004437 for ; Mon, 6 Dec 2004 04:21:14 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx02.cybersurf.com with esmtp (Exim 4.30) id 1CbInN-0004GP-Vh for netdev@oss.sgi.com; Mon, 06 Dec 2004 08:20:53 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CbHrC-0008Dv-BM; Mon, 06 Dec 2004 07:20:46 -0500 Subject: Re: 1.03Mpps on e1000 (was: Re: [E1000-devel] Transmission limit) From: jamal Reply-To: hadi@cyberus.ca To: Lennert Buytenhek Cc: Martin Josefsson , Scott Feldman , Robert Olsson , P@draigBrady.com, mellia@prezzemolo.polito.it, e1000-devel@lists.sourceforge.net, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com In-Reply-To: <20041206121151.GA12598@xi.wantstofly.org> References: <20041201182943.GA14470@xi.wantstofly.org> <20041201213550.GF14470@xi.wantstofly.org> <1101967983.4782.9.camel@localhost.localdomain> <20041205145051.GA647@xi.wantstofly.org> <20041205174401.GJ647@xi.wantstofly.org> <20041205175133.GK647@xi.wantstofly.org> <1102332757.1042.2255.camel@jzny.localdomain> <20041206121151.GA12598@xi.wantstofly.org> Content-Type: text/plain Organization: jamalopolous Message-Id: <1102335643.1048.2311.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 06 Dec 2004 07:20:43 -0500 Content-Transfer-Encoding: 7bit X-archive-position: 12472 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Mon, 2004-12-06 at 07:11, Lennert Buytenhek wrote: > On Mon, Dec 06, 2004 at 06:32:37AM -0500, jamal wrote: > > > Someone correct me if i am wrong - but does it appear as if all these > > changes are only useful on PCI but not PCI-X? > > They are useful on PCI-X as well as regular PCI. On my 64/100 NIC I > get ~620kpps on 2.6.9, ~1Mpps with 2.6.9 plus tx rework plus TXDMAC=0. > > Martin gets the ~1Mpps number with just the tx rework, and even more > with TXDMAC=0 added in as well. Right, but so far when i scan the results all i see is PCI not PCI-X. Which of your (or Martins) boards has PCI-X? cheers, jamal From buytenh@wantstofly.org Mon Dec 6 04:23:22 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Dec 2004 04:23:27 -0800 (PST) Received: from xi.wantstofly.org (alephnull.demon.nl [212.238.201.82]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB6CNM75004717 for ; Mon, 6 Dec 2004 04:23:22 -0800 Received: by xi.wantstofly.org (Postfix, from userid 500) id 5095C2B0ED; Mon, 6 Dec 2004 13:23:00 +0100 (MET) Date: Mon, 6 Dec 2004 13:23:00 +0100 From: Lennert Buytenhek To: jamal Cc: Martin Josefsson , Scott Feldman , Robert Olsson , P@draigBrady.com, mellia@prezzemolo.polito.it, e1000-devel@lists.sourceforge.net, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com Subject: Re: 1.03Mpps on e1000 (was: Re: [E1000-devel] Transmission limit) Message-ID: <20041206122300.GA12763@xi.wantstofly.org> References: <1101967983.4782.9.camel@localhost.localdomain> <20041205145051.GA647@xi.wantstofly.org> <20041205174401.GJ647@xi.wantstofly.org> <20041205175133.GK647@xi.wantstofly.org> <1102332757.1042.2255.camel@jzny.localdomain> <20041206121151.GA12598@xi.wantstofly.org> <1102335643.1048.2311.camel@jzny.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1102335643.1048.2311.camel@jzny.localdomain> User-Agent: Mutt/1.4.1i X-archive-position: 12473 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: buytenh@wantstofly.org Precedence: bulk X-list: netdev On Mon, Dec 06, 2004 at 07:20:43AM -0500, jamal wrote: > > > Someone correct me if i am wrong - but does it appear as if all these > > > changes are only useful on PCI but not PCI-X? > > > > They are useful on PCI-X as well as regular PCI. On my 64/100 NIC I > > get ~620kpps on 2.6.9, ~1Mpps with 2.6.9 plus tx rework plus TXDMAC=0. > > > > Martin gets the ~1Mpps number with just the tx rework, and even more > > with TXDMAC=0 added in as well. > > Right, but so far when i scan the results all i see is PCI not PCI-X. > Which of your (or Martins) boards has PCI-X? I've tested 32/33 PCI, 32/66 PCI, and 64/100 PCI-X. I _think_ Martin was running at 64/133 PCI-X. --L From gandalf@wlug.westbo.se Mon Dec 6 04:31:08 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Dec 2004 04:31:13 -0800 (PST) Received: from null.rsn.bth.se (postfix@null.rsn.bth.se [194.47.142.3]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB6CV7sK005476 for ; Mon, 6 Dec 2004 04:31:08 -0800 Received: by null.rsn.bth.se (Postfix, from userid 65534) id 0DB082C0023; Mon, 6 Dec 2004 13:30:42 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by null.rsn.bth.se (Postfix) with ESMTP id 9FDCD2C0103; Mon, 6 Dec 2004 13:30:41 +0100 (CET) Received: from tux.rsn.bth.se (tux.rsn.bth.se [194.47.143.135]) by null.rsn.bth.se (Postfix) with ESMTP id DEFF12C0023; Mon, 6 Dec 2004 13:30:40 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by tux.rsn.bth.se (Postfix) with ESMTP id EB1133FA7; Mon, 6 Dec 2004 13:30:40 +0100 (CET) Date: Mon, 6 Dec 2004 13:30:40 +0100 (CET) From: Martin Josefsson X-X-Sender: gandalf@tux.rsn.bth.se To: Lennert Buytenhek Cc: jamal , Scott Feldman , Robert Olsson , P@draigBrady.com, mellia@prezzemolo.polito.it, e1000-devel@lists.sourceforge.net, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com Subject: Re: 1.03Mpps on e1000 (was: Re: [E1000-devel] Transmission limit) In-Reply-To: <20041206122300.GA12763@xi.wantstofly.org> Message-ID: References: <1101967983.4782.9.camel@localhost.localdomain> <20041205145051.GA647@xi.wantstofly.org> <20041205174401.GJ647@xi.wantstofly.org> <20041205175133.GK647@xi.wantstofly.org> <1102332757.1042.2255.camel@jzny.localdomain> <20041206121151.GA12598@xi.wantstofly.org> <1102335643.1048.2311.camel@jzny.localdomain> <20041206122300.GA12763@xi.wantstofly.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Scanned: by amavisd-new-20030616-p10 on null.rsn.bth.se X-archive-position: 12474 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: gandalf@wlug.westbo.se Precedence: bulk X-list: netdev On Mon, 6 Dec 2004, Lennert Buytenhek wrote: > > Right, but so far when i scan the results all i see is PCI not PCI-X. > > Which of your (or Martins) boards has PCI-X? > > I've tested 32/33 PCI, 32/66 PCI, and 64/100 PCI-X. I _think_ Martin > was running at 64/133 PCI-X. I don't have any motherboards with PCI-X so no :) I'm running the 82546GB (dualport) at 64/66 and the 82540EM (desktop adapter) at 32/66, both are able to send at wirespeed. /Martin From hadi@cyberus.ca Mon Dec 6 05:11:38 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Dec 2004 05:11:42 -0800 (PST) Received: from mx01.cybersurf.com (mx01.cybersurf.com [209.197.145.104]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB6DBcd3007808 for ; Mon, 6 Dec 2004 05:11:38 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx01.cybersurf.com with esmtp (Exim 4.30) id 1CbIe1-0004XW-Nb for netdev@oss.sgi.com; Mon, 06 Dec 2004 08:11:13 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CbIdt-000504-BI; Mon, 06 Dec 2004 08:11:05 -0500 Subject: Re: 1.03Mpps on e1000 (was: Re: [E1000-devel] Transmission limit) From: jamal Reply-To: hadi@cyberus.ca To: Martin Josefsson Cc: Lennert Buytenhek , Scott Feldman , Robert Olsson , P@draigBrady.com, mellia@prezzemolo.polito.it, e1000-devel@lists.sourceforge.net, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com In-Reply-To: References: <1101967983.4782.9.camel@localhost.localdomain> <20041205145051.GA647@xi.wantstofly.org> <20041205174401.GJ647@xi.wantstofly.org> <20041205175133.GK647@xi.wantstofly.org> <1102332757.1042.2255.camel@jzny.localdomain> <20041206121151.GA12598@xi.wantstofly.org> <1102335643.1048.2311.camel@jzny.localdomain> <20041206122300.GA12763@xi.wantstofly.org> Content-Type: text/plain Organization: jamalopolous Message-Id: <1102338662.1042.2365.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 06 Dec 2004 08:11:02 -0500 Content-Transfer-Encoding: 7bit X-archive-position: 12475 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev Hopefully someone will beat me to testing to see if our forwarding capacity now goes up with this new recipe. cheers, jamal On Mon, 2004-12-06 at 07:30, Martin Josefsson wrote: > On Mon, 6 Dec 2004, Lennert Buytenhek wrote: > > > > Right, but so far when i scan the results all i see is PCI not PCI-X. > > > Which of your (or Martins) boards has PCI-X? > > > > I've tested 32/33 PCI, 32/66 PCI, and 64/100 PCI-X. I _think_ Martin > > was running at 64/133 PCI-X. > > I don't have any motherboards with PCI-X so no :) > I'm running the 82546GB (dualport) at 64/66 and the 82540EM (desktop > adapter) at 32/66, both are able to send at wirespeed. > > /Martin > > From tgraf@suug.ch Mon Dec 6 06:02:21 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Dec 2004 06:02:26 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB6E2KhA010085 for ; Mon, 6 Dec 2004 06:02:21 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id 3010CF; Mon, 6 Dec 2004 15:01:33 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id E2C991C0EA; Mon, 6 Dec 2004 15:02:14 +0100 (CET) Date: Mon, 6 Dec 2004 15:02:14 +0100 From: Thomas Graf To: Michal Ludvig Cc: Andrew Morton , netdev@oss.sgi.com, Jan Kara Subject: Re: [PATCH] rtnetlink & address family problem Message-ID: <20041206140214.GA749@postel.suug.ch> References: <41B0A5B4.6060108@suse.cz> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <41B0A5B4.6060108@suse.cz> X-archive-position: 12476 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * Michal Ludvig <41B0A5B4.6060108@suse.cz> 2004-12-03 18:43 > running 'ip -6 addr flush dev eth0' on a kernel without IPv6 support > flushes *all* addresses from the interface, even those IPv4 ones, > because the unsupported protocol is substituted by PF_UNSPEC. > IMHO it should better return with an error EAFNOSUPPORT. > > diff -Nru a/net/core/rtnetlink.c b/net/core/rtnetlink.c > --- a/net/core/rtnetlink.c 2004-12-03 18:30:33 +01:00 > +++ b/net/core/rtnetlink.c 2004-12-03 18:30:33 +01:00 > @@ -477,8 +477,10 @@ > } > > link_tab = rtnetlink_links[family]; > - if (link_tab == NULL) > - link_tab = rtnetlink_links[PF_UNSPEC]; > + if (link_tab == NULL) { > + *errp = -EAFNOSUPPORT; > + return -1; > + } > link = &link_tab[type]; > > sz_idx = type>>2; Your patch would fix this issue but might break various things. The actual problem is that iproute2 doesn't check the family in its filter. It blindly assumes that the kernel only returns addresses of the kind it has requested. I can understand if you think the current behaviour is wrong but we shouldn't change it in the middle of a stable tree. --- iproute2-2.6.9.orig/ip/ipaddress.c 2004-10-19 22:49:02.000000000 +0200 +++ iproute2-2.6.9/ip/ipaddress.c 2004-12-06 14:55:58.000000000 +0100 @@ -330,6 +330,8 @@ return 0; } } + if (filter.family && filter.family != ifa->ifa_family) + return 0; if (filter.flushb) { struct nlmsghdr *fn; From hasso@estpak.ee Mon Dec 6 06:42:35 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Dec 2004 06:42:40 -0800 (PST) Received: from arena (test.estpak.ee [194.126.115.47]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB6EgXrQ012227 for ; Mon, 6 Dec 2004 06:42:34 -0800 Received: from arena ([127.0.0.1] ident=hasso) by arena with esmtp (Exim 3.36 #1 (Debian)) id 1CbK41-0000ah-00; Mon, 06 Dec 2004 16:42:09 +0200 From: Hasso Tepper To: hadi@cyberus.ca Subject: Re: [patch 4/10] s390: network driver. Date: Mon, 6 Dec 2004 16:42:00 +0200 User-Agent: KMail/1.7.1 Cc: Paul Jakma , Thomas Spatzier , jgarzik@pobox.com, netdev@oss.sgi.com References: <1102332479.1048.2243.camel@jzny.localdomain> In-Reply-To: <1102332479.1048.2243.camel@jzny.localdomain> Organization: Elion Enterprises Ltd. MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-15" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200412061642.00552.hasso@estpak.ee> X-archive-position: 12477 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hasso@estpak.ee Precedence: bulk X-list: netdev jamal wrote: > Dont post networking related patches on other lists. I havent seen said > patch, but it seems someone is complaining about some behavior changing? For some strange reason I didn't find the beginning of thread either from archive, the first post seems to be http://oss.sgi.com/projects/netdev/archive/2004-11/msg01015.html. I'm trying to summarize. The approach - one raw socket to send/receive messages no matter how many interfaces are used - worked for Quagga/Zebra routing software daemons till now. If some of these interfaces was operationally down, socket wasn't blocked even if the queue was full. In fact "man sendmsg" has still text: ENOBUFS The output queue for a network interface was full. This gener- ally indicates that the interface has stopped sending, but may be caused by transient congestion. (Normally, this does not occur in Linux. Packets are just silently dropped when a device queue overflows.) Seems that it's no longer true. Seems that kernel is now trying as hard as possible not to loose any data - data is queued and if the queue will be full, all related sockets will be blocked to notify application. So, one socket approach don't work any more for Quagga/Zebra. No problem, we can take the "one socket per interface" approach. And we already have link detection implemented to notify daemons. But there will be still potential problem - sending the "interface down" message from kernel to the zebra daemon and then to the routing protocol daemon takes time. And during this time daemon can send packets which will sit in queue and may cause many problems if sent to the network later (if link comes up again). Think about statelss routing protocols like rip(ng), ipv6 router advertisements etc. They may carry the info that's no longer true causing routing loops etc. And to clarify a little bit: no - the Quagga/Zebra didn't work with previous approach perfectly. I fact with link detection and socket per interface it will be better than ever no matter what's the kernel behaviour. We just want to make sure that solution will be bulletproof. So, problem - how can we make sure that no potentially dangerous aged (routing) info will be in the network? -- Hasso Tepper Elion Enterprises Ltd. WAN administrator From P@draigBrady.com Mon Dec 6 09:32:47 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Dec 2004 09:32:53 -0800 (PST) Received: from corvil.com (gate.corvil.net [213.94.219.177]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB6HWjlc024640 for ; Mon, 6 Dec 2004 09:32:47 -0800 Received: from draigBrady.com (pixelbeat.local.corvil.com [172.18.1.170]) by corvil.com (8.12.9/8.12.5) with ESMTP id iB6HWHwS032781; Mon, 6 Dec 2004 17:32:17 GMT (envelope-from P@draigBrady.com) Message-ID: <41B497A1.6090600@draigBrady.com> Date: Mon, 06 Dec 2004 17:32:17 +0000 From: P@draigBrady.com User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040124 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Robert Olsson CC: Lennert Buytenhek , jamal , Martin Josefsson , Scott Feldman , mellia@prezzemolo.polito.it, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com Subject: Re: 1.03Mpps on e1000 (was: Re: [E1000-devel] Transmission limit) References: <20041205174401.GJ647@xi.wantstofly.org> <20041205175133.GK647@xi.wantstofly.org> <1102332757.1042.2255.camel@jzny.localdomain> <20041206121151.GA12598@xi.wantstofly.org> <1102335643.1048.2311.camel@jzny.localdomain> <20041206122300.GA12763@xi.wantstofly.org> <1102338662.1042.2365.camel@jzny.localdomain> <20041206132907.GA13411@xi.wantstofly.org> <16820.37049.396306.295878@robur.slu.se> In-Reply-To: <16820.37049.396306.295878@robur.slu.se> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-archive-position: 12478 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: P@draigBrady.com Precedence: bulk X-list: netdev Robert Olsson wrote: > Lennert Buytenhek writes: > > On Mon, Dec 06, 2004 at 08:11:02AM -0500, jamal wrote: > > > > > Hopefully someone will beat me to testing to see if our forwarding > > > capacity now goes up with this new recipe. > > > A breakthrough we now can send small packets at wire speed it will make > development and testing much easier... It surely will!! Just to recap, 2 people have been able to tx @ wire speed. The origonal poster was able to receive at wire speed, but could only TX at about 50% wire speed. It would be really cool if we could combine this to bridge @ wire speed. -- Pádraig Brady - http://www.pixelbeat.org -- From shemminger@osdl.org Mon Dec 6 09:43:05 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Dec 2004 09:43:13 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB6Hh5KH025589 for ; Mon, 6 Dec 2004 09:43:05 -0800 Received: from dxpl.pdx.osdl.net (dxpl.pdx.osdl.net [172.20.1.103]) by mail.osdl.org (8.11.6/8.11.6) with ESMTP id iB6HgZ920498; Mon, 6 Dec 2004 09:42:35 -0800 Date: Mon, 6 Dec 2004 09:42:34 -0800 From: Stephen Hemminger To: "David S. Miller" Cc: Michael Vittrup Larsen , netdev@oss.sgi.com Subject: [PATCH] tcp: efficient port randomisation (rev 3) Message-Id: <20041206094234.34861c78@dxpl.pdx.osdl.net> In-Reply-To: <200412060918.04441.michael.vittrup.larsen@ericsson.com> References: <20041027092531.78fe438c@guest-251-240.pdx.osdl.net> <20041202135252.04e64f51.davem@davemloft.net> <41B14E57.5080803@osdl.org> <200412060918.04441.michael.vittrup.larsen@ericsson.com> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.10claws (GTK+ 1.2.10; x86_64-suse-linux) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 12479 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Third revision of the TCP port randomization patch. It randomizes TCP ephemeral ports of incoming connections using variation of existing sequence number hash. This one avoids the MD4 for the loopback case since there is no reason to bother over loopback and it improves benchmark numbers. Signed-off-by: Stephen Hemminger Thanks to original author Michael Larsen. http://www.ietf.org/internet-drafts/draft-larsen-tsvwg-port-randomisation-00.txt diff -urNp -X dontdiff test-2.6/drivers/char/random.c tcpport/drivers/char/random.c --- test-2.6/drivers/char/random.c 2004-11-30 16:26:41.000000000 -0800 +++ tcpport/drivers/char/random.c 2004-12-03 17:04:18.267850607 -0800 @@ -2347,6 +2347,24 @@ __u32 secure_ip_id(__u32 daddr) return halfMD4Transform(hash, keyptr->secret); } +/* Generate secure starting point for ephemeral TCP port search */ +u32 secure_tcp_port_ephemeral(__u32 saddr, __u32 daddr, __u16 dport) +{ + struct keydata *keyptr = get_keyptr(); + u32 hash[4]; + + /* + * Pick a unique starting offset for each ephemeral port search + * (saddr, daddr, dport) and 48bits of random data. + */ + hash[0] = saddr; + hash[1] = daddr; + hash[2] = dport ^ keyptr->secret[10]; + hash[3] = keyptr->secret[11]; + + return halfMD4Transform(hash, keyptr->secret); +} + #ifdef CONFIG_SYN_COOKIES /* * Secure SYN cookie computation. This is the algorithm worked out by diff -urNp -X dontdiff test-2.6/include/linux/random.h tcpport/include/linux/random.h --- test-2.6/include/linux/random.h 2004-11-30 16:26:51.000000000 -0800 +++ tcpport/include/linux/random.h 2004-12-02 17:07:13.000000000 -0800 @@ -52,6 +52,7 @@ extern void get_random_bytes(void *buf, void generate_random_uuid(unsigned char uuid_out[16]); extern __u32 secure_ip_id(__u32 daddr); +extern u32 secure_tcp_port_ephemeral(__u32 saddr, __u32 daddr, __u16 dport); extern __u32 secure_tcp_sequence_number(__u32 saddr, __u32 daddr, __u16 sport, __u16 dport); extern __u32 secure_tcp_syn_cookie(__u32 saddr, __u32 daddr, diff -urNp -X dontdiff test-2.6/net/ipv4/tcp_ipv4.c tcpport/net/ipv4/tcp_ipv4.c --- test-2.6/net/ipv4/tcp_ipv4.c 2004-11-30 16:26:51.000000000 -0800 +++ tcpport/net/ipv4/tcp_ipv4.c 2004-12-03 17:04:26.454562583 -0800 @@ -636,10 +636,18 @@ not_unique: return -EADDRNOTAVAIL; } +static inline u32 connect_port_offset(const struct sock *sk) +{ + const struct inet_opt *inet = inet_sk(sk); + + return secure_tcp_port_ephemeral(inet->rcv_saddr, inet->daddr, + inet->dport); +} + /* * Bind a port for a connect operation and hash it. */ -static int tcp_v4_hash_connect(struct sock *sk) +static int tcp_v4_hash_connect(struct sock *sk, int loopback) { unsigned short snum = inet_sk(sk)->num; struct tcp_bind_hashbucket *head; @@ -647,36 +655,23 @@ static int tcp_v4_hash_connect(struct so int ret; if (!snum) { - int rover; int low = sysctl_local_port_range[0]; int high = sysctl_local_port_range[1]; - int remaining = (high - low) + 1; + int range = high - low; + int i; + int port; + static u32 hint; + u32 offset = hint; struct hlist_node *node; struct tcp_tw_bucket *tw = NULL; + if (!loopback) + offset += connect_port_offset(sk); + local_bh_disable(); - - /* TODO. Actually it is not so bad idea to remove - * tcp_portalloc_lock before next submission to Linus. - * As soon as we touch this place at all it is time to think. - * - * Now it protects single _advisory_ variable tcp_port_rover, - * hence it is mostly useless. - * Code will work nicely if we just delete it, but - * I am afraid in contented case it will work not better or - * even worse: another cpu just will hit the same bucket - * and spin there. - * So some cpu salt could remove both contention and - * memory pingpong. Any ideas how to do this in a nice way? - */ - spin_lock(&tcp_portalloc_lock); - rover = tcp_port_rover; - - do { - rover++; - if ((rover < low) || (rover > high)) - rover = low; - head = &tcp_bhash[tcp_bhashfn(rover)]; + for (i = 1; i <= range; i++) { + port = low + (i + offset) % range; + head = &tcp_bhash[tcp_bhashfn(port)]; spin_lock(&head->lock); /* Does not bother with rcv_saddr checks, @@ -684,19 +679,19 @@ static int tcp_v4_hash_connect(struct so * unique enough. */ tb_for_each(tb, node, &head->chain) { - if (tb->port == rover) { + if (tb->port == port) { BUG_TRAP(!hlist_empty(&tb->owners)); if (tb->fastreuse >= 0) goto next_port; if (!__tcp_v4_check_established(sk, - rover, + port, &tw)) goto ok; goto next_port; } } - tb = tcp_bucket_create(head, rover); + tb = tcp_bucket_create(head, port); if (!tb) { spin_unlock(&head->lock); break; @@ -706,22 +701,18 @@ static int tcp_v4_hash_connect(struct so next_port: spin_unlock(&head->lock); - } while (--remaining > 0); - tcp_port_rover = rover; - spin_unlock(&tcp_portalloc_lock); - + } local_bh_enable(); return -EADDRNOTAVAIL; ok: - /* All locks still held and bhs disabled */ - tcp_port_rover = rover; - spin_unlock(&tcp_portalloc_lock); + hint += i; - tcp_bind_hash(sk, tb, rover); + /* Head lock still held and bh's disabled */ + tcp_bind_hash(sk, tb, port); if (sk_unhashed(sk)) { - inet_sk(sk)->sport = htons(rover); + inet_sk(sk)->sport = htons(port); __tcp_v4_hash(sk, 0); } spin_unlock(&head->lock); @@ -832,7 +823,7 @@ int tcp_v4_connect(struct sock *sk, stru * complete initialization after this. */ tcp_set_state(sk, TCP_SYN_SENT); - err = tcp_v4_hash_connect(sk); + err = tcp_v4_hash_connect(sk, rt->rt_flags & RTCF_LOCAL); if (err) goto failure; From paul@clubi.ie Mon Dec 6 10:45:16 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Dec 2004 10:45:23 -0800 (PST) Received: from hibernia.jakma.org (hibernia.jakma.org [212.17.55.49]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB6Ij8Vh028799 for ; Mon, 6 Dec 2004 10:45:13 -0800 Received: from hibernia.jakma.org (IDENT:paul@hibernia.jakma.org [192.168.0.3]) by hibernia.jakma.org (8.12.11/8.12.11) with ESMTP id iB6Ii9gB014516; Mon, 6 Dec 2004 18:44:12 GMT Date: Mon, 6 Dec 2004 18:44:09 +0000 (GMT) From: Paul Jakma X-X-Sender: paul@hibernia.jakma.org To: jamal cc: Thomas Spatzier , jgarzik@pobox.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: [patch 4/10] s390: network driver. In-Reply-To: <1102332479.1048.2243.camel@jzny.localdomain> Message-ID: References: <1102332479.1048.2243.camel@jzny.localdomain> Mail-Followup-To: paul@hibernia.jakma.org X-NSA: arafat al aqsar jihad musharef jet-A1 avgas ammonium qran inshallah allah al-akbar martyr iraq saddam hammas hisballah rabin ayatollah korea vietnam revolt mustard gas british airways washington MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Virus-Scanned: ClamAV 0.80/614/Wed Dec 1 15:44:43 2004 clamav-milter version 0.80j on hibernia.jakma.org X-Virus-Status: Clean X-archive-position: 12480 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: paul@clubi.ie Precedence: bulk X-list: netdev On Mon, 6 Dec 2004, jamal wrote: > Dont post networking related patches on other lists. I havent seen said > patch, but it seems someone is complaining about some behavior changing? I missed the beginning of the thread too, but saw Jeff's reply to Thomas on netdev. It appears the original patch was to make the s390 network driver discard packets on link-down. Jeff had replied to say this was bad, that queues are meant to fill and that this was what other drivers (e1000, tg3) did. > In regards to link down and packets being queued. Agreed this is a > little problematic for some apps/transports. Tis yes. Particularly for apps using raw and UDP+IP_HDRINCL sockets. This problem came to light when we got reports of ospfd blocking because link was down, late in 2.4 with a certain version of the (iirc) e100 driver. ospfd uses one single socket for all interfaces, and relies on IP_HDRINCL to have the packet routed out right interface. However this approach doesnt play well if the socket can be blocked completely because of /one/ interface having its link down. The behaviour we expected (and got up until now) is to receive either ENOBUFS or else, if the kernel accepts the packet write, for it to drop it if it can not be sent. We can work around that by moving to a socket/interface. However it still leaves the problem of packets being queued indefinitely while the link is down and being sent when link comes back. This is *not* good for RIP, IPv4 IRDP and IPv6 RA. > In the case the netdevice is administratively downed both the qdisc > and DMA ring packets are flushed. What about any packets remaining in the socket buffer? (if that makes sense - i dont know enough about internals sadly). Are those queued? > Newer packets will never be queued That no longer appears to be the case though. The socket blocks, and /some/ packets are queued (presumably those which still were in the socket buffer? i dont know exactly..). > and you should quickly be able to find from your app that > the device is down. We can yes, via rtnetlink - but impossible to guarantee we'll know the link is down before we try write a packet. > In the case of netdevice being operationally down ? As in 'ip link set dev ... down'? > - I am hoping this is what the discussion is, having jumped on it - No, its for link-down, AIUI. > both queues stay intact. What you can do is certainly from user > space admin down/up the device when you receive a netlink carrier > off notification. That seems possible, but quite a hack. Something to work at a socket level would possibly be nicer. (Socket being the primary handle our application has). > I am struggling to see whether dropping the packet inside the tx > path once it is operationaly down is so blasphemous ... need to get > caffeine first. As long as reliable transports have some other transport specific queue, shouldnt be a problem. For UDP and raw no reliability or guarantees are expected by applications (least shouldnt be), and queueing some packets on link-down interferes with application-layer expectations. > cheers, > jamal regards, -- Paul Jakma paul@clubi.ie paul@jakma.org Key ID: 64A2FF6A Fortune: The UPS doesn't have a battery backup. From Robert.Olsson@data.slu.se Mon Dec 6 11:11:10 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Dec 2004 11:11:16 -0800 (PST) Received: from mail1.slu.se (mail1.slu.se [130.238.96.11]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB6JB8V6030365 for ; Mon, 6 Dec 2004 11:11:09 -0800 Received: from robur.slu.se (robur.slu.se [130.238.98.12]) by mail1.slu.se (8.12.10/8.12.10) with ESMTP id iB6JAgKO029803; Mon, 6 Dec 2004 20:10:43 +0100 Received: by robur.slu.se (Postfix, from userid 1000) id BF556EC001; Mon, 6 Dec 2004 20:10:42 +0100 (CET) From: Robert Olsson MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16820.44722.748743.6711@robur.slu.se> Date: Mon, 6 Dec 2004 20:10:42 +0100 To: Lennert Buytenhek Cc: jamal , Martin Josefsson , Scott Feldman , Robert Olsson , P@draigBrady.com, mellia@prezzemolo.polito.it, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com Subject: Re: 1.03Mpps on e1000 (was: Re: [E1000-devel] Transmission limit) X-Mailer: VM 7.18 under Emacs 21.3.1 X-archive-position: 12481 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Robert.Olsson@data.slu.se Precedence: bulk X-list: netdev Lennert Buytenhek writes: > On Mon, Dec 06, 2004 at 08:11:02AM -0500, jamal wrote: > > > Hopefully someone will beat me to testing to see if our forwarding > > capacity now goes up with this new recipe. Yes a breakthrough as we now can send small packets at GIGE wire speed this will make development and testing much easier... A first router test with our setup below. Opteron 1.6 GHz SMP kernel. using 1 CPU. 82546 EB + 82456 GB and PCI-X 100 Mhz & 133 MHz. pktgen performance is measured on router box. Remember Scotts patch uses 4096 TX buffers and w. pktgen we use clone_skb. So with real skb's we probably see lower performance due to this. This may explain results below so routing performance doesn't follow pktgen performance as seen. T-PUT is routing performance. Also pktgen pure TX performance is given this on the router. Input rate for routing test is 2*765 kpps for all three runs. Input Packets input to eth0 is routed to eth1 and eth2 to eth3. Vanilla. T-PUT 657 kpps. pktgen TX perf 818 kpps ------------------------------------------------- Iface MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flags eth0 1500 0 4312682 8253078 8253078 5687318 5 0 0 0 BRU eth1 1500 0 1 0 0 0 4312199 0 0 0 BRU eth2 1500 0 4311018 8386504 8386504 5688982 5 0 0 0 BRU eth3 1500 0 1 0 0 0 4310791 0 0 0 BRU CPU0 0: 116665 IO-APIC-edge timer 1: 208 IO-APIC-edge i8042 8: 0 IO-APIC-edge rtc 9: 0 IO-APIC-level acpi 14: 21943 IO-APIC-edge ide0 26: 66 IO-APIC-level eth0 27: 58638 IO-APIC-level eth1 28: 68 IO-APIC-level eth2 29: 58497 IO-APIC-level eth3 NMI: 0 LOC: 116605 ERR: 0 MIS: 0 e1000-TX-prefetch+scott tx patch. T-PUT 540 kpps. pktgen TX perf 1.48 Mpps -------------------------------------------------------------------------- Iface MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flags eth0 1500 0 3533795 8618637 8618637 6466205 5 0 0 0 BRU eth1 1500 0 3 0 0 0 3533803 0 0 0 BRU eth2 1500 0 3535804 8697149 8697149 6464196 5 0 0 0 BRU eth3 1500 0 1 0 0 0 3535321 0 0 0 BRU CPU0 0: 1372774 IO-APIC-edge timer 1: 663 IO-APIC-edge i8042 8: 0 IO-APIC-edge rtc 9: 0 IO-APIC-level acpi 14: 22631 IO-APIC-edge ide0 26: 686 IO-APIC-level eth0 27: 693 IO-APIC-level eth1 28: 687 IO-APIC-level eth2 29: 682 IO-APIC-level eth3 NMI: 0 LOC: 1372804 ERR: 0 MIS: 0 e1000-TX-prefetch. T-PUT 657 kpps. pktgen TX perf 1.15 Mpps ----------------------------------------------------------- Kernel Interface table Iface MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flags eth0 1500 0 4311848 8288270 8288270 5688152 5 0 0 0 BRU eth1 1500 0 4 0 0 0 4311388 0 0 0 BRU eth2 1500 0 4309082 8400892 8400892 5690918 5 0 0 0 BRU eth3 1500 0 1 0 0 0 4308271 0 0 0 BRU lo 16436 0 0 0 0 0 0 0 0 0 LRU CPU0 0: 224310 IO-APIC-edge timer 1: 250 IO-APIC-edge i8042 8: 0 IO-APIC-edge rtc 9: 0 IO-APIC-level acpi 14: 22055 IO-APIC-edge ide0 26: 122 IO-APIC-level eth0 27: 58001 IO-APIC-level eth1 28: 123 IO-APIC-level eth2 29: 57681 IO-APIC-level eth3 NMI: 0 LOC: 224251 ERR: 0 MIS: 0 --ro From kdesler@soohrt.org Mon Dec 6 12:53:32 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Dec 2004 12:53:36 -0800 (PST) Received: from quickstop.soohrt.org (quickstop.soohrt.org [81.2.155.147]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB6KrV4c005582 for ; Mon, 6 Dec 2004 12:53:32 -0800 Received: (qmail 18579 invoked by uid 1000); 6 Dec 2004 20:53:05 -0000 Date: Mon, 6 Dec 2004 21:53:05 +0100 From: Karsten Desler To: netdev@oss.sgi.com Cc: linux-kernel@vger.kernel.org Subject: _High_ CPU usage while routing (mostly) small UDP packets Message-ID: <20041206205305.GA11970@soohrt.org> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline User-Agent: Mutt/1.5.6+20040722i X-archive-position: 12482 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kdesler@soohrt.org Precedence: bulk X-list: netdev Hi, I'm running a dual Opteron 244 router with two active e1000 interfaces, that mostly deals with small (<200b) udp packets. I'm using Linux 2.6.10-rc3 (32bit), but I tried 64bit with early 2.6.9-rc versions, and it didn't make much of a difference. The irq of eth0 is bound to cpu1 while eth1 is bound to cpu0. NAPI is enabled. Current packetload on eth0 (and reversed on eth1): 115kpps tx 135kpps rx There are about 200 iptables rules, but the common packet only has to traverse about 20. conntrack is not loaded. eth0 and eth1 are running on the same 66MHz/64bit PCI bus. Kernel-Profiling is running, I don't know how much that contributes to the overall load. I don't know if that has anything to do with it, but the systemclock is getting out of sync _fast_ (openntpd can't keep up). ntpdate -b ntp.soohrt.org; sleep 60; ntpdate ntp.soohrt.org: 6 Dec 21:40:39 ntpdate[30146]: adjust time server 134.100.177.5 offset 0.000092 sec 6 Dec 21:41:39 ntpdate[30218]: adjust time server 134.100.177.5 offset 0.006971 sec Is that the expected cpu usage? I'd appreciate _any_ pointers, thanks in advance, Karsten Current cpu usage: Cpu0 : 0.0% us, 0.0% sy, 0.0% ni, 46.3% id, 0.0% wa, 0.0% hi, 53.7% si Cpu1 : 0.0% us, 0.0% sy, 0.0% ni, 21.4% id, 0.0% wa, 0.0% hi, 78.6% si vmstat 5: procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 0 0 1804612 100488 104464 0 0 0 32 5965 167 3 68 30 0 0 1 0 1804548 100488 104464 0 0 0 157 5994 18 0 67 33 0 1 1 0 1804684 100488 104464 0 0 0 63 5998 19 0 67 33 0 0 1 0 1804684 100492 104460 0 0 0 4 5985 10 0 68 33 0 0 1 0 1804620 100492 104460 0 0 0 12 6032 15 0 68 32 0 lspci -vt: -[00]-+-06.0-[03]--+-00.0 Advanced Micro Devices [AMD] AMD-8111 USB [...] +-0a.1 Advanced Micro Devices [AMD] AMD-8131 PCI-X APIC +-0b.0-[01]--+-01.0 Intel Corp. 82545EM Gigabit Ethernet Controller (Fiber) <- eth0 | +-03.0 Intel Corp. 82546GB Gigabit Ethernet Controller <- eth1 | \-03.1 Intel Corp. 82546GB Gigabit Ethernet Controller arp -n|grep -c incomplete: 73 arp -n|grep -vc incomplete: 778 iptables -nL|grep -v Chain|grep -vc source: 199 readprofile -r; sleep 60; readprofile|sort -n +2: 76 __do_softirq 0.3654 73 dst_alloc 0.4148 489 ip_rcv 0.4187 96 fib_semantic_match 0.4615 15 memset 0.4688 8 _write_lock 0.5000 37 match 0.5781 72 handle_IRQ_event 0.6429 138 fn_hash_lookup 0.7188 401 qdisc_restart 0.7371 13 _read_unlock 0.8125 28 _spin_lock_bh 0.8750 70 e1000_rx_checksum 0.8750 536 ip_rcv_finish 0.9306 33 kfree_skbmem 1.0312 248 e1000_intr 1.0333 30 _read_lock 1.8750 1198 ip_forward 1.9199 910 ip_finish_output2 1.9612 282 rt_hash_code 2.2031 286 pfifo_fast_enqueue 2.2344 36 _spin_unlock 2.2500 1657 dev_queue_xmit 2.3014 230 ip_output 2.3958 905 netif_receive_skb 2.4592 2852 e1000_clean_rx_irq 2.6213 697 nf_hook_slow 2.7227 674 e1000_alloc_rx_buffers 2.8083 228 ip_forward_finish 2.8500 187 ipt_hook 3.8958 346 kmem_cache_free 4.3250 357 pfifo_fast_dequeue 4.4625 626 local_bh_enable 4.8906 250 e1000_irq_enable 5.2083 425 kmem_cache_alloc 5.3125 3100 e1000_clean_tx_irq 5.6985 1014 nf_iterate 5.7614 284 ipt_route_hook 5.9167 701 kfree 6.2589 1063 __kmalloc 8.3047 1605 skb_release_data 10.0312 2692 eth_type_trans 11.2167 4013 __kfree_skb 15.6758 4554 alloc_skb 18.9750 20017 ipt_do_table 24.0589 10700 ip_route_input 30.3977 1341 _spin_lock 83.8125 1483 _read_unlock_bh 92.6875 3345 _read_lock_bh 104.5312 3185 _spin_unlock_irqrestore 199.0625 44402 default_idle 693.7812 From davem@davemloft.net Mon Dec 6 13:50:31 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Dec 2004 13:50:37 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB6LoVuH008312 for ; Mon, 6 Dec 2004 13:50:31 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1CbQiw-0001qy-00; Mon, 06 Dec 2004 13:48:50 -0800 Date: Mon, 6 Dec 2004 13:48:49 -0800 From: "David S. Miller" To: Karsten Desler Cc: netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: _High_ CPU usage while routing (mostly) small UDP packets Message-Id: <20041206134849.498bfc93.davem@davemloft.net> In-Reply-To: <20041206205305.GA11970@soohrt.org> References: <20041206205305.GA11970@soohrt.org> X-Mailer: Sylpheed version 1.0.0beta3 (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 12483 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Mon, 6 Dec 2004 21:53:05 +0100 Karsten Desler wrote: > 20017 ipt_do_table 24.0589 It's spending nearly half of it's time in iptables. Try to consolidate your rules if possible. This is the part of netfilter that really doesn't scale well at all. From gandalf@wlug.westbo.se Mon Dec 6 14:30:21 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Dec 2004 14:30:25 -0800 (PST) Received: from null.rsn.bth.se (postfix@null.rsn.bth.se [194.47.142.3]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB6MUJ1V010327 for ; Mon, 6 Dec 2004 14:30:20 -0800 Received: by null.rsn.bth.se (Postfix, from userid 65534) id 554092C00B1; Mon, 6 Dec 2004 23:29:53 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by null.rsn.bth.se (Postfix) with ESMTP id 6F3F52C011E; Mon, 6 Dec 2004 23:29:52 +0100 (CET) Received: from tux.rsn.bth.se (tux.rsn.bth.se [194.47.143.135]) by null.rsn.bth.se (Postfix) with ESMTP id 80DA72C00B1; Mon, 6 Dec 2004 23:29:51 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by tux.rsn.bth.se (Postfix) with ESMTP id DC2EC3FA7; Mon, 6 Dec 2004 23:29:51 +0100 (CET) Date: Mon, 6 Dec 2004 23:29:51 +0100 (CET) From: Martin Josefsson X-X-Sender: gandalf@tux.rsn.bth.se To: Robert Olsson Cc: Lennert Buytenhek , jamal , Scott Feldman , P@draigBrady.com, mellia@prezzemolo.polito.it, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com Subject: Re: 1.03Mpps on e1000 (was: Re: [E1000-devel] Transmission limit) In-Reply-To: <16820.44722.748743.6711@robur.slu.se> Message-ID: References: <16820.44722.748743.6711@robur.slu.se> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Scanned: by amavisd-new-20030616-p10 on null.rsn.bth.se X-archive-position: 12484 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: gandalf@wlug.westbo.se Precedence: bulk X-list: netdev On Mon, 6 Dec 2004, Robert Olsson wrote: > pktgen performance is measured on router box. Remember Scotts patch uses > 4096 TX buffers and w. pktgen we use clone_skb. So with real skb's we probably > see lower performance due to this. This may explain results below so routing > performance doesn't follow pktgen performance as seen. I've performed some tests with and without clone_skb with various versions of the driver. > Vanilla. T-PUT 657 kpps. pktgen TX perf 818 kpps > e1000-TX-prefetch+scott tx patch. T-PUT 540 kpps. pktgen TX perf 1.48 Mpps > e1000-TX-prefetch. T-PUT 657 kpps. pktgen TX perf 1.15 Mpps This matches the data I see in my tests here with and without clone_skb. I've included a lot of pps numbers below, they might need some description. I tested generating packets with four diffrent drivers with and without clone_skb. vanilla is the vanilla driver in 2.6.10-rc3 copy is using the patch found at the bottom of this mail, just a small test to see if there's any gain or loss using "static" buffers to dma from. Prefetch doesn't help at all here, just makes things worse, even for clone_skb. Tried with delayed TDT updating as well, didn't help. vanilla + prefetch is just the vanilla driver + prefetching. feldman tx is using scotts tx-path rewrite patch. I didn't bother listing feldman tx + prefetch as the results were even lower for the non clone_skb case. The only thing I can think of that can cause this is cache trashing, or overhead in slab when we have a lot of skb's in the wild. I don't have oprofile on my testmachine at the moment and it's time to go to bed now, maybe tomorrow... Does anyone have any suggestions of what to test next? vanilla and clone 60 854886 64 772341 68 759531 72 758872 76 758926 80 761136 84 742109 88 742070 92 741616 96 744083 100 727430 104 725242 108 724153 112 725841 116 707331 120 706000 124 704923 128 662547 vanilla and noclone 60 748552 64 702464 68 649066 72 671992 76 680251 80 627711 84 625468 88 640115 92 679365 96 650544 100 666423 104 652057 108 665821 112 679443 116 652507 120 661279 124 648627 128 635780 copy and clone 60 897165 64 872767 68 750694 72 750427 76 749583 80 748242 84 732760 88 731129 92 732603 96 732631 100 717123 104 717678 108 716839 112 719258 116 703824 120 706047 124 701885 128 695575 copy and noclone 60 882227 64 649614 68 691327 72 700706 76 700795 80 696594 84 686016 88 691689 92 696136 96 691348 100 684596 104 687800 108 689218 112 671483 116 675867 120 679089 124 672385 128 650148 vanilla + prefetch and clone 60 1300075 64 1079069 68 1082091 72 1068791 76 1067630 80 1026222 84 1053055 88 1024442 92 1032112 96 1014844 100 991346 104 976483 108 947019 112 919193 116 892863 120 868054 124 844679 128 822347 vanilla + prefetch and noclone 60 738538 64 800927 68 719832 72 725353 76 822738 80 743134 84 813520 88 721522 92 797838 96 724031 100 812198 104 717811 108 713072 112 789771 116 696027 120 682168 124 749020 128 703233 feldman tx and clone 60 1029997 64 916706 68 898601 72 895378 76 896171 80 898594 84 861434 88 861446 92 861444 96 863669 100 837624 104 836225 108 835528 112 835527 116 817102 120 817101 124 817100 128 757683 feldman tx and noclone 60 626646 64 628148 68 628935 72 625084 76 623527 80 623510 84 624286 88 625086 92 623907 96 630199 100 613933 104 618025 108 620326 112 607884 116 606124 120 538434 124 531699 128 532719 diff -X /home/gandalf/dontdiff.ny -urNp drivers/net/e1000-vanilla/e1000_main.c drivers/net/e1000/e1000_main.c --- drivers/net/e1000-vanilla/e1000_main.c 2004-12-05 18:27:50.000000000 +0100 +++ drivers/net/e1000/e1000_main.c 2004-12-06 22:21:10.000000000 +0100 @@ -132,6 +132,7 @@ static void e1000_irq_disable(struct e10 static void e1000_irq_enable(struct e1000_adapter *adapter); static irqreturn_t e1000_intr(int irq, void *data, struct pt_regs *regs); static boolean_t e1000_clean_tx_irq(struct e1000_adapter *adapter); +static boolean_t e1000_alloc_tx_buffers(struct e1000_adapter *adapter); #ifdef CONFIG_E1000_NAPI static int e1000_clean(struct net_device *netdev, int *budget); static boolean_t e1000_clean_rx_irq(struct e1000_adapter *adapter, @@ -264,6 +265,7 @@ e1000_up(struct e1000_adapter *adapter) e1000_restore_vlan(adapter); e1000_configure_tx(adapter); + e1000_alloc_tx_buffers(adapter); e1000_setup_rctl(adapter); e1000_configure_rx(adapter); e1000_alloc_rx_buffers(adapter); @@ -1048,10 +1052,21 @@ e1000_configure_rx(struct e1000_adapter void e1000_free_tx_resources(struct e1000_adapter *adapter) { + struct e1000_desc_ring *tx_ring = &adapter->tx_ring; + struct e1000_buffer *buffer_info; struct pci_dev *pdev = adapter->pdev; + unsigned int i; e1000_clean_tx_ring(adapter); + for(i = 0; i < tx_ring->count; i++) { + buffer_info = &tx_ring->buffer_info[i]; + if(buffer_info->skb) { + kfree(buffer_info->skb); + buffer_info->skb = NULL; + } + } + vfree(adapter->tx_ring.buffer_info); adapter->tx_ring.buffer_info = NULL; @@ -1079,16 +1094,12 @@ e1000_clean_tx_ring(struct e1000_adapter for(i = 0; i < tx_ring->count; i++) { buffer_info = &tx_ring->buffer_info[i]; - if(buffer_info->skb) { - + if(buffer_info->dma) { pci_unmap_page(pdev, buffer_info->dma, buffer_info->length, PCI_DMA_TODEVICE); - - dev_kfree_skb(buffer_info->skb); - - buffer_info->skb = NULL; + buffer_info->dma = 0; } } @@ -1579,8 +1590,6 @@ e1000_tx_map(struct e1000_adapter *adapt struct e1000_buffer *buffer_info; unsigned int len = skb->len; unsigned int offset = 0, size, count = 0, i; - unsigned int f; - len -= skb->data_len; i = tx_ring->next_to_use; @@ -1600,10 +1609,12 @@ e1000_tx_map(struct e1000_adapter *adapt size > 4)) size -= 4; + skb_copy_bits(skb, offset, buffer_info->skb, size); + buffer_info->length = size; buffer_info->dma = pci_map_single(adapter->pdev, - skb->data + offset, + buffer_info->skb, size, PCI_DMA_TODEVICE); buffer_info->time_stamp = jiffies; @@ -1614,50 +1625,11 @@ e1000_tx_map(struct e1000_adapter *adapt if(unlikely(++i == tx_ring->count)) i = 0; } - for(f = 0; f < nr_frags; f++) { - struct skb_frag_struct *frag; - - frag = &skb_shinfo(skb)->frags[f]; - len = frag->size; - offset = frag->page_offset; - - while(len) { - buffer_info = &tx_ring->buffer_info[i]; - size = min(len, max_per_txd); -#ifdef NETIF_F_TSO - /* Workaround for premature desc write-backs - * in TSO mode. Append 4-byte sentinel desc */ - if(unlikely(mss && f == (nr_frags-1) && size == len && size > 8)) - size -= 4; -#endif - /* Workaround for potential 82544 hang in PCI-X. - * Avoid terminating buffers within evenly-aligned - * dwords. */ - if(unlikely(adapter->pcix_82544 && - !((unsigned long)(frag->page+offset+size-1) & 4) && - size > 4)) - size -= 4; - - buffer_info->length = size; - buffer_info->dma = - pci_map_page(adapter->pdev, - frag->page, - offset, - size, - PCI_DMA_TODEVICE); - buffer_info->time_stamp = jiffies; - - len -= size; - offset += size; - count++; - if(unlikely(++i == tx_ring->count)) i = 0; - } - } - i = (i == 0) ? tx_ring->count - 1 : i - 1; - tx_ring->buffer_info[i].skb = skb; tx_ring->buffer_info[first].next_to_watch = i; + dev_kfree_skb_any(skb); + return count; } @@ -2213,11 +2185,6 @@ e1000_clean_tx_irq(struct e1000_adapter buffer_info->dma = 0; } - if(buffer_info->skb) { - dev_kfree_skb_any(buffer_info->skb); - buffer_info->skb = NULL; - } - tx_desc->buffer_addr = 0; tx_desc->lower.data = 0; tx_desc->upper.data = 0; @@ -2243,6 +2210,28 @@ e1000_clean_tx_irq(struct e1000_adapter return cleaned; } + +static boolean_t +e1000_alloc_tx_buffers(struct e1000_adapter *adapter) +{ + struct e1000_desc_ring *tx_ring = &adapter->tx_ring; + struct e1000_buffer *buffer_info; + unsigned int i; + + for (i = 0; i < tx_ring->count; i++) { + buffer_info = &tx_ring->buffer_info[i]; + if (!buffer_info->skb) { + buffer_info->skb = kmalloc(2048, GFP_ATOMIC); + if (unlikely(!buffer_info->skb)) { + printk("eek!\n"); + return FALSE; + } + } + } + + return TRUE; +} + /** * e1000_clean_rx_irq - Send received data up the network stack * @adapter: board private structure /Martin From kdesler@soohrt.org Mon Dec 6 14:41:35 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Dec 2004 14:41:39 -0800 (PST) Received: from quickstop.soohrt.org (quickstop.soohrt.org [81.2.155.147]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB6MfXHU011199 for ; Mon, 6 Dec 2004 14:41:34 -0800 Received: (qmail 10304 invoked by uid 1000); 6 Dec 2004 22:41:07 -0000 Date: Mon, 6 Dec 2004 23:41:07 +0100 From: Karsten Desler To: "David S. Miller" Cc: netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: _High_ CPU usage while routing (mostly) small UDP packets Message-ID: <20041206224107.GA8529@soohrt.org> References: <20041206205305.GA11970@soohrt.org> <20041206134849.498bfc93.davem@davemloft.net> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <20041206134849.498bfc93.davem@davemloft.net> User-Agent: Mutt/1.5.6+20040722i X-archive-position: 12485 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kdesler@soohrt.org Precedence: bulk X-list: netdev * David S. Miller wrote: > It's spending nearly half of it's time in iptables. > Try to consolidate your rules if possible. This is the > part of netfilter that really doesn't scale well at all. > Removing the iptables rules helps reducing the load a little, but the majority of time is still spent somewhere else. 50kpps rx and 43kpps tx. procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 0 0 1802956 101032 104464 0 0 0 18 74 26 0 21 79 0 0 0 0 1802956 101032 104464 0 0 0 61 8810 28 0 25 75 0 0 0 0 1802956 101032 104464 0 0 0 233 8867 17 0 24 76 0 0 0 0 1802892 101032 104464 0 0 0 0 8865 16 0 21 79 0 0 0 0 1802892 101032 104464 0 0 0 0 8772 8 0 18 82 0 <- iptables -F 0 0 0 1802892 101032 104464 0 0 0 36 8863 18 0 19 81 0 0 0 0 1802892 101032 104464 0 0 0 80 8700 18 0 18 82 0 0 0 0 1802956 101032 104464 0 0 0 0 8779 7 0 17 83 0 0 0 0 1802948 101032 104464 0 0 0 223 8716 278 4 19 76 0 - Karsten From rayl@mail.com Mon Dec 6 15:12:58 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Dec 2004 15:13:03 -0800 (PST) Received: from ray.lehtiniemi.com (dsl-crow-209-5-162-130-cgy.nucleus.com [209.5.162.130]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB6NCv21013369 for ; Mon, 6 Dec 2004 15:12:58 -0800 Received: by ray.lehtiniemi.com (Postfix, from userid 500) id 9EAE11D7803; Mon, 6 Dec 2004 16:12:35 -0700 (MST) Date: Mon, 6 Dec 2004 16:12:35 -0700 From: Ray Lehtiniemi To: Scott Feldman Cc: netdev@oss.sgi.com Subject: Re: how to tune a pair of e1000 cards on intel e7501-based system? Message-ID: <20041206231235.GA18097@mail.com> References: <20041206024437.GB7891@mail.com> <1102304058.3343.217.camel@sfeldma-mobl.dsl-verizon.net> <20041206041002.GC7891@mail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <20041206041002.GC7891@mail.com> User-Agent: Mutt/1.5.6i X-archive-position: 12486 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rayl@mail.com Precedence: bulk X-list: netdev On Sun, Dec 05, 2004 at 09:10:03PM -0700, Ray Lehtiniemi wrote: > > any idea why lspci -vv shows non-64bit, non-133 MHz? (i am assuming > that is what the minus sign means) > > Capabilities: [e4] PCI-X non-bridge device. > Command: DPERE- ERO+ RBC=0 OST=0 > Status: Bus=0 Dev=0 Func=0 64bit- 133MHz- SCD- USC-, DC=simple, DMMRBC=0, DMOST=0, DMCRS=0, RSCEM- turns out this is a bug in pci-utils-2.1.11. it was not correctly fetching the config information from the card, and was displaying zeroed data instead. i've included a patch at the end of this email and have also forwarded it to martin mares. > > Can you put one on D and the other on another bus? > > not sure... have to look at the chassis tomorrow morning. a co-worker > actually built the box, i've not seen it in person yet. nope, can't move things around. this is a NexGate NSA 2040G, and everything is built into the motherboard. > > What kind of numbers are you getting? i'm seeing about 100Kpps, with all settings at their defaults on the 2.4.20 kernel. basically, i have a couple of desktop PCs generating 480 streams of UDP data at 50 packets per second. Packet size on the wire, including 96 bits of IFG, is 128 bytes. these packets are forwarded through a user process on the NexGen box to an echoer process which is also running on the traffic generator boxes. the echoer sends them back to the NexGen user process, which forwards them back to the generator process. timestamps are logged for each packet at send, loop and recv points. anything over 480 streams, and i start to get large latencies and packet drops, as measured by the timestamps in the sender and echoer process. does 100Kpps sound reasonable for an untweaked 2.4.20 kernel? thanks diff -ur pciutils-2.1.11/lspci.c pciutils-2.1.11-rayl/lspci.c --- pciutils-2.1.11/lspci.c 2002-12-26 13:24:50.000000000 -0700 +++ pciutils-2.1.11-rayl/lspci.c 2004-12-06 15:54:33.573313973 -0700 @@ -476,16 +476,19 @@ static void show_pcix_nobridge(struct device *d, int where) { - u16 command = get_conf_word(d, where + PCI_PCIX_COMMAND); - u32 status = get_conf_long(d, where + PCI_PCIX_STATUS); + u16 command; + u32 status; printf("PCI-X non-bridge device.\n"); if (verbose < 2) return; + config_fetch(d, where, 8); + command = get_conf_word(d, where + PCI_PCIX_COMMAND); printf("\t\tCommand: DPERE%c ERO%c RBC=%d OST=%d\n", FLAG(command, PCI_PCIX_COMMAND_DPERE), FLAG(command, PCI_PCIX_COMMAND_ERO), ((command & PCI_PCIX_COMMAND_MAX_MEM_READ_BYTE_COUNT) >> 2U), ((command & PCI_PCIX_COMMAND_MAX_OUTSTANDING_SPLIT_TRANS) >> 4U)); + status = get_conf_long(d, where + PCI_PCIX_STATUS); printf("\t\tStatus: Bus=%u Dev=%u Func=%u 64bit%c 133MHz%c SCD%c USC%c, DC=%s, DMMRBC=%u, DMOST=%u, DMCRS=%u, RSCEM%c", ((status >> 8) & 0xffU), // bus ((status >> 3) & 0x1fU), // dev @@ -509,6 +512,7 @@ printf("PCI-X bridge device.\n"); if (verbose < 2) return; + config_fetch(d, where, 8); secstatus = get_conf_word(d, where + PCI_PCIX_BRIDGE_SEC_STATUS); printf("\t\tSecondary Status: 64bit%c, 133MHz%c, SCD%c, USC%c, SCO%c, SRD%c Freq=%d\n", FLAG(secstatus, PCI_PCIX_BRIDGE_SEC_STATUS_64BIT), -- ---------------------------------------------------------------------- Ray L From kernel@kolivas.org Mon Dec 6 15:57:36 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Dec 2004 15:57:40 -0800 (PST) Received: from mail07.syd.optusnet.com.au (mail07.syd.optusnet.com.au [211.29.132.188]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB6NvW7S015654 for ; Mon, 6 Dec 2004 15:57:35 -0800 Received: from mail.kolivas.org (c210-49-199-147.lowrp1.vic.optusnet.com.au [210.49.199.147]) by mail07.syd.optusnet.com.au (8.12.11/8.12.11) with ESMTP id iB6NuwtR029277; Tue, 7 Dec 2004 10:57:01 +1100 Received: from pc.kolivas.org (pc [192.168.1.251]) by mail.kolivas.org (Postfix) with ESMTP id 090373EA9E; Tue, 7 Dec 2004 10:56:58 +1100 (EST) References: <20041206205305.GA11970@soohrt.org> <20041206134849.498bfc93.davem@davemloft.net> <20041206224107.GA8529@soohrt.org> Message-ID: X-Mailer: http://www.courier-mta.org/cone/ From: Con Kolivas To: Karsten Desler Cc: David =?ISO-8859-1?B?Uy4=?= Miller , netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: _High_ CPU usage while routing (mostly) small UDP packets Date: Tue, 07 Dec 2004 10:56:58 +1100 Mime-Version: 1.0 Content-Type: text/plain; format=flowed; charset="US-ASCII" Content-Disposition: inline Content-Transfer-Encoding: 7bit X-archive-position: 12487 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kernel@kolivas.org Precedence: bulk X-list: netdev Karsten Desler writes: > * David S. Miller wrote: >> It's spending nearly half of it's time in iptables. >> Try to consolidate your rules if possible. This is the >> part of netfilter that really doesn't scale well at all. >> > > Removing the iptables rules helps reducing the load a little, but the > majority of time is still spent somewhere else. I had a similar scenario recently with a very low spec box and found it to be the QoS. Disabling traffic shaping and removing the QoS modules made it much faster. I don't know if you're using them but it's worth pointing out. Cheers, Con From romieu@fr.zoreil.com Mon Dec 6 16:15:30 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Dec 2004 16:15:36 -0800 (PST) Received: from fr.zoreil.com (electric-eye.fr.zoreil.com [213.41.134.224]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB70FSlr016672 for ; Mon, 6 Dec 2004 16:15:29 -0800 Received: from electric-eye.fr.zoreil.com (localhost.localdomain [127.0.0.1]) by fr.zoreil.com (8.12.10/8.12.1) with ESMTP id iB70ELvr018671; Tue, 7 Dec 2004 01:14:21 +0100 Received: (from romieu@localhost) by electric-eye.fr.zoreil.com (8.12.10/8.12.10/Submit) id iB70EKoH018670; Tue, 7 Dec 2004 01:14:20 +0100 Date: Tue, 7 Dec 2004 01:14:19 +0100 From: Francois Romieu To: akpm@osdl.org Cc: netdev@oss.sgi.com, jgarzik@pobox.com, Dorn Hetzel Subject: [patch 1/5] r8169: missing netif_poll_enable and irq ack Message-ID: <20041207001419.GB12838@electric-eye.fr.zoreil.com> References: <20041119162920.GA26836@lilah.hetzel.org> <20041119201203.GA13522@electric-eye.fr.zoreil.com> <20041120003754.GA32133@lilah.hetzel.org> <20041120002946.GA18059@electric-eye.fr.zoreil.com> <20041122181307.GA3625@lilah.hetzel.org> <20041205235519.GA21885@lilah.hetzel.org> <20041205233756.GB29236@electric-eye.fr.zoreil.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041205233756.GB29236@electric-eye.fr.zoreil.com> User-Agent: Mutt/1.4.1i X-Organisation: Land of Sunshine Inc. X-archive-position: 12488 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: romieu@fr.zoreil.com Precedence: bulk X-list: netdev - (noticed by Jon D. Mason) rtl8169_wait_for_quiescence() needs to disable the NAPI processing but it has no reason to lock any part of the driver which would try to do the same at a later time. Let's reenable NAPI processing as soon as possible. - properly ack any aborted interruption: a reset of the device is not always enough. Signed-off-by: Francois Romieu diff -puN drivers/net/r8169.c~r8169-250 drivers/net/r8169.c --- linux-2.6.10-rc2/drivers/net/r8169.c~r8169-250 2004-12-05 22:36:14.592843434 +0100 +++ linux-2.6.10-rc2-fr/drivers/net/r8169.c 2004-12-05 22:36:14.596842782 +0100 @@ -1742,10 +1742,19 @@ static void rtl8169_schedule_work(struct static void rtl8169_wait_for_quiescence(struct net_device *dev) { + struct rtl8169_private *tp = netdev_priv(dev); + void __iomem *ioaddr = tp->mmio_addr; + synchronize_irq(dev->irq); /* Wait for any pending NAPI task to complete */ netif_poll_disable(dev); + + RTL_W16(IntrMask, 0x0000); + + RTL_W16(IntrStatus, 0xffff); + + netif_poll_enable(dev); } static void rtl8169_reinit_task(void *_data) _ From kdesler@soohrt.org Mon Dec 6 16:18:41 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Dec 2004 16:18:47 -0800 (PST) Received: from quickstop.soohrt.org (quickstop.soohrt.org [81.2.155.147]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB70IeVI017395 for ; Mon, 6 Dec 2004 16:18:41 -0800 Received: (qmail 31476 invoked by uid 1018); 7 Dec 2004 00:18:15 -0000 Date: Tue, 7 Dec 2004 01:18:15 +0100 From: Karsten Desler To: Con Kolivas Cc: "David S. Miller" , netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: _High_ CPU usage while routing (mostly) small UDP packets Message-ID: <20041207001815.GA30674@quickstop.soohrt.org> References: <20041206205305.GA11970@soohrt.org> <20041206134849.498bfc93.davem@davemloft.net> <20041206224107.GA8529@soohrt.org> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="GvXjxJ+pjyke8COw" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.6+20040722i X-archive-position: 12489 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kdesler@soohrt.org Precedence: bulk X-list: netdev --GvXjxJ+pjyke8COw Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Con Kolivas wrote: > I had a similar scenario recently with a very low spec box and found it to > be the QoS. Disabling traffic shaping and removing the QoS modules made it > much faster. I don't know if you're using them but it's worth pointing out. No QoS loaded. .config is attached. Thanks, Karsten --GvXjxJ+pjyke8COw Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename=config CONFIG_X86=y CONFIG_MMU=y CONFIG_UID16=y CONFIG_GENERIC_ISA_DMA=y CONFIG_GENERIC_IOMAP=y CONFIG_EXPERIMENTAL=y CONFIG_CLEAN_COMPILE=y CONFIG_LOCK_KERNEL=y CONFIG_LOCALVERSION="" CONFIG_SWAP=y CONFIG_SYSVIPC=y CONFIG_POSIX_MQUEUE=y CONFIG_SYSCTL=y CONFIG_LOG_BUF_SHIFT=15 CONFIG_KOBJECT_UEVENT=y CONFIG_EMBEDDED=y CONFIG_FUTEX=y CONFIG_EPOLL=y CONFIG_SHMEM=y CONFIG_CC_ALIGN_FUNCTIONS=0 CONFIG_CC_ALIGN_LABELS=0 CONFIG_CC_ALIGN_LOOPS=0 CONFIG_CC_ALIGN_JUMPS=0 CONFIG_X86_PC=y CONFIG_MK8=y CONFIG_X86_CMPXCHG=y CONFIG_X86_XADD=y CONFIG_X86_L1_CACHE_SHIFT=6 CONFIG_RWSEM_XCHGADD_ALGORITHM=y CONFIG_X86_WP_WORKS_OK=y CONFIG_X86_INVLPG=y CONFIG_X86_BSWAP=y CONFIG_X86_POPAD_OK=y CONFIG_X86_GOOD_APIC=y CONFIG_X86_INTEL_USERCOPY=y CONFIG_X86_USE_PPRO_CHECKSUM=y CONFIG_HPET_TIMER=y CONFIG_HPET_EMULATE_RTC=y CONFIG_SMP=y CONFIG_NR_CPUS=2 CONFIG_X86_LOCAL_APIC=y CONFIG_X86_IO_APIC=y CONFIG_X86_TSC=y CONFIG_X86_MCE=y CONFIG_HIGHMEM4G=y CONFIG_HIGHMEM=y CONFIG_MTRR=y CONFIG_HAVE_DEC_LOCK=y CONFIG_REGPARM=y CONFIG_ACPI=y CONFIG_ACPI_BOOT=y CONFIG_ACPI_INTERPRETER=y CONFIG_ACPI_BLACKLIST_YEAR=0 CONFIG_ACPI_BUS=y CONFIG_ACPI_EC=y CONFIG_ACPI_POWER=y CONFIG_ACPI_PCI=y CONFIG_ACPI_SYSTEM=y CONFIG_PCI=y CONFIG_PCI_GOANY=y CONFIG_PCI_BIOS=y CONFIG_PCI_DIRECT=y CONFIG_PCI_MMCONFIG=y CONFIG_PCI_MSI=y CONFIG_PCI_NAMES=y CONFIG_BINFMT_ELF=y CONFIG_STANDALONE=y CONFIG_BLK_DEV_RAM_COUNT=16 CONFIG_INITRAMFS_SOURCE="" CONFIG_IOSCHED_NOOP=y CONFIG_IOSCHED_CFQ=y CONFIG_SCSI=y CONFIG_BLK_DEV_SD=y CONFIG_SCSI_SATA=y CONFIG_SCSI_SATA_SIL=y CONFIG_SCSI_QLA2XXX=y CONFIG_MD=y CONFIG_BLK_DEV_MD=y CONFIG_MD_RAID1=y CONFIG_NET=y CONFIG_PACKET=y CONFIG_PACKET_MMAP=y CONFIG_UNIX=y CONFIG_INET=y CONFIG_IP_TCPDIAG=y CONFIG_NETFILTER=y CONFIG_IP_NF_IPTABLES=y CONFIG_IP_NF_MATCH_LIMIT=y CONFIG_IP_NF_MATCH_IPRANGE=y CONFIG_IP_NF_MATCH_MULTIPORT=y CONFIG_IP_NF_MATCH_LENGTH=y CONFIG_IP_NF_MATCH_HASHLIMIT=y CONFIG_IP_NF_FILTER=y CONFIG_IP_NF_TARGET_REJECT=y CONFIG_IP_NF_TARGET_LOG=y CONFIG_IP_NF_TARGET_ULOG=y CONFIG_IP_NF_MANGLE=y CONFIG_NETDEVICES=y CONFIG_NET_ETHERNET=y CONFIG_MII=y CONFIG_NET_PCI=y CONFIG_E100=y CONFIG_E100_NAPI=y CONFIG_E1000=y CONFIG_E1000_NAPI=y CONFIG_TIGON3=y CONFIG_INPUT=y CONFIG_SOUND_GAMEPORT=y CONFIG_SERIO=y CONFIG_SERIO_I8042=y CONFIG_INPUT_KEYBOARD=y CONFIG_KEYBOARD_ATKBD=y CONFIG_VT=y CONFIG_VT_CONSOLE=y CONFIG_HW_CONSOLE=y CONFIG_SERIAL_8250=y CONFIG_SERIAL_8250_CONSOLE=y CONFIG_SERIAL_8250_ACPI=y CONFIG_SERIAL_8250_NR_UARTS=4 CONFIG_SERIAL_CORE=y CONFIG_SERIAL_CORE_CONSOLE=y CONFIG_UNIX98_PTYS=y CONFIG_WATCHDOG=y CONFIG_SOFT_WATCHDOG=y CONFIG_RTC=y CONFIG_HANGCHECK_TIMER=y CONFIG_VGA_CONSOLE=y CONFIG_DUMMY_CONSOLE=y CONFIG_USB_ARCH_HAS_HCD=y CONFIG_USB_ARCH_HAS_OHCI=y CONFIG_EXT2_FS=y CONFIG_EXT3_FS=y CONFIG_JBD=y CONFIG_DNOTIFY=y CONFIG_PROC_FS=y CONFIG_PROC_KCORE=y CONFIG_SYSFS=y CONFIG_TMPFS=y CONFIG_RAMFS=y CONFIG_PARTITION_ADVANCED=y CONFIG_MSDOS_PARTITION=y CONFIG_NLS=y CONFIG_NLS_DEFAULT="iso8859-1" CONFIG_NLS_CODEPAGE_437=y CONFIG_NLS_ISO8859_1=y CONFIG_4KSTACKS=y CONFIG_X86_FIND_SMP_CONFIG=y CONFIG_X86_MPPARSE=y CONFIG_GENERIC_HARDIRQS=y CONFIG_GENERIC_IRQ_PROBE=y CONFIG_X86_SMP=y CONFIG_X86_HT=y CONFIG_X86_BIOS_REBOOT=y CONFIG_X86_TRAMPOLINE=y --GvXjxJ+pjyke8COw-- From romieu@fr.zoreil.com Mon Dec 6 16:19:27 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Dec 2004 16:19:33 -0800 (PST) Received: from fr.zoreil.com (electric-eye.fr.zoreil.com [213.41.134.224]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB70JP3o020724 for ; Mon, 6 Dec 2004 16:19:26 -0800 Received: from electric-eye.fr.zoreil.com (localhost.localdomain [127.0.0.1]) by fr.zoreil.com (8.12.10/8.12.1) with ESMTP id iB70Favr018744; Tue, 7 Dec 2004 01:15:36 +0100 Received: (from romieu@localhost) by electric-eye.fr.zoreil.com (8.12.10/8.12.10/Submit) id iB70FZ5r018743; Tue, 7 Dec 2004 01:15:35 +0100 Date: Tue, 7 Dec 2004 01:15:35 +0100 From: Francois Romieu To: akpm@osdl.org Cc: netdev@oss.sgi.com, jgarzik@pobox.com, Dorn Hetzel Subject: [patch 2/5] r8169: C 101 Message-ID: <20041207001535.GA18672@electric-eye.fr.zoreil.com> References: <20041119162920.GA26836@lilah.hetzel.org> <20041119201203.GA13522@electric-eye.fr.zoreil.com> <20041120003754.GA32133@lilah.hetzel.org> <20041120002946.GA18059@electric-eye.fr.zoreil.com> <20041122181307.GA3625@lilah.hetzel.org> <20041205235519.GA21885@lilah.hetzel.org> <20041205233756.GB29236@electric-eye.fr.zoreil.com> <20041207001419.GB12838@electric-eye.fr.zoreil.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041207001419.GB12838@electric-eye.fr.zoreil.com> User-Agent: Mutt/1.4.1i X-Organisation: Land of Sunshine Inc. X-archive-position: 12490 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: romieu@fr.zoreil.com Precedence: bulk X-list: netdev Back to C101 and code which gives the expected result. Signed-off-by: Francois Romieu diff -puN drivers/net/r8169.c~r8169-255 drivers/net/r8169.c --- linux-2.6.10-rc2/drivers/net/r8169.c~r8169-255 2004-12-05 22:36:19.717006983 +0100 +++ linux-2.6.10-rc2-fr/drivers/net/r8169.c 2004-12-05 22:36:19.721006330 +0100 @@ -1978,7 +1978,7 @@ static void rtl8169_pcierr_interrupt(str PCI_STATUS_REC_TARGET_ABORT | PCI_STATUS_SIG_TARGET_ABORT)); /* The infamous DAC f*ckup only happens at boot time */ - if ((tp->cp_cmd & PCIDAC) && (tp->dirty_rx == tp->cur_rx == 0)) { + if ((tp->cp_cmd & PCIDAC) && !tp->dirty_rx && !tp->cur_rx) { printk(KERN_INFO PFX "%s: disabling PCI DAC.\n", dev->name); tp->cp_cmd &= ~PCIDAC; RTL_W16(CPlusCmd, tp->cp_cmd); _ From romieu@fr.zoreil.com Mon Dec 6 16:19:27 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Dec 2004 16:19:35 -0800 (PST) Received: from fr.zoreil.com (electric-eye.fr.zoreil.com [213.41.134.224]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB70JPv9020726 for ; Mon, 6 Dec 2004 16:19:26 -0800 Received: from electric-eye.fr.zoreil.com (localhost.localdomain [127.0.0.1]) by fr.zoreil.com (8.12.10/8.12.1) with ESMTP id iB70GLvr018748; Tue, 7 Dec 2004 01:16:21 +0100 Received: (from romieu@localhost) by electric-eye.fr.zoreil.com (8.12.10/8.12.10/Submit) id iB70GLKm018747; Tue, 7 Dec 2004 01:16:21 +0100 Date: Tue, 7 Dec 2004 01:16:21 +0100 From: Francois Romieu To: akpm@osdl.org Cc: netdev@oss.sgi.com, jgarzik@pobox.com, Dorn Hetzel Subject: [patch 3/5] r8169: Large Send enablement Message-ID: <20041207001621.GB18672@electric-eye.fr.zoreil.com> References: <20041119162920.GA26836@lilah.hetzel.org> <20041119201203.GA13522@electric-eye.fr.zoreil.com> <20041120003754.GA32133@lilah.hetzel.org> <20041120002946.GA18059@electric-eye.fr.zoreil.com> <20041122181307.GA3625@lilah.hetzel.org> <20041205235519.GA21885@lilah.hetzel.org> <20041205233756.GB29236@electric-eye.fr.zoreil.com> <20041207001419.GB12838@electric-eye.fr.zoreil.com> <20041207001535.GA18672@electric-eye.fr.zoreil.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041207001535.GA18672@electric-eye.fr.zoreil.com> User-Agent: Mutt/1.4.1i X-Organisation: Land of Sunshine Inc. X-archive-position: 12491 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: romieu@fr.zoreil.com Precedence: bulk X-list: netdev Large Send enablement. Acked-by: Francois Romieu Signed-off-by: Jon Mason diff -puN drivers/net/r8169.c~r8169-260 drivers/net/r8169.c --- linux-2.6.10-rc2/drivers/net/r8169.c~r8169-260 2004-12-05 22:36:22.079621298 +0100 +++ linux-2.6.10-rc2-fr/drivers/net/r8169.c 2004-12-05 22:36:22.084620482 +0100 @@ -112,7 +112,7 @@ static int multicast_filter_limit = 32; #define RX_DMA_BURST 6 /* Maximum PCI burst, '6' is 1024 */ #define TX_DMA_BURST 6 /* Maximum PCI burst, '6' is 1024 */ #define EarlyTxThld 0x3F /* 0x3F means NO early transmit */ -#define RxPacketMaxSize 0x0800 /* Maximum size supported is 16K-1 */ +#define RxPacketMaxSize 0x3FE8 /* 16K - 1 - ETH_HLEN - VLAN - CRC */ #define InterFrameGap 0x03 /* 3 means InterFrameGap = the shortest one */ #define R8169_REGS_SIZE 256 @@ -426,6 +426,9 @@ static void rtl8169_tx_timeout(struct ne static struct net_device_stats *rtl8169_get_stats(struct net_device *netdev); static int rtl8169_rx_interrupt(struct net_device *, struct rtl8169_private *, void __iomem *); +static int rtl8169_change_mtu(struct net_device *netdev, int new_mtu); +static void rtl8169_down(struct net_device *dev); + #ifdef CONFIG_R8169_NAPI static int rtl8169_poll(struct net_device *dev, int *budget); #endif @@ -1237,8 +1240,6 @@ rtl8169_init_board(struct pci_dev *pdev, } tp->chipset = i; - tp->rx_buf_sz = RX_BUF_SIZE; - *ioaddr_out = ioaddr; *dev_out = dev; out: @@ -1320,6 +1321,7 @@ rtl8169_init_one(struct pci_dev *pdev, c dev->watchdog_timeo = RTL8169_TX_TIMEOUT; dev->irq = pdev->irq; dev->base_addr = (unsigned long) ioaddr; + dev->change_mtu = rtl8169_change_mtu; #ifdef CONFIG_R8169_NAPI dev->poll = rtl8169_poll; @@ -1448,13 +1450,22 @@ static int rtl8169_resume(struct pci_dev #endif /* CONFIG_PM */ -static int -rtl8169_open(struct net_device *dev) +static void rtl8169_set_rxbufsize(struct rtl8169_private *tp, + struct net_device *dev) +{ + unsigned int mtu = dev->mtu; + + tp->rx_buf_sz = (mtu > RX_BUF_SIZE) ? mtu + ETH_HLEN + 8 : RX_BUF_SIZE; +} + +static int rtl8169_open(struct net_device *dev) { struct rtl8169_private *tp = netdev_priv(dev); struct pci_dev *pdev = tp->pci_dev; int retval; + rtl8169_set_rxbufsize(tp, dev); + retval = request_irq(dev->irq, rtl8169_interrupt, SA_SHIRQ, dev->name, dev); if (retval < 0) @@ -1534,8 +1545,8 @@ rtl8169_hw_start(struct net_device *dev) RTL_W8(ChipCmd, CmdTxEnb | CmdRxEnb); RTL_W8(EarlyTxThres, EarlyTxThld); - // For gigabit rtl8169 - RTL_W16(RxMaxSize, RxPacketMaxSize); + // For gigabit rtl8169, MTU + header + CRC + VLAN + RTL_W16(RxMaxSize, tp->rx_buf_sz); // Set Rx Config register i = rtl8169_rx_config | @@ -1576,6 +1587,37 @@ rtl8169_hw_start(struct net_device *dev) netif_start_queue(dev); } +static int rtl8169_change_mtu(struct net_device *dev, int new_mtu) +{ + struct rtl8169_private *tp = netdev_priv(dev); + int ret = 0; + + if (new_mtu < ETH_ZLEN || new_mtu > RxPacketMaxSize) + return -EINVAL; + + dev->mtu = new_mtu; + + if (!netif_running(dev)) + goto out; + + rtl8169_down(dev); + + rtl8169_set_rxbufsize(tp, dev); + + ret = rtl8169_init_ring(dev); + if (ret < 0) + goto out; + + rtl8169_hw_start(dev); + + netif_poll_enable(dev); + + rtl8169_request_timer(dev); + +out: + return ret; +} + static inline void rtl8169_make_unusable_by_asic(struct RxDesc *desc) { desc->addr = 0x0badbadbadbadbadull; @@ -2264,19 +2306,17 @@ static int rtl8169_poll(struct net_devic } #endif -static int -rtl8169_close(struct net_device *dev) +static void rtl8169_down(struct net_device *dev) { struct rtl8169_private *tp = netdev_priv(dev); - struct pci_dev *pdev = tp->pci_dev; void __iomem *ioaddr = tp->mmio_addr; + rtl8169_delete_timer(dev); + netif_stop_queue(dev); flush_scheduled_work(); - rtl8169_delete_timer(dev); - spin_lock_irq(&tp->lock); /* Stop the chip's Tx and Rx DMA processes. */ @@ -2291,13 +2331,27 @@ rtl8169_close(struct net_device *dev) spin_unlock_irq(&tp->lock); - free_irq(dev->irq, dev); + synchronize_irq(dev->irq); netif_poll_disable(dev); + /* Give a racing hard_start_xmit a few cycles to complete. */ + set_current_state(TASK_UNINTERRUPTIBLE); + schedule_timeout(1); + rtl8169_tx_clear(tp); rtl8169_rx_clear(tp); +} + +static int rtl8169_close(struct net_device *dev) +{ + struct rtl8169_private *tp = netdev_priv(dev); + struct pci_dev *pdev = tp->pci_dev; + + rtl8169_down(dev); + + free_irq(dev->irq, dev); pci_free_consistent(pdev, R8169_RX_RING_BYTES, tp->RxDescArray, tp->RxPhyAddr); _ From romieu@fr.zoreil.com Mon Dec 6 16:19:27 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Dec 2004 16:19:36 -0800 (PST) Received: from fr.zoreil.com (electric-eye.fr.zoreil.com [213.41.134.224]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB70JPcs020727 for ; Mon, 6 Dec 2004 16:19:26 -0800 Received: from electric-eye.fr.zoreil.com (localhost.localdomain [127.0.0.1]) by fr.zoreil.com (8.12.10/8.12.1) with ESMTP id iB70HNvr018753; Tue, 7 Dec 2004 01:17:23 +0100 Received: (from romieu@localhost) by electric-eye.fr.zoreil.com (8.12.10/8.12.10/Submit) id iB70HMPY018752; Tue, 7 Dec 2004 01:17:22 +0100 Date: Tue, 7 Dec 2004 01:17:22 +0100 From: Francois Romieu To: akpm@osdl.org Cc: netdev@oss.sgi.com, jgarzik@pobox.com, Dorn Hetzel Subject: [patch 4/5] r8169: reduce max MTU for large frames Message-ID: <20041207001722.GC18672@electric-eye.fr.zoreil.com> References: <20041119162920.GA26836@lilah.hetzel.org> <20041119201203.GA13522@electric-eye.fr.zoreil.com> <20041120003754.GA32133@lilah.hetzel.org> <20041120002946.GA18059@electric-eye.fr.zoreil.com> <20041122181307.GA3625@lilah.hetzel.org> <20041205235519.GA21885@lilah.hetzel.org> <20041205233756.GB29236@electric-eye.fr.zoreil.com> <20041207001419.GB12838@electric-eye.fr.zoreil.com> <20041207001535.GA18672@electric-eye.fr.zoreil.com> <20041207001621.GB18672@electric-eye.fr.zoreil.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041207001621.GB18672@electric-eye.fr.zoreil.com> User-Agent: Mutt/1.4.1i X-Organisation: Land of Sunshine Inc. X-archive-position: 12492 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: romieu@fr.zoreil.com Precedence: bulk X-list: netdev The device does not support the whole mtu range it claims. Experimenting with the Tx threshold and/or the PCI burst size does not seem to improve the behavior. Signed-off-by: Francois Romieu diff -puN drivers/net/r8169.c~r8169-265 drivers/net/r8169.c --- linux-2.6.10-rc2/drivers/net/r8169.c~r8169-265 2004-12-05 22:36:25.000000000 +0100 +++ linux-2.6.10-rc2-fr/drivers/net/r8169.c 2004-12-07 00:54:48.313082500 +0100 @@ -112,7 +112,8 @@ static int multicast_filter_limit = 32; #define RX_DMA_BURST 6 /* Maximum PCI burst, '6' is 1024 */ #define TX_DMA_BURST 6 /* Maximum PCI burst, '6' is 1024 */ #define EarlyTxThld 0x3F /* 0x3F means NO early transmit */ -#define RxPacketMaxSize 0x3FE8 /* 16K - 1 - ETH_HLEN - VLAN - CRC */ +#define RxPacketMaxSize 0x3FE8 /* 16K - 1 - ETH_HLEN - VLAN - CRC... */ +#define SafeMtu 0x1c20 /* ... actually life sucks beyond ~7k */ #define InterFrameGap 0x03 /* 3 means InterFrameGap = the shortest one */ #define R8169_REGS_SIZE 256 @@ -1592,9 +1593,9 @@ static int rtl8169_change_mtu(struct net struct rtl8169_private *tp = netdev_priv(dev); int ret = 0; - if (new_mtu < ETH_ZLEN || new_mtu > RxPacketMaxSize) + if (new_mtu < ETH_ZLEN || new_mtu > SafeMtu) return -EINVAL; - + dev->mtu = new_mtu; if (!netif_running(dev)) _ From kdesler@soohrt.org Mon Dec 6 16:20:39 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Dec 2004 16:20:47 -0800 (PST) Received: from quickstop.soohrt.org (quickstop.soohrt.org [81.2.155.147]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB70KcXf021563 for ; Mon, 6 Dec 2004 16:20:38 -0800 Received: (qmail 31879 invoked by uid 1018); 7 Dec 2004 00:20:12 -0000 Date: Tue, 7 Dec 2004 01:20:12 +0100 From: Karsten Desler To: Bernd Eckenfels Cc: "David S. Miller" , netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: _High_ CPU usage while routing (mostly) small UDP packets Message-ID: <20041207002012.GB30674@quickstop.soohrt.org> References: <20041206224107.GA8529@soohrt.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.6+20040722i X-archive-position: 12493 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kdesler@soohrt.org Precedence: bulk X-list: netdev Bernd Eckenfels wrote: > In article <20041206224107.GA8529@soohrt.org> you wrote: > > Removing the iptables rules helps reducing the load a little, but the > > majority of time is still spent somewhere else. > > In handling Interrupts. Are those equally sidtributed on eth0 and eth1? Yes they are. Thanks, Karsten CPU0 CPU1 0: 117199776 133677244 IO-APIC-edge timer 1: 0 9 IO-APIC-edge i8042 8: 0 4 IO-APIC-edge rtc 9: 0 0 IO-APIC-level acpi 169: 139 893669684 IO-APIC-level eth0 177: 919803109 30665 IO-APIC-level eth1 209: 414257 413316 IO-APIC-level libata NMI: 0 0 LOC: 250918849 250918819 ERR: 0 MIS: 0 From romieu@fr.zoreil.com Mon Dec 6 16:23:33 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Dec 2004 16:23:40 -0800 (PST) Received: from fr.zoreil.com (electric-eye.fr.zoreil.com [213.41.134.224]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB70NRon022516 for ; Mon, 6 Dec 2004 16:23:32 -0800 Received: from electric-eye.fr.zoreil.com (localhost.localdomain [127.0.0.1]) by fr.zoreil.com (8.12.10/8.12.1) with ESMTP id iB70M8vr018922; Tue, 7 Dec 2004 01:22:08 +0100 Received: (from romieu@localhost) by electric-eye.fr.zoreil.com (8.12.10/8.12.10/Submit) id iB70M8de018921; Tue, 7 Dec 2004 01:22:08 +0100 Date: Tue, 7 Dec 2004 01:22:08 +0100 From: Francois Romieu To: akpm@osdl.org Cc: netdev@oss.sgi.com, jgarzik@pobox.com, Dorn Hetzel Subject: [patch 5/5] r8169: oversized driver field for ethtool Message-ID: <20041207002208.GA18910@electric-eye.fr.zoreil.com> References: <20041119201203.GA13522@electric-eye.fr.zoreil.com> <20041120003754.GA32133@lilah.hetzel.org> <20041120002946.GA18059@electric-eye.fr.zoreil.com> <20041122181307.GA3625@lilah.hetzel.org> <20041205235519.GA21885@lilah.hetzel.org> <20041205233756.GB29236@electric-eye.fr.zoreil.com> <20041207001419.GB12838@electric-eye.fr.zoreil.com> <20041207001535.GA18672@electric-eye.fr.zoreil.com> <20041207001621.GB18672@electric-eye.fr.zoreil.com> <20041207001722.GC18672@electric-eye.fr.zoreil.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041207001722.GC18672@electric-eye.fr.zoreil.com> User-Agent: Mutt/1.4.1i X-Organisation: Land of Sunshine Inc. X-archive-position: 12494 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: romieu@fr.zoreil.com Precedence: bulk X-list: netdev Reported by Richard Dawe : - RTL8169_DRIVER_NAME contains more than the 32 characters allowed for the driver field; - remove RTL8169_DRIVER_NAME as it is only used once. Signed-off-by: Francois Romieu diff -puN drivers/net/r8169.c~r8169-280 drivers/net/r8169.c --- linux-2.6.10-rc2/drivers/net/r8169.c~r8169-280 2004-12-07 00:56:46.864676094 +0100 +++ linux-2.6.10-rc2-fr/drivers/net/r8169.c 2004-12-07 00:57:11.031721347 +0100 @@ -63,7 +63,6 @@ VERSION 1.6LK <2004/04/14> #define RTL8169_VERSION "1.6LK" #define MODULENAME "r8169" -#define RTL8169_DRIVER_NAME MODULENAME " Gigabit Ethernet driver " RTL8169_VERSION #define PFX MODULENAME ": " #ifdef RTL8169_DEBUG @@ -564,8 +563,8 @@ static void rtl8169_get_drvinfo(struct n { struct rtl8169_private *tp = netdev_priv(dev); - strcpy(info->driver, RTL8169_DRIVER_NAME); - strcpy(info->version, RTL8169_VERSION ); + strcpy(info->driver, MODULENAME); + strcpy(info->version, RTL8169_VERSION); strcpy(info->bus_info, pci_name(tp->pci_dev)); } @@ -1282,7 +1281,8 @@ rtl8169_init_one(struct pci_dev *pdev, c board_idx++; if (!printed_version) { - printk(KERN_INFO RTL8169_DRIVER_NAME " loaded\n"); + printk(KERN_INFO "%s Gigabit Ethernet driver %s loaded\n", + MODULENAME, RTL8169_VERSION); printed_version = 1; } _ From andreaf@cs.columbia.edu Mon Dec 6 16:28:26 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Dec 2004 16:28:30 -0800 (PST) Received: from cs.columbia.edu (cs.columbia.edu [128.59.16.20]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB70SPqb023077 for ; Mon, 6 Dec 2004 16:28:26 -0800 Received: from lion.cs.columbia.edu (IDENT:6RR5CZKiC9xg06/FDPktbe+pMtp8DrwM@lion.cs.columbia.edu [128.59.16.120]) by cs.columbia.edu (8.12.10/8.12.10) with ESMTP id iB70S1TO017519 for ; Mon, 6 Dec 2004 19:28:02 -0500 (EST) Received: from [128.59.17.219] (dhcp69.cs.columbia.edu [128.59.17.219]) by lion.cs.columbia.edu (8.12.9/8.12.9) with ESMTP id iB70S1Kj024362 for ; Mon, 6 Dec 2004 19:28:01 -0500 Message-ID: <41B4F914.2070401@cs.columbia.edu> Date: Mon, 06 Dec 2004 19:28:04 -0500 From: Andrea G Forte User-Agent: Mozilla Thunderbird 0.9 (Windows/20041103) X-Accept-Language: en-us, en MIME-Version: 1.0 To: netdev@oss.sgi.com Subject: IP aliasing and IP change delay. Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-PMX-Version: 4.7.0.111621, Antispam-Engine: 2.0.2.0, Antispam-Data: 2004.12.6.29 X-PerlMx-Spam: Gauge=IIIIIII, Probability=7%, Report='__CT 0, __CTE 0, __CT_TEXT_PLAIN 0, __HAS_MSGID 0, __MIME_VERSION 0, __SANE_MSGID 0' X-archive-position: 12495 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: andreaf@cs.columbia.edu Precedence: bulk X-list: netdev Hello all, I am new to this list but I hope you can help me. I have been trying to use two different IP addresses on the same PCMCIA wireless card. For doing this I tried the classic ifconfig wlan0:0 inet xxx.xxx.xxx.xxx route ...... and I also tried ip address add xxx.xxx.xxx.xxx dev wlan0 The problem is that after I issue the command, the IP is actually changed several hundred of milliseconds later, while if I do not create an alias and change the IP twice on the same interface (using ifconfig), then the change of IP is really fast, practically it changes starting from the packet following the command. Anybody has some ideas why there is such a double behaviour if using wlan0 and wlan0:0 or using only wlan0?? Thank you very much for your help, Andrea From herbert@gondor.apana.org.au Mon Dec 6 17:19:24 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Dec 2004 17:19:31 -0800 (PST) Received: from arnor.apana.org.au (c211-30-229-77.rivrw4.nsw.optusnet.com.au [211.30.229.77]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB71JNtm029150 for ; Mon, 6 Dec 2004 17:19:24 -0800 Received: from gondolin.me.apana.org.au ([192.168.0.6] ident=mail) by arnor.apana.org.au with esmtp (Exim 3.35 #1 (Debian)) id 1CbTyw-0000If-00; Tue, 07 Dec 2004 12:17:34 +1100 Received: from herbert by gondolin.me.apana.org.au with local (Exim 3.36 #1 (Debian)) id 1CbTv3-000732-00; Tue, 07 Dec 2004 12:13:33 +1100 From: Herbert Xu To: hasso@estpak.ee (Hasso Tepper) Subject: Re: [patch 4/10] s390: network driver. Cc: hadi@cyberus.ca, paul@clubi.ie, thomas.spatzier@de.ibm.com, jgarzik@pobox.com, netdev@oss.sgi.com Organization: Core In-Reply-To: <200412061642.00552.hasso@estpak.ee> X-Newsgroups: apana.lists.os.linux.netdev User-Agent: tin/1.7.4-20040225 ("Benbecula") (UNIX) (Linux/2.4.27-hx-1-686-smp (i686)) Message-Id: Date: Tue, 07 Dec 2004 12:13:33 +1100 X-archive-position: 12496 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@gondor.apana.org.au Precedence: bulk X-list: netdev Hasso Tepper wrote: > > Seems that it's no longer true. Seems that kernel is now trying as hard as > possible not to loose any data - data is queued and if the queue will be > full, all related sockets will be blocked to notify application. So, one > socket approach don't work any more for Quagga/Zebra. No problem, we can > take the "one socket per interface" approach. And we already have link > detection implemented to notify daemons. I don't see why this should be happening. Can you please provide a minimal program that reproduces this blocking problem? For example, something that sends a packet to a downed interface and then sends one to an interface that's up? -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt From hadi@cyberus.ca Mon Dec 6 18:23:35 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Dec 2004 18:23:42 -0800 (PST) Received: from mx02.cybersurf.com (mx02.cybersurf.com [209.197.145.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB72NYV8032161 for ; Mon, 6 Dec 2004 18:23:35 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx02.cybersurf.com with esmtp (Exim 4.30) id 1CbVwZ-0007ho-Ix for netdev@oss.sgi.com; Mon, 06 Dec 2004 22:23:15 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CbV0G-0007vh-CS; Mon, 06 Dec 2004 21:23:00 -0500 Subject: Re: [patch 4/10] s390: network driver. From: jamal Reply-To: hadi@cyberus.ca To: Herbert Xu Cc: Hasso Tepper , paul@clubi.ie, thomas.spatzier@de.ibm.com, jgarzik@pobox.com, netdev@oss.sgi.com In-Reply-To: References: Content-Type: text/plain Organization: jamalopolous Message-Id: <1102386174.1093.21.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 06 Dec 2004 21:22:55 -0500 Content-Transfer-Encoding: 7bit X-archive-position: 12497 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Mon, 2004-12-06 at 20:13, Herbert Xu wrote: > Hasso Tepper wrote: > > > > Seems that it's no longer true. Seems that kernel is now trying as hard as > > possible not to loose any data - data is queued and if the queue will be > > full, all related sockets will be blocked to notify application. So, one > > socket approach don't work any more for Quagga/Zebra. No problem, we can > > take the "one socket per interface" approach. And we already have link > > detection implemented to notify daemons. > > I don't see why this should be happening. Can you please provide a > minimal program that reproduces this blocking problem? For example, > something that sends a packet to a downed interface and then sends > one to an interface that's up? This may be something to do with socket level changes maybe? Does this happen with sockets that are not raw? Having said that: I think getting rid of obsolete messages is important. One of the suggested schemes of operation sounds to be the least brutal. Jgarzik, I thought about it a little and it seems to me that allowing the device driver to drop packets on txmit when netcarrier is off is not that bad. This is almost like pretending packets were dropped on the wire. Thoughts? cheers, jamal From hadi@cyberus.ca Mon Dec 6 18:28:24 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Dec 2004 18:28:29 -0800 (PST) Received: from mx01.cybersurf.com (mx01.cybersurf.com [209.197.145.104]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB72SO9T032711 for ; Mon, 6 Dec 2004 18:28:24 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx01.cybersurf.com with esmtp (Exim 4.30) id 1CbW0h-0002UR-DI for netdev@oss.sgi.com; Mon, 06 Dec 2004 22:27:31 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CbV4s-00004u-R6; Mon, 06 Dec 2004 21:27:47 -0500 Subject: Re: [PATCH] rtnetlink & address family problem From: jamal Reply-To: hadi@cyberus.ca To: Thomas Graf Cc: Michal Ludvig , Andrew Morton , Stephen Hemminger , netdev@oss.sgi.com, Jan Kara In-Reply-To: <20041206140214.GA749@postel.suug.ch> References: <41B0A5B4.6060108@suse.cz> <20041206140214.GA749@postel.suug.ch> Content-Type: text/plain Organization: jamalopolous Message-Id: <1102386461.1093.26.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 06 Dec 2004 21:27:41 -0500 Content-Transfer-Encoding: 7bit X-archive-position: 12498 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Mon, 2004-12-06 at 09:02, Thomas Graf wrote: > Your patch would fix this issue but might break various things. The > actual problem is that iproute2 doesn't check the family in its filter. > It blindly assumes that the kernel only returns addresses of the kind it > has requested. I can understand if you think the current behaviour > is wrong but we shouldn't change it in the middle of a stable tree. Why would it be wrong? The PF_UNSPEC is there for a purpose. If user space decides it wants to flush ipv4 addresses blindly that user spaces fault. The patch you attached seems legit. did you verify it? BTW, Stephen - are you still updating iproute2? cheers, jamal From hadi@cyberus.ca Mon Dec 6 18:40:31 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Dec 2004 18:40:36 -0800 (PST) Received: from mx02.cybersurf.com (mx02.cybersurf.com [209.197.145.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB72eVwh001240 for ; Mon, 6 Dec 2004 18:40:31 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx02.cybersurf.com with esmtp (Exim 4.30) id 1CbWCu-0008IW-T2 for netdev@oss.sgi.com; Mon, 06 Dec 2004 22:40:08 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CbVGZ-0001ld-De; Mon, 06 Dec 2004 21:39:51 -0500 Subject: Re: [patch 4/10] s390: network driver. From: jamal Reply-To: hadi@cyberus.ca To: Hasso Tepper Cc: Paul Jakma , Thomas Spatzier , jgarzik@pobox.com, netdev@oss.sgi.com In-Reply-To: <200412061642.00552.hasso@estpak.ee> References: <1102332479.1048.2243.camel@jzny.localdomain> <200412061642.00552.hasso@estpak.ee> Content-Type: text/plain Organization: jamalopolous Message-Id: <1102387179.1091.39.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 06 Dec 2004 21:39:39 -0500 Content-Transfer-Encoding: 7bit X-archive-position: 12499 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Mon, 2004-12-06 at 09:42, Hasso Tepper wrote: > I'm trying to summarize. The approach - one raw socket to send/receive > messages no matter how many interfaces are used - worked for Quagga/Zebra > routing software daemons till now. If some of these interfaces was > operationally down, socket wasn't blocked even if the queue was full. > > In > fact "man sendmsg" has still text: > > ENOBUFS > The output queue for a network interface was full. This gener- > ally indicates that the interface has stopped sending, but may > be caused by transient congestion. (Normally, this does not > occur in Linux. Packets are just silently dropped when a device > queue overflows.) > > Seems that it's no longer true. Seems that kernel is now trying as hard as > possible not to loose any data - data is queued and if the queue will be > full, all related sockets will be blocked to notify application. We need to investigate why this happens. It doesnt sound like good behavior neither legit. Does this happen to only raw sockets? Herbert asked for a sample small program that reproduces it. > So, one > socket approach don't work any more for Quagga/Zebra. No problem, we can > take the "one socket per interface" approach. And we already have link > detection implemented to notify daemons. > I dont think it would be necessary with latest changes to notifications in 2.6.x > But there will be still potential problem - sending the "interface down" > message from kernel to the zebra daemon and then to the routing protocol > daemon takes time. And during this time daemon can send packets which will > sit in queue and may cause many problems if sent to the network later (if > link comes up again). Think about statelss routing protocols like rip(ng), > ipv6 router advertisements etc. They may carry the info that's no longer > true causing routing loops etc. > > And to clarify a little bit: no - the Quagga/Zebra didn't work with previous > approach perfectly. I fact with link detection and socket per interface it > will be better than ever no matter what's the kernel behaviour. We just > want to make sure that solution will be bulletproof. > > So, problem - how can we make sure that no potentially dangerous aged > (routing) info will be in the network? I think the idea of having driver junk packets when link is down does sound plausible as a solution. cheers, jamal From hadi@cyberus.ca Mon Dec 6 18:47:06 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Dec 2004 18:47:12 -0800 (PST) Received: from mx03.cybersurf.com (mx03.cybersurf.com [209.197.145.106]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB72l54R001960 for ; Mon, 6 Dec 2004 18:47:06 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx03.cybersurf.com with esmtp (Exim 4.30) id 1CbVN9-0001sn-E3 for netdev@oss.sgi.com; Mon, 06 Dec 2004 21:46:39 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CbVNA-0002rO-Lh; Mon, 06 Dec 2004 21:46:40 -0500 Subject: Re: _High_ CPU usage while routing (mostly) small UDP packets From: jamal Reply-To: hadi@cyberus.ca To: Karsten Desler Cc: Bernd Eckenfels , "David S. Miller" , netdev@oss.sgi.com, linux-kernel@vger.kernel.org In-Reply-To: <20041207002012.GB30674@quickstop.soohrt.org> References: <20041206224107.GA8529@soohrt.org> <20041207002012.GB30674@quickstop.soohrt.org> Content-Type: text/plain Organization: jamalopolous Message-Id: <1102387595.1088.48.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 06 Dec 2004 21:46:35 -0500 Content-Transfer-Encoding: 7bit X-archive-position: 12500 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev Your numbers are very suspect. You may be having other issues in the box. You should be able to do much higher packet rates even with iptables compiled in. Some numbers at: http://www.suug.ch/sucon/04/slides/pkt_cls.pdf If all you need is std filtering then consider using tc actions. I do have a suspicion that your problem has to do with your machine more than it does with Linux. cheers, jamal On Mon, 2004-12-06 at 19:20, Karsten Desler wrote: > Bernd Eckenfels wrote: > > In article <20041206224107.GA8529@soohrt.org> you wrote: > > > Removing the iptables rules helps reducing the load a little, but the > > > majority of time is still spent somewhere else. > > > > In handling Interrupts. Are those equally sidtributed on eth0 and eth1? > > Yes they are. > > Thanks, > Karsten > > CPU0 CPU1 > 0: 117199776 133677244 IO-APIC-edge timer > 1: 0 9 IO-APIC-edge i8042 > 8: 0 4 IO-APIC-edge rtc > 9: 0 0 IO-APIC-level acpi > 169: 139 893669684 IO-APIC-level eth0 > 177: 919803109 30665 IO-APIC-level eth1 > 209: 414257 413316 IO-APIC-level libata > NMI: 0 0 > LOC: 250918849 250918819 > ERR: 0 > MIS: 0 > > From kdesler@soohrt.org Mon Dec 6 18:55:23 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Dec 2004 18:55:27 -0800 (PST) Received: from quickstop.soohrt.org (quickstop.soohrt.org [81.2.155.147]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB72tLMH002670 for ; Mon, 6 Dec 2004 18:55:22 -0800 Received: (qmail 2092 invoked by uid 1000); 7 Dec 2004 02:54:56 -0000 Date: Tue, 7 Dec 2004 03:54:56 +0100 From: Karsten Desler To: jamal Cc: Bernd Eckenfels , "David S. Miller" , netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: _High_ CPU usage while routing (mostly) small UDP packets Message-ID: <20041207025456.GA525@soohrt.org> References: <20041206224107.GA8529@soohrt.org> <20041207002012.GB30674@quickstop.soohrt.org> <1102387595.1088.48.camel@jzny.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <1102387595.1088.48.camel@jzny.localdomain> User-Agent: Mutt/1.5.6+20040722i X-archive-position: 12501 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kdesler@soohrt.org Precedence: bulk X-list: netdev * jamal wrote: > > Your numbers are very suspect. You may be having other issues in the > box. You should be able to do much higher packet rates even with > iptables compiled in. I know, and I have no idea why I'm not. > Some numbers at: > > http://www.suug.ch/sucon/04/slides/pkt_cls.pdf > > If all you need is std filtering then consider using tc actions. Thanks, I'll look into it. > I do have a suspicion that your problem has to do with your machine > more than it does with Linux. But what could be the reason? I'm really out of ideas. The only thing I can think off is the 66/64 PCI bus and the disadvantageous placement of the PCI cards, but neither should cause a higher CPU usage. If the bus couldn't keep up, I'd get packetloss. - Karsten From hadi@cyberus.ca Mon Dec 6 19:19:25 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Dec 2004 19:19:29 -0800 (PST) Received: from mx03.cybersurf.com (mx03.cybersurf.com [209.197.145.106]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB73JOSL004074 for ; Mon, 6 Dec 2004 19:19:25 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx03.cybersurf.com with esmtp (Exim 4.30) id 1CbVsQ-0002BC-ML for netdev@oss.sgi.com; Mon, 06 Dec 2004 22:18:58 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CbVsR-0007WW-1j; Mon, 06 Dec 2004 22:18:59 -0500 Subject: Re: _High_ CPU usage while routing (mostly) small UDP packets From: jamal Reply-To: hadi@cyberus.ca To: Karsten Desler Cc: Bernd Eckenfels , "David S. Miller" , netdev@oss.sgi.com, linux-kernel@vger.kernel.org In-Reply-To: <20041207025456.GA525@soohrt.org> References: <20041206224107.GA8529@soohrt.org> <20041207002012.GB30674@quickstop.soohrt.org> <1102387595.1088.48.camel@jzny.localdomain> <20041207025456.GA525@soohrt.org> Content-Type: text/plain Organization: jamalopolous Message-Id: <1102389533.1089.51.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 06 Dec 2004 22:18:54 -0500 Content-Transfer-Encoding: 7bit X-archive-position: 12502 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Mon, 2004-12-06 at 21:54, Karsten Desler wrote: > The only thing I can think off is the 66/64 PCI bus and the > disadvantageous placement of the PCI cards, but neither should cause a > higher CPU usage. If the bus couldn't keep up, I'd get packetloss. > cant tell offhand; it looks like a modern piece of hardware. Are you sure you are using NAPI? This is an e1000, correct? cheers, jamal From hadi@cyberus.ca Mon Dec 6 19:21:21 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Dec 2004 19:21:25 -0800 (PST) Received: from mx01.cybersurf.com (mx01.cybersurf.com [209.197.145.104]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB73LH19004288 for ; Mon, 6 Dec 2004 19:21:20 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx01.cybersurf.com with esmtp (Exim 4.30) id 1CbWpp-0003al-T6 for netdev@oss.sgi.com; Mon, 06 Dec 2004 23:20:22 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CbVu3-0007nj-1O; Mon, 06 Dec 2004 22:20:39 -0500 Subject: Re: 1.03Mpps on e1000 (was: Re: [E1000-devel] Transmission limit) From: jamal Reply-To: hadi@cyberus.ca To: Martin Josefsson Cc: Robert Olsson , Lennert Buytenhek , Scott Feldman , P@draigBrady.com, mellia@prezzemolo.polito.it, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com In-Reply-To: References: <16820.44722.748743.6711@robur.slu.se> Content-Type: text/plain Organization: jamalopolous Message-Id: <1102389633.1091.54.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 06 Dec 2004 22:20:34 -0500 Content-Transfer-Encoding: 7bit X-archive-position: 12503 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev Can someone post the patches and a small README? As luck would have it my ext3 just decided to fail me on my first -rc3 boot. Dammit. cheers, jamal On Mon, 2004-12-06 at 17:29, Martin Josefsson wrote: > On Mon, 6 Dec 2004, Robert Olsson wrote: > > > pktgen performance is measured on router box. Remember Scotts patch uses > > 4096 TX buffers and w. pktgen we use clone_skb. So with real skb's we probably > > see lower performance due to this. This may explain results below so routing > > performance doesn't follow pktgen performance as seen. > > I've performed some tests with and without clone_skb with various versions > of the driver. > > > Vanilla. T-PUT 657 kpps. pktgen TX perf 818 kpps > > > e1000-TX-prefetch+scott tx patch. T-PUT 540 kpps. pktgen TX perf 1.48 Mpps > > > e1000-TX-prefetch. T-PUT 657 kpps. pktgen TX perf 1.15 Mpps > > This matches the data I see in my tests here with and without clone_skb. > > I've included a lot of pps numbers below, they might need some > description. > > I tested generating packets with four diffrent drivers with and without > clone_skb. > > vanilla is the vanilla driver in 2.6.10-rc3 > > copy is using the patch found at the bottom of this mail, just a small > test to see if there's any gain or loss using "static" buffers to dma > from. Prefetch doesn't help at all here, just makes things worse, even for > clone_skb. Tried with delayed TDT updating as well, didn't help. > > vanilla + prefetch is just the vanilla driver + prefetching. > > feldman tx is using scotts tx-path rewrite patch. > I didn't bother listing feldman tx + prefetch as the results were even > lower for the non clone_skb case. > The only thing I can think of that can cause this is cache trashing, or > overhead in slab when we have a lot of skb's in the wild. > > I don't have oprofile on my testmachine at the moment and it's time to go > to bed now, maybe tomorrow... > > Does anyone have any suggestions of what to test next? > > > vanilla and clone > 60 854886 > 64 772341 > 68 759531 > 72 758872 > 76 758926 > 80 761136 > 84 742109 > 88 742070 > 92 741616 > 96 744083 > 100 727430 > 104 725242 > 108 724153 > 112 725841 > 116 707331 > 120 706000 > 124 704923 > 128 662547 > > vanilla and noclone > 60 748552 > 64 702464 > 68 649066 > 72 671992 > 76 680251 > 80 627711 > 84 625468 > 88 640115 > 92 679365 > 96 650544 > 100 666423 > 104 652057 > 108 665821 > 112 679443 > 116 652507 > 120 661279 > 124 648627 > 128 635780 > > copy and clone > 60 897165 > 64 872767 > 68 750694 > 72 750427 > 76 749583 > 80 748242 > 84 732760 > 88 731129 > 92 732603 > 96 732631 > 100 717123 > 104 717678 > 108 716839 > 112 719258 > 116 703824 > 120 706047 > 124 701885 > 128 695575 > > copy and noclone > 60 882227 > 64 649614 > 68 691327 > 72 700706 > 76 700795 > 80 696594 > 84 686016 > 88 691689 > 92 696136 > 96 691348 > 100 684596 > 104 687800 > 108 689218 > 112 671483 > 116 675867 > 120 679089 > 124 672385 > 128 650148 > > vanilla + prefetch and clone > 60 1300075 > 64 1079069 > 68 1082091 > 72 1068791 > 76 1067630 > 80 1026222 > 84 1053055 > 88 1024442 > 92 1032112 > 96 1014844 > 100 991346 > 104 976483 > 108 947019 > 112 919193 > 116 892863 > 120 868054 > 124 844679 > 128 822347 > > vanilla + prefetch and noclone > 60 738538 > 64 800927 > 68 719832 > 72 725353 > 76 822738 > 80 743134 > 84 813520 > 88 721522 > 92 797838 > 96 724031 > 100 812198 > 104 717811 > 108 713072 > 112 789771 > 116 696027 > 120 682168 > 124 749020 > 128 703233 > > feldman tx and clone > 60 1029997 > 64 916706 > 68 898601 > 72 895378 > 76 896171 > 80 898594 > 84 861434 > 88 861446 > 92 861444 > 96 863669 > 100 837624 > 104 836225 > 108 835528 > 112 835527 > 116 817102 > 120 817101 > 124 817100 > 128 757683 > > feldman tx and noclone > 60 626646 > 64 628148 > 68 628935 > 72 625084 > 76 623527 > 80 623510 > 84 624286 > 88 625086 > 92 623907 > 96 630199 > 100 613933 > 104 618025 > 108 620326 > 112 607884 > 116 606124 > 120 538434 > 124 531699 > 128 532719 > > > > diff -X /home/gandalf/dontdiff.ny -urNp drivers/net/e1000-vanilla/e1000_main.c drivers/net/e1000/e1000_main.c > --- drivers/net/e1000-vanilla/e1000_main.c 2004-12-05 18:27:50.000000000 +0100 > +++ drivers/net/e1000/e1000_main.c 2004-12-06 22:21:10.000000000 +0100 > @@ -132,6 +132,7 @@ static void e1000_irq_disable(struct e10 > static void e1000_irq_enable(struct e1000_adapter *adapter); > static irqreturn_t e1000_intr(int irq, void *data, struct pt_regs *regs); > static boolean_t e1000_clean_tx_irq(struct e1000_adapter *adapter); > +static boolean_t e1000_alloc_tx_buffers(struct e1000_adapter *adapter); > #ifdef CONFIG_E1000_NAPI > static int e1000_clean(struct net_device *netdev, int *budget); > static boolean_t e1000_clean_rx_irq(struct e1000_adapter *adapter, > @@ -264,6 +265,7 @@ e1000_up(struct e1000_adapter *adapter) > e1000_restore_vlan(adapter); > > e1000_configure_tx(adapter); > + e1000_alloc_tx_buffers(adapter); > e1000_setup_rctl(adapter); > e1000_configure_rx(adapter); > e1000_alloc_rx_buffers(adapter); > @@ -1048,10 +1052,21 @@ e1000_configure_rx(struct e1000_adapter > void > e1000_free_tx_resources(struct e1000_adapter *adapter) > { > + struct e1000_desc_ring *tx_ring = &adapter->tx_ring; > + struct e1000_buffer *buffer_info; > struct pci_dev *pdev = adapter->pdev; > + unsigned int i; > > e1000_clean_tx_ring(adapter); > > + for(i = 0; i < tx_ring->count; i++) { > + buffer_info = &tx_ring->buffer_info[i]; > + if(buffer_info->skb) { > + kfree(buffer_info->skb); > + buffer_info->skb = NULL; > + } > + } > + > vfree(adapter->tx_ring.buffer_info); > adapter->tx_ring.buffer_info = NULL; > > @@ -1079,16 +1094,12 @@ e1000_clean_tx_ring(struct e1000_adapter > > for(i = 0; i < tx_ring->count; i++) { > buffer_info = &tx_ring->buffer_info[i]; > - if(buffer_info->skb) { > - > + if(buffer_info->dma) { > pci_unmap_page(pdev, > buffer_info->dma, > buffer_info->length, > PCI_DMA_TODEVICE); > - > - dev_kfree_skb(buffer_info->skb); > - > - buffer_info->skb = NULL; > + buffer_info->dma = 0; > } > } > > @@ -1579,8 +1590,6 @@ e1000_tx_map(struct e1000_adapter *adapt > struct e1000_buffer *buffer_info; > unsigned int len = skb->len; > unsigned int offset = 0, size, count = 0, i; > - unsigned int f; > - len -= skb->data_len; > > i = tx_ring->next_to_use; > > @@ -1600,10 +1609,12 @@ e1000_tx_map(struct e1000_adapter *adapt > size > 4)) > size -= 4; > > + skb_copy_bits(skb, offset, buffer_info->skb, size); > + > buffer_info->length = size; > buffer_info->dma = > pci_map_single(adapter->pdev, > - skb->data + offset, > + buffer_info->skb, > size, > PCI_DMA_TODEVICE); > buffer_info->time_stamp = jiffies; > @@ -1614,50 +1625,11 @@ e1000_tx_map(struct e1000_adapter *adapt > if(unlikely(++i == tx_ring->count)) i = 0; > } > > - for(f = 0; f < nr_frags; f++) { > - struct skb_frag_struct *frag; > - > - frag = &skb_shinfo(skb)->frags[f]; > - len = frag->size; > - offset = frag->page_offset; > - > - while(len) { > - buffer_info = &tx_ring->buffer_info[i]; > - size = min(len, max_per_txd); > -#ifdef NETIF_F_TSO > - /* Workaround for premature desc write-backs > - * in TSO mode. Append 4-byte sentinel desc */ > - if(unlikely(mss && f == (nr_frags-1) && size == len && size > 8)) > - size -= 4; > -#endif > - /* Workaround for potential 82544 hang in PCI-X. > - * Avoid terminating buffers within evenly-aligned > - * dwords. */ > - if(unlikely(adapter->pcix_82544 && > - !((unsigned long)(frag->page+offset+size-1) & 4) && > - size > 4)) > - size -= 4; > - > - buffer_info->length = size; > - buffer_info->dma = > - pci_map_page(adapter->pdev, > - frag->page, > - offset, > - size, > - PCI_DMA_TODEVICE); > - buffer_info->time_stamp = jiffies; > - > - len -= size; > - offset += size; > - count++; > - if(unlikely(++i == tx_ring->count)) i = 0; > - } > - } > - > i = (i == 0) ? tx_ring->count - 1 : i - 1; > - tx_ring->buffer_info[i].skb = skb; > tx_ring->buffer_info[first].next_to_watch = i; > > + dev_kfree_skb_any(skb); > + > return count; > } > > @@ -2213,11 +2185,6 @@ e1000_clean_tx_irq(struct e1000_adapter > buffer_info->dma = 0; > } > > - if(buffer_info->skb) { > - dev_kfree_skb_any(buffer_info->skb); > - buffer_info->skb = NULL; > - } > - > tx_desc->buffer_addr = 0; > tx_desc->lower.data = 0; > tx_desc->upper.data = 0; > @@ -2243,6 +2210,28 @@ e1000_clean_tx_irq(struct e1000_adapter > return cleaned; > } > > + > +static boolean_t > +e1000_alloc_tx_buffers(struct e1000_adapter *adapter) > +{ > + struct e1000_desc_ring *tx_ring = &adapter->tx_ring; > + struct e1000_buffer *buffer_info; > + unsigned int i; > + > + for (i = 0; i < tx_ring->count; i++) { > + buffer_info = &tx_ring->buffer_info[i]; > + if (!buffer_info->skb) { > + buffer_info->skb = kmalloc(2048, GFP_ATOMIC); > + if (unlikely(!buffer_info->skb)) { > + printk("eek!\n"); > + return FALSE; > + } > + } > + } > + > + return TRUE; > +} > + > /** > * e1000_clean_rx_irq - Send received data up the network stack > * @adapter: board private structure > > /Martin > > From kdesler@soohrt.org Mon Dec 6 19:25:05 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Dec 2004 19:25:09 -0800 (PST) Received: from quickstop.soohrt.org (quickstop.soohrt.org [81.2.155.147]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB73P4RT004877 for ; Mon, 6 Dec 2004 19:25:05 -0800 Received: (qmail 8969 invoked by uid 1000); 7 Dec 2004 03:24:38 -0000 Date: Tue, 7 Dec 2004 04:24:38 +0100 From: Karsten Desler To: jamal Cc: Bernd Eckenfels , "David S. Miller" , netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: _High_ CPU usage while routing (mostly) small UDP packets Message-ID: <20041207032438.GA7767@soohrt.org> References: <20041206224107.GA8529@soohrt.org> <20041207002012.GB30674@quickstop.soohrt.org> <1102387595.1088.48.camel@jzny.localdomain> <20041207025456.GA525@soohrt.org> <1102389533.1089.51.camel@jzny.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <1102389533.1089.51.camel@jzny.localdomain> User-Agent: Mutt/1.5.6+20040722i X-archive-position: 12504 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kdesler@soohrt.org Precedence: bulk X-list: netdev * jamal wrote: > > The only thing I can think off is the 66/64 PCI bus and the > > disadvantageous placement of the PCI cards, but neither should cause a > > higher CPU usage. If the bus couldn't keep up, I'd get packetloss. > > > > cant tell offhand; it looks like a modern piece of hardware. > Are you sure you are using NAPI? This is an e1000, correct? > Yes and yes. 0000:01:01.0 Ethernet controller: Intel Corp. 82545EM Gigabit Ethernet Controller (Fiber) (rev 01) 0000:01:03.0 Ethernet controller: Intel Corp. 82546GB Gigabit Ethernet Controller (rev 03) driver: e1000 version: 5.5.4-k2-NAPI firmware-version: N/A bus-info: 0000:01:03.0 driver: e1000 version: 5.5.4-k2-NAPI firmware-version: N/A bus-info: 0000:01:01.0 From hadi@cyberus.ca Mon Dec 6 19:31:25 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Dec 2004 19:31:30 -0800 (PST) Received: from mx02.cybersurf.com (mx02.cybersurf.com [209.197.145.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB73VM3X005469 for ; Mon, 6 Dec 2004 19:31:24 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx02.cybersurf.com with esmtp (Exim 4.30) id 1CbX07-0007PE-SJ for netdev@oss.sgi.com; Mon, 06 Dec 2004 23:30:59 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CbW3u-0000eJ-94; Mon, 06 Dec 2004 22:30:50 -0500 Subject: Re: _High_ CPU usage while routing (mostly) small UDP packets From: jamal Reply-To: hadi@cyberus.ca To: Karsten Desler Cc: Bernd Eckenfels , "David S. Miller" , netdev@oss.sgi.com, linux-kernel@vger.kernel.org In-Reply-To: <20041207032438.GA7767@soohrt.org> References: <20041206224107.GA8529@soohrt.org> <20041207002012.GB30674@quickstop.soohrt.org> <1102387595.1088.48.camel@jzny.localdomain> <20041207025456.GA525@soohrt.org> <1102389533.1089.51.camel@jzny.localdomain> <20041207032438.GA7767@soohrt.org> Content-Type: text/plain Organization: jamalopolous Message-Id: <1102390241.1093.59.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 06 Dec 2004 22:30:41 -0500 Content-Transfer-Encoding: 7bit X-archive-position: 12505 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Mon, 2004-12-06 at 22:24, Karsten Desler wrote: > roller (rev 03) > > driver: e1000 > version: 5.5.4-k2-NAPI > firmware-version: N/A > bus-info: 0000:01:03.0 > > driver: e1000 > version: 5.5.4-k2-NAPI > firmware-version: N/A > bus-info: 0000:01:01.0 Beats me. Make sure it boots NAPI. Also if you can turn off ITR; intel loves to turn on that silly feature. cheers, jamal From kdesler@soohrt.org Mon Dec 6 20:03:04 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Dec 2004 20:03:10 -0800 (PST) Received: from quickstop.soohrt.org (quickstop.soohrt.org [81.2.155.147]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB7430XP007091 for ; Mon, 6 Dec 2004 20:03:03 -0800 Received: (qmail 16775 invoked by uid 1000); 7 Dec 2004 04:02:35 -0000 Date: Tue, 7 Dec 2004 05:02:35 +0100 From: Karsten Desler To: jamal Cc: Bernd Eckenfels , "David S. Miller" , netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: _High_ CPU usage while routing (mostly) small UDP packets Message-ID: <20041207040235.GA10501@soohrt.org> References: <20041206224107.GA8529@soohrt.org> <20041207002012.GB30674@quickstop.soohrt.org> <1102387595.1088.48.camel@jzny.localdomain> <20041207025456.GA525@soohrt.org> <1102389533.1089.51.camel@jzny.localdomain> <20041207032438.GA7767@soohrt.org> <1102390241.1093.59.camel@jzny.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <1102390241.1093.59.camel@jzny.localdomain> User-Agent: Mutt/1.5.6+20040722i X-archive-position: 12506 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kdesler@soohrt.org Precedence: bulk X-list: netdev * jamal wrote: > Beats me. Make sure it boots NAPI. Also if you can turn off ITR; intel > loves to turn on that silly feature. ITR was in fact activated. I think i've disabled it now (e1000.InterruptThrottleRate=0 in the kernel cmdline). And as I'm reading the e1000 code, there is no way to enable/disable NAPI without a recompile. So the fact that ethtool spat out -NAPI in the version string means that NAPI is actually used. - Karsten From mitch@sfgoth.com Mon Dec 6 21:45:35 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Dec 2004 21:45:39 -0800 (PST) Received: from gaz.sfgoth.com (gaz.sfgoth.com [69.36.241.230]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB75jYQM015279 for ; Mon, 6 Dec 2004 21:45:35 -0800 Received: from gaz.sfgoth.com (localhost.sfgoth.com [127.0.0.1]) by gaz.sfgoth.com (8.12.10/8.12.10) with ESMTP id iB75mfi0065495; Mon, 6 Dec 2004 21:48:41 -0800 (PST) (envelope-from mitch@gaz.sfgoth.com) Received: (from mitch@localhost) by gaz.sfgoth.com (8.12.10/8.12.6/Submit) id iB75me00065494; Mon, 6 Dec 2004 21:48:41 -0800 (PST) (envelope-from mitch) Date: Mon, 6 Dec 2004 21:48:40 -0800 From: Mitchell Blank Jr To: Phil Oester Cc: shemminger@osdl.org, linux-net@vger.kernel.org, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: Recent select() handling change breaks Poptop Message-ID: <20041207054840.GD61527@gaz.sfgoth.com> References: <20041207003525.GA22933@linuxace.com> <20041207025218.GB61527@gaz.sfgoth.com> <20041207045302.GA23746@linuxace.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041207045302.GA23746@linuxace.com> User-Agent: Mutt/1.4.2.1i X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-1.2.2 (gaz.sfgoth.com [127.0.0.1]); Mon, 06 Dec 2004 21:48:41 -0800 (PST) X-archive-position: 12507 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mitch@sfgoth.com Precedence: bulk X-list: netdev (adding netdev to cc:) Phil Oester wrote: > > 2. a "tcpdump -nvv" of its udp traffic (ideally captured from a seperate > > server, but from the server would probably be OK too) > > PPTP uses TCP 1723 and GRE (proto 47), so there is no udp traffic involved. > I suspect the change was made to all datagram traffic with the assumption > that UDP was the only protocol impacted. Perhaps GRE was not considered? Yeah, it looks like the problem for sure. The patch modifies the structure "inet_dgram_ops" to use udp_poll(), but looking farther down: static struct inet_protosw inetsw_array[] = [...] .type = SOCK_DGRAM, .protocol = IPPROTO_UDP, .prot = &udp_prot, .ops = &inet_dgram_ops, [...] .type = SOCK_RAW, .protocol = IPPROTO_IP, /* wild card */ .prot = &raw_prot, .ops = &inet_dgram_ops, [...] so it looks like udp_poll() will end up getting used for both SOCK_DGRAM and SOCK_RAW inet sockets; obviously Poptop is using the latter and failing as a result. No need for the strace/tcpdump data I guess. The fix is to just make a copy of the inet_dgram_ops called inet_udp_ops and make the udp_poll() change only in that one (and obviously change the SOCK_DGRAM case there to use &inet_udp_ops). I don't have time right this second to spin a patch, but could you try that out and see if it fixes your problem. -Mitch From kaber@trash.net Mon Dec 6 22:35:53 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Dec 2004 22:35:57 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB76Zq3K017788 for ; Mon, 6 Dec 2004 22:35:53 -0800 Received: from localhost ([127.0.0.1]) by kaber.coreworks.de with esmtp (Exim 4.34) id 1CbYwE-000226-A0; Tue, 07 Dec 2004 07:35:06 +0100 Message-ID: <41B54F1A.6050905@trash.net> Date: Tue, 07 Dec 2004 07:35:06 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.3) Gecko/20041008 Debian/1.7.3-5 X-Accept-Language: en MIME-Version: 1.0 To: Thomas Cataldo CC: linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: Hard freeze with 2.6.10-rc3 and QoS, worked fine with 2.6.9 References: <1102380430.6103.6.camel@buffy> In-Reply-To: <1102380430.6103.6.camel@buffy> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 12508 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev Thomas Cataldo wrote: >Hi, > >Tonight I upgraded to 2.6.10-rc3. Everything was fine until I started >wondershaper to setup my Qos rules : > >wondershaper eth0 255 16 > >And the machine freezed hard. No magic sysrq working, no oops in my >logs. > > Please try to find out which line causes the lockup. Are you using the same config options as with 2.6.9 ? Regards Patrick From akpm@osdl.org Mon Dec 6 22:45:24 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 06 Dec 2004 22:45:30 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB76jOb3018581 for ; Mon, 6 Dec 2004 22:45:24 -0800 Received: from bix (build.pdx.osdl.net [172.20.1.2]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id iB76ir926758; Mon, 6 Dec 2004 22:44:53 -0800 Date: Mon, 6 Dec 2004 22:44:41 -0800 From: Andrew Morton To: Thomas Cataldo Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com, tgraf@suug.ch, "David S. Miller" Subject: Re: Hard freeze with 2.6.10-rc3 and QoS, worked fine with 2.6.9 Message-Id: <20041206224441.628e7885.akpm@osdl.org> In-Reply-To: <1102380430.6103.6.camel@buffy> References: <1102380430.6103.6.camel@buffy> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.10; i386-redhat-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 12509 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: akpm@osdl.org Precedence: bulk X-list: netdev Thomas Cataldo wrote: > > Tonight I upgraded to 2.6.10-rc3. Everything was fine until I started > wondershaper to setup my Qos rules : > > wondershaper eth0 255 16 > > And the machine freezed hard. No magic sysrq working, no oops in my > logs. > > The computer is an x86 smp (dual p3) > > wondershaper was working fine with 2.6.9. Me too, with your .config: Using http://lartc.org/wondershaper/wondershaper-1.1a.tar.gz vmm:/home/akpm/wondershaper-1.1a# ./wshaper eth0 255 16 u32 classifier Perfomance counters on input device check on Actions configured Unable to handle kernel NULL pointer dereference at virtual address 00000000 printing eip: c0290b58 *pde = 00000000 Oops: 0002 [#1] SMP Modules linked in: police sch_ingress cls_u32 sch_sfq sch_cbq usbcore CPU: 1 EIP: 0060:[] Not tainted VLI EFLAGS: 00010206 (2.6.10-rc3) EIP is at _spin_lock_bh+0x10/0x20 eax: cf690000 ebx: ce0d4d80 ecx: 00000008 edx: 00000000 esi: cf691ba0 edi: 00000000 ebp: 00000000 esp: cf691b48 ds: 007b es: 007b ss: 0068 Process tc (pid: 2743, threadinfo=cf690000 task=cf07b520) Stack: c02372ab ce0d4d80 cd4d8800 cebd1c40 ce0d4d80 c0237346 ce0d4d80 00000008 00000000 00000000 00000000 cf691ba0 cf691ba0 c02478d6 ce0d4d80 00000008 00000000 cf691ba0 ce0d4d80 cf6a6070 30960094 cec44380 00000000 00019000 Call Trace: [] gnet_stats_start_copy_compat+0x1b/0x98 [] gnet_stats_start_copy+0x1e/0x24 [] tcf_action_copy_stats+0x26/0xa0 [] tcf_action_dump_old+0x36/0x3c [] u32_dump+0x2c8/0x344 [cls_u32] [] u32_dump+0x2fa/0x344 [cls_u32] [] tcf_fill_node+0x11d/0x170 [] tfilter_notify+0x50/0xa0 [] tc_ctl_tfilter+0x542/0x570 [] rtnetlink_rcv+0x23d/0x360 [] netlink_data_ready+0x1c/0x54 [] netlink_sendskb+0x21/0x40 [] netlink_unicast+0xe3/0xec [] netlink_sendmsg+0x27c/0x28c [] sock_sendmsg+0xd5/0xf8 [] sock_sendmsg+0xd5/0xf8 [] copy_from_user+0x30/0x60 [] copy_from_user+0x30/0x60 [] autoremove_wake_function+0x0/0x40 [] sys_sendmsg+0x18f/0x1f4 [] handle_mm_fault+0x80/0x11c [] do_page_fault+0x1a3/0x554 [] copy_from_user+0x30/0x60 [] sys_socketcall+0x1d8/0x1f4 [] sysenter_past_esp+0x52/0x71 Code: 3a 00 7e f9 fa eb e9 c3 8d 76 00 fa f0 fe 08 79 09 f3 90 80 38 00 7e f9 eb f2 c3 89 c2 b8 00 e0 ff ff 21 e0 81 <0>Kernel panic - not syncing: Fatal exception in interrupt Somehow I don't think this is because "Performance" was misspelled ;) tcf_act_hdr.stats_lock is NULL in tcf_action_copy_stats() From lenar@vision.ee Tue Dec 7 00:44:22 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 00:44:29 -0800 (PST) Received: from mail.city.ee (tristate.vision.ee [194.204.30.144]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iB78iGAr027552 for ; Tue, 7 Dec 2004 00:44:21 -0800 Received: (qmail 8023 invoked from network); 7 Dec 2004 08:43:54 -0000 Received: from unknown (HELO ?127.0.0.1?) (127.0.0.1) by localhost with SMTP; 7 Dec 2004 08:43:54 -0000 Message-ID: <41B56D85.9080105@vision.ee> Date: Tue, 07 Dec 2004 10:44:53 +0200 From: =?UTF-8?B?TGVuYXIgTMO1aG11cw==?= User-Agent: Mozilla Thunderbird 1.0RC1 (Windows/20041201) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Francois Romieu , netdev@oss.sgi.com Subject: Re: status of via velocity in 2.6.9 References: <41B4F447.2060808@ccs.neu.edu> <41B56518.2070108@vision.ee> <20041207082106.GA24306@electric-eye.fr.zoreil.com> In-Reply-To: <20041207082106.GA24306@electric-eye.fr.zoreil.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-archive-position: 12510 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: lenar@vision.ee Precedence: bulk X-list: netdev Hi, Francois Romieu wrote: >Lenar Lõhmus : >[...] > > >>the machine just locked up after ifconfig up. With 2.6.9, it doesn't >>lock up, but it doesn't work either (data seems to go to black hole >>or sth). But there seem to be some success reports too with this kernel. >> >> > >Can you check if the computer hosts the latest bios from its vendor and > > At the time 2.6.9 was released it had latest bios. >if booting with "acpi=off" makes a difference ? > > If I remember correctly - i tried, but no difference. >The content of /proc/interrupts after a known number of TX packets could >give some hint (use ping or such and correlate ifconfig output with >/proc/interrupts). > > With that I can't help, although the machine is accessible to me, it is now 'in production' with 3Com NIC and so I can't test and play with it anymore. When I get my hands on another similar box and it still exhibits same problems, I'll be back reporting it, and hopefully I have time to test suggested things. Lenar From kdesler@soohrt.org Tue Dec 7 02:21:58 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 02:22:04 -0800 (PST) Received: from quickstop.soohrt.org (quickstop.soohrt.org [81.2.155.147]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB7ALvrT030739 for ; Tue, 7 Dec 2004 02:21:58 -0800 Received: (qmail 29859 invoked by uid 1018); 7 Dec 2004 10:21:32 -0000 Date: Tue, 7 Dec 2004 11:21:32 +0100 From: Karsten Desler To: Karsten Desler Cc: jamal , Bernd Eckenfels , "David S. Miller" , netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: _High_ CPU usage while routing (mostly) small UDP packets Message-ID: <20041207102132.GA28588@quickstop.soohrt.org> References: <20041206224107.GA8529@soohrt.org> <20041207002012.GB30674@quickstop.soohrt.org> <1102387595.1088.48.camel@jzny.localdomain> <20041207025456.GA525@soohrt.org> <1102389533.1089.51.camel@jzny.localdomain> <20041207032438.GA7767@soohrt.org> <1102390241.1093.59.camel@jzny.localdomain> <20041207040235.GA10501@soohrt.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041207040235.GA10501@soohrt.org> User-Agent: Mutt/1.5.6+20040722i X-archive-position: 12511 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kdesler@soohrt.org Precedence: bulk X-list: netdev Karsten Desler wrote: > * jamal wrote: > > Beats me. Make sure it boots NAPI. Also if you can turn off ITR; intel > > loves to turn on that silly feature. > > ITR was in fact activated. I think i've disabled it now > (e1000.InterruptThrottleRate=0 in the kernel cmdline). > And as I'm reading the e1000 code, there is no way to enable/disable > NAPI without a recompile. So the fact that ethtool spat out -NAPI in > the version string means that NAPI is actually used. But looking and the int/s number, I'm not so sure anymore. Is there any other way to find out? # ethtool -i eth0|grep ^vers version: 5.5.4-k2-NAPI # ethtool -i eth1|grep ^vers version: 5.5.4-k2-NAPI CPU0 CPU1 169: 5 115554253 IO-APIC-level eth0 177: 78998347 5568 IO-APIC-level eth1 # sar -I 169 5 5 11:20:05 INTR intr/s 11:20:10 169 10401.40 11:20:15 169 10579.80 11:20:20 169 10965.20 11:20:25 169 10768.20 11:20:30 169 10460.60 Average: 169 10635.04 # sar -I 177 5 5 11:18:50 INTR intr/s 11:18:55 177 4769.74 11:19:00 177 4780.80 11:19:05 177 4669.74 11:19:10 177 4724.55 11:19:15 177 4748.50 Average: 177 4738.67 Cheers, Karsten From P@draigBrady.com Tue Dec 7 02:48:32 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 02:48:37 -0800 (PST) Received: from corvil.com (gate.corvil.net [213.94.219.177]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB7AmSHL001259 for ; Tue, 7 Dec 2004 02:48:29 -0800 Received: from draigBrady.com (pixelbeat.local.corvil.com [172.18.1.170]) by corvil.com (8.12.9/8.12.5) with ESMTP id iB7AlqwS058748; Tue, 7 Dec 2004 10:47:55 GMT (envelope-from P@draigBrady.com) Message-ID: <41B58A58.8010007@draigBrady.com> Date: Tue, 07 Dec 2004 10:47:52 +0000 From: P@draigBrady.com User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040124 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Karsten Desler CC: "David S. Miller" , netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: _High_ CPU usage while routing (mostly) small UDP packets References: <20041206205305.GA11970@soohrt.org> <20041206134849.498bfc93.davem@davemloft.net> <20041206224107.GA8529@soohrt.org> In-Reply-To: <20041206224107.GA8529@soohrt.org> Content-Type: multipart/mixed; boundary="------------000001040401020404000706" X-archive-position: 12512 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: P@draigBrady.com Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------000001040401020404000706 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit Karsten Desler wrote: > * David S. Miller wrote: > >>It's spending nearly half of it's time in iptables. >>Try to consolidate your rules if possible. This is the >>part of netfilter that really doesn't scale well at all. >> > > Removing the iptables rules helps reducing the load a little, but the > majority of time is still spent somewhere else. Well with NAPI it can be hard to tell CPU usage. You may need to use something like cyclesoak to get a true idea of CPU left. Also have a look at http://www.hipac.org/ as netfilter has silly scalability properties. I also notice that a lot of time is spent allocating and freeing the packet buffers (and possible hidden time due to cache misses due to allocating on one CPU and freeing on another?). How many [RT]xDescriptors do you have configured by the way? Anyway attached is a small patch that I used to make the e1000 "own" the packet buffers, and hence it does not alloc/free per packet at all. Now this has only been tested in one configuration where I was just sniffing the packets, so definitely YMMV. -- Pádraig Brady - http://www.pixelbeat.org -- --------------000001040401020404000706 Content-Type: application/x-texinfo; name="linux-2.4.20-5.2.52-realloc.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="linux-2.4.20-5.2.52-realloc.diff" diff -Naur linux-2.4.20-5.2.52/drivers/net/e1000/e1000_main.c linux-2.4.20-pb/drivers/net/e1000/e1000_main.c --- linux-2.4.20-5.2.52/drivers/net/e1000/e1000_main.c 2004-05-17 22:59:53.000000000 +0000 +++ linux-2.4.20-pb/drivers/net/e1000/e1000_main.c 2004-12-07 10:16:16.000000000 +0000 @@ -2319,9 +2323,9 @@ E1000_DBG("%s: Receive packet consumed multiple buffers\n", netdev->name); - dev_kfree_skb_irq(skb); + //dev_kfree_skb_irq(skb); //PB rx_desc->status = 0; - buffer_info->skb = NULL; + //buffer_info->skb = NULL; //PB if(++i == rx_ring->count) i = 0; @@ -2347,9 +2351,9 @@ length--; } else { - dev_kfree_skb_irq(skb); + //dev_kfree_skb_irq(skb); //PB rx_desc->status = 0; - buffer_info->skb = NULL; + //buffer_info->skb = NULL; //PB if(++i == rx_ring->count) i = 0; @@ -2393,7 +2398,7 @@ netdev->last_rx = jiffies; rx_desc->status = 0; - buffer_info->skb = NULL; + //buffer_info->skb = NULL; //PB: E1000 doesn't reallocate packets if(++i == rx_ring->count) i = 0; @@ -2421,20 +2426,37 @@ struct e1000_rx_desc *rx_desc; struct e1000_buffer *buffer_info; struct sk_buff *skb; - int reserve_len = 2; + int reserve_len = 18; unsigned int i; i = rx_ring->next_to_use; - buffer_info = &rx_ring->buffer_info[i]; - while(!buffer_info->skb) { + while(1) { + buffer_info = &rx_ring->buffer_info[i]; + if (!buffer_info->skb) { + ; /* try to allocate new buf */ + } else { + if (!skb_shared(buffer_info->skb)) { + ; /* try to reallocate unused buf */ + } else { + break; /* Better luck next round */ + } + } + rx_desc = E1000_RX_DESC(*rx_ring, i); - skb = dev_alloc_skb(adapter->rx_buffer_len + reserve_len); + if (!buffer_info->skb) { + /* TODO: optimize this alloc size */ + skb = alloc_skb(adapter->rx_buffer_len + reserve_len, GFP_ATOMIC); + } else { + skb = realloc_skb(buffer_info->skb, adapter->rx_buffer_len + reserve_len, GFP_ATOMIC); + } if(!skb) { - /* Better luck next round */ - break; + break; /* Better luck next round */ + } else { + skb->dev = netdev; + skb_get(skb); /* It's ours. Don't let others deallocate */ } /* Make buffer alignment 2 beyond a 16 byte boundary @@ -2443,8 +2465,6 @@ */ skb_reserve(skb, reserve_len); - skb->dev = netdev; - buffer_info->skb = skb; buffer_info->length = adapter->rx_buffer_len; buffer_info->dma = @@ -2466,7 +2486,6 @@ } if(++i == rx_ring->count) i = 0; - buffer_info = &rx_ring->buffer_info[i]; } rx_ring->next_to_use = i; diff -Naur linux-2.4.20-5.2.52/include/linux/skbuff.h linux-2.4.20-pb/include/linux/skbuff.h --- linux-2.4.20-5.2.52/include/linux/skbuff.h 2002-08-03 00:39:46.000000000 +0000 +++ linux-2.4.20-pb/include/linux/skbuff.h 2004-12-07 10:16:16.000000000 +0000 @@ -230,6 +230,7 @@ extern void __kfree_skb(struct sk_buff *skb); extern struct sk_buff * alloc_skb(unsigned int size, int priority); +extern struct sk_buff * realloc_skb(struct sk_buff *skb, unsigned int size, int priority); extern void kfree_skbmem(struct sk_buff *skb); extern struct sk_buff * skb_clone(struct sk_buff *skb, int priority); extern struct sk_buff * skb_copy(const struct sk_buff *skb, int priority); @@ -240,6 +241,7 @@ int newheadroom, int newtailroom, int priority); +extern struct sk_buff * skb_pad(struct sk_buff *skb, int pad); #define dev_kfree_skb(a) kfree_skb(a) extern void skb_over_panic(struct sk_buff *skb, int len, void *here); extern void skb_under_panic(struct sk_buff *skb, int len, void *here); @@ -1082,6 +1084,26 @@ } /** + * skb_padto - pad an skbuff up to a minimal size + * @skb: buffer to pad + * @len: minimal length + * + * Pads up a buffer to ensure the trailing bytes exist and are + * blanked. If the buffer already contains sufficient data it + * is untouched. Returns the buffer, which may be a replacement + * for the original, or NULL for out of memory - in which case + * the original buffer is still freed. + */ + +static inline struct sk_buff *skb_padto(struct sk_buff *skb, unsigned int len) +{ + unsigned int size = skb->len; + if(likely(size >= len)) + return skb; + return skb_pad(skb, len-size); +} + +/** * skb_linearize - convert paged skb to linear one * @skb: buffer to linarize * @gfp: allocation mode diff -Naur linux-2.4.20-5.2.52/net/core/skbuff.c linux-2.4.20-pb/net/core/skbuff.c --- linux-2.4.20-5.2.52/net/core/skbuff.c 2002-08-03 00:39:46.000000000 +0000 +++ linux-2.4.20-pb/net/core/skbuff.c 2004-12-07 10:16:16.000000000 +0000 @@ -216,7 +216,6 @@ return NULL; } - /* * Slab constructor for a skb head. */ @@ -251,6 +250,59 @@ #endif } +/** + * realloc_skb - reset skb for new packet. + * @size: size to allocate + * @gfp_mask: allocation mask + * + * Allocate a new &sk_buff. The returned buffer has no headroom and a + * tail room of size bytes. The object has a reference count of one. + * The return is the buffer. On a failure the return is %NULL. + * + * Buffers may only be allocated from interrupts using a @gfp_mask of + * %GFP_ATOMIC. + */ + +struct sk_buff *realloc_skb(struct sk_buff* skb, unsigned int size, int gfp_mask) +{ + int truesize=skb->truesize; + u8 *data=skb->head; + + skb_headerinit(skb, (kmem_cache_t *)NULL, 0); + + /* Get the DATA. Size must match skb_add_mtu(). */ + size = SKB_DATA_ALIGN(size); + if ((size+sizeof(struct sk_buff)) > truesize) { + skb_release_data(skb); + data = kmalloc(size + sizeof(struct skb_shared_info), gfp_mask); + if (data == NULL) + goto nodata; + } + + /* XXX: does not include slab overhead */ + skb->truesize = size + sizeof(struct sk_buff); + + /* Load the data pointers. */ + skb->head = data; + skb->data = data; + skb->tail = data; + skb->end = data + size; + + /* Set up other state */ + skb->len = 0; + skb->cloned = 0; + skb->data_len = 0; + + atomic_set(&skb->users, 1); + atomic_set(&(skb_shinfo(skb)->dataref), 1); + skb_shinfo(skb)->nr_frags = 0; + skb_shinfo(skb)->frag_list = NULL; + return skb; + +nodata: + return NULL; +} + static void skb_drop_fraglist(struct sk_buff *skb) { struct sk_buff *list = skb_shinfo(skb)->frag_list; @@ -731,6 +783,36 @@ return n; } +/** + * skb_pad - zero pad the tail of an skb + * @skb: buffer to pad + * @pad: space to pad + * + * Ensure that a buffer is followed by a padding area that is zero + * filled. Used by network drivers which may DMA or transfer data + * beyond the buffer end onto the wire. + * + * May return NULL in out of memory cases. + */ + +struct sk_buff *skb_pad(struct sk_buff *skb, int pad) +{ + struct sk_buff *nskb; + + /* If the skbuff is non linear tailroom is always zero.. */ + if(skb_tailroom(skb) >= pad) + { + memset(skb->data+skb->len, 0, pad); + return skb; + } + + nskb = skb_copy_expand(skb, skb_headroom(skb), skb_tailroom(skb) + pad, GFP_ATOMIC); + kfree_skb(skb); + if(nskb) + memset(nskb->data+nskb->len, 0, pad); + return nskb; +} + /* Trims skb to length len. It can change skb pointers, if "realloc" is 1. * If realloc==0 and trimming is impossible without change of data, * it is BUG(). --------------000001040401020404000706-- From kdesler@soohrt.org Tue Dec 7 03:22:06 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 03:22:10 -0800 (PST) Received: from quickstop.soohrt.org (quickstop.soohrt.org [81.2.155.147]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB7BM4ZC002905 for ; Tue, 7 Dec 2004 03:22:05 -0800 Received: (qmail 10058 invoked by uid 1000); 7 Dec 2004 11:21:39 -0000 Date: Tue, 7 Dec 2004 12:21:39 +0100 From: Karsten Desler To: P@draigBrady.com Cc: "David S. Miller" , netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: _High_ CPU usage while routing (mostly) small UDP packets Message-ID: <20041207112139.GA3610@soohrt.org> References: <20041206205305.GA11970@soohrt.org> <20041206134849.498bfc93.davem@davemloft.net> <20041206224107.GA8529@soohrt.org> <41B58A58.8010007@draigBrady.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <41B58A58.8010007@draigBrady.com> User-Agent: Mutt/1.5.6+20040722i X-archive-position: 12513 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kdesler@soohrt.org Precedence: bulk X-list: netdev * P@draigBrady.com wrote: > Karsten Desler wrote: > >* David S. Miller wrote: > > > >>It's spending nearly half of it's time in iptables. > >>Try to consolidate your rules if possible. This is the > >>part of netfilter that really doesn't scale well at all. > >> > > > >Removing the iptables rules helps reducing the load a little, but the > >majority of time is still spent somewhere else. > > Well with NAPI it can be hard to tell CPU usage. > You may need to use something like cyclesoak to get > a true idea of CPU left. Thanks, I'll look into it. > Also have a look at http://www.hipac.org/ as netfilter > has silly scalability properties. I did before, but I read on Harald Weltes' weblog that 2.4 gives slightly worse pps results than 2.6, and since the cpu usage is as high as it is, I didn't want to take any more performance hits. I'll try to see what performance impact the netfilter rules have during peak load. > I also notice that a lot of time is spent allocating > and freeing the packet buffers (and possible hidden > time due to cache misses due to allocating on one > CPU and freeing on another?). > How many [RT]xDescriptors do you have configured by the way? 256. I increased them to 1024 shortly after the profiling run, but didn't notice any change in the cpu usage (will try again with cyclesoak). > Anyway attached is a small patch that I used to make the e1000 > "own" the packet buffers, and hence it does not alloc/free > per packet at all. Now this has only been tested in one > configuration where I was just sniffing the packets, so > definitely YMMV. Thanks, I'll give it a spin. Cheers, Karsten From hadi@cyberus.ca Tue Dec 7 04:29:39 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 04:29:45 -0800 (PST) Received: from mx01.cybersurf.com (mx01.cybersurf.com [209.197.145.104]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB7CTcXn008190 for ; Tue, 7 Dec 2004 04:29:39 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx01.cybersurf.com with esmtp (Exim 4.30) id 1CbeSu-00023Z-Bw for netdev@oss.sgi.com; Tue, 07 Dec 2004 07:29:12 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CbeSq-0005L8-VP; Tue, 07 Dec 2004 07:29:09 -0500 Subject: Re: Hard freeze with 2.6.10-rc3 and QoS, worked fine with 2.6.9 From: jamal Reply-To: hadi@cyberus.ca To: Andrew Morton Cc: Thomas Cataldo , linux-kernel@vger.kernel.org, netdev@oss.sgi.com, tgraf@suug.ch, "David S. Miller" In-Reply-To: <20041206224441.628e7885.akpm@osdl.org> References: <1102380430.6103.6.camel@buffy> <20041206224441.628e7885.akpm@osdl.org> Content-Type: text/plain Organization: jamalopolous Message-Id: <1102422544.1088.98.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 07 Dec 2004 07:29:04 -0500 Content-Transfer-Encoding: 7bit X-archive-position: 12514 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev Can you do a: tc -V This seems to point to probably be a backward compat issue which was overlooked in the stats. cheers, jamal On Tue, 2004-12-07 at 01:44, Andrew Morton wrote: > Thomas Cataldo wrote: > > > > Tonight I upgraded to 2.6.10-rc3. Everything was fine until I started > > wondershaper to setup my Qos rules : > > > > wondershaper eth0 255 16 > > > > And the machine freezed hard. No magic sysrq working, no oops in my > > logs. > > > > The computer is an x86 smp (dual p3) > > > > wondershaper was working fine with 2.6.9. > > Me too, with your .config: > > Using http://lartc.org/wondershaper/wondershaper-1.1a.tar.gz > > vmm:/home/akpm/wondershaper-1.1a# ./wshaper eth0 255 16 > > > u32 classifier > Perfomance counters on > input device check on > Actions configured > Unable to handle kernel NULL pointer dereference at virtual address 00000000 > printing eip: > c0290b58 > *pde = 00000000 > Oops: 0002 [#1] > SMP > Modules linked in: police sch_ingress cls_u32 sch_sfq sch_cbq usbcore > CPU: 1 > EIP: 0060:[] Not tainted VLI > EFLAGS: 00010206 (2.6.10-rc3) > EIP is at _spin_lock_bh+0x10/0x20 > eax: cf690000 ebx: ce0d4d80 ecx: 00000008 edx: 00000000 > esi: cf691ba0 edi: 00000000 ebp: 00000000 esp: cf691b48 > ds: 007b es: 007b ss: 0068 > Process tc (pid: 2743, threadinfo=cf690000 task=cf07b520) > Stack: c02372ab ce0d4d80 cd4d8800 cebd1c40 ce0d4d80 c0237346 ce0d4d80 00000008 > 00000000 00000000 00000000 cf691ba0 cf691ba0 c02478d6 ce0d4d80 00000008 > 00000000 cf691ba0 ce0d4d80 cf6a6070 30960094 cec44380 00000000 00019000 > Call Trace: > [] gnet_stats_start_copy_compat+0x1b/0x98 > [] gnet_stats_start_copy+0x1e/0x24 > [] tcf_action_copy_stats+0x26/0xa0 > [] tcf_action_dump_old+0x36/0x3c > [] u32_dump+0x2c8/0x344 [cls_u32] > [] u32_dump+0x2fa/0x344 [cls_u32] > [] tcf_fill_node+0x11d/0x170 > [] tfilter_notify+0x50/0xa0 > [] tc_ctl_tfilter+0x542/0x570 > [] rtnetlink_rcv+0x23d/0x360 > [] netlink_data_ready+0x1c/0x54 > [] netlink_sendskb+0x21/0x40 > [] netlink_unicast+0xe3/0xec > [] netlink_sendmsg+0x27c/0x28c > [] sock_sendmsg+0xd5/0xf8 > [] sock_sendmsg+0xd5/0xf8 > [] copy_from_user+0x30/0x60 > [] copy_from_user+0x30/0x60 > [] autoremove_wake_function+0x0/0x40 > [] sys_sendmsg+0x18f/0x1f4 > [] handle_mm_fault+0x80/0x11c > [] do_page_fault+0x1a3/0x554 > [] copy_from_user+0x30/0x60 > [] sys_socketcall+0x1d8/0x1f4 > [] sysenter_past_esp+0x52/0x71 > Code: 3a 00 7e f9 fa eb e9 c3 8d 76 00 fa f0 fe 08 79 09 f3 90 80 38 00 7e f9 eb f2 c3 89 c2 b8 00 e0 ff ff 21 e0 81 > <0>Kernel panic - not syncing: Fatal exception in interrupt > Somehow I don't think this is because "Performance" was misspelled ;) > > tcf_act_hdr.stats_lock is NULL in tcf_action_copy_stats() > > > From hadi@cyberus.ca Tue Dec 7 04:34:34 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 04:34:40 -0800 (PST) Received: from mx03.cybersurf.com (mx03.cybersurf.com [209.197.145.106]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB7CYY37008666 for ; Tue, 7 Dec 2004 04:34:34 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx03.cybersurf.com with esmtp (Exim 4.30) id 1CbeXh-0001Vs-Pj for netdev@oss.sgi.com; Tue, 07 Dec 2004 07:34:09 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CbeXf-0005sK-EE; Tue, 07 Dec 2004 07:34:07 -0500 Subject: Re: _High_ CPU usage while routing (mostly) small UDP packets From: jamal Reply-To: hadi@cyberus.ca To: Karsten Desler , Robert Olsson Cc: Bernd Eckenfels , "David S. Miller" , netdev@oss.sgi.com, linux-kernel@vger.kernel.org In-Reply-To: <20041207102132.GA28588@quickstop.soohrt.org> References: <20041206224107.GA8529@soohrt.org> <20041207002012.GB30674@quickstop.soohrt.org> <1102387595.1088.48.camel@jzny.localdomain> <20041207025456.GA525@soohrt.org> <1102389533.1089.51.camel@jzny.localdomain> <20041207032438.GA7767@soohrt.org> <1102390241.1093.59.camel@jzny.localdomain> <20041207040235.GA10501@soohrt.org> <20041207102132.GA28588@quickstop.soohrt.org> Content-Type: text/plain Organization: jamalopolous Message-Id: <1102422845.1089.105.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 07 Dec 2004 07:34:05 -0500 Content-Transfer-Encoding: 7bit X-archive-position: 12515 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Tue, 2004-12-07 at 05:21, Karsten Desler wrote: > But looking and the int/s number, I'm not so sure anymore. Is there any > other way to find out? > > # sar -I 169 5 5 > 11:20:05 INTR intr/s > 11:20:10 169 10401.40 That doesnt seem to be too high. You have a dual opteron 244. You are supposed to be kicking ass with that machine - not 200Kpps+ you are getting with all that CPU overload. Something is wrong with your setup. Unfortunately i cant afford such a machine so i cant see it right off the bat. I know Robert has at least one similar machine; maybe he could help. Robert? cheers, jamal From Robert.Olsson@data.slu.se Tue Dec 7 04:39:29 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 04:39:33 -0800 (PST) Received: from mail1.slu.se (mail1.slu.se [130.238.96.11]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB7CdLkg009072 for ; Tue, 7 Dec 2004 04:39:28 -0800 Received: from robur.slu.se (robur.slu.se [130.238.98.12]) by mail1.slu.se (8.12.10/8.12.10) with ESMTP id iB7CcuKO011805; Tue, 7 Dec 2004 13:38:56 +0100 Received: by robur.slu.se (Postfix, from userid 1000) id EFA5DEC001; Tue, 7 Dec 2004 13:38:56 +0100 (CET) From: Robert Olsson MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Message-ID: <16821.42080.932184.167780@robur.slu.se> Date: Tue, 7 Dec 2004 13:38:56 +0100 To: Karsten Desler Cc: P@draigBrady.com, "David S. Miller" , netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: _High_ CPU usage while routing (mostly) small UDP packets In-Reply-To: <20041207112139.GA3610@soohrt.org> References: <20041206205305.GA11970@soohrt.org> <20041206134849.498bfc93.davem@davemloft.net> <20041206224107.GA8529@soohrt.org> <41B58A58.8010007@draigBrady.com> <20041207112139.GA3610@soohrt.org> X-Mailer: VM 7.18 under Emacs 21.3.1 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id iB7CdLkg009072 X-archive-position: 12516 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Robert.Olsson@data.slu.se Precedence: bulk X-list: netdev Hello! Well my experience is that it very hard not to say almost impossible to extrapolate idle cpu into any network system capacity. I guess this is what you are trying to do? Rather load and overload the system with traffic having the characteristics you expect as a bonus you will get some kind proof of robustness and responsiveness a max load. There are tools for this type of tests. Pádraig! Very funny... I started hacking 2 hours ago on idea I had for long time, This to do a light version of skb recycling based skb->users (the pktgen trick) with very minimal kernel change.. > Anyway attached is a small patch that I used to make the e1000 > "own" the packet buffers, and hence it does not alloc/free > per packet at all. Now this has only been tested in one > configuration where I was just sniffing the packets, so > definitely YMMV. --ro From kdesler@soohrt.org Tue Dec 7 04:50:28 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 04:50:32 -0800 (PST) Received: from quickstop.soohrt.org (quickstop.soohrt.org [81.2.155.147]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB7CoRD0009992 for ; Tue, 7 Dec 2004 04:50:28 -0800 Received: (qmail 29304 invoked by uid 1000); 7 Dec 2004 12:50:01 -0000 Date: Tue, 7 Dec 2004 13:50:01 +0100 From: Karsten Desler To: Robert Olsson Cc: P@draigBrady.com, "David S. Miller" , netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: _High_ CPU usage while routing (mostly) small UDP packets Message-ID: <20041207125001.GA26644@soohrt.org> References: <20041206205305.GA11970@soohrt.org> <20041206134849.498bfc93.davem@davemloft.net> <20041206224107.GA8529@soohrt.org> <41B58A58.8010007@draigBrady.com> <20041207112139.GA3610@soohrt.org> <16821.42080.932184.167780@robur.slu.se> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <16821.42080.932184.167780@robur.slu.se> User-Agent: Mutt/1.5.6+20040722i X-archive-position: 12518 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kdesler@soohrt.org Precedence: bulk X-list: netdev * Robert Olsson wrote: > Well my experience is that it very hard not to say almost impossible to > extrapolate idle cpu into any network system capacity. I guess this is > what you are trying to do? Kinda, yes. I'm trying to evaluate if the behaviour I'm seeing is expected, which would heavily influence my choice of hardware/software for future projects (and of course to optimize the current setup). Currently I'm having problems capturing packets with tcpdump (lots of "packets dropped by kernel") which indicates to me that there's genuinely not much (enough) idle time sitting around. > Rather load and overload the system with traffic having the characteristics > you expect as a bonus you will get some kind proof of robustness and > responsiveness a max load. There are tools for this type of tests. Will do, that could take a couple of days though. Cheers, Karsten From tgraf@suug.ch Tue Dec 7 04:49:25 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 04:49:30 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB7CnOxR009775 for ; Tue, 7 Dec 2004 04:49:25 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id DF242F; Tue, 7 Dec 2004 13:48:39 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id BA4AE1C0EA; Tue, 7 Dec 2004 13:49:22 +0100 (CET) Date: Tue, 7 Dec 2004 13:49:22 +0100 From: Thomas Graf To: jamal Cc: Michal Ludvig , Andrew Morton , Stephen Hemminger , netdev@oss.sgi.com, Jan Kara Subject: Re: [PATCH] rtnetlink & address family problem Message-ID: <20041207124922.GA1371@postel.suug.ch> References: <41B0A5B4.6060108@suse.cz> <20041206140214.GA749@postel.suug.ch> <1102386461.1093.26.camel@jzny.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1102386461.1093.26.camel@jzny.localdomain> X-archive-position: 12517 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * jamal <1102386461.1093.26.camel@jzny.localdomain> 2004-12-06 21:27 > On Mon, 2004-12-06 at 09:02, Thomas Graf wrote: > > > Your patch would fix this issue but might break various things. The > > actual problem is that iproute2 doesn't check the family in its filter. > > It blindly assumes that the kernel only returns addresses of the kind it > > has requested. I can understand if you think the current behaviour > > is wrong but we shouldn't change it in the middle of a stable tree. > > Why would it be wrong? The PF_UNSPEC is there for a purpose. I don't think it is wrong myself but I understand if someone does. If one sends a GETADDR request for PF_INET6 one might expect to either receive all ipv6 addresses or none and to only receive all addresess of any type if PF_UNSPEC was specified. > If user space decides it wants to flush ipv4 addresses blindly that user > spaces fault. The patch you attached seems legit. did you verify it? Not yet, it probably has to be applied to iproute.c as well. I'll have a look at it and do some testing. From hadi@cyberus.ca Tue Dec 7 05:03:22 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 05:03:26 -0800 (PST) Received: from mx01.cybersurf.com (mx01.cybersurf.com [209.197.145.104]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB7D3MX1010959 for ; Tue, 7 Dec 2004 05:03:22 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx01.cybersurf.com with esmtp (Exim 4.30) id 1CbezZ-0003AQ-FI for netdev@oss.sgi.com; Tue, 07 Dec 2004 08:02:57 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CbezW-0000Uo-Pd; Tue, 07 Dec 2004 08:02:55 -0500 Subject: Re: [PATCH] rtnetlink & address family problem From: jamal Reply-To: hadi@cyberus.ca To: Thomas Graf Cc: Michal Ludvig , Andrew Morton , Stephen Hemminger , netdev@oss.sgi.com, Jan Kara In-Reply-To: <20041207124922.GA1371@postel.suug.ch> References: <41B0A5B4.6060108@suse.cz> <20041206140214.GA749@postel.suug.ch> <1102386461.1093.26.camel@jzny.localdomain> <20041207124922.GA1371@postel.suug.ch> Content-Type: text/plain Organization: jamalopolous Message-Id: <1102424568.1089.120.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 07 Dec 2004 08:02:48 -0500 Content-Transfer-Encoding: 7bit X-archive-position: 12519 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Tue, 2004-12-07 at 07:49, Thomas Graf wrote: > * jamal <1102386461.1093.26.camel@jzny.localdomain> 2004-12-06 21:27 > > On Mon, 2004-12-06 at 09:02, Thomas Graf wrote: > > > > > Your patch would fix this issue but might break various things. The > > > actual problem is that iproute2 doesn't check the family in its filter. > > > It blindly assumes that the kernel only returns addresses of the kind it > > > has requested. I can understand if you think the current behaviour > > > is wrong but we shouldn't change it in the middle of a stable tree. > > > > Why would it be wrong? The PF_UNSPEC is there for a purpose. > > I don't think it is wrong myself but I understand if someone does. > If > one sends a GETADDR request for PF_INET6 one might expect to either > receive all ipv6 addresses or none and to only receive all addresess > of any type if PF_UNSPEC was specified. > Thats debatable. Its user space that issues the flushing after a response from the kernel. It happens to be flushing IPV4 addresses. Thats why your filter in ip is the answer. BTW, did the gnet_stats patches to iproute2 ever get merged? If you have cycles, can you please look at that hang being reported using older tc with 2.6.10-rc3? cheers, jamal From hadi@cyberus.ca Tue Dec 7 05:05:04 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 05:05:08 -0800 (PST) Received: from mx01.cybersurf.com (mx01.cybersurf.com [209.197.145.104]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB7D540i011306 for ; Tue, 7 Dec 2004 05:05:04 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx01.cybersurf.com with esmtp (Exim 4.30) id 1Cbf1D-000441-IO for netdev@oss.sgi.com; Tue, 07 Dec 2004 08:04:39 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1Cbf1B-0000hw-1X; Tue, 07 Dec 2004 08:04:37 -0500 Subject: Re: _High_ CPU usage while routing (mostly) small UDP packets From: jamal Reply-To: hadi@cyberus.ca To: Karsten Desler Cc: Robert Olsson , P@draigBrady.com, "David S. Miller" , netdev@oss.sgi.com, linux-kernel@vger.kernel.org In-Reply-To: <20041207125001.GA26644@soohrt.org> References: <20041206205305.GA11970@soohrt.org> <20041206134849.498bfc93.davem@davemloft.net> <20041206224107.GA8529@soohrt.org> <41B58A58.8010007@draigBrady.com> <20041207112139.GA3610@soohrt.org> <16821.42080.932184.167780@robur.slu.se> <20041207125001.GA26644@soohrt.org> Content-Type: text/plain Organization: jamalopolous Message-Id: <1102424673.1093.124.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 07 Dec 2004 08:04:33 -0500 Content-Transfer-Encoding: 7bit X-archive-position: 12520 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Tue, 2004-12-07 at 07:50, Karsten Desler wrote: > Currently I'm having problems capturing packets with tcpdump (lots of > "packets dropped by kernel") which indicates to me that there's > genuinely not much (enough) idle time sitting around. > Ah, more hints. So you are not trying to forward - rather just packet capturing? Are you using a tcpdump patched with mmaped packet socket? The 230-240Kpps you are reporting as a capture dont seem as unreasonable as i thought then. Neither would the CPU use. cheers, jamal From kdesler@soohrt.org Tue Dec 7 05:11:52 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 05:11:56 -0800 (PST) Received: from quickstop.soohrt.org (quickstop.soohrt.org [81.2.155.147]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB7DBpvo011870 for ; Tue, 7 Dec 2004 05:11:52 -0800 Received: (qmail 1430 invoked by uid 1000); 7 Dec 2004 13:11:25 -0000 Date: Tue, 7 Dec 2004 14:11:25 +0100 From: Karsten Desler To: jamal Cc: Robert Olsson , P@draigBrady.com, "David S. Miller" , netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: _High_ CPU usage while routing (mostly) small UDP packets Message-ID: <20041207131125.GB26644@soohrt.org> References: <20041206205305.GA11970@soohrt.org> <20041206134849.498bfc93.davem@davemloft.net> <20041206224107.GA8529@soohrt.org> <41B58A58.8010007@draigBrady.com> <20041207112139.GA3610@soohrt.org> <16821.42080.932184.167780@robur.slu.se> <20041207125001.GA26644@soohrt.org> <1102424673.1093.124.camel@jzny.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <1102424673.1093.124.camel@jzny.localdomain> User-Agent: Mutt/1.5.6+20040722i X-archive-position: 12521 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kdesler@soohrt.org Precedence: bulk X-list: netdev * jamal wrote: > On Tue, 2004-12-07 at 07:50, Karsten Desler wrote: > > > Currently I'm having problems capturing packets with tcpdump (lots of > > "packets dropped by kernel") which indicates to me that there's > > genuinely not much (enough) idle time sitting around. > > > > Ah, more hints. So you are not trying to forward - rather just packet > capturing? forward/routing is the goal. I was just trying to capture a tcpdump to analyze the traffic to generate something that could emulate the trafficpattern for further testing in a non-production environment. Cheers, Karsten From kdesler@soohrt.org Tue Dec 7 05:15:12 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 05:15:16 -0800 (PST) Received: from quickstop.soohrt.org (quickstop.soohrt.org [81.2.155.147]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB7DFAP7012281 for ; Tue, 7 Dec 2004 05:15:11 -0800 Received: (qmail 2228 invoked by uid 1000); 7 Dec 2004 13:14:45 -0000 Date: Tue, 7 Dec 2004 14:14:45 +0100 From: Karsten Desler To: jamal Cc: Robert Olsson , Bernd Eckenfels , "David S. Miller" , netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: _High_ CPU usage while routing (mostly) small UDP packets Message-ID: <20041207131445.GA1622@soohrt.org> References: <20041207002012.GB30674@quickstop.soohrt.org> <1102387595.1088.48.camel@jzny.localdomain> <20041207025456.GA525@soohrt.org> <1102389533.1089.51.camel@jzny.localdomain> <20041207032438.GA7767@soohrt.org> <1102390241.1093.59.camel@jzny.localdomain> <20041207040235.GA10501@soohrt.org> <20041207102132.GA28588@quickstop.soohrt.org> <1102422845.1089.105.camel@jzny.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <1102422845.1089.105.camel@jzny.localdomain> User-Agent: Mutt/1.5.6+20040722i X-archive-position: 12522 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kdesler@soohrt.org Precedence: bulk X-list: netdev * jamal wrote: > > # sar -I 169 5 5 > > 11:20:05 INTR intr/s > > 11:20:10 169 10401.40 > > That doesnt seem to be too high. To recap: 169 is a fibre e1000, eth1 is one of two ports on a dualport e1000 copper nic. eth1 is still running at about 4k int/s. 14:12:18 INTR intr/s 14:12:23 169 34012.80 14:12:28 169 33977.60 14:12:33 169 34218.16 14:12:38 169 34060.60 14:12:43 169 34252.60 Average: 169 34104.40 Cheers, Karsten From tgraf@suug.ch Tue Dec 7 05:17:10 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 05:17:14 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB7DH9Sj012639 for ; Tue, 7 Dec 2004 05:17:09 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id D0243F; Tue, 7 Dec 2004 14:16:24 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id BF4C01C0EA; Tue, 7 Dec 2004 14:17:06 +0100 (CET) Date: Tue, 7 Dec 2004 14:17:06 +0100 From: Thomas Graf To: jamal Cc: Michal Ludvig , Andrew Morton , Stephen Hemminger , netdev@oss.sgi.com, Jan Kara Subject: Re: [PATCH] rtnetlink & address family problem Message-ID: <20041207131706.GB1371@postel.suug.ch> References: <41B0A5B4.6060108@suse.cz> <20041206140214.GA749@postel.suug.ch> <1102386461.1093.26.camel@jzny.localdomain> <20041207124922.GA1371@postel.suug.ch> <1102424568.1089.120.camel@jzny.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1102424568.1089.120.camel@jzny.localdomain> X-archive-position: 12523 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev > > I don't think it is wrong myself but I understand if someone does. If > > one sends a GETADDR request for PF_INET6 one might expect to either > > receive all ipv6 addresses or none and to only receive all addresess > > of any type if PF_UNSPEC was specified. > > > > Thats debatable. > Its user space that issues the flushing after a response from the > kernel. It happens to be flushing IPV4 addresses. > Thats why your filter in ip is the answer. Agreed. > BTW, did the gnet_stats patches to iproute2 ever get merged? Not sure, I will check that. > If you have cycles, can you please look at that hang being reported > using older tc with 2.6.10-rc3? It's not really related to the gnet_stats code. stats_lock isn't set in the action code when using an older iproute2. I haven't tested this case because it was marked as broken anyway. I compiled an older version of iproute2 and will look into it today. From hadi@cyberus.ca Tue Dec 7 05:20:50 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 05:20:54 -0800 (PST) Received: from mx02.cybersurf.com (mx02.cybersurf.com [209.197.145.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB7DKobc013048 for ; Tue, 7 Dec 2004 05:20:50 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx02.cybersurf.com with esmtp (Exim 4.30) id 1CbfGU-0005w1-1D for netdev@oss.sgi.com; Tue, 07 Dec 2004 08:20:26 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CbfGP-0002LV-Vn; Tue, 07 Dec 2004 08:20:22 -0500 Subject: Re: [PATCH] rtnetlink & address family problem From: jamal Reply-To: hadi@cyberus.ca To: Thomas Graf Cc: Michal Ludvig , Andrew Morton , Stephen Hemminger , netdev@oss.sgi.com, Jan Kara In-Reply-To: <20041207131706.GB1371@postel.suug.ch> References: <41B0A5B4.6060108@suse.cz> <20041206140214.GA749@postel.suug.ch> <1102386461.1093.26.camel@jzny.localdomain> <20041207124922.GA1371@postel.suug.ch> <1102424568.1089.120.camel@jzny.localdomain> <20041207131706.GB1371@postel.suug.ch> Content-Type: text/plain Organization: jamalopolous Message-Id: <1102425618.1089.133.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 07 Dec 2004 08:20:18 -0500 Content-Transfer-Encoding: 7bit X-archive-position: 12524 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Tue, 2004-12-07 at 08:17, Thomas Graf wrote: > It's not really related to the gnet_stats code. stats_lock isn't set > in the action code when using an older iproute2. I haven't tested this > case because it was marked as broken anyway. Can you ping my memory on this? Is this tc with initial support for actions or something much older than that. cheers, jamal From P@draigBrady.com Tue Dec 7 05:39:50 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 05:39:56 -0800 (PST) Received: from corvil.com (gate.corvil.net [213.94.219.177]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB7DdnMI013945 for ; Tue, 7 Dec 2004 05:39:50 -0800 Received: from draigBrady.com (pixelbeat.local.corvil.com [172.18.1.170]) by corvil.com (8.12.9/8.12.5) with ESMTP id iB7DdEwS066571; Tue, 7 Dec 2004 13:39:15 GMT (envelope-from P@draigBrady.com) Message-ID: <41B5B282.9040909@draigBrady.com> Date: Tue, 07 Dec 2004 13:39:14 +0000 From: P@draigBrady.com User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040124 X-Accept-Language: en-us, en MIME-Version: 1.0 To: hadi@cyberus.ca CC: Karsten Desler , Robert Olsson , "David S. Miller" , netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: _High_ CPU usage while routing (mostly) small UDP packets References: <20041206205305.GA11970@soohrt.org> <20041206134849.498bfc93.davem@davemloft.net> <20041206224107.GA8529@soohrt.org> <41B58A58.8010007@draigBrady.com> <20041207112139.GA3610@soohrt.org> <16821.42080.932184.167780@robur.slu.se> <20041207125001.GA26644@soohrt.org> <1102424673.1093.124.camel@jzny.localdomain> In-Reply-To: <1102424673.1093.124.camel@jzny.localdomain> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-archive-position: 12525 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: P@draigBrady.com Precedence: bulk X-list: netdev jamal wrote: > On Tue, 2004-12-07 at 07:50, Karsten Desler wrote: > > >>Currently I'm having problems capturing packets with tcpdump (lots of >>"packets dropped by kernel") which indicates to me that there's >>genuinely not much (enough) idle time sitting around. >> > > Ah, more hints. So you are not trying to forward - rather just packet > capturing? > Are you using a tcpdump patched with mmaped packet socket? > > The 230-240Kpps you are reporting as a capture dont seem as unreasonable > as i thought then. Neither would the CPU use. Yes this is vital Karsten, otherwise tcpdump will do 2 syscalls per packet, which is the bottleneck in my experience. You may want to try a simpler capture program that uses the kernel PACKET_MMAP feature directly: http://www.scaramanga.co.uk/code-fu/lincap.c -- Pádraig Brady - http://www.pixelbeat.org -- From tgraf@suug.ch Tue Dec 7 06:10:38 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 06:10:42 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB7EAbCY015053 for ; Tue, 7 Dec 2004 06:10:37 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id 5D7B0F; Tue, 7 Dec 2004 15:09:50 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id 668491C0EA; Tue, 7 Dec 2004 15:10:33 +0100 (CET) Date: Tue, 7 Dec 2004 15:10:33 +0100 From: Thomas Graf To: jamal Cc: Michal Ludvig , Andrew Morton , Stephen Hemminger , netdev@oss.sgi.com, Jan Kara Subject: Re: [PATCH] rtnetlink & address family problem Message-ID: <20041207141033.GD1371@postel.suug.ch> References: <41B0A5B4.6060108@suse.cz> <20041206140214.GA749@postel.suug.ch> <1102386461.1093.26.camel@jzny.localdomain> <20041207124922.GA1371@postel.suug.ch> <1102424568.1089.120.camel@jzny.localdomain> <20041207131706.GB1371@postel.suug.ch> <1102425618.1089.133.camel@jzny.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1102425618.1089.133.camel@jzny.localdomain> X-archive-position: 12526 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * jamal <1102425618.1089.133.camel@jzny.localdomain> 2004-12-07 08:20 > On Tue, 2004-12-07 at 08:17, Thomas Graf wrote: > > > It's not really related to the gnet_stats code. stats_lock isn't set > > in the action code when using an older iproute2. I haven't tested this > > case because it was marked as broken anyway. > > Can you ping my memory on this? Is this tc with initial support > for actions or something much older than that. I'm not sure, I'm testing with a version having no action support at all. It should be fairly easy to find the bug once I have the time to really look into it. I'm still getting interrupted all the time at the moment. All actions created via tcf_hash_create, tcf_police_locate, and tcf_act_police_locate should be fine. There must be some bogus path related to older tc versions. From mitch@sfgoth.com Tue Dec 7 07:05:31 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 07:05:35 -0800 (PST) Received: from gaz.sfgoth.com (gaz.sfgoth.com [69.36.241.230]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB7F5Ssj016842 for ; Tue, 7 Dec 2004 07:05:30 -0800 Received: from gaz.sfgoth.com (localhost.sfgoth.com [127.0.0.1]) by gaz.sfgoth.com (8.12.10/8.12.10) with ESMTP id iB7F8Yi0076313; Tue, 7 Dec 2004 07:08:35 -0800 (PST) (envelope-from mitch@gaz.sfgoth.com) Received: (from mitch@localhost) by gaz.sfgoth.com (8.12.10/8.12.6/Submit) id iB7F8Y0f076312; Tue, 7 Dec 2004 07:08:34 -0800 (PST) (envelope-from mitch) Date: Tue, 7 Dec 2004 07:08:34 -0800 From: Mitchell Blank Jr To: Phil Oester , "David S. Miller" , shemminger@osdl.org Cc: linux-net@vger.kernel.org, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: [PATCH] fix select() for SOCK_RAW sockets Message-ID: <20041207150834.GA75700@gaz.sfgoth.com> References: <20041207003525.GA22933@linuxace.com> <20041207025218.GB61527@gaz.sfgoth.com> <20041207045302.GA23746@linuxace.com> <20041207054840.GD61527@gaz.sfgoth.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041207054840.GD61527@gaz.sfgoth.com> User-Agent: Mutt/1.4.2.1i X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-1.2.2 (gaz.sfgoth.com [127.0.0.1]); Tue, 07 Dec 2004 07:08:35 -0800 (PST) X-archive-position: 12527 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mitch@sfgoth.com Precedence: bulk X-list: netdev Phil: Here's a real patch for you to test. I actually left inet_dgram_ops alone since it's an exported symbol (two of the users just want the .do_ioctl value which is the same between SOCK_DGRAM and SOCK_RAW; the other is ipv6 where it's clearly dealing with a UDP socket -- therefore I think its safest to leave inet_dgram_ops to have the UDP behavior) Davem: I only tested that this doesn't break UDP; if it works for Phil and Stephen can verify that it doesn't break his bad-checksum UDP tests then please push it for 2.6.10. Patch is versus 2.6.10-rc3. Signed-off-by: Mitchell Blank Jr -Mitch --- linux-2.6.10-rc3-VIRGIN/net/ipv4/af_inet.c 2004-12-07 06:37:52.480082706 -0800 +++ linux-2.6.10-rc3/net/ipv4/af_inet.c 2004-12-07 06:57:47.799013216 -0800 @@ -821,6 +821,31 @@ .sendpage = inet_sendpage, }; +/* + * For SOCK_RAW sockets; should be the same as inet_dgram_ops but without + * udp_poll + */ +static struct proto_ops inet_sockraw_ops = { + .family = PF_INET, + .owner = THIS_MODULE, + .release = inet_release, + .bind = inet_bind, + .connect = inet_dgram_connect, + .socketpair = sock_no_socketpair, + .accept = sock_no_accept, + .getname = inet_getname, + .poll = datagram_poll, + .ioctl = inet_ioctl, + .listen = sock_no_listen, + .shutdown = inet_shutdown, + .setsockopt = sock_common_setsockopt, + .getsockopt = sock_common_getsockopt, + .sendmsg = inet_sendmsg, + .recvmsg = sock_common_recvmsg, + .mmap = sock_no_mmap, + .sendpage = inet_sendpage, +}; + static struct net_proto_family inet_family_ops = { .family = PF_INET, .create = inet_create, @@ -861,7 +886,7 @@ .type = SOCK_RAW, .protocol = IPPROTO_IP, /* wild card */ .prot = &raw_prot, - .ops = &inet_dgram_ops, + .ops = &inet_sockraw_ops, .capability = CAP_NET_RAW, .no_check = UDP_CSUM_DEFAULT, .flags = INET_PROTOSW_REUSE, From tgraf@suug.ch Tue Dec 7 08:55:08 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 08:55:19 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB7Gt4UC024966 for ; Tue, 7 Dec 2004 08:55:07 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id 47AE8F; Tue, 7 Dec 2004 17:54:20 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id AEFB31C0EA; Tue, 7 Dec 2004 17:55:01 +0100 (CET) Date: Tue, 7 Dec 2004 17:55:01 +0100 From: Thomas Graf To: jamal Cc: Michal Ludvig , Andrew Morton , Stephen Hemminger , netdev@oss.sgi.com, Jan Kara Subject: Re: [PATCH] rtnetlink & address family problem Message-ID: <20041207165501.GE1371@postel.suug.ch> References: <41B0A5B4.6060108@suse.cz> <20041206140214.GA749@postel.suug.ch> <1102386461.1093.26.camel@jzny.localdomain> <20041207124922.GA1371@postel.suug.ch> <1102424568.1089.120.camel@jzny.localdomain> <20041207131706.GB1371@postel.suug.ch> <1102425618.1089.133.camel@jzny.localdomain> <20041207141033.GD1371@postel.suug.ch> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041207141033.GD1371@postel.suug.ch> X-archive-position: 12528 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * Thomas Graf <20041207141033.GD1371@postel.suug.ch> 2004-12-07 15:10 > * jamal <1102425618.1089.133.camel@jzny.localdomain> 2004-12-07 08:20 > > On Tue, 2004-12-07 at 08:17, Thomas Graf wrote: > > > > > It's not really related to the gnet_stats code. stats_lock isn't set > > > in the action code when using an older iproute2. I haven't tested this > > > case because it was marked as broken anyway. > > > > Can you ping my memory on this? Is this tc with initial support > > for actions or something much older than that. > > I'm not sure, I'm testing with a version having no action support at > all. It should be fairly easy to find the bug once I have the time to > really look into it. I'm still getting interrupted all the time at > the moment. One major problem is that the tc_dump_action path doesn't take care of TCA_OLD_COMPAT resulting in calling tcf_action_copy_stats for policers which is a bad thing since their a->priv is set to tcf_police instead of the generic header and thus causes random behaviour. One solution would be to make tcf_police compatible to tca_gen. Thoughts? --- linux-2.6.10-rc2-bk13.orig/include/net/act_api.h 2004-11-30 14:01:11.000000000 +0100 +++ linux-2.6.10-rc2-bk13/include/net/act_api.h 2004-12-07 17:49:50.000000000 +0100 @@ -8,15 +8,42 @@ #include #include +#ifdef CONFIG_NET_CLS_ACT + +#define ACT_P_CREATED 1 +#define ACT_P_DELETED 1 +#define tca_gen(name) \ +struct tcf_##name *next; \ + u32 index; \ + int refcnt; \ + int bindcnt; \ + u32 capab; \ + int action; \ + struct tcf_t tm; \ + struct gnet_stats_basic bstats; \ + struct gnet_stats_queue qstats; \ + struct gnet_stats_rate_est rate_est; \ + spinlock_t *stats_lock; \ + spinlock_t lock + +#endif + struct tcf_police { +#ifdef CONFIG_NET_CLS_ACT + tca_gen(police); +#else struct tcf_police *next; int refcnt; -#ifdef CONFIG_NET_CLS_ACT - int bindcnt; -#endif u32 index; int action; + spinlock_t lock; + struct gnet_stats_basic bstats; + struct gnet_stats_queue qstats; + struct gnet_stats_rate_est rate_est; + spinlock_t *stats_lock; +#endif + int result; u32 ewma_rate; u32 burst; @@ -24,34 +51,12 @@ u32 toks; u32 ptoks; psched_time_t t_c; - spinlock_t lock; struct qdisc_rate_table *R_tab; struct qdisc_rate_table *P_tab; - - struct gnet_stats_basic bstats; - struct gnet_stats_queue qstats; - struct gnet_stats_rate_est rate_est; - spinlock_t *stats_lock; }; #ifdef CONFIG_NET_CLS_ACT -#define ACT_P_CREATED 1 -#define ACT_P_DELETED 1 -#define tca_gen(name) \ -struct tcf_##name *next; \ - u32 index; \ - int refcnt; \ - int bindcnt; \ - u32 capab; \ - int action; \ - struct tcf_t tm; \ - struct gnet_stats_basic bstats; \ - struct gnet_stats_queue qstats; \ - struct gnet_stats_rate_est rate_est; \ - spinlock_t *stats_lock; \ - spinlock_t lock - struct tcf_act_hdr { tca_gen(act_hdr); From kaber@trash.net Tue Dec 7 09:00:43 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 09:00:51 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB7H0gun025650 for ; Tue, 7 Dec 2004 09:00:43 -0800 Received: from localhost ([127.0.0.1]) by kaber.coreworks.de with esmtp (Exim 4.34) id 1Cbigq-0001M5-Hg; Tue, 07 Dec 2004 17:59:52 +0100 Message-ID: <41B5E188.5050800@trash.net> Date: Tue, 07 Dec 2004 17:59:52 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.3) Gecko/20041008 Debian/1.7.3-5 X-Accept-Language: en MIME-Version: 1.0 To: hadi@cyberus.ca CC: Andrew Morton , Thomas Cataldo , linux-kernel@vger.kernel.org, netdev@oss.sgi.com, tgraf@suug.ch, "David S. Miller" Subject: Re: Hard freeze with 2.6.10-rc3 and QoS, worked fine with 2.6.9 References: <1102380430.6103.6.camel@buffy> <20041206224441.628e7885.akpm@osdl.org> <1102422544.1088.98.camel@jzny.localdomain> In-Reply-To: <1102422544.1088.98.camel@jzny.localdomain> Content-Type: multipart/mixed; boundary="------------010705030109060209080805" X-archive-position: 12529 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------010705030109060209080805 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit jamal wrote: >Can you do a: >tc -V > >This seems to point to probably be a backward compat issue which was >overlooked in the stats. > That's also what I thought at first. But the problem is in tcf_action_copy_stats, it assumes a->priv has the same layout as struct tcf_act_hdr, which is not true for struct tcf_police. This patch rearranges struct tcf_police to match tcf_act_hdr. Signed-off-by: Patrick McHardy --------------010705030109060209080805 Content-Type: text/plain; name="x" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="x" ===== include/net/act_api.h 1.4 vs edited ===== --- 1.4/include/net/act_api.h 2004-11-06 01:33:12 +01:00 +++ edited/include/net/act_api.h 2004-12-07 17:53:37 +01:00 @@ -8,15 +8,23 @@ #include #include +#define tca_gen(name) \ +struct tcf_##name *next; \ + u32 index; \ + int refcnt; \ + int bindcnt; \ + u32 capab; \ + int action; \ + struct tcf_t tm; \ + struct gnet_stats_basic bstats; \ + struct gnet_stats_queue qstats; \ + struct gnet_stats_rate_est rate_est; \ + spinlock_t *stats_lock; \ + spinlock_t lock + struct tcf_police { - struct tcf_police *next; - int refcnt; -#ifdef CONFIG_NET_CLS_ACT - int bindcnt; -#endif - u32 index; - int action; + tca_gen(police); int result; u32 ewma_rate; u32 burst; @@ -24,33 +32,14 @@ u32 toks; u32 ptoks; psched_time_t t_c; - spinlock_t lock; struct qdisc_rate_table *R_tab; struct qdisc_rate_table *P_tab; - - struct gnet_stats_basic bstats; - struct gnet_stats_queue qstats; - struct gnet_stats_rate_est rate_est; - spinlock_t *stats_lock; }; #ifdef CONFIG_NET_CLS_ACT #define ACT_P_CREATED 1 #define ACT_P_DELETED 1 -#define tca_gen(name) \ -struct tcf_##name *next; \ - u32 index; \ - int refcnt; \ - int bindcnt; \ - u32 capab; \ - int action; \ - struct tcf_t tm; \ - struct gnet_stats_basic bstats; \ - struct gnet_stats_queue qstats; \ - struct gnet_stats_rate_est rate_est; \ - spinlock_t *stats_lock; \ - spinlock_t lock struct tcf_act_hdr { --------------010705030109060209080805-- From tgraf@suug.ch Tue Dec 7 09:07:52 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 09:07:56 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB7H7pFZ026129 for ; Tue, 7 Dec 2004 09:07:51 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id A09EFF; Tue, 7 Dec 2004 18:07:06 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id 88BA81C0EA; Tue, 7 Dec 2004 18:07:48 +0100 (CET) Date: Tue, 7 Dec 2004 18:07:48 +0100 From: Thomas Graf To: Patrick McHardy Cc: hadi@cyberus.ca, Andrew Morton , Thomas Cataldo , linux-kernel@vger.kernel.org, netdev@oss.sgi.com, "David S. Miller" Subject: Re: Hard freeze with 2.6.10-rc3 and QoS, worked fine with 2.6.9 Message-ID: <20041207170748.GF1371@postel.suug.ch> References: <1102380430.6103.6.camel@buffy> <20041206224441.628e7885.akpm@osdl.org> <1102422544.1088.98.camel@jzny.localdomain> <41B5E188.5050800@trash.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <41B5E188.5050800@trash.net> X-archive-position: 12530 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * Patrick McHardy <41B5E188.5050800@trash.net> 2004-12-07 17:59 > jamal wrote: > > >Can you do a: > >tc -V > > > >This seems to point to probably be a backward compat issue which was > >overlooked in the stats. > > > That's also what I thought at first. But the problem is in > tcf_action_copy_stats, it assumes a->priv has the same layout as > struct tcf_act_hdr, which is not true for struct tcf_police. This > patch rearranges struct tcf_police to match tcf_act_hdr. Hehe, see my post a few minutes back. I came up with nearly the same solution ;-> The only difference to my patch is that I don't touch tcf_police if the action code isn't compiled. --- linux-2.6.10-rc2-bk13.orig/include/net/act_api.h 2004-11-30 14:01:11.000000000 +0100 +++ linux-2.6.10-rc2-bk13/include/net/act_api.h 2004-12-07 17:49:50.000000000 +0100 @@ -8,15 +8,42 @@ #include #include +#ifdef CONFIG_NET_CLS_ACT + +#define ACT_P_CREATED 1 +#define ACT_P_DELETED 1 +#define tca_gen(name) \ +struct tcf_##name *next; \ + u32 index; \ + int refcnt; \ + int bindcnt; \ + u32 capab; \ + int action; \ + struct tcf_t tm; \ + struct gnet_stats_basic bstats; \ + struct gnet_stats_queue qstats; \ + struct gnet_stats_rate_est rate_est; \ + spinlock_t *stats_lock; \ + spinlock_t lock + +#endif + struct tcf_police { +#ifdef CONFIG_NET_CLS_ACT + tca_gen(police); +#else struct tcf_police *next; int refcnt; -#ifdef CONFIG_NET_CLS_ACT - int bindcnt; -#endif u32 index; int action; + spinlock_t lock; + struct gnet_stats_basic bstats; + struct gnet_stats_queue qstats; + struct gnet_stats_rate_est rate_est; + spinlock_t *stats_lock; +#endif + int result; u32 ewma_rate; u32 burst; @@ -24,34 +51,12 @@ u32 toks; u32 ptoks; psched_time_t t_c; - spinlock_t lock; struct qdisc_rate_table *R_tab; struct qdisc_rate_table *P_tab; - - struct gnet_stats_basic bstats; - struct gnet_stats_queue qstats; - struct gnet_stats_rate_est rate_est; - spinlock_t *stats_lock; }; #ifdef CONFIG_NET_CLS_ACT -#define ACT_P_CREATED 1 -#define ACT_P_DELETED 1 -#define tca_gen(name) \ -struct tcf_##name *next; \ - u32 index; \ - int refcnt; \ - int bindcnt; \ - u32 capab; \ - int action; \ - struct tcf_t tm; \ - struct gnet_stats_basic bstats; \ - struct gnet_stats_queue qstats; \ - struct gnet_stats_rate_est rate_est; \ - spinlock_t *stats_lock; \ - spinlock_t lock - struct tcf_act_hdr { tca_gen(act_hdr); From tgraf@suug.ch Tue Dec 7 09:23:52 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 09:23:56 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB7HNpXP026893 for ; Tue, 7 Dec 2004 09:23:51 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id 2D965F; Tue, 7 Dec 2004 18:23:07 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id E95A01C0EA; Tue, 7 Dec 2004 18:23:49 +0100 (CET) Date: Tue, 7 Dec 2004 18:23:49 +0100 From: Thomas Graf To: "David S. Miller" Cc: netdev@oss.sgi.com Subject: [PATCH] PKT_SCHED: validate policer configuration TLVs Message-ID: <20041207172349.GG1371@postel.suug.ch> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-archive-position: 12531 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev Adds TLV size sanity checks for policer configuration. Signed-off-by: Thomas Graf --- linux-2.6.10-rc2-bk13.orig/net/sched/police.c 2004-11-30 14:01:12.000000000 +0100 +++ linux-2.6.10-rc2-bk13/net/sched/police.c 2004-12-07 17:24:01.000000000 +0100 @@ -180,7 +180,8 @@ if (rtattr_parse(tb, TCA_POLICE_MAX, RTA_DATA(rta), RTA_PAYLOAD(rta)) < 0) return -1; - if (tb[TCA_POLICE_TBF-1] == NULL) + if (tb[TCA_POLICE_TBF-1] == NULL || + RTA_PAYLOAD(tb[TCA_POLICE_TBF-1]) != sizeof(*parm)) return -1; parm = RTA_DATA(tb[TCA_POLICE_TBF-1]); @@ -220,11 +221,17 @@ goto failure; } } - if (tb[TCA_POLICE_RESULT-1]) + if (tb[TCA_POLICE_RESULT-1]) { + if (RTA_PAYLOAD(tb[TCA_POLICE_RESULT-1]) != sizeof(u32)) + goto failure; p->result = *(int*)RTA_DATA(tb[TCA_POLICE_RESULT-1]); + } #ifdef CONFIG_NET_ESTIMATOR - if (tb[TCA_POLICE_AVRATE-1]) + if (tb[TCA_POLICE_AVRATE-1]) { + if (RTA_PAYLOAD(tb[TCA_POLICE_AVRATE-1]) != sizeof(u32)) + goto failure; p->ewma_rate = *(u32*)RTA_DATA(tb[TCA_POLICE_AVRATE-1]); + } #endif p->toks = p->burst = parm->burst; p->mtu = parm->mtu; @@ -424,7 +431,8 @@ if (rtattr_parse(tb, TCA_POLICE_MAX, RTA_DATA(rta), RTA_PAYLOAD(rta)) < 0) return NULL; - if (tb[TCA_POLICE_TBF-1] == NULL) + if (tb[TCA_POLICE_TBF-1] == NULL || + RTA_PAYLOAD(tb[TCA_POLICE_TBF-1]) != sizeof(*parm)) return NULL; parm = RTA_DATA(tb[TCA_POLICE_TBF-1]); @@ -449,11 +457,17 @@ (p->P_tab = qdisc_get_rtab(&parm->peakrate, tb[TCA_POLICE_PEAKRATE-1])) == NULL) goto failure; } - if (tb[TCA_POLICE_RESULT-1]) + if (tb[TCA_POLICE_RESULT-1]) { + if (RTA_PAYLOAD(tb[TCA_POLICE_RESULT-1]) != sizeof(u32)) + goto failure; p->result = *(int*)RTA_DATA(tb[TCA_POLICE_RESULT-1]); + } #ifdef CONFIG_NET_ESTIMATOR - if (tb[TCA_POLICE_AVRATE-1]) + if (tb[TCA_POLICE_AVRATE-1]) { + if (RTA_PAYLOAD(tb[TCA_POLICE_AVRATE-1]) != sizeof(u32)) + goto failure; p->ewma_rate = *(u32*)RTA_DATA(tb[TCA_POLICE_AVRATE-1]); + } #endif p->toks = p->burst = parm->burst; p->mtu = parm->mtu; From kaber@trash.net Tue Dec 7 09:24:28 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 09:24:35 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB7HORuZ026998 for ; Tue, 7 Dec 2004 09:24:28 -0800 Received: from localhost ([127.0.0.1]) by kaber.coreworks.de with esmtp (Exim 4.34) id 1Cbj3y-0001TU-2L; Tue, 07 Dec 2004 18:23:46 +0100 Message-ID: <41B5E722.2080600@trash.net> Date: Tue, 07 Dec 2004 18:23:46 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.3) Gecko/20041008 Debian/1.7.3-5 X-Accept-Language: en MIME-Version: 1.0 To: Thomas Graf CC: hadi@cyberus.ca, Andrew Morton , Thomas Cataldo , linux-kernel@vger.kernel.org, netdev@oss.sgi.com, "David S. Miller" Subject: Re: Hard freeze with 2.6.10-rc3 and QoS, worked fine with 2.6.9 References: <1102380430.6103.6.camel@buffy> <20041206224441.628e7885.akpm@osdl.org> <1102422544.1088.98.camel@jzny.localdomain> <41B5E188.5050800@trash.net> <20041207170748.GF1371@postel.suug.ch> In-Reply-To: <20041207170748.GF1371@postel.suug.ch> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 12532 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev Thomas Graf wrote: >* Patrick McHardy <41B5E188.5050800@trash.net> 2004-12-07 17:59 > > >>That's also what I thought at first. But the problem is in >>tcf_action_copy_stats, it assumes a->priv has the same layout as >>struct tcf_act_hdr, which is not true for struct tcf_police. This >>patch rearranges struct tcf_police to match tcf_act_hdr. >> >> > >Hehe, see my post a few minutes back. I came up with nearly the same >solution ;-> The only difference to my patch is that I don't touch >tcf_police if the action code isn't compiled. > > Either one is fine with me, although I would prefer to see the number of ifdefs in this area going down, not up :) Regards Patrick From holt@lnx-holt.americas.sgi.com Tue Dec 7 09:25:45 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 09:25:49 -0800 (PST) Received: from omx2.sgi.com (omx2-ext.sgi.com [192.48.171.19]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB7HPiYZ027588 for ; Tue, 7 Dec 2004 09:25:45 -0800 Received: from flecktone.americas.sgi.com (flecktone.americas.sgi.com [198.149.16.15]) by omx2.sgi.com (8.12.11/8.12.9/linux-outbound_gateway-1.1) with ESMTP id iB7IjxSd002036 for ; Tue, 7 Dec 2004 10:45:59 -0800 Received: from thistle-e236.americas.sgi.com (thistle-e236.americas.sgi.com [128.162.236.204]) by flecktone.americas.sgi.com (8.12.9/8.12.10/SGI_generic_relay-1.2) with ESMTP id iB7HPCam3777258; Tue, 7 Dec 2004 11:25:12 -0600 (CST) Received: from lnx-holt.americas.sgi.com (IDENT:U2FsdGVkX1/B1RZRupXVrXPgWg4srt3WGQAx3W9OB5U@lnx-holt.americas.sgi.com [128.162.233.109]) by thistle-e236.americas.sgi.com (8.12.9/SGI-server-1.8) with ESMTP id iB7HPBtC16861234; Tue, 7 Dec 2004 11:25:11 -0600 (CST) Received: from lnx-holt.americas.sgi.com (localhost.localdomain [127.0.0.1]) by lnx-holt.americas.sgi.com (8.13.1/8.12.11) with ESMTP id iB7HPBJC012584; Tue, 7 Dec 2004 11:25:11 -0600 Received: (from holt@localhost) by lnx-holt.americas.sgi.com (8.13.1/8.13.1/Submit) id iB7HPBta012583; Tue, 7 Dec 2004 11:25:11 -0600 Date: Tue, 7 Dec 2004 11:25:10 -0600 From: Robin Holt To: linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: What is a reasonable upper limit to the rt_hash_table. Message-ID: <20041207172510.GC11423@lnx-holt.americas.sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.1i X-archive-position: 12533 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: holt@sgi.com Precedence: bulk X-list: netdev We have a system with a very large amount of memory. We are noticing pauses of approximately 5 seconds every 10 minutes. We tracked it down to rt_run_flush holding off other timer processing while it scans the rt_hash_table. The following is from the boot: IP: routing cache hash table of 33554432 buckets, 524288Kbytes This seems like an outrageously large value. I realize the 2.6 kernel has rhash_entries as a boot option. Can I get some guidance on what a reasonable upper limit would be? What is this guidance based upon? What is the reason for not making that upper limit a default and let rhash_entries override to make it larger if a site actually needed it? Thank you in advance, Robin Holt From kernel@linuxace.com Tue Dec 7 09:28:43 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 09:28:47 -0800 (PST) Received: from home.linuxace.com (adsl-67-120-171-161.dsl.lsan03.pacbell.net [67.120.171.161]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iB7HSdLJ027951 for ; Tue, 7 Dec 2004 09:28:43 -0800 Received: (qmail 25817 invoked by uid 0); 7 Dec 2004 17:28:12 -0000 Date: Tue, 7 Dec 2004 09:28:12 -0800 From: Phil Oester To: Mitchell Blank Jr Cc: "David S. Miller" , shemminger@osdl.org, linux-net@vger.kernel.org, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: [PATCH] fix select() for SOCK_RAW sockets Message-ID: <20041207172812.GA25810@linuxace.com> References: <20041207003525.GA22933@linuxace.com> <20041207025218.GB61527@gaz.sfgoth.com> <20041207045302.GA23746@linuxace.com> <20041207054840.GD61527@gaz.sfgoth.com> <20041207150834.GA75700@gaz.sfgoth.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041207150834.GA75700@gaz.sfgoth.com> User-Agent: Mutt/1.4.1i X-archive-position: 12534 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kernel@linuxace.com Precedence: bulk X-list: netdev On Tue, Dec 07, 2004 at 07:08:34AM -0800, Mitchell Blank Jr wrote: > Phil: Here's a real patch for you to test. I actually left inet_dgram_ops > alone since it's an exported symbol (two of the users just want the .do_ioctl > value which is the same between SOCK_DGRAM and SOCK_RAW; the other is ipv6 > where it's clearly dealing with a UDP socket -- therefore I think its safest > to leave inet_dgram_ops to have the UDP behavior) > > Davem: I only tested that this doesn't break UDP; if it works for Phil and > Stephen can verify that it doesn't break his bad-checksum UDP tests then > please push it for 2.6.10. Yup, that does indeed fix it for me, thanks. Phil From yoshfuji@linux-ipv6.org Tue Dec 7 09:34:42 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 09:34:48 -0800 (PST) Received: from yue.st-paulia.net (yue.linux-ipv6.org [203.178.140.15]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB7HYJ5t028361 for ; Tue, 7 Dec 2004 09:34:42 -0800 Received: from localhost (localhost [127.0.0.1]) by yue.st-paulia.net (Postfix) with ESMTP id 79EF433CE5; Wed, 8 Dec 2004 02:35:31 +0900 (JST) Date: Wed, 08 Dec 2004 02:35:30 +0900 (JST) Message-Id: <20041208.023530.26430801.yoshfuji@linux-ipv6.org> To: mitch@sfgoth.com Cc: kernel@linuxace.com, davem@davemloft.net, shemminger@osdl.org, linux-net@vger.kernel.org, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, yoshfuji@linux-ipv6.org Subject: Re: [PATCH] fix select() for SOCK_RAW sockets From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: <20041207150834.GA75700@gaz.sfgoth.com> References: <20041207045302.GA23746@linuxace.com> <20041207054840.GD61527@gaz.sfgoth.com> <20041207150834.GA75700@gaz.sfgoth.com> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-archive-position: 12535 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev In article <20041207150834.GA75700@gaz.sfgoth.com> (at Tue, 7 Dec 2004 07:08:34 -0800), Mitchell Blank Jr says: > Phil: Here's a real patch for you to test. I actually left inet_dgram_ops > alone since it's an exported symbol (two of the users just want the .do_ioctl > value which is the same between SOCK_DGRAM and SOCK_RAW; the other is ipv6 > where it's clearly dealing with a UDP socket -- therefore I think its safest > to leave inet_dgram_ops to have the UDP behavior) Probably, we need to do the same for ipv6, don't we? --yoshfuji From shemminger@osdl.org Tue Dec 7 09:46:06 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 09:46:10 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB7Hk5Mv028936 for ; Tue, 7 Dec 2004 09:46:06 -0800 Received: from dxpl.pdx.osdl.net (dxpl.pdx.osdl.net [172.20.1.103]) by mail.osdl.org (8.11.6/8.11.6) with ESMTP id iB7HjZ926090; Tue, 7 Dec 2004 09:45:35 -0800 Date: Tue, 7 Dec 2004 09:45:35 -0800 From: Stephen Hemminger To: Mitchell Blank Jr Cc: Phil Oester , "David S. Miller" , linux-net@vger.kernel.org, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: [PATCH] fix select() for SOCK_RAW sockets Message-Id: <20041207094535.11080082@dxpl.pdx.osdl.net> In-Reply-To: <20041207150834.GA75700@gaz.sfgoth.com> References: <20041207003525.GA22933@linuxace.com> <20041207025218.GB61527@gaz.sfgoth.com> <20041207045302.GA23746@linuxace.com> <20041207054840.GD61527@gaz.sfgoth.com> <20041207150834.GA75700@gaz.sfgoth.com> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.10claws (GTK+ 1.2.10; x86_64-suse-linux) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 12536 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev On Tue, 7 Dec 2004 07:08:34 -0800 Mitchell Blank Jr wrote: > Phil: Here's a real patch for you to test. I actually left inet_dgram_ops > alone since it's an exported symbol (two of the users just want the .do_ioctl > value which is the same between SOCK_DGRAM and SOCK_RAW; the other is ipv6 > where it's clearly dealing with a UDP socket -- therefore I think its safest > to leave inet_dgram_ops to have the UDP behavior) > > Davem: I only tested that this doesn't break UDP; if it works for Phil and > Stephen can verify that it doesn't break his bad-checksum UDP tests then > please push it for 2.6.10. > Thanks, I'll retest UDP today, but it looks right. From tgraf@suug.ch Tue Dec 7 09:53:03 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 09:53:15 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB7Hr2iQ029417 for ; Tue, 7 Dec 2004 09:53:02 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id E5372F; Tue, 7 Dec 2004 18:52:17 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id B8A6F1C0EA; Tue, 7 Dec 2004 18:52:59 +0100 (CET) Date: Tue, 7 Dec 2004 18:52:59 +0100 From: Thomas Graf To: jamal Cc: Michal Ludvig , Andrew Morton , Stephen Hemminger , netdev@oss.sgi.com, Jan Kara Subject: Re: [PATCH] rtnetlink & address family problem Message-ID: <20041207175259.GH1371@postel.suug.ch> References: <41B0A5B4.6060108@suse.cz> <20041206140214.GA749@postel.suug.ch> <1102386461.1093.26.camel@jzny.localdomain> <20041207124922.GA1371@postel.suug.ch> <1102424568.1089.120.camel@jzny.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1102424568.1089.120.camel@jzny.localdomain> X-archive-position: 12537 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev > BTW, did the gnet_stats patches to iproute2 ever get merged? > If you have cycles, can you please look at that hang being reported > using older tc with 2.6.10-rc3? They're not in bk://developer.osdl.org/iproute2 so I guess not. I've put a iproute2 including all my changes into a tarball at: http://people.suug.ch/~tgr/iproute2/iproute2-2.6.9-tgr.tar.gz From shemminger@osdl.org Tue Dec 7 10:02:30 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 10:02:34 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB7I2Rel030066 for ; Tue, 7 Dec 2004 10:02:30 -0800 Received: from dxpl.pdx.osdl.net (dxpl.pdx.osdl.net [172.20.1.103]) by mail.osdl.org (8.11.6/8.11.6) with ESMTP id iB7I1e928573; Tue, 7 Dec 2004 10:01:40 -0800 Date: Tue, 7 Dec 2004 10:01:40 -0800 From: Stephen Hemminger To: YOSHIFUJI Hideaki / =?ISO-8859-1?B?X19fX19fX19fX19f?= Cc: mitch@sfgoth.com, kernel@linuxace.com, davem@davemloft.net, linux-net@vger.kernel.org, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, yoshfuji@linux-ipv6.org Subject: [PATCH] fix select() for SOCK_RAW sockets (ipv6) Message-Id: <20041207100140.781f4c00@dxpl.pdx.osdl.net> In-Reply-To: <20041208.023530.26430801.yoshfuji@linux-ipv6.org> References: <20041207045302.GA23746@linuxace.com> <20041207054840.GD61527@gaz.sfgoth.com> <20041207150834.GA75700@gaz.sfgoth.com> <20041208.023530.26430801.yoshfuji@linux-ipv6.org> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.10claws (GTK+ 1.2.10; x86_64-suse-linux) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 12538 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev > Probably, we need to do the same for ipv6, don't we? diff -Nru a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c --- a/net/ipv6/af_inet6.c 2004-12-07 10:02:50 -08:00 +++ b/net/ipv6/af_inet6.c 2004-12-07 10:02:50 -08:00 @@ -513,6 +513,27 @@ .sendpage = sock_no_sendpage, }; +struct proto_ops inet6_raw_ops = { + .family = PF_INET6, + .owner = THIS_MODULE, + .release = inet6_release, + .bind = inet6_bind, + .connect = inet_dgram_connect, /* ok */ + .socketpair = sock_no_socketpair, /* a do nothing */ + .accept = sock_no_accept, /* a do nothing */ + .getname = inet6_getname, + .poll = datagram_poll, /* ok */ + .ioctl = inet6_ioctl, /* must change */ + .listen = sock_no_listen, /* ok */ + .shutdown = inet_shutdown, /* ok */ + .setsockopt = sock_common_setsockopt, /* ok */ + .getsockopt = sock_common_getsockopt, /* ok */ + .sendmsg = inet_sendmsg, /* ok */ + .recvmsg = sock_common_recvmsg, /* ok */ + .mmap = sock_no_mmap, + .sendpage = sock_no_sendpage, +}; + static struct net_proto_family inet6_family_ops = { .family = PF_INET6, .create = inet6_create, @@ -528,7 +549,7 @@ .type = SOCK_RAW, .protocol = IPPROTO_IP, /* wild card */ .prot = &rawv6_prot, - .ops = &inet6_dgram_ops, + .ops = &inet6_raw_ops, .capability = CAP_NET_RAW, .no_check = UDP_CSUM_DEFAULT, .flags = INET_PROTOSW_REUSE, From kdesler@soohrt.org Tue Dec 7 10:39:12 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 10:39:19 -0800 (PST) Received: from quickstop.soohrt.org (quickstop.soohrt.org [81.2.155.147]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB7IdBCg031069 for ; Tue, 7 Dec 2004 10:39:12 -0800 Received: (qmail 3135 invoked by uid 1018); 7 Dec 2004 18:38:45 -0000 Date: Tue, 7 Dec 2004 19:38:45 +0100 From: Karsten Desler To: Karsten Desler Cc: P@draigBrady.com, "David S. Miller" , netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: _High_ CPU usage while routing (mostly) small UDP packets Message-ID: <20041207183845.GA2078@quickstop.soohrt.org> References: <20041206205305.GA11970@soohrt.org> <20041206134849.498bfc93.davem@davemloft.net> <20041206224107.GA8529@soohrt.org> <41B58A58.8010007@draigBrady.com> <20041207112139.GA3610@soohrt.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041207112139.GA3610@soohrt.org> User-Agent: Mutt/1.5.6+20040722i X-archive-position: 12539 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kdesler@soohrt.org Precedence: bulk X-list: netdev Karsten Desler wrote: > > Also have a look at http://www.hipac.org/ as netfilter > > has silly scalability properties. > > I did before, but I read on Harald Weltes' weblog that 2.4 gives > slightly worse pps results than 2.6, and since the cpu usage is as high > as it is, I didn't want to take any more performance hits. > I'll try to see what performance impact the netfilter rules have during > peak load. using 2 CPUs System load: 61.4% || Free: 51.0%(0) 26.3%(1) System load: 59.6% || Free: 53.6%(0) 27.3%(1) System load: 59.6% || Free: 53.6%(0) 27.3%(1) System load: 59.7% || Free: 53.6%(0) 27.0%(1) System load: 60.3% || Free: 53.0%(0) 26.4%(1) System load: 51.9% || Free: 60.4%(0) 35.8%(1) <- iptables -F System load: 50.1% || Free: 62.1%(0) 37.7%(1) System load: 50.1% || Free: 62.0%(0) 37.8%(1) System load: 50.6% || Free: 61.6%(0) 37.2%(1) System load: 50.5% || Free: 61.7%(0) 37.3%(1) > > I also notice that a lot of time is spent allocating > > and freeing the packet buffers (and possible hidden > > time due to cache misses due to allocating on one > > CPU and freeing on another?). > > How many [RT]xDescriptors do you have configured by the way? > > 256. I increased them to 1024 shortly after the profiling run, but > didn't notice any change in the cpu usage (will try again with cyclesoak). Again, no effect. Cheers, Karsten From davem@davemloft.net Tue Dec 7 10:50:36 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 10:50:43 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB7IoXF8031739 for ; Tue, 7 Dec 2004 10:50:36 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1CbkNj-0000Gd-00; Tue, 07 Dec 2004 10:48:15 -0800 Date: Tue, 7 Dec 2004 10:48:15 -0800 From: "David S. Miller" To: Stephen Hemminger Cc: yoshfuji@linux-ipv6.org, mitch@sfgoth.com, kernel@linuxace.com, linux-net@vger.kernel.org, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: [PATCH] fix select() for SOCK_RAW sockets (ipv6) Message-Id: <20041207104815.3f7a4684.davem@davemloft.net> In-Reply-To: <20041207100140.781f4c00@dxpl.pdx.osdl.net> References: <20041207045302.GA23746@linuxace.com> <20041207054840.GD61527@gaz.sfgoth.com> <20041207150834.GA75700@gaz.sfgoth.com> <20041208.023530.26430801.yoshfuji@linux-ipv6.org> <20041207100140.781f4c00@dxpl.pdx.osdl.net> X-Mailer: Sylpheed version 1.0.0beta3 (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 12540 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Tue, 7 Dec 2004 10:01:40 -0800 Stephen Hemminger wrote: > > Probably, we need to do the same for ipv6, don't we? > > diff -Nru a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c > --- a/net/ipv6/af_inet6.c 2004-12-07 10:02:50 -08:00 > +++ b/net/ipv6/af_inet6.c 2004-12-07 10:02:50 -08:00 > @@ -513,6 +513,27 @@ We didn't do the "UDP select() fix" on ipv6 because we don't have the delayed checksumming optimization there. So the ipv6 side really doesn't need this change. Or did I miss something? From shemminger@osdl.org Tue Dec 7 10:57:07 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 10:57:12 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB7Iv5kV032284 for ; Tue, 7 Dec 2004 10:57:06 -0800 Received: from dxpl.pdx.osdl.net (dxpl.pdx.osdl.net [172.20.1.103]) by mail.osdl.org (8.11.6/8.11.6) with ESMTP id iB7IuK908517; Tue, 7 Dec 2004 10:56:20 -0800 Date: Tue, 7 Dec 2004 10:56:20 -0800 From: Stephen Hemminger To: "David S. Miller" Cc: yoshfuji@linux-ipv6.org, mitch@sfgoth.com, kernel@linuxace.com, linux-net@vger.kernel.org, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: [PATCH] fix select() for SOCK_RAW sockets (ipv6) Message-Id: <20041207105620.241652d0@dxpl.pdx.osdl.net> In-Reply-To: <20041207104815.3f7a4684.davem@davemloft.net> References: <20041207045302.GA23746@linuxace.com> <20041207054840.GD61527@gaz.sfgoth.com> <20041207150834.GA75700@gaz.sfgoth.com> <20041208.023530.26430801.yoshfuji@linux-ipv6.org> <20041207100140.781f4c00@dxpl.pdx.osdl.net> <20041207104815.3f7a4684.davem@davemloft.net> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.10claws (GTK+ 1.2.10; x86_64-suse-linux) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 12541 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev On Tue, 7 Dec 2004 10:48:15 -0800 "David S. Miller" wrote: > On Tue, 7 Dec 2004 10:01:40 -0800 > Stephen Hemminger wrote: > > > > Probably, we need to do the same for ipv6, don't we? > > > > diff -Nru a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c > > --- a/net/ipv6/af_inet6.c 2004-12-07 10:02:50 -08:00 > > +++ b/net/ipv6/af_inet6.c 2004-12-07 10:02:50 -08:00 > > @@ -513,6 +513,27 @@ > > We didn't do the "UDP select() fix" on ipv6 because we don't > have the delayed checksumming optimization there. So the > ipv6 side really doesn't need this change. > > Or did I miss something? yeah, didn't look that deeply. From shemminger@osdl.org Tue Dec 7 11:13:19 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 11:13:25 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB7JDION000549 for ; Tue, 7 Dec 2004 11:13:19 -0800 Received: from dxpl.pdx.osdl.net (dxpl.pdx.osdl.net [172.20.1.103]) by mail.osdl.org (8.11.6/8.11.6) with ESMTP id iB7JC6911442; Tue, 7 Dec 2004 11:12:06 -0800 Date: Tue, 7 Dec 2004 11:12:06 -0800 From: Stephen Hemminger To: Thomas Graf Cc: jamal , Michal Ludvig , Andrew Morton , netdev@oss.sgi.com, Jan Kara Subject: Re: [PATCH] rtnetlink & address family problem Message-Id: <20041207111206.5060ccfc@dxpl.pdx.osdl.net> In-Reply-To: <20041207175259.GH1371@postel.suug.ch> References: <41B0A5B4.6060108@suse.cz> <20041206140214.GA749@postel.suug.ch> <1102386461.1093.26.camel@jzny.localdomain> <20041207124922.GA1371@postel.suug.ch> <1102424568.1089.120.camel@jzny.localdomain> <20041207175259.GH1371@postel.suug.ch> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.10claws (GTK+ 1.2.10; x86_64-suse-linux) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-archive-position: 12542 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev On Tue, 7 Dec 2004 18:52:59 +0100 Thomas Graf wrote: > http://people.suug.ch/~tgr/iproute2/iproute2-2.6.9-tgr.tar.gz Thanks, I'm behind on iproute2 From brad@danga.com Tue Dec 7 11:18:30 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 11:18:35 -0800 (PST) Received: from danga.com (danga.com [66.150.15.140]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB7JITDm001015 for ; Tue, 7 Dec 2004 11:18:30 -0800 Received: by danga.com (Postfix, from userid 1000) id 02DAA3BC0B5; Tue, 7 Dec 2004 11:18:04 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by danga.com (Postfix) with ESMTP id E2DCD87C121; Tue, 7 Dec 2004 11:18:04 -0800 (PST) Date: Tue, 7 Dec 2004 11:18:04 -0800 (PST) From: Brad Fitzpatrick X-X-Sender: bradfitz@danga.com To: linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: e1000, 2.6.9: swapper: page allocation failure. order:1, mode:0x20 Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 12543 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: brad@danga.com Precedence: bulk X-list: netdev Hello, On a busy database server I'm getting tons of these w/ 2.6.9 and e1000, no NAPI, no jumbo frames: swapper: page allocation failure. order:1, mode:0x20 [] __alloc_pages+0x2e9/0x30c [] e1000_xmit_frame+0x816/0x820 [] __get_free_pages+0x1d/0x30 [] kmem_getpages+0x26/0xd4 [] cache_grow+0xb3/0x148 [] cache_alloc_refill+0x1ae/0x1fc [] kmem_cache_alloc+0x43/0x4c [] sk_alloc+0x26/0x98 [] tcp_create_openreq_child+0x24/0x538 [] tcp_v4_syn_recv_sock+0x4c/0x2d8 [] tcp_check_req+0x23f/0x3e0 [] e1000_xmit_frame+0x816/0x820 [] ip_output+0x76/0x7c [] qdisc_restart+0x1f/0x1e8 [] dev_queue_xmit+0x234/0x244 [] ip_finish_output+0x165/0x1b0 [] ip_output+0x76/0x7c [] ip_queue_xmit+0x3a9/0x428 [] recalc_task_prio+0x128/0x138 [] activate_task+0x9f/0xb0 [] recalc_task_prio+0x128/0x138 [] activate_task+0x9f/0xb0 [] recalc_task_prio+0x128/0x138 [] activate_task+0x9f/0xb0 [] try_to_wake_up+0x25a/0x268 [] default_wake_function+0x17/0x1c [] default_wake_function+0x17/0x1c [] tcp_v4_hnd_req+0x32/0x190 [] tcp_v4_hnd_req+0x4c/0x190 [] tcp_v4_do_rcv+0xbb/0x120 [] tcp_v4_do_rcv+0x94/0x120 [] tcp_v4_rcv+0x499/0x758 [] ip_route_input+0x33/0x140 [] ip_local_deliver+0x9c/0x13c [] ip_rcv+0x34d/0x3ec [] netif_receive_skb+0x149/0x180 [] process_backlog+0x85/0x114 [] net_rx_action+0x80/0x128 [] __do_softirq+0x6a/0xd4 [] do_softirq+0x28/0x30 [] do_IRQ+0x10e/0x124 [] common_interrupt+0x18/0x20 [] default_idle+0x29/0x34 [] cpu_idle+0x30/0x44 [] rest_init+0x47/0x48 [] start_kernel+0x161/0x168 From following lists, I read this as e1000 in interrupt context tried to allocate two contiguous pages without sleeping, and the "emergency pools" or whatnot weren't full enough? We're not using jumbo frames, ... what is e1000 allocating 8k for? (or am I misreading this?) Also not using TSO, unless it's the default again in 2.6.9. What should I tune to avoid this from happening? Thanks! Brad From kdesler@soohrt.org Tue Dec 7 13:11:02 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 13:11:07 -0800 (PST) Received: from quickstop.soohrt.org (quickstop.soohrt.org [81.2.155.147]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB7LB0JT009307 for ; Tue, 7 Dec 2004 13:11:02 -0800 Received: (qmail 2600 invoked by uid 1018); 7 Dec 2004 21:10:35 -0000 Date: Tue, 7 Dec 2004 22:10:35 +0100 From: Karsten Desler To: netdev@oss.sgi.com Cc: linux-kernel@vger.kernel.org, "David S. Miller" , jamal , Robert Olsson , P@draigBrady.com Subject: Re: _High_ CPU usage while routing (mostly) small UDP packets Message-ID: <20041207211035.GA20286@quickstop.soohrt.org> References: <20041206205305.GA11970@soohrt.org> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="y0ulUmNC+osPPQO6" Content-Disposition: inline In-Reply-To: <20041206205305.GA11970@soohrt.org> User-Agent: Mutt/1.5.6+20040722i X-archive-position: 12544 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kdesler@soohrt.org Precedence: bulk X-list: netdev --y0ulUmNC+osPPQO6 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Karsten Desler wrote: > Current packetload on eth0 (and reversed on eth1): > 115kpps tx > 135kpps rx I totally forgot to mention: There are approximately 100k concurrent flows. From dmesg: IP: routing cache hash table of 16384 buckets, 128Kbytes Maybe there is some contention on the rt_hash_table spinlocks? Is the attached patch enough to increase the size? - Karsten --y0ulUmNC+osPPQO6 Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="rtcachesize.patch" --- linux/net/ipv4/route.c~old 2004-12-07 21:55:22.000000000 +0100 +++ linux/net/ipv4/route.c 2004-12-07 21:55:32.000000000 +0100 @@ -2728,7 +2728,7 @@ if (!ipv4_dst_ops.kmem_cachep) panic("IP: failed to allocate ip_dst_cache\n"); - goal = num_physpages >> (26 - PAGE_SHIFT); + goal = num_physpages >> (23 - PAGE_SHIFT); if (rhash_entries) goal = (rhash_entries * sizeof(struct rt_hash_bucket)) >> PAGE_SHIFT; for (order = 0; (1UL << order) < goal; order++) --y0ulUmNC+osPPQO6-- From buytenh@wantstofly.org Tue Dec 7 14:25:47 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 14:25:51 -0800 (PST) Received: from xi.wantstofly.org (alephnull.demon.nl [212.238.201.82]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB7MPi5n011327 for ; Tue, 7 Dec 2004 14:25:47 -0800 Received: by xi.wantstofly.org (Postfix, from userid 500) id 5972E2B0ED; Tue, 7 Dec 2004 23:25:22 +0100 (MET) Date: Tue, 7 Dec 2004 23:25:22 +0100 From: Lennert Buytenhek To: Robert Olsson Cc: hadi@cyberus.ca, netdev@oss.sgi.com Subject: inter-packet gap in pktgen Message-ID: <20041207222522.GA30266@xi.wantstofly.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.1i X-archive-position: 12545 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: buytenh@wantstofly.org Precedence: bulk X-list: netdev Hi Robert, For TX/RX tests, I've been trying to convince pktgen to spit out exactly N packets per second (for various values of N) by tweaking the inter-packet gap parameter, and I've noticed the following. The 'ipg' parameter to pktgen seems to be neither (i) the actual on-the-wire inter-packet gap (which is the time between the FCS of one packet and the preamble of the next), nor (ii) the time between the first data bit of one packet and that of the next. From observing that the pps rate for a given 'ipg' does not depend on the packet size, it would seem that 'ipg' attempts to be (ii). But it doesn't quite succeed in that -- it appears to be always ~496ns off on my box. For example, if I send 500B packets with param 'ipg' equal to 10000ns, I would expect to end up either with 71kpps(i) or with 100kpps(ii). But what I get is 95kpps, 1e9/(10000+496). At 5000ns ipg, I get neither 109kpps(i) nor 200kpps(ii) but 182kpps, 1e9/(5000+496). Presumably this 496ns is the CPU cost of shoving one packet towards the NIC, and pktgen only after sending a packet starts waiting for 'ipg' ns before transmitting the next packet. Can we not compensate for this cost so that we either always get (i) or (ii)? Possibly by first getting the current time X, then transmitting the packet, and then waiting until X+ipg, which would then give us (ii). (We'd have to rename it to 'inter-packet-start gap' though, or something like that.) By tweaking the 'ipg' parameter I can generate pretty much any packet rate I want, as long as I set ipg=(1e9/rate)-496 instead of something possibly more straightforward. thanks, Lennert From Robert.Olsson@data.slu.se Tue Dec 7 14:40:53 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 14:40:57 -0800 (PST) Received: from mail1.slu.se (mail1.slu.se [130.238.96.11]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB7Meqmc011819 for ; Tue, 7 Dec 2004 14:40:53 -0800 Received: from robur.slu.se (robur.slu.se [130.238.98.12]) by mail1.slu.se (8.12.10/8.12.10) with ESMTP id iB7MeMKO017845; Tue, 7 Dec 2004 23:40:22 +0100 Received: by robur.slu.se (Postfix, from userid 1000) id 4BA96EC001; Tue, 7 Dec 2004 23:40:22 +0100 (CET) From: Robert Olsson MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16822.12630.275389.575326@robur.slu.se> Date: Tue, 7 Dec 2004 23:40:22 +0100 To: Karsten Desler Cc: netdev@oss.sgi.com, linux-kernel@vger.kernel.org, "David S. Miller" , jamal , Robert Olsson , P@draigBrady.com Subject: Re: _High_ CPU usage while routing (mostly) small UDP packets In-Reply-To: <20041207211035.GA20286@quickstop.soohrt.org> References: <20041206205305.GA11970@soohrt.org> <20041207211035.GA20286@quickstop.soohrt.org> X-Mailer: VM 7.18 under Emacs 21.3.1 X-archive-position: 12546 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Robert.Olsson@data.slu.se Precedence: bulk X-list: netdev Karsten Desler writes: > I totally forgot to mention: There are approximately 100k concurrent > flows. > >From dmesg: > IP: routing cache hash table of 16384 buckets, 128Kbytes You can take a looks at stats w. rtstat. Hash spinning and how many new entires create and how many warm you hit. > Maybe there is some contention on the rt_hash_table spinlocks? > Is the attached patch enough to increase the size? There is boot option for this now rhash_entries= [KNL,NET] Set number of hash buckets for route cache --ro From greearb@candelatech.com Tue Dec 7 14:47:42 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 14:47:47 -0800 (PST) Received: from www.lanforge.com (ns1.lanforge.com [66.165.47.210]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB7MlgdJ012261 for ; Tue, 7 Dec 2004 14:47:42 -0800 Received: from [4.33.45.22] (evrtwa1-ar2-4-33-045-022.evrtwa1.dsl-verizon.net [4.33.45.22]) (authenticated bits=0) by www.lanforge.com (8.12.8/8.12.8) with ESMTP id iB7MxZLH005248; Tue, 7 Dec 2004 14:59:36 -0800 Message-ID: <41B632F3.1090104@candelatech.com> Date: Tue, 07 Dec 2004 14:47:15 -0800 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.3) Gecko/20041020 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Lennert Buytenhek CC: Robert Olsson , hadi@cyberus.ca, netdev@oss.sgi.com Subject: Re: inter-packet gap in pktgen References: <20041207222522.GA30266@xi.wantstofly.org> In-Reply-To: <20041207222522.GA30266@xi.wantstofly.org> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-archive-position: 12547 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev Lennert Buytenhek wrote: > By tweaking the 'ipg' parameter I can generate pretty much any packet rate > I want, as long as I set ipg=(1e9/rate)-496 instead of something possibly > more straightforward. That 496 will also change with load on the system, at least on average. I dealt with this by having a user-space app sample the rate and adjust the ipg to keep the average rate where I want it. So, I'd suggest leaving the ipg as it is, and use external tools to get the exact pps that you are looking for. Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com From sven@zion.homelinux.com Tue Dec 7 15:14:16 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 15:14:21 -0800 (PST) Received: from moutng.kundenserver.de (moutng.kundenserver.de [212.227.126.184]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB7NEFmC013367 for ; Tue, 7 Dec 2004 15:14:16 -0800 Received: from [212.227.126.155] (helo=mrelayng.kundenserver.de) by moutng.kundenserver.de with esmtp (Exim 3.35 #1) id 1CboWb-0003u4-00; Wed, 08 Dec 2004 00:13:41 +0100 Received: from [80.136.68.240] (helo=zion.homelinux.com) by mrelayng.kundenserver.de with asmtp (Exim 3.35 #1) id 1CboWa-0004tU-00; Wed, 08 Dec 2004 00:13:41 +0100 Received: from localhost (zion.homelinux.com [127.0.0.1]) by stage2.zion.homelinux.com (Postfix) with ESMTP id CC2812CAF6; Wed, 8 Dec 2004 00:13:38 +0100 (CET) Received: from zion.homelinux.com ([127.0.0.1]) by localhost (zion [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 27557-05; Wed, 8 Dec 2004 00:13:36 +0100 (CET) Received: by zion.homelinux.com (Postfix, from userid 1022) id 557522D179; Wed, 8 Dec 2004 00:13:36 +0100 (CET) Date: Wed, 8 Dec 2004 00:13:36 +0100 From: Sven Schuster To: shemminger@osdl.org Cc: netdev@oss.sgi.com Subject: [PATCH][iproute2] make lnstat default count '1' Message-ID: <20041207231336.GA29839@zion.homelinux.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="s2ZSL+KKDSLx8OML" Content-Disposition: inline User-Agent: Mutt/1.5.6i X-Virus-Scanned: by amavisd-new-2.2.0 (20041102) at zion.homelinux.com X-Provags-ID: kundenserver.de abuse@kundenserver.de auth:38b5f051b8cd178556c5593940405c4a X-archive-position: 12548 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: schuster.sven@gmx.de Precedence: bulk X-list: netdev --s2ZSL+KKDSLx8OML Content-Type: multipart/mixed; boundary="X1bOJ3K7DJ5YkBrT" Content-Disposition: inline --X1bOJ3K7DJ5YkBrT Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hi, I just compiled the latest iproute2 (from the bk repository) and tested Harald's lnstat utility. It found it quite annoying that when I just called rtstat it displayed nothing. The reason is that without the parameter '-c 1' the default count for viewing the information is 0. Wouldn't 1 be a better default value?? If yes, tiny patch attached :-) Sven --=20 Linux zion 2.6.10-rc3 #1 Mon Dec 6 22:51:51 CET 2004 i686 athlon i386 GNU/L= inux 00:08:17 up 1 day, 2 min, 2 users, load average: 0.00, 0.03, 0.05 --X1bOJ3K7DJ5YkBrT Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: attachment; filename="iproute2-count.patch" Content-Transfer-Encoding: quoted-printable --- iproute2-20041207/misc/lnstat.c.orig 2004-12-08 00:04:21.960353599 +0100 +++ iproute2-20041207/misc/lnstat.c 2004-12-08 00:02:32.425889013 +0100 @@ -218,7 +218,7 @@ MODE_NORMAL, } mode =3D MODE_NORMAL; =20 - unsigned long count =3D 0; + unsigned long count =3D 1; static struct field_params fp; int num_req_files =3D 0; char *req_files[LNSTAT_MAX_FILES]; --X1bOJ3K7DJ5YkBrT-- --s2ZSL+KKDSLx8OML Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQFBtjkgo4FAdB2PneQRAuNzAJ9XW0u3dyn+1TtytHlef8TmKAHX0wCdFYBy gB2MV8RVbfayglONP01p7MU= =/jMn -----END PGP SIGNATURE----- --s2ZSL+KKDSLx8OML-- From sri@us.ibm.com Tue Dec 7 16:58:30 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 16:58:40 -0800 (PST) Received: from e5.ny.us.ibm.com (e5.ny.us.ibm.com [32.97.182.145]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB80wMTs019825 for ; Tue, 7 Dec 2004 16:58:29 -0800 Received: from d01relay02.pok.ibm.com (d01relay02.pok.ibm.com [9.56.227.234]) by e5.ny.us.ibm.com (8.12.10/8.12.10) with ESMTP id iB80voGW027690 for ; Tue, 7 Dec 2004 19:57:50 -0500 Received: from d01av02.pok.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by d01relay02.pok.ibm.com (8.12.10/NCO/VER6.6) with ESMTP id iB80voT2094676 for ; Tue, 7 Dec 2004 19:57:50 -0500 Received: from d01av02.pok.ibm.com (loopback [127.0.0.1]) by d01av02.pok.ibm.com (8.12.11/8.12.11) with ESMTP id iB80voqq025920 for ; Tue, 7 Dec 2004 19:57:50 -0500 Received: from w-sridhar.beaverton.ibm.com (w-sridhar.beaverton.ibm.com [9.47.18.19]) by d01av02.pok.ibm.com (8.12.11/8.12.11) with ESMTP id iB80vnxB025879; Tue, 7 Dec 2004 19:57:49 -0500 Date: Tue, 7 Dec 2004 16:57:48 -0800 (PST) From: Sridhar Samudrala X-X-Sender: sridhar@w-sridhar.beaverton.ibm.com To: Adrian Bunk cc: lksctp-developers@lists.sourceforge.net, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: [2.6 patch] SCTP: possible cleanups In-Reply-To: <20041125172412.GG3537@stusta.de> Message-ID: References: <20041125172412.GG3537@stusta.de> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-archive-position: 12549 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: sri@us.ibm.com Precedence: bulk X-list: netdev Adrian, The patch looks fine. But instead of marking some functions directly as static i would like to use SCTP_STATIC macro which is defined as static within kernel. This allows us to export certain internal static functions to our user-level sctp test framework that uses the kernel implementation. I will submit this patch with the above change as part of next SCTP update. Thanks Sridhar On Thu, 25 Nov 2004, Adrian Bunk wrote: > The patch below contains the following possible cleanups for the SCTP > code: > - remove unused code > - make needlessly global code static > > This patch is not intended for being blindly applies, please review and > check whether part of it might conflict with pending patches. > > > diffstat output: > include/net/sctp/command.h | 13 ----- > include/net/sctp/constants.h | 4 - > include/net/sctp/sctp.h | 10 ---- > include/net/sctp/sm.h | 61 -------------------------- > include/net/sctp/structs.h | 22 --------- > include/net/sctp/tsnmap.h | 16 ------ > include/net/sctp/ulpevent.h | 2 > include/net/sctp/ulpqueue.h | 1 > net/sctp/associola.c | 72 +++++++++++-------------------- > net/sctp/bind_addr.c | 17 ------- > net/sctp/chunk.c | 8 +-- > net/sctp/command.c | 23 --------- > net/sctp/debug.c | 17 ------- > net/sctp/endpointola.c | 54 +++++++++++------------ > net/sctp/input.c | 46 ++++++++++--------- > net/sctp/inqueue.c | 13 ----- > net/sctp/ipv6.c | 20 ++++---- > net/sctp/objcnt.c | 2 > net/sctp/outqueue.c | 15 ------ > net/sctp/proc.c | 2 > net/sctp/protocol.c | 34 +++++++------- > net/sctp/sm_make_chunk.c | 81 ++++++++++------------------------- > net/sctp/sm_sideeffect.c | 66 +++++++++++++++++++--------- > net/sctp/sm_statefuns.c | 81 +++++++++++++++++++++++------------ > net/sctp/sm_statetable.c | 27 ++++++++--- > net/sctp/socket.c | 4 - > net/sctp/ssnmap.c | 7 ++- > net/sctp/transport.c | 56 ++++++++++++------------ > net/sctp/tsnmap.c | 39 ++-------------- > net/sctp/ulpevent.c | 18 ++++--- > net/sctp/ulpqueue.c | 21 --------- > 31 files changed, 306 insertions(+), 546 deletions(-) > > > Signed-off-by: Adrian Bunk > > --- linux-2.6.10-rc2-mm3-full/include/net/sctp/structs.h.old 2004-11-25 00:33:15.000000000 +0100 > +++ linux-2.6.10-rc2-mm3-full/include/net/sctp/structs.h 2004-11-25 03:29:32.000000000 +0100 > @@ -406,7 +406,6 @@ > int malloced; > }; > > -struct sctp_ssnmap *sctp_ssnmap_init(struct sctp_ssnmap *, __u16, __u16); > struct sctp_ssnmap *sctp_ssnmap_new(__u16 in, __u16 out, int gfp); > void sctp_ssnmap_free(struct sctp_ssnmap *map); > void sctp_ssnmap_clear(struct sctp_ssnmap *map); > @@ -538,12 +537,9 @@ > struct sctp_datamsg *sctp_datamsg_from_user(struct sctp_association *, > struct sctp_sndrcvinfo *, > struct msghdr *, int len); > -struct sctp_datamsg *sctp_datamsg_new(int gfp); > void sctp_datamsg_put(struct sctp_datamsg *); > -void sctp_datamsg_hold(struct sctp_datamsg *); > void sctp_datamsg_free(struct sctp_datamsg *); > void sctp_datamsg_track(struct sctp_chunk *); > -void sctp_datamsg_assign(struct sctp_datamsg *, struct sctp_chunk *); > void sctp_chunk_fail(struct sctp_chunk *, int error); > int sctp_chunk_abandoned(struct sctp_chunk *); > > @@ -651,8 +647,6 @@ > void sctp_chunk_put(struct sctp_chunk *); > int sctp_user_addto_chunk(struct sctp_chunk *chunk, int off, int len, > struct iovec *data); > -struct sctp_chunk *sctp_make_chunk(const struct sctp_association *, __u8 type, > - __u8 flags, int size); > void sctp_chunk_free(struct sctp_chunk *); > void *sctp_addto_chunk(struct sctp_chunk *, int len, const void *data); > struct sctp_chunk *sctp_chunkify(struct sk_buff *, > @@ -922,15 +916,12 @@ > }; > > struct sctp_transport *sctp_transport_new(const union sctp_addr *, int); > -struct sctp_transport *sctp_transport_init(struct sctp_transport *, > - const union sctp_addr *, int); > void sctp_transport_set_owner(struct sctp_transport *, > struct sctp_association *); > void sctp_transport_route(struct sctp_transport *, union sctp_addr *, > struct sctp_opt *); > void sctp_transport_pmtu(struct sctp_transport *); > void sctp_transport_free(struct sctp_transport *); > -void sctp_transport_destroy(struct sctp_transport *); > void sctp_transport_reset_timers(struct sctp_transport *); > void sctp_transport_hold(struct sctp_transport *); > void sctp_transport_put(struct sctp_transport *); > @@ -961,7 +952,6 @@ > int malloced; /* Is this structure kfree()able? */ > }; > > -struct sctp_inq *sctp_inq_new(void); > void sctp_inq_init(struct sctp_inq *); > void sctp_inq_free(struct sctp_inq *); > void sctp_inq_push(struct sctp_inq *, struct sctp_chunk *packet); > @@ -1029,7 +1019,6 @@ > char malloced; > }; > > -struct sctp_outq *sctp_outq_new(struct sctp_association *); > void sctp_outq_init(struct sctp_association *, struct sctp_outq *); > void sctp_outq_teardown(struct sctp_outq *); > void sctp_outq_free(struct sctp_outq*); > @@ -1070,7 +1059,6 @@ > int malloced; /* Are we kfree()able? */ > }; > > -struct sctp_bind_addr *sctp_bind_addr_new(int gfp_mask); > void sctp_bind_addr_init(struct sctp_bind_addr *, __u16 port); > void sctp_bind_addr_free(struct sctp_bind_addr *); > int sctp_bind_addr_copy(struct sctp_bind_addr *dest, > @@ -1220,8 +1208,6 @@ > > /* These are function signatures for manipulating endpoints. */ > struct sctp_endpoint *sctp_endpoint_new(struct sock *, int); > -struct sctp_endpoint *sctp_endpoint_init(struct sctp_endpoint *, > - struct sock *, int gfp); > void sctp_endpoint_free(struct sctp_endpoint *); > void sctp_endpoint_put(struct sctp_endpoint *); > void sctp_endpoint_hold(struct sctp_endpoint *); > @@ -1243,8 +1229,6 @@ > int sctp_process_init(struct sctp_association *, sctp_cid_t cid, > const union sctp_addr *peer, > sctp_init_chunk_t *init, int gfp); > -int sctp_process_param(struct sctp_association *, union sctp_params param, > - const union sctp_addr *from, int gfp); > __u32 sctp_generate_tag(const struct sctp_endpoint *); > __u32 sctp_generate_tsn(const struct sctp_endpoint *); > > @@ -1690,10 +1674,6 @@ > struct sctp_association * > sctp_association_new(const struct sctp_endpoint *, const struct sock *, > sctp_scope_t scope, int gfp); > -struct sctp_association * > -sctp_association_init(struct sctp_association *, const struct sctp_endpoint *, > - const struct sock *, sctp_scope_t scope, > - int gfp); > void sctp_association_free(struct sctp_association *); > void sctp_association_put(struct sctp_association *); > void sctp_association_hold(struct sctp_association *); > @@ -1722,7 +1702,6 @@ > struct sctp_association *new); > > __u32 sctp_association_get_next_tsn(struct sctp_association *); > -__u32 sctp_association_get_tsn_block(struct sctp_association *, int); > > void sctp_assoc_sync_pmtu(struct sctp_association *); > void sctp_assoc_rwnd_increase(struct sctp_association *, unsigned); > @@ -1736,7 +1715,6 @@ > int sctp_cmp_addr_exact(const union sctp_addr *ss1, > const union sctp_addr *ss2); > struct sctp_chunk *sctp_get_ecne_prepend(struct sctp_association *asoc); > -struct sctp_chunk *sctp_get_no_prepend(struct sctp_association *asoc); > > /* A convenience structure to parse out SCTP specific CMSGs. */ > typedef struct sctp_cmsgs { > --- linux-2.6.10-rc2-mm3-full/include/net/sctp/command.h.old 2004-11-25 00:37:53.000000000 +0100 > +++ linux-2.6.10-rc2-mm3-full/include/net/sctp/command.h 2004-11-25 00:38:13.000000000 +0100 > @@ -189,11 +189,6 @@ > } sctp_cmd_seq_t; > > > -/* Create a new sctp_command_sequence. > - * Return NULL if creating a new sequence fails. > - */ > -sctp_cmd_seq_t *sctp_new_cmd_seq(int gfp); > - > /* Initialize a block of memory as a command sequence. > * Return 0 if the initialization fails. > */ > @@ -207,18 +202,10 @@ > */ > int sctp_add_cmd(sctp_cmd_seq_t *seq, sctp_verb_t verb, sctp_arg_t obj); > > -/* Rewind an sctp_cmd_seq_t to iterate from the start. > - * Return 0 if the rewind fails. > - */ > -int sctp_rewind_sequence(sctp_cmd_seq_t *seq); > - > /* Return the next command structure in an sctp_cmd_seq. > * Return NULL at the end of the sequence. > */ > sctp_cmd_t *sctp_next_cmd(sctp_cmd_seq_t *seq); > > -/* Dispose of a command sequence. */ > -void sctp_free_cmd_seq(sctp_cmd_seq_t *seq); > - > #endif /* __net_sctp_command_h__ */ > > --- linux-2.6.10-rc2-mm3-full/include/net/sctp/constants.h.old 2004-11-25 00:39:10.000000000 +0100 > +++ linux-2.6.10-rc2-mm3-full/include/net/sctp/constants.h 2004-11-25 00:39:20.000000000 +0100 > @@ -155,10 +155,6 @@ > - (unsigned long)(c->chunk_hdr)\ > - sizeof(sctp_data_chunk_t))) > > -/* This is a table of printable names of sctp_param_t's. */ > -extern const char *sctp_param_tbl[]; > - > - > #define SCTP_MAX_ERROR_CAUSE SCTP_ERROR_NONEXIST_IP > #define SCTP_NUM_ERROR_CAUSE 10 > > --- linux-2.6.10-rc2-mm3-full/include/net/sctp/sctp.h.old 2004-11-25 00:41:46.000000000 +0100 > +++ linux-2.6.10-rc2-mm3-full/include/net/sctp/sctp.h 2004-11-25 01:34:35.000000000 +0100 > @@ -162,17 +162,9 @@ > int sctp_rcv(struct sk_buff *skb); > void sctp_v4_err(struct sk_buff *skb, u32 info); > void sctp_hash_established(struct sctp_association *); > -void __sctp_hash_established(struct sctp_association *); > void sctp_unhash_established(struct sctp_association *); > -void __sctp_unhash_established(struct sctp_association *); > void sctp_hash_endpoint(struct sctp_endpoint *); > -void __sctp_hash_endpoint(struct sctp_endpoint *); > void sctp_unhash_endpoint(struct sctp_endpoint *); > -void __sctp_unhash_endpoint(struct sctp_endpoint *); > -struct sctp_association *__sctp_lookup_association( > - const union sctp_addr *, > - const union sctp_addr *, > - struct sctp_transport **); > struct sock *sctp_err_lookup(int family, struct sk_buff *, > struct sctphdr *, struct sctp_endpoint **, > struct sctp_association **, > @@ -310,8 +302,6 @@ > > int sctp_v6_init(void); > void sctp_v6_exit(void); > -void sctp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt, > - int type, int code, int offset, __u32 info); > > #else /* #ifdef defined(CONFIG_IPV6) */ > > --- linux-2.6.10-rc2-mm3-full/include/net/sctp/sm.h.old 2004-11-25 01:42:52.000000000 +0100 > +++ linux-2.6.10-rc2-mm3-full/include/net/sctp/sm.h 2004-11-25 03:14:05.000000000 +0100 > @@ -128,7 +128,6 @@ > sctp_state_fn_t sctp_sf_do_ecn_cwr; > sctp_state_fn_t sctp_sf_do_ecne; > sctp_state_fn_t sctp_sf_ootb; > -sctp_state_fn_t sctp_sf_shut_8_4_5; > sctp_state_fn_t sctp_sf_pdiscard; > sctp_state_fn_t sctp_sf_violation; > sctp_state_fn_t sctp_sf_discard_chunk; > @@ -138,7 +137,6 @@ > sctp_state_fn_t sctp_sf_unk_chunk; > sctp_state_fn_t sctp_sf_do_8_5_1_E_sa; > sctp_state_fn_t sctp_sf_cookie_echoed_err; > -sctp_state_fn_t sctp_sf_do_5_2_6_stale; > sctp_state_fn_t sctp_sf_do_asconf; > sctp_state_fn_t sctp_sf_do_asconf_ack; > sctp_state_fn_t sctp_sf_do_9_2_reshutack; > @@ -200,19 +198,10 @@ > struct sctp_chunk *sctp_make_cwr(const struct sctp_association *, > const __u32 lowest_tsn, > const struct sctp_chunk *); > -struct sctp_chunk *sctp_make_datafrag(struct sctp_association *, > - const struct sctp_sndrcvinfo *sinfo, > - int len, const __u8 *data, > - __u8 flags, __u16 ssn); > struct sctp_chunk * sctp_make_datafrag_empty(struct sctp_association *, > const struct sctp_sndrcvinfo *sinfo, > int len, const __u8 flags, > __u16 ssn); > -struct sctp_chunk *sctp_make_data(struct sctp_association *, > - const struct sctp_sndrcvinfo *sinfo, > - int len, const __u8 *data); > -struct sctp_chunk *sctp_make_data_empty(struct sctp_association *, > - const struct sctp_sndrcvinfo *, int len); > struct sctp_chunk *sctp_make_ecne(const struct sctp_association *, > const __u32); > struct sctp_chunk *sctp_make_sack(const struct sctp_association *); > @@ -246,17 +235,12 @@ > const void *payload, > size_t paylen); > > -struct sctp_chunk *sctp_make_asconf(struct sctp_association *asoc, > - union sctp_addr *addr, > - int vparam_len); > struct sctp_chunk *sctp_make_asconf_update_ip(struct sctp_association *, > union sctp_addr *, > struct sockaddr *, > int, __u16); > struct sctp_chunk *sctp_make_asconf_set_prim(struct sctp_association *asoc, > union sctp_addr *addr); > -struct sctp_chunk *sctp_make_asconf_ack(const struct sctp_association *asoc, > - __u32 serial, int vparam_len); > struct sctp_chunk *sctp_process_asconf(struct sctp_association *asoc, > struct sctp_chunk *asconf); > int sctp_process_asconf_ack(struct sctp_association *asoc, > @@ -277,71 +261,26 @@ > void *event_arg, > int gfp); > > -int sctp_side_effects(sctp_event_t event_type, sctp_subtype_t subtype, > - sctp_state_t state, > - struct sctp_endpoint *, > - struct sctp_association *asoc, > - void *event_arg, > - sctp_disposition_t status, > - sctp_cmd_seq_t *commands, > - int gfp); > - > /* 2nd level prototypes */ > -int sctp_cmd_interpreter(sctp_event_t, sctp_subtype_t, sctp_state_t, > - struct sctp_endpoint *, struct sctp_association *, > - void *event_arg, sctp_disposition_t, > - sctp_cmd_seq_t *retval, int gfp); > - > - > -int sctp_gen_sack(struct sctp_association *, int force, sctp_cmd_seq_t *); > void sctp_generate_t3_rtx_event(unsigned long peer); > void sctp_generate_heartbeat_event(unsigned long peer); > > -sctp_sackhdr_t *sctp_sm_pull_sack(struct sctp_chunk *); > -struct sctp_packet *sctp_abort_pkt_new(const struct sctp_endpoint *, > - const struct sctp_association *, > - struct sctp_chunk *chunk, > - const void *payload, > - size_t paylen); > -struct sctp_packet *sctp_ootb_pkt_new(const struct sctp_association *, > - const struct sctp_chunk *); > void sctp_ootb_pkt_free(struct sctp_packet *); > > -struct sctp_cookie_param * > -sctp_pack_cookie(const struct sctp_endpoint *, const struct sctp_association *, > - const struct sctp_chunk *, int *cookie_len, > - const __u8 *, int addrs_len); > struct sctp_association *sctp_unpack_cookie(const struct sctp_endpoint *, > const struct sctp_association *, > struct sctp_chunk *, int gfp, int *err, > struct sctp_chunk **err_chk_p); > int sctp_addip_addr_config(struct sctp_association *, sctp_param_t, > struct sockaddr_storage*, int); > -void sctp_send_stale_cookie_err(const struct sctp_endpoint *ep, > - const struct sctp_association *asoc, > - const struct sctp_chunk *chunk, > - sctp_cmd_seq_t *commands, > - struct sctp_chunk *err_chunk); > -int sctp_eat_data(const struct sctp_association *asoc, > - struct sctp_chunk *chunk, > - sctp_cmd_seq_t *commands); > > /* 3rd level prototypes */ > __u32 sctp_generate_tag(const struct sctp_endpoint *); > __u32 sctp_generate_tsn(const struct sctp_endpoint *); > > /* Extern declarations for major data structures. */ > -const sctp_sm_table_entry_t *sctp_chunk_event_lookup(sctp_cid_t, sctp_state_t); > -extern const sctp_sm_table_entry_t > -primitive_event_table[SCTP_NUM_PRIMITIVE_TYPES][SCTP_STATE_NUM_STATES]; > -extern const sctp_sm_table_entry_t > -other_event_table[SCTP_NUM_OTHER_TYPES][SCTP_STATE_NUM_STATES]; > -extern const sctp_sm_table_entry_t > -timeout_event_table[SCTP_NUM_TIMEOUT_TYPES][SCTP_STATE_NUM_STATES]; > extern sctp_timer_event_t *sctp_timer_events[SCTP_NUM_TIMEOUT_TYPES]; > > -/* These are some handy utility macros... */ > - > > /* Get the size of a DATA chunk payload. */ > static inline __u16 sctp_data_size(struct sctp_chunk *chunk) > --- linux-2.6.10-rc2-mm3-full/net/sctp/associola.c.old 2004-11-25 00:33:40.000000000 +0100 > +++ linux-2.6.10-rc2-mm3-full/net/sctp/associola.c 2004-11-25 00:34:52.000000000 +0100 > @@ -66,33 +66,8 @@ > > /* 1st Level Abstractions. */ > > -/* Allocate and initialize a new association */ > -struct sctp_association *sctp_association_new(const struct sctp_endpoint *ep, > - const struct sock *sk, > - sctp_scope_t scope, int gfp) > -{ > - struct sctp_association *asoc; > - > - asoc = t_new(struct sctp_association, gfp); > - if (!asoc) > - goto fail; > - > - if (!sctp_association_init(asoc, ep, sk, scope, gfp)) > - goto fail_init; > - > - asoc->base.malloced = 1; > - SCTP_DBG_OBJCNT_INC(assoc); > - > - return asoc; > - > -fail_init: > - kfree(asoc); > -fail: > - return NULL; > -} > - > /* Initialize a new association from provided memory. */ > -struct sctp_association *sctp_association_init(struct sctp_association *asoc, > +static struct sctp_association *sctp_association_init(struct sctp_association *asoc, > const struct sctp_endpoint *ep, > const struct sock *sk, > sctp_scope_t scope, > @@ -296,6 +271,31 @@ > return NULL; > } > > +/* Allocate and initialize a new association */ > +struct sctp_association *sctp_association_new(const struct sctp_endpoint *ep, > + const struct sock *sk, > + sctp_scope_t scope, int gfp) > +{ > + struct sctp_association *asoc; > + > + asoc = t_new(struct sctp_association, gfp); > + if (!asoc) > + goto fail; > + > + if (!sctp_association_init(asoc, ep, sk, scope, gfp)) > + goto fail_init; > + > + asoc->base.malloced = 1; > + SCTP_DBG_OBJCNT_INC(assoc); > + > + return asoc; > + > +fail_init: > + kfree(asoc); > +fail: > + return NULL; > +} > + > /* Free this association if possible. There may still be users, so > * the actual deallocation may be delayed. > */ > @@ -714,18 +714,6 @@ > return retval; > } > > -/* Allocate 'num' TSNs by incrementing the association's TSN by num. */ > -__u32 sctp_association_get_tsn_block(struct sctp_association *asoc, int num) > -{ > - __u32 retval = asoc->next_tsn; > - > - asoc->next_tsn += num; > - asoc->unack_data += num; > - > - return retval; > -} > - > - > /* Compare two addresses to see if they match. Wildcard addresses > * only match themselves. > */ > @@ -760,14 +748,6 @@ > return chunk; > } > > -/* Use this function for the packet prepend callback when no ECNE > - * packet is desired (e.g. some packets don't like to be bundled). > - */ > -struct sctp_chunk *sctp_get_no_prepend(struct sctp_association *asoc) > -{ > - return NULL; > -} > - > /* > * Find which transport this TSN was sent on. > */ > --- linux-2.6.10-rc2-mm3-full/net/sctp/bind_addr.c.old 2004-11-25 00:35:13.000000000 +0100 > +++ linux-2.6.10-rc2-mm3-full/net/sctp/bind_addr.c 2004-11-25 00:35:23.000000000 +0100 > @@ -104,23 +104,6 @@ > return error; > } > > -/* Create a new SCTP_bind_addr from nothing. */ > -struct sctp_bind_addr *sctp_bind_addr_new(int gfp) > -{ > - struct sctp_bind_addr *retval; > - > - retval = t_new(struct sctp_bind_addr, gfp); > - if (!retval) > - goto nomem; > - > - sctp_bind_addr_init(retval, 0); > - retval->malloced = 1; > - SCTP_DBG_OBJCNT_INC(bind_addr); > - > -nomem: > - return retval; > -} > - > /* Initialize the SCTP_bind_addr structure for either an endpoint or > * an association. > */ > --- linux-2.6.10-rc2-mm3-full/net/sctp/chunk.c.old 2004-11-25 00:36:15.000000000 +0100 > +++ linux-2.6.10-rc2-mm3-full/net/sctp/chunk.c 2004-11-25 00:36:55.000000000 +0100 > @@ -51,7 +51,7 @@ > */ > > /* Initialize datamsg from memory. */ > -void sctp_datamsg_init(struct sctp_datamsg *msg) > +static void sctp_datamsg_init(struct sctp_datamsg *msg) > { > atomic_set(&msg->refcnt, 1); > msg->send_failed = 0; > @@ -62,7 +62,7 @@ > } > > /* Allocate and initialize datamsg. */ > -struct sctp_datamsg *sctp_datamsg_new(int gfp) > +static struct sctp_datamsg *sctp_datamsg_new(int gfp) > { > struct sctp_datamsg *msg; > msg = kmalloc(sizeof(struct sctp_datamsg), gfp); > @@ -124,7 +124,7 @@ > } > > /* Hold a reference. */ > -void sctp_datamsg_hold(struct sctp_datamsg *msg) > +static void sctp_datamsg_hold(struct sctp_datamsg *msg) > { > atomic_inc(&msg->refcnt); > } > @@ -151,7 +151,7 @@ > } > > /* Assign a chunk to this datamsg. */ > -void sctp_datamsg_assign(struct sctp_datamsg *msg, struct sctp_chunk *chunk) > +static void sctp_datamsg_assign(struct sctp_datamsg *msg, struct sctp_chunk *chunk) > { > sctp_datamsg_hold(msg); > chunk->msg = msg; > --- linux-2.6.10-rc2-mm3-full/net/sctp/command.c.old 2004-11-25 00:38:22.000000000 +0100 > +++ linux-2.6.10-rc2-mm3-full/net/sctp/command.c 2004-11-25 00:38:55.000000000 +0100 > @@ -42,17 +42,6 @@ > #include > #include > > -/* Create a new sctp_command_sequence. */ > -sctp_cmd_seq_t *sctp_new_cmd_seq(int gfp) > -{ > - sctp_cmd_seq_t *retval = t_new(sctp_cmd_seq_t, gfp); > - > - if (retval) > - sctp_init_cmd_seq(retval); > - > - return retval; > -} > - > /* Initialize a block of memory as a command sequence. */ > int sctp_init_cmd_seq(sctp_cmd_seq_t *seq) > { > @@ -77,13 +66,6 @@ > return 0; > } > > -/* Rewind an sctp_cmd_seq_t to iterate from the start. */ > -int sctp_rewind_sequence(sctp_cmd_seq_t *seq) > -{ > - seq->next_cmd = 0; > - return 1; /* We always succeed. */ > -} > - > /* Return the next command structure in a sctp_cmd_seq. > * Returns NULL at the end of the sequence. > */ > @@ -97,8 +79,3 @@ > return retval; > } > > -/* Dispose of a command sequence. */ > -void sctp_free_cmd_seq(sctp_cmd_seq_t *seq) > -{ > - kfree(seq); > -} > --- linux-2.6.10-rc2-mm3-full/net/sctp/debug.c.old 2004-11-25 00:39:29.000000000 +0100 > +++ linux-2.6.10-rc2-mm3-full/net/sctp/debug.c 2004-11-25 00:39:39.000000000 +0100 > @@ -98,23 +98,6 @@ > return "unknown chunk"; > } > > -/* These are printable form of variable-length parameters. */ > -const char *sctp_param_tbl[SCTP_PARAM_ECN_CAPABLE + 1] = { > - "", > - "PARAM_HEARTBEAT_INFO", > - "", > - "", > - "", > - "PARAM_IPV4_ADDRESS", > - "PARAM_IPV6_ADDRESS", > - "PARAM_STATE_COOKIE", > - "PARAM_UNRECOGNIZED_PARAMETERS", > - "PARAM_COOKIE_PRESERVATIVE", > - "", > - "PARAM_HOST_NAME_ADDRESS", > - "PARAM_SUPPORTED_ADDRESS_TYPES", > -}; > - > /* These are printable forms of the states. */ > const char *sctp_state_tbl[SCTP_STATE_NUM_STATES] = { > "STATE_EMPTY", > --- linux-2.6.10-rc2-mm3-full/net/sctp/endpointola.c.old 2004-11-25 00:40:01.000000000 +0100 > +++ linux-2.6.10-rc2-mm3-full/net/sctp/endpointola.c 2004-11-25 00:41:19.000000000 +0100 > @@ -63,34 +63,11 @@ > /* Forward declarations for internal helpers. */ > static void sctp_endpoint_bh_rcv(struct sctp_endpoint *ep); > > -/* Create a sctp_endpoint with all that boring stuff initialized. > - * Returns NULL if there isn't enough memory. > - */ > -struct sctp_endpoint *sctp_endpoint_new(struct sock *sk, int gfp) > -{ > - struct sctp_endpoint *ep; > - > - /* Build a local endpoint. */ > - ep = t_new(struct sctp_endpoint, gfp); > - if (!ep) > - goto fail; > - if (!sctp_endpoint_init(ep, sk, gfp)) > - goto fail_init; > - ep->base.malloced = 1; > - SCTP_DBG_OBJCNT_INC(ep); > - return ep; > - > -fail_init: > - kfree(ep); > -fail: > - return NULL; > -} > - > /* > * Initialize the base fields of the endpoint structure. > */ > -struct sctp_endpoint *sctp_endpoint_init(struct sctp_endpoint *ep, > - struct sock *sk, int gfp) > +static struct sctp_endpoint *sctp_endpoint_init(struct sctp_endpoint *ep, > + struct sock *sk, int gfp) > { > struct sctp_opt *sp = sctp_sk(sk); > memset(ep, 0, sizeof(struct sctp_endpoint)); > @@ -160,6 +137,29 @@ > return ep; > } > > +/* Create a sctp_endpoint with all that boring stuff initialized. > + * Returns NULL if there isn't enough memory. > + */ > +struct sctp_endpoint *sctp_endpoint_new(struct sock *sk, int gfp) > +{ > + struct sctp_endpoint *ep; > + > + /* Build a local endpoint. */ > + ep = t_new(struct sctp_endpoint, gfp); > + if (!ep) > + goto fail; > + if (!sctp_endpoint_init(ep, sk, gfp)) > + goto fail_init; > + ep->base.malloced = 1; > + SCTP_DBG_OBJCNT_INC(ep); > + return ep; > + > +fail_init: > + kfree(ep); > +fail: > + return NULL; > +} > + > /* Add an association to an endpoint. */ > void sctp_endpoint_add_asoc(struct sctp_endpoint *ep, > struct sctp_association *asoc) > @@ -184,7 +184,7 @@ > } > > /* Final destructor for endpoint. */ > -void sctp_endpoint_destroy(struct sctp_endpoint *ep) > +static void sctp_endpoint_destroy(struct sctp_endpoint *ep) > { > SCTP_ASSERT(ep->base.dead, "Endpoint is not dead", return); > > @@ -257,7 +257,7 @@ > * We do a linear search of the associations for this endpoint. > * We return the matching transport address too. > */ > -struct sctp_association *__sctp_endpoint_lookup_assoc( > +static struct sctp_association *__sctp_endpoint_lookup_assoc( > const struct sctp_endpoint *ep, > const union sctp_addr *paddr, > struct sctp_transport **transport) > --- linux-2.6.10-rc2-mm3-full/net/sctp/input.c.old 2004-11-25 00:42:03.000000000 +0100 > +++ linux-2.6.10-rc2-mm3-full/net/sctp/input.c 2004-11-25 00:46:22.000000000 +0100 > @@ -63,11 +63,15 @@ > > /* Forward declarations for internal helpers. */ > static int sctp_rcv_ootb(struct sk_buff *); > -struct sctp_association *__sctp_rcv_lookup(struct sk_buff *skb, > +static struct sctp_association *__sctp_rcv_lookup(struct sk_buff *skb, > const union sctp_addr *laddr, > const union sctp_addr *paddr, > struct sctp_transport **transportp); > -struct sctp_endpoint *__sctp_rcv_lookup_endpoint(const union sctp_addr *laddr); > +static struct sctp_endpoint *__sctp_rcv_lookup_endpoint(const union sctp_addr *laddr); > +static struct sctp_association *__sctp_lookup_association( > + const union sctp_addr *local, > + const union sctp_addr *peer, > + struct sctp_transport **pt); > > > /* Calculate the SCTP checksum of an SCTP packet. */ > @@ -522,7 +526,7 @@ > } > > /* Insert endpoint into the hash table. */ > -void __sctp_hash_endpoint(struct sctp_endpoint *ep) > +static void __sctp_hash_endpoint(struct sctp_endpoint *ep) > { > struct sctp_ep_common **epp; > struct sctp_ep_common *epb; > @@ -552,7 +556,7 @@ > } > > /* Remove endpoint from the hash table. */ > -void __sctp_unhash_endpoint(struct sctp_endpoint *ep) > +static void __sctp_unhash_endpoint(struct sctp_endpoint *ep) > { > struct sctp_hashbucket *head; > struct sctp_ep_common *epb; > @@ -584,7 +588,7 @@ > } > > /* Look up an endpoint. */ > -struct sctp_endpoint *__sctp_rcv_lookup_endpoint(const union sctp_addr *laddr) > +static struct sctp_endpoint *__sctp_rcv_lookup_endpoint(const union sctp_addr *laddr) > { > struct sctp_hashbucket *head; > struct sctp_ep_common *epb; > @@ -610,16 +614,8 @@ > return ep; > } > > -/* Add an association to the hash. Local BH-safe. */ > -void sctp_hash_established(struct sctp_association *asoc) > -{ > - sctp_local_bh_disable(); > - __sctp_hash_established(asoc); > - sctp_local_bh_enable(); > -} > - > /* Insert association into the hash table. */ > -void __sctp_hash_established(struct sctp_association *asoc) > +static void __sctp_hash_established(struct sctp_association *asoc) > { > struct sctp_ep_common **epp; > struct sctp_ep_common *epb; > @@ -642,16 +638,16 @@ > sctp_write_unlock(&head->lock); > } > > -/* Remove association from the hash table. Local BH-safe. */ > -void sctp_unhash_established(struct sctp_association *asoc) > +/* Add an association to the hash. Local BH-safe. */ > +void sctp_hash_established(struct sctp_association *asoc) > { > sctp_local_bh_disable(); > - __sctp_unhash_established(asoc); > + __sctp_hash_established(asoc); > sctp_local_bh_enable(); > } > > /* Remove association from the hash table. */ > -void __sctp_unhash_established(struct sctp_association *asoc) > +static void __sctp_unhash_established(struct sctp_association *asoc) > { > struct sctp_hashbucket *head; > struct sctp_ep_common *epb; > @@ -675,8 +671,16 @@ > sctp_write_unlock(&head->lock); > } > > +/* Remove association from the hash table. Local BH-safe. */ > +void sctp_unhash_established(struct sctp_association *asoc) > +{ > + sctp_local_bh_disable(); > + __sctp_unhash_established(asoc); > + sctp_local_bh_enable(); > +} > + > /* Look up an association. */ > -struct sctp_association *__sctp_lookup_association( > +static struct sctp_association *__sctp_lookup_association( > const union sctp_addr *local, > const union sctp_addr *peer, > struct sctp_transport **pt) > @@ -713,7 +717,7 @@ > } > > /* Look up an association. BH-safe. */ > -struct sctp_association *sctp_lookup_association(const union sctp_addr *laddr, > +static struct sctp_association *sctp_lookup_association(const union sctp_addr *laddr, > const union sctp_addr *paddr, > struct sctp_transport **transportp) > { > @@ -821,7 +825,7 @@ > } > > /* Lookup an association for an inbound skb. */ > -struct sctp_association *__sctp_rcv_lookup(struct sk_buff *skb, > +static struct sctp_association *__sctp_rcv_lookup(struct sk_buff *skb, > const union sctp_addr *paddr, > const union sctp_addr *laddr, > struct sctp_transport **transportp) > --- linux-2.6.10-rc2-mm3-full/net/sctp/ipv6.c.old 2004-11-25 01:34:44.000000000 +0100 > +++ linux-2.6.10-rc2-mm3-full/net/sctp/ipv6.c 2004-11-25 01:35:57.000000000 +0100 > @@ -84,8 +84,8 @@ > }; > > /* ICMP error handler. */ > -void sctp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt, > - int type, int code, int offset, __u32 info) > +static void sctp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt, > + int type, int code, int offset, __u32 info) > { > struct inet6_dev *idev; > struct ipv6hdr *iph = (struct ipv6hdr *)skb->data; > @@ -188,9 +188,9 @@ > /* Returns the dst cache entry for the given source and destination ip > * addresses. > */ > -struct dst_entry *sctp_v6_get_dst(struct sctp_association *asoc, > - union sctp_addr *daddr, > - union sctp_addr *saddr) > +static struct dst_entry *sctp_v6_get_dst(struct sctp_association *asoc, > + union sctp_addr *daddr, > + union sctp_addr *saddr) > { > struct dst_entry *dst; > struct flowi fl; > @@ -251,8 +251,10 @@ > /* Fills in the source address(saddr) based on the destination address(daddr) > * and asoc's bind address list. > */ > -void sctp_v6_get_saddr(struct sctp_association *asoc, struct dst_entry *dst, > - union sctp_addr *daddr, union sctp_addr *saddr) > +static void sctp_v6_get_saddr(struct sctp_association *asoc, > + struct dst_entry *dst, > + union sctp_addr *daddr, > + union sctp_addr *saddr) > { > struct sctp_bind_addr *bp; > rwlock_t *addr_lock; > @@ -577,8 +579,8 @@ > } > > /* Create and initialize a new sk for the socket to be returned by accept(). */ > -struct sock *sctp_v6_create_accept_sk(struct sock *sk, > - struct sctp_association *asoc) > +static struct sock *sctp_v6_create_accept_sk(struct sock *sk, > + struct sctp_association *asoc) > { > struct inet_opt *inet = inet_sk(sk); > struct sock *newsk; > --- linux-2.6.10-rc2-mm3-full/net/sctp/inqueue.c.old 2004-11-25 00:46:50.000000000 +0100 > +++ linux-2.6.10-rc2-mm3-full/net/sctp/inqueue.c 2004-11-25 00:46:57.000000000 +0100 > @@ -59,19 +59,6 @@ > queue->malloced = 0; > } > > -/* Create an initialized sctp_inq. */ > -struct sctp_inq *sctp_inq_new(void) > -{ > - struct sctp_inq *retval; > - > - retval = t_new(struct sctp_inq, GFP_ATOMIC); > - if (retval) { > - sctp_inq_init(retval); > - retval->malloced = 1; > - } > - return retval; > -} > - > /* Release the memory associated with an SCTP inqueue. */ > void sctp_inq_free(struct sctp_inq *queue) > { > --- linux-2.6.10-rc2-mm3-full/net/sctp/objcnt.c.old 2004-11-25 01:36:46.000000000 +0100 > +++ linux-2.6.10-rc2-mm3-full/net/sctp/objcnt.c 2004-11-25 01:36:56.000000000 +0100 > @@ -62,7 +62,7 @@ > /* An array to make it easy to pretty print the debug information > * to the proc fs. > */ > -sctp_dbg_objcnt_entry_t sctp_dbg_objcnt[] = { > +static sctp_dbg_objcnt_entry_t sctp_dbg_objcnt[] = { > SCTP_DBG_OBJCNT_ENTRY(sock), > SCTP_DBG_OBJCNT_ENTRY(ep), > SCTP_DBG_OBJCNT_ENTRY(assoc), > --- linux-2.6.10-rc2-mm3-full/net/sctp/outqueue.c.old 2004-11-25 01:37:31.000000000 +0100 > +++ linux-2.6.10-rc2-mm3-full/net/sctp/outqueue.c 2004-11-25 01:37:47.000000000 +0100 > @@ -190,19 +190,6 @@ > return 0; > } > > -/* Generate a new outqueue. */ > -struct sctp_outq *sctp_outq_new(struct sctp_association *asoc) > -{ > - struct sctp_outq *q; > - > - q = t_new(struct sctp_outq, GFP_KERNEL); > - if (q) { > - sctp_outq_init(asoc, q); > - q->malloced = 1; > - } > - return q; > -} > - > /* Initialize an existing sctp_outq. This does the boring stuff. > * You still need to define handlers if you really want to DO > * something with this structure... > @@ -362,7 +349,7 @@ > /* Insert a chunk into the sorted list based on the TSNs. The retransmit list > * and the abandoned list are in ascending order. > */ > -void sctp_insert_list(struct list_head *head, struct list_head *new) > +static void sctp_insert_list(struct list_head *head, struct list_head *new) > { > struct list_head *pos; > struct sctp_chunk *nchunk, *lchunk; > --- linux-2.6.10-rc2-mm3-full/net/sctp/proc.c.old 2004-11-25 01:38:01.000000000 +0100 > +++ linux-2.6.10-rc2-mm3-full/net/sctp/proc.c 2004-11-25 01:38:10.000000000 +0100 > @@ -39,7 +39,7 @@ > #include > #include > > -struct snmp_mib sctp_snmp_list[] = { > +static struct snmp_mib sctp_snmp_list[] = { > SNMP_MIB_ITEM("SctpCurrEstab", SCTP_MIB_CURRESTAB), > SNMP_MIB_ITEM("SctpActiveEstabs", SCTP_MIB_ACTIVEESTABS), > SNMP_MIB_ITEM("SctpPassiveEstabs", SCTP_MIB_PASSIVEESTABS), > --- linux-2.6.10-rc2-mm3-full/net/sctp/protocol.c.old 2004-11-25 01:39:00.000000000 +0100 > +++ linux-2.6.10-rc2-mm3-full/net/sctp/protocol.c 2004-11-25 01:41:45.000000000 +0100 > @@ -95,7 +95,7 @@ > } > > /* Set up the proc fs entry for the SCTP protocol. */ > -__init int sctp_proc_init(void) > +static __init int sctp_proc_init(void) > { > if (!proc_net_sctp) { > struct proc_dir_entry *ent; > @@ -124,7 +124,7 @@ > * Note: Do not make this __exit as it is used in the init error > * path. > */ > -void sctp_proc_exit(void) > +static void sctp_proc_exit(void) > { > sctp_snmp_proc_exit(); > sctp_eps_proc_exit(); > @@ -428,9 +428,9 @@ > * addresses. If an association is passed, trys to get a dst entry with a > * source address that matches an address in the bind address list. > */ > -struct dst_entry *sctp_v4_get_dst(struct sctp_association *asoc, > - union sctp_addr *daddr, > - union sctp_addr *saddr) > +static struct dst_entry *sctp_v4_get_dst(struct sctp_association *asoc, > + union sctp_addr *daddr, > + union sctp_addr *saddr) > { > struct rtable *rt; > struct flowi fl; > @@ -520,10 +520,10 @@ > /* For v4, the source address is cached in the route entry(dst). So no need > * to cache it separately and hence this is an empty routine. > */ > -void sctp_v4_get_saddr(struct sctp_association *asoc, > - struct dst_entry *dst, > - union sctp_addr *daddr, > - union sctp_addr *saddr) > +static void sctp_v4_get_saddr(struct sctp_association *asoc, > + struct dst_entry *dst, > + union sctp_addr *daddr, > + union sctp_addr *saddr) > { > struct rtable *rt = (struct rtable *)dst; > > @@ -547,8 +547,8 @@ > } > > /* Create and initialize a new sk for the socket returned by accept(). */ > -struct sock *sctp_v4_create_accept_sk(struct sock *sk, > - struct sctp_association *asoc) > +static struct sock *sctp_v4_create_accept_sk(struct sock *sk, > + struct sctp_association *asoc) > { > struct sock *newsk; > struct inet_opt *inet = inet_sk(sk); > @@ -639,7 +639,7 @@ > * Initialize the control inode/socket with a control endpoint data > * structure. This endpoint is reserved exclusively for the OOTB processing. > */ > -int sctp_ctl_sock_init(void) > +static int sctp_ctl_sock_init(void) > { > int err; > sa_family_t family; > @@ -808,7 +808,7 @@ > return ip_queue_xmit(skb, ipfragok); > } > > -struct sctp_af sctp_ipv4_specific; > +static struct sctp_af sctp_ipv4_specific; > > static struct sctp_pf sctp_pf_inet = { > .event_msgname = sctp_inet_event_msgname, > @@ -829,7 +829,7 @@ > }; > > /* Socket operations. */ > -struct proto_ops inet_seqpacket_ops = { > +static struct proto_ops inet_seqpacket_ops = { > .family = PF_INET, > .owner = THIS_MODULE, > .release = inet_release, /* Needs to be wrapped... */ > @@ -878,7 +878,7 @@ > }; > > /* IPv4 address related functions. */ > -struct sctp_af sctp_ipv4_specific = { > +static struct sctp_af sctp_ipv4_specific = { > .sctp_xmit = sctp_v4_xmit, > .setsockopt = ip_setsockopt, > .getsockopt = ip_getsockopt, > @@ -959,7 +959,7 @@ > } > > /* Initialize the universe into something sensible. */ > -__init int sctp_init(void) > +static __init int sctp_init(void) > { > int i; > int status = -EINVAL; > @@ -1196,7 +1196,7 @@ > } > > /* Exit handler for the SCTP protocol. */ > -__exit void sctp_exit(void) > +static __exit void sctp_exit(void) > { > /* BUG. This should probably do something useful like clean > * up all the remaining associations and all that memory. > --- linux-2.6.10-rc2-mm3-full/net/sctp/sm_make_chunk.c.old 2004-11-25 01:43:40.000000000 +0100 > +++ linux-2.6.10-rc2-mm3-full/net/sctp/sm_make_chunk.c 2004-11-25 01:51:15.000000000 +0100 > @@ -67,6 +67,18 @@ > > extern kmem_cache_t *sctp_chunk_cachep; > > +static struct sctp_chunk *sctp_make_chunk(const struct sctp_association *asoc, > + __u8 type, __u8 flags, int paylen); > +static sctp_cookie_param_t *sctp_pack_cookie(const struct sctp_endpoint *ep, > + const struct sctp_association *asoc, > + const struct sctp_chunk *init_chunk, > + int *cookie_len, > + const __u8 *raw_addrs, int addrs_len); > +static int sctp_process_param(struct sctp_association *asoc, > + union sctp_params param, > + const union sctp_addr *peer_addr, > + int gfp); > + > /* What was the inbound interface for this chunk? */ > int sctp_chunk_iif(const struct sctp_chunk *chunk) > { > @@ -559,52 +571,6 @@ > return retval; > } > > -/* Make a DATA chunk for the given association. Populate the data > - * payload. > - */ > -struct sctp_chunk *sctp_make_datafrag(struct sctp_association *asoc, > - const struct sctp_sndrcvinfo *sinfo, > - int data_len, const __u8 *data, > - __u8 flags, __u16 ssn) > -{ > - struct sctp_chunk *retval; > - > - retval = sctp_make_datafrag_empty(asoc, sinfo, data_len, flags, ssn); > - if (retval) > - sctp_addto_chunk(retval, data_len, data); > - > - return retval; > -} > - > -/* Make a DATA chunk for the given association to ride on stream id > - * 'stream', with a payload id of 'payload', and a body of 'data'. > - */ > -struct sctp_chunk *sctp_make_data(struct sctp_association *asoc, > - const struct sctp_sndrcvinfo *sinfo, > - int data_len, const __u8 *data) > -{ > - struct sctp_chunk *retval = NULL; > - > - retval = sctp_make_data_empty(asoc, sinfo, data_len); > - if (retval) > - sctp_addto_chunk(retval, data_len, data); > - return retval; > -} > - > -/* Make a DATA chunk for the given association to ride on stream id > - * 'stream', with a payload id of 'payload', and a body big enough to > - * hold 'data_len' octets of data. We use this version when we need > - * to build the message AFTER allocating memory. > - */ > -struct sctp_chunk *sctp_make_data_empty(struct sctp_association *asoc, > - const struct sctp_sndrcvinfo *sinfo, > - int data_len) > -{ > - __u8 flags = SCTP_DATA_NOT_FRAG; > - > - return sctp_make_datafrag_empty(asoc, sinfo, data_len, flags, 0); > -} > - > /* Create a selective ackowledgement (SACK) for the given > * association. This reports on which TSN's we've seen to date, > * including duplicates and gaps. > @@ -933,7 +899,7 @@ > /* Create an Operation Error chunk with the specified space reserved. > * This routine can be used for containing multiple causes in the chunk. > */ > -struct sctp_chunk *sctp_make_op_error_space( > +static struct sctp_chunk *sctp_make_op_error_space( > const struct sctp_association *asoc, > const struct sctp_chunk *chunk, > size_t size) > @@ -1062,8 +1028,8 @@ > /* Create a new chunk, setting the type and flags headers from the > * arguments, reserving enough space for a 'paylen' byte payload. > */ > -struct sctp_chunk *sctp_make_chunk(const struct sctp_association *asoc, > - __u8 type, __u8 flags, int paylen) > +static struct sctp_chunk *sctp_make_chunk(const struct sctp_association *asoc, > + __u8 type, __u8 flags, int paylen) > { > struct sctp_chunk *retval; > sctp_chunkhdr_t *chunk_hdr; > @@ -1261,7 +1227,7 @@ > /* Build a cookie representing asoc. > * This INCLUDES the param header needed to put the cookie in the INIT ACK. > */ > -sctp_cookie_param_t *sctp_pack_cookie(const struct sctp_endpoint *ep, > +static sctp_cookie_param_t *sctp_pack_cookie(const struct sctp_endpoint *ep, > const struct sctp_association *asoc, > const struct sctp_chunk *init_chunk, > int *cookie_len, > @@ -1912,8 +1878,10 @@ > * work we do. In particular, we should not build transport > * structures for the addresses. > */ > -int sctp_process_param(struct sctp_association *asoc, union sctp_params param, > - const union sctp_addr *peer_addr, int gfp) > +static int sctp_process_param(struct sctp_association *asoc, > + union sctp_params param, > + const union sctp_addr *peer_addr, > + int gfp) > { > union sctp_addr addr; > int i; > @@ -2078,8 +2046,9 @@ > * > * Address Parameter and other parameter will not be wrapped in this function > */ > -struct sctp_chunk *sctp_make_asconf(struct sctp_association *asoc, > - union sctp_addr *addr, int vparam_len) > +static struct sctp_chunk *sctp_make_asconf(struct sctp_association *asoc, > + union sctp_addr *addr, > + int vparam_len) > { > sctp_addiphdr_t asconf; > struct sctp_chunk *retval; > @@ -2248,8 +2217,8 @@ > * > * Create an ASCONF_ACK chunk with enough space for the parameter responses. > */ > -struct sctp_chunk *sctp_make_asconf_ack(const struct sctp_association *asoc, > - __u32 serial, int vparam_len) > +static struct sctp_chunk *sctp_make_asconf_ack(const struct sctp_association *asoc, > + __u32 serial, int vparam_len) > { > sctp_addiphdr_t asconf; > struct sctp_chunk *retval; > --- linux-2.6.10-rc2-mm3-full/net/sctp/sm_sideeffect.c.old 2004-11-25 01:52:16.000000000 +0100 > +++ linux-2.6.10-rc2-mm3-full/net/sctp/sm_sideeffect.c 2004-11-25 03:24:28.000000000 +0100 > @@ -55,6 +55,24 @@ > #include > #include > > +static int sctp_cmd_interpreter(sctp_event_t event_type, > + sctp_subtype_t subtype, > + sctp_state_t state, > + struct sctp_endpoint *ep, > + struct sctp_association *asoc, > + void *event_arg, > + sctp_disposition_t status, > + sctp_cmd_seq_t *commands, > + int gfp); > +static int sctp_side_effects(sctp_event_t event_type, sctp_subtype_t subtype, > + sctp_state_t state, > + struct sctp_endpoint *ep, > + struct sctp_association *asoc, > + void *event_arg, > + sctp_disposition_t status, > + sctp_cmd_seq_t *commands, > + int gfp); > + > /******************************************************************** > * Helper functions > ********************************************************************/ > @@ -134,8 +152,8 @@ > } > > /* Generate SACK if necessary. We call this at the end of a packet. */ > -int sctp_gen_sack(struct sctp_association *asoc, int force, > - sctp_cmd_seq_t *commands) > +static int sctp_gen_sack(struct sctp_association *asoc, int force, > + sctp_cmd_seq_t *commands) > { > __u32 ctsn, max_tsn_seen; > struct sctp_chunk *sack; > @@ -276,31 +294,31 @@ > sctp_association_put(asoc); > } > > -void sctp_generate_t1_cookie_event(unsigned long data) > +static void sctp_generate_t1_cookie_event(unsigned long data) > { > struct sctp_association *asoc = (struct sctp_association *) data; > sctp_generate_timeout_event(asoc, SCTP_EVENT_TIMEOUT_T1_COOKIE); > } > > -void sctp_generate_t1_init_event(unsigned long data) > +static void sctp_generate_t1_init_event(unsigned long data) > { > struct sctp_association *asoc = (struct sctp_association *) data; > sctp_generate_timeout_event(asoc, SCTP_EVENT_TIMEOUT_T1_INIT); > } > > -void sctp_generate_t2_shutdown_event(unsigned long data) > +static void sctp_generate_t2_shutdown_event(unsigned long data) > { > struct sctp_association *asoc = (struct sctp_association *) data; > sctp_generate_timeout_event(asoc, SCTP_EVENT_TIMEOUT_T2_SHUTDOWN); > } > > -void sctp_generate_t4_rto_event(unsigned long data) > +static void sctp_generate_t4_rto_event(unsigned long data) > { > struct sctp_association *asoc = (struct sctp_association *) data; > sctp_generate_timeout_event(asoc, SCTP_EVENT_TIMEOUT_T4_RTO); > } > > -void sctp_generate_t5_shutdown_guard_event(unsigned long data) > +static void sctp_generate_t5_shutdown_guard_event(unsigned long data) > { > struct sctp_association *asoc = (struct sctp_association *)data; > sctp_generate_timeout_event(asoc, > @@ -308,7 +326,7 @@ > > } /* sctp_generate_t5_shutdown_guard_event() */ > > -void sctp_generate_autoclose_event(unsigned long data) > +static void sctp_generate_autoclose_event(unsigned long data) > { > struct sctp_association *asoc = (struct sctp_association *) data; > sctp_generate_timeout_event(asoc, SCTP_EVENT_TIMEOUT_AUTOCLOSE); > @@ -353,7 +371,7 @@ > } > > /* Inject a SACK Timeout event into the state machine. */ > -void sctp_generate_sack_event(unsigned long data) > +static void sctp_generate_sack_event(unsigned long data) > { > struct sctp_association *asoc = (struct sctp_association *) data; > sctp_generate_timeout_event(asoc, SCTP_EVENT_TIMEOUT_SACK); > @@ -857,14 +875,14 @@ > /***************************************************************** > * This the master state function side effect processing function. > *****************************************************************/ > -int sctp_side_effects(sctp_event_t event_type, sctp_subtype_t subtype, > - sctp_state_t state, > - struct sctp_endpoint *ep, > - struct sctp_association *asoc, > - void *event_arg, > - sctp_disposition_t status, > - sctp_cmd_seq_t *commands, > - int gfp) > +static int sctp_side_effects(sctp_event_t event_type, sctp_subtype_t subtype, > + sctp_state_t state, > + struct sctp_endpoint *ep, > + struct sctp_association *asoc, > + void *event_arg, > + sctp_disposition_t status, > + sctp_cmd_seq_t *commands, > + int gfp) > { > int error; > > @@ -944,11 +962,15 @@ > ********************************************************************/ > > /* This is the side-effect interpreter. */ > -int sctp_cmd_interpreter(sctp_event_t event_type, sctp_subtype_t subtype, > - sctp_state_t state, struct sctp_endpoint *ep, > - struct sctp_association *asoc, void *event_arg, > - sctp_disposition_t status, sctp_cmd_seq_t *commands, > - int gfp) > +static int sctp_cmd_interpreter(sctp_event_t event_type, > + sctp_subtype_t subtype, > + sctp_state_t state, > + struct sctp_endpoint *ep, > + struct sctp_association *asoc, > + void *event_arg, > + sctp_disposition_t status, > + sctp_cmd_seq_t *commands, > + int gfp) > { > int error = 0; > int force; > --- linux-2.6.10-rc2-mm3-full/net/sctp/sm_statefuns.c.old 2004-11-25 01:56:27.000000000 +0100 > +++ linux-2.6.10-rc2-mm3-full/net/sctp/sm_statefuns.c 2004-11-25 02:14:47.000000000 +0100 > @@ -65,6 +65,33 @@ > #include > #include > > +static struct sctp_packet *sctp_abort_pkt_new(const struct sctp_endpoint *ep, > + const struct sctp_association *asoc, > + struct sctp_chunk *chunk, > + const void *payload, > + size_t paylen); > +static int sctp_eat_data(const struct sctp_association *asoc, > + struct sctp_chunk *chunk, > + sctp_cmd_seq_t *commands); > +static struct sctp_packet *sctp_ootb_pkt_new(const struct sctp_association *asoc, > + const struct sctp_chunk *chunk); > +static void sctp_send_stale_cookie_err(const struct sctp_endpoint *ep, > + const struct sctp_association *asoc, > + const struct sctp_chunk *chunk, > + sctp_cmd_seq_t *commands, > + struct sctp_chunk *err_chunk); > +static sctp_disposition_t sctp_sf_do_5_2_6_stale(const struct sctp_endpoint *ep, > + const struct sctp_association *asoc, > + const sctp_subtype_t type, > + void *arg, > + sctp_cmd_seq_t *commands); > +static sctp_disposition_t sctp_sf_shut_8_4_5(const struct sctp_endpoint *ep, > + const struct sctp_association *asoc, > + const sctp_subtype_t type, > + void *arg, > + sctp_cmd_seq_t *commands); > +static struct sctp_sackhdr *sctp_sm_pull_sack(struct sctp_chunk *chunk); > + > /********************************************************** > * These are the state functions for handling chunk events. > **********************************************************/ > @@ -748,11 +775,11 @@ > } > > /* Generate and sendout a heartbeat packet. */ > -sctp_disposition_t sctp_sf_heartbeat(const struct sctp_endpoint *ep, > - const struct sctp_association *asoc, > - const sctp_subtype_t type, > - void *arg, > - sctp_cmd_seq_t *commands) > +static sctp_disposition_t sctp_sf_heartbeat(const struct sctp_endpoint *ep, > + const struct sctp_association *asoc, > + const sctp_subtype_t type, > + void *arg, > + sctp_cmd_seq_t *commands) > { > struct sctp_transport *transport = (struct sctp_transport *) arg; > struct sctp_chunk *reply; > @@ -1928,11 +1955,11 @@ > * > * The return value is the disposition of the chunk. > */ > -sctp_disposition_t sctp_sf_do_5_2_6_stale(const struct sctp_endpoint *ep, > - const struct sctp_association *asoc, > - const sctp_subtype_t type, > - void *arg, > - sctp_cmd_seq_t *commands) > +static sctp_disposition_t sctp_sf_do_5_2_6_stale(const struct sctp_endpoint *ep, > + const struct sctp_association *asoc, > + const sctp_subtype_t type, > + void *arg, > + sctp_cmd_seq_t *commands) > { > struct sctp_chunk *chunk = arg; > time_t stale; > @@ -2853,11 +2880,11 @@ > * > * The return value is the disposition of the chunk. > */ > -sctp_disposition_t sctp_sf_shut_8_4_5(const struct sctp_endpoint *ep, > - const struct sctp_association *asoc, > - const sctp_subtype_t type, > - void *arg, > - sctp_cmd_seq_t *commands) > +static sctp_disposition_t sctp_sf_shut_8_4_5(const struct sctp_endpoint *ep, > + const struct sctp_association *asoc, > + const sctp_subtype_t type, > + void *arg, > + sctp_cmd_seq_t *commands) > { > struct sctp_packet *packet = NULL; > struct sctp_chunk *chunk = arg; > @@ -4537,7 +4564,7 @@ > ********************************************************************/ > > /* Pull the SACK chunk based on the SACK header. */ > -struct sctp_sackhdr *sctp_sm_pull_sack(struct sctp_chunk *chunk) > +static struct sctp_sackhdr *sctp_sm_pull_sack(struct sctp_chunk *chunk) > { > struct sctp_sackhdr *sack; > unsigned int len; > @@ -4564,7 +4591,7 @@ > /* Create an ABORT packet to be sent as a response, with the specified > * error causes. > */ > -struct sctp_packet *sctp_abort_pkt_new(const struct sctp_endpoint *ep, > +static struct sctp_packet *sctp_abort_pkt_new(const struct sctp_endpoint *ep, > const struct sctp_association *asoc, > struct sctp_chunk *chunk, > const void *payload, > @@ -4600,8 +4627,8 @@ > } > > /* Allocate a packet for responding in the OOTB conditions. */ > -struct sctp_packet *sctp_ootb_pkt_new(const struct sctp_association *asoc, > - const struct sctp_chunk *chunk) > +static struct sctp_packet *sctp_ootb_pkt_new(const struct sctp_association *asoc, > + const struct sctp_chunk *chunk) > { > struct sctp_packet *packet; > struct sctp_transport *transport; > @@ -4664,11 +4691,11 @@ > } > > /* Send a stale cookie error when a invalid COOKIE ECHO chunk is found */ > -void sctp_send_stale_cookie_err(const struct sctp_endpoint *ep, > - const struct sctp_association *asoc, > - const struct sctp_chunk *chunk, > - sctp_cmd_seq_t *commands, > - struct sctp_chunk *err_chunk) > +static void sctp_send_stale_cookie_err(const struct sctp_endpoint *ep, > + const struct sctp_association *asoc, > + const struct sctp_chunk *chunk, > + sctp_cmd_seq_t *commands, > + struct sctp_chunk *err_chunk) > { > struct sctp_packet *packet; > > @@ -4694,9 +4721,9 @@ > > > /* Process a data chunk */ > -int sctp_eat_data(const struct sctp_association *asoc, > - struct sctp_chunk *chunk, > - sctp_cmd_seq_t *commands) > +static int sctp_eat_data(const struct sctp_association *asoc, > + struct sctp_chunk *chunk, > + sctp_cmd_seq_t *commands) > { > sctp_datahdr_t *data_hdr; > struct sctp_chunk *err; > --- linux-2.6.10-rc2-mm3-full/net/sctp/sm_statetable.c.old 2004-11-25 02:15:50.000000000 +0100 > +++ linux-2.6.10-rc2-mm3-full/net/sctp/sm_statetable.c 2004-11-25 03:23:43.000000000 +0100 > @@ -50,6 +50,17 @@ > #include > #include > > +static const sctp_sm_table_entry_t > +primitive_event_table[SCTP_NUM_PRIMITIVE_TYPES][SCTP_STATE_NUM_STATES]; > +static const sctp_sm_table_entry_t > +other_event_table[SCTP_NUM_OTHER_TYPES][SCTP_STATE_NUM_STATES]; > +static const sctp_sm_table_entry_t > +timeout_event_table[SCTP_NUM_TIMEOUT_TYPES][SCTP_STATE_NUM_STATES]; > + > +static const sctp_sm_table_entry_t *sctp_chunk_event_lookup(sctp_cid_t cid, > + sctp_state_t state); > + > + > static const sctp_sm_table_entry_t bug = { > .fn = sctp_sf_bug, > .name = "sctp_sf_bug" > @@ -419,7 +430,7 @@ > * > * For base protocol (RFC 2960). > */ > -const sctp_sm_table_entry_t chunk_event_table[SCTP_NUM_BASE_CHUNK_TYPES][SCTP_STATE_NUM_STATES] = { > +static const sctp_sm_table_entry_t chunk_event_table[SCTP_NUM_BASE_CHUNK_TYPES][SCTP_STATE_NUM_STATES] = { > TYPE_SCTP_DATA, > TYPE_SCTP_INIT, > TYPE_SCTP_INIT_ACK, > @@ -482,7 +493,7 @@ > /* The primary index for this table is the chunk type. > * The secondary index for this table is the state. > */ > -const sctp_sm_table_entry_t addip_chunk_event_table[SCTP_NUM_ADDIP_CHUNK_TYPES][SCTP_STATE_NUM_STATES] = { > +static const sctp_sm_table_entry_t addip_chunk_event_table[SCTP_NUM_ADDIP_CHUNK_TYPES][SCTP_STATE_NUM_STATES] = { > TYPE_SCTP_ASCONF, > TYPE_SCTP_ASCONF_ACK, > }; /*state_fn_t addip_chunk_event_table[][] */ > @@ -511,7 +522,7 @@ > /* The primary index for this table is the chunk type. > * The secondary index for this table is the state. > */ > -const sctp_sm_table_entry_t prsctp_chunk_event_table[SCTP_NUM_PRSCTP_CHUNK_TYPES][SCTP_STATE_NUM_STATES] = { > +static const sctp_sm_table_entry_t prsctp_chunk_event_table[SCTP_NUM_PRSCTP_CHUNK_TYPES][SCTP_STATE_NUM_STATES] = { > TYPE_SCTP_FWD_TSN, > }; /*state_fn_t prsctp_chunk_event_table[][] */ > > @@ -684,7 +695,7 @@ > /* The primary index for this table is the primitive type. > * The secondary index for this table is the state. > */ > -const sctp_sm_table_entry_t primitive_event_table[SCTP_NUM_PRIMITIVE_TYPES][SCTP_STATE_NUM_STATES] = { > +static const sctp_sm_table_entry_t primitive_event_table[SCTP_NUM_PRIMITIVE_TYPES][SCTP_STATE_NUM_STATES] = { > TYPE_SCTP_PRIMITIVE_ASSOCIATE, > TYPE_SCTP_PRIMITIVE_SHUTDOWN, > TYPE_SCTP_PRIMITIVE_ABORT, > @@ -716,7 +727,7 @@ > {.fn = sctp_sf_ignore_other, .name = "sctp_sf_ignore_other"}, \ > } > > -const sctp_sm_table_entry_t other_event_table[SCTP_NUM_OTHER_TYPES][SCTP_STATE_NUM_STATES] = { > +static const sctp_sm_table_entry_t other_event_table[SCTP_NUM_OTHER_TYPES][SCTP_STATE_NUM_STATES] = { > TYPE_SCTP_OTHER_NO_PENDING_TSN, > }; > > @@ -931,7 +942,7 @@ > {.fn = sctp_sf_timer_ignore, .name = "sctp_sf_timer_ignore"}, \ > } > > -const sctp_sm_table_entry_t timeout_event_table[SCTP_NUM_TIMEOUT_TYPES][SCTP_STATE_NUM_STATES] = { > +static const sctp_sm_table_entry_t timeout_event_table[SCTP_NUM_TIMEOUT_TYPES][SCTP_STATE_NUM_STATES] = { > TYPE_SCTP_EVENT_TIMEOUT_NONE, > TYPE_SCTP_EVENT_TIMEOUT_T1_COOKIE, > TYPE_SCTP_EVENT_TIMEOUT_T1_INIT, > @@ -944,8 +955,8 @@ > TYPE_SCTP_EVENT_TIMEOUT_AUTOCLOSE, > }; > > -const sctp_sm_table_entry_t *sctp_chunk_event_lookup(sctp_cid_t cid, > - sctp_state_t state) > +static const sctp_sm_table_entry_t *sctp_chunk_event_lookup(sctp_cid_t cid, > + sctp_state_t state) > { > if (state > SCTP_STATE_MAX) > return &bug; > --- linux-2.6.10-rc2-mm3-full/net/sctp/socket.c.old 2004-11-25 02:19:43.000000000 +0100 > +++ linux-2.6.10-rc2-mm3-full/net/sctp/socket.c 2004-11-25 02:20:25.000000000 +0100 > @@ -208,7 +208,7 @@ > * id are specified, the associations matching the address and the id should be > * the same. > */ > -struct sctp_transport *sctp_addr_id2transport(struct sock *sk, > +static struct sctp_transport *sctp_addr_id2transport(struct sock *sk, > struct sockaddr_storage *addr, > sctp_assoc_t id) > { > @@ -245,7 +245,7 @@ > * sockaddr_in6 [RFC 2553]), > * addr_len - the size of the address structure. > */ > -int sctp_bind(struct sock *sk, struct sockaddr *uaddr, int addr_len) > +static int sctp_bind(struct sock *sk, struct sockaddr *uaddr, int addr_len) > { > int retval = 0; > > --- linux-2.6.10-rc2-mm3-full/net/sctp/ssnmap.c.old 2004-11-25 02:20:54.000000000 +0100 > +++ linux-2.6.10-rc2-mm3-full/net/sctp/ssnmap.c 2004-11-25 03:29:59.000000000 +0100 > @@ -42,6 +42,9 @@ > > #define MAX_KMALLOC_SIZE 131072 > > +static struct sctp_ssnmap *sctp_ssnmap_init(struct sctp_ssnmap *map, __u16 in, > + __u16 out); > + > /* Storage size needed for map includes 2 headers and then the > * specific needs of in or out streams. > */ > @@ -87,8 +90,8 @@ > > > /* Initialize a block of memory as a ssnmap. */ > -struct sctp_ssnmap *sctp_ssnmap_init(struct sctp_ssnmap *map, __u16 in, > - __u16 out) > +static struct sctp_ssnmap *sctp_ssnmap_init(struct sctp_ssnmap *map, __u16 in, > + __u16 out) > { > memset(map, 0x00, sctp_ssnmap_size(in, out)); > > --- linux-2.6.10-rc2-mm3-full/net/sctp/transport.c.old 2004-11-25 02:22:16.000000000 +0100 > +++ linux-2.6.10-rc2-mm3-full/net/sctp/transport.c 2004-11-25 02:23:01.000000000 +0100 > @@ -54,34 +54,10 @@ > > /* 1st Level Abstractions. */ > > -/* Allocate and initialize a new transport. */ > -struct sctp_transport *sctp_transport_new(const union sctp_addr *addr, int gfp) > -{ > - struct sctp_transport *transport; > - > - transport = t_new(struct sctp_transport, gfp); > - if (!transport) > - goto fail; > - > - if (!sctp_transport_init(transport, addr, gfp)) > - goto fail_init; > - > - transport->malloced = 1; > - SCTP_DBG_OBJCNT_INC(transport); > - > - return transport; > - > -fail_init: > - kfree(transport); > - > -fail: > - return NULL; > -} > - > /* Initialize a new transport from provided memory. */ > -struct sctp_transport *sctp_transport_init(struct sctp_transport *peer, > - const union sctp_addr *addr, > - int gfp) > +static struct sctp_transport *sctp_transport_init(struct sctp_transport *peer, > + const union sctp_addr *addr, > + int gfp) > { > /* Copy in the address. */ > peer->ipaddr = *addr; > @@ -144,6 +120,30 @@ > return peer; > } > > +/* Allocate and initialize a new transport. */ > +struct sctp_transport *sctp_transport_new(const union sctp_addr *addr, int gfp) > +{ > + struct sctp_transport *transport; > + > + transport = t_new(struct sctp_transport, gfp); > + if (!transport) > + goto fail; > + > + if (!sctp_transport_init(transport, addr, gfp)) > + goto fail_init; > + > + transport->malloced = 1; > + SCTP_DBG_OBJCNT_INC(transport); > + > + return transport; > + > +fail_init: > + kfree(transport); > + > +fail: > + return NULL; > +} > + > /* This transport is no longer needed. Free up if possible, or > * delay until it last reference count. > */ > @@ -161,7 +161,7 @@ > /* Destroy the transport data structure. > * Assumes there are no more users of this structure. > */ > -void sctp_transport_destroy(struct sctp_transport *transport) > +static void sctp_transport_destroy(struct sctp_transport *transport) > { > SCTP_ASSERT(transport->dead, "Transport is not dead", return); > > --- linux-2.6.10-rc2-mm3-full/include/net/sctp/tsnmap.h.old 2004-11-25 02:23:58.000000000 +0100 > +++ linux-2.6.10-rc2-mm3-full/include/net/sctp/tsnmap.h 2004-11-25 02:24:21.000000000 +0100 > @@ -120,12 +120,6 @@ > __u32 start; > }; > > -/* Create a new tsnmap. */ > -struct sctp_tsnmap *sctp_tsnmap_new(__u16 len, __u32 init_tsn, int gfp); > - > -/* Dispose of a tsnmap. */ > -void sctp_tsnmap_free(struct sctp_tsnmap *); > - > /* This macro assists in creation of external storage for variable length > * internal buffers. We double allocate so the overflow map works. > */ > @@ -210,14 +204,4 @@ > /* Is there a gap in the TSN map? */ > int sctp_tsnmap_has_gap(const struct sctp_tsnmap *); > > -/* Initialize a gap ack block interator from user-provided memory. */ > -void sctp_tsnmap_iter_init(const struct sctp_tsnmap *, > - struct sctp_tsnmap_iter *); > - > -/* Get the next gap ack blocks. We return 0 if there are no more > - * gap ack blocks. > - */ > -int sctp_tsnmap_next_gap_ack(const struct sctp_tsnmap *, > - struct sctp_tsnmap_iter *,__u16 *start, __u16 *end); > - > #endif /* __sctp_tsnmap_h__ */ > --- linux-2.6.10-rc2-mm3-full/net/sctp/tsnmap.c.old 2004-11-25 02:24:29.000000000 +0100 > +++ linux-2.6.10-rc2-mm3-full/net/sctp/tsnmap.c 2004-11-25 02:25:39.000000000 +0100 > @@ -52,29 +52,6 @@ > int *started, __u16 *start, > int *ended, __u16 *end); > > -/* Create a new sctp_tsnmap. > - * Allocate room to store at least 'len' contiguous TSNs. > - */ > -struct sctp_tsnmap *sctp_tsnmap_new(__u16 len, __u32 initial_tsn, int gfp) > -{ > - struct sctp_tsnmap *retval; > - > - retval = kmalloc(sizeof(struct sctp_tsnmap) + > - sctp_tsnmap_storage_size(len), gfp); > - if (!retval) > - goto fail; > - > - if (!sctp_tsnmap_init(retval, len, initial_tsn)) > - goto fail_map; > - retval->malloced = 1; > - return retval; > - > -fail_map: > - kfree(retval); > -fail: > - return NULL; > -} > - > /* Initialize a block of memory as a tsnmap. */ > struct sctp_tsnmap *sctp_tsnmap_init(struct sctp_tsnmap *map, __u16 len, > __u32 initial_tsn) > @@ -168,16 +145,9 @@ > } > > > -/* Dispose of a tsnmap. */ > -void sctp_tsnmap_free(struct sctp_tsnmap *map) > -{ > - if (map->malloced) > - kfree(map); > -} > - > /* Initialize a Gap Ack Block iterator from memory being provided. */ > -void sctp_tsnmap_iter_init(const struct sctp_tsnmap *map, > - struct sctp_tsnmap_iter *iter) > +static void sctp_tsnmap_iter_init(const struct sctp_tsnmap *map, > + struct sctp_tsnmap_iter *iter) > { > /* Only start looking one past the Cumulative TSN Ack Point. */ > iter->start = map->cumulative_tsn_ack_point + 1; > @@ -186,8 +156,9 @@ > /* Get the next Gap Ack Blocks. Returns 0 if there was not another block > * to get. > */ > -int sctp_tsnmap_next_gap_ack(const struct sctp_tsnmap *map, > - struct sctp_tsnmap_iter *iter, __u16 *start, __u16 *end) > +static int sctp_tsnmap_next_gap_ack(const struct sctp_tsnmap *map, > + struct sctp_tsnmap_iter *iter, > + __u16 *start, __u16 *end) > { > int started, ended; > __u16 _start, _end, offset; > --- linux-2.6.10-rc2-mm3-full/include/net/sctp/ulpevent.h.old 2004-11-25 02:26:08.000000000 +0100 > +++ linux-2.6.10-rc2-mm3-full/include/net/sctp/ulpevent.h 2004-11-25 02:26:19.000000000 +0100 > @@ -77,8 +77,6 @@ > return (struct sctp_ulpevent *)skb->cb; > } > > -struct sctp_ulpevent *sctp_ulpevent_new(int size, int flags, int gfp); > -void sctp_ulpevent_init(struct sctp_ulpevent *, int flags); > void sctp_ulpevent_free(struct sctp_ulpevent *); > int sctp_ulpevent_is_notification(const struct sctp_ulpevent *); > void sctp_queue_purge_ulpevents(struct sk_buff_head *list); > --- linux-2.6.10-rc2-mm3-full/net/sctp/ulpevent.c.old 2004-11-25 02:26:26.000000000 +0100 > +++ linux-2.6.10-rc2-mm3-full/net/sctp/ulpevent.c 2004-11-25 02:27:38.000000000 +0100 > @@ -65,8 +65,17 @@ > */ > } > > +/* Initialize an ULP event from an given skb. */ > +static void sctp_ulpevent_init(struct sctp_ulpevent *event, int msg_flags) > +{ > + memset(event, 0, sizeof(struct sctp_ulpevent)); > + event->msg_flags = msg_flags; > +} > + > /* Create a new sctp_ulpevent. */ > -struct sctp_ulpevent *sctp_ulpevent_new(int size, int msg_flags, int gfp) > +static struct sctp_ulpevent *sctp_ulpevent_new(int size, > + int msg_flags, > + int gfp) > { > struct sctp_ulpevent *event; > struct sk_buff *skb; > @@ -84,13 +93,6 @@ > return NULL; > } > > -/* Initialize an ULP event from an given skb. */ > -void sctp_ulpevent_init(struct sctp_ulpevent *event, int msg_flags) > -{ > - memset(event, 0, sizeof(struct sctp_ulpevent)); > - event->msg_flags = msg_flags; > -} > - > /* Is this a MSG_NOTIFICATION? */ > int sctp_ulpevent_is_notification(const struct sctp_ulpevent *event) > { > --- linux-2.6.10-rc2-mm3-full/include/net/sctp/ulpqueue.h.old 2004-11-25 02:28:27.000000000 +0100 > +++ linux-2.6.10-rc2-mm3-full/include/net/sctp/ulpqueue.h 2004-11-25 02:28:36.000000000 +0100 > @@ -57,7 +57,6 @@ > }; > > /* Prototypes. */ > -struct sctp_ulpq *sctp_ulpq_new(struct sctp_association *asoc, int gfp); > struct sctp_ulpq *sctp_ulpq_init(struct sctp_ulpq *, > struct sctp_association *); > void sctp_ulpq_free(struct sctp_ulpq *); > --- linux-2.6.10-rc2-mm3-full/net/sctp/ulpqueue.c.old 2004-11-25 02:28:44.000000000 +0100 > +++ linux-2.6.10-rc2-mm3-full/net/sctp/ulpqueue.c 2004-11-25 02:28:58.000000000 +0100 > @@ -56,25 +56,6 @@ > > /* 1st Level Abstractions */ > > -/* Create a new ULP queue. */ > -struct sctp_ulpq *sctp_ulpq_new(struct sctp_association *asoc, int gfp) > -{ > - struct sctp_ulpq *ulpq; > - > - ulpq = kmalloc(sizeof(struct sctp_ulpq), gfp); > - if (!ulpq) > - goto fail; > - if (!sctp_ulpq_init(ulpq, asoc)) > - goto fail_init; > - ulpq->malloced = 1; > - return ulpq; > - > -fail_init: > - kfree(ulpq); > -fail: > - return NULL; > -} > - > /* Initialize a ULP queue from a block of memory. */ > struct sctp_ulpq *sctp_ulpq_init(struct sctp_ulpq *ulpq, > struct sctp_association *asoc) > @@ -92,7 +73,7 @@ > > > /* Flush the reassembly and ordering queues. */ > -void sctp_ulpq_flush(struct sctp_ulpq *ulpq) > +static void sctp_ulpq_flush(struct sctp_ulpq *ulpq) > { > struct sk_buff *skb; > struct sctp_ulpevent *event; > > From jesse.brandeburg@intel.com Tue Dec 7 17:13:43 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 17:13:47 -0800 (PST) Received: from orsfmr003.jf.intel.com (fmr18.intel.com [134.134.136.17]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB81Dgvh020382 for ; Tue, 7 Dec 2004 17:13:43 -0800 Received: from orsfmr100.jf.intel.com (orsfmr100.jf.intel.com [10.7.209.16]) by orsfmr003.jf.intel.com (8.12.10/8.12.10/d: major-outer.mc,v 1.1 2004/09/17 17:50:56 root Exp $) with ESMTP id iB81DFcE006153; Wed, 8 Dec 2004 01:13:15 GMT Received: from orsmsxvs040.jf.intel.com (orsmsxvs040.jf.intel.com [192.168.65.206]) by orsfmr100.jf.intel.com (8.12.10/8.12.10/d: major-inner.mc,v 1.2 2004/09/17 18:05:01 root Exp $) with SMTP id iB81D8Zp007661; Wed, 8 Dec 2004 01:13:08 GMT Received: from orsmsx332.amr.corp.intel.com ([192.168.65.60]) by orsmsxvs040.jf.intel.com (SAVSMTP 3.1.2.35) with SMTP id M2004120717102922328 ; Tue, 07 Dec 2004 17:10:32 -0800 Received: from orsmsx408.amr.corp.intel.com ([192.168.65.52]) by orsmsx332.amr.corp.intel.com with Microsoft SMTPSVC(6.0.3790.0); Tue, 7 Dec 2004 17:10:27 -0800 X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Subject: RE: how to tune a pair of e1000 cards on intel e7501-based system? Date: Tue, 7 Dec 2004 17:10:26 -0800 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: how to tune a pair of e1000 cards on intel e7501-based system? Thread-Index: AcTb6WG4udppu+VYRkClljDSThg2tgA2HYsw From: "Brandeburg, Jesse" To: "Ray Lehtiniemi" , "Scott Feldman" Cc: X-OriginalArrivalTime: 08 Dec 2004 01:10:27.0208 (UTC) FILETIME=[B3D54080:01C4DCC2] X-Scanned-By: MIMEDefang 2.44 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id iB81Dgvh020382 X-archive-position: 12550 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jesse.brandeburg@intel.com Precedence: bulk X-list: netdev > > > What kind of numbers are you getting? > > i'm seeing about 100Kpps, with all settings at their defaults on the > 2.4.20 > kernel. > > basically, i have a couple of desktop PCs generating 480 streams > of UDP data at 50 packets per second. Packet size on the wire, including > 96 bits of IFG, is 128 bytes. these packets are forwarded through a user > process on the NexGen box to an echoer process which is also running on > the > traffic generator boxes. the echoer sends them back to the NexGen user > process, > which forwards them back to the generator process. timestamps are logged > for each packet at send, loop and recv points. > I'm not much of an expert, but one easy thing to try is to up your receive stack resources, as they were particularly low on 2.4 series kernels, leading to udp getting overrun pretty easily with gig nics. I think if you make the value go too high it just ignores it, so if you see no change, try 256kB instead. cat /proc/sys/net/core/rmem_default cat /proc/sys/net/core/rmem_max echo -n 512000 > /proc/sys/net/core/rmem_default echo -n 512000 > /proc/sys/net/core/rmem_max Jesse From hadi@cyberus.ca Tue Dec 7 20:28:01 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 20:28:08 -0800 (PST) Received: from mx03.cybersurf.com (mx03.cybersurf.com [209.197.145.106]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB84Rxlo027978 for ; Tue, 7 Dec 2004 20:28:00 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx03.cybersurf.com with esmtp (Exim 4.30) id 1CbtQS-0007v3-92 for netdev@oss.sgi.com; Tue, 07 Dec 2004 23:27:40 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CbtQI-0007Z6-Jb; Tue, 07 Dec 2004 23:27:30 -0500 Subject: Re: Hard freeze with 2.6.10-rc3 and QoS, worked fine with 2.6.9 From: jamal Reply-To: hadi@cyberus.ca To: Patrick McHardy Cc: Thomas Graf , Andrew Morton , Thomas Cataldo , linux-kernel@vger.kernel.org, netdev@oss.sgi.com, "David S. Miller" In-Reply-To: <41B5E722.2080600@trash.net> References: <1102380430.6103.6.camel@buffy> <20041206224441.628e7885.akpm@osdl.org> <1102422544.1088.98.camel@jzny.localdomain> <41B5E188.5050800@trash.net> <20041207170748.GF1371@postel.suug.ch> <41B5E722.2080600@trash.net> Content-Type: text/plain Organization: jamalopolous Message-Id: <1102480044.1050.9.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 07 Dec 2004 23:27:25 -0500 Content-Transfer-Encoding: 7bit X-archive-position: 12551 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Tue, 2004-12-07 at 12:23, Patrick McHardy wrote: > Either one is fine with me, although I would prefer to see > the number of ifdefs in this area going down, not up :) You guys pick one or other or a mix. I run 4 base testcases for the policer typically: 1) Old kernel, uptodate TC - MUST pass 2) old kernel, old tc (trivial - expected to pass). 3) New Kernel, uptodate TC - MUST pass 4) New Kernel, uptodate TC - MUST pass (although trivial) Try both setting, dumping then deleting policies. If these tests pass, please push patch to Dave. cheers, jamal From hadi@cyberus.ca Tue Dec 7 20:32:28 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 20:32:33 -0800 (PST) Received: from mx03.cybersurf.com (mx03.cybersurf.com [209.197.145.106]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB84WSQx028360 for ; Tue, 7 Dec 2004 20:32:28 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx03.cybersurf.com with esmtp (Exim 4.30) id 1CbtUn-0001AA-1E for netdev@oss.sgi.com; Tue, 07 Dec 2004 23:32:09 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CbtUh-0007vs-Cn; Tue, 07 Dec 2004 23:32:03 -0500 Subject: Re: _High_ CPU usage while routing (mostly) small UDP packets From: jamal Reply-To: hadi@cyberus.ca To: Karsten Desler Cc: netdev@oss.sgi.com, linux-kernel@vger.kernel.org, "David S. Miller" , Robert Olsson , P@draigBrady.com In-Reply-To: <20041207211035.GA20286@quickstop.soohrt.org> References: <20041206205305.GA11970@soohrt.org> <20041207211035.GA20286@quickstop.soohrt.org> Content-Type: text/plain Organization: jamalopolous Message-Id: <1102480318.1050.16.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 07 Dec 2004 23:31:58 -0500 Content-Transfer-Encoding: 7bit X-archive-position: 12552 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Tue, 2004-12-07 at 16:10, Karsten Desler wrote: > Karsten Desler wrote: > > Current packetload on eth0 (and reversed on eth1): > > 115kpps tx > > 135kpps rx > > I totally forgot to mention: There are approximately 100k concurrent > flows. ;-> Aha. That would make a huge difference. I know of noone who has actually done this level of testing. I have tried upto about 50K flows myself in early 2.6.x and was eventually compute bound. Really try compiling out totaly iptables/netfilter - it will make a difference. You may also want to try something like LC trie algorithm that Robert and co are playing with to see if it makes a difference with this many flows. cheers, jamal From hadi@znyx.com Tue Dec 7 20:42:22 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 20:42:27 -0800 (PST) Received: from lotus.znyx.com (znx208-2-156-007.znyx.com [208.2.156.7]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB84gM8P029034 for ; Tue, 7 Dec 2004 20:42:22 -0800 Received: from [127.0.0.1] ([208.2.156.2]) by lotus.znyx.com (Lotus Domino Release 5.0.11) with ESMTP id 2004120720452509:52708 ; Tue, 7 Dec 2004 20:45:25 -0800 Subject: Re: Hard freeze with 2.6.10-rc3 and QoS, worked fine with 2.6.9 From: Jamal Hadi Salim Reply-To: hadi@znyx.com To: Patrick McHardy Cc: Thomas Graf , Andrew Morton , Thomas Cataldo , linux-kernel@vger.kernel.org, netdev@oss.sgi.com, "David S. Miller" In-Reply-To: <1102480044.1050.9.camel@jzny.localdomain> References: <1102380430.6103.6.camel@buffy> <20041206224441.628e7885.akpm@osdl.org> <1102422544.1088.98.camel@jzny.localdomain> <41B5E188.5050800@trash.net> <20041207170748.GF1371@postel.suug.ch> <41B5E722.2080600@trash.net> <1102480044.1050.9.camel@jzny.localdomain> Organization: ZNYX Networks Message-Id: <1102480913.1049.24.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 07 Dec 2004 23:41:53 -0500 X-MIMETrack: Itemize by SMTP Server on Lotus/Znyx(Release 5.0.11 |July 24, 2002) at 12/07/2004 08:45:25 PM, Serialize by Router on Lotus/Znyx(Release 5.0.11 |July 24, 2002) at 12/07/2004 08:45:30 PM, Serialize complete at 12/07/2004 08:45:30 PM Content-Transfer-Encoding: 7bit Content-Type: text/plain X-archive-position: 12553 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@znyx.com Precedence: bulk X-list: netdev BTW, old kernel in this case implies one that does not support tc actions at all. So pick something like 2.4.28. New is whatever 2.6.x with patch. Old tc is something that for example ships with redhat new tc is whatever one is patched. Supplementary tests are: in 2.6.x to compile the policer in two different ways a) via tc actions and b) using the old scheme which is understood by "old" tc. Repeat the tests i described earlier with b) pretending to be "old" kernel. Infact come to think of it i would also prefer to have the suplementary tests run as well. If you guys have no cycles, please pass the patch to me and i will test on the weekend. cheers, jamal On Tue, 2004-12-07 at 23:27, jamal wrote: > On Tue, 2004-12-07 at 12:23, Patrick McHardy wrote: > > > Either one is fine with me, although I would prefer to see > > the number of ifdefs in this area going down, not up :) > > You guys pick one or other or a mix. I run 4 base testcases for the > policer typically: > > 1) Old kernel, uptodate TC - MUST pass > 2) old kernel, old tc (trivial - expected to pass). > 3) New Kernel, uptodate TC - MUST pass > 4) New Kernel, uptodate TC - MUST pass (although trivial) > > Try both setting, dumping then deleting policies. > > If these tests pass, please push patch to Dave. > > cheers, > jamal > From kaber@trash.net Tue Dec 7 21:17:57 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 21:18:03 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB85Hu8C031042 for ; Tue, 7 Dec 2004 21:17:57 -0800 Received: from localhost ([127.0.0.1]) by kaber.coreworks.de with esmtp (Exim 4.34) id 1CbuCT-000158-Jy; Wed, 08 Dec 2004 06:17:17 +0100 Message-ID: <41B68E5D.2080009@trash.net> Date: Wed, 08 Dec 2004 06:17:17 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.3) Gecko/20041008 Debian/1.7.3-5 X-Accept-Language: en MIME-Version: 1.0 To: hadi@znyx.com CC: Thomas Graf , Andrew Morton , Thomas Cataldo , linux-kernel@vger.kernel.org, netdev@oss.sgi.com, "David S. Miller" Subject: Re: Hard freeze with 2.6.10-rc3 and QoS, worked fine with 2.6.9 References: <1102380430.6103.6.camel@buffy> <20041206224441.628e7885.akpm@osdl.org> <1102422544.1088.98.camel@jzny.localdomain> <41B5E188.5050800@trash.net> <20041207170748.GF1371@postel.suug.ch> <41B5E722.2080600@trash.net> <1102480044.1050.9.camel@jzny.localdomain> <1102480913.1049.24.camel@jzny.localdomain> In-Reply-To: <1102480913.1049.24.camel@jzny.localdomain> Content-Type: multipart/mixed; boundary="------------040309070208000504050501" X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12554 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------040309070208000504050501 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Jamal Hadi Salim wrote: >BTW, old kernel in this case implies one that does not support tc >actions at all. So pick something like 2.4.28. >New is whatever 2.6.x with patch. >Old tc is something that for example ships with redhat >new tc is whatever one is patched. > >Supplementary tests are: in 2.6.x to compile the policer >in two different ways a) via tc actions and b) using the old scheme >which is understood by "old" tc. Repeat the tests i described earlier >with b) pretending to be "old" kernel. > >Infact come to think of it i would also prefer to have the suplementary >tests run as well. >If you guys have no cycles, please pass the patch to me and i will test >on the weekend. > > I think these tests are a waste of time. struct tcf_police is not userspace-visible, so it's highly unlikely that the tc version matters. Why an old kernel needs to be tested is beyond me. For possible in-kernel breakage caused by the restructuring, without CONFIG_NET_CLS_ACT, struct tcf_police is only used in police.c, without any casts or assumptions about layout, so I can't see what could break. With CONFIG_NET_CLS_ACT, the only place where it is used outside of police.c is tcf_action_copy_stats, and this is exactly what this patch (tested) fixes. If you still want to do these test, please use the attached patch. Regards Patrick --------------040309070208000504050501 Content-Type: text/plain; name="x" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="x" ===== include/net/act_api.h 1.4 vs edited ===== --- 1.4/include/net/act_api.h 2004-11-06 01:33:12 +01:00 +++ edited/include/net/act_api.h 2004-12-07 17:53:37 +01:00 @@ -8,15 +8,23 @@ #include #include +#define tca_gen(name) \ +struct tcf_##name *next; \ + u32 index; \ + int refcnt; \ + int bindcnt; \ + u32 capab; \ + int action; \ + struct tcf_t tm; \ + struct gnet_stats_basic bstats; \ + struct gnet_stats_queue qstats; \ + struct gnet_stats_rate_est rate_est; \ + spinlock_t *stats_lock; \ + spinlock_t lock + struct tcf_police { - struct tcf_police *next; - int refcnt; -#ifdef CONFIG_NET_CLS_ACT - int bindcnt; -#endif - u32 index; - int action; + tca_gen(police); int result; u32 ewma_rate; u32 burst; @@ -24,33 +32,14 @@ u32 toks; u32 ptoks; psched_time_t t_c; - spinlock_t lock; struct qdisc_rate_table *R_tab; struct qdisc_rate_table *P_tab; - - struct gnet_stats_basic bstats; - struct gnet_stats_queue qstats; - struct gnet_stats_rate_est rate_est; - spinlock_t *stats_lock; }; #ifdef CONFIG_NET_CLS_ACT #define ACT_P_CREATED 1 #define ACT_P_DELETED 1 -#define tca_gen(name) \ -struct tcf_##name *next; \ - u32 index; \ - int refcnt; \ - int bindcnt; \ - u32 capab; \ - int action; \ - struct tcf_t tm; \ - struct gnet_stats_basic bstats; \ - struct gnet_stats_queue qstats; \ - struct gnet_stats_rate_est rate_est; \ - spinlock_t *stats_lock; \ - spinlock_t lock struct tcf_act_hdr { --------------040309070208000504050501-- From davem@davemloft.net Tue Dec 7 21:30:31 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 21:30:36 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB85UVCs031807 for ; Tue, 7 Dec 2004 21:30:31 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1CbuN8-0001VE-00; Tue, 07 Dec 2004 21:28:18 -0800 Date: Tue, 7 Dec 2004 21:28:17 -0800 From: "David S. Miller" To: Mitchell Blank Jr Cc: kernel@linuxace.com, shemminger@osdl.org, linux-net@vger.kernel.org, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: [PATCH] fix select() for SOCK_RAW sockets Message-Id: <20041207212817.1b74671b.davem@davemloft.net> In-Reply-To: <20041207150834.GA75700@gaz.sfgoth.com> References: <20041207003525.GA22933@linuxace.com> <20041207025218.GB61527@gaz.sfgoth.com> <20041207045302.GA23746@linuxace.com> <20041207054840.GD61527@gaz.sfgoth.com> <20041207150834.GA75700@gaz.sfgoth.com> X-Mailer: Sylpheed version 1.0.0beta3 (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12555 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Tue, 7 Dec 2004 07:08:34 -0800 Mitchell Blank Jr wrote: > Davem: I only tested that this doesn't break UDP; if it works for Phil and > Stephen can verify that it doesn't break his bad-checksum UDP tests then > please push it for 2.6.10. Looks good Mitchell, patch applied. Thanks. From davem@davemloft.net Tue Dec 7 21:33:23 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 21:33:29 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB85XNIC032269 for ; Tue, 7 Dec 2004 21:33:23 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1CbuPd-0001Vp-00; Tue, 07 Dec 2004 21:30:53 -0800 Date: Tue, 7 Dec 2004 21:30:53 -0800 From: "David S. Miller" To: Patrick McHardy Cc: tgraf@suug.ch, hadi@cyberus.ca, akpm@osdl.org, tomc@compaqnet.fr, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: Hard freeze with 2.6.10-rc3 and QoS, worked fine with 2.6.9 Message-Id: <20041207213053.6bb602c1.davem@davemloft.net> In-Reply-To: <41B5E722.2080600@trash.net> References: <1102380430.6103.6.camel@buffy> <20041206224441.628e7885.akpm@osdl.org> <1102422544.1088.98.camel@jzny.localdomain> <41B5E188.5050800@trash.net> <20041207170748.GF1371@postel.suug.ch> <41B5E722.2080600@trash.net> X-Mailer: Sylpheed version 1.0.0beta3 (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12556 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Tue, 07 Dec 2004 18:23:46 +0100 Patrick McHardy wrote: > Either one is fine with me, although I would prefer to see > the number of ifdefs in this area going down, not up :) I agree, therefore I applied Patrick's patch. From bunk@stusta.de Tue Dec 7 21:33:59 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 21:34:06 -0800 (PST) Received: from mailout.stusta.mhn.de (mailout.stusta.mhn.de [141.84.69.5]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iB85Xwq4032505 for ; Tue, 7 Dec 2004 21:33:58 -0800 Received: (qmail 13316 invoked from network); 8 Dec 2004 03:33:30 -0000 Received: from r063144.stusta.swh.mhn.de (10.150.63.144) by mailout.stusta.mhn.de with SMTP; 8 Dec 2004 03:33:30 -0000 Received: by r063144.stusta.swh.mhn.de (Postfix, from userid 1000) id A26C6BBA6A; Wed, 8 Dec 2004 04:33:26 +0100 (CET) Date: Wed, 8 Dec 2004 04:33:26 +0100 From: Adrian Bunk To: prism54-private@prism54.org, Andrew Morton Cc: netdev@oss.sgi.com, jgarzik@pobox.com, linux-net@vger.kernel.org Subject: [2.6 patch] prism54: small prismcompat cleanup (fwd) Message-ID: <20041208033326.GQ5496@stusta.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12557 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bunk@stusta.de Precedence: bulk X-list: netdev The patch forwarded below (slightly adopted for 2.6.10-rc2-mm4) still applies. Please apply. ----- Forwarded message from Adrian Bunk ----- Date: Sat, 30 Oct 2004 07:38:00 +0200 From: Adrian Bunk To: Margit Schubert-While Cc: prism54-private@prism54.org, netdev@oss.sgi.com, jgarzik@pobox.com, linux-net@vger.kernel.org Subject: [2.6 patch] prism54: small prismcompat cleanup - the FW_LOADER is already guaranteed through the Kconfig file - prism54_synchronize_irq is also #define'd to synchronize_irq in prismcompat24.h, so there's no need for it diffstat output: drivers/net/wireless/prism54/islpci_dev.c | 2 +- drivers/net/wireless/prism54/prismcompat.h | 6 ------ 2 files changed, 1 insertion(+), 7 deletions(-) Signed-off-by: Adrian Bunk --- linux-2.6.10-rc1-mm2-full/drivers/net/wireless/prism54/islpci_dev.c.old2 2004-10-30 07:23:07.000000000 +0200 +++ linux-2.6.10-rc1-mm2-full/drivers/net/wireless/prism54/islpci_dev.c 2004-10-30 07:23:19.000000000 +0200 @@ -420,7 +420,7 @@ * currently in progress by emptying the TX and RX queues. */ /* wait until interrupts have finished executing on other CPUs */ - prism54_synchronize_irq(priv->pdev->irq); + synchronize_irq(priv->pdev->irq); reg = readl(device_base + ISL38XX_CTRL_STAT_REG); reg &= ~(ISL38XX_CTRL_STAT_RESET | ISL38XX_CTRL_STAT_RAMBOOT); --- linux-2.6.10-rc2-mm4-full/drivers/net/wireless/prism54/prismcompat.h.old 2004-12-08 04:06:13.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/drivers/net/wireless/prism54/prismcompat.h 2004-12-08 04:06:25.000000000 +0100 @@ -34,16 +34,10 @@ #include #include -#if !defined(CONFIG_FW_LOADER) && !defined(CONFIG_FW_LOADER_MODULE) -#error Firmware Loading is not configured in the kernel ! -#endif - #ifndef __iomem #define __iomem #endif -#define prism54_synchronize_irq(irq) synchronize_irq(irq) - #define PRISM_FW_PDEV &priv->pdev->dev #endif /* _PRISM_COMPAT_H */ From davem@davemloft.net Tue Dec 7 21:34:43 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 21:34:49 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB85YhFg000624 for ; Tue, 7 Dec 2004 21:34:43 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1CbuRG-0001Wk-00; Tue, 07 Dec 2004 21:32:34 -0800 Date: Tue, 7 Dec 2004 21:32:34 -0800 From: "David S. Miller" To: Thomas Graf Cc: netdev@oss.sgi.com Subject: Re: [PATCH] PKT_SCHED: validate policer configuration TLVs Message-Id: <20041207213234.257fd0d9.davem@davemloft.net> In-Reply-To: <20041207172349.GG1371@postel.suug.ch> References: <20041207172349.GG1371@postel.suug.ch> X-Mailer: Sylpheed version 1.0.0beta3 (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12558 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Tue, 7 Dec 2004 18:23:49 +0100 Thomas Graf wrote: > Adds TLV size sanity checks for policer configuration. Hmmm... > - if (tb[TCA_POLICE_RESULT-1]) > + if (tb[TCA_POLICE_RESULT-1]) { > + if (RTA_PAYLOAD(tb[TCA_POLICE_RESULT-1]) != sizeof(u32)) > + goto failure; > p->result = *(int*)RTA_DATA(tb[TCA_POLICE_RESULT-1]); > + } Either these things are int's or u32's, they cannot be both :-) I know that size wise it's identical, but at least make the code look consistent. From willy@w.ods.org Tue Dec 7 21:53:19 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 21:53:23 -0800 (PST) Received: from willy.net1.nerim.net (willy.net1.nerim.net [62.212.114.60]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB85rHM9001487 for ; Tue, 7 Dec 2004 21:53:18 -0800 Date: Wed, 8 Dec 2004 06:39:53 +0100 From: Willy Tarreau To: Karsten Desler Cc: P@draigBrady.com, "David S. Miller" , netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: _High_ CPU usage while routing (mostly) small UDP packets Message-ID: <20041208053953.GC17946@alpha.home.local> References: <20041206205305.GA11970@soohrt.org> <20041206134849.498bfc93.davem@davemloft.net> <20041206224107.GA8529@soohrt.org> <41B58A58.8010007@draigBrady.com> <20041207112139.GA3610@soohrt.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041207112139.GA3610@soohrt.org> User-Agent: Mutt/1.4i X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12559 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: willy@w.ods.org Precedence: bulk X-list: netdev On Tue, Dec 07, 2004 at 12:21:39PM +0100, Karsten Desler wrote: > > I also notice that a lot of time is spent allocating > > and freeing the packet buffers (and possible hidden > > time due to cache misses due to allocating on one > > CPU and freeing on another?). > > How many [RT]xDescriptors do you have configured by the way? > > 256. I increased them to 1024 shortly after the profiling run, but > didn't notice any change in the cpu usage (will try again with cyclesoak). Have you checked the interrupts rate ? I had an e1000 eating many CPU cycles because it would generate 50000 interrupts/s. Passing the module InterruptThrottleRate=5000 definitely calmed it down, and more than doubled the data rate. Regards Willy From buytenh@wantstofly.org Tue Dec 7 23:39:22 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 23:39:26 -0800 (PST) Received: from xi.wantstofly.org (alephnull.demon.nl [212.238.201.82]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB87dLgg005057 for ; Tue, 7 Dec 2004 23:39:22 -0800 Received: by xi.wantstofly.org (Postfix, from userid 500) id 217542B0ED; Wed, 8 Dec 2004 08:38:59 +0100 (MET) Date: Wed, 8 Dec 2004 08:38:58 +0100 From: Lennert Buytenhek To: Ben Greear Cc: Robert Olsson , hadi@cyberus.ca, netdev@oss.sgi.com Subject: Re: inter-packet gap in pktgen Message-ID: <20041208073858.GA4027@xi.wantstofly.org> References: <20041207222522.GA30266@xi.wantstofly.org> <41B632F3.1090104@candelatech.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <41B632F3.1090104@candelatech.com> User-Agent: Mutt/1.4.1i X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12560 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: buytenh@wantstofly.org Precedence: bulk X-list: netdev On Tue, Dec 07, 2004 at 02:47:15PM -0800, Ben Greear wrote: > >By tweaking the 'ipg' parameter I can generate pretty much any packet rate > >I want, as long as I set ipg=(1e9/rate)-496 instead of something possibly > >more straightforward. > > That 496 will also change with load on the system, at least on average. I > dealt with this by having a user-space app sample the rate and adjust the > ipg to keep the average rate where I want it. > > So, I'd suggest leaving the ipg as it is, and use external tools to get > the exact pps that you are looking for. Just because it's possible that way doesn't mean that it's the only way or even _the_ way of doing it. Another option is: next_tx = get_time_in_ns(); while (--count) { tx_packet(); next_tx += 1e9/intended_pps; nanospin(next_tx - get_time_in_ns()); } This should be relatively independent of system load. OK, I know, time to show some code. --L From yoshfuji@linux-ipv6.org Tue Dec 7 23:57:47 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 07 Dec 2004 23:57:52 -0800 (PST) Received: from yue.st-paulia.net (yue.linux-ipv6.org [203.178.140.15]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB87vkeL006170 for ; Tue, 7 Dec 2004 23:57:47 -0800 Received: from localhost (localhost [127.0.0.1]) by yue.st-paulia.net (Postfix) with ESMTP id 4C3A833CE5; Wed, 8 Dec 2004 16:58:59 +0900 (JST) Date: Wed, 08 Dec 2004 16:58:50 +0900 (JST) Message-Id: <20041208.165850.97660819.yoshfuji@linux-ipv6.org> To: davem@davemloft.net Cc: shemminger@osdl.org, mitch@sfgoth.com, kernel@linuxace.com, linux-net@vger.kernel.org, linux-kernel@vger.kernel.org, netdev@oss.sgi.com, yoshfuji@linux-ipv6.org Subject: Re: [PATCH] fix select() for SOCK_RAW sockets (ipv6) From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: <20041207104815.3f7a4684.davem@davemloft.net> References: <20041208.023530.26430801.yoshfuji@linux-ipv6.org> <20041207100140.781f4c00@dxpl.pdx.osdl.net> <20041207104815.3f7a4684.davem@davemloft.net> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12561 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev In article <20041207104815.3f7a4684.davem@davemloft.net> (at Tue, 7 Dec 2004 10:48:15 -0800), "David S. Miller" says: > On Tue, 7 Dec 2004 10:01:40 -0800 > Stephen Hemminger wrote: > > > > Probably, we need to do the same for ipv6, don't we? > > > > diff -Nru a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c > > --- a/net/ipv6/af_inet6.c 2004-12-07 10:02:50 -08:00 > > +++ b/net/ipv6/af_inet6.c 2004-12-07 10:02:50 -08:00 > > @@ -513,6 +513,27 @@ > > We didn't do the "UDP select() fix" on ipv6 because we don't > have the delayed checksumming optimization there. So the > ipv6 side really doesn't need this change. Ok, thanks for the explanation. --yoshfuji @ Strasbourg From yoshfuji@linux-ipv6.org Wed Dec 8 01:43:45 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 01:43:51 -0800 (PST) Received: from yue.st-paulia.net (yue.linux-ipv6.org [203.178.140.15]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB89hiPI013225 for ; Wed, 8 Dec 2004 01:43:44 -0800 Received: from localhost (localhost [127.0.0.1]) by yue.st-paulia.net (Postfix) with ESMTP id F151933CE5; Wed, 8 Dec 2004 18:44:56 +0900 (JST) Date: Wed, 08 Dec 2004 10:44:56 +0100 (CET) Message-Id: <20041208.104456.103795781.yoshfuji@linux-ipv6.org> To: davem@davemloft.net Cc: netdev@oss.sgi.com Subject: [PATCH] [IPV6]: Fix check if we're router. From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12562 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev Hello. Fix the check if we're router. against 2.6.10-rc3, for 2.6.10 queue. Signed-off-by: Hideaki YOSHIFUJI ===== net/ipv6/ndisc.c 1.105 vs edited ===== --- 1.105/net/ipv6/ndisc.c 2004-11-10 15:57:03 +09:00 +++ edited/net/ipv6/ndisc.c 2004-12-08 18:34:45 +09:00 @@ -943,7 +943,7 @@ } /* Don't accept RS if we're not in router mode */ - if (!idev->cnf.forwarding || idev->cnf.accept_ra) + if (!idev->cnf.forwarding) goto out; /* -- Hideaki YOSHIFUJI @ USAGI Project GPG FP: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA From Robert.Olsson@data.slu.se Wed Dec 8 02:22:24 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 02:22:29 -0800 (PST) Received: from mail1.slu.se (mail1.slu.se [130.238.96.11]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB8AMMuH015083 for ; Wed, 8 Dec 2004 02:22:23 -0800 Received: from robur.slu.se (robur.slu.se [130.238.98.12]) by mail1.slu.se (8.12.10/8.12.10) with ESMTP id iB8ALsKO001387; Wed, 8 Dec 2004 11:21:54 +0100 Received: by robur.slu.se (Postfix, from userid 1000) id CA3B7EC002; Wed, 8 Dec 2004 11:21:54 +0100 (CET) From: Robert Olsson MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16822.54722.755218.745451@robur.slu.se> Date: Wed, 8 Dec 2004 11:21:54 +0100 To: Lennert Buytenhek Cc: Ben Greear , Robert Olsson , hadi@cyberus.ca, netdev@oss.sgi.com Subject: Re: inter-packet gap in pktgen In-Reply-To: <20041208073858.GA4027@xi.wantstofly.org> References: <20041207222522.GA30266@xi.wantstofly.org> <41B632F3.1090104@candelatech.com> <20041208073858.GA4027@xi.wantstofly.org> X-Mailer: VM 7.18 under Emacs 21.3.1 X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12563 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Robert.Olsson@data.slu.se Precedence: bulk X-list: netdev Lennert Buytenhek writes: > > Another option is: > > next_tx = get_time_in_ns(); > while (--count) { > tx_packet(); > next_tx += 1e9/intended_pps; > nanospin(next_tx - get_time_in_ns()); > } Hello! I think this what Ben is doing with his userland app. Ev. adjusting the ipg delay in runtime? A kind of control system. Even the device stats could possible be read. > This should be relatively independent of system load. OK, I know, > time to show some code. Also an idea might be to have some kind of option to use pktgen w. existent qdisc/tc infrastructure for this type of tests. Ben did you try this? --ro From webvenza@libero.it Wed Dec 8 02:47:48 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 02:47:55 -0800 (PST) Received: from gateway.milesteg.arr (venza@adsl-ull-208-134.44-151.net24.it [151.44.134.208]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB8AllcV022245 for ; Wed, 8 Dec 2004 02:47:48 -0800 Date: Wed, 8 Dec 2004 11:47:21 +0100 From: Daniele Venzano To: NetDev , Jeff Garzik Subject: [PATCH 0/5] sis900 printk and stack usage audit Message-ID: <20041208104721.GA31707@picchio.gall.it> Mail-Followup-To: NetDev , Jeff Garzik Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Operating-System: Debian GNU/Linux on kernel Linux 2.4.27-grsec X-Copyright: Forwarding or publishing without permission is prohibited. X-Truth: La vita e' una questione di culo, o ce l'hai o te lo fanno. X-GPG-Fingerprint: 642A A345 1CEF B6E3 925C 23CE DAB9 8764 25B3 57ED X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12564 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: webvenza@libero.it Precedence: bulk X-list: netdev This patchset brings the debug output of the sis900 driver up to date, making it actually useful. There is also a rework of the code to avoid unnede calls to sis630_set_eq and to remove repeated calls to pci_read_config_byte. All the patches are made against latest netdev-2.6 tree (linus' sis900 + altimata PHY patch) I've tested all patches and the runtime debug parameter and everything works as expected. As usual all patches are available from http://teg.homeunix.org/sis900.html Thanks. -- ----------------------------- Daniele Venzano Web: http://teg.homeunix.org From dev-null@krynn.se.axis.com Wed Dec 8 03:01:06 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 03:01:13 -0800 (PST) Received: from krynn.se.axis.com ([212.209.10.221]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB8B15tU023973 for ; Wed, 8 Dec 2004 03:01:05 -0800 Received: from krynn.se.axis.com (localhost [127.0.0.1]) by krynn.se.axis.com (8.12.9/8.12.9/Debian-5local0.1) with ESMTP id iB8B0XAD012187 for ; Wed, 8 Dec 2004 12:00:33 +0100 Received: (from root@localhost) by krynn.se.axis.com (8.12.9/8.12.9/Debian-5local0.1) id iB8B0XLi012185; Wed, 8 Dec 2004 12:00:33 +0100 Message-Id: <200412081100.iB8B0XLi012185@krynn.se.axis.com> Content-Disposition: inline Content-Type: text/plain MIME-Version: 1.0 X-Mailer: MIME::Lite 2.117 (F2.6; A1.44; B2.12; Q2.03) Date: Wed, 8 Dec 2004 11:00:33 UT From: Return daemon S7KW8 To: netdev@oss.sgi.com Subject: Invalid address: ted.gulley Precedence: bulk X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id iB8B15tU023973 X-archive-position: 12565 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dev-null@axis.com Precedence: bulk X-list: netdev The addresses and are not valid. ted gulley has no known new address. From webvenza@libero.it Wed Dec 8 03:02:20 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 03:02:26 -0800 (PST) Received: from gateway.milesteg.arr (venza@adsl-ull-208-134.44-151.net24.it [151.44.134.208]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB8B2I22025274 for ; Wed, 8 Dec 2004 03:02:19 -0800 Date: Wed, 8 Dec 2004 12:01:56 +0100 From: Daniele Venzano To: NetDev , Jeff Garzik Subject: [PATCH 1/5] sis900 printk and stack usage audit Message-ID: <20041208110156.GB31707@picchio.gall.it> Mail-Followup-To: NetDev , Jeff Garzik References: <20041208104721.GA31707@picchio.gall.it> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="XOIedfhf+7KOe/yw" Content-Disposition: inline In-Reply-To: <20041208104721.GA31707@picchio.gall.it> X-Operating-System: Debian GNU/Linux on kernel Linux 2.4.27-grsec X-Copyright: Forwarding or publishing without permission is prohibited. X-Truth: La vita e' una questione di culo, o ce l'hai o te lo fanno. X-GPG-Fingerprint: 642A A345 1CEF B6E3 925C 23CE DAB9 8764 25B3 57ED X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12566 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: webvenza@libero.it Precedence: bulk X-list: netdev --XOIedfhf+7KOe/yw Content-Type: multipart/mixed; boundary="huq684BweRXVnRxX" Content-Disposition: inline --huq684BweRXVnRxX Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Audit of current printk() calls Changed debug levels to 0,1,2,3 as follows: 0 No debug 1 load/probe/unload/suspend/resume stuff 2 rx/tx errors 3 rx/tx packets and every interrupt are logged (very verbose) Debug levels are incremental Removed double printing of version string in module_init and in sis900_probe Made the sis900_debug parameter modifiable at runtime Signed-off-by: Daniele Venzano -- ----------------------------- Daniele Venzano Web: http://teg.homeunix.org --huq684BweRXVnRxX Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="sis900-debug1.diff" Index: sis900.c =================================================================== --- a/drivers/net/sis900.c (revision 39) +++ b/drivers/net/sis900.c (revision 40) @@ -183,10 +183,10 @@ module_param(multicast_filter_limit, int, 0444); module_param(max_interrupt_work, int, 0444); -module_param(debug, int, 0444); +module_param(debug, int, 0644); MODULE_PARM_DESC(multicast_filter_limit, "SiS 900/7016 maximum number of filtered multicast addresses"); MODULE_PARM_DESC(max_interrupt_work, "SiS 900/7016 maximum events handled per interrupt"); -MODULE_PARM_DESC(debug, "SiS 900/7016 debug level (2-4)"); +MODULE_PARM_DESC(debug, "SiS 900/7016 debug level (0-3)"); static int sis900_open(struct net_device *net_dev); static int sis900_mii_probe (struct net_device * net_dev); @@ -236,7 +236,7 @@ /* check to see if we have sane EEPROM */ signature = (u16) read_eeprom(ioaddr, EEPROMSignature); if (signature == 0xffff || signature == 0x0000) { - printk (KERN_INFO "%s: Error EERPOM read %x\n", + printk (KERN_WARNING "%s: Error: EEPROM signature is %x\n", net_dev->name, signature); return 0; } @@ -269,7 +269,7 @@ if (!isa_bridge) { isa_bridge = pci_find_device(PCI_VENDOR_ID_SI, 0x0018, isa_bridge); if (!isa_bridge) { - printk("%s: Can not find ISA bridge\n", net_dev->name); + printk(KERN_WARNING "%s: Can not find ISA bridge\n", net_dev->name); return 0; } } @@ -390,13 +390,6 @@ u8 revision; char *card_name = card_names[pci_id->driver_data]; -/* when built into the kernel, we only print version if device is found */ -#ifndef MODULE - static int printed_version; - if (!printed_version++) - printk(version); -#endif - /* setup various bits in PCI command register */ ret = pci_enable_device(pci_dev); if(ret) return ret; @@ -554,7 +547,7 @@ continue; if ((mii_phy = kmalloc(sizeof(struct mii_phy), GFP_KERNEL)) == NULL) { - printk(KERN_INFO "Cannot allocate mem for struct mii_phy\n"); + printk(KERN_WARNING "Cannot allocate mem for struct mii_phy\n"); mii_phy = sis_priv->first_mii; while (mii_phy) { struct mii_phy *phy; @@ -1015,8 +1008,8 @@ outl((i << RFADDR_shift), ioaddr + rfcr); outl(w, ioaddr + rfdr); - if (sis900_debug > 2) { - printk(KERN_INFO "%s: Receive Filter Addrss[%d]=%x\n", + if (sis900_debug > 0) { + printk(KERN_DEBUG "%s: Receive Filter Addrss[%d]=%x\n", net_dev->name, i, inl(ioaddr + rfdr)); } } @@ -1053,8 +1046,8 @@ /* load Transmit Descriptor Register */ outl(sis_priv->tx_ring_dma, ioaddr + txdp); - if (sis900_debug > 2) - printk(KERN_INFO "%s: TX descriptor register loaded with: %8.8x\n", + if (sis900_debug > 0) + printk(KERN_DEBUG "%s: TX descriptor register loaded with: %8.8x\n", net_dev->name, inl(ioaddr + txdp)); } @@ -1107,8 +1100,8 @@ /* load Receive Descriptor Register */ outl(sis_priv->rx_ring_dma, ioaddr + rxdp); - if (sis900_debug > 2) - printk(KERN_INFO "%s: RX descriptor register loaded with: %8.8x\n", + if (sis900_debug > 0) + printk(KERN_DEBUG "%s: RX descriptor register loaded with: %8.8x\n", net_dev->name, inl(ioaddr + rxdp)); } @@ -1557,8 +1550,8 @@ net_dev->trans_start = jiffies; - if (sis900_debug > 3) - printk(KERN_INFO "%s: Queued Tx packet at %p size %d " + if (sis900_debug > 2) + printk(KERN_DEBUG "%s: Queued Tx packet at %p size %d " "to slot %d.\n", net_dev->name, skb->data, (int)skb->len, entry); @@ -1617,8 +1610,8 @@ } } while (1); - if (sis900_debug > 3) - printk(KERN_INFO "%s: exiting interrupt, " + if (sis900_debug > 2) + printk(KERN_DEBUG "%s: exiting interrupt, " "interrupt status = 0x%#8.8x.\n", net_dev->name, inl(ioaddr + isr)); @@ -1643,8 +1636,8 @@ unsigned int entry = sis_priv->cur_rx % NUM_RX_DESC; u32 rx_status = sis_priv->rx_ring[entry].cmdsts; - if (sis900_debug > 3) - printk(KERN_INFO "sis900_rx, cur_rx:%4.4d, dirty_rx:%4.4d " + if (sis900_debug > 2) + printk(KERN_DEBUG "sis900_rx, cur_rx:%4.4d, dirty_rx:%4.4d " "status:0x%8.8x\n", sis_priv->cur_rx, sis_priv->dirty_rx, rx_status); @@ -1655,8 +1648,8 @@ if (rx_status & (ABORT|OVERRUN|TOOLONG|RUNT|RXISERR|CRCERR|FAERR)) { /* corrupted packet received */ - if (sis900_debug > 3) - printk(KERN_INFO "%s: Corrupted packet " + if (sis900_debug > 1) + printk(KERN_DEBUG "%s: Corrupted packet " "received, buffer status = 0x%8.8x.\n", net_dev->name, rx_status); sis_priv->stats.rx_errors++; @@ -1793,8 +1786,8 @@ if (tx_status & (ABORT | UNDERRUN | OWCOLL)) { /* packet unsuccessfully transmitted */ - if (sis900_debug > 3) - printk(KERN_INFO "%s: Transmit " + if (sis900_debug > 1) + printk(KERN_DEBUG "%s: Transmit " "error, Tx status %8.8x.\n", net_dev->name, tx_status); sis_priv->stats.tx_errors++; @@ -2047,12 +2040,10 @@ case IF_PORT_AUI: /* AUI */ case IF_PORT_100BASEFX: /* 100BaseFx */ /* These Modes are not supported (are they?)*/ - printk(KERN_INFO "Not supported"); return -EOPNOTSUPP; break; default: - printk(KERN_INFO "Invalid"); return -EINVAL; } } --huq684BweRXVnRxX-- --XOIedfhf+7KOe/yw Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Digital signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) iD8DBQFBtt8j2rmHZCWzV+0RAmGtAKCQJtoDkJz7VUY3Ncq/bRxVLpjcSwCgsEd2 xsqL2EXJ+V8uQT8Ln6mT5uE= =r8bb -----END PGP SIGNATURE----- --XOIedfhf+7KOe/yw-- From webvenza@libero.it Wed Dec 8 03:04:52 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 03:04:58 -0800 (PST) Received: from gateway.milesteg.arr (venza@adsl-ull-208-134.44-151.net24.it [151.44.134.208]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB8B4oIZ026034 for ; Wed, 8 Dec 2004 03:04:51 -0800 Date: Wed, 8 Dec 2004 12:04:26 +0100 From: Daniele Venzano To: NetDev , Jeff Garzik Subject: [PATCH 2/5] sis900 printk and stack usage audit Message-ID: <20041208110426.GC31707@picchio.gall.it> Mail-Followup-To: NetDev , Jeff Garzik References: <20041208104721.GA31707@picchio.gall.it> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="uxuisgdDHaNETlh8" Content-Disposition: inline In-Reply-To: <20041208104721.GA31707@picchio.gall.it> X-Operating-System: Debian GNU/Linux on kernel Linux 2.4.27-grsec X-Copyright: Forwarding or publishing without permission is prohibited. X-Truth: La vita e' una questione di culo, o ce l'hai o te lo fanno. X-GPG-Fingerprint: 642A A345 1CEF B6E3 925C 23CE DAB9 8764 25B3 57ED X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12567 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: webvenza@libero.it Precedence: bulk X-list: netdev --uxuisgdDHaNETlh8 Content-Type: multipart/mixed; boundary="vOmOzSkFvhd7u8Ms" Content-Disposition: inline --vOmOzSkFvhd7u8Ms Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Added some initialization debug output Version bump Signed-off-by: Daniele Venzano -- ----------------------------- Daniele Venzano Web: http://teg.homeunix.org --vOmOzSkFvhd7u8Ms Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="sis900-debug2.diff" Index: sis900.c =================================================================== --- a/drivers/net/sis900.c (revision 40) +++ b/drivers/net/sis900.c (revision 41) @@ -1,6 +1,6 @@ /* sis900.c: A SiS 900/7016 PCI Fast Ethernet driver for Linux. Copyright 1999 Silicon Integrated System Corporation - Revision: 1.08.06 Sep. 24 2002 + Revision: 1.08.08 Nov. 28 2004 Modified from the driver which is originally written by Donald Becker. @@ -18,6 +18,7 @@ preliminary Rev. 1.0 Jan. 18, 1998 http://www.sis.com.tw/support/databook.htm + Rev 1.08.08 Nov. 28 2004 Daniele Venzano audit debug code and printk() calls Rev 1.08.07 Nov. 2 2003 Daniele Venzano add suspend/resume support Rev 1.08.06 Sep. 24 2002 Mufasa Yang bug fix for Tx timeout & add SiS963 support Rev 1.08.05 Jun. 6 2002 Mufasa Yang bug fix for read_eeprom & Tx descriptor over-boundary @@ -74,7 +75,7 @@ #include "sis900.h" #define SIS900_MODULE_NAME "sis900" -#define SIS900_DRV_VERSION "v1.08.07 11/02/2003" +#define SIS900_DRV_VERSION "v1.08.08 Nov. 28 2004" static char version[] __devinitdata = KERN_INFO "sis900.c: " SIS900_DRV_VERSION "\n"; @@ -107,8 +108,6 @@ }; MODULE_DEVICE_TABLE (pci, sis900_pci_tbl); -static void sis900_read_mode(struct net_device *net_dev, int *speed, int *duplex); - static struct mii_chip_info { const char * name; u16 phy_id0; @@ -216,6 +215,7 @@ static u16 sis900_reset_phy(struct net_device *net_dev, int phy_addr); static void sis900_auto_negotiate(struct net_device *net_dev, int phy_addr); static void sis900_set_mode (long ioaddr, int speed, int duplex); +static void sis900_read_mode(struct net_device *net_dev, int *speed, int *duplex); static struct ethtool_ops sis900_ethtool_ops; /** @@ -269,7 +269,7 @@ if (!isa_bridge) { isa_bridge = pci_find_device(PCI_VENDOR_ID_SI, 0x0018, isa_bridge); if (!isa_bridge) { - printk(KERN_WARNING "%s: Can not find ISA bridge\n", net_dev->name); + printk(KERN_WARNING "%s: Cannot find ISA bridge\n", net_dev->name); return 0; } } @@ -455,10 +455,15 @@ if (ret) goto err_unmap_rx; + pci_read_config_byte(pci_dev, PCI_CLASS_REVISION, &revision); + + if(sis900_debug > 0) + printk(KERN_DEBUG "%s: detected revision %2.2x," + "trying to get MAC address...\n", + net_dev->name, revision); + /* Get Mac address according to the chip revision */ - pci_read_config_byte(pci_dev, PCI_CLASS_REVISION, &revision); ret = 0; - if (revision == SIS630E_900_REV) ret = sis630e_get_mac_addr(pci_dev, net_dev); else if ((revision > 0x81) && (revision <= 0x90) ) @@ -469,6 +474,7 @@ ret = sis900_get_mac_addr(pci_dev, net_dev); if (ret == 0) { + printk(KERN_WARNING "%s: Cannot read MAC address.\n", net_dev->name); ret = -ENODEV; goto err_out_unregister; } @@ -479,6 +485,7 @@ /* probe for mii transceiver */ if (sis900_mii_probe(net_dev) == 0) { + printk(KERN_WARNING "%s: Error probing MII device.\n", net_dev->name); ret = -ENODEV; goto err_out_unregister; } @@ -542,9 +549,14 @@ for(i = 0; i < 2; i++) mii_status = mdio_read(net_dev, phy_addr, MII_STATUS); - if (mii_status == 0xffff || mii_status == 0x0000) + if (mii_status == 0xffff || mii_status == 0x0000) { /* the mii is not accessible, try next one */ + if (sis900_debug > 0) + printk(KERN_DEBUG "%s: MII at address %d" + " not accessible\n", + net_dev->name, phy_addr); continue; + } if ((mii_phy = kmalloc(sizeof(struct mii_phy), GFP_KERNEL)) == NULL) { printk(KERN_WARNING "Cannot allocate mem for struct mii_phy\n"); @@ -568,14 +580,15 @@ for (i = 0; mii_chip_table[i].phy_id1; i++) if ((mii_phy->phy_id0 == mii_chip_table[i].phy_id0 ) && - ((mii_phy->phy_id1 & 0xFFF0) == mii_chip_table[i].phy_id1)){ + ((mii_phy->phy_id1 & 0xFFF0) == mii_chip_table[i].phy_id1)) + { mii_phy->phy_types = mii_chip_table[i].phy_types; if (mii_chip_table[i].phy_types == MIX) - mii_phy->phy_types = - (mii_status & (MII_STAT_CAN_TX_FDX | MII_STAT_CAN_TX)) ? LAN : HOME; + mii_phy->phy_types = (mii_status & (MII_STAT_CAN_TX_FDX | MII_STAT_CAN_TX)) ? LAN : HOME; printk(KERN_INFO "%s: %s transceiver found at address %d.\n", - net_dev->name, mii_chip_table[i].name, - phy_addr); + net_dev->name, + mii_chip_table[i].name, + phy_addr); break; } --vOmOzSkFvhd7u8Ms-- --uxuisgdDHaNETlh8 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Digital signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) iD8DBQFBtt+62rmHZCWzV+0RAmfHAJwPWQeSrRvdBpuPFZS1ldYrON4b4wCglYiG o7Gxt7+sQQUgYjVLkDhV2JU= =Ikf7 -----END PGP SIGNATURE----- --uxuisgdDHaNETlh8-- From webvenza@libero.it Wed Dec 8 03:06:34 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 03:08:05 -0800 (PST) Received: from gateway.milesteg.arr (venza@adsl-ull-208-134.44-151.net24.it [151.44.134.208]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB8B6XKf026550 for ; Wed, 8 Dec 2004 03:06:33 -0800 Date: Wed, 8 Dec 2004 12:06:10 +0100 From: Daniele Venzano To: NetDev , Jeff Garzik Subject: [PATCH 3/5] sis900 printk and stack usage audit Message-ID: <20041208110610.GD31707@picchio.gall.it> Mail-Followup-To: NetDev , Jeff Garzik References: <20041208104721.GA31707@picchio.gall.it> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="hUH5gZbnpyIv7Mn4" Content-Disposition: inline In-Reply-To: <20041208104721.GA31707@picchio.gall.it> X-Operating-System: Debian GNU/Linux on kernel Linux 2.4.27-grsec X-Copyright: Forwarding or publishing without permission is prohibited. X-Truth: La vita e' una questione di culo, o ce l'hai o te lo fanno. X-GPG-Fingerprint: 642A A345 1CEF B6E3 925C 23CE DAB9 8764 25B3 57ED X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12568 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: webvenza@libero.it Precedence: bulk X-list: netdev --hUH5gZbnpyIv7Mn4 Content-Type: multipart/mixed; boundary="9crTWz/Z+Zyzu20v" Content-Disposition: inline --9crTWz/Z+Zyzu20v Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Chip revision is now a member of sis_priv structure Kill all calls to pci_read_config_byte but one Change the code to use sis_priv->chipset_rev Signed-off-by: Daniele Venzano -- ----------------------------- Daniele Venzano Web: http://teg.homeunix.org --9crTWz/Z+Zyzu20v Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="sis900-chipset-revision.diff" Index: sis900.c =================================================================== --- a/drivers/net/sis900.c (revision 41) +++ b/drivers/net/sis900.c (revision 42) @@ -173,6 +173,7 @@ unsigned int tx_full; /* The Tx queue is full. */ u8 host_bridge_rev; + u8 chipset_rev; u32 pci_state[16]; }; @@ -387,7 +388,6 @@ void *ring_space; long ioaddr; int i, ret; - u8 revision; char *card_name = card_names[pci_id->driver_data]; /* setup various bits in PCI command register */ @@ -455,20 +455,20 @@ if (ret) goto err_unmap_rx; - pci_read_config_byte(pci_dev, PCI_CLASS_REVISION, &revision); + pci_read_config_byte(pci_dev, PCI_CLASS_REVISION, &(sis_priv->chipset_rev)); if(sis900_debug > 0) printk(KERN_DEBUG "%s: detected revision %2.2x," "trying to get MAC address...\n", - net_dev->name, revision); + net_dev->name, sis_priv->chipset_rev); /* Get Mac address according to the chip revision */ ret = 0; - if (revision == SIS630E_900_REV) + if (sis_priv->chipset_rev == SIS630E_900_REV) ret = sis630e_get_mac_addr(pci_dev, net_dev); - else if ((revision > 0x81) && (revision <= 0x90) ) + else if ((sis_priv->chipset_rev > 0x81) && (sis_priv->chipset_rev <= 0x90)) ret = sis635_get_mac_addr(pci_dev, net_dev); - else if (revision == SIS96x_900_REV) + else if (sis_priv->chipset_rev == SIS96x_900_REV) ret = sis96x_get_mac_addr(pci_dev, net_dev); else ret = sis900_get_mac_addr(pci_dev, net_dev); @@ -480,7 +480,7 @@ } /* 630ET : set the mii access mode as software-mode */ - if (revision == SIS630ET_900_REV) + if (sis_priv->chipset_rev == SIS630ET_900_REV) outl(ACCESSMODE | inl(ioaddr + cr), ioaddr + cr); /* probe for mii transceiver */ @@ -535,7 +535,6 @@ u16 poll_bit = MII_STAT_LINK, status = 0; unsigned long timeout = jiffies + 5 * HZ; int phy_addr; - u8 revision; sis_priv->mii = NULL; @@ -632,8 +631,7 @@ } } - pci_read_config_byte(sis_priv->pci_dev, PCI_CLASS_REVISION, &revision); - if (revision == SIS630E_900_REV) { + if (sis_priv->chipset_rev == SIS630E_900_REV) { /* SiS 630E has some bugs on default value of PHY registers */ mdio_write(net_dev, sis_priv->cur_phy, MII_ANADV, 0x05e1); mdio_write(net_dev, sis_priv->cur_phy, MII_CONFIG1, 0x22); @@ -948,15 +946,13 @@ { struct sis900_private *sis_priv = net_dev->priv; long ioaddr = net_dev->base_addr; - u8 revision; int ret; /* Soft reset the chip. */ sis900_reset(net_dev); /* Equalizer workaround Rule */ - pci_read_config_byte(sis_priv->pci_dev, PCI_CLASS_REVISION, &revision); - sis630_set_eq(net_dev, revision); + sis630_set_eq(net_dev, sis_priv->chipset_rev); ret = request_irq(net_dev->irq, &sis900_interrupt, SA_SHIRQ, net_dev->name, net_dev); @@ -1224,7 +1220,6 @@ struct mii_phy *mii_phy = sis_priv->mii; static int next_tick = 5*HZ; u16 status; - u8 revision; if (!sis_priv->autong_complete){ int speed, duplex = 0; @@ -1232,9 +1227,7 @@ sis900_read_mode(net_dev, &speed, &duplex); if (duplex){ sis900_set_mode(net_dev->base_addr, speed, duplex); - pci_read_config_byte(sis_priv->pci_dev, - PCI_CLASS_REVISION, &revision); - sis630_set_eq(net_dev, revision); + sis630_set_eq(net_dev, sis_priv->chipset_rev); netif_start_queue(net_dev); } @@ -1268,9 +1261,7 @@ ((mii_phy->phy_id1 & 0xFFF0) == 0x8000)) sis900_reset_phy(net_dev, sis_priv->cur_phy); - pci_read_config_byte(sis_priv->pci_dev, - PCI_CLASS_REVISION, &revision); - sis630_set_eq(net_dev, revision); + sis630_set_eq(net_dev, sis_priv->chipset_rev); goto LookForLink; } @@ -2102,11 +2093,10 @@ u16 mc_filter[16] = {0}; /* 256/128 bits multicast hash table */ int i, table_entries; u32 rx_mode; - u8 revision; /* 635 Hash Table entires = 256(2^16) */ - pci_read_config_byte(sis_priv->pci_dev, PCI_CLASS_REVISION, &revision); - if((revision >= SIS635A_900_REV) || (revision == SIS900B_900_REV)) + if((sis_priv->chipset_rev >= SIS635A_900_REV) || + (sis_priv->chipset_rev == SIS900B_900_REV)) table_entries = 16; else table_entries = 8; @@ -2132,7 +2122,7 @@ mclist && i < net_dev->mc_count; i++, mclist = mclist->next) { unsigned int bit_nr = - sis900_mcast_bitnr(mclist->dmi_addr, revision); + sis900_mcast_bitnr(mclist->dmi_addr, sis_priv->chipset_rev); mc_filter[bit_nr >> 4] |= (1 << (bit_nr & 0xf)); } } @@ -2178,7 +2168,6 @@ long ioaddr = net_dev->base_addr; int i = 0; u32 status = TxRCMP | RxRCMP; - u8 revision; outl(0, ioaddr + ier); outl(0, ioaddr + imr); @@ -2191,8 +2180,8 @@ status ^= (inl(isr + ioaddr) & status); } - pci_read_config_byte(sis_priv->pci_dev, PCI_CLASS_REVISION, &revision); - if( (revision >= SIS635A_900_REV) || (revision == SIS900B_900_REV) ) + if( (sis_priv->chipset_rev >= SIS635A_900_REV) || + (sis_priv->chipset_rev == SIS900B_900_REV) ) outl(PESEL | RND_CNT, ioaddr + cfg); else outl(PESEL, ioaddr + cfg); --9crTWz/Z+Zyzu20v-- --hUH5gZbnpyIv7Mn4 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Digital signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) iD8DBQFBtuAi2rmHZCWzV+0RAsOuAJ91SdqW9b0XjtWJKTsYPMw6nLq18wCgneuq 18Au9MtLx2C5hSy0CKy0aqg= =F0z5 -----END PGP SIGNATURE----- --hUH5gZbnpyIv7Mn4-- From webvenza@libero.it Wed Dec 8 03:08:59 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 03:09:04 -0800 (PST) Received: from gateway.milesteg.arr (venza@adsl-ull-208-134.44-151.net24.it [151.44.134.208]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB8B8wXf026901 for ; Wed, 8 Dec 2004 03:08:58 -0800 Date: Wed, 8 Dec 2004 12:08:35 +0100 From: Daniele Venzano To: NetDev , Jeff Garzik Subject: [PATCH 4/5] sis900 printk and stack usage audit Message-ID: <20041208110835.GE31707@picchio.gall.it> Mail-Followup-To: NetDev , Jeff Garzik References: <20041208104721.GA31707@picchio.gall.it> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="jaoouwwPWoQSJZYp" Content-Disposition: inline In-Reply-To: <20041208104721.GA31707@picchio.gall.it> X-Operating-System: Debian GNU/Linux on kernel Linux 2.4.27-grsec X-Copyright: Forwarding or publishing without permission is prohibited. X-Truth: La vita e' una questione di culo, o ce l'hai o te lo fanno. X-GPG-Fingerprint: 642A A345 1CEF B6E3 925C 23CE DAB9 8764 25B3 57ED X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12569 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: webvenza@libero.it Precedence: bulk X-list: netdev --jaoouwwPWoQSJZYp Content-Type: multipart/mixed; boundary="B8ONY/mu/bqBak9m" Content-Disposition: inline --B8ONY/mu/bqBak9m Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Don't do useless calls of sis630_set_eq Kill now unneeded paramenter of sis630_set_eq Signed-off-by: Daniele Venzano -- ----------------------------- Daniele Venzano Web: http://teg.homeunix.org --B8ONY/mu/bqBak9m Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="sis900-sis630-set-eq.diff" Index: sis900.c =================================================================== --- a/drivers/net/sis900.c (revision 42) +++ b/drivers/net/sis900.c (revision 43) @@ -209,7 +209,7 @@ static u16 sis900_mcast_bitnr(u8 *addr, u8 revision); static void set_rx_mode(struct net_device *net_dev); static void sis900_reset(struct net_device *net_dev); -static void sis630_set_eq(struct net_device *net_dev, u8 revision); +static void sis630_set_eq(struct net_device *net_dev); static int sis900_set_config(struct net_device *dev, struct ifmap *map); static u16 sis900_default_phy(struct net_device * net_dev); static void sis900_set_capability( struct net_device *net_dev ,struct mii_phy *phy); @@ -952,7 +952,11 @@ sis900_reset(net_dev); /* Equalizer workaround Rule */ - sis630_set_eq(net_dev, sis_priv->chipset_rev); + if (sis_priv->chipset_rev == SIS630E_900_REV || + sis_priv->chipset_rev == SIS630EA1_900_REV || + sis_priv->chipset_rev == SIS630A_900_REV || + sis_priv->chipset_rev == SIS630ET_900_REV) + sis630_set_eq(net_dev); ret = request_irq(net_dev->irq, &sis900_interrupt, SA_SHIRQ, net_dev->name, net_dev); @@ -1141,16 +1145,12 @@ * max >= 15 --> set equalizer to max+5 or set equalizer to max+6 if max == min */ -static void sis630_set_eq(struct net_device *net_dev, u8 revision) +static void sis630_set_eq(struct net_device *net_dev) { struct sis900_private *sis_priv = net_dev->priv; u16 reg14h, eq_value=0, max_value=0, min_value=0; int i, maxcount=10; - if ( !(revision == SIS630E_900_REV || revision == SIS630EA1_900_REV || - revision == SIS630A_900_REV || revision == SIS630ET_900_REV) ) - return; - if (netif_carrier_ok(net_dev)) { reg14h = mdio_read(net_dev, sis_priv->cur_phy, MII_RESV); mdio_write(net_dev, sis_priv->cur_phy, MII_RESV, @@ -1166,8 +1166,9 @@ eq_value : min_value; } /* 630E rule to determine the equalizer value */ - if (revision == SIS630E_900_REV || revision == SIS630EA1_900_REV || - revision == SIS630ET_900_REV) { + if (sis_priv->chipset_rev == SIS630E_900_REV || + sis_priv->chipset_rev == SIS630EA1_900_REV || + sis_priv->chipset_rev == SIS630ET_900_REV) { if (max_value < 5) eq_value = max_value; else if (max_value >= 5 && max_value < 15) @@ -1178,7 +1179,7 @@ max_value+6 : max_value+5; } /* 630B0&B1 rule to determine the equalizer value */ - if (revision == SIS630A_900_REV && + if (sis_priv->chipset_rev == SIS630A_900_REV && (sis_priv->host_bridge_rev == SIS630B0 || sis_priv->host_bridge_rev == SIS630B1)) { if (max_value == 0) @@ -1193,7 +1194,7 @@ mdio_write(net_dev, sis_priv->cur_phy, MII_RESV, reg14h); } else { reg14h = mdio_read(net_dev, sis_priv->cur_phy, MII_RESV); - if (revision == SIS630A_900_REV && + if (sis_priv->chipset_rev == SIS630A_900_REV && (sis_priv->host_bridge_rev == SIS630B0 || sis_priv->host_bridge_rev == SIS630B1)) mdio_write(net_dev, sis_priv->cur_phy, MII_RESV, @@ -1227,7 +1228,11 @@ sis900_read_mode(net_dev, &speed, &duplex); if (duplex){ sis900_set_mode(net_dev->base_addr, speed, duplex); - sis630_set_eq(net_dev, sis_priv->chipset_rev); + if (sis_priv->chipset_rev == SIS630E_900_REV || + sis_priv->chipset_rev == SIS630EA1_900_REV || + sis_priv->chipset_rev == SIS630A_900_REV || + sis_priv->chipset_rev == SIS630ET_900_REV) + sis630_set_eq(net_dev); netif_start_queue(net_dev); } @@ -1260,9 +1265,13 @@ if ((mii_phy->phy_id0 == 0x001D) && ((mii_phy->phy_id1 & 0xFFF0) == 0x8000)) sis900_reset_phy(net_dev, sis_priv->cur_phy); + + if (sis_priv->chipset_rev == SIS630E_900_REV || + sis_priv->chipset_rev == SIS630EA1_900_REV || + sis_priv->chipset_rev == SIS630A_900_REV || + sis_priv->chipset_rev == SIS630ET_900_REV) + sis630_set_eq(net_dev); - sis630_set_eq(net_dev, sis_priv->chipset_rev); - goto LookForLink; } } --B8ONY/mu/bqBak9m-- --jaoouwwPWoQSJZYp Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Digital signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) iD8DBQFBtuCz2rmHZCWzV+0RAnSJAJ9XXuoldG7DCKmKGrTD9nDQjt/rBgCfebjo 3d5CGqu9Q38MsjW/XKFiMvE= =Dp2S -----END PGP SIGNATURE----- --jaoouwwPWoQSJZYp-- From webvenza@libero.it Wed Dec 8 03:10:34 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 03:10:37 -0800 (PST) Received: from gateway.milesteg.arr (venza@adsl-ull-208-134.44-151.net24.it [151.44.134.208]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB8BAXM9027551 for ; Wed, 8 Dec 2004 03:10:33 -0800 Date: Wed, 8 Dec 2004 12:10:10 +0100 From: Daniele Venzano To: NetDev , Jeff Garzik Subject: [PATCH 5/5] sis900 printk and stack usage audit Message-ID: <20041208111010.GF31707@picchio.gall.it> Mail-Followup-To: NetDev , Jeff Garzik References: <20041208104721.GA31707@picchio.gall.it> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="oplxJGu+Ee5xywIT" Content-Disposition: inline In-Reply-To: <20041208104721.GA31707@picchio.gall.it> X-Operating-System: Debian GNU/Linux on kernel Linux 2.4.27-grsec X-Copyright: Forwarding or publishing without permission is prohibited. X-Truth: La vita e' una questione di culo, o ce l'hai o te lo fanno. X-GPG-Fingerprint: 642A A345 1CEF B6E3 925C 23CE DAB9 8764 25B3 57ED X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12570 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: webvenza@libero.it Precedence: bulk X-list: netdev --oplxJGu+Ee5xywIT Content-Type: multipart/mixed; boundary="tSiBuZsJmMXpnp7T" Content-Disposition: inline --tSiBuZsJmMXpnp7T Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Change comment to reflect changes in parameters od sis630_set_eq Signed-off-by: Daniele Venzano -- ----------------------------- Daniele Venzano Web: http://teg.homeunix.org --tSiBuZsJmMXpnp7T Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="sis900-sis630-docs.diff" Index: sis900.c =================================================================== --- a/drivers/net/sis900.c (revision 43) +++ b/drivers/net/sis900.c (revision 44) @@ -1121,7 +1121,6 @@ /** * sis630_set_eq - set phy equalizer value for 630 LAN * @net_dev: the net device to set equalizer value - * @revision: 630 LAN revision number * * 630E equalizer workaround rule(Cyrus Huang 08/15) * PHY register 14h(Test) --tSiBuZsJmMXpnp7T-- --oplxJGu+Ee5xywIT Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Digital signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) iD8DBQFBtuES2rmHZCWzV+0RAv3BAJ9iap/qRLxiEoc0yRXdy5kSS7RdHwCeNBdH 3X4qLtiWfkFz1ADLgpxnUJc= =5Kxh -----END PGP SIGNATURE----- --oplxJGu+Ee5xywIT-- From buytenh@wantstofly.org Wed Dec 8 03:16:13 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 03:16:17 -0800 (PST) Received: from xi.wantstofly.org (alephnull.demon.nl [212.238.201.82]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB8BGCoJ028197 for ; Wed, 8 Dec 2004 03:16:12 -0800 Received: by xi.wantstofly.org (Postfix, from userid 500) id AA1F52B0ED; Wed, 8 Dec 2004 12:15:49 +0100 (MET) Date: Wed, 8 Dec 2004 12:15:49 +0100 From: Lennert Buytenhek To: Robert Olsson Cc: Ben Greear , hadi@cyberus.ca, netdev@oss.sgi.com Subject: Re: inter-packet gap in pktgen Message-ID: <20041208111549.GA5703@xi.wantstofly.org> References: <20041207222522.GA30266@xi.wantstofly.org> <41B632F3.1090104@candelatech.com> <20041208073858.GA4027@xi.wantstofly.org> <16822.54722.755218.745451@robur.slu.se> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <16822.54722.755218.745451@robur.slu.se> User-Agent: Mutt/1.4.1i X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12571 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: buytenh@wantstofly.org Precedence: bulk X-list: netdev On Wed, Dec 08, 2004 at 11:21:54AM +0100, Robert Olsson wrote: > > Another option is: > > > > next_tx = get_time_in_ns(); > > while (--count) { > > tx_packet(); > > next_tx += 1e9/intended_pps; > > nanospin(next_tx - get_time_in_ns()); > > } > > Hello! > > I think this what Ben is doing with his userland app. Ev. adjusting > the ipg delay in runtime? A kind of control system. I think what Ben is doing is just measuring the # of pps and then using a PI-controller or something like that to adjust the ipg. What I mean is something like this (warning: whitespace damaged.) But this gives me an IPG that is consistently too small. Are the nanospin() and pg_udelay() functions accurate? --L --- pktgen.c.04111.orig 2004-12-08 11:57:03.627392497 +0100 +++ pktgen.c 2004-12-08 12:11:35.777150214 +0100 @@ -2794,7 +2794,13 @@ pkt_dev->last_ok = 0; pkt_dev->next_tx_ns = getRelativeCurNs(); /* TODO */ } - pkt_dev->next_tx_ns = getRelativeCurNs() + pkt_dev->ipg; + if (now == 0) + now = getRelativeCurNs(); + pkt_dev->next_tx_ns += pkt_dev->ipg; + if (pkt_dev->next_tx_ns < now) { + pkt_dev->next_tx_ns = now; + pkt_dev->errors++; /* missed ipg deadline */ + } } else { /* Retry it next time */ From hadi@cyberus.ca Wed Dec 8 04:05:03 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 04:05:10 -0800 (PST) Received: from mx02.cybersurf.com (mx02.cybersurf.com [209.197.145.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB8C52p1003289 for ; Wed, 8 Dec 2004 04:05:02 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx02.cybersurf.com with esmtp (Exim 4.30) id 1Cc0Ye-0000gM-ON for netdev@oss.sgi.com; Wed, 08 Dec 2004 07:04:36 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1Cc0Yb-0004qI-7U; Wed, 08 Dec 2004 07:04:33 -0500 Subject: Re: Hard freeze with 2.6.10-rc3 and QoS, worked fine with 2.6.9 From: jamal Reply-To: hadi@cyberus.ca To: "David S. Miller" Cc: Patrick McHardy , tgraf@suug.ch, akpm@osdl.org, tomc@compaqnet.fr, linux-kernel@vger.kernel.org, netdev@oss.sgi.com In-Reply-To: <20041207213053.6bb602c1.davem@davemloft.net> References: <1102380430.6103.6.camel@buffy> <20041206224441.628e7885.akpm@osdl.org> <1102422544.1088.98.camel@jzny.localdomain> <41B5E188.5050800@trash.net> <20041207170748.GF1371@postel.suug.ch> <41B5E722.2080600@trash.net> <20041207213053.6bb602c1.davem@davemloft.net> Content-Type: text/plain Organization: jamalopolous Message-Id: <1102507470.1051.27.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 08 Dec 2004 07:04:30 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12572 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev I can almost guarantee that one or more of those tests i outlined would fail. So i would suggest a revert until the testing has been done. cheers, jamal On Wed, 2004-12-08 at 00:30, David S. Miller wrote: > On Tue, 07 Dec 2004 18:23:46 +0100 > Patrick McHardy wrote: > > > Either one is fine with me, although I would prefer to see > > the number of ifdefs in this area going down, not up :) > > I agree, therefore I applied Patrick's patch. > From hadi@cyberus.ca Wed Dec 8 04:32:20 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 04:32:25 -0800 (PST) Received: from mx02.cybersurf.com (mx02.cybersurf.com [209.197.145.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB8CWKgk010308 for ; Wed, 8 Dec 2004 04:32:20 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx02.cybersurf.com with esmtp (Exim 4.30) id 1Cc0z5-000386-2m for netdev@oss.sgi.com; Wed, 08 Dec 2004 07:31:55 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1Cc0z3-0008JD-V2; Wed, 08 Dec 2004 07:31:54 -0500 Subject: Re: Hard freeze with 2.6.10-rc3 and QoS, worked fine with 2.6.9 From: jamal Reply-To: hadi@cyberus.ca To: Patrick McHardy Cc: Thomas Graf , Andrew Morton , Thomas Cataldo , linux-kernel@vger.kernel.org, netdev@oss.sgi.com, "David S. Miller" In-Reply-To: <41B68E5D.2080009@trash.net> References: <1102380430.6103.6.camel@buffy> <20041206224441.628e7885.akpm@osdl.org> <1102422544.1088.98.camel@jzny.localdomain> <41B5E188.5050800@trash.net> <20041207170748.GF1371@postel.suug.ch> <41B5E722.2080600@trash.net> <1102480044.1050.9.camel@jzny.localdomain> <1102480913.1049.24.camel@jzny.localdomain> <41B68E5D.2080009@trash.net> Content-Type: text/plain Organization: jamalopolous Message-Id: <1102509111.1051.54.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 08 Dec 2004 07:31:51 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12573 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Wed, 2004-12-08 at 00:17, Patrick McHardy wrote: > I think these tests are a waste of time. struct tcf_police is not > userspace-visible, so it's highly unlikely that the tc version matters. > Why an old kernel needs to be tested is beyond me. Regression testing. You need both backward and forward compatibility. Old kernels must continue to work with new tc for the policer using the old syntax. new kernels must continue to work with old tc for policer management using old syntax. Policer existed before any tc action code was written and has a very different layout of the structure. User tools and classifiers (accessed from user tools) do touch that code. These kind of tests constitute about 50% or more of my testing. > For possible in-kernel > breakage caused by the restructuring, without CONFIG_NET_CLS_ACT, > struct tcf_police is only used in police.c, without any casts or > assumptions about layout, so I can't see what could break. With > CONFIG_NET_CLS_ACT, the only place where it is used outside of > police.c is tcf_action_copy_stats, and this is exactly what this patch > (tested) fixes. > > If you still want to do these test, please use the attached patch. No rush now that its in (I also dont have time or equipment at the moment). Lets hope no more freezes reported. When i get time i will look into it. cheers, jamal From mumasankar@novell.com Wed Dec 8 04:46:17 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 04:46:24 -0800 (PST) Received: from lucius.provo.novell.com (lucius.provo.novell.com [137.65.81.172]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB8CkHsM014107 for ; Wed, 8 Dec 2004 04:46:17 -0800 Received: from INET-PRV1-MTA by lucius.provo.novell.com with Novell_GroupWise; Wed, 08 Dec 2004 05:45:50 -0700 Message-Id: X-Mailer: Novell GroupWise Internet Agent 6.5.3 Date: Wed, 08 Dec 2004 05:45:35 -0700 From: "Umasankar Mukkara" To: Subject: IPSEC MIB instrumentatoin ?? Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Content-Disposition: inline X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12574 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mumasankar@novell.com Precedence: bulk X-list: netdev Hello, Native ipsec stack lacks proc file instrumentation for IPSEC statistics. Would a patch for this instrumentation be looked as a good one? Following can be the ipsec related counters in ESP4/AH4/ ESP6/AH6/IPCOMP -Packets Encrypted -Packets decypted -Packets discarded (Policy enforcement) -Replay packets etc etc. Or something closer to "draft-ietf-ipsec-monitor-mib-06.txt " which is an expired draft and can be found at http://www.tatsuyababa.com/internet-drafts/draft-ietf-ipsec-monitor-mib-06.txt - Uma. From ilya.pashkovsky@gmail.com Wed Dec 8 04:56:53 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 04:56:59 -0800 (PST) Received: from rproxy.gmail.com (rproxy.gmail.com [64.233.170.198]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB8Cur7G019455 for ; Wed, 8 Dec 2004 04:56:53 -0800 Received: by rproxy.gmail.com with SMTP id b11so349543rne for ; Wed, 08 Dec 2004 04:56:31 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:mime-version:content-type:content-transfer-encoding; b=XLw0tmuPx2Cy2M90VDsljvg22KkL1PseAhNw1HSguhEu0xqZHx5aoP0xFM6RRnL3aPfRRW2n0854UGaXbFRvgzuJthBuFADd5FbiHrgg6xpR6Lz4S5kTT6tiKCxnnmyfsBH3r6TYQIDeZt9ETGOg7nld4FH95Xz/lhOxRRIR30Q= Received: by 10.38.162.30 with SMTP id k30mr22740rne; Wed, 08 Dec 2004 04:56:31 -0800 (PST) Received: by 10.38.149.15 with HTTP; Wed, 8 Dec 2004 04:56:31 -0800 (PST) Message-ID: Date: Wed, 8 Dec 2004 14:56:31 +0200 From: Ilya Pashkovsky Reply-To: Ilya Pashkovsky To: davem@redhat.com, netdev@oss.sgi.com Subject: [PATCH] sk_reuse fixes Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12575 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ilya.pashkovsky@gmail.com Precedence: bulk X-list: netdev This fixes sk_reuse checks: 1) allow outgoing connections AND one listening socket bound to same source port. 2) remove > 1 check of a boolean variable http://puding.mine.nu/patches/patch-reuse-bool --- linux/net/ipv4/tcp_ipv4.c.orig 2004-12-07 14:54:12.597084704 +0200 +++ linux/net/ipv4/tcp_ipv4.c 2004-12-08 14:47:27.792827816 +0200 @@ -50,6 +50,8 @@ * YOSHIFUJI Hideaki @USAGI and: Support IPV6_V6ONLY socket option, which * Alexey Kuznetsov allow both IPv4 and IPv6 sockets to bind * a single port at the same time. + * Ilya Pashkovsky : fix TCP_LISTEN check on reuse + * remove (sk_reuse > 1) check in get_port */ #include @@ -184,7 +186,8 @@ static inline int tcp_bind_conflict(stru const u32 sk_rcv_saddr = tcp_v4_rcv_saddr(sk); struct sock *sk2; struct hlist_node *node; - int reuse = sk->sk_reuse; + unsigned char reuse = sk->sk_reuse; + unsigned char state = sk->sk_state; sk_for_each_bound(sk2, node, &tb->owners) { if (sk != sk2 && @@ -193,7 +196,7 @@ static inline int tcp_bind_conflict(stru !sk2->sk_bound_dev_if || sk->sk_bound_dev_if == sk2->sk_bound_dev_if)) { if (!reuse || !sk2->sk_reuse || - sk2->sk_state == TCP_LISTEN) { + (state == TCP_LISTEN && sk2->sk_state == TCP_LISTEN)) { const u32 sk2_rcv_saddr = tcp_v4_rcv_saddr(sk2); if (!sk2_rcv_saddr || !sk_rcv_saddr || sk2_rcv_saddr == sk_rcv_saddr) @@ -259,8 +262,14 @@ static int tcp_v4_get_port(struct sock * goto tb_not_found; tb_found: if (!hlist_empty(&tb->owners)) { - if (sk->sk_reuse > 1) - goto success; + + /* + * sk_reuse is boolean + * + *if (sk->sk_reuse > 1) + * goto success; + */ + if (tb->fastreuse > 0 && sk->sk_reuse && sk->sk_state != TCP_LISTEN) { goto success; From kdesler@soohrt.org Wed Dec 8 05:09:13 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 05:09:17 -0800 (PST) Received: from quickstop.soohrt.org (quickstop.soohrt.org [81.2.155.147]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB8D9B3d020211 for ; Wed, 8 Dec 2004 05:09:12 -0800 Received: (qmail 5857 invoked by uid 1000); 8 Dec 2004 13:08:45 -0000 Date: Wed, 8 Dec 2004 14:08:45 +0100 From: Karsten Desler To: Willy Tarreau Cc: P@draigBrady.com, "David S. Miller" , netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: _High_ CPU usage while routing (mostly) small UDP packets Message-ID: <20041208130845.GA5036@soohrt.org> References: <20041206205305.GA11970@soohrt.org> <20041206134849.498bfc93.davem@davemloft.net> <20041206224107.GA8529@soohrt.org> <41B58A58.8010007@draigBrady.com> <20041207112139.GA3610@soohrt.org> <20041208053953.GC17946@alpha.home.local> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <20041208053953.GC17946@alpha.home.local> User-Agent: Mutt/1.5.6+20040722i X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12576 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kdesler@soohrt.org Precedence: bulk X-list: netdev * Willy Tarreau wrote: > On Tue, Dec 07, 2004 at 12:21:39PM +0100, Karsten Desler wrote: > > > > I also notice that a lot of time is spent allocating > > > and freeing the packet buffers (and possible hidden > > > time due to cache misses due to allocating on one > > > CPU and freeing on another?). > > > How many [RT]xDescriptors do you have configured by the way? > > > > 256. I increased them to 1024 shortly after the profiling run, but > > didn't notice any change in the cpu usage (will try again with cyclesoak). > > Have you checked the interrupts rate ? I had an e1000 eating many CPU cycles > because it would generate 50000 interrupts/s. Passing the module > InterruptThrottleRate=5000 definitely calmed it down, and more than doubled > the data rate. I was running mit ITR=3000, but as a test to see if NAPI works, I disabled ITR on eth0 bringing the int/s rate up to 50k. Is that normal? I always though NAPI was supposed to kick in way earlier. Anyways, I'm going to try different ITR settings to see if they make any difference. Cheers, Karsten From yoshfuji@linux-ipv6.org Wed Dec 8 05:12:58 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 05:13:02 -0800 (PST) Received: from yue.st-paulia.net (yue.linux-ipv6.org [203.178.140.15]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB8DCv9G020762 for ; Wed, 8 Dec 2004 05:12:58 -0800 Received: from localhost (localhost [127.0.0.1]) by yue.st-paulia.net (Postfix) with ESMTP id 3484033CE5; Wed, 8 Dec 2004 22:14:10 +0900 (JST) Date: Wed, 08 Dec 2004 14:14:09 +0100 (CET) Message-Id: <20041208.141409.103464255.yoshfuji@linux-ipv6.org> To: ilya.pashkovsky@gmail.com Cc: davem@redhat.com, netdev@oss.sgi.com, yoshfuji@linux-ipv6.org Subject: Re: [PATCH] sk_reuse fixes From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: References: Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12577 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev Hello. In article (at Wed, 8 Dec 2004 14:56:31 +0200), Ilya Pashkovsky says: > This fixes sk_reuse checks: > 1) allow outgoing connections AND one listening socket bound to same > source port. > 2) remove > 1 check of a boolean variable Two comments: 1. This (boolean variable) is not the case on Linux. Behavior is differentiated with sk_reuse > 1. 2. Please provide tcp_ipv6 part as well. --yoshfuji From ilya.pashkovsky@gmail.com Wed Dec 8 05:13:22 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 05:13:28 -0800 (PST) Received: from rproxy.gmail.com (rproxy.gmail.com [64.233.170.204]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB8DDLfV020960 for ; Wed, 8 Dec 2004 05:13:22 -0800 Received: by rproxy.gmail.com with SMTP id b11so351434rne for ; Wed, 08 Dec 2004 05:13:00 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:cc:mime-version:content-type:content-transfer-encoding; b=fNpDhJCnnO9sjtwJ9ifhKTCei5wtx6VM8+m/I/jDdUcLi4qcc+2XvjaZIU3lXbTzdOch9tlOM+oF6D+6JaOWNdhaOH8fJBaS2dCjoNAJm+a5tN4ZtR8cfvMPebPSORcXoWpTbKxyJxtqYyQNhFuNHjwsqNst4OInzcuj0XW59zU= Received: by 10.38.12.13 with SMTP id 13mr364259rnl; Wed, 08 Dec 2004 05:12:59 -0800 (PST) Received: by 10.38.149.15 with HTTP; Wed, 8 Dec 2004 05:12:59 -0800 (PST) Message-ID: Date: Wed, 8 Dec 2004 15:12:59 +0200 From: Ilya Pashkovsky Reply-To: Ilya Pashkovsky To: davem@redhat.com, netdev@oss.sgi.com Subject: [PATCH] fixed patch for sk_reuse Cc: linux-kernel@vger.kernel.org Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12578 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ilya.pashkovsky@gmail.com Precedence: bulk X-list: netdev ah, it seems the sk_reuse > 1 is really used... do not try the previous patch. http://puding.mine.nu/patches/patch-reuse --- linux/net/ipv4/tcp_ipv4.c.orig2004-12-07 14:54:12.597084704 +0200 +++ linux/net/ipv4/tcp_ipv4.c2004-12-08 15:07:51.985722208 +0200 @@ -50,6 +50,7 @@ *YOSHIFUJI Hideaki @USAGI and:Support IPV6_V6ONLY socket option, which *Alexey Kuznetsovallow both IPv4 and IPv6 sockets to bind *a single port at the same time. + *Ilya Pashkovsky:fix TCP_LISTEN check on reuse */ #include @@ -184,7 +185,8 @@ static inline int tcp_bind_conflict(stru const u32 sk_rcv_saddr = tcp_v4_rcv_saddr(sk); struct sock *sk2; struct hlist_node *node; -int reuse = sk->sk_reuse; +unsigned char reuse = sk->sk_reuse; +unsigned char state = sk->sk_state; sk_for_each_bound(sk2, node, &tb->owners) { if (sk != sk2 && @@ -193,7 +195,7 @@ static inline int tcp_bind_conflict(stru !sk2->sk_bound_dev_if || sk->sk_bound_dev_if == sk2->sk_bound_dev_if)) { if (!reuse || !sk2->sk_reuse || - sk2->sk_state == TCP_LISTEN) { + (state == TCP_LISTEN && sk2->sk_state == TCP_LISTEN)) { const u32 sk2_rcv_saddr = tcp_v4_rcv_saddr(sk2); if (!sk2_rcv_saddr || !sk_rcv_saddr || sk2_rcv_saddr == sk_rcv_saddr) From kdesler@soohrt.org Wed Dec 8 05:27:24 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 05:27:28 -0800 (PST) Received: from quickstop.soohrt.org (quickstop.soohrt.org [81.2.155.147]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB8DRMrj022561 for ; Wed, 8 Dec 2004 05:27:23 -0800 Received: (qmail 9640 invoked by uid 1000); 8 Dec 2004 13:26:57 -0000 Date: Wed, 8 Dec 2004 14:26:57 +0100 From: Karsten Desler To: jamal , Robert Olsson Cc: netdev@oss.sgi.com, linux-kernel@vger.kernel.org, "David S. Miller" , P@draigBrady.com Subject: Re: _High_ CPU usage while routing (mostly) small UDP packets Message-ID: <20041208132657.GB5036@soohrt.org> References: <20041206205305.GA11970@soohrt.org> <20041207211035.GA20286@quickstop.soohrt.org> <1102480318.1050.16.camel@jzny.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <1102480318.1050.16.camel@jzny.localdomain> User-Agent: Mutt/1.5.6+20040722i X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12579 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kdesler@soohrt.org Precedence: bulk X-list: netdev * jamal wrote: > On Tue, 2004-12-07 at 16:10, Karsten Desler wrote: > > I totally forgot to mention: There are approximately 100k concurrent > > flows. > > ;-> Aha. That would make a huge difference. I know of noone > who has actually done this level of testing. I have tried upto about 50K > flows myself in early 2.6.x and was eventually compute bound. > Really try compiling out totaly iptables/netfilter - it will make a > difference. Unfortunately I need netfilter (for now, I haven't had time to look into replacing the rules with tc yet). > You may also want to try something like LC trie algorithm that Robert > and co are playing with to see if it makes a difference with this many > flows. On a scale of one to ten, one being "will crash on the first packet", five being "will allow moderate load, but is probably going to crash under high load" and ten being "rock stable" how would the patch be rated? The announcement doesn't look too promising: | Locking. | Not yet done. Also when looking at the profiles, fib_* isn't showing up at (not even near) the top. Testworthy none the less? ip route|wc -l: 40 profile: 4 fib_validate_source 0.0064 39 fib_semantic_match 0.1875 [...] 76 ipt_route_hook 1.5833 219 __kmalloc 1.7109 985 e1000_clean_tx_irq 1.8107 157 kmem_cache_alloc 1.9625 405 skb_release_data 2.5312 633 eth_type_trans 2.6375 394 handle_IRQ_event 3.5179 1017 __kfree_skb 3.9727 989 alloc_skb 4.1208 1645 ip_route_input 4.6733 5128 ipt_do_table 6.1635 1678 e1000_intr 6.9917 289 _read_unlock_bh 18.0625 616 _read_lock_bh 19.2500 418 _spin_lock 26.1250 2076 e1000_irq_enable 43.2500 881 _spin_unlock_irqrestore 55.0625 96895 default_idle 1513.9844 Cheers, Karsten From hadi@cyberus.ca Wed Dec 8 05:27:50 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 05:27:54 -0800 (PST) Received: from mx01.cybersurf.com (mx01.cybersurf.com [209.197.145.104]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB8DRn6N022657 for ; Wed, 8 Dec 2004 05:27:50 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx01.cybersurf.com with esmtp (Exim 4.30) id 1Cc1qk-000146-NM for netdev@oss.sgi.com; Wed, 08 Dec 2004 08:27:22 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1Cc1qi-0007C4-Sm; Wed, 08 Dec 2004 08:27:21 -0500 Subject: Re: _High_ CPU usage while routing (mostly) small UDP packets From: jamal Reply-To: hadi@cyberus.ca To: Karsten Desler Cc: Willy Tarreau , P@draigBrady.com, "David S. Miller" , netdev@oss.sgi.com, linux-kernel@vger.kernel.org In-Reply-To: <20041208130845.GA5036@soohrt.org> References: <20041206205305.GA11970@soohrt.org> <20041206134849.498bfc93.davem@davemloft.net> <20041206224107.GA8529@soohrt.org> <41B58A58.8010007@draigBrady.com> <20041207112139.GA3610@soohrt.org> <20041208053953.GC17946@alpha.home.local> <20041208130845.GA5036@soohrt.org> Content-Type: text/plain Organization: jamalopolous Message-Id: <1102512438.1050.61.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 08 Dec 2004 08:27:18 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12580 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Wed, 2004-12-08 at 08:08, Karsten Desler wrote: > I was running mit ITR=3000, but as a test to see if NAPI works, I > disabled ITR on eth0 bringing the int/s rate up to 50k. > Is that normal? I always though NAPI was supposed to kick in way earlier. > Anyways, I'm going to try different ITR settings to see if they make any > difference. > The one time you would need this ITR crap is when you are running low traffic (relatively speaking: which could mean anything below 100Kpps depending on h/ware). NAPI will consume a little more CPU otherwise - given it does an extra IO (if it kicks in and out on every packet or two). Granted you are doing some new types of tests, so you may be seeing some things we havent experienced before. Can you put a printk in ->open() of e1000 with #if napi is defined printk("%s: NAPI is on\n",dev->name); #endif This should print on ifconfig up and indicate whether NAPI is on or not. cheers, jamal From tgraf@suug.ch Wed Dec 8 06:32:18 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 06:32:25 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB8EWGbS026108 for ; Wed, 8 Dec 2004 06:32:17 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id AAEC1F; Wed, 8 Dec 2004 15:31:30 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id 9FFC61C0EA; Wed, 8 Dec 2004 15:32:12 +0100 (CET) Date: Wed, 8 Dec 2004 15:32:12 +0100 From: Thomas Graf To: jamal Cc: Patrick McHardy , Andrew Morton , Thomas Cataldo , linux-kernel@vger.kernel.org, netdev@oss.sgi.com, "David S. Miller" Subject: Re: Hard freeze with 2.6.10-rc3 and QoS, worked fine with 2.6.9 Message-ID: <20041208143212.GL1371@postel.suug.ch> References: <1102380430.6103.6.camel@buffy> <20041206224441.628e7885.akpm@osdl.org> <1102422544.1088.98.camel@jzny.localdomain> <41B5E188.5050800@trash.net> <20041207170748.GF1371@postel.suug.ch> <41B5E722.2080600@trash.net> <1102480044.1050.9.camel@jzny.localdomain> <1102480913.1049.24.camel@jzny.localdomain> <41B68E5D.2080009@trash.net> <1102509111.1051.54.camel@jzny.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1102509111.1051.54.camel@jzny.localdomain> X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12581 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * jamal <1102509111.1051.54.camel@jzny.localdomain> 2004-12-08 07:31 > On Wed, 2004-12-08 at 00:17, Patrick McHardy wrote: > > > I think these tests are a waste of time. struct tcf_police is not > > userspace-visible, so it's highly unlikely that the tc version matters. > > Why an old kernel needs to be tested is beyond me. > > Regression testing. > You need both backward and forward compatibility. > Old kernels must continue to work with new tc for the policer using the > old syntax. > new kernels must continue to work with old tc for policer management > using old syntax. > Policer existed before any tc action code was written and has a very > different layout of the structure. User tools and classifiers (accessed > from user tools) do touch that code. > These kind of tests constitute about 50% or more of my testing. I invested some time to ease testing since this was primarly my fault by overlooking the special case of tcf_police. I've put together a small testsuite allowing to easly run tests for multiple versions of iproute2. It can be found at: http://people.suug.ch/~tgr/iproute2/tc-testsuite.tar.gz One simply extracts various iproute2 versions into iproute2/ and sets KERNEL_INCLUDE if needed for older versions. 'make compile' on the top level compiles all the versions. The tests are defined in tests/ and are simple shell scripts and get invoked for every iproute2 verison in iproute2 with $TC and $IP set to the version currently being tested. The output of every test run is stored in results/$TEST.$IPVERSION.out respectively .dmesg. 'make clean' removes all the results again. 'make liststests' lists all the available tests. 'make alltests' runs all the tests. I've run all the tests on my patch with the following kernels and iproute2 versions: - 2.6.10-rc2-bk13 (actions compiled in) - 2.6.10-rc2-bk13-no-act (old policer compiled in) - 2.4.28-rc1-bk1 - iproute2-2.6.9-tgr (with all my patches in) - iproute2-2.4.7 iproute-2.6.9 was sucessful with all kernels. I couldn't test with the old 2.4.7 iproute2 yet since the syntax has changed and I need to adopt the tests first. I will create better tests and run it on patrick's patch when I get home. From tgraf@suug.ch Wed Dec 8 06:59:28 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 06:59:33 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB8ExRm7028341 for ; Wed, 8 Dec 2004 06:59:27 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id 86C28F; Wed, 8 Dec 2004 15:58:42 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id 9ED481C0EA; Wed, 8 Dec 2004 15:59:25 +0100 (CET) Date: Wed, 8 Dec 2004 15:59:25 +0100 From: Thomas Graf To: jamal Cc: Patrick McHardy , Andrew Morton , Thomas Cataldo , linux-kernel@vger.kernel.org, netdev@oss.sgi.com, "David S. Miller" Subject: Re: Hard freeze with 2.6.10-rc3 and QoS, worked fine with 2.6.9 Message-ID: <20041208145925.GM1371@postel.suug.ch> References: <20041206224441.628e7885.akpm@osdl.org> <1102422544.1088.98.camel@jzny.localdomain> <41B5E188.5050800@trash.net> <20041207170748.GF1371@postel.suug.ch> <41B5E722.2080600@trash.net> <1102480044.1050.9.camel@jzny.localdomain> <1102480913.1049.24.camel@jzny.localdomain> <41B68E5D.2080009@trash.net> <1102509111.1051.54.camel@jzny.localdomain> <20041208143212.GL1371@postel.suug.ch> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041208143212.GL1371@postel.suug.ch> X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12582 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * Thomas Graf <20041208143212.GL1371@postel.suug.ch> 2004-12-08 15:32 > iproute-2.6.9 was sucessful with all kernels. I couldn't test with the > old 2.4.7 iproute2 yet since the syntax has changed and I need to adopt > the tests first. I will create better tests and run it on patrick's > patch when I get home. I've updated the tests and all were successful: tc-2.6.9-tgr tc-2.4.7 2.6.10-rc2-bk13* Y Y 2.6.10-rc2-bk13-no-act* Y Y 2.4.28-rc1-bk1 Y Y * including tcf_police patch From hadi@cyberus.ca Wed Dec 8 07:05:44 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 07:05:48 -0800 (PST) Received: from mx02.cybersurf.com (mx02.cybersurf.com [209.197.145.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB8F5hWl029001 for ; Wed, 8 Dec 2004 07:05:44 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx02.cybersurf.com with esmtp (Exim 4.30) id 1Cc3NW-0006r6-Q3 for netdev@oss.sgi.com; Wed, 08 Dec 2004 10:05:18 -0500 Received: from [216.209.86.2] (helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1Cc3NM-0005fu-B3; Wed, 08 Dec 2004 10:05:08 -0500 Subject: Re: Hard freeze with 2.6.10-rc3 and QoS, worked fine with 2.6.9 From: jamal Reply-To: hadi@cyberus.ca To: Thomas Graf Cc: Patrick McHardy , Andrew Morton , Thomas Cataldo , linux-kernel@vger.kernel.org, netdev@oss.sgi.com, "David S. Miller" In-Reply-To: <20041208143212.GL1371@postel.suug.ch> References: <1102380430.6103.6.camel@buffy> <20041206224441.628e7885.akpm@osdl.org> <1102422544.1088.98.camel@jzny.localdomain> <41B5E188.5050800@trash.net> <20041207170748.GF1371@postel.suug.ch> <41B5E722.2080600@trash.net> <1102480044.1050.9.camel@jzny.localdomain> <1102480913.1049.24.camel@jzny.localdomain> <41B68E5D.2080009@trash.net> <1102509111.1051.54.camel@jzny.localdomain> <20041208143212.GL1371@postel.suug.ch> Content-Type: text/plain Organization: jamalopolous Message-Id: <1102518304.1023.6.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 08 Dec 2004 10:05:05 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12583 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Wed, 2004-12-08 at 09:32, Thomas Graf wrote: > I've put together a small testsuite allowing to easly run tests for > multiple versions of iproute2. It can be found at: > http://people.suug.ch/~tgr/iproute2/tc-testsuite.tar.gz > Good stuff. Hopefully we can run these tests everytime we make changes going forward. We cant compromise quality by handwaving on instinct. Famous last words: "that couldnt have possibly caused a bug down there". I will take a look at what you have and integrate my 20-30 testcases if they are not covered over time - or may be adpat what you have to follow how i do things or maybe i can send them to you and you can integrate them. > iproute-2.6.9 was sucessful with all kernels. I couldn't test with the > old 2.4.7 iproute2 yet since the syntax has changed and I need to adopt > the tests first. I will create better tests and run it on patrick's > patch when I get home. That would be appreaciated. cheers, jamal From hadi@cyberus.ca Wed Dec 8 07:07:14 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 07:07:18 -0800 (PST) Received: from mx02.cybersurf.com (mx02.cybersurf.com [209.197.145.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB8F7EG4029475 for ; Wed, 8 Dec 2004 07:07:14 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx02.cybersurf.com with esmtp (Exim 4.30) id 1Cc3Oz-0007V8-GK for netdev@oss.sgi.com; Wed, 08 Dec 2004 10:06:49 -0500 Received: from [216.209.86.2] (helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1Cc3Oy-0005xA-Ez; Wed, 08 Dec 2004 10:06:48 -0500 Subject: Re: Hard freeze with 2.6.10-rc3 and QoS, worked fine with 2.6.9 From: jamal Reply-To: hadi@cyberus.ca To: Thomas Graf Cc: Patrick McHardy , Andrew Morton , Thomas Cataldo , linux-kernel@vger.kernel.org, netdev@oss.sgi.com, "David S. Miller" In-Reply-To: <20041208145925.GM1371@postel.suug.ch> References: <20041206224441.628e7885.akpm@osdl.org> <1102422544.1088.98.camel@jzny.localdomain> <41B5E188.5050800@trash.net> <20041207170748.GF1371@postel.suug.ch> <41B5E722.2080600@trash.net> <1102480044.1050.9.camel@jzny.localdomain> <1102480913.1049.24.camel@jzny.localdomain> <41B68E5D.2080009@trash.net> <1102509111.1051.54.camel@jzny.localdomain> <20041208143212.GL1371@postel.suug.ch> <20041208145925.GM1371@postel.suug.ch> Content-Type: text/plain Organization: jamalopolous Message-Id: <1102518405.1025.8.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 08 Dec 2004 10:06:45 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12584 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev Thanks for your efforts Thomas. cheers, jamal On Wed, 2004-12-08 at 09:59, Thomas Graf wrote: > * Thomas Graf <20041208143212.GL1371@postel.suug.ch> 2004-12-08 15:32 > > iproute-2.6.9 was sucessful with all kernels. I couldn't test with the > > old 2.4.7 iproute2 yet since the syntax has changed and I need to adopt > > the tests first. I will create better tests and run it on patrick's > > patch when I get home. > > I've updated the tests and all were successful: > > tc-2.6.9-tgr tc-2.4.7 > 2.6.10-rc2-bk13* Y Y > 2.6.10-rc2-bk13-no-act* Y Y > 2.4.28-rc1-bk1 Y Y > > * including tcf_police patch > From jmorris@redhat.com Wed Dec 8 08:56:34 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 08:56:40 -0800 (PST) Received: from mx1.redhat.com (mx1.redhat.com [66.187.233.31]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB8GuXRf005766 for ; Wed, 8 Dec 2004 08:56:34 -0800 Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.12.11/8.12.11) with ESMTP id iB8Gu3K4021881; Wed, 8 Dec 2004 11:56:03 -0500 Received: from mail.boston.redhat.com (mail.boston.redhat.com [172.16.76.12]) by int-mx1.corp.redhat.com (8.11.6/8.11.6) with ESMTP id iB8Gu3r16911; Wed, 8 Dec 2004 11:56:03 -0500 Received: from thoron.boston.redhat.com (thoron.boston.redhat.com [172.16.80.63]) by mail.boston.redhat.com (8.12.8/8.12.8) with ESMTP id iB8Gu1ZV029721; Wed, 8 Dec 2004 11:56:01 -0500 Date: Wed, 8 Dec 2004 11:56:03 -0500 (EST) From: James Morris X-X-Sender: jmorris@thoron.boston.redhat.com To: Umasankar Mukkara cc: netdev@oss.sgi.com Subject: Re: IPSEC MIB instrumentatoin ?? In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12585 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jmorris@redhat.com Precedence: bulk X-list: netdev On Wed, 8 Dec 2004, Umasankar Mukkara wrote: > Or something closer to "draft-ietf-ipsec-monitor-mib-06.txt " which is > an expired draft and can be found at > > http://www.tatsuyababa.com/internet-drafts/draft-ietf-ipsec-monitor-mib-06.txt > Any idea what happened to the draft process? Is it likely to be re-issued and turned into an RFC? - James -- James Morris From kaber@trash.net Wed Dec 8 08:58:16 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 08:58:21 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB8GwFGQ006097 for ; Wed, 8 Dec 2004 08:58:16 -0800 Received: from localhost ([127.0.0.1]) by kaber.coreworks.de with esmtp (Exim 4.34) id 1Cc57j-0002Ut-PZ; Wed, 08 Dec 2004 17:57:07 +0100 Message-ID: <41B73263.4040706@trash.net> Date: Wed, 08 Dec 2004 17:57:07 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.3) Gecko/20041008 Debian/1.7.3-5 X-Accept-Language: en MIME-Version: 1.0 To: hadi@cyberus.ca CC: "David S. Miller" , tgraf@suug.ch, akpm@osdl.org, tomc@compaqnet.fr, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: Hard freeze with 2.6.10-rc3 and QoS, worked fine with 2.6.9 References: <1102380430.6103.6.camel@buffy> <20041206224441.628e7885.akpm@osdl.org> <1102422544.1088.98.camel@jzny.localdomain> <41B5E188.5050800@trash.net> <20041207170748.GF1371@postel.suug.ch> <41B5E722.2080600@trash.net> <20041207213053.6bb602c1.davem@davemloft.net> <1102507470.1051.27.camel@jzny.localdomain> In-Reply-To: <1102507470.1051.27.camel@jzny.localdomain> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12586 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev jamal wrote: >I can almost guarantee that one or more of those tests i outlined would >fail. So i would suggest a revert until the testing has been done. > Please be more specific than an "almost guarantee" that "one or more tests" may fail when asking to revert a patch that fixes an easily triggerable crash. For example, point to the code that makes you think it might fail. From tgraf@suug.ch Wed Dec 8 09:30:31 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 09:30:38 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB8HUUrn007635 for ; Wed, 8 Dec 2004 09:30:31 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id 1826AF; Wed, 8 Dec 2004 18:29:46 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id BA6EA1C0EA; Wed, 8 Dec 2004 18:30:28 +0100 (CET) Date: Wed, 8 Dec 2004 18:30:28 +0100 From: Thomas Graf To: jamal Cc: Patrick McHardy , Andrew Morton , Thomas Cataldo , linux-kernel@vger.kernel.org, netdev@oss.sgi.com, "David S. Miller" Subject: Re: Hard freeze with 2.6.10-rc3 and QoS, worked fine with 2.6.9 Message-ID: <20041208173028.GN1371@postel.suug.ch> References: <1102422544.1088.98.camel@jzny.localdomain> <41B5E188.5050800@trash.net> <20041207170748.GF1371@postel.suug.ch> <41B5E722.2080600@trash.net> <1102480044.1050.9.camel@jzny.localdomain> <1102480913.1049.24.camel@jzny.localdomain> <41B68E5D.2080009@trash.net> <1102509111.1051.54.camel@jzny.localdomain> <20041208143212.GL1371@postel.suug.ch> <1102518304.1023.6.camel@jzny.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1102518304.1023.6.camel@jzny.localdomain> X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12587 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * jamal <1102518304.1023.6.camel@jzny.localdomain> 2004-12-08 10:05 > On Wed, 2004-12-08 at 09:32, Thomas Graf wrote: > > > I've put together a small testsuite allowing to easly run tests for > > multiple versions of iproute2. It can be found at: > > http://people.suug.ch/~tgr/iproute2/tc-testsuite.tar.gz > > > > Good stuff. Hopefully we can run these tests everytime we make changes > going forward. We cant compromise quality by handwaving on instinct. > Famous last words: "that couldnt have possibly caused a bug down there". > I will take a look at what you have and integrate my 20-30 testcases if > they are not covered over time - or may be adpat what you have to follow > how i do things or maybe i can send them to you and you can integrate > them. I've only put a small cbq test case and the policer test in there for now. I'd be happy to integrate your test cases if you send them to me. I wil add more tests in the next days given time. Our sysadmin had some cycles and set up a UML capable of booting any kernel image by script so we can easly test all changes to iproute2 or the kernel on as many branches as we want. I've put in the latest 2.4/2.6 main releases, the latest bk snapshots of both and earlier releases of both to ensure we keep some backward compatibility. All 2.6 branches have 2 configs, one with action code and the other with the old policer. I'm not quite sure which versions of iproute2 are being used by the distributions so I've put the latest bk version and the ones of red hat, suse and debian into it. That's 36 kernel<->iproute2 combinations per test, given we add a few dozen tests makes it hard to tell if something went wrong. I'll probably need to add some more logic to it besides just checking if the test script has written anything to stderr. I'll be happy to put patches into the test cycle once they're posted to netdev or alternatively add dave's bk tree to the list of branches whichever is more reasonable. From greearb@candelatech.com Wed Dec 8 11:32:43 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 11:32:49 -0800 (PST) Received: from www.lanforge.com (ns1.lanforge.com [66.165.47.210]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB8JWgJd012398 for ; Wed, 8 Dec 2004 11:32:43 -0800 Received: from [4.33.45.22] (evrtwa1-ar2-4-33-045-022.evrtwa1.dsl-verizon.net [4.33.45.22]) (authenticated bits=0) by www.lanforge.com (8.12.8/8.12.8) with ESMTP id iB8JieLH020056; Wed, 8 Dec 2004 11:44:40 -0800 Message-ID: <41B756BE.3050504@candelatech.com> Date: Wed, 08 Dec 2004 11:32:14 -0800 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.3) Gecko/20041020 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Robert Olsson CC: Lennert Buytenhek , hadi@cyberus.ca, netdev@oss.sgi.com Subject: Re: inter-packet gap in pktgen References: <20041207222522.GA30266@xi.wantstofly.org> <41B632F3.1090104@candelatech.com> <20041208073858.GA4027@xi.wantstofly.org> <16822.54722.755218.745451@robur.slu.se> In-Reply-To: <16822.54722.755218.745451@robur.slu.se> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12589 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev Robert Olsson wrote: > Lennert Buytenhek writes: > > > > Another option is: > > > > next_tx = get_time_in_ns(); > > while (--count) { > > tx_packet(); > > next_tx += 1e9/intended_pps; > > nanospin(next_tx - get_time_in_ns()); > > } > > Hello! > > I think this what Ben is doing with his userland app. Ev. adjusting > the ipg delay in runtime? A kind of control system. Even the device > stats could possible be read. Actually, I do return virtually the entire pktgen_interface_info structure through an IOCTL call. I do adjust the ipg often based on what my control logic tells me I should do, 10 times a second I believe. I found I needed external control because the control is fairly complex, and I didn't want it in the kernel. I also aim to allow X number of interfaces to be used at once, and allow a mixture of different speeds. The traffic on the interfaces affect each other..so I found the external control useful again... All that said, when we do calculate the 'next-tx' timer in the kernel, there is no reason I see not to use the method above to get as accurate as possible. > > This should be relatively independent of system load. OK, I know, > > time to show some code. > > Also an idea might be to have some kind of option to use pktgen w. > existent qdisc/tc infrastructure for this type of tests. Ben did > you try this? Actually, yes. My pktgen no longer busy-spins. I probably add a bit of jitter at a very low level, but over all system performance is much better. I use a callback from net_device.h to wake up the pktgen thread, and I put it to sleep as soon as I have no more work to do, or I get a hard-start-xmit error. The netdev.h code looks like this: @@ -474,9 +482,18 @@ void (*poll_controller)(struct net_device *dev); #endif + /* Callback for when the queue is woken, used by pktgen currently */ + int (*notify_queue_woken)(struct net_device *dev); + void* nqw_data; /* To be used by the method above as needed */ + /* bridge stuff */ struct net_bridge_port *br_port; static inline void netif_wake_queue(struct net_device *dev) { #ifdef CONFIG_NETPOLL_TRAP if (netpoll_trap()) return; #endif if (test_and_clear_bit(__LINK_STATE_XOFF, &dev->state)) { __netif_schedule(dev); if (dev->notify_queue_woken) { dev->notify_queue_woken(dev); } } } pktgen registers with devices' notify_queue_woken hook. For VLANs, I have some fairly complex logic in pktgen so that it still registers with the 'real' device since the VLAN won't have the netif_wake_queue called when the real device drains it's fifo. I'd be happy to have this included in the kernel, or be a basis for something similar, so let me know if you'd like to see the full patch. It is unlikely that davem will allow my pktgen-rx code in as well, but the tx stuff might still prove useful. Thanks, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com From shemminger@osdl.org Wed Dec 8 11:32:00 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 11:32:04 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB8JVx2F012260 for ; Wed, 8 Dec 2004 11:32:00 -0800 Received: from dxpl.pdx.osdl.net (dxpl.pdx.osdl.net [172.20.1.103]) by mail.osdl.org (8.11.6/8.11.6) with ESMTP id iB8JUE922854; Wed, 8 Dec 2004 11:30:14 -0800 Date: Wed, 8 Dec 2004 11:30:14 -0800 From: Stephen Hemminger To: Patrick McHardy Cc: hadi@cyberus.ca, "David S. Miller" , tgraf@suug.ch, akpm@osdl.org, tomc@compaqnet.fr, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: Hard freeze with 2.6.10-rc3 and QoS, worked fine with 2.6.9 Message-Id: <20041208113014.3dcad5f5@dxpl.pdx.osdl.net> In-Reply-To: <41B73263.4040706@trash.net> References: <1102380430.6103.6.camel@buffy> <20041206224441.628e7885.akpm@osdl.org> <1102422544.1088.98.camel@jzny.localdomain> <41B5E188.5050800@trash.net> <20041207170748.GF1371@postel.suug.ch> <41B5E722.2080600@trash.net> <20041207213053.6bb602c1.davem@davemloft.net> <1102507470.1051.27.camel@jzny.localdomain> <41B73263.4040706@trash.net> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.10claws (GTK+ 1.2.10; x86_64-suse-linux) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12588 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev I don't know if this is related but something broke netem after 2.6.10-rc2 Now a simple request to delay 10ms ends up taking 1000ms. Still narrowing down the changeset, but it isn't a change in netem or tc utils since these didn't change. From hadi@cyberus.ca Wed Dec 8 11:45:03 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 11:45:08 -0800 (PST) Received: from mx01.cybersurf.com (mx01.cybersurf.com [209.197.145.104]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB8Jj2Ok013604 for ; Wed, 8 Dec 2004 11:45:03 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx01.cybersurf.com with esmtp (Exim 4.30) id 1Cc7jr-0005SJ-2j for netdev@oss.sgi.com; Wed, 08 Dec 2004 14:44:39 -0500 Received: from [216.209.86.2] (helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1Cc7jm-00013Z-19; Wed, 08 Dec 2004 14:44:34 -0500 Subject: Re: Hard freeze with 2.6.10-rc3 and QoS, worked fine with 2.6.9 From: jamal Reply-To: hadi@cyberus.ca To: Patrick McHardy Cc: "David S. Miller" , tgraf@suug.ch, akpm@osdl.org, tomc@compaqnet.fr, linux-kernel@vger.kernel.org, netdev@oss.sgi.com In-Reply-To: <41B73263.4040706@trash.net> References: <1102380430.6103.6.camel@buffy> <20041206224441.628e7885.akpm@osdl.org> <1102422544.1088.98.camel@jzny.localdomain> <41B5E188.5050800@trash.net> <20041207170748.GF1371@postel.suug.ch> <41B5E722.2080600@trash.net> <20041207213053.6bb602c1.davem@davemloft.net> <1102507470.1051.27.camel@jzny.localdomain> <41B73263.4040706@trash.net> Content-Type: text/plain Organization: jamalopolous Message-Id: <1102535069.1023.110.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 08 Dec 2004 14:44:30 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12590 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Wed, 2004-12-08 at 11:57, Patrick McHardy wrote: > jamal wrote: > > >I can almost guarantee that one or more of those tests i outlined would > >fail. So i would suggest a revert until the testing has been done. > > > Please be more specific than an "almost guarantee" that > "one or more tests" may fail when asking to revert a patch > that fixes an easily triggerable crash. For example, point > to the code that makes you think it might fail. I hope this is clear after you read the last email exchange i had with Thomas and that you are not intentionaly trying to be annoying. cheers, jamal From hadi@cyberus.ca Wed Dec 8 11:50:19 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 11:50:24 -0800 (PST) Received: from mx01.cybersurf.com (mx01.cybersurf.com [209.197.145.104]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB8JoIqK014190 for ; Wed, 8 Dec 2004 11:50:19 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx01.cybersurf.com with esmtp (Exim 4.30) id 1Cc7ox-0001GE-8f for netdev@oss.sgi.com; Wed, 08 Dec 2004 14:49:55 -0500 Received: from [216.209.86.2] (helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1Cc7ot-0001tK-T5; Wed, 08 Dec 2004 14:49:52 -0500 Subject: Re: Hard freeze with 2.6.10-rc3 and QoS, worked fine with 2.6.9 From: jamal Reply-To: hadi@cyberus.ca To: Stephen Hemminger Cc: Patrick McHardy , "David S. Miller" , tgraf@suug.ch, akpm@osdl.org, tomc@compaqnet.fr, linux-kernel@vger.kernel.org, netdev@oss.sgi.com In-Reply-To: <20041208113014.3dcad5f5@dxpl.pdx.osdl.net> References: <1102380430.6103.6.camel@buffy> <20041206224441.628e7885.akpm@osdl.org> <1102422544.1088.98.camel@jzny.localdomain> <41B5E188.5050800@trash.net> <20041207170748.GF1371@postel.suug.ch> <41B5E722.2080600@trash.net> <20041207213053.6bb602c1.davem@davemloft.net> <1102507470.1051.27.camel@jzny.localdomain> <41B73263.4040706@trash.net> <20041208113014.3dcad5f5@dxpl.pdx.osdl.net> Content-Type: text/plain Organization: jamalopolous Message-Id: <1102535388.1022.118.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 08 Dec 2004 14:49:48 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12591 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev Stephen, Can we incoporate the tests from Thomas as part of iproute2? I have about 20 or so i could add in. Maybe you could add some for netem as well. cheers, jamal On Wed, 2004-12-08 at 14:30, Stephen Hemminger wrote: > I don't know if this is related but something broke netem after 2.6.10-rc2 > Now a simple request to delay 10ms ends up taking 1000ms. > > Still narrowing down the changeset, but it isn't a change in netem > or tc utils since these didn't change. > From buytenh@wantstofly.org Wed Dec 8 12:11:40 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 12:11:44 -0800 (PST) Received: from xi.wantstofly.org (alephnull.demon.nl [212.238.201.82]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB8KBdWa015642 for ; Wed, 8 Dec 2004 12:11:40 -0800 Received: by xi.wantstofly.org (Postfix, from userid 500) id CB4D72B0ED; Wed, 8 Dec 2004 21:11:16 +0100 (MET) Date: Wed, 8 Dec 2004 21:11:16 +0100 From: Lennert Buytenhek To: Robert Olsson Cc: netdev@oss.sgi.com Subject: [PATCH] remove FCS from pktgen bandwidth calculation Message-ID: <20041208201116.GA10691@xi.wantstofly.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.1i X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12592 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: buytenh@wantstofly.org Precedence: bulk X-list: netdev Hi, This patch stops pktgen from counting the FCS in the 'total number of bytes transmitted' and 'bits per second' calculations. cheers, Lennert --- pktgen.c.orig 2004-11-28 22:27:50.000000000 +0100 +++ pktgen.c 2004-12-08 21:09:00.511175266 +0100 @@ -2564,7 +2564,6 @@ static void show_results(struct pktgen_dev *pkt_dev, int nr_frags) { __u64 total_us, bps, mbps, pps, idle; - int size = pkt_dev->cur_pkt_size + 4; /* incl 32bit ethernet CRC */ char *p = pkt_dev->result; total_us = pkt_dev->stopped_at - pkt_dev->started_at; @@ -2576,7 +2575,7 @@ p += sprintf(p, "OK: %llu(c%llu+d%llu) usec, %llu (%dbyte,%dfrags)\n", total_us, (unsigned long long)(total_us - idle), idle, - pkt_dev->sofar, size, nr_frags); + pkt_dev->sofar, pkt_dev->cur_pkt_size, nr_frags); pps = pkt_dev->sofar * USEC_PER_SEC; @@ -2588,7 +2587,7 @@ do_div(pps, total_us); - bps = pps * 8 * size; + bps = pps * 8 * pkt_dev->cur_pkt_size; mbps = bps; do_div(mbps, 1000000); @@ -2777,7 +2776,7 @@ pkt_dev->last_ok = 1; pkt_dev->sofar++; pkt_dev->seq_num++; - pkt_dev->tx_bytes += (pkt_dev->cur_pkt_size + 4); /* count csum */ + pkt_dev->tx_bytes += pkt_dev->cur_pkt_size; } else if (ret == NETDEV_TX_LOCKED && (odev->features & NETIF_F_LLTX)) { From shemminger@osdl.org Wed Dec 8 12:31:43 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 12:31:48 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB8KVgOc020059 for ; Wed, 8 Dec 2004 12:31:43 -0800 Received: from dxpl.pdx.osdl.net (dxpl.pdx.osdl.net [172.20.1.103]) by mail.osdl.org (8.11.6/8.11.6) with ESMTP id iB8KV3902885; Wed, 8 Dec 2004 12:31:03 -0800 Date: Wed, 8 Dec 2004 12:31:03 -0800 From: Stephen Hemminger To: "David S. Miller" Cc: netem@osdl.org, netdev@oss.sgi.com Subject: [PATCH] netem: restart device after inserting packets Message-Id: <20041208123103.4cc6b005@dxpl.pdx.osdl.net> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.10claws (GTK+ 1.2.10; x86_64-suse-linux) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12593 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev The version of netem in 2.6.10 moves packets from the delayed queue to the qdisc in a timer interrupt. But it forgot to force the device to pick them up. Signed-off-by: Stephen Hemminger diff -Nru a/net/sched/sch_netem.c b/net/sched/sch_netem.c --- a/net/sched/sch_netem.c 2004-12-08 12:29:12 -08:00 +++ b/net/sched/sch_netem.c 2004-12-08 12:29:12 -08:00 @@ -258,12 +258,13 @@ { struct Qdisc *sch = (struct Qdisc *)arg; struct netem_sched_data *q = qdisc_priv(sch); + struct net_device *dev = sch->dev; struct sk_buff *skb; psched_time_t now; pr_debug("netem_watchdog: fired @%lu\n", jiffies); - spin_lock_bh(&sch->dev->queue_lock); + spin_lock_bh(&dev->queue_lock); PSCHED_GET_TIME(now); while ((skb = skb_peek(&q->delayed)) != NULL) { @@ -286,7 +287,8 @@ else sch->q.qlen++; } - spin_unlock_bh(&sch->dev->queue_lock); + qdisc_restart(dev); + spin_unlock_bh(&dev->queue_lock); } static void netem_reset(struct Qdisc *sch) From tgraf@suug.ch Wed Dec 8 12:39:45 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 12:39:52 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB8KdiZ8020707 for ; Wed, 8 Dec 2004 12:39:45 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id AAE40F; Wed, 8 Dec 2004 21:38:59 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id DE6B41C0EA; Wed, 8 Dec 2004 21:39:42 +0100 (CET) Date: Wed, 8 Dec 2004 21:39:42 +0100 From: Thomas Graf To: "David S. Miller" Cc: netdev@oss.sgi.com Subject: [PATCH] PKT_SCHED: validate policer configuration TLVs Message-ID: <20041208203942.GO1371@postel.suug.ch> References: <20041207172349.GG1371@postel.suug.ch> <20041207213234.257fd0d9.davem@davemloft.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041207213234.257fd0d9.davem@davemloft.net> X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12594 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev > Either these things are int's or u32's, they cannot be both :-) > I know that size wise it's identical, but at least make the code > look consistent. OK, I changed the dereferencing to use u32 as well and have it "casted" while assigning the value since changing the structure datatypes wouldn't make sense. Signed-off-by: Thomas Graf --- linux-2.6.10-rc2-bk13.orig/net/sched/police.c 2004-11-30 14:01:12.000000000 +0100 +++ linux-2.6.10-rc2-bk13/net/sched/police.c 2004-12-08 19:45:36.000000000 +0100 @@ -180,7 +180,8 @@ if (rtattr_parse(tb, TCA_POLICE_MAX, RTA_DATA(rta), RTA_PAYLOAD(rta)) < 0) return -1; - if (tb[TCA_POLICE_TBF-1] == NULL) + if (tb[TCA_POLICE_TBF-1] == NULL || + RTA_PAYLOAD(tb[TCA_POLICE_TBF-1]) != sizeof(*parm)) return -1; parm = RTA_DATA(tb[TCA_POLICE_TBF-1]); @@ -220,11 +221,17 @@ goto failure; } } - if (tb[TCA_POLICE_RESULT-1]) - p->result = *(int*)RTA_DATA(tb[TCA_POLICE_RESULT-1]); + if (tb[TCA_POLICE_RESULT-1]) { + if (RTA_PAYLOAD(tb[TCA_POLICE_RESULT-1]) != sizeof(u32)) + goto failure; + p->result = *(u32*)RTA_DATA(tb[TCA_POLICE_RESULT-1]); + } #ifdef CONFIG_NET_ESTIMATOR - if (tb[TCA_POLICE_AVRATE-1]) + if (tb[TCA_POLICE_AVRATE-1]) { + if (RTA_PAYLOAD(tb[TCA_POLICE_AVRATE-1]) != sizeof(u32)) + goto failure; p->ewma_rate = *(u32*)RTA_DATA(tb[TCA_POLICE_AVRATE-1]); + } #endif p->toks = p->burst = parm->burst; p->mtu = parm->mtu; @@ -424,7 +431,8 @@ if (rtattr_parse(tb, TCA_POLICE_MAX, RTA_DATA(rta), RTA_PAYLOAD(rta)) < 0) return NULL; - if (tb[TCA_POLICE_TBF-1] == NULL) + if (tb[TCA_POLICE_TBF-1] == NULL || + RTA_PAYLOAD(tb[TCA_POLICE_TBF-1]) != sizeof(*parm)) return NULL; parm = RTA_DATA(tb[TCA_POLICE_TBF-1]); @@ -449,11 +457,17 @@ (p->P_tab = qdisc_get_rtab(&parm->peakrate, tb[TCA_POLICE_PEAKRATE-1])) == NULL) goto failure; } - if (tb[TCA_POLICE_RESULT-1]) - p->result = *(int*)RTA_DATA(tb[TCA_POLICE_RESULT-1]); + if (tb[TCA_POLICE_RESULT-1]) { + if (RTA_PAYLOAD(tb[TCA_POLICE_RESULT-1]) != sizeof(u32)) + goto failure; + p->result = *(u32*)RTA_DATA(tb[TCA_POLICE_RESULT-1]); + } #ifdef CONFIG_NET_ESTIMATOR - if (tb[TCA_POLICE_AVRATE-1]) + if (tb[TCA_POLICE_AVRATE-1]) { + if (RTA_PAYLOAD(tb[TCA_POLICE_AVRATE-1]) != sizeof(u32)) + goto failure; p->ewma_rate = *(u32*)RTA_DATA(tb[TCA_POLICE_AVRATE-1]); + } #endif p->toks = p->burst = parm->burst; p->mtu = parm->mtu; From davem@davemloft.net Wed Dec 8 12:57:33 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 12:57:40 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB8KvUK8021749 for ; Wed, 8 Dec 2004 12:57:30 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1Cc8q2-0003SC-00; Wed, 08 Dec 2004 12:55:06 -0800 Date: Wed, 8 Dec 2004 12:55:06 -0800 From: "David S. Miller" To: yoshfuji@linux-ipv6.org Cc: netdev@oss.sgi.com Subject: Re: [PATCH] [IPV6]: Fix check if we're router. Message-Id: <20041208125506.4b810d51.davem@davemloft.net> In-Reply-To: <20041208.104456.103795781.yoshfuji@linux-ipv6.org> References: <20041208.104456.103795781.yoshfuji@linux-ipv6.org> X-Mailer: Sylpheed version 1.0.0beta3 (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id iB8KvUK8021749 X-archive-position: 12595 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Wed, 08 Dec 2004 10:44:56 +0100 (CET) YOSHIFUJI Hideaki / $B5HF#1QL@(B wrote: > Fix the check if we're router. against 2.6.10-rc3, for 2.6.10 queue. Applied, thank you. From kaber@trash.net Wed Dec 8 13:22:21 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 13:22:25 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB8LMKIr022926 for ; Wed, 8 Dec 2004 13:22:21 -0800 Received: from localhost ([127.0.0.1]) by kaber.coreworks.de with esmtp (Exim 4.34) id 1Cc9Fy-0002zL-1s; Wed, 08 Dec 2004 22:21:54 +0100 Message-ID: <41B77072.2070200@trash.net> Date: Wed, 08 Dec 2004 22:21:54 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.3) Gecko/20041008 Debian/1.7.3-5 X-Accept-Language: en MIME-Version: 1.0 To: Stephen Hemminger CC: netdev@oss.sgi.com Subject: Re: Hard freeze with 2.6.10-rc3 and QoS, worked fine with 2.6.9 References: <1102380430.6103.6.camel@buffy> <20041206224441.628e7885.akpm@osdl.org> <1102422544.1088.98.camel@jzny.localdomain> <41B5E188.5050800@trash.net> <20041207170748.GF1371@postel.suug.ch> <41B5E722.2080600@trash.net> <20041207213053.6bb602c1.davem@davemloft.net> <1102507470.1051.27.camel@jzny.localdomain> <41B73263.4040706@trash.net> <20041208113014.3dcad5f5@dxpl.pdx.osdl.net> In-Reply-To: <20041208113014.3dcad5f5@dxpl.pdx.osdl.net> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12596 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev Stephen Hemminger wrote: >I don't know if this is related but something broke netem after 2.6.10-rc2 >Now a simple request to delay 10ms ends up taking 1000ms. > >Still narrowing down the changeset, but it isn't a change in netem >or tc utils since these didn't change. > > > I can't find anything in the Changelog that looks related. Can you give me a testcase ? Regards Patrick From kdesler@soohrt.org Wed Dec 8 14:06:30 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 14:06:36 -0800 (PST) Received: from quickstop.soohrt.org (quickstop.soohrt.org [81.2.155.147]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB8M6Snb025270 for ; Wed, 8 Dec 2004 14:06:29 -0800 Received: (qmail 23229 invoked by uid 1018); 8 Dec 2004 22:06:01 -0000 Date: Wed, 8 Dec 2004 23:06:01 +0100 From: Karsten Desler To: Robert Olsson Cc: Karsten Desler , netdev@oss.sgi.com, linux-kernel@vger.kernel.org, "David S. Miller" , jamal , P@draigBrady.com Subject: Re: _High_ CPU usage while routing (mostly) small UDP packets Message-ID: <20041208220601.GA18066@quickstop.soohrt.org> References: <20041206205305.GA11970@soohrt.org> <20041207211035.GA20286@quickstop.soohrt.org> <16822.12630.275389.575326@robur.slu.se> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <16822.12630.275389.575326@robur.slu.se> User-Agent: Mutt/1.5.6+20040722i X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12597 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kdesler@soohrt.org Precedence: bulk X-list: netdev Robert Olsson wrote: > rhash_entries= [KNL,NET] > Set number of hash buckets for route cache IP: routing cache hash table of 131072 buckets, 1024Kbytes A 3 percent improvement over the average cpu usage in the last 8 hours compared to the same time yesterday. Still in the noise. I'm going to try a day without netfilter next and then the realloc_skb patch. Cheers, Karsten From shemminger@osdl.org Wed Dec 8 14:07:39 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 14:07:45 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB8M7cqW025599 for ; Wed, 8 Dec 2004 14:07:38 -0800 Received: from dxpl.pdx.osdl.net (dxpl.pdx.osdl.net [172.20.1.103]) by mail.osdl.org (8.11.6/8.11.6) with ESMTP id iB8M77924979; Wed, 8 Dec 2004 14:07:07 -0800 Date: Wed, 8 Dec 2004 14:07:07 -0800 From: Stephen Hemminger To: Patrick McHardy Cc: netdev@oss.sgi.com Subject: Re: Hard freeze with 2.6.10-rc3 and QoS, worked fine with 2.6.9 Message-Id: <20041208140707.61c2237c@dxpl.pdx.osdl.net> In-Reply-To: <41B77072.2070200@trash.net> References: <1102380430.6103.6.camel@buffy> <20041206224441.628e7885.akpm@osdl.org> <1102422544.1088.98.camel@jzny.localdomain> <41B5E188.5050800@trash.net> <20041207170748.GF1371@postel.suug.ch> <41B5E722.2080600@trash.net> <20041207213053.6bb602c1.davem@davemloft.net> <1102507470.1051.27.camel@jzny.localdomain> <41B73263.4040706@trash.net> <20041208113014.3dcad5f5@dxpl.pdx.osdl.net> <41B77072.2070200@trash.net> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.10claws (GTK+ 1.2.10; x86_64-suse-linux) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12598 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev On Wed, 08 Dec 2004 22:21:54 +0100 Patrick McHardy wrote: > Stephen Hemminger wrote: > > >I don't know if this is related but something broke netem after 2.6.10-rc2 > >Now a simple request to delay 10ms ends up taking 1000ms. > > > >Still narrowing down the changeset, but it isn't a change in netem > >or tc utils since these didn't change. > > > > > > > I can't find anything in the Changelog that looks related. > Can you give me a testcase ? I think the problem was a missing poke (qdisc_restart) in the netem timer routine, it probably worked earlier for me on other hardware because I was using different hardware that was waking up and checking tx in response to received packets. From buytenh@wantstofly.org Wed Dec 8 14:19:01 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 14:19:06 -0800 (PST) Received: from xi.wantstofly.org (alephnull.demon.nl [212.238.201.82]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB8MJ0qh026622 for ; Wed, 8 Dec 2004 14:19:01 -0800 Received: by xi.wantstofly.org (Postfix, from userid 500) id 172B02B0ED; Wed, 8 Dec 2004 23:18:38 +0100 (MET) Date: Wed, 8 Dec 2004 23:18:38 +0100 From: Lennert Buytenhek To: robert.olsson@data.slu.se Cc: hadi@cyberus.ca, netdev@oss.sgi.com Subject: [PATCH] nanospin/pg_udelay oversleeping in pktgen Message-ID: <20041208221838.GA12022@xi.wantstofly.org> References: <20041208215611.GA11844@xi.wantstofly.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041208215611.GA11844@xi.wantstofly.org> User-Agent: Mutt/1.4.1i X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12599 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: buytenh@wantstofly.org Precedence: bulk X-list: netdev On Wed, Dec 08, 2004 at 10:56:11PM +0100, Lennert Buytenhek wrote: > One more reason why 'ipg' isn't really working very well for me: pktgen > tends to oversleep a lot. When using ipg=1000000 and attached patch, I > always get output like this: > > overslept by 173588 ns > overslept by 171822 ns > overslept by 171234 ns > overslept by 171888 ns > overslept by 171354 ns > overslept by 171974 ns > overslept by 171480 ns Below patch appears to fix the 'oversleeping' problem by always delaying in terms of nanoseconds instead of converting nanoseconds and cycles back and forth. It still oversleeps, but by ~50-~150ns per call instead of in the millisecond range. Comments welcome. cheers, Lennert --- pktgen.c.orig 2004-11-28 22:27:50.000000000 +0100 +++ pktgen.c 2004-12-08 23:09:53.397371873 +0100 @@ -239,7 +239,7 @@ */ __u64 started_at; /* micro-seconds */ __u64 stopped_at; /* micro-seconds */ - __u64 idle_acc; + __u64 idle_acc; /* nano-seconds */ __u32 seq_num; int clone_skb; /* Use multiple SKBs during packet gen. If this number @@ -717,8 +717,8 @@ stopped = now; /* not really stopped, more like last-running-at */ p += sprintf(p, "Current:\n pkts-sofar: %llu errors: %llu\n started: %lluus stopped: %lluus idle: %lluus\n", - pkt_dev->sofar, pkt_dev->errors, sa, stopped, - pg_div(pkt_dev->idle_acc, pg_cycles_per_us)); + pkt_dev->sofar, pkt_dev->errors, sa, stopped, + pg_div(pkt_dev->idle_acc, 1000)); p += sprintf(p, " seq_num: %d cur_dst_mac_offset: %d cur_src_mac_offset: %d\n", pkt_dev->seq_num, pkt_dev->cur_dst_mac_offset, pkt_dev->cur_src_mac_offset); @@ -1732,54 +1729,32 @@ pkt_dev->nflows = 0; } -/* ipg is in nano-seconds */ -static void nanospin(__u32 ipg, struct pktgen_dev *pkt_dev) +/* spin_until is in nano-seconds */ +static void spin(struct pktgen_dev *pkt_dev, __u64 spin_until) { - u64 idle_start = get_cycles(); - u64 idle; - - for (;;) { - barrier(); - idle = get_cycles() - idle_start; - if (idle * 1000 >= ipg * pg_cycles_per_us) - break; - } - pkt_dev->idle_acc += idle; -} - - -/* ipg is in micro-seconds (usecs) */ -static void pg_udelay(__u32 delay_us, struct pktgen_dev *pkt_dev) -{ - u64 start = getRelativeCurUs(); + u64 start; u64 now; - - for (;;) { - do_softirq(); - now = getRelativeCurUs(); - if (start + delay_us <= (now - 10)) - break; - if (!pkt_dev->running) - return; - - if (need_resched()) - schedule(); - - now = getRelativeCurUs(); - if (start + delay_us <= (now - 10)) - break; - } + /* Try not to busy-spin if we have larger sleep times. + * TODO: Investigate better ways to do this. + */ + start = now = getRelativeCurNs(); + while (now < spin_until) { + if (spin_until - now > 10000) { + do_softirq(); + if (!pkt_dev->running) + return; + if (need_resched()) + schedule(); + } - pkt_dev->idle_acc += (1000 * (now - start)); + now = getRelativeCurNs(); + } - /* We can break out of the loop up to 10us early, so spend the rest of - * it spinning to increase accuracy. - */ - if (start + delay_us > now) - nanospin((start + delay_us) - now, pkt_dev); + pkt_dev->idle_acc += now - start; } + /* Returns: cycles per micro-second */ static int calc_mhz(void) { @@ -2693,33 +2667,16 @@ { struct net_device *odev = NULL; __u64 idle_start = 0; - u32 next_ipg = 0; - u64 now = 0; /* in nano-seconds */ int ret; odev = pkt_dev->odev; if (pkt_dev->ipg) { - now = getRelativeCurNs(); - if (now < pkt_dev->next_tx_ns) { - next_ipg = (u32)(pkt_dev->next_tx_ns - now); - - /* Try not to busy-spin if we have larger sleep times. - * TODO: Investigate better ways to do this. - */ + u64 now; /* in nano-seconds */ - /* 10 usecs or less */ - if (next_ipg < 10000) - nanospin(next_ipg, pkt_dev); - - /* 10ms or less */ - else if (next_ipg < 10000000) - pg_udelay(next_ipg / 1000, pkt_dev); - - /* fall asleep for a 10ms or more. */ - else - pg_udelay(next_ipg / 1000, pkt_dev); - } + now = getRelativeCurNs(); + if (now < pkt_dev->next_tx_ns) + spin(pkt_dev, pkt_dev->next_tx_ns); /* This is max IPG, this has special meaning of * "never transmit" From kaber@trash.net Wed Dec 8 14:27:14 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 14:27:18 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB8MRDVr027310 for ; Wed, 8 Dec 2004 14:27:13 -0800 Received: from localhost ([127.0.0.1]) by kaber.coreworks.de with esmtp (Exim 4.34) id 1CcAGk-0003AW-Q4; Wed, 08 Dec 2004 23:26:46 +0100 Message-ID: <41B77FA6.7030707@trash.net> Date: Wed, 08 Dec 2004 23:26:46 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.3) Gecko/20041008 Debian/1.7.3-5 X-Accept-Language: en MIME-Version: 1.0 To: Stephen Hemminger CC: netdev@oss.sgi.com Subject: Re: Hard freeze with 2.6.10-rc3 and QoS, worked fine with 2.6.9 References: <1102380430.6103.6.camel@buffy> <20041206224441.628e7885.akpm@osdl.org> <1102422544.1088.98.camel@jzny.localdomain> <41B5E188.5050800@trash.net> <20041207170748.GF1371@postel.suug.ch> <41B5E722.2080600@trash.net> <20041207213053.6bb602c1.davem@davemloft.net> <1102507470.1051.27.camel@jzny.localdomain> <41B73263.4040706@trash.net> <20041208113014.3dcad5f5@dxpl.pdx.osdl.net> <41B77072.2070200@trash.net> <20041208140707.61c2237c@dxpl.pdx.osdl.net> In-Reply-To: <20041208140707.61c2237c@dxpl.pdx.osdl.net> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12600 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev Stephen Hemminger wrote: >I think the problem was a missing poke (qdisc_restart) in the netem timer >routine, it probably worked earlier for me on other hardware because I was >using different hardware that was waking up and checking tx in response >to received packets. > > Yes, that seems likely. Since netem doesn't account for the delayed packets in sch->q.qlen it will only be woken up while non-delayed packets are queued. Regards Patrick From kaber@trash.net Wed Dec 8 15:07:55 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 15:08:02 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB8N7soV029310 for ; Wed, 8 Dec 2004 15:07:54 -0800 Received: from localhost ([127.0.0.1]) by kaber.coreworks.de with esmtp (Exim 4.34) id 1CcAu6-0004NO-8e; Thu, 09 Dec 2004 00:07:26 +0100 Message-ID: <41B7892E.1080706@trash.net> Date: Thu, 09 Dec 2004 00:07:26 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.3) Gecko/20041008 Debian/1.7.3-5 X-Accept-Language: en MIME-Version: 1.0 To: "David S. Miller" CC: netdev@oss.sgi.com, jamal Subject: [PATCH 2.6]: Fix oops in ipt action error path Content-Type: multipart/mixed; boundary="------------080100080308010505080402" X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12601 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------080100080308010505080402 Content-Type: text/plain; charset=ISO-8859-15; format=flowed Content-Transfer-Encoding: 7bit This patch fixes an oops when the ipt action is used with a non-existant iptables target. It tries to log t->u.kernel.target->name, u.kernel.target is part of a union and as long as the target wasn't successfully loaded contains the name of the target, using it as a pointer results in a crash. Oops captured in UML: EIP: 0023:[] CPU: 0 Not tainted ESP: 002b:a14b7514 EFLAGS: 00010297 Not tainted EAX: 414d4e56 EBX: 0000000a ECX: 414d4e56 EDX: fffffffe ESI: a036acba EDI: 00000000 EBP: a036b09f DS: 002b ES: 002b Call Trace: [] notifier_call_chain+0x2d/0x50 [] bust_spinlocks+0x46/0x50 [] panic+0x71/0x120 [] vsnprintf+0x331/0x4d0 [] segv+0x1fa/0x230 [] vsnprintf+0x331/0x4d0 [] sigemptyset+0x24/0x40 [] change_signals+0x65/0x90 [] segv_handler+0xe0/0xf0 [] vsnprintf+0x331/0x4d0 [] sig_handler_common_tt+0x8d/0x120 [] sig_handler+0x17/0x20 [] __restore+0x0/0x8 [] vsnprintf+0x331/0x4d0 [] vscnprintf+0x2b/0x40 [] vprintk+0xb2/0x320 [] printk+0x17/0x20 [] tcf_ipt_init+0x533/0x750 [] tcf_action_init_1+0x92/0x1a0 [] kmem_cache_alloc+0x39/0x60 [] sigemptyset+0x24/0x40 [] tcf_action_init+0xa7/0x140 ... Not very important right now since ipt support isn't merged in iproute yet, but still should be fixed for 2.6.10. Regards Patrick --------------080100080308010505080402 Content-Type: text/plain; name="x" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="x" # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/12/08 23:59:19+01:00 kaber@coreworks.de # [PKT_SCHED]: Fix oops in ipt action error path # # Signed-off-by: Patrick McHardy # # net/sched/ipt.c # 2004/12/08 23:59:13+01:00 kaber@coreworks.de +1 -2 # [PKT_SCHED]: Fix oops in ipt action error path # # Signed-off-by: Patrick McHardy # diff -Nru a/net/sched/ipt.c b/net/sched/ipt.c --- a/net/sched/ipt.c 2004-12-08 23:59:51 +01:00 +++ b/net/sched/ipt.c 2004-12-08 23:59:51 +01:00 @@ -63,8 +63,7 @@ target = __ipt_find_target_lock(t->u.user.name, &ret); if (!target) { - printk("init_targ: Failed to find %s\n", - t->u.kernel.target->name); + printk("init_targ: Failed to find %s\n", t->u.user.name); return -1; } --------------080100080308010505080402-- From rayl@mail.com Wed Dec 8 15:36:42 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 15:36:47 -0800 (PST) Received: from ray.lehtiniemi.com (dsl-crow-209-5-162-130-cgy.nucleus.com [209.5.162.130]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB8NafPH030651 for ; Wed, 8 Dec 2004 15:36:41 -0800 Received: by ray.lehtiniemi.com (Postfix, from userid 500) id 1CCAB14BD46; Wed, 8 Dec 2004 16:36:18 -0700 (MST) Date: Wed, 8 Dec 2004 16:36:18 -0700 From: Ray Lehtiniemi To: Martin Josefsson Cc: Lennert Buytenhek , Scott Feldman , jamal , Robert Olsson , P@draigBrady.com, mellia@prezzemolo.polito.it, e1000-devel@lists.sourceforge.net, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com Subject: Re: 1.03Mpps on e1000 (was: Re: [E1000-devel] Transmission limit) Message-ID: <20041208233617.GD8649@mail.com> References: <20041130134600.GA31515@xi.wantstofly.org> <1101824754.1044.126.camel@jzny.localdomain> <20041201001107.GE4203@xi.wantstofly.org> <1101863399.4663.54.camel@sfeldma-mobl.dsl-verizon.net> <20041201182943.GA14470@xi.wantstofly.org> <20041201213550.GF14470@xi.wantstofly.org> <1101967983.4782.9.camel@localhost.localdomain> <20041205145051.GA647@xi.wantstofly.org> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.6i X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12602 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rayl@mail.com Precedence: bulk X-list: netdev hello martin On Sun, Dec 05, 2004 at 04:42:34PM +0100, Martin Josefsson wrote: > > Here's the patch, not much more tested (it still gives some transmit > timeouts since it's scotts patch + prefetching and delayed TDT updating). > And it's not cleaned up, but hey, that's development :) > > The delayed TDT updating was a test and currently it delays the first tx'd > packet after a timerrun 1ms. > > Would be interesting to see what other people get with this thing. > Lennert? well, i'm brand new to gig ethernet, but i have access to some nice hardware right now, so i decided to give your patch a try. this is the average tx pps of 10 pktgen runs for each packet size: 60 1187589.1 64 601805.4 68 1115029.3 72 593096.4 76 1097761.1 80 587125.4 84 1098045.2 88 588159.1 92 1072124.8 96 582510.3 100 1008056.8 104 577898.0 108 946974.0 112 573719.2 116 892871.0 120 573072.5 124 844608.3 128 563685.7 any idea why the packet rates are cut in half for every other line? pktgen is running with eth0 bound to CPU0 on this box: NexGate NSA 2040G Dual Xeon 3.06 GHz, HT enabled 1 GB PC3200 DDR SDRAM Dual 82544EI - on PCI-X 64 bit 133 MHz bus - behind P64H2 bridge - on hub channel D of E7501 chipset thanks -- ---------------------------------------------------------------------- Ray L From rayl@mail.com Wed Dec 8 15:43:33 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 15:43:38 -0800 (PST) Received: from ray.lehtiniemi.com (dsl-crow-209-5-162-130-cgy.nucleus.com [209.5.162.130]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB8NhXoG031249 for ; Wed, 8 Dec 2004 15:43:33 -0800 Received: by ray.lehtiniemi.com (Postfix, from userid 500) id 108B311F8C1; Wed, 8 Dec 2004 16:43:10 -0700 (MST) Date: Wed, 8 Dec 2004 16:43:10 -0700 From: Ray Lehtiniemi To: "Brandeburg, Jesse" Cc: Scott Feldman , netdev@oss.sgi.com Subject: Re: RE: how to tune a pair of e1000 cards on intel e7501-based system? Message-ID: <20041208234310.GE8649@mail.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.6i X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12603 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rayl@mail.com Precedence: bulk X-list: netdev hi jesse On Tue, Dec 07, 2004 at 05:10:26PM -0800, Brandeburg, Jesse wrote: > > I'm not much of an expert, but one easy thing to try is to up your > receive stack resources, as they were particularly low on 2.4 series > kernels, leading to udp getting overrun pretty easily with gig nics. I > think if you make the value go too high it just ignores it, so if you > see no change, try 256kB instead. > > cat /proc/sys/net/core/rmem_default > cat /proc/sys/net/core/rmem_max > > echo -n 512000 > /proc/sys/net/core/rmem_default > echo -n 512000 > /proc/sys/net/core/rmem_max sounds reasonable, i'll try this early next week when i have access to that test box again. thanks! -- ---------------------------------------------------------------------- Ray L From romieu@fr.zoreil.com Wed Dec 8 16:51:32 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 16:51:41 -0800 (PST) Received: from fr.zoreil.com (electric-eye.fr.zoreil.com [213.41.134.224]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB90pUD7005129 for ; Wed, 8 Dec 2004 16:51:31 -0800 Received: from electric-eye.fr.zoreil.com (localhost.localdomain [127.0.0.1]) by fr.zoreil.com (8.12.10/8.12.1) with ESMTP id iB90oIvr028308; Thu, 9 Dec 2004 01:50:18 +0100 Received: (from romieu@localhost) by electric-eye.fr.zoreil.com (8.12.10/8.12.10/Submit) id iB90oFfW028307; Thu, 9 Dec 2004 01:50:15 +0100 Date: Thu, 9 Dec 2004 01:50:15 +0100 From: Francois Romieu To: Nicholas Papadakos Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: realtek r8169 + kernel 2.4.24 (openmosix) Message-ID: <20041209005015.GA25868@electric-eye.fr.zoreil.com> References: <20041205122414.GA22383@electric-eye.fr.zoreil.com> <200412051317.iB5DHrOL026570@kane.otenet.gr> <20041205135132.GA23262@electric-eye.fr.zoreil.com> <20041206234131.GA12838@electric-eye.fr.zoreil.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041206234131.GA12838@electric-eye.fr.zoreil.com> User-Agent: Mutt/1.4.1i X-Organisation: Land of Sunshine Inc. X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12604 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: romieu@fr.zoreil.com Precedence: bulk X-list: netdev Francois Romieu : [unstable backport of the r8169 driver] Nicholas, can you give a try at the patch below against vanilla 2.4.28 ? http://www.fr.zoreil.com/people/francois/misc/20041209-2.4.28-r8169.c-test.patch This patch does not trivially crash and you should be able to set the mtu around 7200 bytes. The detail of the (43) patches is available at: http://www.fr.zoreil.com/linux/kernel/2.4.x/2.4.28/r8169. -- Ueimor From chas@cmf.nrl.navy.mil Wed Dec 8 17:03:32 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 17:03:40 -0800 (PST) Received: from ginger.cmf.nrl.navy.mil (ginger.cmf.nrl.navy.mil [134.207.10.161]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB913TKC006048 for ; Wed, 8 Dec 2004 17:03:32 -0800 Received: from cmf.nrl.navy.mil (thirdoffive.cmf.nrl.navy.mil [134.207.10.180]) by ginger.cmf.nrl.navy.mil (8.12.11/8.12.11) with ESMTP id iB912w8T007866; Wed, 8 Dec 2004 20:02:58 -0500 (EST) Message-Id: <200412090102.iB912w8T007866@ginger.cmf.nrl.navy.mil> To: netdev@oss.sgi.com Cc: davem@redhat.com, bunk@stusta.de Subject: [PATCH] [ATM]: small atm cleanups (from Adrian Bunk ) Date: Wed, 08 Dec 2004 20:02:59 -0500 From: "chas williams - CONTRACTOR" X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Scanned: NAI Completed X-Scanned-By: MIMEDefang 2.30 (www . roaringpenguin . com / mimedefang) X-Virus-Status: Clean X-archive-position: 12605 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: chas@cmf.nrl.navy.mil Precedence: bulk X-list: netdev please apply to 2.6 -- thanks! Signed-off-by: Chas Williams # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/12/08 17:09:38-05:00 chas@relax.cmf.nrl.navy.mil # [ATM]: small atm cleanups (from Adrian Bunk ) # diff -Nru a/drivers/atm/ambassador.c b/drivers/atm/ambassador.c --- a/drivers/atm/ambassador.c 2004-12-08 18:07:24 -05:00 +++ b/drivers/atm/ambassador.c 2004-12-08 18:07:24 -05:00 @@ -1692,7 +1692,7 @@ }; -unsigned int command_successes [] = { +static unsigned int command_successes [] = { [host_memory_test] = COMMAND_PASSED_TEST, [read_adapter_memory] = COMMAND_READ_DATA_OK, [write_adapter_memory] = COMMAND_WRITE_DATA_OK, @@ -2088,7 +2088,7 @@ } // swap bits within byte to get Ethernet ordering -u8 bit_swap (u8 byte) +static u8 bit_swap (u8 byte) { const u8 swap[] = { 0x0, 0x8, 0x4, 0xc, diff -Nru a/drivers/atm/atmtcp.c b/drivers/atm/atmtcp.c --- a/drivers/atm/atmtcp.c 2004-12-08 18:07:24 -05:00 +++ b/drivers/atm/atmtcp.c 2004-12-08 18:07:24 -05:00 @@ -396,7 +396,7 @@ } -int atmtcp_attach(struct atm_vcc *vcc,int itf) +static int atmtcp_attach(struct atm_vcc *vcc,int itf) { struct atm_dev *dev; @@ -427,13 +427,13 @@ } -int atmtcp_create_persistent(int itf) +static int atmtcp_create_persistent(int itf) { return atmtcp_create(itf,1,NULL); } -int atmtcp_remove_persistent(int itf) +static int atmtcp_remove_persistent(int itf) { struct atm_dev *dev; struct atmtcp_dev_data *dev_data; diff -Nru a/drivers/atm/firestream.c b/drivers/atm/firestream.c --- a/drivers/atm/firestream.c 2004-12-08 18:07:24 -05:00 +++ b/drivers/atm/firestream.c 2004-12-08 18:07:24 -05:00 @@ -82,14 +82,14 @@ * would be interpreted. -- REW */ #define NP FS_NR_FREE_POOLS -int rx_buf_sizes[NP] = {128, 256, 512, 1024, 2048, 4096, 16384, 65520}; +static int rx_buf_sizes[NP] = {128, 256, 512, 1024, 2048, 4096, 16384, 65520}; /* log2: 7 8 9 10 11 12 14 16 */ #if 0 -int rx_pool_sizes[NP] = {1024, 1024, 512, 256, 128, 64, 32, 32}; +static int rx_pool_sizes[NP] = {1024, 1024, 512, 256, 128, 64, 32, 32}; #else /* debug */ -int rx_pool_sizes[NP] = {128, 128, 128, 64, 64, 64, 32, 32}; +static int rx_pool_sizes[NP] = {128, 128, 128, 64, 64, 64, 32, 32}; #endif /* log2: 10 10 9 8 7 6 5 5 */ /* sumlog2: 17 18 18 18 18 18 19 21 */ @@ -250,7 +250,7 @@ }; -struct reginit_item PHY_NTC_INIT[] __devinitdata = { +static struct reginit_item PHY_NTC_INIT[] __devinitdata = { { PHY_CLEARALL, 0x40 }, { 0x12, 0x0001 }, { 0x13, 0x7605 }, @@ -334,7 +334,7 @@ #define func_exit() fs_dprintk (FS_DEBUG_FLOW, "fs: exit %s\n", __FUNCTION__) -struct fs_dev *fs_boards = NULL; +static struct fs_dev *fs_boards = NULL; #ifdef DEBUG @@ -1921,7 +1921,7 @@ return -ENODEV; } -void __devexit firestream_remove_one (struct pci_dev *pdev) +static void __devexit firestream_remove_one (struct pci_dev *pdev) { int i; struct fs_dev *dev, *nxtdev; diff -Nru a/drivers/atm/he.c b/drivers/atm/he.c --- a/drivers/atm/he.c 2004-12-08 18:07:24 -05:00 +++ b/drivers/atm/he.c 2004-12-08 18:07:24 -05:00 @@ -147,13 +147,55 @@ /* globals */ -struct he_dev *he_devs = NULL; +static struct he_dev *he_devs = NULL; static int disable64 = 0; static short nvpibits = -1; static short nvcibits = -1; static short rx_skb_reserve = 16; static int irq_coalesce = 1; static int sdh = 0; + +/* Read from EEPROM = 0000 0011b */ +static unsigned int readtab[] = { + CS_HIGH | CLK_HIGH, + CS_LOW | CLK_LOW, + CLK_HIGH, /* 0 */ + CLK_LOW, + CLK_HIGH, /* 0 */ + CLK_LOW, + CLK_HIGH, /* 0 */ + CLK_LOW, + CLK_HIGH, /* 0 */ + CLK_LOW, + CLK_HIGH, /* 0 */ + CLK_LOW, + CLK_HIGH, /* 0 */ + CLK_LOW | SI_HIGH, + CLK_HIGH | SI_HIGH, /* 1 */ + CLK_LOW | SI_HIGH, + CLK_HIGH | SI_HIGH /* 1 */ +}; + +/* Clock to read from/write to the EEPROM */ +static unsigned int clocktab[] = { + CLK_LOW, + CLK_HIGH, + CLK_LOW, + CLK_HIGH, + CLK_LOW, + CLK_HIGH, + CLK_LOW, + CLK_HIGH, + CLK_LOW, + CLK_HIGH, + CLK_LOW, + CLK_HIGH, + CLK_LOW, + CLK_HIGH, + CLK_LOW, + CLK_HIGH, + CLK_LOW +}; static struct atmdev_ops he_ops = { diff -Nru a/drivers/atm/he.h b/drivers/atm/he.h --- a/drivers/atm/he.h 2004-12-08 18:07:24 -05:00 +++ b/drivers/atm/he.h 2004-12-08 18:07:24 -05:00 @@ -892,47 +892,4 @@ #define SI_HIGH ID_DIN /* HOST_CNTL_ID_PROM_DATA_IN */ #define EEPROM_DELAY 400 /* microseconds */ -/* Read from EEPROM = 0000 0011b */ -unsigned int readtab[] = { - CS_HIGH | CLK_HIGH, - CS_LOW | CLK_LOW, - CLK_HIGH, /* 0 */ - CLK_LOW, - CLK_HIGH, /* 0 */ - CLK_LOW, - CLK_HIGH, /* 0 */ - CLK_LOW, - CLK_HIGH, /* 0 */ - CLK_LOW, - CLK_HIGH, /* 0 */ - CLK_LOW, - CLK_HIGH, /* 0 */ - CLK_LOW | SI_HIGH, - CLK_HIGH | SI_HIGH, /* 1 */ - CLK_LOW | SI_HIGH, - CLK_HIGH | SI_HIGH /* 1 */ -}; - -/* Clock to read from/write to the EEPROM */ -unsigned int clocktab[] = { - CLK_LOW, - CLK_HIGH, - CLK_LOW, - CLK_HIGH, - CLK_LOW, - CLK_HIGH, - CLK_LOW, - CLK_HIGH, - CLK_LOW, - CLK_HIGH, - CLK_LOW, - CLK_HIGH, - CLK_LOW, - CLK_HIGH, - CLK_LOW, - CLK_HIGH, - CLK_LOW -}; - - #endif /* _HE_H_ */ diff -Nru a/drivers/atm/idt77105.c b/drivers/atm/idt77105.c --- a/drivers/atm/idt77105.c 2004-12-08 18:07:24 -05:00 +++ b/drivers/atm/idt77105.c 2004-12-08 18:07:24 -05:00 @@ -323,7 +323,7 @@ } -int idt77105_stop(struct atm_dev *dev) +static int idt77105_stop(struct atm_dev *dev) { struct idt77105_priv *walk, *prev; diff -Nru a/drivers/atm/idt77105.h b/drivers/atm/idt77105.h --- a/drivers/atm/idt77105.h 2004-12-08 18:07:24 -05:00 +++ b/drivers/atm/idt77105.h 2004-12-08 18:07:24 -05:00 @@ -77,7 +77,6 @@ #ifdef __KERNEL__ int idt77105_init(struct atm_dev *dev) __init; -int idt77105_stop(struct atm_dev *dev); #endif /* diff -Nru a/drivers/atm/idt77252.h b/drivers/atm/idt77252.h --- a/drivers/atm/idt77252.h 2004-12-08 18:07:24 -05:00 +++ b/drivers/atm/idt77252.h 2004-12-08 18:07:24 -05:00 @@ -275,7 +275,7 @@ struct rsq_entry *next; struct rsq_entry *last; dma_addr_t paddr; -} rsq_info; +}; /*****************************************************************************/ diff -Nru a/drivers/atm/iphase.c b/drivers/atm/iphase.c --- a/drivers/atm/iphase.c 2004-12-08 18:07:24 -05:00 +++ b/drivers/atm/iphase.c 2004-12-08 18:07:24 -05:00 @@ -72,13 +72,13 @@ #define PRIV(dev) ((struct suni_priv *) dev->phy_data) static unsigned char ia_phy_get(struct atm_dev *dev, unsigned long addr); +static void desc_dbg(IADEV *iadev); static IADEV *ia_dev[8]; static struct atm_dev *_ia_dev[8]; static int iadev_count; static void ia_led_timer(unsigned long arg); static struct timer_list ia_timer = TIMER_INITIALIZER(ia_led_timer, 0, 0); -struct atm_vcc *vcc_close_que[100]; static int IA_TX_BUF = DFL_TX_BUFFERS, IA_TX_BUF_SZ = DFL_TX_BUF_SZ; static int IA_RX_BUF = DFL_RX_BUFFERS, IA_RX_BUF_SZ = DFL_RX_BUF_SZ; static uint IADebugFlag = /* IF_IADBG_ERR | IF_IADBG_CBR| IF_IADBG_INIT_ADAPTER @@ -147,7 +147,6 @@ u_short desc1; u_short tcq_wr; struct ia_vcc *iavcc_r = NULL; - extern void desc_dbg(IADEV *iadev); tcq_wr = readl(dev->seg_reg+TCQ_WR_PTR) & 0xffff; while (dev->host_tcq_wr != tcq_wr) { @@ -187,7 +186,6 @@ unsigned long delta; static unsigned long timer = 0; int ltimeout; - extern void desc_dbg(IADEV *iadev); ia_hack_tcq (dev); if(((jiffies - timer)>50)||((dev->ffL.tcq_rd==dev->host_tcq_wr))){ @@ -643,7 +641,7 @@ return 0; } -void ia_tx_poll (IADEV *iadev) { +static void ia_tx_poll (IADEV *iadev) { struct atm_vcc *vcc = NULL; struct sk_buff *skb = NULL, *skb1 = NULL; struct ia_vcc *iavcc; @@ -860,7 +858,7 @@ return; } -void ia_mb25_init (IADEV *iadev) +static void ia_mb25_init (IADEV *iadev) { volatile ia_mb25_t *mb25 = (ia_mb25_t*)iadev->phy; #if 0 @@ -875,7 +873,7 @@ return; } -void ia_suni_pm7345_init (IADEV *iadev) +static void ia_suni_pm7345_init (IADEV *iadev) { volatile suni_pm7345_t *suni_pm7345 = (suni_pm7345_t *)iadev->phy; if (iadev->phy_type & FE_DS3_PHY) @@ -958,9 +956,8 @@ /***************************** IA_LIB END *****************************/ -/* pwang_test debug utility */ -int tcnter = 0, rcnter = 0; -void xdump( u_char* cp, int length, char* prefix ) +static int tcnter = 0; +static void xdump( u_char* cp, int length, char* prefix ) { int col, count; u_char prntBuf[120]; @@ -1007,7 +1004,7 @@ /*-- some utilities and memory allocation stuff will come here -------------*/ -void desc_dbg(IADEV *iadev) { +static void desc_dbg(IADEV *iadev) { u_short tcq_wr_ptr, tcq_st_ptr, tcq_ed_ptr; u32 i; diff -Nru a/drivers/atm/iphase.h b/drivers/atm/iphase.h --- a/drivers/atm/iphase.h 2004-12-08 18:07:24 -05:00 +++ b/drivers/atm/iphase.h 2004-12-08 18:07:24 -05:00 @@ -1126,8 +1126,6 @@ #define FE_DS3_PHY 0x0080 /* DS3 */ #define FE_E3_PHY 0x0090 /* E3 */ -extern void ia_mb25_init (IADEV *); - /*********************** SUNI_PM7345 PHY DEFINE HERE *********************/ typedef struct _suni_pm7345_t { @@ -1325,8 +1323,6 @@ #define SUNI_DS3_COCAI 0x04 /* Corr. HCS errors detected */ #define SUNI_DS3_FOVRI 0x02 /* FIFO overrun */ #define SUNI_DS3_FUDRI 0x01 /* FIFO underrun */ - -extern void ia_suni_pm7345_init (IADEV *iadev); ///////////////////SUNI_PM7345 PHY DEFINE END ///////////////////////////// diff -Nru a/drivers/atm/nicstarmac.c b/drivers/atm/nicstarmac.c --- a/drivers/atm/nicstarmac.c 2004-12-08 18:07:24 -05:00 +++ b/drivers/atm/nicstarmac.c 2004-12-08 18:07:24 -05:00 @@ -35,6 +35,7 @@ #define SI_LOW 0x0000 /* Serial input data low */ /* Read Status Register = 0000 0101b */ +#if 0 static u_int32_t rdsrtab[] = { CS_HIGH | CLK_HIGH, @@ -55,6 +56,7 @@ CLK_LOW | SI_HIGH, CLK_HIGH | SI_HIGH /* 1 */ }; +#endif /* 0 */ /* Read from EEPROM = 0000 0011b */ @@ -117,7 +119,7 @@ * eeprom, then pull the result from bit 16 of the NicSTaR's General Purpose * register. */ - +#if 0 u_int32_t nicstar_read_eprom_status( virt_addr_t base ) { @@ -153,6 +155,7 @@ osp_MicroDelay( CYCLE_DELAY ); return rbyte; } +#endif /* 0 */ /* diff -Nru a/drivers/atm/nicstarmac.h b/drivers/atm/nicstarmac.h --- a/drivers/atm/nicstarmac.h 2004-12-08 18:07:24 -05:00 +++ b/drivers/atm/nicstarmac.h 2004-12-08 18:07:24 -05:00 @@ -9,6 +9,5 @@ typedef void __iomem *virt_addr_t; -u_int32_t nicstar_read_eprom_status( virt_addr_t base ); void nicstar_init_eprom( virt_addr_t base ); void nicstar_read_eprom( virt_addr_t, u_int8_t, u_int8_t *, u_int32_t); From davem@davemloft.net Wed Dec 8 21:00:30 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 21:00:36 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB950T61018649 for ; Wed, 8 Dec 2004 21:00:30 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1CcGNC-0004Ph-00; Wed, 08 Dec 2004 20:57:50 -0800 Date: Wed, 8 Dec 2004 20:57:50 -0800 From: "David S. Miller" To: Patrick McHardy Cc: netdev@oss.sgi.com, hadi@cyberus.ca Subject: Re: [PATCH 2.6]: Fix oops in ipt action error path Message-Id: <20041208205750.56a790fa.davem@davemloft.net> In-Reply-To: <41B7892E.1080706@trash.net> References: <41B7892E.1080706@trash.net> X-Mailer: Sylpheed version 1.0.0beta3 (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12606 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Thu, 09 Dec 2004 00:07:26 +0100 Patrick McHardy wrote: > This patch fixes an oops when the ipt action is used with a > non-existant iptables target. It tries to log > t->u.kernel.target->name, u.kernel.target is part of a union > and as long as the target wasn't successfully loaded contains > the name of the target, using it as a pointer results in a > crash. Applied, thanks Patrick. From davem@davemloft.net Wed Dec 8 21:03:00 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 21:03:06 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB9530Ac019033 for ; Wed, 8 Dec 2004 21:03:00 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1CcGPo-0004Q6-00; Wed, 08 Dec 2004 21:00:32 -0800 Date: Wed, 8 Dec 2004 21:00:31 -0800 From: "David S. Miller" To: Stephen Hemminger Cc: netem@osdl.org, netdev@oss.sgi.com Subject: Re: [PATCH] netem: restart device after inserting packets Message-Id: <20041208210031.63f0963f.davem@davemloft.net> In-Reply-To: <20041208123103.4cc6b005@dxpl.pdx.osdl.net> References: <20041208123103.4cc6b005@dxpl.pdx.osdl.net> X-Mailer: Sylpheed version 1.0.0beta3 (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12607 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Wed, 8 Dec 2004 12:31:03 -0800 Stephen Hemminger wrote: > The version of netem in 2.6.10 moves packets from the delayed queue > to the qdisc in a timer interrupt. But it forgot to force the device to > pick them up. > > Signed-off-by: Stephen Hemminger Good spotting. Applied, thanks Stephen. From davem@davemloft.net Wed Dec 8 22:09:13 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 22:09:19 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB969D4b021935 for ; Wed, 8 Dec 2004 22:09:13 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1CcHRq-0004Xl-00; Wed, 08 Dec 2004 22:06:42 -0800 Date: Wed, 8 Dec 2004 22:06:42 -0800 From: "David S. Miller" To: Benjamin Herrenschmidt Cc: netdev@oss.sgi.com Subject: Re: netdev ioctl & dev_base_lock : bad idea ? Message-Id: <20041208220642.6984519f.davem@davemloft.net> In-Reply-To: <1101458929.28048.9.camel@gaston> References: <1101458929.28048.9.camel@gaston> X-Mailer: Sylpheed version 1.0.0beta3 (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12608 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Fri, 26 Nov 2004 19:48:49 +1100 Benjamin Herrenschmidt wrote: > I suppose there is a good reason we can't just use the rtnl_sem for > these guys, though why isn't dev_base_lock a read/write semaphore > instead of a spinlock ? At least on ppc, I don't think there's any > overhead in the normal path, and this is not on a very critical path > anyway, is it ? It can't be a semphore because it is taken in packet processing, and thus softint handling, paths. From davem@davemloft.net Wed Dec 8 22:11:01 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 22:11:07 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB96B1VT022292 for ; Wed, 8 Dec 2004 22:11:01 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1CcHTO-0004YO-00; Wed, 08 Dec 2004 22:08:18 -0800 Date: Wed, 8 Dec 2004 22:08:18 -0800 From: "David S. Miller" To: Wensong Zhang Cc: netdev@oss.sgi.com Subject: Re: [PATCH] [IPVS] add a sysctl variable to expire quiescent template Message-Id: <20041208220818.05a5811b.davem@davemloft.net> In-Reply-To: References: X-Mailer: Sylpheed version 1.0.0beta3 (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12609 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Thu, 2 Dec 2004 00:48:26 +0800 (CST) Wensong Zhang wrote: > Here is the patch from Horms to add a sysctl > variable to expire quiescent templat. Please check and apply them to > kernel 2.4 and 2.6 respectively. Sorry for the delay, I'll apply this soon. From benh@kernel.crashing.org Wed Dec 8 22:23:18 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 22:23:22 -0800 (PST) Received: from gate.crashing.org (gate.crashing.org [63.228.1.57]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB96NHMQ023214 for ; Wed, 8 Dec 2004 22:23:18 -0800 Received: from gaston (localhost [127.0.0.1]) by gate.crashing.org (8.12.8/8.12.8) with ESMTP id iB96IXgJ008600; Thu, 9 Dec 2004 00:18:34 -0600 Subject: Re: netdev ioctl & dev_base_lock : bad idea ? From: Benjamin Herrenschmidt To: "David S. Miller" Cc: netdev@oss.sgi.com In-Reply-To: <20041208220642.6984519f.davem@davemloft.net> References: <1101458929.28048.9.camel@gaston> <20041208220642.6984519f.davem@davemloft.net> Content-Type: text/plain Date: Thu, 09 Dec 2004 17:22:13 +1100 Message-Id: <1102573333.16495.2.camel@gaston> Mime-Version: 1.0 X-Mailer: Evolution 2.0.2 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12610 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: benh@kernel.crashing.org Precedence: bulk X-list: netdev On Wed, 2004-12-08 at 22:06 -0800, David S. Miller wrote: > On Fri, 26 Nov 2004 19:48:49 +1100 > Benjamin Herrenschmidt wrote: > > > I suppose there is a good reason we can't just use the rtnl_sem for > > these guys, though why isn't dev_base_lock a read/write semaphore > > instead of a spinlock ? At least on ppc, I don't think there's any > > overhead in the normal path, and this is not on a very critical path > > anyway, is it ? > > It can't be a semphore because it is taken in packet processing, > and thus softint handling, paths. Right, and I missed the fact that we did indeed take the semaphore and not the lock in the _set_ functions which is just fine, we can actually schedule.... except in set_multicast... Is there any reason we actually _need_ to get the xmit lock in this one specifically ? Ben. From davem@davemloft.net Wed Dec 8 23:16:04 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 23:16:19 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB97G3uR025725 for ; Wed, 8 Dec 2004 23:16:04 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1CcIUW-0004fZ-00; Wed, 08 Dec 2004 23:13:32 -0800 Date: Wed, 8 Dec 2004 23:13:31 -0800 From: "David S. Miller" To: Benjamin Herrenschmidt Cc: netdev@oss.sgi.com Subject: Re: netdev ioctl & dev_base_lock : bad idea ? Message-Id: <20041208231331.40cd98ad.davem@davemloft.net> In-Reply-To: <1102573333.16495.2.camel@gaston> References: <1101458929.28048.9.camel@gaston> <20041208220642.6984519f.davem@davemloft.net> <1102573333.16495.2.camel@gaston> X-Mailer: Sylpheed version 1.0.0beta3 (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12611 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Thu, 09 Dec 2004 17:22:13 +1100 Benjamin Herrenschmidt wrote: > Right, and I missed the fact that we did indeed take the semaphore and > not the lock in the _set_ functions which is just fine, we can actually > schedule.... except in set_multicast... > > Is there any reason we actually _need_ to get the xmit lock in this one > specifically ? Since we implement NETIF_F_LLTX, the core packet transmit routines do no locking, the driver does it all. So if we don't hold the tx lock in the set multicast routine, any other cpu can come into our hard_start_xmit function and poke at the hardware. Upon further consideration, it seems that it may be OK to drop that tx lock right after we do the netif_stop_queue(). But we should regrab the tx lock when we do the subsequent netif_wake_queue(). From davem@davemloft.net Wed Dec 8 23:58:01 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 08 Dec 2004 23:58:07 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB97w1gM027882 for ; Wed, 8 Dec 2004 23:58:01 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1CcJ92-0004mk-00; Wed, 08 Dec 2004 23:55:24 -0800 Date: Wed, 8 Dec 2004 23:55:24 -0800 From: "David S. Miller" To: Stephen Hemminger Cc: michael.vittrup.larsen@ericsson.com, netdev@oss.sgi.com Subject: Re: [PATCH] tcp: efficient port randomisation (rev 3) Message-Id: <20041208235524.202ff3a1.davem@davemloft.net> In-Reply-To: <20041206094234.34861c78@dxpl.pdx.osdl.net> References: <20041027092531.78fe438c@guest-251-240.pdx.osdl.net> <20041202135252.04e64f51.davem@davemloft.net> <41B14E57.5080803@osdl.org> <200412060918.04441.michael.vittrup.larsen@ericsson.com> <20041206094234.34861c78@dxpl.pdx.osdl.net> X-Mailer: Sylpheed version 1.0.0beta3 (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12612 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Mon, 6 Dec 2004 09:42:34 -0800 Stephen Hemminger wrote: > Third revision of the TCP port randomization patch. It randomizes > TCP ephemeral ports of incoming connections using variation of existing > sequence number hash. This one avoids the MD4 for the loopback case since > there is no reason to bother over loopback and it improves benchmark numbers. I don't think the loopback optimization is really necessary. And in any event, RTCF_LOCAL doesn't necessarily mean that the connection doesn't go "on the wire" especially when using Julian's "send to self" patch which I might add at some point. Anyways, please resend to me the version without the loopback hack and I'll add it to my 2.6.11 queue. Thanks Stephen and Michael. From ilya.pashkovsky@gmail.com Thu Dec 9 03:26:01 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 09 Dec 2004 03:26:07 -0800 (PST) Received: from rproxy.gmail.com (rproxy.gmail.com [64.233.170.197]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB9BQ0gg017829 for ; Thu, 9 Dec 2004 03:26:01 -0800 Received: by rproxy.gmail.com with SMTP id b11so489179rne for ; Thu, 09 Dec 2004 03:25:35 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:mime-version:content-type:content-transfer-encoding; b=POkHUu59qsBOIYlVAVpc52h4BzB/V8/fnSA20pCeWzxNNaRxZzvgcL84K0uW5DHijJikcp/KVKHxHNXdNTvMvWRggxXKwcXreRp33DbvVjGqQfMYxQ5j9Lni45Rupug7FAjNMfvtchas996rXp8PXzyQEW4gTKG1QOs5ZM6bxyA= Received: by 10.38.76.13 with SMTP id y13mr402247rna; Thu, 09 Dec 2004 03:25:26 -0800 (PST) Received: by 10.38.149.15 with HTTP; Thu, 9 Dec 2004 03:25:26 -0800 (PST) Message-ID: Date: Thu, 9 Dec 2004 13:25:26 +0200 From: Ilya Pashkovsky Reply-To: Ilya Pashkovsky To: netdev@oss.sgi.com, =?UTF-8?Q?YOSHIFUJI_Hideaki_/_=E5=90=89=E8=97=A4=E8=8B=B1=E6=98=8E?= , davem@redhat.com, linux-kernel@vger.kernel.org Subject: [PATCH] port_reuse listen fix (allow simultaneous single listen + outgoing connects from same port) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12613 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ilya.pashkovsky@gmail.com Precedence: bulk X-list: netdev This is the latest patch with removed bool > 1 check and ipv6 support. http://puding.mine.nu/patches/ http://puding.mine.nu/patches/patch-reuse-bool-ipv6 to check, you can use netcat (sets SO_REUSEADDR by default). on one host (host A): nc -v -l -p 9999 on another/same host (host B): nc -v -l -p 9000 on host A: nc -v -p 9999 host.B.ip.addr 9000 on host B: nc -v host.A.ip.addr 9999 nothing should fail. --- linux/net/ipv4/tcp_ipv4.c.orig2004-12-07 14:54:12.597084704 +0200 +++ linux/net/ipv4/tcp_ipv4.c2004-12-08 16:20:32.018896416 +0200 @@ -50,6 +50,8 @@ *YOSHIFUJI Hideaki @USAGI and:Support IPV6_V6ONLY socket option, which *Alexey Kuznetsovallow both IPv4 and IPv6 sockets to bind *a single port at the same time. + *Ilya Pashkovsky:fix TCP_LISTEN check on reuse + *sk_reuse boolean fix */ #include @@ -184,7 +186,8 @@ static inline int tcp_bind_conflict(stru const u32 sk_rcv_saddr = tcp_v4_rcv_saddr(sk); struct sock *sk2; struct hlist_node *node; -int reuse = sk->sk_reuse; +unsigned char reuse = sk->sk_reuse; +unsigned char state = sk->sk_state; sk_for_each_bound(sk2, node, &tb->owners) { if (sk != sk2 && @@ -193,7 +196,7 @@ static inline int tcp_bind_conflict(stru !sk2->sk_bound_dev_if || sk->sk_bound_dev_if == sk2->sk_bound_dev_if)) { if (!reuse || !sk2->sk_reuse || - sk2->sk_state == TCP_LISTEN) { + (state == TCP_LISTEN && sk2->sk_state == TCP_LISTEN)) { const u32 sk2_rcv_saddr = tcp_v4_rcv_saddr(sk2); if (!sk2_rcv_saddr || !sk_rcv_saddr || sk2_rcv_saddr == sk_rcv_saddr) @@ -259,8 +262,11 @@ static int tcp_v4_get_port(struct sock * goto tb_not_found; tb_found: if (!hlist_empty(&tb->owners)) { -if (sk->sk_reuse > 1) -goto success; +/* + * sk_reuse is boolean + * if (sk->sk_reuse > 1) + *goto success; + */ if (tb->fastreuse > 0 && sk->sk_reuse && sk->sk_state != TCP_LISTEN) { goto success; --- linux/net/ipv6/tcp_ipv6.c.orig2004-12-09 01:35:33.162353104 +0200 +++ linux/net/ipv6/tcp_ipv6.c2004-12-09 01:34:38.162714320 +0200 @@ -111,7 +111,7 @@ static inline int tcp_v6_bind_conflict(s !sk2->sk_bound_dev_if || sk->sk_bound_dev_if == sk2->sk_bound_dev_if) && (!sk->sk_reuse || !sk2->sk_reuse || - sk2->sk_state == TCP_LISTEN) && + (sk->sk_state == TCP_LISTEN && sk2->sk_state == TCP_LISTEN)) && ipv6_rcv_saddr_equal(sk, sk2)) break; } From hadi@cyberus.ca Thu Dec 9 04:15:03 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 09 Dec 2004 04:15:09 -0800 (PST) Received: from mx02.cybersurf.com (mx02.cybersurf.com [209.197.145.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB9CF20L023997 for ; Thu, 9 Dec 2004 04:15:02 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx02.cybersurf.com with esmtp (Exim 4.30) id 1CcNBt-0007Fn-7b for netdev@oss.sgi.com; Thu, 09 Dec 2004 07:14:37 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CcNBr-0005Tb-0c; Thu, 09 Dec 2004 07:14:35 -0500 Subject: Re: [PATCH 2.6]: Fix oops in ipt action error path From: jamal Reply-To: hadi@cyberus.ca To: Patrick McHardy Cc: "David S. Miller" , netdev@oss.sgi.com In-Reply-To: <41B7892E.1080706@trash.net> References: <41B7892E.1080706@trash.net> Content-Type: text/plain Organization: jamalopolous Message-Id: <1102594472.1048.50.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 09 Dec 2004 07:14:32 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12614 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Wed, 2004-12-08 at 18:07, Patrick McHardy wrote: > This patch fixes an oops when the ipt action is used with a > non-existant iptables target. It tries to log > t->u.kernel.target->name, u.kernel.target is part of a union > and as long as the target wasn't successfully loaded contains > the name of the target, using it as a pointer results in a > crash. > > Oops captured in UML: > > EIP: 0023:[] CPU: 0 Not tainted ESP: 002b:a14b7514 EFLAGS: > 00010297 > Not tainted > EAX: 414d4e56 EBX: 0000000a ECX: 414d4e56 EDX: fffffffe > ESI: a036acba EDI: 00000000 EBP: a036b09f DS: 002b ES: 002b > Call Trace: > [] notifier_call_chain+0x2d/0x50 > [] bust_spinlocks+0x46/0x50 > [] panic+0x71/0x120 > [] vsnprintf+0x331/0x4d0 > [] segv+0x1fa/0x230 > [] vsnprintf+0x331/0x4d0 > [] sigemptyset+0x24/0x40 > [] change_signals+0x65/0x90 > [] segv_handler+0xe0/0xf0 > [] vsnprintf+0x331/0x4d0 > [] sig_handler_common_tt+0x8d/0x120 > [] sig_handler+0x17/0x20 > [] __restore+0x0/0x8 > [] vsnprintf+0x331/0x4d0 > [] vscnprintf+0x2b/0x40 > [] vprintk+0xb2/0x320 > [] printk+0x17/0x20 > [] tcf_ipt_init+0x533/0x750 > [] tcf_action_init_1+0x92/0x1a0 > [] kmem_cache_alloc+0x39/0x60 > [] sigemptyset+0x24/0x40 > [] tcf_action_init+0xa7/0x140 > ... > > Not very important right now since ipt support isn't merged in iproute > yet, but still should be fixed for 2.6.10. > I think it is valid to apply now; Thanks Patrick. If you have more cleanups on ipt, please shoot them in as well. I am going to resend the ipt iproute2 patch to Stephen now that he is awake. cheers, jamal From Robert.Olsson@data.slu.se Thu Dec 9 05:49:19 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 09 Dec 2004 05:49:23 -0800 (PST) Received: from mail1.slu.se (mail1.slu.se [130.238.96.11]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB9DnHrZ031232 for ; Thu, 9 Dec 2004 05:49:18 -0800 Received: from robur.slu.se (robur.slu.se [130.238.98.12]) by mail1.slu.se (8.12.10/8.12.10) with ESMTP id iB9DmoOB032231; Thu, 9 Dec 2004 14:48:50 +0100 Received: by robur.slu.se (Postfix, from userid 1000) id E1550EC001; Thu, 9 Dec 2004 14:48:50 +0100 (CET) From: Robert Olsson MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16824.22466.897897.238619@robur.slu.se> Date: Thu, 9 Dec 2004 14:48:50 +0100 To: Lennert Buytenhek Cc: Robert Olsson , netdev@oss.sgi.com Subject: [PATCH] remove FCS from pktgen bandwidth calculation In-Reply-To: <20041208201116.GA10691@xi.wantstofly.org> References: <20041208201116.GA10691@xi.wantstofly.org> X-Mailer: VM 7.18 under Emacs 21.3.1 X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12615 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Robert.Olsson@data.slu.se Precedence: bulk X-list: netdev OK! We strictly define statistics to be based on the data we deliver to hard_xmit. Feel free to change the current kernel version... while we wait for 2.7 :-) --ro Lennert Buytenhek writes: > Hi, > > This patch stops pktgen from counting the FCS in the 'total number of > bytes transmitted' and 'bits per second' calculations. > > > cheers, > Lennert > > > --- pktgen.c.orig 2004-11-28 22:27:50.000000000 +0100 > +++ pktgen.c 2004-12-08 21:09:00.511175266 +0100 > @@ -2564,7 +2564,6 @@ > static void show_results(struct pktgen_dev *pkt_dev, int nr_frags) > { > __u64 total_us, bps, mbps, pps, idle; > - int size = pkt_dev->cur_pkt_size + 4; /* incl 32bit ethernet CRC */ > char *p = pkt_dev->result; > > total_us = pkt_dev->stopped_at - pkt_dev->started_at; > @@ -2576,7 +2575,7 @@ > > p += sprintf(p, "OK: %llu(c%llu+d%llu) usec, %llu (%dbyte,%dfrags)\n", > total_us, (unsigned long long)(total_us - idle), idle, > - pkt_dev->sofar, size, nr_frags); > + pkt_dev->sofar, pkt_dev->cur_pkt_size, nr_frags); > > pps = pkt_dev->sofar * USEC_PER_SEC; > > @@ -2588,7 +2587,7 @@ > > do_div(pps, total_us); > > - bps = pps * 8 * size; > + bps = pps * 8 * pkt_dev->cur_pkt_size; > > mbps = bps; > do_div(mbps, 1000000); > @@ -2777,7 +2776,7 @@ > pkt_dev->last_ok = 1; > pkt_dev->sofar++; > pkt_dev->seq_num++; > - pkt_dev->tx_bytes += (pkt_dev->cur_pkt_size + 4); /* count csum */ > + pkt_dev->tx_bytes += pkt_dev->cur_pkt_size; > > } else if (ret == NETDEV_TX_LOCKED > && (odev->features & NETIF_F_LLTX)) { > From wensong@linux-vs.org Thu Dec 9 07:11:57 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 09 Dec 2004 07:12:04 -0800 (PST) Received: from lb1.ctrip.com ([218.244.111.2]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB9FBtNc002209 for ; Thu, 9 Dec 2004 07:11:57 -0800 Received: from penguin.linux-vs.org ([61.149.149.239]) by lb1.ctrip.com (8.12.10/8.12.10) with ESMTP id iB9FAnMh017364 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Thu, 9 Dec 2004 23:10:53 +0800 Received: from penguin.linux-vs.org (localhost.localdomain [127.0.0.1]) by penguin.linux-vs.org (8.12.8/8.12.8) with ESMTP id iB9FA0Mq001040; Thu, 9 Dec 2004 23:10:00 +0800 Received: from localhost (wensong@localhost) by penguin.linux-vs.org (8.12.8/8.12.8/Submit) with ESMTP id iB9F9mr4001036; Thu, 9 Dec 2004 23:09:53 +0800 X-Authentication-Warning: penguin.linux-vs.org: wensong owned process doing -bs Date: Thu, 9 Dec 2004 23:09:48 +0800 (CST) From: Wensong Zhang To: "David S. Miller" cc: netdev@oss.sgi.com Subject: Re: [PATCH] [IPVS] add a sysctl variable to expire quiescent template In-Reply-To: <20041208220818.05a5811b.davem@davemloft.net> Message-ID: References: <20041208220818.05a5811b.davem@davemloft.net> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12616 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: wensong@linux-vs.org Precedence: bulk X-list: netdev On Wed, 8 Dec 2004, David S. Miller wrote: > On Thu, 2 Dec 2004 00:48:26 +0800 (CST) > Wensong Zhang wrote: > >> Here is the patch from Horms to add a sysctl >> variable to expire quiescent templat. Please check and apply them to >> kernel 2.4 and 2.6 respectively. > > Sorry for the delay, I'll apply this soon. > No problem. Thanks for applying it. Wensong From ross.biro@gmail.com Thu Dec 9 07:36:31 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 09 Dec 2004 07:36:36 -0800 (PST) Received: from rproxy.gmail.com (rproxy.gmail.com [64.233.170.193]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB9FaVfA004063 for ; Thu, 9 Dec 2004 07:36:31 -0800 Received: by rproxy.gmail.com with SMTP id q1so520552rnf for ; Thu, 09 Dec 2004 07:36:09 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:references; b=az0uWqSjg5OpimMv+wrUdr0H7bI4NGVDognwZd/CqJoatyoNQDQxVjEgGplbjjXhXUptJ40YsfbDrAZwfpufOxBg4Wa/vMPH02u1y25UdeAmcdQnsDYTQU11kCqIJgNdH+7BPQOq7Tlwyli/ZSAPogZmHJbupFgwjGTkoLGAzDU= Received: by 10.38.208.3 with SMTP id f3mr1423464rng; Thu, 09 Dec 2004 07:36:08 -0800 (PST) Received: by 10.38.66.65 with HTTP; Thu, 9 Dec 2004 07:36:08 -0800 (PST) Message-ID: <8783be6604120907367db1fda5@mail.gmail.com> Date: Thu, 9 Dec 2004 10:36:08 -0500 From: Ross Biro Reply-To: Ross Biro To: Ilya Pashkovsky Subject: Re: [PATCH] port_reuse listen fix (allow simultaneous single listen + outgoing connects from same port) Cc: netdev@oss.sgi.com, =?UTF-8?Q?YOSHIFUJI_Hideaki_/_=E5=90=89=E8=97=A4=E8=8B=B1=E6=98=8E?= , davem@redhat.com, linux-kernel@vger.kernel.org In-Reply-To: Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit References: X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12617 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ross.biro@gmail.com Precedence: bulk X-list: netdev On Thu, 9 Dec 2004 13:25:26 +0200, Ilya Pashkovsky wrote: > This is the latest patch with removed bool > 1 check and ipv6 support. > http://puding.mine.nu/patches/ > http://puding.mine.nu/patches/patch-reuse-bool-ipv6 > > to check, you can use netcat (sets SO_REUSEADDR by default). > on one host (host A): nc -v -l -p 9999 > on another/same host (host B): nc -v -l -p 9000 > on host A: nc -v -p 9999 host.B.ip.addr 9000 > on host B: nc -v host.A.ip.addr 9999 What happens if on host B you do nc -v -p 9000 host.A.ip.addr 9999? Seems to me you will break the rule that a connection is uniquely identified by (srcpip, destip, srcport, destport). Ross From Robert.Olsson@data.slu.se Thu Dec 9 08:07:41 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 09 Dec 2004 08:07:45 -0800 (PST) Received: from mail1.slu.se (mail1.slu.se [130.238.96.11]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB9G7dSY005672 for ; Thu, 9 Dec 2004 08:07:40 -0800 Received: from robur.slu.se (robur.slu.se [130.238.98.12]) by mail1.slu.se (8.12.10/8.12.10) with ESMTP id iB9G7GOB017264 for ; Thu, 9 Dec 2004 17:07:16 +0100 Received: by robur.slu.se (Postfix, from userid 1000) id 272AEEC001; Thu, 9 Dec 2004 17:07:17 +0100 (CET) From: Robert Olsson MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16824.30773.125766.220826@robur.slu.se> Date: Thu, 9 Dec 2004 17:07:17 +0100 To: netdev@oss.sgi.com Cc: Robert.Olsson@data.slu.se Subject: [PATCH] e1000 poll behavior X-Mailer: VM 7.18 under Emacs 21.3.1 X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12618 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Robert.Olsson@data.slu.se Precedence: bulk X-list: netdev Hello! Seems e1000 never gets into poll mode when tx_cleaned is false. Compare irq's on RX interfaces eth0, eth2 in the forwarding test below. Vanilla: ------- Iface MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flags eth0 1500 0 3983334 8374749 8374749 6016848 111 0 0 0 BRU eth1 1500 0 1 0 0 0 3982841 0 0 0 BRU eth2 1500 0 4002156 8507930 8507930 5997844 5 0 0 0 BRU eth3 1500 0 1 0 0 0 4001653 0 0 0 BRU CPU0 26: 66366 IO-APIC-level eth0 27: 75200 IO-APIC-level eth1 28: 66705 IO-APIC-level eth2 29: 75132 IO-APIC-level eth3 --- drivers/net/e1000/e1000_main.c.orig 2004-12-09 17:49:56.000000000 +0100 +++ drivers/net/e1000/e1000_main.c 2004-12-09 19:05:07.000000000 +0100 @@ -2179,8 +2179,8 @@ *budget -= work_done; netdev->quota -= work_done; - /* if no Rx and Tx cleanup work was done, exit the polling mode */ - if(!tx_cleaned || (work_done < work_to_do) || + /* if no Tx and not enough Rx work done, exit the polling mode */ + if((!tx_cleaned && (work_done < work_to_do)) || !netif_running(netdev)) { netif_rx_complete(netdev); e1000_irq_enable(adapter); Patched: ------- Iface MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flags eth0 1500 0 4193457 8283968 8283968 5806693 99 0 0 0 BRU eth1 1500 0 1 0 0 0 4192308 0 0 0 BRU eth2 1500 0 4190698 8441200 8441200 5809302 5 0 0 0 BRU eth3 1500 0 1 0 0 0 4190171 0 0 0 BRU CPU0 26: 336 IO-APIC-level eth0 27: 58159 IO-APIC-level eth1 28: 64 IO-APIC-level eth2 29: 58228 IO-APIC-level eth3 --ro From holt@oss.sgi.com Thu Dec 9 08:25:37 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 09 Dec 2004 08:25:41 -0800 (PST) Received: from oss.sgi.com (localhost [127.0.0.1]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB9GPbmn010256; Thu, 9 Dec 2004 08:25:37 -0800 Received: (from holt@localhost) by oss.sgi.com (8.13.0/8.12.8/Submit) id iB9GPbN8010255; Thu, 9 Dec 2004 08:25:37 -0800 Date: Thu, 9 Dec 2004 08:25:37 -0800 From: Robin Holt To: netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Limit the route hash size. Message-ID: <20041209082537.A1262@oss.sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12619 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: holt@oss.sgi.com Precedence: bulk X-list: netdev I got the following from the boot of one of our really large machines: IP: routing cache hash table of 33554432 buckets, 524288Kbytes I have done a lot of testing with the rt_hash_table. I would like to propose that for the overwhelming majority of machines, the default size is wrong. It is currently based on numphyspages(). I would suggest that the majority of machines will never need more than a single page of memory for this hash. In my testing, I found a single 16k page would only get an 11% fill in a fairly heavily used production machine on a large network. The only place where the large route cache seems to make sense is for larger servers that are servicing internet connections from many sites. Since the cache is completely flushed every 10 minutes by default, the above machine would have to be adding 55,924 routes per second that were ideally distrbuted throughout the hash space to even fill every bucket. The patch I am proposing is as follows. For the sites that need larger route hashes, they can use the rhash_entries command line option to set it as desired. Signed-off-by: Robin Holt diff -Naur linux-orig/net/ipv4/route.c linux/net/ipv4/route.c --- linux-orig/net/ipv4/route.c 2004-12-09 09:00:06 -06:00 +++ linux/net/ipv4/route.c 2004-12-09 08:56:33 -06:00 @@ -2728,7 +2728,7 @@ if (!ipv4_dst_ops.kmem_cachep) panic("IP: failed to allocate ip_dst_cache\n"); - goal = num_physpages >> (26 - PAGE_SHIFT); + goal = 0; if (rhash_entries) goal = (rhash_entries * sizeof(struct rt_hash_bucket)) >> PAGE_SHIFT; for (order = 0; (1UL << order) < goal; order++) From P@draigBrady.com Thu Dec 9 09:13:08 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 09 Dec 2004 09:13:19 -0800 (PST) Received: from corvil.com (gate.corvil.net [213.94.219.177]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB9HD3JF021392 for ; Thu, 9 Dec 2004 09:13:04 -0800 Received: from draigBrady.com (pixelbeat.local.corvil.com [172.18.1.170]) by corvil.com (8.12.9/8.12.5) with ESMTP id iB9HCUwS058459; Thu, 9 Dec 2004 17:12:31 GMT (envelope-from P@draigBrady.com) Message-ID: <41B8877E.2080106@draigBrady.com> Date: Thu, 09 Dec 2004 17:12:30 +0000 From: P@draigBrady.com User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040124 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Ray Lehtiniemi CC: netdev@oss.sgi.com Subject: Re: 1.03Mpps on e1000 References: <20041201001107.GE4203@xi.wantstofly.org> <1101863399.4663.54.camel@sfeldma-mobl.dsl-verizon.net> <20041201182943.GA14470@xi.wantstofly.org> <20041201213550.GF14470@xi.wantstofly.org> <1101967983.4782.9.camel@localhost.localdomain> <20041205145051.GA647@xi.wantstofly.org> <20041208233617.GD8649@mail.com> <41B825A5.2000009@draigBrady.com> <20041209161825.GA32454@mail.com> In-Reply-To: <20041209161825.GA32454@mail.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12620 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: P@draigBrady.com Precedence: bulk X-list: netdev Ray Lehtiniemi wrote: > On Thu, Dec 09, 2004 at 10:15:01AM +0000, P@draigBrady.com wrote: > >>That is very interesting! >>I'm guessing it's due to some alignment bug? >> >>Can you repeat for 60-68 ? > > certainly. here are the raw results, and a summary oprofile for > 60-68. > > looking at the disassembly, it seems that the 'rdtsc' opcode > at 0x46f3 is causing the problem? Well that wasn't obvious to me :-) I did some manipulating with sort/join and came up with the following percentage changes Note the % diff col adds to 37% address % @ 60b % @ 64b % diff 000046f5 14.6006 22.3856 7.785000 #instruction after rdtsc 00004737 15.0990 20.2242 5.125200 #instruction after rdtsc 0000474b 11.3857 12.9496 1.563900 00004726 1.5419 2.5867 1.044800 000046f7 0.6258 1.1922 0.566400 00004751 4.9377 5.4016 0.463900 000047a1 0.0118 0.4675 0.455700 00004739 1.2614 1.6962 0.434800 000044f7 1.0592 1.4506 0.391400 00004749 0.5467 0.9253 0.378600 0000475d 0.0879 0.1769 0.089000 0000445f 0.3785 0.4599 0.081400 000047c3 0.1003 0.1652 0.064900 000045cf 0.0804 0.1316 0.051200 000047aa 0.0048 0.0194 0.014600 000047bd 5.5e-04 0.0142 0.013650 000047b3 0.0106 0.0200 0.009400 00004598 0.0061 0.0147 0.008600 000045e9 0.0026 0.0103 0.007700 00004640 0.0692 0.0701 0.000900 00004465 0.0014 0.0020 0.000600 0000481b 4.3e-04 7.3e-04 0.000300 0000470e 6.1e-05 2.4e-04 0.000179 0000458d 1.2e-04 2.7e-04 0.000150 00004a47 1.8e-04 3.0e-04 0.000120 00004735 0.0085 0.0086 0.000100 00004745 1.2e-04 2.2e-04 0.000100 000047dd 0.0032 0.0033 0.000100 00004a49 0.0037 0.0038 0.000100 00004663 1.8e-04 2.7e-04 0.000090 0000489a 8.0e-04 8.9e-04 0.000090 00004514 9.2e-04 0.0010 0.000080 00004a61 6.1e-05 1.4e-04 0.000079 000046d4 6.1e-05 1.1e-04 0.000049 00004789 6.1e-05 1.1e-04 0.000049 00004683 1.2e-04 1.6e-04 0.000040 00004a51 1.8e-04 2.2e-04 0.000040 000047cc 9.2e-04 9.5e-04 0.000030 000045ba 6.8e-04 7.0e-04 0.000020 00004a36 6.1e-05 8.1e-05 0.000020 00004620 1.8e-04 1.9e-04 0.000010 0000474f 0.0042 0.0042 0.000000 0000466d 6.1e-05 5.4e-05 -0.000007 00004817 1.2e-04 1.1e-04 -0.000010 0000470c 4.9e-04 4.6e-04 -0.000030 000045eb 6.1e-05 2.7e-05 -0.000034 00004616 6.1e-05 2.7e-05 -0.000034 00004a1e 6.1e-05 2.7e-05 -0.000034 00004652 1.2e-04 8.1e-05 -0.000039 000047ee 1.2e-04 8.1e-05 -0.000039 00004685 1.2e-04 5.4e-05 -0.000066 00004894 3.1e-04 2.4e-04 -0.000070 00004714 6.1e-04 5.2e-04 -0.000090 00004524 1.2e-04 2.7e-05 -0.000093 0000467b 1.2e-04 2.7e-05 -0.000093 000046bb 1.2e-04 2.7e-05 -0.000093 00004446 0.0010 8.9e-04 -0.000110 0000488b 2.5e-04 1.4e-04 -0.000110 00004522 4.3e-04 2.7e-04 -0.000160 00004508 3.1e-04 1.4e-04 -0.000170 00004634 6.1e-04 4.3e-04 -0.000180 00004587 8.0e-04 6.0e-04 -0.000200 000047ae 0.0032 0.0030 -0.000200 00004440 5.5e-04 3.3e-04 -0.000220 00004459 0.0012 9.8e-04 -0.000220 00004506 9.2e-04 6.5e-04 -0.000270 000049ff 0.0021 0.0018 -0.000300 0000451c 0.0013 9.8e-04 -0.000320 000046c7 3.7e-04 2.7e-05 -0.000343 00004673 4.9e-04 1.1e-04 -0.000380 0000478f 4.9e-04 1.1e-04 -0.000380 00004450 0.0012 8.1e-04 -0.000390 00004541 6.1e-04 2.2e-04 -0.000390 000045a9 7.4e-04 3.5e-04 -0.000390 00004777 5.5e-04 1.6e-04 -0.000390 000047d0 6.8e-04 2.7e-04 -0.000410 00004457 0.0084 0.0079 -0.000500 000047ba 0.0018 0.0013 -0.000500 00004a6b 0.0031 0.0026 -0.000500 00004612 5.5e-04 2.7e-05 -0.000523 00004681 6.8e-04 1.4e-04 -0.000540 0000477b 7.4e-04 1.9e-04 -0.000550 00004503 0.0017 0.0011 -0.000600 000047df 0.0020 0.0014 -0.000600 000045b6 0.0010 3.8e-04 -0.000620 00004781 0.0010 3.8e-04 -0.000620 00004667 0.0012 5.2e-04 -0.000680 00004885 0.0015 8.1e-04 -0.000690 000045a3 0.0017 0.0010 -0.000700 000047da 0.0014 7.0e-04 -0.000700 00004747 8.6e-04 8.1e-05 -0.000779 0000446f 0.0151 0.0143 -0.000800 00004702 0.0019 0.0011 -0.000800 00004718 0.0157 0.0149 -0.000800 000047b6 0.0022 0.0014 -0.000800 00004a25 0.0054 0.0046 -0.000800 00004a65 0.0026 0.0018 -0.000800 0000477e 9.8e-04 1.4e-04 -0.000840 000045c8 0.0015 5.7e-04 -0.000930 00004543 0.0049 0.0039 -0.001000 00004604 0.0013 3.0e-04 -0.001000 00004787 0.0026 0.0016 -0.001000 00004a02 0.0018 7.6e-04 -0.001040 0000450e 0.0063 0.0052 -0.001100 0000465d 0.0022 0.0011 -0.001100 0000459d 0.0014 1.9e-04 -0.001210 0000464a 0.0017 3.8e-04 -0.001320 000047cf 0.0020 6.8e-04 -0.001320 00004a13 0.0016 1.1e-04 -0.001490 0000461e 0.0017 1.6e-04 -0.001540 000044ff 0.0040 0.0024 -0.001600 00004628 0.0020 3.5e-04 -0.001650 000045d5 0.0076 0.0055 -0.002100 00004638 0.0049 0.0027 -0.002200 00004650 0.0045 0.0021 -0.002400 00004632 0.0052 0.0026 -0.002600 00004769 0.0059 0.0033 -0.002600 00004444 0.0957 0.0930 -0.002700 00004610 0.0034 6.5e-04 -0.002750 000046fb 0.0097 0.0069 -0.002800 0000487f 0.0175 0.0146 -0.002900 000044f4 0.0071 0.0039 -0.003200 00004757 0.0068 0.0032 -0.003600 00004583 0.0176 0.0136 -0.004000 0000472d 0.0178 0.0138 -0.004000 00004624 0.0049 6.5e-04 -0.004250 00004700 0.0074 0.0029 -0.004500 00004763 0.0110 0.0059 -0.005100 00004755 0.0091 0.0037 -0.005400 000047b0 0.0201 0.0138 -0.006300 0000459b 0.0102 0.0035 -0.006700 000046fd 0.0146 0.0078 -0.006800 00004797 0.0253 0.0181 -0.007200 0000473f 0.0226 0.0153 -0.007300 0000476d 0.0253 0.0180 -0.007300 0000474d 0.0236 0.0152 -0.008400 000044f0 0.0191 0.0094 -0.009700 00004471 0.0332 0.0222 -0.011000 000046f3 0.0224 0.0112 -0.011200 0000472f 0.0221 0.0105 -0.011600 00004743 0.0146 0.0025 -0.012100 00004753 0.0311 0.0185 -0.012600 000044f9 0.0232 0.0100 -0.013200 000045f2 0.0781 0.0638 -0.014300 000045c0 0.0796 0.0632 -0.016400 000047a4 0.1020 0.0851 -0.016900 00004455 0.0468 0.0282 -0.018600 0000472a 0.0331 0.0140 -0.019100 00004720 0.0420 0.0228 -0.019200 00004741 0.0520 0.0255 -0.026500 0000460a 0.0296 6.8e-04 -0.028920 00004469 0.0696 0.0391 -0.030500 000047b8 0.0485 0.0164 -0.032100 00004771 0.0479 0.0151 -0.032800 000047d6 0.0634 0.0270 -0.036400 000045c2 0.1763 0.0500 -0.126300 0000488e 0.2228 0.0961 -0.126700 0000458f 0.2212 0.0932 -0.128000 00004709 0.8817 0.7529 -0.128800 0000479b 0.2469 0.1158 -0.131100 000047c6 0.2489 0.1103 -0.138600 00004775 0.2514 0.1124 -0.139000 00004657 0.2502 0.1105 -0.139700 0000444c 0.2555 0.1107 -0.144800 000045df 0.1822 0.0357 -0.146500 00004608 0.2596 0.1117 -0.147900 00004618 0.2635 0.1153 -0.148200 00004679 0.2580 0.1094 -0.148600 0000462c 0.2630 0.1134 -0.149600 00004594 0.2494 0.0958 -0.153600 000045f8 0.1934 0.0369 -0.156500 0000471a 0.8706 0.6718 -0.198800 000045e6 0.4986 0.2189 -0.279700 00004644 0.4393 0.1515 -0.287800 0000463c 0.5214 0.2247 -0.296700 000045fe 0.5160 0.2022 -0.313800 00004622 3.5942 1.5668 -2.027400 0000461c 3.6298 1.5695 -2.060300 00004716 19.2425 16.4027 -2.839800 00004600 5.2128 2.2837 -2.929100 000045b0 7.8500 3.3027 -4.547300 > > > it is worth noting that my box has become quite unstable since > i started to use oprofile and pktgen together. sshd stops responding, > and the network seems to go down. not sure what is happening there... > this instability seems to be persisting across reboots, unfortunately... > > > > > > > 60 bytes > -------- > > 60 1195259 > 60 1206652 > 60 1139822 > 60 1206650 > 60 1206654 > 60 1136447 > 60 1206651 > 60 1148050 > 60 1206504 > 60 1206653 > > CPU: P4 / Xeon with 2 hyper-threads, speed 3067.25 MHz (estimated) > Counted GLOBAL_POWER_EVENTS events (time during which processor is not stopped) with a unit mask of 0x01 (mandatory) count 100000 > vma samples % image name app name symbol name > 00004337 1626886 57.5170 pktgen.ko pktgen pktgen_thread_worker > c02f389d 282974 10.0043 vmlinux vmlinux _spin_lock > c021adc0 219795 7.7706 vmlinux vmlinux e1000_clean_tx > c02f3904 164371 5.8112 vmlinux vmlinux _spin_lock_bh > c0219c74 160383 5.6702 vmlinux vmlinux e1000_xmit_frame > c02f3870 124564 4.4038 vmlinux vmlinux _spin_trylock > 000041d1 48511 1.7151 pktgen.ko pktgen next_to_run > c02f399a 46205 1.6335 vmlinux vmlinux _spin_unlock_irqrestore > c010c7d9 20876 0.7381 vmlinux vmlinux mark_offset_tsc > c011fdb2 13116 0.4637 vmlinux vmlinux local_bh_enable > c0107248 8166 0.2887 vmlinux vmlinux timer_interrupt > c0103970 5607 0.1982 vmlinux vmlinux apic_timer_interrupt > c010123a 5368 0.1898 vmlinux vmlinux default_idle > c02f39a5 4256 0.1505 vmlinux vmlinux _spin_unlock_bh > c0103c08 4042 0.1429 vmlinux vmlinux page_fault > 0804ae00 3930 0.1389 oprofiled oprofiled sfile_find > 0804aa10 3573 0.1263 oprofiled oprofiled get_file > > > > 64 bytes > -------- > > 64 606104 > 64 597737 > 64 594927 > 64 595531 > 64 606876 > 64 594751 > 64 595709 > 64 595070 > 64 606876 > 64 595600 > > CPU: P4 / Xeon with 2 hyper-threads, speed 3067.25 MHz (estimated) > Counted GLOBAL_POWER_EVENTS events (time during which processor is not stopped) with a unit mask of 0x01 (mandatory) count 100000 > vma samples % image name app name symbol name > 00004337 3688998 68.9133 pktgen.ko pktgen pktgen_thread_worker > c02f389d 519536 9.7053 vmlinux vmlinux _spin_lock > c021adc0 271791 5.0773 vmlinux vmlinux e1000_clean_tx > c0219c74 214428 4.0057 vmlinux vmlinux e1000_xmit_frame > c02f3904 166334 3.1072 vmlinux vmlinux _spin_lock_bh > c02f3870 127623 2.3841 vmlinux vmlinux _spin_trylock > 000041d1 111650 2.0857 pktgen.ko pktgen next_to_run > c02f399a 47428 0.8860 vmlinux vmlinux _spin_unlock_irqrestore > c010c7d9 39586 0.7395 vmlinux vmlinux mark_offset_tsc > c0107248 14671 0.2741 vmlinux vmlinux timer_interrupt > c011fdb2 12926 0.2415 vmlinux vmlinux local_bh_enable > c0103970 11778 0.2200 vmlinux vmlinux apic_timer_interrupt > c010123a 9282 0.1734 vmlinux vmlinux default_idle > 0804ae00 7449 0.1392 oprofiled oprofiled sfile_find > 0804aa10 6387 0.1193 oprofiled oprofiled get_file > 0804ac30 6234 0.1165 oprofiled oprofiled sfile_log_sample > 0804f4b0 5852 0.1093 oprofiled oprofiled odb_insert > > > > 68 bytes > -------- > > 68 1124822 > 68 1124805 > 68 1090006 > 68 1124822 > 68 1089775 > 68 1124812 > 68 1123305 > 68 1091796 > 68 1124820 > 68 1087043 > > CPU: P4 / Xeon with 2 hyper-threads, speed 3067.25 MHz (estimated) > Counted GLOBAL_POWER_EVENTS events (time during which processor is not stopped) with a unit mask of 0x01 (mandatory) count 100000 > vma samples % image name app name symbol name > 00004337 1753028 58.4510 pktgen.ko pktgen pktgen_thread_worker > c02f389d 301835 10.0641 vmlinux vmlinux _spin_lock > c021adc0 223405 7.4490 vmlinux vmlinux e1000_clean_tx > c02f3904 167118 5.5722 vmlinux vmlinux _spin_lock_bh > c0219c74 166016 5.5355 vmlinux vmlinux e1000_xmit_frame > c02f3870 131516 4.3851 vmlinux vmlinux _spin_trylock > 000041d1 56334 1.8783 pktgen.ko pktgen next_to_run > c02f399a 46860 1.5624 vmlinux vmlinux _spin_unlock_irqrestore > c010c7d9 26188 0.8732 vmlinux vmlinux mark_offset_tsc > c011fdb2 12199 0.4068 vmlinux vmlinux local_bh_enable > c0107248 10399 0.3467 vmlinux vmlinux timer_interrupt > c010123a 8799 0.2934 vmlinux vmlinux default_idle > c0103970 8194 0.2732 vmlinux vmlinux apic_timer_interrupt > c0117346 4822 0.1608 vmlinux vmlinux find_busiest_group > 0804ae00 4214 0.1405 oprofiled oprofiled sfile_find > c02f39a5 3955 0.1319 vmlinux vmlinux _spin_unlock_bh > 0804aa10 3745 0.1249 oprofiled oprofiled get_file > > > > here is the detailed breakdown for the 60 byte pktgen: > > CPU: P4 / Xeon with 2 hyper-threads, speed 3067.25 MHz (estimated) > Counted GLOBAL_POWER_EVENTS events (time during which processor is not stopped) with a unit mask of 0x01 (mandatory) count 100000 > vma samples % image name app name symbol name > 00004337 1626886 57.5170 pktgen.ko pktgen pktgen_thread_worker > 00004440 9 5.5e-04 > 00004444 1557 0.0957 > 00004446 17 0.0010 > 0000444c 4156 0.2555 > 00004450 19 0.0012 > 00004455 762 0.0468 > 00004457 136 0.0084 > 00004459 20 0.0012 > 0000445f 6157 0.3785 > 00004465 23 0.0014 > 00004469 1133 0.0696 > 0000446f 246 0.0151 > 00004471 540 0.0332 > 000044f0 310 0.0191 > 000044f4 115 0.0071 > 000044f7 17232 1.0592 > 000044f9 377 0.0232 > 000044ff 65 0.0040 > 00004503 28 0.0017 > 00004506 15 9.2e-04 > 00004508 5 3.1e-04 > 0000450e 102 0.0063 > 00004514 15 9.2e-04 > 0000451c 21 0.0013 > 00004522 7 4.3e-04 > 00004524 2 1.2e-04 > 00004541 10 6.1e-04 > 00004543 79 0.0049 > 00004583 287 0.0176 > 00004587 13 8.0e-04 > 0000458d 2 1.2e-04 > 0000458f 3598 0.2212 > 00004594 4057 0.2494 > 00004598 100 0.0061 > 0000459b 166 0.0102 > 0000459d 22 0.0014 > 000045a3 28 0.0017 > 000045a9 12 7.4e-04 > 000045b0 127711 7.8500 > 000045b6 17 0.0010 > 000045ba 11 6.8e-04 > 000045c0 1295 0.0796 > 000045c2 2869 0.1763 > 000045c8 24 0.0015 > 000045cf 1308 0.0804 > 000045d5 123 0.0076 > 000045df 2964 0.1822 > 000045e6 8111 0.4986 > 000045e9 42 0.0026 > 000045eb 1 6.1e-05 > 000045f2 1271 0.0781 > 000045f8 3146 0.1934 > 000045fe 8395 0.5160 > 00004600 84807 5.2128 > 00004604 21 0.0013 > 00004608 4223 0.2596 > 0000460a 481 0.0296 > 00004610 55 0.0034 > 00004612 9 5.5e-04 > 00004616 1 6.1e-05 > 00004618 4287 0.2635 > 0000461a 3 1.8e-04 > 0000461c 59052 3.6298 > 0000461e 28 0.0017 > 00004620 3 1.8e-04 > 00004622 58473 3.5942 > 00004624 79 0.0049 > 00004628 33 0.0020 > 0000462c 4279 0.2630 > 00004632 84 0.0052 > 00004634 10 6.1e-04 > 00004638 80 0.0049 > 0000463c 8483 0.5214 > 00004640 1126 0.0692 > 00004644 7147 0.4393 > 0000464a 27 0.0017 > 00004650 73 0.0045 > 00004652 2 1.2e-04 > 00004657 4070 0.2502 > 0000465d 36 0.0022 > 00004663 3 1.8e-04 > 00004665 2 1.2e-04 > 00004667 20 0.0012 > 0000466d 1 6.1e-05 > 00004673 8 4.9e-04 > 00004679 4197 0.2580 > 0000467b 2 1.2e-04 > 00004681 11 6.8e-04 > 00004683 2 1.2e-04 > 00004685 2 1.2e-04 > 000046bb 2 1.2e-04 > 000046c1 2 1.2e-04 > 000046c7 6 3.7e-04 > 000046d4 1 6.1e-05 > 000046f3 365 0.0224 > 000046f5 237535 14.6006 > 000046f7 10181 0.6258 > 000046fb 157 0.0097 > 000046fd 238 0.0146 > 00004700 120 0.0074 > 00004702 31 0.0019 > 00004709 14344 0.8817 > 0000470c 8 4.9e-04 > 0000470e 1 6.1e-05 > 00004714 10 6.1e-04 > 00004716 313053 19.2425 > 00004718 255 0.0157 > 0000471a 14164 0.8706 > 00004720 683 0.0420 > 00004726 25085 1.5419 > 0000472a 538 0.0331 > 0000472d 290 0.0178 > 0000472f 359 0.0221 > 00004735 139 0.0085 > 00004737 245644 15.0990 > 00004739 20521 1.2614 > 0000473f 368 0.0226 > 00004741 846 0.0520 > 00004743 237 0.0146 > 00004745 2 1.2e-04 > 00004747 14 8.6e-04 > 00004749 8894 0.5467 > 0000474b 185233 11.3857 > 0000474d 384 0.0236 > 0000474f 69 0.0042 > 00004751 80331 4.9377 > 00004753 506 0.0311 > 00004755 148 0.0091 > 00004757 111 0.0068 > 0000475d 1430 0.0879 > 00004763 179 0.0110 > 00004769 96 0.0059 > 0000476d 411 0.0253 > 00004771 780 0.0479 > 00004775 4090 0.2514 > 00004777 9 5.5e-04 > 0000477b 12 7.4e-04 > 0000477e 16 9.8e-04 > 00004781 17 0.0010 > 00004787 43 0.0026 > 00004789 1 6.1e-05 > 0000478f 8 4.9e-04 > 00004797 412 0.0253 > 0000479b 4016 0.2469 > 000047a1 192 0.0118 > 000047a4 1660 0.1020 > 000047aa 78 0.0048 > 000047ae 52 0.0032 > 000047b0 327 0.0201 > 000047b3 173 0.0106 > 000047b6 35 0.0022 > 000047b8 789 0.0485 > 000047ba 29 0.0018 > 000047bd 9 5.5e-04 > 000047c3 1632 0.1003 > 000047c6 4049 0.2489 > 000047cc 15 9.2e-04 > 000047cf 33 0.0020 > 000047d0 11 6.8e-04 > 000047d6 1032 0.0634 > 000047da 22 0.0014 > 000047dd 52 0.0032 > 000047df 33 0.0020 > 000047ea 1 6.1e-05 > 000047ee 2 1.2e-04 > 000047f6 1 6.1e-05 > 000047ff 1 6.1e-05 > 00004809 1 6.1e-05 > 0000480e 1 6.1e-05 > 00004817 2 1.2e-04 > 0000481b 7 4.3e-04 > 0000487f 284 0.0175 > 00004885 24 0.0015 > 0000488b 4 2.5e-04 > 0000488e 3625 0.2228 > 00004894 5 3.1e-04 > 0000489a 13 8.0e-04 > 000049ff 34 0.0021 > 00004a02 30 0.0018 > 00004a04 4 2.5e-04 > 00004a0f 3 1.8e-04 > 00004a13 26 0.0016 > 00004a1e 1 6.1e-05 > 00004a25 88 0.0054 > 00004a36 1 6.1e-05 > 00004a47 3 1.8e-04 > 00004a49 60 0.0037 > 00004a51 3 1.8e-04 > 00004a61 1 6.1e-05 > 00004a65 42 0.0026 > 00004a6b 50 0.0031 > > > > here is the detailed breakdown for the 64 byte pktgen: > > CPU: P4 / Xeon with 2 hyper-threads, speed 3067.25 MHz (estimated) > Counted GLOBAL_POWER_EVENTS events (time during which processor is not stopped) with a unit mask of 0x01 (mandatory) count 100000 > vma samples % image name app name symbol name > 00004337 3688998 68.9133 pktgen.ko pktgen pktgen_thread_worker > 00004440 12 3.3e-04 > 00004444 3431 0.0930 > 00004446 33 8.9e-04 > 0000444c 4082 0.1107 > 00004450 30 8.1e-04 > 00004455 1041 0.0282 > 00004457 292 0.0079 > 00004459 36 9.8e-04 > 0000445f 16964 0.4599 > 00004465 73 0.0020 > 00004469 1442 0.0391 > 0000446f 528 0.0143 > 00004471 818 0.0222 > 000044f0 347 0.0094 > 000044f4 145 0.0039 > 000044f7 53514 1.4506 > 000044f9 369 0.0100 > 000044ff 90 0.0024 > 00004503 41 0.0011 > 00004506 24 6.5e-04 > 00004508 5 1.4e-04 > 0000450e 192 0.0052 > 00004514 37 0.0010 > 00004516 5 1.4e-04 > 0000451c 36 9.8e-04 > 00004522 10 2.7e-04 > 00004524 1 2.7e-05 > 00004541 8 2.2e-04 > 00004543 144 0.0039 > 00004583 503 0.0136 > 00004587 22 6.0e-04 > 0000458d 10 2.7e-04 > 0000458f 3437 0.0932 > 00004594 3533 0.0958 > 00004598 541 0.0147 > 0000459b 129 0.0035 > 0000459d 7 1.9e-04 > 000045a3 38 0.0010 > 000045a9 13 3.5e-04 > 000045b0 121838 3.3027 > 000045b6 14 3.8e-04 > 000045ba 26 7.0e-04 > 000045c0 2330 0.0632 > 000045c2 1843 0.0500 > 000045c8 21 5.7e-04 > 000045cf 4855 0.1316 > 000045d5 203 0.0055 > 000045df 1317 0.0357 > 000045e6 8076 0.2189 > 000045e9 381 0.0103 > 000045eb 1 2.7e-05 > 000045f2 2355 0.0638 > 000045f8 1362 0.0369 > 000045fe 7460 0.2022 > 00004600 84246 2.2837 > 00004604 11 3.0e-04 > 00004608 4122 0.1117 > 0000460a 25 6.8e-04 > 00004610 24 6.5e-04 > 00004612 1 2.7e-05 > 00004614 1 2.7e-05 > 00004616 1 2.7e-05 > 00004618 4254 0.1153 > 0000461c 57898 1.5695 > 0000461e 6 1.6e-04 > 00004620 7 1.9e-04 > 00004622 57801 1.5668 > 00004624 24 6.5e-04 > 00004628 13 3.5e-04 > 0000462c 4185 0.1134 > 00004632 97 0.0026 > 00004634 16 4.3e-04 > 00004638 99 0.0027 > 0000463c 8288 0.2247 > 00004640 2585 0.0701 > 00004644 5590 0.1515 > 0000464a 14 3.8e-04 > 00004650 77 0.0021 > 00004652 3 8.1e-05 > 00004657 4077 0.1105 > 0000465d 41 0.0011 > 00004663 10 2.7e-04 > 00004667 19 5.2e-04 > 0000466d 2 5.4e-05 > 00004673 4 1.1e-04 > 00004679 4035 0.1094 > 0000467b 1 2.7e-05 > 00004681 5 1.4e-04 > 00004683 6 1.6e-04 > 00004685 2 5.4e-05 > 000046bb 1 2.7e-05 > 000046c7 1 2.7e-05 > 000046d4 4 1.1e-04 > 000046f3 415 0.0112 > 000046f5 825806 22.3856 > 000046f7 43980 1.1922 > 000046fb 256 0.0069 > 000046fd 286 0.0078 > 00004700 108 0.0029 > 00004702 41 0.0011 > 00004705 5 1.4e-04 > 00004709 27774 0.7529 > 0000470c 17 4.6e-04 > 0000470e 9 2.4e-04 > 00004714 19 5.2e-04 > 00004716 605096 16.4027 > 00004718 548 0.0149 > 0000471a 24782 0.6718 > 00004720 842 0.0228 > 00004726 95423 2.5867 > 0000472a 516 0.0140 > 0000472d 510 0.0138 > 0000472f 389 0.0105 > 00004735 316 0.0086 > 00004737 746069 20.2242 > 00004739 62574 1.6962 > 0000473f 565 0.0153 > 00004741 941 0.0255 > 00004743 91 0.0025 > 00004745 8 2.2e-04 > 00004747 3 8.1e-05 > 00004749 34135 0.9253 > 0000474b 477712 12.9496 > 0000474d 561 0.0152 > 0000474f 155 0.0042 > 00004751 199265 5.4016 > 00004753 684 0.0185 > 00004755 137 0.0037 > 00004757 119 0.0032 > 0000475d 6527 0.1769 > 00004763 217 0.0059 > 00004769 120 0.0033 > 0000476d 665 0.0180 > 00004771 558 0.0151 > 00004775 4148 0.1124 > 00004777 6 1.6e-04 > 0000477b 7 1.9e-04 > 0000477e 5 1.4e-04 > 00004781 14 3.8e-04 > 00004787 60 0.0016 > 00004789 4 1.1e-04 > 0000478f 4 1.1e-04 > 00004797 669 0.0181 > 0000479b 4271 0.1158 > 000047a1 17245 0.4675 > 000047a4 3138 0.0851 > 000047aa 716 0.0194 > 000047ae 112 0.0030 > 000047b0 508 0.0138 > 000047b3 736 0.0200 > 000047b6 53 0.0014 > 000047b8 604 0.0164 > 000047ba 47 0.0013 > 000047bd 525 0.0142 > 000047c3 6094 0.1652 > 000047c6 4068 0.1103 > 000047cc 35 9.5e-04 > 000047cf 25 6.8e-04 > 000047d0 10 2.7e-04 > 000047d6 995 0.0270 > 000047da 26 7.0e-04 > 000047dd 120 0.0033 > 000047df 50 0.0014 > 000047ee 3 8.1e-05 > 000047fa 1 2.7e-05 > 00004817 4 1.1e-04 > 0000481b 27 7.3e-04 > 0000487f 539 0.0146 > 00004885 30 8.1e-04 > 0000488b 5 1.4e-04 > 0000488e 3544 0.0961 > 00004894 9 2.4e-04 > 0000489a 33 8.9e-04 > 000049ff 67 0.0018 > 00004a02 28 7.6e-04 > 00004a11 1 2.7e-05 > 00004a13 4 1.1e-04 > 00004a18 3 8.1e-05 > 00004a1e 1 2.7e-05 > 00004a25 168 0.0046 > 00004a36 3 8.1e-05 > 00004a47 11 3.0e-04 > 00004a49 139 0.0038 > 00004a51 8 2.2e-04 > 00004a59 1 2.7e-05 > 00004a61 5 1.4e-04 > 00004a65 67 0.0018 > 00004a6b 97 0.0026 > > > and finally, here's the disasm of the threadworker function: > > > 00004337 : > 4337: 55 push %ebp > 4338: 57 push %edi > 4339: 56 push %esi > 433a: 53 push %ebx > 433b: bb 00 e0 ff ff mov $0xffffe000,%ebx > 4340: 21 e3 and %esp,%ebx > 4342: 83 ec 2c sub $0x2c,%esp > 4345: 89 44 24 28 mov %eax,0x28(%esp) > 4349: 8b b0 bc 02 00 00 mov 0x2bc(%eax),%esi > 434f: c7 44 24 20 00 00 00 movl $0x0,0x20(%esp) > 4356: 00 > 4357: c7 04 24 c2 06 00 00 movl $0x6c2,(%esp) > 435e: 89 74 24 04 mov %esi,0x4(%esp) > 4362: e8 fc ff ff ff call 4363 > 4367: 8b 03 mov (%ebx),%eax > 4369: 8b 80 90 04 00 00 mov 0x490(%eax),%eax > 436f: 05 04 05 00 00 add $0x504,%eax > 4374: e8 fc ff ff ff call 4375 > 4379: 8b 03 mov (%ebx),%eax > 437b: c7 80 94 04 00 00 ff movl $0xfffbbeff,0x494(%eax) > 4382: be fb ff > 4385: c7 80 98 04 00 00 ff movl $0xffffffff,0x498(%eax) > 438c: ff ff ff > 438f: e8 fc ff ff ff call 4390 > 4394: 8b 03 mov (%ebx),%eax > 4396: 8b 80 90 04 00 00 mov 0x490(%eax),%eax > 439c: 05 04 05 00 00 add $0x504,%eax > 43a1: e8 fc ff ff ff call 43a2 > 43a6: 89 f1 mov %esi,%ecx > 43a8: ba 01 00 00 00 mov $0x1,%edx > 43ad: d3 e2 shl %cl,%edx > 43af: 8b 03 mov (%ebx),%eax > 43b1: e8 fc ff ff ff call 43b2 > 43b6: 39 73 10 cmp %esi,0x10(%ebx) > 43b9: 74 08 je 43c3 > 43bb: 0f 0b ud2a > 43bd: 27 daa > 43be: 0b b0 06 00 00 8b or 0x8b000006(%eax),%esi > 43c4: 44 inc %esp > 43c5: 24 28 and $0x28,%al > 43c7: 8b 54 24 28 mov 0x28(%esp),%edx > 43cb: 05 c0 02 00 00 add $0x2c0,%eax > 43d0: 89 44 24 1c mov %eax,0x1c(%esp) > 43d4: c7 82 c0 02 00 00 01 movl $0x1,0x2c0(%edx) > 43db: 00 00 00 > 43de: 89 d0 mov %edx,%eax > 43e0: 8b 4c 24 1c mov 0x1c(%esp),%ecx > 43e4: 05 c4 02 00 00 add $0x2c4,%eax > 43e9: 89 41 04 mov %eax,0x4(%ecx) > 43ec: 89 41 08 mov %eax,0x8(%ecx) > 43ef: 83 a2 b4 02 00 00 f0 andl $0xfffffff0,0x2b4(%edx) > 43f6: 8b 03 mov (%ebx),%eax > 43f8: 8b 80 a8 00 00 00 mov 0xa8(%eax),%eax > 43fe: 89 82 b8 02 00 00 mov %eax,0x2b8(%edx) > 4404: 8b 03 mov (%ebx),%eax > 4406: 8b 80 a8 00 00 00 mov 0xa8(%eax),%eax > 440c: 89 74 24 04 mov %esi,0x4(%esp) > 4410: c7 04 24 a0 06 00 00 movl $0x6a0,(%esp) > 4417: 89 44 24 08 mov %eax,0x8(%esp) > 441b: e8 fc ff ff ff call 441c > 4420: 8b 44 24 28 mov 0x28(%esp),%eax > 4424: 8b 80 b0 02 00 00 mov 0x2b0(%eax),%eax > 442a: 89 44 24 24 mov %eax,0x24(%esp) > 442e: 8b 03 mov (%ebx),%eax > 4430: c7 00 01 00 00 00 movl $0x1,(%eax) > 4436: f0 83 44 24 00 00 lock addl $0x0,0x0(%esp) > 443c: 89 5c 24 18 mov %ebx,0x18(%esp) > 4440: 8b 54 24 18 mov 0x18(%esp),%edx > 4444: 8b 02 mov (%edx),%eax > 4446: c7 00 00 00 00 00 movl $0x0,(%eax) > 444c: 8b 44 24 28 mov 0x28(%esp),%eax > 4450: e8 7c fd ff ff call 41d1 > 4455: 85 c0 test %eax,%eax > 4457: 89 c6 mov %eax,%esi > 4459: 0f 84 aa 03 00 00 je 4809 > 445f: 8b 88 44 04 00 00 mov 0x444(%eax),%ecx > 4465: 89 4c 24 14 mov %ecx,0x14(%esp) > 4469: 8b b8 90 02 00 00 mov 0x290(%eax),%edi > 446f: 85 ff test %edi,%edi > 4471: 74 7d je 44f0 > 4473: 0f 31 rdtsc > 4475: 89 44 24 0c mov %eax,0xc(%esp) > 4479: 89 54 24 10 mov %edx,0x10(%esp) > 447d: 85 d2 test %edx,%edx > 447f: 8b 1d 1c 00 00 00 mov 0x1c,%ebx > 4485: 89 d1 mov %edx,%ecx > 4487: 89 c5 mov %eax,%ebp > 4489: 74 08 je 4493 > 448b: 89 d0 mov %edx,%eax > 448d: 31 d2 xor %edx,%edx > 448f: f7 f3 div %ebx > 4491: 89 c1 mov %eax,%ecx > 4493: 89 e8 mov %ebp,%eax > 4495: f7 f3 div %ebx > 4497: 89 ca mov %ecx,%edx > 4499: 89 d3 mov %edx,%ebx > 449b: 8b 96 b8 02 00 00 mov 0x2b8(%esi),%edx > 44a1: 39 d3 cmp %edx,%ebx > 44a3: 89 c1 mov %eax,%ecx > 44a5: 8b 86 b4 02 00 00 mov 0x2b4(%esi),%eax > 44ab: 77 37 ja 44e4 > 44ad: 72 04 jb 44b3 > 44af: 39 c1 cmp %eax,%ecx > 44b1: 73 31 jae 44e4 > 44b3: 8b be b4 02 00 00 mov 0x2b4(%esi),%edi > 44b9: 29 cf sub %ecx,%edi > 44bb: 81 ff 0f 27 00 00 cmp $0x270f,%edi > 44c1: 0f 86 ee 04 00 00 jbe 49b5 > 44c7: b9 d3 4d 62 10 mov $0x10624dd3,%ecx > 44cc: 89 f8 mov %edi,%eax > 44ce: f7 e1 mul %ecx > 44d0: 89 d1 mov %edx,%ecx > 44d2: 89 f2 mov %esi,%edx > 44d4: c1 e9 06 shr $0x6,%ecx > 44d7: 89 c8 mov %ecx,%eax > 44d9: e8 10 e4 ff ff call 28ee > 44de: 8b be 90 02 00 00 mov 0x290(%esi),%edi > 44e4: 81 ff ff ff ff 7f cmp $0x7fffffff,%edi > 44ea: 0f 84 d5 04 00 00 je 49c5 > 44f0: 8b 54 24 14 mov 0x14(%esp),%edx > 44f4: 8b 42 24 mov 0x24(%edx),%eax > 44f7: a8 01 test $0x1,%al > 44f9: 0f 85 f4 01 00 00 jne 46f3 > 44ff: 8b 4c 24 18 mov 0x18(%esp),%ecx > 4503: 8b 41 08 mov 0x8(%ecx),%eax > 4506: a8 08 test $0x8,%al > 4508: 0f 85 e5 01 00 00 jne 46f3 > 450e: 8b 86 c8 02 00 00 mov 0x2c8(%esi),%eax > 4514: 85 c0 test %eax,%eax > 4516: 0f 85 63 03 00 00 jne 487f > 451c: 8b 96 40 04 00 00 mov 0x440(%esi),%edx > 4522: 85 d2 test %edx,%edx > 4524: 75 5d jne 4583 > 4526: 8b 86 c4 02 00 00 mov 0x2c4(%esi),%eax > 452c: 83 c0 01 add $0x1,%eax > 452f: 3b 86 e8 02 00 00 cmp 0x2e8(%esi),%eax > 4535: 89 86 c4 02 00 00 mov %eax,0x2c4(%esi) > 453b: 0f 83 5f 03 00 00 jae 48a0 > 4541: 85 d2 test %edx,%edx > 4543: 75 3e jne 4583 > 4545: f6 86 81 02 00 00 02 testb $0x2,0x281(%esi) > 454c: 0f 84 87 03 00 00 je 48d9 > 4552: 89 f2 mov %esi,%edx > 4554: 8b 44 24 14 mov 0x14(%esp),%eax > 4558: e8 fc f1 ff ff call 3759 > 455d: 85 c0 test %eax,%eax > 455f: 89 86 40 04 00 00 mov %eax,0x440(%esi) > 4565: 0f 84 87 03 00 00 je 48f2 > 456b: 83 86 bc 02 00 00 01 addl $0x1,0x2bc(%esi) > 4572: c7 86 c4 02 00 00 00 movl $0x0,0x2c4(%esi) > 4579: 00 00 00 > 457c: 83 96 c0 02 00 00 00 adcl $0x0,0x2c0(%esi) > 4583: 8b 7c 24 14 mov 0x14(%esp),%edi > 4587: 81 c7 2c 01 00 00 add $0x12c,%edi > 458d: 89 f8 mov %edi,%eax > 458f: e8 fc ff ff ff call 4590 > 4594: 8b 54 24 14 mov 0x14(%esp),%edx > 4598: 8b 42 24 mov 0x24(%edx),%eax > 459b: a8 01 test $0x1,%al > 459d: 0f 85 6c 03 00 00 jne 490f > 45a3: 8b 86 40 04 00 00 mov 0x440(%esi),%eax > 45a9: f0 ff 80 94 00 00 00 lock incl 0x94(%eax) > 45b0: 8b 86 40 04 00 00 mov 0x440(%esi),%eax > 45b6: 8b 54 24 14 mov 0x14(%esp),%edx > 45ba: ff 92 6c 01 00 00 call *0x16c(%edx) > 45c0: 85 c0 test %eax,%eax > 45c2: 0f 85 37 04 00 00 jne 49ff > 45c8: 83 86 9c 02 00 00 01 addl $0x1,0x29c(%esi) > 45cf: 8b 86 2c 04 00 00 mov 0x42c(%esi),%eax > 45d5: c7 86 c8 02 00 00 01 movl $0x1,0x2c8(%esi) > 45dc: 00 00 00 > 45df: 83 96 a0 02 00 00 00 adcl $0x0,0x2a0(%esi) > 45e6: 83 c0 04 add $0x4,%eax > 45e9: 31 d2 xor %edx,%edx > 45eb: 83 86 e4 02 00 00 01 addl $0x1,0x2e4(%esi) > 45f2: 01 86 a4 02 00 00 add %eax,0x2a4(%esi) > 45f8: 11 96 a8 02 00 00 adc %edx,0x2a8(%esi) > 45fe: 0f 31 rdtsc > 4600: 89 44 24 0c mov %eax,0xc(%esp) > 4604: 89 54 24 10 mov %edx,0x10(%esp) > 4608: 85 d2 test %edx,%edx > 460a: 8b 1d 1c 00 00 00 mov 0x1c,%ebx > 4610: 89 d1 mov %edx,%ecx > 4612: 89 c5 mov %eax,%ebp > 4614: 74 08 je 461e > 4616: 89 d0 mov %edx,%eax > 4618: 31 d2 xor %edx,%edx > 461a: f7 f3 div %ebx > 461c: 89 c1 mov %eax,%ecx > 461e: 89 e8 mov %ebp,%eax > 4620: f7 f3 div %ebx > 4622: 89 ca mov %ecx,%edx > 4624: 89 54 24 10 mov %edx,0x10(%esp) > 4628: 89 44 24 0c mov %eax,0xc(%esp) > 462c: 8b 8e 90 02 00 00 mov 0x290(%esi),%ecx > 4632: 31 db xor %ebx,%ebx > 4634: 01 4c 24 0c add %ecx,0xc(%esp) > 4638: 11 5c 24 10 adc %ebx,0x10(%esp) > 463c: 8b 54 24 0c mov 0xc(%esp),%edx > 4640: 8b 4c 24 10 mov 0x10(%esp),%ecx > 4644: 89 96 b4 02 00 00 mov %edx,0x2b4(%esi) > 464a: 89 8e b8 02 00 00 mov %ecx,0x2b8(%esi) > 4650: 89 f8 mov %edi,%eax > 4652: e8 fc ff ff ff call 4653 > 4657: 8b 96 98 02 00 00 mov 0x298(%esi),%edx > 465d: 8b 86 94 02 00 00 mov 0x294(%esi),%eax > 4663: 89 d1 mov %edx,%ecx > 4665: 09 c1 or %eax,%ecx > 4667: 0f 84 f6 00 00 00 je 4763 > 466d: 8b 9e a0 02 00 00 mov 0x2a0(%esi),%ebx > 4673: 8b 8e 9c 02 00 00 mov 0x29c(%esi),%ecx > 4679: 39 d3 cmp %edx,%ebx > 467b: 0f 82 e2 00 00 00 jb 4763 > 4681: 77 08 ja 468b > 4683: 39 c1 cmp %eax,%ecx > 4685: 0f 82 d8 00 00 00 jb 4763 > 468b: 8b 8e 40 04 00 00 mov 0x440(%esi),%ecx > 4691: 8b 81 94 00 00 00 mov 0x94(%ecx),%eax > 4697: 83 f8 01 cmp $0x1,%eax > 469a: 74 4e je 46ea > 469c: 0f 31 rdtsc > 469e: 89 c7 mov %eax,%edi > 46a0: 8b 81 94 00 00 00 mov 0x94(%ecx),%eax > 46a6: 83 f8 01 cmp $0x1,%eax > 46a9: 89 d5 mov %edx,%ebp > 46ab: 74 2b je 46d8 > 46ad: bb 00 e0 ff ff mov $0xffffe000,%ebx > 46b2: 21 e3 and %esp,%ebx > 46b4: eb 16 jmp 46cc > 46b6: e8 fc ff ff ff call 46b7 > 46bb: 8b 86 40 04 00 00 mov 0x440(%esi),%eax > 46c1: 8b 80 94 00 00 00 mov 0x94(%eax),%eax > 46c7: 83 f8 01 cmp $0x1,%eax > 46ca: 74 0c je 46d8 > 46cc: 8b 03 mov (%ebx),%eax > 46ce: 8b 40 04 mov 0x4(%eax),%eax > 46d1: 8b 40 08 mov 0x8(%eax),%eax > 46d4: a8 04 test $0x4,%al > 46d6: 74 de je 46b6 > 46d8: 0f 31 rdtsc > 46da: 29 f8 sub %edi,%eax > 46dc: 19 ea sbb %ebp,%edx > 46de: 01 86 dc 02 00 00 add %eax,0x2dc(%esi) > 46e4: 11 96 e0 02 00 00 adc %edx,0x2e0(%esi) > 46ea: 89 f0 mov %esi,%eax > 46ec: e8 2f fa ff ff call 4120 > 46f1: eb 70 jmp 4763 > 46f3: 0f 31 rdtsc > 46f5: 89 d5 mov %edx,%ebp > 46f7: 8b 54 24 14 mov 0x14(%esp),%edx > 46fb: 89 c7 mov %eax,%edi > 46fd: 8b 42 24 mov 0x24(%edx),%eax > 4700: a8 02 test $0x2,%al > 4702: 2e 74 e5 je,pn 46ea > 4705: 8b 4c 24 18 mov 0x18(%esp),%ecx > 4709: 8b 41 08 mov 0x8(%ecx),%eax > 470c: a8 08 test $0x8,%al > 470e: 0f 85 e1 02 00 00 jne 49f5 > 4714: 0f 31 rdtsc > 4716: 29 f8 sub %edi,%eax > 4718: 19 ea sbb %ebp,%edx > 471a: 01 86 dc 02 00 00 add %eax,0x2dc(%esi) > 4720: 11 96 e0 02 00 00 adc %edx,0x2e0(%esi) > 4726: 8b 54 24 14 mov 0x14(%esp),%edx > 472a: 8b 42 24 mov 0x24(%edx),%eax > 472d: a8 01 test $0x1,%al > 472f: 0f 84 d9 fd ff ff je 450e > 4735: 0f 31 rdtsc > 4737: 85 d2 test %edx,%edx > 4739: 8b 1d 1c 00 00 00 mov 0x1c,%ebx > 473f: 89 d1 mov %edx,%ecx > 4741: 89 c7 mov %eax,%edi > 4743: 74 08 je 474d > 4745: 89 d0 mov %edx,%eax > 4747: 31 d2 xor %edx,%edx > 4749: f7 f3 div %ebx > 474b: 89 c1 mov %eax,%ecx > 474d: 89 f8 mov %edi,%eax > 474f: f7 f3 div %ebx > 4751: 89 ca mov %ecx,%edx > 4753: 89 c1 mov %eax,%ecx > 4755: 89 d3 mov %edx,%ebx > 4757: 89 8e b4 02 00 00 mov %ecx,0x2b4(%esi) > 475d: 89 9e b8 02 00 00 mov %ebx,0x2b8(%esi) > 4763: 8b 96 c8 02 00 00 mov 0x2c8(%esi),%edx > 4769: 8b 4c 24 24 mov 0x24(%esp),%ecx > 476d: 01 54 24 20 add %edx,0x20(%esp) > 4771: 39 4c 24 20 cmp %ecx,0x20(%esp) > 4775: 76 20 jbe 4797 > 4777: 8b 54 24 18 mov 0x18(%esp),%edx > 477b: 8b 42 10 mov 0x10(%edx),%eax > 477e: c1 e0 07 shl $0x7,%eax > 4781: 8b b8 00 00 00 00 mov 0x0(%eax),%edi > 4787: 85 ff test %edi,%edi > 4789: 0f 85 1c 02 00 00 jne 49ab > 478f: c7 44 24 20 00 00 00 movl $0x0,0x20(%esp) > 4796: 00 > 4797: 8b 4c 24 28 mov 0x28(%esp),%ecx > 479b: 8b 91 b4 02 00 00 mov 0x2b4(%ecx),%edx > 47a1: f6 c2 01 test $0x1,%dl > 47a4: 0f 85 7c 00 00 00 jne 4826 > 47aa: 8b 4c 24 18 mov 0x18(%esp),%ecx > 47ae: 8b 01 mov (%ecx),%eax > 47b0: 8b 40 04 mov 0x4(%eax),%eax > 47b3: 8b 40 08 mov 0x8(%eax),%eax > 47b6: a8 04 test $0x4,%al > 47b8: 75 6c jne 4826 > 47ba: f6 c2 02 test $0x2,%dl > 47bd: 0f 85 c7 01 00 00 jne 498a > 47c3: f6 c2 04 test $0x4,%dl > 47c6: 0f 85 9d 01 00 00 jne 4969 > 47cc: 80 e2 08 and $0x8,%dl > 47cf: 90 nop > 47d0: 0f 85 7a 01 00 00 jne 4950 > 47d6: 8b 54 24 18 mov 0x18(%esp),%edx > 47da: 8b 42 08 mov 0x8(%edx),%eax > 47dd: a8 08 test $0x8,%al > 47df: 0f 84 5b fc ff ff je 4440 > 47e5: e8 fc ff ff ff call 47e6 > 47ea: 8b 54 24 18 mov 0x18(%esp),%edx > 47ee: 8b 02 mov (%edx),%eax > 47f0: c7 00 00 00 00 00 movl $0x0,(%eax) > 47f6: 8b 44 24 28 mov 0x28(%esp),%eax > 47fa: e8 d2 f9 ff ff call 41d1 > 47ff: 85 c0 test %eax,%eax > 4801: 89 c6 mov %eax,%esi > 4803: 0f 85 56 fc ff ff jne 445f > 4809: ba 64 00 00 00 mov $0x64,%edx > 480e: 8b 44 24 1c mov 0x1c(%esp),%eax > 4812: e8 fc ff ff ff call 4813 > 4817: 8b 4c 24 28 mov 0x28(%esp),%ecx > 481b: 8b 91 b4 02 00 00 mov 0x2b4(%ecx),%edx > 4821: f6 c2 01 test $0x1,%dl > 4824: 74 84 je 47aa > 4826: 8b 5c 24 28 mov 0x28(%esp),%ebx > 482a: c7 04 24 c8 06 00 00 movl $0x6c8,(%esp) > 4831: 83 c3 0c add $0xc,%ebx > 4834: 89 5c 24 04 mov %ebx,0x4(%esp) > 4838: e8 fc ff ff ff call 4839 > 483d: 8b 44 24 28 mov 0x28(%esp),%eax > 4841: e8 f6 f9 ff ff call 423c > 4846: 89 5c 24 04 mov %ebx,0x4(%esp) > 484a: c7 04 24 e8 06 00 00 movl $0x6e8,(%esp) > 4851: e8 fc ff ff ff call 4852 > 4856: 8b 44 24 28 mov 0x28(%esp),%eax > 485a: e8 1b fa ff ff call 427a > 485f: 89 5c 24 04 mov %ebx,0x4(%esp) > 4863: c7 04 24 cc 06 00 00 movl $0x6cc,(%esp) > 486a: e8 fc ff ff ff call 486b > 486f: 8b 44 24 28 mov 0x28(%esp),%eax > 4873: 83 c4 2c add $0x2c,%esp > 4876: 5b pop %ebx > 4877: 5e pop %esi > 4878: 5f pop %edi > 4879: 5d pop %ebp > 487a: e9 27 fa ff ff jmp 42a6 > 487f: 8b 96 40 04 00 00 mov 0x440(%esi),%edx > 4885: 8b 86 c4 02 00 00 mov 0x2c4(%esi),%eax > 488b: 83 c0 01 add $0x1,%eax > 488e: 3b 86 e8 02 00 00 cmp 0x2e8(%esi),%eax > 4894: 89 86 c4 02 00 00 mov %eax,0x2c4(%esi) > 489a: 0f 82 a1 fc ff ff jb 4541 > 48a0: 85 d2 test %edx,%edx > 48a2: 0f 84 9d fc ff ff je 4545 > 48a8: 8b 82 94 00 00 00 mov 0x94(%edx),%eax > 48ae: 83 f8 01 cmp $0x1,%eax > 48b1: 74 12 je 48c5 > 48b3: f0 ff 8a 94 00 00 00 lock decl 0x94(%edx) > 48ba: 0f 94 c0 sete %al > 48bd: 84 c0 test %al,%al > 48bf: 0f 84 80 fc ff ff je 4545 > 48c5: 89 d0 mov %edx,%eax > 48c7: e8 fc ff ff ff call 48c8 > 48cc: f6 86 81 02 00 00 02 testb $0x2,0x281(%esi) > 48d3: 0f 85 79 fc ff ff jne 4552 > 48d9: 89 f2 mov %esi,%edx > 48db: 8b 44 24 14 mov 0x14(%esp),%eax > 48df: e8 22 e7 ff ff call 3006 > 48e4: 85 c0 test %eax,%eax > 48e6: 89 86 40 04 00 00 mov %eax,0x440(%esi) > 48ec: 0f 85 79 fc ff ff jne 456b > 48f2: c7 04 24 08 07 00 00 movl $0x708,(%esp) > 48f9: e8 fc ff ff ff call 48fa > 48fe: e8 fc ff ff ff call 48ff > 4903: 83 ae c4 02 00 00 01 subl $0x1,0x2c4(%esi) > 490a: e9 54 fe ff ff jmp 4763 > 490f: c7 86 c8 02 00 00 00 movl $0x0,0x2c8(%esi) > 4916: 00 00 00 > 4919: 0f 31 rdtsc > 491b: 89 44 24 0c mov %eax,0xc(%esp) > 491f: 89 54 24 10 mov %edx,0x10(%esp) > 4923: 85 d2 test %edx,%edx > 4925: 8b 1d 1c 00 00 00 mov 0x1c,%ebx > 492b: 89 d1 mov %edx,%ecx > 492d: 89 c5 mov %eax,%ebp > 492f: 74 08 je 4939 > 4931: 89 d0 mov %edx,%eax > 4933: 31 d2 xor %edx,%edx > 4935: f7 f3 div %ebx > 4937: 89 c1 mov %eax,%ecx > 4939: 89 e8 mov %ebp,%eax > 493b: f7 f3 div %ebx > 493d: 89 ca mov %ecx,%edx > 493f: 89 86 b4 02 00 00 mov %eax,0x2b4(%esi) > 4945: 89 96 b8 02 00 00 mov %edx,0x2b8(%esi) > 494b: e9 00 fd ff ff jmp 4650 > 4950: 8b 44 24 28 mov 0x28(%esp),%eax > 4954: e8 21 f9 ff ff call 427a > 4959: 8b 44 24 28 mov 0x28(%esp),%eax > 495d: 83 a0 b4 02 00 00 f7 andl $0xfffffff7,0x2b4(%eax) > 4964: e9 6d fe ff ff jmp 47d6 > 4969: 8b 44 24 28 mov 0x28(%esp),%eax > 496d: e8 d7 f2 ff ff call 3c49 > 4972: 8b 4c 24 28 mov 0x28(%esp),%ecx > 4976: 8b 91 b4 02 00 00 mov 0x2b4(%ecx),%edx > 497c: 83 e2 fb and $0xfffffffb,%edx > 497f: 89 91 b4 02 00 00 mov %edx,0x2b4(%ecx) > 4985: e9 42 fe ff ff jmp 47cc > 498a: 8b 44 24 28 mov 0x28(%esp),%eax > 498e: e8 a9 f8 ff ff call 423c > 4993: 8b 44 24 28 mov 0x28(%esp),%eax > 4997: 8b 90 b4 02 00 00 mov 0x2b4(%eax),%edx > 499d: 83 e2 fd and $0xfffffffd,%edx > 49a0: 89 90 b4 02 00 00 mov %edx,0x2b4(%eax) > 49a6: e9 18 fe ff ff jmp 47c3 > 49ab: e8 fc ff ff ff call 49ac > 49b0: e9 da fd ff ff jmp 478f > 49b5: 89 f2 mov %esi,%edx > 49b7: 89 f8 mov %edi,%eax > 49b9: e8 cc de ff ff call 288a > 49be: 89 f6 mov %esi,%esi > 49c0: e9 19 fb ff ff jmp 44de > 49c5: 0f 31 rdtsc > 49c7: 85 d2 test %edx,%edx > 49c9: 8b 1d 1c 00 00 00 mov 0x1c,%ebx > 49cf: 89 d1 mov %edx,%ecx > 49d1: 89 c7 mov %eax,%edi > 49d3: 74 08 je 49dd > 49d5: 89 d0 mov %edx,%eax > 49d7: 31 d2 xor %edx,%edx > 49d9: f7 f3 div %ebx > 49db: 89 c1 mov %eax,%ecx > 49dd: 89 f8 mov %edi,%eax > 49df: f7 f3 div %ebx > 49e1: 89 ca mov %ecx,%edx > 49e3: 89 c1 mov %eax,%ecx > 49e5: 89 d3 mov %edx,%ebx > 49e7: 81 c1 ff ff ff 7f add $0x7fffffff,%ecx > 49ed: 83 d3 00 adc $0x0,%ebx > 49f0: e9 62 fd ff ff jmp 4757 > 49f5: e8 fc ff ff ff call 49f6 > 49fa: e9 15 fd ff ff jmp 4714 > 49ff: 83 f8 ff cmp $0xffffffff,%eax > 4a02: 75 14 jne 4a18 > 4a04: 8b 4c 24 14 mov 0x14(%esp),%ecx > 4a08: f6 81 59 01 00 00 10 testb $0x10,0x159(%ecx) > 4a0f: 74 07 je 4a18 > 4a11: f3 90 pause > 4a13: e9 98 fb ff ff jmp 45b0 > 4a18: 8b 86 40 04 00 00 mov 0x440(%esi),%eax > 4a1e: f0 ff 88 94 00 00 00 lock decl 0x94(%eax) > 4a25: 8b 2d 08 00 00 00 mov 0x8,%ebp > 4a2b: 85 ed test %ebp,%ebp > 4a2d: 75 4f jne 4a7e > 4a2f: 83 86 ac 02 00 00 01 addl $0x1,0x2ac(%esi) > 4a36: c7 86 c8 02 00 00 00 movl $0x0,0x2c8(%esi) > 4a3d: 00 00 00 > 4a40: 83 96 b0 02 00 00 00 adcl $0x0,0x2b0(%esi) > 4a47: 0f 31 rdtsc > 4a49: 89 44 24 0c mov %eax,0xc(%esp) > 4a4d: 89 54 24 10 mov %edx,0x10(%esp) > 4a51: 85 d2 test %edx,%edx > 4a53: 8b 1d 1c 00 00 00 mov 0x1c,%ebx > 4a59: 89 d1 mov %edx,%ecx > 4a5b: 89 c5 mov %eax,%ebp > 4a5d: 74 08 je 4a67 > 4a5f: 89 d0 mov %edx,%eax > 4a61: 31 d2 xor %edx,%edx > 4a63: f7 f3 div %ebx > 4a65: 89 c1 mov %eax,%ecx > 4a67: 89 e8 mov %ebp,%eax > 4a69: f7 f3 div %ebx > 4a6b: 89 ca mov %ecx,%edx > 4a6d: 89 86 b4 02 00 00 mov %eax,0x2b4(%esi) > 4a73: 89 96 b8 02 00 00 mov %edx,0x2b8(%esi) > 4a79: e9 80 fb ff ff jmp 45fe > 4a7e: e8 fc ff ff ff call 4a7f > 4a83: 85 c0 test %eax,%eax > 4a85: 74 a8 je 4a2f > 4a87: c7 04 24 e9 06 00 00 movl $0x6e9,(%esp) > 4a8e: e8 fc ff ff ff call 4a8f > 4a93: eb 9a jmp 4a2f > -- Pádraig Brady - http://www.pixelbeat.org -- From rayl@mail.com Thu Dec 9 13:48:08 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 09 Dec 2004 13:48:14 -0800 (PST) Received: from ray.lehtiniemi.com (dsl-crow-209-5-162-130-cgy.nucleus.com [209.5.162.130]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB9Lm77L001326 for ; Thu, 9 Dec 2004 13:48:08 -0800 Received: by ray.lehtiniemi.com (Postfix, from userid 500) id E529714D8BF; Thu, 9 Dec 2004 14:47:44 -0700 (MST) Resent-From: rayl@mail.com Resent-Date: Thu, 9 Dec 2004 14:47:44 -0700 Resent-Message-ID: <20041209214744.GC32454@mail.com> Resent-To: netdev@oss.sgi.com X-Original-To: rayl@localhost Delivered-To: rayl@localhost.lehtiniemi.com Received: from localhost (localhost [127.0.0.1]) by ray.lehtiniemi.com (Postfix) with ESMTP id 4D20D14D8F6 for ; Thu, 9 Dec 2004 10:27:42 -0700 (MST) Received: from mail.nucleus.com [207.34.93.23] by localhost with IMAP (fetchmail-6.2.5) for rayl@localhost (single-drop); Thu, 09 Dec 2004 10:27:42 -0700 (MST) Received: from spf6.us4.outblaze.com (unverified [205.158.62.33]) by mail.nucleus.com (Vircom SMTPRS 4.0.330.8) with ESMTP id for ; Thu, 9 Dec 2004 10:23:31 -0700 Received: from corvil.com (gate.corvil.net [213.94.219.177]) by spf6.us4.outblaze.com (Postfix) with ESMTP id 352C453907 for ; Thu, 9 Dec 2004 17:19:57 +0000 (GMT) Received: from draigBrady.com (pixelbeat.local.corvil.com [172.18.1.170]) by corvil.com (8.12.9/8.12.5) with ESMTP id iB9HJtwS059653 for ; Thu, 9 Dec 2004 17:19:55 GMT (envelope-from P@draigBrady.com) Message-ID: <41B8893B.5030407@draigBrady.com> Date: Thu, 09 Dec 2004 17:19:55 +0000 From: P@draigBrady.com User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040124 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Ray Lehtiniemi Subject: Re: 1.03Mpps on e1000 References: <1101863399.4663.54.camel@sfeldma-mobl.dsl-verizon.net> <20041201182943.GA14470@xi.wantstofly.org> <20041201213550.GF14470@xi.wantstofly.org> <1101967983.4782.9.camel@localhost.localdomain> <20041205145051.GA647@xi.wantstofly.org> <20041208233617.GD8649@mail.com> <41B825A5.2000009@draigBrady.com> <20041209161825.GA32454@mail.com> <20041209164820.GB32454@mail.com> In-Reply-To: <20041209164820.GB32454@mail.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12621 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rayl@mail.com Precedence: bulk X-list: netdev Ray Lehtiniemi wrote: > On Thu, Dec 09, 2004 at 09:18:25AM -0700, Ray Lehtiniemi wrote: > >>it is worth noting that my box has become quite unstable since >>i started to use oprofile and pktgen together. sshd stops responding, >>and the network seems to go down. not sure what is happening there... >>this instability seems to be persisting across reboots, unfortunately... > > > > ok, it seems that this is related to martin's e1000 patch, and i > just hadn't noticed it before. rolling back the 1.2 Mpps patch > seems to cure the problem. > > symptoms are a total freezeup of the e1000 interfaces. netstat > -an shows a tcp connection for my ssh login to the box, with about > 53K in the send-Q. /proc/net/tcp is empty, however.... i can > reproduce this at will by doing > > # objdump -d /lib/modules/2.6.10-rc3-mp-rayl/kernel/net/core/pktgen.ko > > on that machine with the e1000-patched kernel running. > > > if there's any diagnostic output i can generate that might tell > me what's going wrong, let me know and i'll try to generate it. can you send this to again to netdev. thanks. -- Pádraig Brady - http://www.pixelbeat.org -- From benh@kernel.crashing.org Thu Dec 9 14:15:41 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 09 Dec 2004 14:15:49 -0800 (PST) Received: from gate.crashing.org (gate.crashing.org [63.228.1.57]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB9MFegd002743 for ; Thu, 9 Dec 2004 14:15:41 -0800 Received: from gaston (localhost [127.0.0.1]) by gate.crashing.org (8.12.8/8.12.8) with ESMTP id iB9MAqgJ025863; Thu, 9 Dec 2004 16:10:53 -0600 Subject: Re: netdev ioctl & dev_base_lock : bad idea ? From: Benjamin Herrenschmidt To: "David S. Miller" Cc: netdev@oss.sgi.com In-Reply-To: <20041208231331.40cd98ad.davem@davemloft.net> References: <1101458929.28048.9.camel@gaston> <20041208220642.6984519f.davem@davemloft.net> <1102573333.16495.2.camel@gaston> <20041208231331.40cd98ad.davem@davemloft.net> Content-Type: text/plain Date: Fri, 10 Dec 2004 09:14:35 +1100 Message-Id: <1102630475.22746.7.camel@gaston> Mime-Version: 1.0 X-Mailer: Evolution 2.0.2 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12622 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: benh@kernel.crashing.org Precedence: bulk X-list: netdev On Wed, 2004-12-08 at 23:13 -0800, David S. Miller wrote: > On Thu, 09 Dec 2004 17:22:13 +1100 > Benjamin Herrenschmidt wrote: > > > Right, and I missed the fact that we did indeed take the semaphore and > > not the lock in the _set_ functions which is just fine, we can actually > > schedule.... except in set_multicast... > > > > Is there any reason we actually _need_ to get the xmit lock in this one > > specifically ? > > Since we implement NETIF_F_LLTX, the core packet transmit routines do > no locking, the driver does it all. > > So if we don't hold the tx lock in the set multicast routine, any other > cpu can come into our hard_start_xmit function and poke at the hardware. > > Upon further consideration, it seems that it may be OK to drop that tx > lock right after we do the netif_stop_queue(). But we should regrab > the tx lock when we do the subsequent netif_wake_queue(). Yes. In fact, I think it should be driver local locking policy, and not enforced by net/core/*. For example, for things like USB based networking (or other "remote" busses like that), it's both very useful to be able to schedule in set_multicast, and there is no need for any synchronisation with the xmit code. For things like sungem, I already have a driver local lock that can be used if necessary. Also, the lack of ability to schedule means we can't suspend and resume NAPI polling, which basically forces us to take a lock in the NAPI poll side of the driver... I'm aiming at limiting the amount of locks we take in sungem along with moving as much as I can to task level so I can do a bit better power management without having big u/mdelay's all over. Also, why would we need the xmit lock when calling netif_wake_queue() ? I'm not sure I get that one (but I'm not too familiar with the net core neither). Ben. From davem@davemloft.net Thu Dec 9 15:21:55 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 09 Dec 2004 15:22:01 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB9NLtDm005298 for ; Thu, 9 Dec 2004 15:21:55 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1CcXYy-0006TB-00; Thu, 09 Dec 2004 15:19:08 -0800 Date: Thu, 9 Dec 2004 15:19:08 -0800 From: "David S. Miller" To: Benjamin Herrenschmidt Cc: netdev@oss.sgi.com Subject: Re: netdev ioctl & dev_base_lock : bad idea ? Message-Id: <20041209151908.7f84eaf6.davem@davemloft.net> In-Reply-To: <1102630475.22746.7.camel@gaston> References: <1101458929.28048.9.camel@gaston> <20041208220642.6984519f.davem@davemloft.net> <1102573333.16495.2.camel@gaston> <20041208231331.40cd98ad.davem@davemloft.net> <1102630475.22746.7.camel@gaston> X-Mailer: Sylpheed version 1.0.0beta3 (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12623 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Fri, 10 Dec 2004 09:14:35 +1100 Benjamin Herrenschmidt wrote: > Also, why would we need the xmit lock when calling netif_wake_queue() ? > I'm not sure I get that one (but I'm not too familiar with the net core > neither). Hmmm, you're right, it seems it is not needed. From rayl@mail.com Thu Dec 9 15:26:13 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 09 Dec 2004 15:26:19 -0800 (PST) Received: from ray.lehtiniemi.com (dsl-crow-209-5-162-130-cgy.nucleus.com [209.5.162.130]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB9NQD23005821 for ; Thu, 9 Dec 2004 15:26:13 -0800 Received: by ray.lehtiniemi.com (Postfix, from userid 500) id 99FE814D8EE; Thu, 9 Dec 2004 16:25:50 -0700 (MST) Date: Thu, 9 Dec 2004 16:25:50 -0700 From: Ray Lehtiniemi To: netdev@oss.sgi.com Subject: Re: 1.03Mpps on e1000 Message-ID: <20041209232550.GE32454@mail.com> References: <20041201213550.GF14470@xi.wantstofly.org> <1101967983.4782.9.camel@localhost.localdomain> <20041205145051.GA647@xi.wantstofly.org> <20041208233617.GD8649@mail.com> <41B825A5.2000009@draigBrady.com> <20041209161825.GA32454@mail.com> <20041209164820.GB32454@mail.com> <41B8893B.5030407@draigBrady.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <41B8893B.5030407@draigBrady.com> User-Agent: Mutt/1.5.6i X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12624 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rayl@mail.com Precedence: bulk X-list: netdev hi all my apologies if this gets received twice... i originally sent a copy of this using mutt's 'bounce' function, but i don't think that's what i wanted to do..... this is a bug report re: martin e1000 patch. i'm seeing some lockups under normal traffic loads that seem to go away if i revert the patch. details below.. thanks On Thu, Dec 09, 2004 at 05:19:55PM +0000, P@draigBrady.com wrote: > Ray Lehtiniemi wrote: > >On Thu, Dec 09, 2004 at 09:18:25AM -0700, Ray Lehtiniemi wrote: > > > >>it is worth noting that my box has become quite unstable since > >>i started to use oprofile and pktgen together. sshd stops responding, > >>and the network seems to go down. not sure what is happening there... > >>this instability seems to be persisting across reboots, unfortunately... > > > > > > > >ok, it seems that this is related to martin's e1000 patch, and i > >just hadn't noticed it before. rolling back the 1.2 Mpps patch > >seems to cure the problem. > > > >symptoms are a total freezeup of the e1000 interfaces. netstat > >-an shows a tcp connection for my ssh login to the box, with about > >53K in the send-Q. /proc/net/tcp is empty, however.... i can > >reproduce this at will by doing > > > > # objdump -d /lib/modules/2.6.10-rc3-mp-rayl/kernel/net/core/pktgen.ko > > > >on that machine with the e1000-patched kernel running. > > > > > >if there's any diagnostic output i can generate that might tell > >me what's going wrong, let me know and i'll try to generate it. > > can you send this to again to netdev. > > thanks. > > -- > Pádraig Brady - http://www.pixelbeat.org > -- -- ---------------------------------------------------------------------- Ray L From ilya.pashkovsky@gmail.com Thu Dec 9 15:40:19 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 09 Dec 2004 15:40:25 -0800 (PST) Received: from rproxy.gmail.com (rproxy.gmail.com [64.233.170.193]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB9NeIKo006923 for ; Thu, 9 Dec 2004 15:40:19 -0800 Received: by rproxy.gmail.com with SMTP id b11so567755rne for ; Thu, 09 Dec 2004 15:39:53 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:references; b=qvbv7YJg1GiTs5BD72nAvB6TWmN6s2cREhfwibQfMGQHG25yfPZBk5GiBW1PdNVWj+R9nv09actZGi2TG3OltHopaOFWOpdsHu25U3Q4L4a1nt0nVeRGdKmTV33ttV+iJg4IsSuh24rFkTM6Cka2wTX1w2wEM8j7FeZ8oEcOCBI= Received: by 10.38.12.30 with SMTP id 30mr576403rnl; Thu, 09 Dec 2004 15:39:53 -0800 (PST) Received: by 10.38.149.15 with HTTP; Thu, 9 Dec 2004 15:39:53 -0800 (PST) Message-ID: Date: Fri, 10 Dec 2004 01:39:53 +0200 From: Ilya Pashkovsky Reply-To: Ilya Pashkovsky To: netdev@oss.sgi.com Subject: [PATCH] port_reuse listen fix (allow simultaneous single listen + outgoing connects from same port) In-Reply-To: Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit References: <8783be6604120907367db1fda5@mail.gmail.com> <8783be66041209103567bb3310@mail.gmail.com> X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12625 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ilya.pashkovsky@gmail.com Precedence: bulk X-list: netdev if the SYN of clientA is accepted before clientB called connect and clientB is listening on that port, the connection will be accepted no matter what, and this is the expected and good behavior. In process of calling connect(), clientB will get an EADDRINUSE error and will stop connecting. In case the calls are already underway to connect (ports bound) then the new packets will get into the new cross-connection by default and not into the listening socket, since the new cross-connection tuple exists. This is guaranteed by setting the connection state flag before calling get_port. Can still see no added ambiguities in this patch yet...If you can help find some, it would be very nice of you indeed. Thanks for comments up to now. On Thu, 9 Dec 2004 13:35:54 -0500, Ross Biro wrote: > But what if the tuple is not taken. This code exposes a race condition. > > Imagine if first you bind the servers and listne. > > Then you bind the clients. > > Then the clients send the syn packets. > > If the syn's cross on the wire, then the clients will connect to each > other. If one of the syns arrives before the other machine calls > connect, then one machine will have a minisocket for the server, but > the other will still be able to send a syn, which will cause a bogus > reset and kill one of the connections. I'm not 100% sure which one, > but my guess would be the new one. > > In any event, you have a bunch of bad behaviour at the boundary and > need to do something about it. > > Ross > > On Thu, 9 Dec 2004 20:10:27 +0200, Ilya Pashkovsky > > > wrote: > > if this tuple (srcip,destip,srcport,destport) is already taken, you'll > > get an EADDRINUSE error as you should. The fix only fixes the > > behaviour of not allowing even a single listener to coexist with > > outgoing connections from same port. In fact, SO_REUSEADDR on linux > > should and does implement the behaviour of SO_REUSEPORT of BSD, except > > for listener preemption (since its not useful and would require > > several security checks). > > The current fix allows piercing firewalls for the needing and maybe > > TCP NAT traversal in the future (if some vendor produces a Full-cone > > TCP NAT). > > > > > > > > > > On Thu, 9 Dec 2004 10:36:08 -0500, Ross Biro wrote: > > > On Thu, 9 Dec 2004 13:25:26 +0200, Ilya Pashkovsky > > > > > > > > > wrote: > > > > This is the latest patch with removed bool > 1 check and ipv6 support. > > > > http://puding.mine.nu/patches/ > > > > http://puding.mine.nu/patches/patch-reuse-bool-ipv6 > > > > > > > > to check, you can use netcat (sets SO_REUSEADDR by default). > > > > on one host (host A): nc -v -l -p 9999 > > > > on another/same host (host B): nc -v -l -p 9000 > > > > on host A: nc -v -p 9999 host.B.ip.addr 9000 > > > > on host B: nc -v host.A.ip.addr 9999 > > > > > > What happens if on host B you do > > > > > > nc -v -p 9000 host.A.ip.addr 9999? > > > > > > Seems to me you will break the rule that a connection is uniquely > > > identified by (srcpip, destip, srcport, destport). > > > > > > Ross > > > > > > From kaber@trash.net Thu Dec 9 15:47:18 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 09 Dec 2004 15:47:22 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iB9NlILx007539 for ; Thu, 9 Dec 2004 15:47:18 -0800 Received: from localhost ([127.0.0.1]) by kaber.coreworks.de with esmtp (Exim 4.34) id 1CcXzF-0002eW-Qs; Fri, 10 Dec 2004 00:46:17 +0100 Message-ID: <41B8E3C9.5060002@trash.net> Date: Fri, 10 Dec 2004 00:46:17 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.3) Gecko/20041008 Debian/1.7.3-5 X-Accept-Language: en MIME-Version: 1.0 To: hadi@cyberus.ca CC: "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PATCH 2.6]: Fix oops in ipt action error path References: <41B7892E.1080706@trash.net> <1102594472.1048.50.camel@jzny.localdomain> In-Reply-To: <1102594472.1048.50.camel@jzny.localdomain> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12626 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev jamal wrote: >Thanks Patrick. If you have more cleanups on ipt, please shoot them in >as well. > > I didn't get to it yet, but I'll have more free time at christmas. Regards Patrick From davem@davemloft.net Thu Dec 9 16:02:29 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 09 Dec 2004 16:02:35 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBA02Tsw008489 for ; Thu, 9 Dec 2004 16:02:29 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1CcYC8-0006Za-00; Thu, 09 Dec 2004 15:59:36 -0800 Date: Thu, 9 Dec 2004 15:59:36 -0800 From: "David S. Miller" To: Robert Olsson Cc: buytenh@wantstofly.org, Robert.Olsson@data.slu.se, netdev@oss.sgi.com Subject: Re: [PATCH] remove FCS from pktgen bandwidth calculation Message-Id: <20041209155936.69313e70.davem@davemloft.net> In-Reply-To: <16824.22466.897897.238619@robur.slu.se> References: <20041208201116.GA10691@xi.wantstofly.org> <16824.22466.897897.238619@robur.slu.se> X-Mailer: Sylpheed version 1.0.0beta3 (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12627 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Thu, 9 Dec 2004 14:48:50 +0100 Robert Olsson wrote: > OK! We strictly define statistics to be based on the data we deliver > to hard_xmit. Feel free to change the current kernel version... while > we wait for 2.7 :-) Robert can you collect together the various pktgen patches that have been floating about over the past few days and send to me what you think should be applied. There is this FCS patch, and the naonspin one from Lennert as well. There may have been others. From kevins@callplus.co.nz Thu Dec 9 17:00:06 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 09 Dec 2004 17:00:12 -0800 (PST) Received: from inetsrv.callplus.co.nz (mail.callplus.co.nz [202.180.64.194]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBA104mf014362 for ; Thu, 9 Dec 2004 17:00:05 -0800 Received: from [192.168.18.64] (Not Verified[192.168.18.64]) by inetsrv.callplus.co.nz with NetIQ MailMarshal (v6,0,3,8) id ; Fri, 10 Dec 2004 13:59:41 +1300 Message-ID: <41B8F4FD.6080407@callplus.co.nz> Date: Fri, 10 Dec 2004 13:59:41 +1300 From: Kevin Stewart User-Agent: Mozilla Thunderbird 0.9 (X11/20041103) X-Accept-Language: en-us, en MIME-Version: 1.0 To: netdev@oss.sgi.com Subject: One IP on 2 nics Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12628 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kevins@callplus.co.nz Precedence: bulk X-list: netdev What I am after is the ability to keep one IP and MAC address when I remove my laptop from my docking station (I have a 802.11b AP on the same segment as my other nic) how do I create a vertual interface and configure my eth0(10/100) and eth1(802.11b) so I can route traffic out the correct interface depending on what is up this is the closest I have come but my laptop becoming part of spanning tree sounds .... bad bridging and not being part of spanning tree .... worse iface br0 inet static address 192.168.18.64 netmask 255.255.240.0 network 192.168.16.0 broadcast 192.168.31.255 gateway 192.168.18.1 bridge_ports eth0 eth1 pre-up ifconfig eth0 0.0.0.0 up pre-up ifconfig eth1 0.0.0.0 up pre-up brctl addbr br0 pre-up brctl addif br0 eth0 pre-up brctl addif br0 eth1 pre-up brctl stp br0 on post-down ifconfig eth0 0.0.0.0 down post-down ifconfig eth1 0.0.0.0 down post-down brctl delif br0 eth0 post-down brctl delif br0 eth1 post-down brctl delbr br0 my thoughts where a dummy interface and 2 routes to 192.168.16.0 with diffrent metrics out the 2 interfaces and a default route but I am unable to get it working -- Kevin Stewart Network Engineer DDI ++ 64 9 919 6120 Mobile ++ 64 21 885 505 Email kevins@callplus.co.nz ----------------------------------------------------------------------------------------------- This message and any attachments contain privileged and confidential information. If you are not the intended recipient of this message, you are hereby notified that any use, dissemination, distribution or reproduction of this message is prohibited. If you have received this message in error please notify the sender immediately via email and then destroy this message and any attachments. From tgraf@suug.ch Thu Dec 9 17:43:28 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 09 Dec 2004 17:43:37 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBA1hRv5015702 for ; Thu, 9 Dec 2004 17:43:28 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id 5D849F; Fri, 10 Dec 2004 02:42:41 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id 5628B1C0EA; Fri, 10 Dec 2004 02:43:22 +0100 (CET) Date: Fri, 10 Dec 2004 02:43:22 +0100 From: Thomas Graf To: "David S. Miller" Cc: netdev@oss.sgi.com Subject: [PATCH] PKT_SCHED: Prevent destroying via RTM_DELTFILTER while classifying Message-ID: <20041210014322.GS1371@postel.suug.ch> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12629 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev The classify function of every cls is invoked under BH and dev->queue_lock spinlock from dev_xmit. Hence to serialize destroying the destroy function must be called under the spinlock as well. There are 2 paths in which a classifier can be destroyed: 1) via the qdisc destroy cb calling tcf_destroy under qdisc_tree_lock from __qdisc_destroy (rcu callback) 2) via RTM_DELTFILTER in cls_api.c under no locks at all. The first path seems ok, the initial qdisc destroy attempt is called under the spinlock and thus serialized with the classify while the list unlinking takes place and dev_xmit takes care of the RCU callback, hence classify and all the callbacks needed from process context cannot be found anymore. The second path needs the spinlock to avoid destroying while a classification is in progress. 2.4 probably needs the same fix, I will cook one up if so. Signed-off-by: Thomas Graf --- linux-2.6.10-rc2-bk13.orig/net/sched/cls_api.c 2004-11-30 14:01:12.000000000 +0100 +++ linux-2.6.10-rc2-bk13/net/sched/cls_api.c 2004-12-10 02:16:29.000000000 +0100 @@ -257,7 +257,9 @@ qdisc_unlock_tree(dev); tfilter_notify(skb, n, tp, fh, RTM_DELTFILTER); + qdisc_lock_tree(dev); tcf_destroy(tp); + qdisc_unlock_tree(dev); err = 0; goto errout; } From tgraf@suug.ch Thu Dec 9 17:49:21 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 09 Dec 2004 17:49:26 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBA1nKVd016323 for ; Thu, 9 Dec 2004 17:49:21 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id 6DA9AF; Fri, 10 Dec 2004 02:48:36 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id DF63F1C0EA; Fri, 10 Dec 2004 02:49:18 +0100 (CET) Date: Fri, 10 Dec 2004 02:49:18 +0100 From: Thomas Graf To: "David S. Miller" Cc: netdev@oss.sgi.com Subject: [PATCH] PKT_SCHED: Fix double locking in tcindex destroy path Message-ID: <20041210014918.GT1371@postel.suug.ch> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12630 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev tcindex's destroy uses its own delete functions to destroy its configuration. The delete function (correctly) takes the qdisc_tree_lock to prevent list walkings from happening while removing from the list. The qdisc_tree_lock is already held if we're comming via the destroy path and thus a double locking takes place. Patch not needed for 2.4 since both destroy paths are unlocked but will be needed if we add them. Signed-off-by: Thomas Graf --- linux-2.6.10-rc2-bk13.orig/net/sched/cls_tcindex.c 2004-11-30 14:01:12.000000000 +0100 +++ linux-2.6.10-rc2-bk13/net/sched/cls_tcindex.c 2004-12-10 02:20:51.000000000 +0100 @@ -160,7 +160,8 @@ } -static int tcindex_delete(struct tcf_proto *tp, unsigned long arg) +static int +__tcindex_delete(struct tcf_proto *tp, unsigned long arg, int already_locked) { struct tcindex_data *p = PRIV(tp); struct tcindex_filter_result *r = (struct tcindex_filter_result *) arg; @@ -182,9 +183,11 @@ found: f = *walk; - tcf_tree_lock(tp); + if (!already_locked) + tcf_tree_lock(tp); *walk = f->next; - tcf_tree_unlock(tp); + if (!already_locked) + tcf_tree_unlock(tp); } tcf_unbind_filter(tp, &r->res); #ifdef CONFIG_NET_CLS_POLICE @@ -195,6 +198,10 @@ return 0; } +static int tcindex_delete(struct tcf_proto *tp, unsigned long arg) +{ + return __tcindex_delete(tp, arg, 0); +} /* * There are no parameters for tcindex_init, so we overload tcindex_change @@ -384,7 +391,7 @@ static int tcindex_destroy_element(struct tcf_proto *tp, unsigned long arg, struct tcf_walker *walker) { - return tcindex_delete(tp,arg); + return __tcindex_delete(tp,arg, 1); } From kaber@trash.net Thu Dec 9 18:08:40 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 09 Dec 2004 18:08:45 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBA28dPv017266 for ; Thu, 9 Dec 2004 18:08:40 -0800 Received: from localhost ([127.0.0.1]) by kaber.coreworks.de with esmtp (Exim 4.34) id 1CcaCY-0003II-Lp; Fri, 10 Dec 2004 03:08:10 +0100 Message-ID: <41B9050A.1060401@trash.net> Date: Fri, 10 Dec 2004 03:08:10 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.3) Gecko/20041008 Debian/1.7.3-5 X-Accept-Language: en MIME-Version: 1.0 To: Thomas Graf CC: "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PATCH] PKT_SCHED: Prevent destroying via RTM_DELTFILTER while classifying References: <20041210014322.GS1371@postel.suug.ch> In-Reply-To: <20041210014322.GS1371@postel.suug.ch> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12631 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev Thomas Graf wrote: >The classify function of every cls is invoked under BH and >dev->queue_lock spinlock from dev_xmit. Hence to serialize destroying >the destroy function must be called under the spinlock as well. There >are 2 paths in which a classifier can be destroyed: > >1) via the qdisc destroy cb calling tcf_destroy under > qdisc_tree_lock from __qdisc_destroy (rcu callback) > >2) via RTM_DELTFILTER in cls_api.c under no locks at all. > >The first path seems ok, the initial qdisc destroy attempt is called >under the spinlock and thus serialized with the classify while >the list unlinking takes place and dev_xmit takes care of >the RCU callback, hence classify and all the callbacks needed >from process context cannot be found anymore. > >The second path needs the spinlock to avoid destroying while >a classification is in progress. 2.4 probably needs the same fix, >I will cook one up if so. > > This is not correct. The classifier is unlinked in tc_ctl_tfilter before it is destroyed, so it is not visible for classification anymore. if (fh == 0) { if (n->nlmsg_type == RTM_DELTFILTER && t->tcm_handle == 0) { qdisc_lock_tree(dev); *back = tp->next; qdisc_unlock_tree(dev); tfilter_notify(skb, n, tp, fh, RTM_DELTFILTER); tcf_destroy(tp); err = 0; goto errout; } Regards Patrick From kaber@trash.net Thu Dec 9 18:36:14 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 09 Dec 2004 18:36:18 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBA2aDTF019062 for ; Thu, 9 Dec 2004 18:36:14 -0800 Received: from localhost ([127.0.0.1]) by kaber.coreworks.de with esmtp (Exim 4.34) id 1CcadF-0003KV-CA; Fri, 10 Dec 2004 03:35:45 +0100 Message-ID: <41B90B81.1020102@trash.net> Date: Fri, 10 Dec 2004 03:35:45 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.3) Gecko/20041008 Debian/1.7.3-5 X-Accept-Language: en MIME-Version: 1.0 To: Thomas Graf CC: "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PATCH] PKT_SCHED: Fix double locking in tcindex destroy path References: <20041210014918.GT1371@postel.suug.ch> In-Reply-To: <20041210014918.GT1371@postel.suug.ch> Content-Type: text/plain; charset=ISO-8859-15; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12632 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev Thomas Graf wrote: >tcindex's destroy uses its own delete functions to destroy its >configuration. The delete function (correctly) takes the qdisc_tree_lock >to prevent list walkings from happening while removing from the list. >The qdisc_tree_lock is already held if we're comming via the destroy >path and thus a double locking takes place. > >Patch not needed for 2.4 since both destroy paths are unlocked but will >be needed if we add them. > > Looks correct, but 2.4 does need this. qdisc_destroy in 2.4 always happens under dev->queue_lock. For example dev_shutdown from 2.4: write_lock(&qdisc_tree_lock); spin_lock_bh(&dev->queue_lock); ... qdisc_destroy(qdisc); But please rename "already_locked" to "lock" to make it look less like a hack to avoid deadlock. >+ return __tcindex_delete(tp,arg, 1); > > And a space is missing :) Regards Patrick From kaber@trash.net Thu Dec 9 19:34:00 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 09 Dec 2004 19:34:12 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBA3XwIu024727 for ; Thu, 9 Dec 2004 19:33:59 -0800 Received: from localhost ([127.0.0.1]) by kaber.coreworks.de with esmtp (Exim 4.34) id 1CcbWz-0005OA-Kc; Fri, 10 Dec 2004 04:33:21 +0100 Message-ID: <41B91901.3070304@trash.net> Date: Fri, 10 Dec 2004 04:33:21 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.3) Gecko/20041008 Debian/1.7.3-5 X-Accept-Language: en MIME-Version: 1.0 To: "David S. Miller" CC: Stephen Hemminger , netem@osdl.org, netdev@oss.sgi.com Subject: Re: [PATCH] netem: restart device after inserting packets References: <20041208123103.4cc6b005@dxpl.pdx.osdl.net> <20041208210031.63f0963f.davem@davemloft.net> In-Reply-To: <20041208210031.63f0963f.davem@davemloft.net> Content-Type: multipart/mixed; boundary="------------050406000709040107090906" X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12633 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------050406000709040107090906 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit David S. Miller wrote: >On Wed, 8 Dec 2004 12:31:03 -0800 >Stephen Hemminger wrote: > > >>The version of netem in 2.6.10 moves packets from the delayed queue >>to the qdisc in a timer interrupt. But it forgot to force the device to >>pick them up. >> >>Signed-off-by: Stephen Hemminger >> > >Good spotting. Applied, thanks Stephen. > The patch is incomplete, netem may dequeue multiple packets from the delayed queue at once and feed them to the inner queue, but qdisc_restart will only dequeue one packet from the inner queue. This patch moves qdisc_run back to include/net/pkt_sched.h and replaces qdisc_restart by qdisc_run in netem_watchdog. Regards Patrick --------------050406000709040107090906 Content-Type: text/plain; name="x" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="x" # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/12/10 04:24:22+01:00 kaber@coreworks.de # [PKT_SCHED]: Keep netem queue running until inner qdisc is empty # # Signed-off-by: Patrick McHardy # # net/sched/sch_netem.c # 2004/12/10 04:24:13+01:00 kaber@coreworks.de +1 -1 # [PKT_SCHED]: Keep netem queue running until inner qdisc is empty # # Signed-off-by: Patrick McHardy # # net/core/dev.c # 2004/12/10 04:24:13+01:00 kaber@coreworks.de +0 -7 # [PKT_SCHED]: Keep netem queue running until inner qdisc is empty # # Signed-off-by: Patrick McHardy # # include/net/pkt_sched.h # 2004/12/10 04:24:13+01:00 kaber@coreworks.de +6 -0 # [PKT_SCHED]: Keep netem queue running until inner qdisc is empty # # Signed-off-by: Patrick McHardy # diff -Nru a/include/net/pkt_sched.h b/include/net/pkt_sched.h --- a/include/net/pkt_sched.h 2004-12-10 04:24:54 +01:00 +++ b/include/net/pkt_sched.h 2004-12-10 04:24:54 +01:00 @@ -228,6 +228,12 @@ extern int qdisc_restart(struct net_device *dev); +static inline void qdisc_run(struct net_device *dev) +{ + while (!netif_queue_stopped(dev) && qdisc_restart(dev) < 0) + /* NOTHING */; +} + extern int tc_classify(struct sk_buff *skb, struct tcf_proto *tp, struct tcf_result *res); diff -Nru a/net/core/dev.c b/net/core/dev.c --- a/net/core/dev.c 2004-12-10 04:24:54 +01:00 +++ b/net/core/dev.c 2004-12-10 04:24:54 +01:00 @@ -1202,13 +1202,6 @@ } \ } -static inline void qdisc_run(struct net_device *dev) -{ - while (!netif_queue_stopped(dev) && - qdisc_restart(dev)<0) - /* NOTHING */; -} - /** * dev_queue_xmit - transmit a buffer * @skb: buffer to transmit diff -Nru a/net/sched/sch_netem.c b/net/sched/sch_netem.c --- a/net/sched/sch_netem.c 2004-12-10 04:24:54 +01:00 +++ b/net/sched/sch_netem.c 2004-12-10 04:24:54 +01:00 @@ -287,7 +287,7 @@ else sch->q.qlen++; } - qdisc_restart(dev); + qdisc_run(dev); spin_unlock_bh(&dev->queue_lock); } --------------050406000709040107090906-- From shemminger@osdl.org Thu Dec 9 21:15:42 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 09 Dec 2004 21:15:47 -0800 (PST) Received: from fire-1.osdl.org (fire.osdl.org [65.172.181.4]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBA5FfSu031107 for ; Thu, 9 Dec 2004 21:15:42 -0800 Received: from [192.168.0.106] (063-170-215-071.dslnorthwest.net [63.170.215.71]) (authenticated bits=0) by fire-1.osdl.org (8.12.8/8.12.8) with ESMTP id iBA5F6Q3018757 (version=TLSv1/SSLv3 cipher=RC4-MD5 bits=128 verify=NO); Thu, 9 Dec 2004 21:15:07 -0800 Message-ID: <41B930D2.70002@osdl.org> Date: Thu, 09 Dec 2004 21:14:58 -0800 From: Stephen Hemminger User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Kevin Stewart CC: netdev@oss.sgi.com Subject: Re: One IP on 2 nics References: <41B8F4FD.6080407@callplus.co.nz> In-Reply-To: <41B8F4FD.6080407@callplus.co.nz> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-MIMEDefang-Filter: osdl$Revision: 1.97 $ X-Scanned-By: MIMEDefang 2.36 X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12634 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev NetworkManager does what you more easily all from user space. http://people.redhat.com/dcbw/NetworkManager/ From mumasankar@novell.com Thu Dec 9 21:28:26 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 09 Dec 2004 21:28:30 -0800 (PST) Received: from lucius.provo.novell.com (lucius.provo.novell.com [137.65.81.172]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBA5SPqr031879 for ; Thu, 9 Dec 2004 21:28:26 -0800 Received: from INET-PRV1-MTA by lucius.provo.novell.com with Novell_GroupWise; Thu, 09 Dec 2004 22:27:58 -0700 Message-Id: X-Mailer: Novell GroupWise Internet Agent 6.5.3 Date: Thu, 09 Dec 2004 22:27:48 -0700 From: "Umasankar Mukkara" To: Cc: Subject: Re: IPSEC MIB instrumentatoin ?? Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Content-Disposition: inline X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12635 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mumasankar@novell.com Precedence: bulk X-list: netdev >>Any idea what happened to the draft process? Is it likely to be re-issued and turned into an RFC? I think the draft did not proceed further (as confirmed by the author). I have posted a message in the IPSEC main mailing list and awaiting for the feedback. -Uma. From buytenh@wantstofly.org Thu Dec 9 23:26:14 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 09 Dec 2004 23:26:28 -0800 (PST) Received: from xi.wantstofly.org (alephnull.demon.nl [212.238.201.82]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBA7QCAg004024 for ; Thu, 9 Dec 2004 23:26:13 -0800 Received: by xi.wantstofly.org (Postfix, from userid 500) id 34C3E2B0ED; Fri, 10 Dec 2004 08:25:50 +0100 (MET) Date: Fri, 10 Dec 2004 08:25:50 +0100 From: Lennert Buytenhek To: "David S. Miller" Cc: Robert Olsson , netdev@oss.sgi.com Subject: Re: [PATCH] remove FCS from pktgen bandwidth calculation Message-ID: <20041210072549.GA30446@xi.wantstofly.org> References: <20041208201116.GA10691@xi.wantstofly.org> <16824.22466.897897.238619@robur.slu.se> <20041209155936.69313e70.davem@davemloft.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041209155936.69313e70.davem@davemloft.net> User-Agent: Mutt/1.4.1i X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12636 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: buytenh@wantstofly.org Precedence: bulk X-list: netdev On Thu, Dec 09, 2004 at 03:59:36PM -0800, David S. Miller wrote: > > OK! We strictly define statistics to be based on the data we deliver > > to hard_xmit. Feel free to change the current kernel version... while > > we wait for 2.7 :-) > > Robert can you collect together the various pktgen patches > that have been floating about over the past few days and > send to me what you think should be applied. > > There is this FCS patch, and the naonspin one from Lennert > as well. There may have been others. At least the nanospin patch won't apply cleanly against the pktgen version currently in mainline. I'm reworking it anyway since it's actually buggy -- give me a day or two and I'll post updated patches against 2.6 as well as Robert's devel version. (See ftp://robur.slu.se/pub/Linux/net-development/pktgen-testing/ for Robert's devel versions.) cheers, Lennert From cranium2003@yahoo.com Fri Dec 10 01:56:25 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 10 Dec 2004 01:56:29 -0800 (PST) Received: from web41407.mail.yahoo.com (web41407.mail.yahoo.com [66.218.93.73]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBA9uOAR013518 for ; Fri, 10 Dec 2004 01:56:24 -0800 Received: (qmail 68650 invoked by uid 60001); 10 Dec 2004 09:55:57 -0000 Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; b=O6d3t0K4Kgz2iorv7yAy3ChKzsNk1o8OvZWj8DFwsT0L6eqv6iosHvNDlytjV2ICTtzUT47MfGJ4b/v5eNt7Louj+HyXkkiGmRio0liHbR3GaKDa71kDCa8a0i2qJn5SNq30HPbyUdkwfq3lHpHkgZCkcaigAbk1uFZucAVO1/0= ; Message-ID: <20041210095557.68648.qmail@web41407.mail.yahoo.com> Received: from [203.199.141.99] by web41407.mail.yahoo.com via HTTP; Fri, 10 Dec 2004 01:55:57 PST Date: Fri, 10 Dec 2004 01:55:57 -0800 (PST) From: cranium2003 Subject: analysing ip_rcv code problems To: net dev MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12637 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: cranium2003@yahoo.com Precedence: bulk X-list: netdev hello, I know that in kernel 2.4 series, function used to receive IP packets is ip_rcv. But i am not getting where exactly IP header is removed that is following lines from function ip_rcv can be used to remove IP header if (!pskb_may_pull(skb, sizeof(struct iphdr))) goto inhdr_error; iph = skb->nh.iph; ... ... ... ... and also following lines from same kernel is used to remove header as IP header length is 20 bytes then which function actually does IP header removal. if (!pskb_may_pull(skb, iph->ihl*4)) goto inhdr_error; iph = skb->nh.iph; Also why iph = skb->nh.iph statement is used twice in function code ip_rcv. regards, cranium. __________________________________ Do you Yahoo!? Send holiday email and support a worthy cause. Do good. http://celebrity.mail.yahoo.com From tgraf@suug.ch Fri Dec 10 04:30:01 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 10 Dec 2004 04:30:06 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBACU0gA007450 for ; Fri, 10 Dec 2004 04:30:01 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id B34BBF; Fri, 10 Dec 2004 13:29:15 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id C09B51C0EA; Fri, 10 Dec 2004 13:29:57 +0100 (CET) Date: Fri, 10 Dec 2004 13:29:57 +0100 From: Thomas Graf To: Patrick McHardy Cc: "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PATCH] PKT_SCHED: Prevent destroying via RTM_DELTFILTER while classifying Message-ID: <20041210122957.GU1371@postel.suug.ch> References: <20041210014322.GS1371@postel.suug.ch> <41B9050A.1060401@trash.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <41B9050A.1060401@trash.net> X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12638 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev > >The second path needs the spinlock to avoid destroying while > >a classification is in progress. 2.4 probably needs the same fix, > >I will cook one up if so. > > > > > This is not correct. The classifier is unlinked in tc_ctl_tfilter > before it is destroyed, so it is not visible for classification anymore. You're right. Thanks. From tgraf@suug.ch Fri Dec 10 04:44:48 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 10 Dec 2004 04:44:54 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBACilE2008282 for ; Fri, 10 Dec 2004 04:44:47 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id EE5A7F; Fri, 10 Dec 2004 13:44:02 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id E3BAB1C0EA; Fri, 10 Dec 2004 13:44:45 +0100 (CET) Date: Fri, 10 Dec 2004 13:44:45 +0100 From: Thomas Graf To: Patrick McHardy Cc: "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PATCH] PKT_SCHED: Fix double locking in tcindex destroy path Message-ID: <20041210124445.GV1371@postel.suug.ch> References: <20041210014918.GT1371@postel.suug.ch> <41B90B81.1020102@trash.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <41B90B81.1020102@trash.net> X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12639 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * Patrick McHardy <41B90B81.1020102@trash.net> 2004-12-10 03:35 > Thomas Graf wrote: > > >tcindex's destroy uses its own delete functions to destroy its > >configuration. The delete function (correctly) takes the qdisc_tree_lock > >to prevent list walkings from happening while removing from the list. > >The qdisc_tree_lock is already held if we're comming via the destroy > >path and thus a double locking takes place. > > > >Patch not needed for 2.4 since both destroy paths are unlocked but will > >be needed if we add them. > > > > > Looks correct, but 2.4 does need this. qdisc_destroy in 2.4 always > happens under dev->queue_lock. For example dev_shutdown from 2.4: Not 100% correct since cls_api.c drops the lock before calling tcf_destroy but the patch is indeed needed and it's not a problem if dev->queue_lock is not taken since it is already unlinked as you correctly stated in your previous mail. Thanks Patrick. Patch also applies to 2.4 with some fuzz. Signed-off-by: Thomas Graf --- linux-2.6.10-rc2-bk13.orig/net/sched/cls_tcindex.c 2004-11-30 14:01:12.000000000 +0100 +++ linux-2.6.10-rc2-bk13/net/sched/cls_tcindex.c 2004-12-10 13:35:03.000000000 +0100 @@ -160,7 +160,8 @@ } -static int tcindex_delete(struct tcf_proto *tp, unsigned long arg) +static int +__tcindex_delete(struct tcf_proto *tp, unsigned long arg, int lock) { struct tcindex_data *p = PRIV(tp); struct tcindex_filter_result *r = (struct tcindex_filter_result *) arg; @@ -182,9 +183,11 @@ found: f = *walk; - tcf_tree_lock(tp); + if (lock) + tcf_tree_lock(tp); *walk = f->next; - tcf_tree_unlock(tp); + if (lock) + tcf_tree_unlock(tp); } tcf_unbind_filter(tp, &r->res); #ifdef CONFIG_NET_CLS_POLICE @@ -195,6 +198,10 @@ return 0; } +static int tcindex_delete(struct tcf_proto *tp, unsigned long arg) +{ + return __tcindex_delete(tp, arg, 1); +} /* * There are no parameters for tcindex_init, so we overload tcindex_change @@ -384,7 +391,7 @@ static int tcindex_destroy_element(struct tcf_proto *tp, unsigned long arg, struct tcf_walker *walker) { - return tcindex_delete(tp,arg); + return __tcindex_delete(tp, arg, 0); } From paul@clubi.ie Fri Dec 10 07:41:44 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 10 Dec 2004 07:41:51 -0800 (PST) Received: from hibernia.jakma.org (hibernia.jakma.org [212.17.55.49]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBAFfgIi014550 for ; Fri, 10 Dec 2004 07:41:44 -0800 Received: from hibernia.jakma.org (IDENT:paul@hibernia.jakma.org [192.168.0.3]) by hibernia.jakma.org (8.12.11/8.12.11) with ESMTP id iBAFbFUu010062; Fri, 10 Dec 2004 15:37:32 GMT Date: Fri, 10 Dec 2004 15:37:15 +0000 (GMT) From: Paul Jakma X-X-Sender: paul@hibernia.jakma.org To: Herbert Xu cc: Hasso Tepper , hadi@cyberus.ca, thomas.spatzier@de.ibm.com, jgarzik@pobox.com, netdev@oss.sgi.com Subject: Re: [patch 4/10] s390: network driver. In-Reply-To: Message-ID: References: Mail-Followup-To: paul@hibernia.jakma.org X-NSA: arafat al aqsar jihad musharef jet-A1 avgas ammonium qran inshallah allah al-akbar martyr iraq saddam hammas hisballah rabin ayatollah korea vietnam revolt mustard gas british airways washington MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Scanned: ClamAV 0.80/614/Wed Dec 1 15:44:43 2004 clamav-milter version 0.80j on hibernia.jakma.org X-Virus-Status: Clean X-Virus-Status: Clean X-archive-position: 12640 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: paul@clubi.ie Precedence: bulk X-list: netdev On Tue, 7 Dec 2004, Herbert Xu wrote: > I don't see why this should be happening. Thomas' original patch was to address this problem. I wonder could he recap the kernel side of this problem? > Can you please provide a minimal program that reproduces this > blocking problem? For example, something that sends a packet to a > downed interface and then sends one to an interface that's up? I will try get this for you soon. regards -- Paul Jakma paul@clubi.ie paul@jakma.org Key ID: 64A2FF6A Fortune: Who was that masked man? From brande@novolab.de Fri Dec 10 07:47:07 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 10 Dec 2004 07:47:12 -0800 (PST) Received: from aquin (www.novolab.de [212.222.128.68]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBAFl5tB015133 for ; Fri, 10 Dec 2004 07:47:07 -0800 Received: from [212.222.128.131] (helo=[192.168.0.117]) by aquin with asmtp (Exim 3.36 #1 (Debian)) id 1Ccmyb-0006Bs-00; Fri, 10 Dec 2004 16:46:37 +0100 Message-ID: <41B9C4E0.40407@novolab.de> Date: Fri, 10 Dec 2004 16:46:40 +0100 From: Brande User-Agent: Mozilla Thunderbird 0.8 (X11/20040926) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Brande CC: netdev@oss.sgi.com, linux-net@vger.kernel.org Subject: Prism / Hostap Bridge Problems... References: <41B9B6E9.7010600@novolab.de> In-Reply-To: <41B9B6E9.7010600@novolab.de> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12641 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: brande@novolab.de Precedence: bulk X-list: netdev Hi all, maybe there is someone who can help... Problem: I have a meshcube (www.meshcube.org) with two prim 2.5 mini-pci WLAN cards... wlan0 should connect as client (managed mode) to a normal access point which is connected to the internet. wlan1 should work as accesspoint (master mode). I would like to bridge between wlan0 and wlan1 so that I can connect with my Laptop Client2 to the AccessPoint on wlan1 and go, through the wlan0 client connection to the normal AccessPoint, to the internet... So here is the hardware setup: Internet ----- AccessPoint (192.168.0.1) -------- CLient/AP Bridge (192.168.0.100 (2 cards in one box)) | | Laptop Client 1 my Laptop CLient 2 (192.168.0.33) (192.168.0.117) The data of the WLAN devices: hostap_diag wlan0 Host AP driver diagnostics information for 'wlan0' NICID: id=0x8013 v1.0.0 (PRISM II (2.5) Mini-PCI (SST parallel flash)) PRIID: id=0x0015 v1.1.1 STAID: id=0x001f v1.7.4 (station firmware)agnostics information for 'wlan0' hostap_diag wlan1 Host AP driver diagnostics information for 'wlan1' NICID: id=0x8013 v1.0.0 (PRISM II (2.5) Mini-PCI (SST parallel flash)) PRIID: id=0x0015 v1.1.1 STAID: id=0x001f v1.7.4 (station firmware)agnostics information for 'wlan0' I have written the following bridge script: ETHER0=wlan0 ETHER1=wlan1 BRIDGE=br0 BRIDGEIP=192.168.0.100 BRIDGEGW=192.168.0.1 BRIDGENM=255.255.255.0 BRIDGESTP=off ### must be "on" with more then one bridge ### stop configure ### echo -n "stopping firewall: " iptables -F iptables -F -t nat iptables -P INPUT ACCEPT iptables -P FORWARD ACCEPT iptables -P OUTPUT ACCEPT echo "*** network is insecure now *** " echo "done." case "$1" in start) echo "Starting service bridge br0" echo "Bridge IP will be: $BRIDGEIP" ifconfig $ETHER0 promisc up ifconfig $ETHER1 promisc up sleep 2 brctl addbr $BRIDGE brctl setbridgeprio $BRIDGE 0 ifconfig $ETHER0 0.0.0.0 ifconfig $ETHER1 0.0.0.0 brctl addif $BRIDGE $ETHER0 brctl addif $BRIDGE $ETHER1 #brctl stp $BRIDGE $BRIDGESTP #brctl sethello $BRIDGE 1 #brctl setmaxage $BRIDGE 4 #brctl setfd $BRIDGE 4 echo "1" > /proc/sys/net/ipv4/ip_forward # I know it's not really neccessary ifconfig $BRIDGE $BRIDGEIP netmask $BRIDGENM up # but it was a test route add default gw $BRIDGEGW $BRIDGE echo -e "Bridge needs 30 sec. to learn table!\n(depends on kernel version...)\n" ;; If I start the script the bridge goes up and I can ping the bridge (192.168.0.100) from outside with the Laptop Client 1. I can also ping my Laptop Client2 from outside but from my Laptop Client2 I can not ping the gateway behind the bridge (192.168.0.1) or the Laptop Client1 but I can ping the bridge interface from my Laptop Client2 which is connected to the WLAN1 AccessPoint in the bridge... With tcpdump I can see that the there is an arp request from my Laptop Client2 on the bridge interface to see who has 192.168.0.1 if I try to ping e.g. 192.168.0.1 but I get no reply from the bridge. On my Laptop I get the message "Host unreachable". Looks like that the AccessPoint or the Client in the bridge, the MAC address within the arp request from my Laptop Client1 to the one of the correspondig interfaces inside the bridge replaced and that that is the reason while I can't receive the answer to my arp request. Might this be possible? And if - do you know a solution to solve that problem? Or do you have another solution with the same effect but without wds please;) Thanks for your time, have fun, Brande From magnus.damm@gmail.com Fri Dec 10 07:55:20 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 10 Dec 2004 07:55:24 -0800 (PST) Received: from wproxy.gmail.com (wproxy.gmail.com [64.233.184.193]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBAFtJ9s019106 for ; Fri, 10 Dec 2004 07:55:20 -0800 Received: by wproxy.gmail.com with SMTP id 40so25547wri for ; Fri, 10 Dec 2004 07:54:52 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:mime-version:content-type; b=bN18wGoi0sOMHNS1WQkqwVdXKELZql8eK1OvvuzRhA0wMwGrLwFw/t8ax9zbnkmrd6bd4mnkcpKLYHfZVhpX5wIVPvnqg2mYtP2BdRMhX9+NwM5M87R9RBjajeIthTu/kjMizNL6I3PhiTdryq4KOz3AUTjfRBGmGDJjKeDHqiM= Received: by 10.54.20.67 with SMTP id 67mr1459969wrt; Fri, 10 Dec 2004 07:54:51 -0800 (PST) Received: by 10.54.17.4 with HTTP; Fri, 10 Dec 2004 07:54:51 -0800 (PST) Message-ID: Date: Fri, 10 Dec 2004 16:54:51 +0100 From: Magnus Damm Reply-To: Magnus Damm To: netdev@oss.sgi.com Subject: [PATCH] kaweth - Psion Gold Port Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_937_10448295.1102694091939" X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12642 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: magnus.damm@gmail.com Precedence: bulk X-list: netdev ------=_Part_937_10448295.1102694091939 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Content-Disposition: inline Hello, This little patch adds support for "Psion Dacom Gold Port Ethernet" to kaweth. Please apply. / magnus ------=_Part_937_10448295.1102694091939 Content-Type: text/x-patch; name="linux-2.6.9-goldport.patch" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="linux-2.6.9-goldport.patch" --- linux-2.6.9/drivers/usb/net/kaweth.c=092004-12-10 16:31:51.677492360 +0= 100 +++ linux-2.6.9-goldport/drivers/usb/net/kaweth.c=092004-12-10 16:31:49.517= 820680 +0100 @@ -160,6 +160,7 @@ =09{ USB_DEVICE(0x1342, 0x0204) }, /* Mobility USB-Ethernet Adapter */ =09{ USB_DEVICE(0x13d2, 0x0400) }, /* Shark Pocket Adapter */ =09{ USB_DEVICE(0x1485, 0x0001) },=09/* Silicom U2E */ +=09{ USB_DEVICE(0x1485, 0x0002) }, /* Psion Dacom Gold Port Ethernet */ =09{ USB_DEVICE(0x1645, 0x0005) }, /* Entrega E45 */ =09{ USB_DEVICE(0x1645, 0x0008) }, /* Entrega USB Ethernet Adapter */ =09{ USB_DEVICE(0x1645, 0x8005) }, /* PortGear Ethernet Adapter */ ------=_Part_937_10448295.1102694091939-- From gandalf@wlug.westbo.se Fri Dec 10 08:20:03 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 10 Dec 2004 08:20:08 -0800 (PST) Received: from null.rsn.bth.se (postfix@null.rsn.bth.se [194.47.142.3]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBAGK2HA025925 for ; Fri, 10 Dec 2004 08:20:03 -0800 Received: by null.rsn.bth.se (Postfix, from userid 65534) id 3A9A82C002C; Fri, 10 Dec 2004 17:19:37 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by null.rsn.bth.se (Postfix) with ESMTP id C4D052C0047; Fri, 10 Dec 2004 17:19:36 +0100 (CET) Received: from tux.rsn.bth.se (tux.rsn.bth.se [194.47.143.135]) by null.rsn.bth.se (Postfix) with ESMTP id 0B8312C002C; Fri, 10 Dec 2004 17:19:36 +0100 (CET) Received: by tux.rsn.bth.se (Postfix, from userid 501) id 74EFD3FA7; Fri, 10 Dec 2004 17:19:36 +0100 (CET) Subject: Re: [PATCH] e1000 poll behavior From: Martin Josefsson To: Robert Olsson Cc: netdev@oss.sgi.com In-Reply-To: <16824.30773.125766.220826@robur.slu.se> References: <16824.30773.125766.220826@robur.slu.se> Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="=-uCfZOX+FjD9Ul1SAZcD0" Message-Id: <1102695576.12078.22.camel@tux.rsn.bth.se> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.6 Date: Fri, 10 Dec 2004 17:19:36 +0100 X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Scanned: by amavisd-new-20030616-p10 on null.rsn.bth.se X-Virus-Status: Clean X-archive-position: 12643 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: gandalf@wlug.westbo.se Precedence: bulk X-list: netdev --=-uCfZOX+FjD9Ul1SAZcD0 Content-Type: text/plain Content-Transfer-Encoding: quoted-printable On Thu, 2004-12-09 at 17:07, Robert Olsson wrote: > Hello! >=20 > Seems e1000 never gets into poll mode when tx_cleaned is false. Compare > irq's on RX interfaces eth0, eth2 in the forwarding test below. Yes, we discussed this in private some time ago, unfortunately I havn't had any proper hardware to test my changes on :( > --- drivers/net/e1000/e1000_main.c.orig 2004-12-09 17:49:56.000000000 +01= 00 > +++ drivers/net/e1000/e1000_main.c 2004-12-09 19:05:07.000000000 +0100 > @@ -2179,8 +2179,8 @@ > *budget -=3D work_done; > netdev->quota -=3D work_done; > =09 > - /* if no Rx and Tx cleanup work was done, exit the polling mode */ > - if(!tx_cleaned || (work_done < work_to_do) ||=20 > + /* if no Tx and not enough Rx work done, exit the polling mode */ > + if((!tx_cleaned && (work_done < work_to_do)) ||=20 > !netif_running(netdev)) { > netif_rx_complete(netdev); > e1000_irq_enable(adapter); This patch is broken. What about the case where tx_cleaned is true but (work_done < work_to_do) is true. Then the statement is false and we continue, later we return (work_done >=3D work_to_do) which equals 0 (not seen in patch). We are not allowed to return 0 when still on poll_list. That will sort of continue polling in a degraded way (no increase in quota) but will screw up device refcounting badly. That final return must be changed into "return 1;" --=20 /Martin --=-uCfZOX+FjD9Ul1SAZcD0 Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) iD8DBQBBucyXWm2vlfa207ERAoHvAJ95AJaO5ynq4ESESF+hIWjMpGJcaQCdHaFM hO7YNHXsk8PWhJHEtqJSl6I= =KFtZ -----END PGP SIGNATURE----- --=-uCfZOX+FjD9Ul1SAZcD0-- From gandalf@wlug.westbo.se Fri Dec 10 08:24:38 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 10 Dec 2004 08:24:45 -0800 (PST) Received: from null.rsn.bth.se (postfix@null.rsn.bth.se [194.47.142.3]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBAGObHL026470 for ; Fri, 10 Dec 2004 08:24:37 -0800 Received: by null.rsn.bth.se (Postfix, from userid 65534) id AEFD02C002C; Fri, 10 Dec 2004 17:24:13 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by null.rsn.bth.se (Postfix) with ESMTP id 2BB5B2C006B; Fri, 10 Dec 2004 17:24:13 +0100 (CET) Received: from tux.rsn.bth.se (tux.rsn.bth.se [194.47.143.135]) by null.rsn.bth.se (Postfix) with ESMTP id 5E6412C002C; Fri, 10 Dec 2004 17:24:12 +0100 (CET) Received: by tux.rsn.bth.se (Postfix, from userid 501) id 081343FA7; Fri, 10 Dec 2004 17:24:13 +0100 (CET) Subject: Re: [E1000-devel] Transmission limit From: Martin Josefsson To: Robert Olsson Cc: sfeldma@pobox.com, Lennert Buytenhek , jamal , P@draigBrady.com, mellia@prezzemolo.polito.it, e1000-devel@lists.sourceforge.net, Jorge Manuel Finochietto , Giulio Galante , netdev@oss.sgi.com In-Reply-To: <16815.23964.93437.411404@robur.slu.se> References: <1101467291.24742.70.camel@mellia.lipar.polito.it> <41A73826.3000109@draigBrady.com> <16807.20052.569125.686158@robur.slu.se> <1101484740.24742.213.camel@mellia.lipar.polito.it> <41A76085.7000105@draigBrady.com> <1101499285.1079.45.camel@jzny.localdomain> <16811.8052.678955.795327@robur.slu.se> <1101821501.1043.43.camel@jzny.localdomain> <20041130134600.GA31515@xi.wantstofly.org> <1101824754.1044.126.camel@jzny.localdomain> <20041201001107.GE4203@xi.wantstofly.org> <1101863399.4663.54.camel@sfeldma-mobl.dsl-verizon.net> <16813.58484.343629.570703@robur.slu.se> <1101919791.5198.15.camel@localhost.localdomain> <16815.23964.93437.411404@robur.slu.se> Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="=-GXGZyDMmSDaXyqjouBHS" Message-Id: <1102695852.12078.27.camel@tux.rsn.bth.se> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.6 Date: Fri, 10 Dec 2004 17:24:12 +0100 X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Scanned: by amavisd-new-20030616-p10 on null.rsn.bth.se X-Virus-Status: Clean X-archive-position: 12644 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: gandalf@wlug.westbo.se Precedence: bulk X-list: netdev --=-GXGZyDMmSDaXyqjouBHS Content-Type: text/plain Content-Transfer-Encoding: quoted-printable On Thu, 2004-12-02 at 19:23, Robert Olsson wrote: > Hello! >=20 > Below is little patch to clean skb at xmit. It's old jungle trick Jamal > and I used w. tulip. Note we can now even decrease the size of TX ring. Just a small unimportant note. > --- drivers/net/e1000/e1000_main.c.orig 2004-12-01 13:59:36.000000000 +01= 00 > +++ drivers/net/e1000/e1000_main.c 2004-12-02 20:37:40.000000000 +0100 > @@ -1820,6 +1820,10 @@ > return NETDEV_TX_LOCKED;=20 > }=20 > =20 > + > + if( adapter->tx_ring.next_to_use - adapter->tx_ring.next_to_clean > 80 = ) > + e1000_clean_tx_ring(adapter); > + > /* need: count + 2 desc gap to keep tail from touching > * head, otherwise try next time */ > if(E1000_DESC_UNUSED(&adapter->tx_ring) < count + 2) { This patch is pretty broken, I doubt you want to call e1000_clean_tx_ring(), I think you want some variant of e1000_clean_tx_irq() :) e1000_clean_tx_irq() takes adapter->tx_lock which e1000_xmit_frame() also does so it will need some modification. And it should use E1000_DESC_UNUSED as Scott pointed out. --=20 /Martin --=-GXGZyDMmSDaXyqjouBHS Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) iD8DBQBBuc2sWm2vlfa207ERAjcUAJ90ZQvLNaEHUD+XZa1vk4fQWxLUhQCfev61 Ls4tqaYcwJJpjd1D8ojBLHs= =8w/r -----END PGP SIGNATURE----- --=-GXGZyDMmSDaXyqjouBHS-- From Robert.Olsson@data.slu.se Fri Dec 10 09:29:12 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 10 Dec 2004 09:29:17 -0800 (PST) Received: from mail1.slu.se (mail1.slu.se [130.238.96.11]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBAHTBld029042 for ; Fri, 10 Dec 2004 09:29:12 -0800 Received: from robur.slu.se (robur.slu.se [130.238.98.12]) by mail1.slu.se (8.12.10/8.12.10) with ESMTP id iBAHSmOB028753; Fri, 10 Dec 2004 18:28:48 +0100 Received: by robur.slu.se (Postfix, from userid 1000) id 3DBD9EC001; Fri, 10 Dec 2004 18:28:48 +0100 (CET) From: Robert Olsson MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16825.56528.212904.417227@robur.slu.se> Date: Fri, 10 Dec 2004 18:28:48 +0100 To: Martin Josefsson Cc: Robert Olsson , netdev@oss.sgi.com Subject: Re: [PATCH] e1000 poll behavior In-Reply-To: <1102695576.12078.22.camel@tux.rsn.bth.se> References: <16824.30773.125766.220826@robur.slu.se> <1102695576.12078.22.camel@tux.rsn.bth.se> X-Mailer: VM 7.18 under Emacs 21.3.1 X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12645 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Robert.Olsson@data.slu.se Precedence: bulk X-list: netdev Martin Josefsson writes: > What about the case where tx_cleaned is true but (work_done < > work_to_do) is true. Then the statement is false and we continue, later > we return (work_done >= work_to_do) which equals 0 (not seen in patch). > We are not allowed to return 0 when still on poll_list. That will sort > of continue polling in a degraded way (no increase in quota) but will > screw up device refcounting badly. > > That final return must be changed into "return 1;" Ohoh thanks yes... --- drivers/net/e1000/e1000_main.c.orig 2004-12-09 17:49:56.000000000 +0100 +++ drivers/net/e1000/e1000_main.c 2004-12-10 20:13:57.000000000 +0100 @@ -2179,15 +2179,15 @@ *budget -= work_done; netdev->quota -= work_done; - /* if no Rx and Tx cleanup work was done, exit the polling mode */ - if(!tx_cleaned || (work_done < work_to_do) || + /* if no Tx and not enough Rx work done, exit the polling mode */ + if((!tx_cleaned && (work_done < work_to_do)) || !netif_running(netdev)) { netif_rx_complete(netdev); e1000_irq_enable(adapter); return 0; } - return (work_done >= work_to_do); + return 1; } #endif Same experiment again. Iface MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flags eth0 1500 0 4351378 8252365 8252365 5648874 155 0 0 0 BRU eth1 1500 0 1 0 0 0 4350771 0 0 0 BRU eth2 1500 0 4352289 8383841 8383841 5647711 5 0 0 0 BRU eth3 1500 0 1 0 0 0 4352126 0 0 0 BRU CPU0 26: 434 IO-APIC-level eth0 27: 109 IO-APIC-level eth1 28: 112 IO-APIC-level eth2 29: 112 IO-APIC-level eth3 --ro From holt@lnx-holt.americas.sgi.com Fri Dec 10 11:02:13 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 10 Dec 2004 11:02:22 -0800 (PST) Received: from omx2.sgi.com (omx2-ext.sgi.com [192.48.171.19]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBAJ293i002466 for ; Fri, 10 Dec 2004 11:02:10 -0800 Received: from flecktone.americas.sgi.com (flecktone.americas.sgi.com [198.149.16.15]) by omx2.sgi.com (8.12.11/8.12.9/linux-outbound_gateway-1.1) with ESMTP id iBAKMp3x030386 for ; Fri, 10 Dec 2004 12:22:51 -0800 Received: from thistle-e236.americas.sgi.com (thistle-e236.americas.sgi.com [128.162.236.204]) by flecktone.americas.sgi.com (8.12.9/8.12.10/SGI_generic_relay-1.2) with ESMTP id iBAJ0cCK4065186; Fri, 10 Dec 2004 13:00:38 -0600 (CST) Received: from lnx-holt.americas.sgi.com (IDENT:U2FsdGVkX1/knrXRHH66qiR+7oIOsI1w8U8Dl8gKfEU@lnx-holt.americas.sgi.com [128.162.233.109]) by thistle-e236.americas.sgi.com (8.12.9/SGI-server-1.8) with ESMTP id iBAJ0atC17149881; Fri, 10 Dec 2004 13:00:36 -0600 (CST) Received: from lnx-holt.americas.sgi.com (localhost.localdomain [127.0.0.1]) by lnx-holt.americas.sgi.com (8.13.1/8.12.11) with ESMTP id iBAJ0Yj8022332; Fri, 10 Dec 2004 13:00:34 -0600 Received: (from holt@localhost) by lnx-holt.americas.sgi.com (8.13.1/8.13.1/Submit) id iBAJ0PDV022330; Fri, 10 Dec 2004 13:00:25 -0600 Date: Fri, 10 Dec 2004 13:00:25 -0600 From: Robin Holt To: yoshfuji@linux-ipv6.org, akpm@osdl.org, hirofumi@parknet.co.jp, davem@davemloft.net, torvalds@osdl.org, dipankar@ibm.com, laforge@gnumonks.org, bunk@stusta.de, herbert@apana.org.au, paulmck@ibm.com Cc: netdev@oss.sgi.com, linux-kernel@vger.kernel.org, ^Greg Banks Subject: [RFC] Limit the size of the IPV4 route hash. Message-ID: <20041210190025.GA21116@lnx-holt.americas.sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.1i X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12646 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: holt@sgi.com Precedence: bulk X-list: netdev I have sent a couple emails concerning the IPv4 route hash size in the past week with no response. I am now sending to everyone that has made changes to the net/ipv4/route.c file in the last six months to hopefully get some direction. Sorry for the wide net, but I do not know how to proceed. The first post was asking for direction on the maximum size for the route cache. The link is here: (NOTE: I never saw this come back from the netdev list) What is a reasonable upper limit to the rt_hash_table. http://marc.theaimsgroup.com/?l=linux-kernel&m=110244057617765&w=2 I then did some testing/experimenting with systems that are in production, determined the size calculation is definitely too large and then came to the following conclusion: Limit the route hash size. http://marc.theaimsgroup.com/?l=linux-kernel&m=110260977405809&w=2 In the second, I included the patch, but did not intend this to be a patch submission. Sorry for the Signed-off-by. Where do I go from here? I hate to just submit this as a patch without any other discussion. I have checked route cache size on many machines and they have all been in the 30-100 range except for some on ISP machines that are serving web pages where I have seen three machines with a cache size of up to 800 entries. And one university email server where they have set the secret_interval to 86,400 which has peaked at 18,434 entries. With those sizes noted, the cache size of one page does not appear to have any negative impact for any except the email server. For that machine, they have already reviewed the code and decided to adjust tunable values so I can not believe they would be upset with having to provide an rhash_entries= append on the boot line. Are there any benchmarks I am supposed to run prior to asking for this patch to be incorporated? Any guidance would be greatly appreciated. Thank you for you attention. Again, sorry for the wide net. Robin Holt From davem@davemloft.net Fri Dec 10 11:55:12 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 10 Dec 2004 11:55:21 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBAJtCTh004739 for ; Fri, 10 Dec 2004 11:55:12 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1Ccqkf-00012a-00; Fri, 10 Dec 2004 11:48:29 -0800 Date: Fri, 10 Dec 2004 11:48:29 -0800 From: "David S. Miller" To: Robin Holt Cc: yoshfuji@linux-ipv6.org, akpm@osdl.org, hirofumi@parknet.co.jp, torvalds@osdl.org, dipankar@ibm.com, laforge@gnumonks.org, bunk@stusta.de, herbert@apana.org.au, paulmck@ibm.com, netdev@oss.sgi.com, linux-kernel@vger.kernel.org, gnb@sgi.com Subject: Re: [RFC] Limit the size of the IPV4 route hash. Message-Id: <20041210114829.034e02eb.davem@davemloft.net> In-Reply-To: <20041210190025.GA21116@lnx-holt.americas.sgi.com> References: <20041210190025.GA21116@lnx-holt.americas.sgi.com> X-Mailer: Sylpheed version 1.0.0beta3 (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12647 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Fri, 10 Dec 2004 13:00:25 -0600 Robin Holt wrote: > I then did some testing/experimenting with systems that are in production, > determined the size calculation is definitely too large and then came > to the following conclusion: > > Limit the route hash size. > http://marc.theaimsgroup.com/?l=linux-kernel&m=110260977405809&w=2 > > In the second, I included the patch, but did not intend this to be a > patch submission. Sorry for the Signed-off-by. > > Where do I go from here? I hate to just submit this as a patch without > any other discussion. Sometimes we have to just sit and be content with the fact that nobody is inspired by our work enough to respond. :-) It usually means that people aren't too thrilled with your patch, but don't feel any impetus to mention why. I can definitely say that just forcing it to use 1 page is wrong. Even ignoring your tests, your test was on a system that has 16K PAGE_SIZE. Other systems use 4K and 8K (and other) PAGE_SIZE values. This is why we make our calculations relative to PAGE_SHIFT. Also, 1 page even in your case is (assuming you are on a 64-bit platform, you didn't mention) going to give us 1024 hash chains. A reasonably busy web server will easily be talking to more than 1K unique hosts at a given point in time. This is especially true as slow long distance connections bunch up. Alexey Kuznetsov needed to use more than one page on his lowly i386 router in Russia, and this was circa 7 or 8 years ago. People are pretty happy with the current algorithm, and in fact most people ask us to increase it's value not decrease it :-) From holt@lnx-holt.americas.sgi.com Fri Dec 10 13:00:41 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 10 Dec 2004 13:00:47 -0800 (PST) Received: from omx2.sgi.com (omx2-ext.sgi.com [192.48.171.19]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBAL0cGB010425 for ; Fri, 10 Dec 2004 13:00:39 -0800 Received: from flecktone.americas.sgi.com (flecktone.americas.sgi.com [198.149.16.15]) by omx2.sgi.com (8.12.11/8.12.9/linux-outbound_gateway-1.1) with ESMTP id iBAMLK2c024395 for ; Fri, 10 Dec 2004 14:21:21 -0800 Received: from thistle-e236.americas.sgi.com (thistle-e236.americas.sgi.com [128.162.236.204]) by flecktone.americas.sgi.com (8.12.9/8.12.10/SGI_generic_relay-1.2) with ESMTP id iBAL0FCM4071809; Fri, 10 Dec 2004 15:00:16 -0600 (CST) Received: from lnx-holt.americas.sgi.com (IDENT:U2FsdGVkX18ClokZbTqIBl40Zi/Tox8rjeGMutqDFrk@lnx-holt.americas.sgi.com [128.162.233.109]) by thistle-e236.americas.sgi.com (8.12.9/SGI-server-1.8) with ESMTP id iBAL0EtC17173912; Fri, 10 Dec 2004 15:00:14 -0600 (CST) Received: from lnx-holt.americas.sgi.com (localhost.localdomain [127.0.0.1]) by lnx-holt.americas.sgi.com (8.13.1/8.12.11) with ESMTP id iBAL0Ciq023591; Fri, 10 Dec 2004 15:00:12 -0600 Received: (from holt@localhost) by lnx-holt.americas.sgi.com (8.13.1/8.13.1/Submit) id iBAL06Ai023590; Fri, 10 Dec 2004 15:00:06 -0600 Date: Fri, 10 Dec 2004 15:00:06 -0600 From: Robin Holt To: "David S. Miller" Cc: Robin Holt , yoshfuji@linux-ipv6.org, akpm@osdl.org, hirofumi@parknet.co.jp, torvalds@osdl.org, dipankar@ibm.com, laforge@gnumonks.org, bunk@stusta.de, herbert@apana.org.au, paulmck@ibm.com, netdev@oss.sgi.com, linux-kernel@vger.kernel.org, gnb@sgi.com Subject: Re: [RFC] Limit the size of the IPV4 route hash. Message-ID: <20041210210006.GB23222@lnx-holt.americas.sgi.com> References: <20041210190025.GA21116@lnx-holt.americas.sgi.com> <20041210114829.034e02eb.davem@davemloft.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041210114829.034e02eb.davem@davemloft.net> User-Agent: Mutt/1.4.1i X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12648 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: holt@sgi.com Precedence: bulk X-list: netdev > > I can definitely say that just forcing it to use 1 page is wrong. > Even ignoring your tests, your test was on a system that has 16K > PAGE_SIZE. Other systems use 4K and 8K (and other) PAGE_SIZE > values. This is why we make our calculations relative to PAGE_SHIFT. I picked 1 page because it was not some magic small number that I would need to justify. I also hoped that it would incite someone to respond. > > Also, 1 page even in your case is (assuming you are on a 64-bit platform, > you didn't mention) going to give us 1024 hash chains. A reasonably > busy web server will easily be talking to more than 1K unique hosts at > a given point in time. This is especially true as slow long distance > connections bunch up. But 1k hosts is not the limit with a 16k page. There are 1k buckets, but each is a list. A reasonably well designed hash will scale to greater than one item per bucket. Additionally, for the small percentage of web servers with enough network traffic that they will be affected by the depth of the entries, they can set rhash_entries for their specific needs. > > Alexey Kuznetsov needed to use more than one page on his lowly > i386 router in Russia, and this was circa 7 or 8 years ago. And now he, for his special case, could set rhash_entries to increase the size. I realize I have a special case which highlighted the problem. My case shows that not putting an upper limit or at least a drastically aggressive non-linear growth cap does cause issues. For the really large system, we were seeing a size of 512MB for the hash which was limited because that was the largest amount of memory available on a single node. I can not ever imagine this being a reasonable limit. Not with 512 cpus and 1024 network adapters could I envision that this level of hashing would actually be advantageous given all the other lock contention that will be seen. Can we agree that a linear calculation based on num_physpages is probably not the best algorithm. If so, should we make it a linear to a limit or a logarithmically decreasing size to a limit? How do we determine that limit point? From akpm@osdl.org Fri Dec 10 13:11:15 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 10 Dec 2004 13:11:20 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBALBC3C011162 for ; Fri, 10 Dec 2004 13:11:15 -0800 Received: from bix (build.pdx.osdl.net [172.20.1.2]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id iBALA2908635; Fri, 10 Dec 2004 13:10:02 -0800 Date: Fri, 10 Dec 2004 13:09:47 -0800 From: Andrew Morton To: Robin Holt Cc: davem@davemloft.net, holt@sgi.com, yoshfuji@linux-ipv6.org, hirofumi@parknet.co.jp, torvalds@osdl.org, dipankar@ibm.com, laforge@gnumonks.org, bunk@stusta.de, herbert@apana.org.au, paulmck@ibm.com, netdev@oss.sgi.com, linux-kernel@vger.kernel.org, gnb@sgi.com Subject: Re: [RFC] Limit the size of the IPV4 route hash. Message-Id: <20041210130947.1d945422.akpm@osdl.org> In-Reply-To: <20041210210006.GB23222@lnx-holt.americas.sgi.com> References: <20041210190025.GA21116@lnx-holt.americas.sgi.com> <20041210114829.034e02eb.davem@davemloft.net> <20041210210006.GB23222@lnx-holt.americas.sgi.com> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.10; i386-redhat-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12649 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: akpm@osdl.org Precedence: bulk X-list: netdev Robin Holt wrote: > > I realize I have a special case which highlighted the problem. My case > shows that not putting an upper limit or at least a drastically aggressive > non-linear growth cap does cause issues. For the really large system, > we were seeing a size of 512MB for the hash which was limited because > that was the largest amount of memory available on a single node. I can > not ever imagine this being a reasonable limit. Not with 512 cpus and > 1024 network adapters could I envision that this level of hashing would > actually be advantageous given all the other lock contention that will > be seen. Half a gig for the hashtable does seems a bit nutty. > Can we agree that a linear calculation based on num_physpages is probably > not the best algorithm. If so, should we make it a linear to a limit or > a logarithmically decreasing size to a limit? How do we determine that > limit point? An initial default of N + M * log2(num_physpages) would probably give a saner result. The big risk is that someone has a too-small table for some specific application and their machine runs more slowly than it should, but they never notice. I wonder if it would be possible to put a little once-only printk into the routing code: "warning route-cache chain exceeded 100 entries: consider using the rhash_entries boot option". From davem@davemloft.net Fri Dec 10 13:13:06 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 10 Dec 2004 13:13:11 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBALD5qM011666 for ; Fri, 10 Dec 2004 13:13:06 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1CcryF-0001Qu-00; Fri, 10 Dec 2004 13:06:35 -0800 Date: Fri, 10 Dec 2004 13:06:34 -0800 From: "David S. Miller" To: Robin Holt Cc: holt@sgi.com, yoshfuji@linux-ipv6.org, akpm@osdl.org, hirofumi@parknet.co.jp, torvalds@osdl.org, dipankar@ibm.com, laforge@gnumonks.org, bunk@stusta.de, herbert@apana.org.au, paulmck@ibm.com, netdev@oss.sgi.com, linux-kernel@vger.kernel.org, gnb@sgi.com Subject: Re: [RFC] Limit the size of the IPV4 route hash. Message-Id: <20041210130634.251c46f9.davem@davemloft.net> In-Reply-To: <20041210210006.GB23222@lnx-holt.americas.sgi.com> References: <20041210190025.GA21116@lnx-holt.americas.sgi.com> <20041210114829.034e02eb.davem@davemloft.net> <20041210210006.GB23222@lnx-holt.americas.sgi.com> X-Mailer: Sylpheed version 1.0.0beta3 (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12650 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Fri, 10 Dec 2004 15:00:06 -0600 Robin Holt wrote: > > Also, 1 page even in your case is (assuming you are on a 64-bit platform, > > you didn't mention) going to give us 1024 hash chains. A reasonably > > busy web server will easily be talking to more than 1K unique hosts at > > a given point in time. This is especially true as slow long distance > > connections bunch up. > > But 1k hosts is not the limit with a 16k page. There are 1k buckets, > but each is a list. A reasonably well designed hash will scale to greater > than one item per bucket. Additionally, for the small percentage of web > servers with enough network traffic that they will be affected by the > depth of the entries, they can set rhash_entries for their specific needs. We want to aim for a depth of 1 in each chain, so that, assuming the hash is decent, we'll achieve O(1) lookup complexity. That is why we want the number of chains to be at least as large as the number of active routing cache entries we'll work with. > I realize I have a special case which highlighted the problem. My case > shows that not putting an upper limit or at least a drastically aggressive > non-linear growth cap does cause issues. For the really large system, > we were seeing a size of 512MB for the hash which was limited because > that was the largest amount of memory available on a single node. That's true, 512MB is just too much. So let's define some reasonable default cap like 16MB or 32MB, and as current it is overridable via rhash_entries. From buytenh@wantstofly.org Fri Dec 10 14:21:23 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 10 Dec 2004 14:21:28 -0800 (PST) Received: from xi.wantstofly.org (alephnull.demon.nl [212.238.201.82]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBAMLMbx013943 for ; Fri, 10 Dec 2004 14:21:23 -0800 Received: by xi.wantstofly.org (Postfix, from userid 500) id 36C822B0EC; Fri, 10 Dec 2004 23:20:59 +0100 (MET) Date: Fri, 10 Dec 2004 23:20:58 +0100 From: Lennert Buytenhek To: robert.olsson@data.slu.se Cc: netdev@oss.sgi.com Subject: [PATCH,RFC] pktgen sleeping/timing rework Message-ID: <20041210222058.GA5984@xi.wantstofly.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.1i X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12651 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: buytenh@wantstofly.org Precedence: bulk X-list: netdev Hi Robert, Below is a patch against your latest pktgen devel version which does the following: - Remove usage of get_cycles(), mhz calibration, pg_cycles_per_*, getRelativeCur[UM]s(), and everything related to reading the CPU cycle counter directly. For one, it can't work correctly on machines that can change clock frequency on the fly (such as the Via EPIA-based machine I'm working on right now), and since you only determine 'clocks per usec' (i.e. determining the mhz rating of the CPU), it's not even any more accurate than do_gettimeofday already is. - Fix up handling of inter-packet gap. There was some code for enforcing a certain IPG, but the actual (CPU) cost of the sending of the packet was not substracted from that, so the effective IPG ended up being IPG+random_overhead. This now works somewhat better and the effective pps rate does seem to end up being ~ 1/ipg, but this needs some more testing and tweaking. - Call schedule_timeout whenever we have to wait longer than a jiffy. For me this nicely reduces the CPU overhead when sending out 100pps to almost nothing. TODO: - When ipg is set close to the HW limit, behavior becomes a bit wonky. On my EPIA board I can generate ~110kpps when I set ipg=0, but when I set ipg=10000 (which would give 100kpps), I only get ~85kps. - We should rename 'ipg' to something different since in our case it never was and never will be the same as what the ethernet folks mean with it. Comments welcome. cheers, Lennert --- pktgen.c.041209 2004-12-11 21:21:26.000000000 +0100 +++ pktgen.c 2004-12-11 22:57:51.221340048 +0100 @@ -222,14 +222,16 @@ int min_pkt_size; /* = ETH_ZLEN; */ int max_pkt_size; /* = ETH_ZLEN; */ int nfrags; - __u32 ipg; /* Default Interpacket gap in nsec */ + __u32 ipg_us; /* Default Interpacket gap */ + __u32 ipg_ns; __u64 count; /* Default No packets to send */ __u64 sofar; /* How many pkts we've sent so far */ __u64 tx_bytes; /* How many bytes we've transmitted */ __u64 errors; /* Errors when trying to transmit, pkts will be re-sent */ /* runtime counters relating to clone_skb */ - __u64 next_tx_ns; /* timestamp of when to tx next, in nano-seconds */ + __u64 next_tx_us; /* timestamp of when to tx next */ + __u32 next_tx_ns; __u64 allocated_skbs; __u32 clone_count; @@ -239,7 +241,7 @@ */ __u64 started_at; /* micro-seconds */ __u64 stopped_at; /* micro-seconds */ - __u64 idle_acc; + __u64 idle_acc; /* micro-seconds */ __u32 seq_num; int clone_skb; /* Use multiple SKBs during packet gen. If this number @@ -346,10 +348,6 @@ #define REMOVE 1 #define FIND 0 -static u32 pg_cycles_per_ns; -static u32 pg_cycles_per_us; -static u32 pg_cycles_per_ms; - /* This code works around the fact that do_div cannot handle two 64-bit numbers, and regular 64-bit division doesn't work on x86 kernels. --Ben @@ -452,14 +450,6 @@ #endif } -/* Fast, not horribly accurate, since the machine started. */ -static inline __u64 getRelativeCurMs(void) { - return pg_div(get_cycles(), pg_cycles_per_ms); -} - -/* Since the epoc. More precise over long periods of time than - * getRelativeCurMs - */ static inline __u64 getCurMs(void) { struct timeval tv; @@ -467,9 +457,6 @@ return tv_to_ms(&tv); } -/* Since the epoc. More precise over long periods of time than - * getRelativeCurMs - */ static inline __u64 getCurUs(void) { struct timeval tv; @@ -477,18 +464,6 @@ return tv_to_us(&tv); } -/* Since the machine booted. */ -static inline __u64 getRelativeCurUs(void) -{ - return pg_div(get_cycles(), pg_cycles_per_us); -} - -/* Since the machine booted. */ -static inline __u64 getRelativeCurNs(void) -{ - return pg_div(get_cycles(), pg_cycles_per_ns); -} - static inline __u64 tv_diff(const struct timeval* a, const struct timeval* b) { return tv_to_us(a) - tv_to_us(b); @@ -523,11 +498,6 @@ static unsigned int scan_ip6(const char *s,char ip[16]); static unsigned int fmt_ip6(char *s,const char ip[16]); -/* cycles per micro-second */ -static u32 pg_cycles_per_ns; -static u32 pg_cycles_per_us; -static u32 pg_cycles_per_ms; - /* Module parameters, defaults. */ static int pg_count_d = 1000; /* 1000 pkts by default */ static int pg_ipg_d = 0; @@ -646,7 +616,7 @@ pkt_dev->count, pkt_dev->min_pkt_size, pkt_dev->max_pkt_size); p += sprintf(p, " frags: %d ipg: %u clone_skb: %d ifname: %s\n", - pkt_dev->nfrags, pkt_dev->ipg, pkt_dev->clone_skb, pkt_dev->ifname); + pkt_dev->nfrags, 1000*pkt_dev->ipg_us+pkt_dev->ipg_ns, pkt_dev->clone_skb, pkt_dev->ifname); p += sprintf(p, " flows: %u flowlen: %u\n", pkt_dev->cflows, pkt_dev->lflow); @@ -718,7 +688,7 @@ p += sprintf(p, "Current:\n pkts-sofar: %llu errors: %llu\n started: %lluus stopped: %lluus idle: %lluus\n", pkt_dev->sofar, pkt_dev->errors, sa, stopped, - pg_div(pkt_dev->idle_acc, pg_cycles_per_us)); + pkt_dev->idle_acc); p += sprintf(p, " seq_num: %d cur_dst_mac_offset: %d cur_src_mac_offset: %d\n", pkt_dev->seq_num, pkt_dev->cur_dst_mac_offset, pkt_dev->cur_src_mac_offset); @@ -932,11 +902,14 @@ len = num_arg(&user_buffer[i], 10, &value); if (len < 0) { return len; } i += len; - pkt_dev->ipg = value; - if ((getRelativeCurNs() + pkt_dev->ipg) > pkt_dev->next_tx_ns) { - pkt_dev->next_tx_ns = getRelativeCurNs() + pkt_dev->ipg; - } - sprintf(pg_result, "OK: ipg=%u", pkt_dev->ipg); + if (value == 0x7FFFFFFF) { + pkt_dev->ipg_us = 0x7FFFFFFF; + pkt_dev->ipg_ns = 0; + } else { + pkt_dev->ipg_us = value / 1000; + pkt_dev->ipg_ns = value % 1000; + } + sprintf(pg_result, "OK: ipg=%u", 1000*pkt_dev->ipg_us+pkt_dev->ipg_ns); return count; } if (!strcmp(name, "udp_src_min")) { @@ -1732,108 +1705,32 @@ pkt_dev->nflows = 0; } -/* ipg is in nano-seconds */ -static void nanospin(__u32 ipg, struct pktgen_dev *pkt_dev) +static void spin(struct pktgen_dev *pkt_dev, __u64 spin_until_us) { - u64 idle_start = get_cycles(); - u64 idle; + __u64 start; + __u64 now; - for (;;) { - barrier(); - idle = get_cycles() - idle_start; - if (idle * 1000 >= ipg * pg_cycles_per_us) - break; - } - pkt_dev->idle_acc += idle; -} - - -/* ipg is in micro-seconds (usecs) */ -static void pg_udelay(__u32 delay_us, struct pktgen_dev *pkt_dev) -{ - u64 start = getRelativeCurUs(); - u64 now; - - for (;;) { - do_softirq(); - now = getRelativeCurUs(); - if (start + delay_us <= (now - 10)) - break; + start = now = getCurUs(); + printk(KERN_INFO "sleeping for %d\n", (int)(spin_until_us - now)); + while (now < spin_until_us) { + /* TODO: optimise sleeping behavior */ + if (spin_until_us - now > (1000000/HZ)+1) { + current->state = TASK_INTERRUPTIBLE; + schedule_timeout(1); + } else if (spin_until_us - now > 100) { + do_softirq(); + if (!pkt_dev->running) + return; + if (need_resched()) + schedule(); + } - if (!pkt_dev->running) - return; - - if (need_resched()) - schedule(); - - now = getRelativeCurUs(); - if (start + delay_us <= (now - 10)) - break; + now = getCurUs(); } - pkt_dev->idle_acc += (1000 * (now - start)); - - /* We can break out of the loop up to 10us early, so spend the rest of - * it spinning to increase accuracy. - */ - if (start + delay_us > now) - nanospin((start + delay_us) - now, pkt_dev); + pkt_dev->idle_acc += now - start; } -/* Returns: cycles per micro-second */ -static int calc_mhz(void) -{ - struct timeval start, stop; - u64 start_s; - u64 t1, t2; - u32 elapsed; - u32 clock_time = 0; - - do_gettimeofday(&start); - start_s = get_cycles(); - /* Spin for 50,000,000 cycles */ - do { - barrier(); - elapsed = (u32)(get_cycles() - start_s); - if (elapsed == 0) - return 0; - } while (elapsed < 50000000); - - do_gettimeofday(&stop); - - t1 = tv_to_us(&start); - t2 = tv_to_us(&stop); - - clock_time = (u32)(t2 - t1); - if (clock_time == 0) { - printk("pktgen: ERROR: clock_time was zero..things may not work right, t1: %u t2: %u ...\n", - (u32)(t1), (u32)(t2)); - return 0x7FFFFFFF; - } - return elapsed / clock_time; -} - -/* Calibrate cycles per micro-second */ -static void cycles_calibrate(void) -{ - int i; - - for (i = 0; i < 3; i++) { - u32 res = calc_mhz(); - if (res > pg_cycles_per_us) - pg_cycles_per_us = res; - } - - /* Set these up too, only need to calculate these once. */ - pg_cycles_per_ns = pg_cycles_per_us / 1000; - if (pg_cycles_per_ns == 0) - pg_cycles_per_ns = 1; - - pg_cycles_per_ms = pg_cycles_per_us * 1000; - - printk("pktgen: cycles_calibrate, cycles_per_ns: %d per_us: %d per_ms: %d\n", - pg_cycles_per_ns, pg_cycles_per_us, pg_cycles_per_ms); -} /* Increment/randomize headers according to flags and current values * for IP src/dest, UDP src/dst port, MAC-Addr src/dst @@ -2455,7 +2352,8 @@ pkt_dev->running = 1; /* Cranke yeself! */ pkt_dev->skb = NULL; pkt_dev->started_at = getCurUs(); - pkt_dev->next_tx_ns = 0; /* Transmit immediately */ + pkt_dev->next_tx_us = getCurUs(); /* Transmit immediately */ + pkt_dev->next_tx_ns = 0; strcpy(pkt_dev->result, "Starting"); started++; @@ -2568,17 +2466,13 @@ total_us = pkt_dev->stopped_at - pkt_dev->started_at; - BUG_ON(pg_cycles_per_us == 0); - idle = pkt_dev->idle_acc; - do_div(idle, pg_cycles_per_us); p += sprintf(p, "OK: %llu(c%llu+d%llu) usec, %llu (%dbyte,%dfrags)\n", total_us, (unsigned long long)(total_us - idle), idle, pkt_dev->sofar, pkt_dev->cur_pkt_size, nr_frags); pps = pkt_dev->sofar * USEC_PER_SEC; - while ((total_us >> 32) != 0) { pps >>= 1; @@ -2626,7 +2520,7 @@ for(next=t->if_list; next ; next=next->next) { if(!next->running) continue; if(best == NULL) best=next; - else if ( next->next_tx_ns < best->next_tx_ns) + else if ( next->next_tx_us < best->next_tx_us) best = next; } if_unlock(t); @@ -2692,46 +2586,29 @@ { struct net_device *odev = NULL; __u64 idle_start = 0; - u32 next_ipg = 0; - u64 now = 0; /* in nano-seconds */ int ret; odev = pkt_dev->odev; - if (pkt_dev->ipg) { - now = getRelativeCurNs(); - if (now < pkt_dev->next_tx_ns) { - next_ipg = (u32)(pkt_dev->next_tx_ns - now); - - /* Try not to busy-spin if we have larger sleep times. - * TODO: Investigate better ways to do this. - */ + if (pkt_dev->ipg_us || pkt_dev->ipg_ns) { + u64 now; - /* 10 usecs or less */ - if (next_ipg < 10000) - nanospin(next_ipg, pkt_dev); - - /* 10ms or less */ - else if (next_ipg < 10000000) - pg_udelay(next_ipg / 1000, pkt_dev); - - /* fall asleep for a 10ms or more. */ - else - pg_udelay(next_ipg / 1000, pkt_dev); - } + now = getCurUs(); + if (now < pkt_dev->next_tx_us) + spin(pkt_dev, pkt_dev->next_tx_us); /* This is max IPG, this has special meaning of * "never transmit" */ - if (pkt_dev->ipg == 0x7FFFFFFF) { - pkt_dev->next_tx_ns = getRelativeCurNs() + pkt_dev->ipg; + if (pkt_dev->ipg_us == 0x7FFFFFFF) { + pkt_dev->next_tx_us = getCurUs() + pkt_dev->ipg_us; + pkt_dev->next_tx_ns = pkt_dev->ipg_ns; goto out; } } if (netif_queue_stopped(odev) || need_resched()) { - - idle_start = get_cycles(); + idle_start = getCurUs(); if (!netif_running(odev)) { pktgen_stop_device(pkt_dev); @@ -2740,10 +2617,11 @@ if (need_resched()) schedule(); - pkt_dev->idle_acc += get_cycles() - idle_start; + pkt_dev->idle_acc += getCurUs() - idle_start; if (netif_queue_stopped(odev)) { - pkt_dev->next_tx_ns = getRelativeCurNs(); /* TODO */ + pkt_dev->next_tx_us = getCurUs(); /* TODO */ + pkt_dev->next_tx_ns = 0; goto out; /* Try the next interface */ } } @@ -2768,7 +2646,8 @@ spin_lock_bh(&odev->xmit_lock); if (!netif_queue_stopped(odev)) { - + u64 now; + atomic_inc(&(pkt_dev->skb->users)); retry_now: ret = odev->hard_start_xmit(pkt_dev->skb, odev); @@ -2789,16 +2668,32 @@ if (debug && net_ratelimit()) printk(KERN_INFO "pktgen: Hard xmit error\n"); - pkt_dev->errors++; - pkt_dev->last_ok = 0; - pkt_dev->next_tx_ns = getRelativeCurNs(); /* TODO */ + pkt_dev->errors++; + pkt_dev->last_ok = 0; + pkt_dev->next_tx_us = getCurUs(); /* TODO */ + pkt_dev->next_tx_ns = 0; + } + + pkt_dev->next_tx_us += pkt_dev->ipg_us; + pkt_dev->next_tx_ns += pkt_dev->ipg_ns; + if (pkt_dev->next_tx_ns > 1000) { + pkt_dev->next_tx_us++; + pkt_dev->next_tx_ns -= 1000; + } + + now = getCurUs(); + if (now > pkt_dev->next_tx_us) { + /* TODO: this code is slightly wonky. */ + pkt_dev->errors++; + pkt_dev->next_tx_us = now - pkt_dev->ipg_us; + pkt_dev->next_tx_ns = 0; } - pkt_dev->next_tx_ns = getRelativeCurNs() + pkt_dev->ipg; } else { /* Retry it next time */ pkt_dev->last_ok = 0; - pkt_dev->next_tx_ns = getRelativeCurNs(); /* TODO */ + pkt_dev->next_tx_us = getCurUs(); /* TODO */ + pkt_dev->next_tx_ns = 0; } spin_unlock_bh(&odev->xmit_lock); @@ -2806,14 +2701,14 @@ /* If pkt_dev->count is zero, then run forever */ if ((pkt_dev->count != 0) && (pkt_dev->sofar >= pkt_dev->count)) { if (atomic_read(&(pkt_dev->skb->users)) != 1) { - idle_start = get_cycles(); + idle_start = getCurUs(); while (atomic_read(&(pkt_dev->skb->users)) != 1) { if (signal_pending(current)) { break; } schedule(); } - pkt_dev->idle_acc += get_cycles() - idle_start; + pkt_dev->idle_acc += getCurUs() - idle_start; } /* Done with this */ @@ -3006,7 +2901,8 @@ pkt_dev->max_pkt_size = ETH_ZLEN; pkt_dev->nfrags = 0; pkt_dev->clone_skb = pg_clone_skb_d; - pkt_dev->ipg = pg_ipg_d; + pkt_dev->ipg_us = pg_ipg_d / 1000; + pkt_dev->ipg_ns = pg_ipg_d % 1000; pkt_dev->count = pg_count_d; pkt_dev->sofar = 0; pkt_dev->udp_src_min = 9; /* sink port */ @@ -3169,12 +3065,6 @@ module_fname[0] = 0; - cycles_calibrate(); - if (pg_cycles_per_us == 0) { - printk("pktgen: ERROR: your machine does not have working cycle counter.\n"); - return -EINVAL; - } - create_proc_dir(); sprintf(module_fname, "net/%s/pgctrl", PG_PROC_DIR); From buytenh@wantstofly.org Fri Dec 10 14:29:36 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 10 Dec 2004 14:29:40 -0800 (PST) Received: from xi.wantstofly.org (alephnull.demon.nl [212.238.201.82]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBAMTZ6n014592 for ; Fri, 10 Dec 2004 14:29:36 -0800 Received: by xi.wantstofly.org (Postfix, from userid 500) id 7BC542B0EC; Fri, 10 Dec 2004 23:29:13 +0100 (MET) Date: Fri, 10 Dec 2004 23:29:13 +0100 From: Lennert Buytenhek To: Ben Greear Cc: Robert Olsson , hadi@cyberus.ca, netdev@oss.sgi.com Subject: Re: inter-packet gap in pktgen Message-ID: <20041210222913.GC5984@xi.wantstofly.org> References: <20041207222522.GA30266@xi.wantstofly.org> <41B632F3.1090104@candelatech.com> <20041208073858.GA4027@xi.wantstofly.org> <16822.54722.755218.745451@robur.slu.se> <41B756BE.3050504@candelatech.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <41B756BE.3050504@candelatech.com> User-Agent: Mutt/1.4.1i X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12652 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: buytenh@wantstofly.org Precedence: bulk X-list: netdev On Wed, Dec 08, 2004 at 11:32:14AM -0800, Ben Greear wrote: > Actually, yes. My pktgen no longer busy-spins. [...] If you post a patch we can all perhaps have a look at it? --L From greearb@candelatech.com Fri Dec 10 15:02:16 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 10 Dec 2004 15:02:21 -0800 (PST) Received: from www.lanforge.com (ns1.lanforge.com [66.165.47.210]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBAN2DOq015891 for ; Fri, 10 Dec 2004 15:02:16 -0800 Received: from [4.33.45.22] (evrtwa1-ar2-4-33-045-022.evrtwa1.dsl-verizon.net [4.33.45.22]) (authenticated bits=0) by www.lanforge.com (8.12.8/8.12.8) with ESMTP id iBANEQLH026025; Fri, 10 Dec 2004 15:14:26 -0800 Message-ID: <41BA2AD9.7040404@candelatech.com> Date: Fri, 10 Dec 2004 15:01:45 -0800 From: Ben Greear Organization: Candela Technologies User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.3) Gecko/20041020 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Lennert Buytenhek CC: Robert Olsson , hadi@cyberus.ca, netdev@oss.sgi.com Subject: Re: inter-packet gap in pktgen References: <20041207222522.GA30266@xi.wantstofly.org> <41B632F3.1090104@candelatech.com> <20041208073858.GA4027@xi.wantstofly.org> <16822.54722.755218.745451@robur.slu.se> <41B756BE.3050504@candelatech.com> <20041210222913.GC5984@xi.wantstofly.org> In-Reply-To: <20041210222913.GC5984@xi.wantstofly.org> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12653 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: greearb@candelatech.com Precedence: bulk X-list: netdev Lennert Buytenhek wrote: > On Wed, Dec 08, 2004 at 11:32:14AM -0800, Ben Greear wrote: > > >>Actually, yes. My pktgen no longer busy-spins. [...] > > > If you post a patch we can all perhaps have a look at it? My consolidated patch, which includes pktgen (with rx-pktgen logic), macvlans, my send-to-self patch, some e100 and e1000 diagnostics work, etc, can be found here: http://www.candelatech.com/oss/candela_2.6.9.patch The pktgen diff is about the same size as the pktgen.c file, so I also uploaded the pktgen files here: http://www.candelatech.com/oss/pktgen.h http://www.candelatech.com/oss/pktgen.c Enjoy, Ben -- Ben Greear Candela Technologies Inc http://www.candelatech.com From Robert.Olsson@data.slu.se Fri Dec 10 15:07:58 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 10 Dec 2004 15:08:02 -0800 (PST) Received: from mail1.slu.se (mail1.slu.se [130.238.96.11]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBAN7uHt016519 for ; Fri, 10 Dec 2004 15:07:57 -0800 Received: from robur.slu.se (robur.slu.se [130.238.98.12]) by mail1.slu.se (8.12.10/8.12.10) with ESMTP id iBAN7XOB011591; Sat, 11 Dec 2004 00:07:33 +0100 Received: by robur.slu.se (Postfix, from userid 1000) id D816AEC001; Sat, 11 Dec 2004 00:07:33 +0100 (CET) From: Robert Olsson MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16826.11317.848178.621626@robur.slu.se> Date: Sat, 11 Dec 2004 00:07:33 +0100 To: Lennert Buytenhek Cc: robert.olsson@data.slu.se, netdev@oss.sgi.com Subject: [PATCH,RFC] pktgen sleeping/timing rework In-Reply-To: <20041210222058.GA5984@xi.wantstofly.org> References: <20041210222058.GA5984@xi.wantstofly.org> X-Mailer: VM 7.18 under Emacs 21.3.1 X-Virus-Scanned: ClamAV 0.80/533/Sat Oct 16 18:09:44 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12654 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Robert.Olsson@data.slu.se Precedence: bulk X-list: netdev Lennert Buytenhek writes: > > Below is a patch against your latest pktgen devel version which does > the following: > - Remove usage of get_cycles(), mhz calibration, pg_cycles_per_*, > getRelativeCur[UM]s(), and everything related to reading the CPU > cycle counter directly. For one, it can't work correctly on > machines that can change clock frequency on the fly Yeep it's for fixed frequency. If it can be done for variable frequency w/o to much pain it's fine. Haven't even thought this thought. > - Fix up handling of inter-packet gap. There was some code for > enforcing a certain IPG, but the actual (CPU) cost of the sending > of the packet was not substracted from that, so the effective IPG > ended up being IPG+random_overhead. This now works somewhat better Currently ipg is just a delay to hard_xmit and left to tester to tune. > - When ipg is set close to the HW limit, behavior becomes a bit wonky. > On my EPIA board I can generate ~110kpps when I set ipg=0, but when I > set ipg=10000 (which would give 100kpps), I only get ~85kps. > - We should rename 'ipg' to something different since in our case it > never was and never will be the same as what the ethernet folks mean > with it. > > Comments welcome. Let me play with it. Hopefully in the beginning of the week. --ro From holt@lnx-holt.americas.sgi.com Fri Dec 10 15:27:56 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 10 Dec 2004 15:29:25 -0800 (PST) Received: from omx1.americas.sgi.com (omx1-ext.sgi.com [192.48.179.11]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBANRtHq017498 for ; Fri, 10 Dec 2004 15:27:56 -0800 Received: from flecktone.americas.sgi.com (flecktone.americas.sgi.com [198.149.16.15]) by omx1.americas.sgi.com (8.12.10/8.12.9/linux-outbound_gateway-1.1) with ESMTP id iBANRXxT024986 for ; Fri, 10 Dec 2004 17:27:33 -0600 Received: from thistle-e236.americas.sgi.com (thistle-e236.americas.sgi.com [128.162.236.204]) by flecktone.americas.sgi.com (8.12.9/8.12.10/SGI_generic_relay-1.2) with ESMTP id iBANRWCM4079955; Fri, 10 Dec 2004 17:27:33 -0600 (CST) Received: from lnx-holt.americas.sgi.com (IDENT:U2FsdGVkX181Tgxz0kroXoEysxQzL+FuCeFXjTV/GEU@lnx-holt.americas.sgi.com [128.162.233.109]) by thistle-e236.americas.sgi.com (8.12.9/SGI-server-1.8) with ESMTP id iBANRVtC17207257; Fri, 10 Dec 2004 17:27:31 -0600 (CST) Received: from lnx-holt.americas.sgi.com (localhost.localdomain [127.0.0.1]) by lnx-holt.americas.sgi.com (8.13.1/8.12.11) with ESMTP id iBANRSOL025160; Fri, 10 Dec 2004 17:27:28 -0600 Received: (from holt@localhost) by lnx-holt.americas.sgi.com (8.13.1/8.13.1/Submit) id iBANRMc8025158; Fri, 10 Dec 2004 17:27:22 -0600 Date: Fri, 10 Dec 2004 17:27:22 -0600 From: Robin Holt To: Andrew Morton Cc: Robin Holt , davem@davemloft.net, yoshfuji@linux-ipv6.org, hirofumi@parknet.co.jp, torvalds@osdl.org, dipankar@ibm.com, laforge@gnumonks.org, bunk@stusta.de, herbert@apana.org.au, paulmck@ibm.com, netdev@oss.sgi.com, linux-kernel@vger.kernel.org, gnb@sgi.com Subject: Re: [RFC] Limit the size of the IPV4 route hash. Message-ID: <20041210232722.GC24468@lnx-holt.americas.sgi.com> References: <20041210190025.GA21116@lnx-holt.americas.sgi.com> <20041210114829.034e02eb.davem@davemloft.net> <20041210210006.GB23222@lnx-holt.americas.sgi.com> <20041210130947.1d945422.akpm@osdl.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041210130947.1d945422.akpm@osdl.org> User-Agent: Mutt/1.4.1i X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12655 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: holt@sgi.com Precedence: bulk X-list: netdev On Fri, Dec 10, 2004 at 01:09:47PM -0800, Andrew Morton wrote: > Robin Holt wrote: > > Can we agree that a linear calculation based on num_physpages is probably > > not the best algorithm. If so, should we make it a linear to a limit or > > a logarithmically decreasing size to a limit? How do we determine that > > limit point? > > An initial default of N + M * log2(num_physpages) would probably give a > saner result. > > The big risk is that someone has a too-small table for some specific > application and their machine runs more slowly than it should, but they > never notice. I wonder if it would be possible to put a little once-only > printk into the routing code: "warning route-cache chain exceeded 100 > entries: consider using the rhash_entries boot option". Since the hash gets flushed every 10 seconds, what if we kept track of the maximum depth reached and when we reach a certain threshold, just allocate a larger hash and replace the old with the new. I do like the printk idea so the admin can prevent inconsistent performance early in the run cycle for the system. We could even scale the hash size up based upon demand. Thanks, Robin From akpm@osdl.org Fri Dec 10 15:35:30 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 10 Dec 2004 15:35:34 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBANZT2J018262 for ; Fri, 10 Dec 2004 15:35:29 -0800 Received: from akpm.pao.digeo.com (build.pdx.osdl.net [172.20.1.2]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id iBANYQ907041; Fri, 10 Dec 2004 15:34:26 -0800 Date: Fri, 10 Dec 2004 15:38:48 -0800 From: Andrew Morton To: Robin Holt Cc: holt@sgi.com, davem@davemloft.net, yoshfuji@linux-ipv6.org, hirofumi@parknet.co.jp, torvalds@osdl.org, dipankar@ibm.com, laforge@gnumonks.org, bunk@stusta.de, herbert@apana.org.au, paulmck@ibm.com, netdev@oss.sgi.com, linux-kernel@vger.kernel.org, gnb@sgi.com Subject: Re: [RFC] Limit the size of the IPV4 route hash. Message-Id: <20041210153848.5acacd0a.akpm@osdl.org> In-Reply-To: <20041210232722.GC24468@lnx-holt.americas.sgi.com> References: <20041210190025.GA21116@lnx-holt.americas.sgi.com> <20041210114829.034e02eb.davem@davemloft.net> <20041210210006.GB23222@lnx-holt.americas.sgi.com> <20041210130947.1d945422.akpm@osdl.org> <20041210232722.GC24468@lnx-holt.americas.sgi.com> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.10; i586-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12656 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: akpm@osdl.org Precedence: bulk X-list: netdev Robin Holt wrote: > > > The big risk is that someone has a too-small table for some specific > > application and their machine runs more slowly than it should, but they > > never notice. I wonder if it would be possible to put a little once-only > > printk into the routing code: "warning route-cache chain exceeded 100 > > entries: consider using the rhash_entries boot option". > > Since the hash gets flushed every 10 seconds, what if we kept track of > the maximum depth reached and when we reach a certain threshold, just > allocate a larger hash and replace the old with the new. I do like the > printk idea so the admin can prevent inconsistent performance early in > the run cycle for the system. We could even scale the hash size up based > upon demand. Once the system has been running for a while, the possibility of allocating a decent number of physically-contiguous pages is basically zero. If we were to dynamically size it we'd need to either use new data structure (slower) or use vmalloc() (slower and can fragment vmalloc space). From holt@lnx-holt.americas.sgi.com Fri Dec 10 15:37:28 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 10 Dec 2004 15:37:34 -0800 (PST) Received: from omx2.sgi.com (omx2-ext.sgi.com [192.48.171.19]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBANbRU3018712 for ; Fri, 10 Dec 2004 15:37:27 -0800 Received: from flecktone.americas.sgi.com (flecktone.americas.sgi.com [198.149.16.15]) by omx2.sgi.com (8.12.11/8.12.9/linux-outbound_gateway-1.1) with ESMTP id iBB0w92g007376 for ; Fri, 10 Dec 2004 16:58:10 -0800 Received: from thistle-e236.americas.sgi.com (thistle-e236.americas.sgi.com [128.162.236.204]) by flecktone.americas.sgi.com (8.12.9/8.12.10/SGI_generic_relay-1.2) with ESMTP id iBANb4CM4081691; Fri, 10 Dec 2004 17:37:04 -0600 (CST) Received: from lnx-holt.americas.sgi.com (IDENT:U2FsdGVkX1/h+s1xHfTAHxWfKHMHh6DBeMe3kSElBpc@lnx-holt.americas.sgi.com [128.162.233.109]) by thistle-e236.americas.sgi.com (8.12.9/SGI-server-1.8) with ESMTP id iBANb3tC14978380; Fri, 10 Dec 2004 17:37:03 -0600 (CST) Received: from lnx-holt.americas.sgi.com (localhost.localdomain [127.0.0.1]) by lnx-holt.americas.sgi.com (8.13.1/8.12.11) with ESMTP id iBANb10E025595; Fri, 10 Dec 2004 17:37:01 -0600 Received: (from holt@localhost) by lnx-holt.americas.sgi.com (8.13.1/8.13.1/Submit) id iBANb0XU025594; Fri, 10 Dec 2004 17:37:00 -0600 Date: Fri, 10 Dec 2004 17:37:00 -0600 From: Robin Holt To: Andrew Morton Cc: Robin Holt , davem@davemloft.net, yoshfuji@linux-ipv6.org, hirofumi@parknet.co.jp, torvalds@osdl.org, dipankar@ibm.com, laforge@gnumonks.org, bunk@stusta.de, herbert@apana.org.au, paulmck@ibm.com, netdev@oss.sgi.com, linux-kernel@vger.kernel.org, gnb@sgi.com Subject: Re: [RFC] Limit the size of the IPV4 route hash. Message-ID: <20041210233700.GA25582@lnx-holt.americas.sgi.com> References: <20041210190025.GA21116@lnx-holt.americas.sgi.com> <20041210114829.034e02eb.davem@davemloft.net> <20041210210006.GB23222@lnx-holt.americas.sgi.com> <20041210130947.1d945422.akpm@osdl.org> <20041210232722.GC24468@lnx-holt.americas.sgi.com> <20041210153848.5acacd0a.akpm@osdl.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041210153848.5acacd0a.akpm@osdl.org> User-Agent: Mutt/1.4.1i X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12657 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: holt@sgi.com Precedence: bulk X-list: netdev On Fri, Dec 10, 2004 at 03:38:48PM -0800, Andrew Morton wrote: > Robin Holt wrote: > > > > > The big risk is that someone has a too-small table for some specific > > > application and their machine runs more slowly than it should, but they > > > never notice. I wonder if it would be possible to put a little once-only > > > printk into the routing code: "warning route-cache chain exceeded 100 > > > entries: consider using the rhash_entries boot option". > > > > Since the hash gets flushed every 10 seconds, what if we kept track of > > the maximum depth reached and when we reach a certain threshold, just > > allocate a larger hash and replace the old with the new. I do like the > > printk idea so the admin can prevent inconsistent performance early in > > the run cycle for the system. We could even scale the hash size up based > > upon demand. > > Once the system has been running for a while, the possibility of allocating > a decent number of physically-contiguous pages is basically zero. > > If we were to dynamically size it we'd need to either use new data > structure (slower) or use vmalloc() (slower and can fragment vmalloc > space). Why do they need to be physically contiguous? It is a hash correct? From holt@lnx-holt.americas.sgi.com Fri Dec 10 15:41:03 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 10 Dec 2004 15:41:08 -0800 (PST) Received: from omx2.sgi.com (omx2-ext.sgi.com [192.48.171.19]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBANf38Y019286 for ; Fri, 10 Dec 2004 15:41:03 -0800 Received: from flecktone.americas.sgi.com (flecktone.americas.sgi.com [198.149.16.15]) by omx2.sgi.com (8.12.11/8.12.9/linux-outbound_gateway-1.1) with ESMTP id iBB11kQ6008761 for ; Fri, 10 Dec 2004 17:01:46 -0800 Received: from thistle-e236.americas.sgi.com (thistle-e236.americas.sgi.com [128.162.236.204]) by flecktone.americas.sgi.com (8.12.9/8.12.10/SGI_generic_relay-1.2) with ESMTP id iBANeeCM4080636; Fri, 10 Dec 2004 17:40:41 -0600 (CST) Received: from lnx-holt.americas.sgi.com (IDENT:U2FsdGVkX1+r3ihCDxlwTiXNkh8vmDQQBQTS5sJSXcQ@lnx-holt.americas.sgi.com [128.162.233.109]) by thistle-e236.americas.sgi.com (8.12.9/SGI-server-1.8) with ESMTP id iBANedtC17322020; Fri, 10 Dec 2004 17:40:39 -0600 (CST) Received: from lnx-holt.americas.sgi.com (localhost.localdomain [127.0.0.1]) by lnx-holt.americas.sgi.com (8.13.1/8.12.11) with ESMTP id iBANecHQ025636; Fri, 10 Dec 2004 17:40:38 -0600 Received: (from holt@localhost) by lnx-holt.americas.sgi.com (8.13.1/8.13.1/Submit) id iBANebU1025634; Fri, 10 Dec 2004 17:40:37 -0600 Date: Fri, 10 Dec 2004 17:40:37 -0600 From: Robin Holt To: Robin Holt Cc: Andrew Morton , davem@davemloft.net, yoshfuji@linux-ipv6.org, hirofumi@parknet.co.jp, torvalds@osdl.org, dipankar@ibm.com, laforge@gnumonks.org, bunk@stusta.de, herbert@apana.org.au, paulmck@ibm.com, netdev@oss.sgi.com, linux-kernel@vger.kernel.org, gnb@sgi.com Subject: Re: [RFC] Limit the size of the IPV4 route hash. Message-ID: <20041210234037.GB25582@lnx-holt.americas.sgi.com> References: <20041210190025.GA21116@lnx-holt.americas.sgi.com> <20041210114829.034e02eb.davem@davemloft.net> <20041210210006.GB23222@lnx-holt.americas.sgi.com> <20041210130947.1d945422.akpm@osdl.org> <20041210232722.GC24468@lnx-holt.americas.sgi.com> <20041210153848.5acacd0a.akpm@osdl.org> <20041210233700.GA25582@lnx-holt.americas.sgi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041210233700.GA25582@lnx-holt.americas.sgi.com> User-Agent: Mutt/1.4.1i X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12658 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: holt@sgi.com Precedence: bulk X-list: netdev On Fri, Dec 10, 2004 at 05:37:00PM -0600, Robin Holt wrote: > On Fri, Dec 10, 2004 at 03:38:48PM -0800, Andrew Morton wrote: > > Robin Holt wrote: > > > > > > > The big risk is that someone has a too-small table for some specific > > > > application and their machine runs more slowly than it should, but they > > > > never notice. I wonder if it would be possible to put a little once-only > > > > printk into the routing code: "warning route-cache chain exceeded 100 > > > > entries: consider using the rhash_entries boot option". > > > > > > Since the hash gets flushed every 10 seconds, what if we kept track of > > > the maximum depth reached and when we reach a certain threshold, just > > > allocate a larger hash and replace the old with the new. I do like the > > > printk idea so the admin can prevent inconsistent performance early in > > > the run cycle for the system. We could even scale the hash size up based > > > upon demand. > > > > Once the system has been running for a while, the possibility of allocating > > a decent number of physically-contiguous pages is basically zero. > > > > If we were to dynamically size it we'd need to either use new data > > structure (slower) or use vmalloc() (slower and can fragment vmalloc > > space). > > Why do they need to be physically contiguous? It is a hash correct? Sorry, I was asleep at the wheel. I failed to even grok your second paragraph. I will fall back to agreeing with the printk to let the admin know that something is amiss. Should we possibly modify the output of /proc/net/rt_cache (or whatever its name is) to include the hash bucket so people can watch to see how many bucket collisions their system has? From shemminger@osdl.org Fri Dec 10 17:09:41 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 10 Dec 2004 17:09:47 -0800 (PST) Received: from fire-1.osdl.org (fire.osdl.org [65.172.181.4]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBB19eoe025261 for ; Fri, 10 Dec 2004 17:09:40 -0800 Received: from localhost.localdomain (063-170-215-071.dslnorthwest.net [63.170.215.71]) (authenticated bits=0) by fire-1.osdl.org (8.12.8/8.12.8) with ESMTP id iBB192Q3029508 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Fri, 10 Dec 2004 17:09:03 -0800 Date: Fri, 10 Dec 2004 17:09:00 -0800 From: Stephen Hemminger To: "David S. Miller" Cc: michael.vittrup.larsen@ericsson.com, netdev@oss.sgi.com Subject: Re: [PATCH] tcp: efficient port randomistion (rev 3) Message-ID: <20041210170900.11d41d56.shemminger@osdl.org> In-Reply-To: <20041208235524.202ff3a1.davem@davemloft.net> References: <20041027092531.78fe438c@guest-251-240.pdx.osdl.net> <20041202135252.04e64f51.davem@davemloft.net> <41B14E57.5080803@osdl.org> <200412060918.04441.michael.vittrup.larsen@ericsson.com> <20041206094234.34861c78@dxpl.pdx.osdl.net> <20041208235524.202ff3a1.davem@davemloft.net> X-Mailer: Sylpheed-Claws 0.9.12cvs146.13 (GTK+ 2.4.13; i386-redhat-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-MIMEDefang-Filter: osdl$Revision: 1.97 $ X-Scanned-By: MIMEDefang 2.36 X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12659 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev okay, here is the revised version. Testing shows that it is more consistent, and just as fast as existing code, probably because of the getting rid of portalloc_lock and better distribution. Signed-off-by: Stephen Hemminger diff -Nru a/drivers/char/random.c b/drivers/char/random.c --- a/drivers/char/random.c 2004-12-09 23:08:07 -08:00 +++ b/drivers/char/random.c 2004-12-09 23:08:07 -08:00 @@ -2347,6 +2347,24 @@ return halfMD4Transform(hash, keyptr->secret); } +/* Generate secure starting point for ephemeral TCP port search */ +u32 secure_tcp_port_ephemeral(__u32 saddr, __u32 daddr, __u16 dport) +{ + struct keydata *keyptr = get_keyptr(); + u32 hash[4]; + + /* + * Pick a unique starting offset for each ephemeral port search + * (saddr, daddr, dport) and 48bits of random data. + */ + hash[0] = saddr; + hash[1] = daddr; + hash[2] = dport ^ keyptr->secret[10]; + hash[3] = keyptr->secret[11]; + + return halfMD4Transform(hash, keyptr->secret); +} + #ifdef CONFIG_SYN_COOKIES /* * Secure SYN cookie computation. This is the algorithm worked out by diff -Nru a/include/linux/random.h b/include/linux/random.h --- a/include/linux/random.h 2004-12-09 23:08:07 -08:00 +++ b/include/linux/random.h 2004-12-09 23:08:07 -08:00 @@ -52,6 +52,7 @@ void generate_random_uuid(unsigned char uuid_out[16]); extern __u32 secure_ip_id(__u32 daddr); +extern u32 secure_tcp_port_ephemeral(__u32 saddr, __u32 daddr, __u16 dport); extern __u32 secure_tcp_sequence_number(__u32 saddr, __u32 daddr, __u16 sport, __u16 dport); extern __u32 secure_tcp_syn_cookie(__u32 saddr, __u32 daddr, diff -Nru a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c --- a/net/ipv4/tcp_ipv4.c 2004-12-09 23:08:07 -08:00 +++ b/net/ipv4/tcp_ipv4.c 2004-12-09 23:08:07 -08:00 @@ -636,10 +636,18 @@ return -EADDRNOTAVAIL; } +static inline u32 connect_port_offset(const struct sock *sk) +{ + const struct inet_opt *inet = inet_sk(sk); + + return secure_tcp_port_ephemeral(inet->rcv_saddr, inet->daddr, + inet->dport); +} + /* * Bind a port for a connect operation and hash it. */ -static int tcp_v4_hash_connect(struct sock *sk) +static inline int tcp_v4_hash_connect(struct sock *sk) { unsigned short snum = inet_sk(sk)->num; struct tcp_bind_hashbucket *head; @@ -647,36 +655,20 @@ int ret; if (!snum) { - int rover; int low = sysctl_local_port_range[0]; int high = sysctl_local_port_range[1]; - int remaining = (high - low) + 1; + int range = high - low; + int i; + int port; + static u32 hint; + u32 offset = hint + connect_port_offset(sk); struct hlist_node *node; struct tcp_tw_bucket *tw = NULL; local_bh_disable(); - - /* TODO. Actually it is not so bad idea to remove - * tcp_portalloc_lock before next submission to Linus. - * As soon as we touch this place at all it is time to think. - * - * Now it protects single _advisory_ variable tcp_port_rover, - * hence it is mostly useless. - * Code will work nicely if we just delete it, but - * I am afraid in contented case it will work not better or - * even worse: another cpu just will hit the same bucket - * and spin there. - * So some cpu salt could remove both contention and - * memory pingpong. Any ideas how to do this in a nice way? - */ - spin_lock(&tcp_portalloc_lock); - rover = tcp_port_rover; - - do { - rover++; - if ((rover < low) || (rover > high)) - rover = low; - head = &tcp_bhash[tcp_bhashfn(rover)]; + for (i = 1; i <= range; i++) { + port = low + (i + offset) % range; + head = &tcp_bhash[tcp_bhashfn(port)]; spin_lock(&head->lock); /* Does not bother with rcv_saddr checks, @@ -684,19 +676,19 @@ * unique enough. */ tb_for_each(tb, node, &head->chain) { - if (tb->port == rover) { + if (tb->port == port) { BUG_TRAP(!hlist_empty(&tb->owners)); if (tb->fastreuse >= 0) goto next_port; if (!__tcp_v4_check_established(sk, - rover, + port, &tw)) goto ok; goto next_port; } } - tb = tcp_bucket_create(head, rover); + tb = tcp_bucket_create(head, port); if (!tb) { spin_unlock(&head->lock); break; @@ -706,22 +698,18 @@ next_port: spin_unlock(&head->lock); - } while (--remaining > 0); - tcp_port_rover = rover; - spin_unlock(&tcp_portalloc_lock); - + } local_bh_enable(); return -EADDRNOTAVAIL; ok: - /* All locks still held and bhs disabled */ - tcp_port_rover = rover; - spin_unlock(&tcp_portalloc_lock); + hint += i; - tcp_bind_hash(sk, tb, rover); + /* Head lock still held and bh's disabled */ + tcp_bind_hash(sk, tb, port); if (sk_unhashed(sk)) { - inet_sk(sk)->sport = htons(rover); + inet_sk(sk)->sport = htons(port); __tcp_v4_hash(sk, 0); } spin_unlock(&head->lock); From ravinandan.arakali@s2io.com Fri Dec 10 18:01:41 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 10 Dec 2004 18:01:44 -0800 (PST) Received: from ns1.s2io.com (ns1.s2io.com [142.46.200.198]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBB21eW7026882 for ; Fri, 10 Dec 2004 18:01:40 -0800 Received: from guinness.s2io.com (sentry.s2io.com [142.46.200.199]) by ns1.s2io.com (8.12.10/8.12.10) with ESMTP id iBB21Bje018008 for ; Fri, 10 Dec 2004 21:01:11 -0500 (EST) Received: from rarakali ([10.16.16.58]) by guinness.s2io.com (8.12.6/8.12.6) with SMTP id iBB21739019238; Fri, 10 Dec 2004 21:01:07 -0500 (EST) From: "Ravinandan Arakali" To: Cc: "Leonid. Grossman \(E-mail\)" , "Raghavendra. Koushik \(E-mail\)" , Subject: Question about Bonding driver Date: Fri, 10 Dec 2004 18:01:32 -0800 Message-ID: <002201c4df25$56afc3c0$3a10100a@pc.s2io.com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook CWS, Build 9.0.2416 (9.0.2911.0) x-mimeole: Produced By Microsoft MimeOLE V6.00.2900.2180 Importance: Normal X-Scanned-By: MIMEDefang 2.34 X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12660 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ravinandan.arakali@s2io.com Precedence: bulk X-list: netdev Hi, We are facing the following problem with bonding driver with two 10-gigabit ethernet cards. Any help is appreciated. Server: AMD Opteron system with two 10-gigabit ethernet cards Switch: Foundry FastIron 800 Clients: Two Opteron systems, each with one 10-gigabit card. The two cards on the server were aggregated in 802.3ad mode(mode=4). On the switch, the two ports are aggregated in LACP mode. With single stream traffic per client, transmit throughput to one of the clients is 3Gbps and the other, 5.5Gbps. When the two cards are aggregated we get a throughput of 7.9 Gbps(which is 93% of combined throughputs). BTW, throughput is measured using nttcp tool. The MAC address on one of the clients is set to an even address and the MAC address on the other client to odd. This ensures that the traffic to the two clients is being distributed by the bonding driver to the two cards in the bond. We also verified this using ping. However, the problem is, when we have multiple streams per client, throughput drops and we notice continuous TCP retransmissions from the server. Note that without bonding driver, even if we run multiple TCP streams per client(each card on server in different subnet), there are no retransmissions and the throughput is very good. Thanks, Ravi From jm@jm.kir.nu Fri Dec 10 20:17:20 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 10 Dec 2004 20:17:26 -0800 (PST) Received: from jm.kir.nu (dsl017-049-110.sfo4.dsl.speakeasy.net [69.17.49.110]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBB4HKAV031056 for ; Fri, 10 Dec 2004 20:17:20 -0800 Received: from jm by jm.kir.nu with local (Exim 4.42) id 1CcygJ-0005XA-50; Fri, 10 Dec 2004 20:16:31 -0800 Date: Fri, 10 Dec 2004 20:16:31 -0800 From: Jouni Malinen To: Brande Cc: netdev@oss.sgi.com, linux-net@vger.kernel.org Subject: Re: Prism / Hostap Bridge Problems... Message-ID: <20041211041631.GB7015@jm.kir.nu> References: <41B9B6E9.7010600@novolab.de> <41B9C4E0.40407@novolab.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <41B9C4E0.40407@novolab.de> User-Agent: Mutt/1.5.6i X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12661 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jkmaline@cc.hut.fi Precedence: bulk X-list: netdev On Fri, Dec 10, 2004 at 04:46:40PM +0100, Brande wrote: > I have a meshcube (www.meshcube.org) with two prim 2.5 mini-pci WLAN > cards... > wlan0 should connect as client (managed mode) to a normal access point > which is connected to the internet. > wlan1 should work as accesspoint (master mode). > I would like to bridge between wlan0 and wlan1 so that I can connect IEEE 802.11 protocol does not support this kind of configuration. The card in client mode can only send packet with its own layer 2 MAC address whereas bridge functionality would require that the other MAC addresses can also be used as source address. If it is enough to handle only some IP protocols, Proxy ARP or IP routing could be used to setup this kind of network. Another option would be to use WDS link between these APs (assuming the other AP supports WDS) and bridge packet through it. -- Jouni Malinen PGP id EFC895FA From jgarzik@pobox.com Sat Dec 11 11:04:34 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 11 Dec 2004 11:04:38 -0800 (PST) Received: from www.linux.org.uk (IDENT:93@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBBJ4Xp2002679 for ; Sat, 11 Dec 2004 11:04:34 -0800 Received: from rdu74-155-169.nc.rr.com ([24.74.155.169] helo=[10.10.10.88]) by www.linux.org.uk with asmtp (TLSv1:AES256-SHA:256) (Exim 4.33) id 1CdCXJ-0008SF-RA; Sat, 11 Dec 2004 19:04:10 +0000 Message-ID: <41BB44A8.7000906@pobox.com> Date: Sat, 11 Dec 2004 14:04:08 -0500 From: Jeff Garzik User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3) Gecko/20040922 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Daniele Venzano CC: NetDev Subject: Re: [PATCH 1/5] sis900 printk and stack usage audit References: <20041208104721.GA31707@picchio.gall.it> <20041208110156.GB31707@picchio.gall.it> In-Reply-To: <20041208110156.GB31707@picchio.gall.it> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12662 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Daniele Venzano wrote: > Audit of current printk() calls > Changed debug levels to 0,1,2,3 as follows: > 0 No debug > 1 load/probe/unload/suspend/resume stuff > 2 rx/tx errors > 3 rx/tx packets and every interrupt are logged (very verbose) > Debug levels are incremental > Removed double printing of version string in module_init and in sis900_probe > Made the sis900_debug parameter modifiable at runtime Debug levels are standardized, please follow the standard. All messages are bitmapped. See NETIF_MSG_* and netif_msg_* in include/linux/netdevice.h. Given the standard 'debug' module parameter, which takes a value from 0-N (where N==10 or so), the driver uses netif_msg_init() to construct the bitmap of messages to be enabled. Jeff From jgarzik@pobox.com Sat Dec 11 11:06:46 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 11 Dec 2004 11:06:50 -0800 (PST) Received: from www.linux.org.uk (IDENT:93@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBBJ6jvc002999 for ; Sat, 11 Dec 2004 11:06:46 -0800 Received: from rdu74-155-169.nc.rr.com ([24.74.155.169] helo=[10.10.10.88]) by www.linux.org.uk with asmtp (TLSv1:AES256-SHA:256) (Exim 4.33) id 1CdCZS-0000BS-UI; Sat, 11 Dec 2004 19:06:23 +0000 Message-ID: <41BB452D.5040507@pobox.com> Date: Sat, 11 Dec 2004 14:06:21 -0500 From: Jeff Garzik User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3) Gecko/20040922 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Daniele Venzano CC: NetDev Subject: Re: [PATCH 1/5] sis900 printk and stack usage audit References: <20041208104721.GA31707@picchio.gall.it> <20041208110156.GB31707@picchio.gall.it> In-Reply-To: <20041208110156.GB31707@picchio.gall.it> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12663 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Daniele Venzano wrote: > Removed double printing of version string in module_init and in sis900_probe > @@ -390,13 +390,6 @@ > u8 revision; > char *card_name = card_names[pci_id->driver_data]; > > -/* when built into the kernel, we only print version if device is found */ > -#ifndef MODULE > - static int printed_version; > - if (!printed_version++) > - printk(version); > -#endif > - > /* setup various bits in PCI command register */ > ret = pci_enable_device(pci_dev); > if(ret) return ret; There is no double-printing. One is #ifndef MODULE, one is #ifdef MODULE. Did you read the comment included in the code you deleted??? Jeff From jgarzik@pobox.com Sat Dec 11 11:09:38 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 11 Dec 2004 11:09:42 -0800 (PST) Received: from www.linux.org.uk (IDENT:93@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBBJ9bDh003640 for ; Sat, 11 Dec 2004 11:09:38 -0800 Received: from rdu74-155-169.nc.rr.com ([24.74.155.169] helo=[10.10.10.88]) by www.linux.org.uk with asmtp (TLSv1:AES256-SHA:256) (Exim 4.33) id 1CdCcE-0000Tp-3t; Sat, 11 Dec 2004 19:09:14 +0000 Message-ID: <41BB45D6.2090608@pobox.com> Date: Sat, 11 Dec 2004 14:09:10 -0500 From: Jeff Garzik User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3) Gecko/20040922 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Daniele Venzano CC: NetDev Subject: Re: [PATCH 2/5] sis900 printk and stack usage audit References: <20041208104721.GA31707@picchio.gall.it> <20041208110426.GC31707@picchio.gall.it> In-Reply-To: <20041208110426.GC31707@picchio.gall.it> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12664 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Daniele Venzano wrote: > @@ -455,10 +455,15 @@ > if (ret) > goto err_unmap_rx; > > + pci_read_config_byte(pci_dev, PCI_CLASS_REVISION, &revision); > + > + if(sis900_debug > 0) > + printk(KERN_DEBUG "%s: detected revision %2.2x," > + "trying to get MAC address...\n", > + net_dev->name, revision); > + > /* Get Mac address according to the chip revision */ > - pci_read_config_byte(pci_dev, PCI_CLASS_REVISION, &revision); > ret = 0; > - use netif_msg_* bitmap > @@ -542,9 +549,14 @@ > for(i = 0; i < 2; i++) > mii_status = mdio_read(net_dev, phy_addr, MII_STATUS); > > - if (mii_status == 0xffff || mii_status == 0x0000) > + if (mii_status == 0xffff || mii_status == 0x0000) { > /* the mii is not accessible, try next one */ > + if (sis900_debug > 0) > + printk(KERN_DEBUG "%s: MII at address %d" > + " not accessible\n", > + net_dev->name, phy_addr); > continue; > + } > ditto > @@ -568,14 +580,15 @@ > > for (i = 0; mii_chip_table[i].phy_id1; i++) > if ((mii_phy->phy_id0 == mii_chip_table[i].phy_id0 ) && > - ((mii_phy->phy_id1 & 0xFFF0) == mii_chip_table[i].phy_id1)){ > + ((mii_phy->phy_id1 & 0xFFF0) == mii_chip_table[i].phy_id1)) > + { CodingStyle: the brace following an 'if' does not exist on a line by itself. Jeff From jgarzik@pobox.com Sat Dec 11 11:10:17 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 11 Dec 2004 11:10:21 -0800 (PST) Received: from www.linux.org.uk (IDENT:93@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBBJAGUf003980 for ; Sat, 11 Dec 2004 11:10:17 -0800 Received: from rdu74-155-169.nc.rr.com ([24.74.155.169] helo=[10.10.10.88]) by www.linux.org.uk with asmtp (TLSv1:AES256-SHA:256) (Exim 4.33) id 1CdCcs-0000U8-0Z; Sat, 11 Dec 2004 19:09:54 +0000 Message-ID: <41BB4600.3010409@pobox.com> Date: Sat, 11 Dec 2004 14:09:52 -0500 From: Jeff Garzik User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3) Gecko/20040922 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Daniele Venzano CC: NetDev Subject: Re: [PATCH 3/5] sis900 printk and stack usage audit References: <20041208104721.GA31707@picchio.gall.it> <20041208110610.GD31707@picchio.gall.it> In-Reply-To: <20041208110610.GD31707@picchio.gall.it> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12665 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Daniele Venzano wrote: > Chip revision is now a member of sis_priv structure > Kill all calls to pci_read_config_byte but one > Change the code to use sis_priv->chipset_rev OK (but not applied, since previous patches in series were not applied) From jgarzik@pobox.com Sat Dec 11 11:11:55 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 11 Dec 2004 11:11:59 -0800 (PST) Received: from www.linux.org.uk (IDENT:93@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBBJBsIv004759 for ; Sat, 11 Dec 2004 11:11:55 -0800 Received: from rdu74-155-169.nc.rr.com ([24.74.155.169] helo=[10.10.10.88]) by www.linux.org.uk with asmtp (TLSv1:AES256-SHA:256) (Exim 4.33) id 1CdCeS-0000du-8N; Sat, 11 Dec 2004 19:11:32 +0000 Message-ID: <41BB4662.7010105@pobox.com> Date: Sat, 11 Dec 2004 14:11:30 -0500 From: Jeff Garzik User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3) Gecko/20040922 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Daniele Venzano CC: NetDev Subject: Re: [PATCH 5/5] sis900 printk and stack usage audit References: <20041208104721.GA31707@picchio.gall.it> <20041208111010.GF31707@picchio.gall.it> In-Reply-To: <20041208111010.GF31707@picchio.gall.it> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12667 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Daniele Venzano wrote: > Change comment to reflect changes in parameters od sis630_set_eq OK, but not applied due to comments on previous patches From jgarzik@pobox.com Sat Dec 11 11:11:38 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 11 Dec 2004 11:11:42 -0800 (PST) Received: from www.linux.org.uk (IDENT:93@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBBJBb2K004631 for ; Sat, 11 Dec 2004 11:11:38 -0800 Received: from rdu74-155-169.nc.rr.com ([24.74.155.169] helo=[10.10.10.88]) by www.linux.org.uk with asmtp (TLSv1:AES256-SHA:256) (Exim 4.33) id 1CdCeB-0000dj-31; Sat, 11 Dec 2004 19:11:15 +0000 Message-ID: <41BB4651.4040106@pobox.com> Date: Sat, 11 Dec 2004 14:11:13 -0500 From: Jeff Garzik User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3) Gecko/20040922 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Daniele Venzano CC: NetDev Subject: Re: [PATCH 4/5] sis900 printk and stack usage audit References: <20041208104721.GA31707@picchio.gall.it> <20041208110835.GE31707@picchio.gall.it> In-Reply-To: <20041208110835.GE31707@picchio.gall.it> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12666 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Daniele Venzano wrote: > @@ -952,7 +952,11 @@ > sis900_reset(net_dev); > > /* Equalizer workaround Rule */ > - sis630_set_eq(net_dev, sis_priv->chipset_rev); > + if (sis_priv->chipset_rev == SIS630E_900_REV || > + sis_priv->chipset_rev == SIS630EA1_900_REV || > + sis_priv->chipset_rev == SIS630A_900_REV || > + sis_priv->chipset_rev == SIS630ET_900_REV) > + sis630_set_eq(net_dev); > > ret = request_irq(net_dev->irq, &sis900_interrupt, SA_SHIRQ, > net_dev->name, net_dev); > @@ -1141,16 +1145,12 @@ > * max >= 15 --> set equalizer to max+5 or set equalizer to max+6 if max == min > */ > > -static void sis630_set_eq(struct net_device *net_dev, u8 revision) > +static void sis630_set_eq(struct net_device *net_dev) > { > struct sis900_private *sis_priv = net_dev->priv; > u16 reg14h, eq_value=0, max_value=0, min_value=0; > int i, maxcount=10; > > - if ( !(revision == SIS630E_900_REV || revision == SIS630EA1_900_REV || > - revision == SIS630A_900_REV || revision == SIS630ET_900_REV) ) > - return; > - > if (netif_carrier_ok(net_dev)) { > reg14h = mdio_read(net_dev, sis_priv->cur_phy, MII_RESV); > mdio_write(net_dev, sis_priv->cur_phy, MII_RESV, This is a step backwards. You are _adding_ multiple copies of the same piece of code, just to avoid calling a function. Jeff From linux-netdev@gmane.org Sat Dec 11 20:26:59 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 11 Dec 2004 20:27:06 -0800 (PST) Received: from main.gmane.org (main.gmane.org [80.91.229.2]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBC4Qw1F029794 for ; Sat, 11 Dec 2004 20:26:59 -0800 Received: from list by main.gmane.org with local (Exim 3.35 #1 (Debian)) id 1CdLJa-0006k8-00 for ; Sun, 12 Dec 2004 05:26:34 +0100 Received: from dsl027-161-081.atl1.dsl.speakeasy.net ([216.27.161.81]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Sun, 12 Dec 2004 05:26:34 +0100 Received: from lunz by dsl027-161-081.atl1.dsl.speakeasy.net with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Sun, 12 Dec 2004 05:26:34 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: netdev@oss.sgi.com From: Jason Lunz Subject: [patch] remove unused CONFIG_E100_NAPI Date: Sun, 12 Dec 2004 04:26:31 +0000 (UTC) Organization: PBR Streetgang Lines: 54 Message-ID: X-Complaints-To: usenet@sea.gmane.org X-Gmane-NNTP-Posting-Host: dsl027-161-081.atl1.dsl.speakeasy.net User-Agent: slrn/0.9.8.1 (Debian) X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12668 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: lunz@falooley.org Precedence: bulk X-list: netdev I have an old PII nfs media server running 2.6.10-rc3 with an intel e100. Its interrupt is shared with both controllers of an siimage PCI ide card: [dunbar](1) # cat /proc/interrupts CPU0 0: 16835371 XT-PIC timer 2: 0 XT-PIC cascade 8: 4 XT-PIC rtc 10: 75 XT-PIC uhci_hcd 11: 1685559 XT-PIC ide2, ide3, eth0 15: 0 XT-PIC ide1 NMI: 0 LOC: 0 ERR: 0 With the eepro100 driver, a single tcp stream goes wirespeed in either direction (11.2MB/s). With e100, the speed fluctuates wildly without ever exceeding 2.5MB/s or so, and it averages around 1. I thought it might be due to an interaction of napi and the shared interrupt, so I tried disabling napi in the driver. Which was frustrating, because it's no longer optional, despite what Kconfig says. Jason Signed-off-by: Jason Lunz --- linux-2.6.10-rc3/drivers/net/Kconfig.pre 2004-12-11 23:11:49.000000000 -0500 +++ linux-2.6.10-rc3/drivers/net/Kconfig 2004-12-11 23:12:13.000000000 -0500 @@ -1435,23 +1435,6 @@ . The module will be called e100. -config E100_NAPI - bool "Use Rx Polling (NAPI)" - depends on E100 - help - NAPI is a new driver API designed to reduce CPU and interrupt load - when the driver is receiving lots of packets from the card. It is - still somewhat experimental and thus not yet enabled by default. - - If your estimated Rx load is 10kpps or more, or if the card will be - deployed on potentially unfriendly networks (e.g. in a firewall), - then say Y here. - - See for more - information. - - If in doubt, say N. - config LNE390 tristate "Mylex EISA LNE390A/B support (EXPERIMENTAL)" depends on NET_PCI && EISA && EXPERIMENTAL From webvenza@libero.it Sun Dec 12 00:15:37 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 12 Dec 2004 00:15:56 -0800 (PST) Received: from gateway.milesteg.arr (venza@adsl-30-96.38-151.net24.it [151.38.96.30]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBC8FYa2004118 for ; Sun, 12 Dec 2004 00:15:35 -0800 Date: Sun, 12 Dec 2004 09:15:09 +0100 From: Daniele Venzano To: Jeff Garzik Cc: Daniele Venzano , NetDev Subject: Re: [PATCH 1/5] sis900 printk and stack usage audit Message-ID: <20041212081509.GA3084@picchio.gall.it> Mail-Followup-To: Jeff Garzik , Daniele Venzano , NetDev References: <20041208104721.GA31707@picchio.gall.it> <20041208110156.GB31707@picchio.gall.it> <41BB452D.5040507@pobox.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <41BB452D.5040507@pobox.com> X-Operating-System: Debian GNU/Linux on kernel Linux 2.4.27-grsec X-Copyright: Forwarding or publishing without permission is prohibited. X-Truth: La vita e' una questione di culo, o ce l'hai o te lo fanno. X-GPG-Fingerprint: 642A A345 1CEF B6E3 925C 23CE DAB9 8764 25B3 57ED X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12669 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: webvenza@libero.it Precedence: bulk X-list: netdev On Sat, Dec 11, 2004 at 02:06:21PM -0500, Jeff Garzik wrote: > >@@ -390,13 +390,6 @@ > > u8 revision; > > char *card_name = card_names[pci_id->driver_data]; > > > >-/* when built into the kernel, we only print version if device is found */ > >-#ifndef MODULE > >- static int printed_version; > >- if (!printed_version++) > >- printk(version); > >-#endif > >- > > /* setup various bits in PCI command register */ > > ret = pci_enable_device(pci_dev); > > if(ret) return ret; > > There is no double-printing. One is #ifndef MODULE, one is #ifdef MODULE. > > Did you read the comment included in the code you deleted??? My bad, dropped. -- ----------------------------- Daniele Venzano Web: http://teg.homeunix.org From webvenza@libero.it Sun Dec 12 00:19:29 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 12 Dec 2004 00:19:35 -0800 (PST) Received: from gateway.milesteg.arr (venza@adsl-30-96.38-151.net24.it [151.38.96.30]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBC8JSPN007911 for ; Sun, 12 Dec 2004 00:19:28 -0800 Date: Sun, 12 Dec 2004 09:19:04 +0100 From: Daniele Venzano To: Jeff Garzik Cc: Daniele Venzano , NetDev Subject: Re: [PATCH 2/5] sis900 printk and stack usage audit Message-ID: <20041212081904.GB3084@picchio.gall.it> Mail-Followup-To: Jeff Garzik , Daniele Venzano , NetDev References: <20041208104721.GA31707@picchio.gall.it> <20041208110426.GC31707@picchio.gall.it> <41BB45D6.2090608@pobox.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <41BB45D6.2090608@pobox.com> X-Operating-System: Debian GNU/Linux on kernel Linux 2.4.27-grsec X-Copyright: Forwarding or publishing without permission is prohibited. X-Truth: La vita e' una questione di culo, o ce l'hai o te lo fanno. X-GPG-Fingerprint: 642A A345 1CEF B6E3 925C 23CE DAB9 8764 25B3 57ED X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12670 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: webvenza@libero.it Precedence: bulk X-list: netdev On Sat, Dec 11, 2004 at 02:09:10PM -0500, Jeff Garzik wrote: > >@@ -568,14 +580,15 @@ > > > > for (i = 0; mii_chip_table[i].phy_id1; i++) > > if ((mii_phy->phy_id0 == mii_chip_table[i].phy_id0 ) > > && > >- ((mii_phy->phy_id1 & 0xFFF0) == > >mii_chip_table[i].phy_id1)){ > >+ ((mii_phy->phy_id1 & 0xFFF0) == > >mii_chip_table[i].phy_id1)) > >+ { > > CodingStyle: the brace following an 'if' does not exist on a line by > itself. The line is too long, and it is difficult to see where the if block begins, I know it is an exception to the Golden Rules, but I thought a line wasted in code readbility was well wasted. However I don't think it really matters, will drop it. -- ----------------------------- Daniele Venzano Web: http://teg.homeunix.org From webvenza@libero.it Sun Dec 12 00:24:00 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 12 Dec 2004 00:24:05 -0800 (PST) Received: from gateway.milesteg.arr (venza@adsl-30-96.38-151.net24.it [151.38.96.30]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBC8NxeQ008500 for ; Sun, 12 Dec 2004 00:24:00 -0800 Date: Sun, 12 Dec 2004 09:23:34 +0100 From: Daniele Venzano To: Jeff Garzik Cc: NetDev Subject: Re: [PATCH 4/5] sis900 printk and stack usage audit Message-ID: <20041212082334.GC3084@picchio.gall.it> Mail-Followup-To: Jeff Garzik , NetDev References: <20041208104721.GA31707@picchio.gall.it> <20041208110835.GE31707@picchio.gall.it> <41BB4651.4040106@pobox.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <41BB4651.4040106@pobox.com> X-Operating-System: Debian GNU/Linux on kernel Linux 2.4.27-grsec X-Copyright: Forwarding or publishing without permission is prohibited. X-Truth: La vita e' una questione di culo, o ce l'hai o te lo fanno. X-GPG-Fingerprint: 642A A345 1CEF B6E3 925C 23CE DAB9 8764 25B3 57ED X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12671 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: webvenza@libero.it Precedence: bulk X-list: netdev On Sat, Dec 11, 2004 at 02:11:13PM -0500, Jeff Garzik wrote: > This is a step backwards. > > You are _adding_ multiple copies of the same piece of code, just to > avoid calling a function. Ok, I'm learning what is good and what is bad. Patch dropped. -- ----------------------------- Daniele Venzano Web: http://teg.homeunix.org From webvenza@libero.it Sun Dec 12 00:27:44 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 12 Dec 2004 00:27:48 -0800 (PST) Received: from gateway.milesteg.arr (venza@adsl-30-96.38-151.net24.it [151.38.96.30]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBC8Rhoe009060 for ; Sun, 12 Dec 2004 00:27:44 -0800 Date: Sun, 12 Dec 2004 09:27:20 +0100 From: Daniele Venzano To: Jeff Garzik Cc: NetDev Subject: Re: [PATCH 5/5] sis900 printk and stack usage audit Message-ID: <20041212082720.GD3084@picchio.gall.it> Mail-Followup-To: Jeff Garzik , NetDev References: <20041208104721.GA31707@picchio.gall.it> <20041208111010.GF31707@picchio.gall.it> <41BB4662.7010105@pobox.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <41BB4662.7010105@pobox.com> X-Operating-System: Debian GNU/Linux on kernel Linux 2.4.27-grsec X-Copyright: Forwarding or publishing without permission is prohibited. X-Truth: La vita e' una questione di culo, o ce l'hai o te lo fanno. X-GPG-Fingerprint: 642A A345 1CEF B6E3 925C 23CE DAB9 8764 25B3 57ED X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12672 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: webvenza@libero.it Precedence: bulk X-list: netdev On Sat, Dec 11, 2004 at 02:11:30PM -0500, Jeff Garzik wrote: > Daniele Venzano wrote: > >Change comment to reflect changes in parameters od sis630_set_eq > > OK, but not applied due to comments on previous patches Dropped, since I dropped patch 4/5. In the end I'll do new patches for the debugging stuff following netif standard (I didn't know of those :-( ) and rediff the chipset revision stuff. I hope to send these two or three patches later on today. Thanks, bye. -- ----------------------------- Daniele Venzano Web: http://teg.homeunix.org From webvenza@libero.it Sun Dec 12 04:19:19 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 12 Dec 2004 04:19:24 -0800 (PST) Received: from gateway.milesteg.arr (venza@adsl-30-96.38-151.net24.it [151.38.96.30]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBCCJHrT023667 for ; Sun, 12 Dec 2004 04:19:18 -0800 Date: Sun, 12 Dec 2004 13:18:52 +0100 From: Daniele Venzano To: NetDev , Jeff Garzik Subject: [PATCH 0/5] sis900 debugging and revision code Message-ID: <20041212121852.GA16325@gateway.milesteg.arr> Mail-Followup-To: NetDev , Jeff Garzik Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="/04w6evG8XlLl3ft" Content-Disposition: inline X-Operating-System: Debian GNU/Linux on kernel Linux 2.4.27-grsec X-Copyright: Forwarding or publishing without permission is prohibited. X-Truth: La vita e' una questione di culo, o ce l'hai o te lo fanno. X-GPG-Fingerprint: 642A A345 1CEF B6E3 925C 23CE DAB9 8764 25B3 57ED User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12673 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: webvenza@libero.it Precedence: bulk X-list: netdev --/04w6evG8XlLl3ft Content-Type: text/plain; charset=us-ascii Content-Disposition: inline New set of patches, following Jeff comments on my previous posts. Patches are tested and are against latest netdev-2.6 tree After applying all these patches you get: - sis900 debug output filtered with netif_msg macros - ethtool support for changing debug output - Correct printk() logging level - simplification of chip revision code As usual they are available also from: http://teg.homeunix.org/sis900.html -- ----------------------------- Daniele Venzano Web: http://teg.homeunix.org --/04w6evG8XlLl3ft Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Digital signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) iD8DBQFBvDcs2rmHZCWzV+0RArG+AJ9IqbeS7uY0SCP9cqxH60eQQKnY3wCgr3nU 7y1VayxoWHUAdOLTib1nKpk= =3y+R -----END PGP SIGNATURE----- --/04w6evG8XlLl3ft-- From webvenza@libero.it Sun Dec 12 04:30:56 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 12 Dec 2004 04:31:02 -0800 (PST) Received: from gateway.milesteg.arr (venza@adsl-30-96.38-151.net24.it [151.38.96.30]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBCCUsAi029412 for ; Sun, 12 Dec 2004 04:30:55 -0800 Date: Sun, 12 Dec 2004 13:30:31 +0100 From: Daniele Venzano To: NetDev , Jeff Garzik Subject: [PATCH 1/5] sis900 debugging and revision code Message-ID: <20041212123031.GB16325@gateway.milesteg.arr> Mail-Followup-To: NetDev , Jeff Garzik References: <20041212121852.GA16325@gateway.milesteg.arr> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="NU0Ex4SbNnrxsi6C" Content-Disposition: inline In-Reply-To: <20041212121852.GA16325@gateway.milesteg.arr> X-Operating-System: Debian GNU/Linux on kernel Linux 2.4.27-grsec X-Copyright: Forwarding or publishing without permission is prohibited. X-Truth: La vita e' una questione di culo, o ce l'hai o te lo fanno. X-GPG-Fingerprint: 642A A345 1CEF B6E3 925C 23CE DAB9 8764 25B3 57ED User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12674 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: webvenza@libero.it Precedence: bulk X-list: netdev --NU0Ex4SbNnrxsi6C Content-Type: multipart/mixed; boundary="1UWUbFP1cBYEclgG" Content-Disposition: inline --1UWUbFP1cBYEclgG Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Infrastructure needed for standard netif messages - add msg_level to sis900_private - define default msg level - set default value for sis900_debug Update module parameter description Ethtool support for debugging output level Signed-off-by: Daniele Venzano -- ----------------------------- Daniele Venzano Web: http://teg.homeunix.org --1UWUbFP1cBYEclgG Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="patch_60.diff" Index: sis900.c =================================================================== --- a/drivers/net/sis900.c (revision 59) +++ b/drivers/net/sis900.c (revision 60) @@ -82,9 +82,14 @@ static int max_interrupt_work = 40; static int multicast_filter_limit = 128; -#define sis900_debug debug -static int sis900_debug; +static int sis900_debug = -1; /* Use SIS900_DEF_MSG as value */ +#define SIS900_DEF_MSG \ + (NETIF_MSG_DRV | \ + NETIF_MSG_LINK | \ + NETIF_MSG_RX_ERR | \ + NETIF_MSG_TX_ERR) + /* Time in jiffies before concluding the transmitter is hung. */ #define TX_TIMEOUT (4*HZ) /* SiS 900 is capable of 32 bits BM DMA */ @@ -160,6 +165,8 @@ struct timer_list timer; /* Link status detection timer. */ u8 autong_complete; /* 1: auto-negotiate complete */ + u32 msg_enable; + unsigned int cur_rx, dirty_rx; /* producer/comsumer pointers for Tx/Rx ring */ unsigned int cur_tx, dirty_tx; @@ -183,10 +190,10 @@ module_param(multicast_filter_limit, int, 0444); module_param(max_interrupt_work, int, 0444); -module_param(debug, int, 0444); +module_param(sis900_debug, int, 0444); MODULE_PARM_DESC(multicast_filter_limit, "SiS 900/7016 maximum number of filtered multicast addresses"); MODULE_PARM_DESC(max_interrupt_work, "SiS 900/7016 maximum events handled per interrupt"); -MODULE_PARM_DESC(debug, "SiS 900/7016 debug level (2-4)"); +MODULE_PARM_DESC(sis900_debug, "SiS 900/7016 bitmapped debugging message level"); static int sis900_open(struct net_device *net_dev); static int sis900_mii_probe (struct net_device * net_dev); @@ -457,6 +464,11 @@ net_dev->tx_timeout = sis900_tx_timeout; net_dev->watchdog_timeo = TX_TIMEOUT; net_dev->ethtool_ops = &sis900_ethtool_ops; + + if (sis900_debug > 0) + sis_priv->msg_enable = sis900_debug; + else + sis_priv->msg_enable = SIS900_DEF_MSG; ret = register_netdev(net_dev); if (ret) @@ -1905,8 +1917,22 @@ strcpy (info->bus_info, pci_name(sis_priv->pci_dev)); } +static u32 sis900_get_msglevel(struct net_device *net_dev) +{ + struct sis900_private *sis_priv = net_dev->priv; + return sis_priv->msg_enable; +} + +static void sis900_set_msglevel(struct net_device *net_dev, u32 value) +{ + struct sis900_private *sis_priv = net_dev->priv; + sis_priv->msg_enable = value; +} + static struct ethtool_ops sis900_ethtool_ops = { - .get_drvinfo = sis900_get_drvinfo, + .get_drvinfo = sis900_get_drvinfo, + .get_msglevel = sis900_get_msglevel, + .set_msglevel = sis900_set_msglevel, }; /** --1UWUbFP1cBYEclgG-- --NU0Ex4SbNnrxsi6C Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Digital signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) iD8DBQFBvDnn2rmHZCWzV+0RAvilAJ9v02B0ef2GT+U9lxckHAZd9vGFOQCgm4U3 qB5B5edZOzRJG48ybGFZkCQ= =t0zt -----END PGP SIGNATURE----- --NU0Ex4SbNnrxsi6C-- From webvenza@libero.it Sun Dec 12 04:31:50 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 12 Dec 2004 04:31:57 -0800 (PST) Received: from gateway.milesteg.arr (venza@adsl-30-96.38-151.net24.it [151.38.96.30]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBCCVn1k029623 for ; Sun, 12 Dec 2004 04:31:50 -0800 Date: Sun, 12 Dec 2004 13:31:25 +0100 From: Daniele Venzano To: NetDev , Jeff Garzik Subject: [PATCH 2/5] sis900 debugging and revision code Message-ID: <20041212123125.GC16325@gateway.milesteg.arr> Mail-Followup-To: NetDev , Jeff Garzik References: <20041212121852.GA16325@gateway.milesteg.arr> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="bi5JUZtvcfApsciF" Content-Disposition: inline In-Reply-To: <20041212121852.GA16325@gateway.milesteg.arr> X-Operating-System: Debian GNU/Linux on kernel Linux 2.4.27-grsec X-Copyright: Forwarding or publishing without permission is prohibited. X-Truth: La vita e' una questione di culo, o ce l'hai o te lo fanno. X-GPG-Fingerprint: 642A A345 1CEF B6E3 925C 23CE DAB9 8764 25B3 57ED User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12675 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: webvenza@libero.it Precedence: bulk X-list: netdev --bi5JUZtvcfApsciF Content-Type: multipart/mixed; boundary="M38YqGLZlgb6RLPS" Content-Disposition: inline --M38YqGLZlgb6RLPS Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Version bump Signed-off-by: Daniele Venzano -- ----------------------------- Daniele Venzano Web: http://teg.homeunix.org --M38YqGLZlgb6RLPS Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="patch_61.diff" Index: sis900.c =================================================================== --- a/drivers/net/sis900.c (revision 60) +++ b/drivers/net/sis900.c (revision 61) @@ -1,6 +1,6 @@ /* sis900.c: A SiS 900/7016 PCI Fast Ethernet driver for Linux. Copyright 1999 Silicon Integrated System Corporation - Revision: 1.08.06 Sep. 24 2002 + Revision: 1.08.08 Dec. 12 2004 Modified from the driver which is originally written by Donald Becker. @@ -16,8 +16,8 @@ preliminary Rev. 1.0 Nov. 10, 1998 SiS 7014 Single Chip 100BASE-TX/10BASE-T Physical Layer Solution, preliminary Rev. 1.0 Jan. 18, 1998 - http://www.sis.com.tw/support/databook.htm + Rev 1.08.08 Dec. 12 2004 Daniele Venzano use netif_msg for debugging messages Rev 1.08.07 Nov. 2 2003 Daniele Venzano add suspend/resume support Rev 1.08.06 Sep. 24 2002 Mufasa Yang bug fix for Tx timeout & add SiS963 support Rev 1.08.05 Jun. 6 2002 Mufasa Yang bug fix for read_eeprom & Tx descriptor over-boundary @@ -74,7 +74,7 @@ #include "sis900.h" #define SIS900_MODULE_NAME "sis900" -#define SIS900_DRV_VERSION "v1.08.07 11/02/2003" +#define SIS900_DRV_VERSION "v1.08.08 Dec. 12 2004" static char version[] __devinitdata = KERN_INFO "sis900.c: " SIS900_DRV_VERSION "\n"; --M38YqGLZlgb6RLPS-- --bi5JUZtvcfApsciF Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Digital signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) iD8DBQFBvDoc2rmHZCWzV+0RAn21AJ0UObwcBUaP/s74EXdowPtAn+dvTACfcEOA CC/qu52ZHxaBo8nMKYieoT0= =CC6M -----END PGP SIGNATURE----- --bi5JUZtvcfApsciF-- From webvenza@libero.it Sun Dec 12 04:32:39 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 12 Dec 2004 04:32:46 -0800 (PST) Received: from gateway.milesteg.arr (venza@adsl-30-96.38-151.net24.it [151.38.96.30]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBCCWbi0030215 for ; Sun, 12 Dec 2004 04:32:38 -0800 Date: Sun, 12 Dec 2004 13:32:14 +0100 From: Daniele Venzano To: NetDev , Jeff Garzik Subject: [PATCH 3/5] sis900 debugging and revision code Message-ID: <20041212123214.GD16325@gateway.milesteg.arr> Mail-Followup-To: NetDev , Jeff Garzik References: <20041212121852.GA16325@gateway.milesteg.arr> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="fCcDWlUEdh43YKr8" Content-Disposition: inline In-Reply-To: <20041212121852.GA16325@gateway.milesteg.arr> X-Operating-System: Debian GNU/Linux on kernel Linux 2.4.27-grsec X-Copyright: Forwarding or publishing without permission is prohibited. X-Truth: La vita e' una questione di culo, o ce l'hai o te lo fanno. X-GPG-Fingerprint: 642A A345 1CEF B6E3 925C 23CE DAB9 8764 25B3 57ED User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12676 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: webvenza@libero.it Precedence: bulk X-list: netdev --fCcDWlUEdh43YKr8 Content-Type: multipart/mixed; boundary="/aVve/J9H4Wl5yVO" Content-Disposition: inline --/aVve/J9H4Wl5yVO Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Add sis900: prefix to all messages to do simpler greps in logs Change priority of printk to KERN_DEBUG where appropriate Remove two cryptic and useless printk Signed-off-by: Daniele Venzano -- ----------------------------- Daniele Venzano Web: http://teg.homeunix.org --/aVve/J9H4Wl5yVO Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="patch_62.diff" Index: sis900.c =================================================================== --- a/drivers/net/sis900.c (revision 61) +++ b/drivers/net/sis900.c (revision 62) @@ -76,12 +76,8 @@ #define SIS900_MODULE_NAME "sis900" #define SIS900_DRV_VERSION "v1.08.08 Dec. 12 2004" -static char version[] __devinitdata = -KERN_INFO "sis900.c: " SIS900_DRV_VERSION "\n"; - static int max_interrupt_work = 40; static int multicast_filter_limit = 128; - static int sis900_debug = -1; /* Use SIS900_DEF_MSG as value */ #define SIS900_DEF_MSG \ @@ -90,6 +86,11 @@ NETIF_MSG_RX_ERR | \ NETIF_MSG_TX_ERR) +#define PFX "sis900: " /* Prefix common to all sis900 output messages */ + +static char version[] __devinitdata = +KERN_INFO PFX SIS900_DRV_VERSION "\n"; + /* Time in jiffies before concluding the transmitter is hung. */ #define TX_TIMEOUT (4*HZ) /* SiS 900 is capable of 32 bits BM DMA */ @@ -243,7 +244,7 @@ /* check to see if we have sane EEPROM */ signature = (u16) read_eeprom(ioaddr, EEPROMSignature); if (signature == 0xffff || signature == 0x0000) { - printk (KERN_INFO "%s: Error EERPOM read %x\n", + printk (KERN_WARNING PFX "%s: Error EERPOM read %x\n", net_dev->name, signature); return 0; } @@ -276,7 +277,7 @@ if (!isa_bridge) { isa_bridge = pci_find_device(PCI_VENDOR_ID_SI, 0x0018, isa_bridge); if (!isa_bridge) { - printk("%s: Can not find ISA bridge\n", net_dev->name); + printk(KERN_WARNING PFX "%s: Cannot find ISA bridge\n", net_dev->name); return 0; } } @@ -410,7 +411,7 @@ i = pci_set_dma_mask(pci_dev, SIS900_DMA_MASK); if(i){ - printk(KERN_ERR "sis900.c: architecture does not support" + printk(KERN_ERR PFX "sis900.c: architecture does not support" "32bit PCI busmaster DMA\n"); return i; } @@ -508,7 +509,7 @@ pci_read_config_byte(dev, PCI_CLASS_REVISION, &sis_priv->host_bridge_rev); /* print some information about our NIC */ - printk(KERN_INFO "%s: %s at %#lx, IRQ %d, ", net_dev->name, + printk(KERN_INFO PFX "%s: %s at %#lx, IRQ %d, ", net_dev->name, card_name, ioaddr, net_dev->irq); for (i = 0; i < 5; i++) printk("%2.2x:", (u8)net_dev->dev_addr[i]); @@ -566,7 +567,7 @@ continue; if ((mii_phy = kmalloc(sizeof(struct mii_phy), GFP_KERNEL)) == NULL) { - printk(KERN_INFO "Cannot allocate mem for struct mii_phy\n"); + printk(KERN_WARNING PFX "Cannot allocate mem for struct mii_phy\n"); mii_phy = sis_priv->first_mii; while (mii_phy) { struct mii_phy *phy; @@ -592,21 +593,21 @@ if (mii_chip_table[i].phy_types == MIX) mii_phy->phy_types = (mii_status & (MII_STAT_CAN_TX_FDX | MII_STAT_CAN_TX)) ? LAN : HOME; - printk(KERN_INFO "%s: %s transceiver found at address %d.\n", + printk(KERN_INFO PFX "%s: %s transceiver found at address %d.\n", net_dev->name, mii_chip_table[i].name, phy_addr); break; } if( !mii_chip_table[i].phy_id1 ) { - printk(KERN_INFO "%s: Unknown PHY transceiver found at address %d.\n", + printk(KERN_INFO PFX "%s: Unknown PHY transceiver found at address %d.\n", net_dev->name, phy_addr); mii_phy->phy_types = UNKNOWN; } } if (sis_priv->mii == NULL) { - printk(KERN_INFO "%s: No MII transceivers found!\n", + printk(KERN_INFO PFX "%s: No MII transceivers found!\n", net_dev->name); return 0; } @@ -631,7 +632,7 @@ poll_bit ^= (mdio_read(net_dev, sis_priv->cur_phy, MII_STATUS) & poll_bit); if (time_after_eq(jiffies, timeout)) { - printk(KERN_WARNING "%s: reset phy and link down now\n", + printk(KERN_WARNING PFX "%s: reset phy and link down now\n", net_dev->name); return -ETIME; } @@ -701,7 +702,7 @@ if (sis_priv->mii != default_phy) { sis_priv->mii = default_phy; sis_priv->cur_phy = default_phy->phy_addr; - printk(KERN_INFO "%s: Using transceiver found at address %d as default\n", + printk(KERN_INFO PFX "%s: Using transceiver found at address %d as default\n", net_dev->name,sis_priv->cur_phy); } @@ -1028,7 +1029,7 @@ outl(w, ioaddr + rfdr); if (sis900_debug > 2) { - printk(KERN_INFO "%s: Receive Filter Addrss[%d]=%x\n", + printk(KERN_DEBUG PFX "%s: Receive Filter Addrss[%d]=%x\n", net_dev->name, i, inl(ioaddr + rfdr)); } } @@ -1066,7 +1067,7 @@ /* load Transmit Descriptor Register */ outl(sis_priv->tx_ring_dma, ioaddr + txdp); if (sis900_debug > 2) - printk(KERN_INFO "%s: TX descriptor register loaded with: %8.8x\n", + printk(KERN_DEBUG PFX "%s: TX descriptor register loaded with: %8.8x\n", net_dev->name, inl(ioaddr + txdp)); } @@ -1120,7 +1121,7 @@ /* load Receive Descriptor Register */ outl(sis_priv->rx_ring_dma, ioaddr + rxdp); if (sis900_debug > 2) - printk(KERN_INFO "%s: RX descriptor register loaded with: %8.8x\n", + printk(KERN_DEBUG PFX "%s: RX descriptor register loaded with: %8.8x\n", net_dev->name, inl(ioaddr + rxdp)); } @@ -1267,7 +1268,7 @@ /* Link ON -> OFF */ if (!(status & MII_STAT_LINK)){ netif_carrier_off(net_dev); - printk(KERN_INFO "%s: Media Link Off\n", net_dev->name); + printk(KERN_INFO PFX "%s: Media Link Off\n", net_dev->name); /* Change mode issue */ if ((mii_phy->phy_id0 == 0x001D) && @@ -1382,7 +1383,7 @@ status = mdio_read(net_dev, phy_addr, MII_STATUS); if (!(status & MII_STAT_LINK)){ - printk(KERN_INFO "%s: Media Link Off\n", net_dev->name); + printk(KERN_INFO PFX "%s: Media Link Off\n", net_dev->name); sis_priv->autong_complete = 1; netif_carrier_off(net_dev); return; @@ -1444,7 +1445,7 @@ *speed = HW_SPEED_100_MBPS; } - printk(KERN_INFO "%s: Media Link On %s %s-duplex \n", + printk(KERN_INFO PFX "%s: Media Link On %s %s-duplex \n", net_dev->name, *speed == HW_SPEED_100_MBPS ? "100mbps" : "10mbps", @@ -1467,7 +1468,7 @@ unsigned long flags; int i; - printk(KERN_INFO "%s: Transmit timeout, status %8.8x %8.8x \n", + printk(KERN_INFO PFX "%s: Transmit timeout, status %8.8x %8.8x \n", net_dev->name, inl(ioaddr + cr), inl(ioaddr + isr)); /* Disable interrupts by clearing the interrupt mask. */ @@ -1570,7 +1571,7 @@ net_dev->trans_start = jiffies; if (sis900_debug > 3) - printk(KERN_INFO "%s: Queued Tx packet at %p size %d " + printk(KERN_DEBUG PFX "%s: Queued Tx packet at %p size %d " "to slot %d.\n", net_dev->name, skb->data, (int)skb->len, entry); @@ -1617,12 +1618,12 @@ /* something strange happened !!! */ if (status & HIBERR) { - printk(KERN_INFO "%s: Abnormal interrupt," + printk(KERN_INFO PFX "%s: Abnormal interrupt," "status %#8.8x.\n", net_dev->name, status); break; } if (--boguscnt < 0) { - printk(KERN_INFO "%s: Too much work at interrupt, " + printk(KERN_INFO PFX "%s: Too much work at interrupt, " "interrupt status = %#8.8x.\n", net_dev->name, status); break; @@ -1630,7 +1631,7 @@ } while (1); if (sis900_debug > 3) - printk(KERN_INFO "%s: exiting interrupt, " + printk(KERN_DEBUG PFX "%s: exiting interrupt, " "interrupt status = 0x%#8.8x.\n", net_dev->name, inl(ioaddr + isr)); @@ -1656,7 +1657,7 @@ u32 rx_status = sis_priv->rx_ring[entry].cmdsts; if (sis900_debug > 3) - printk(KERN_INFO "sis900_rx, cur_rx:%4.4d, dirty_rx:%4.4d " + printk(KERN_DEBUG PFX "sis900_rx, cur_rx:%4.4d, dirty_rx:%4.4d " "status:0x%8.8x\n", sis_priv->cur_rx, sis_priv->dirty_rx, rx_status); @@ -1668,7 +1669,7 @@ if (rx_status & (ABORT|OVERRUN|TOOLONG|RUNT|RXISERR|CRCERR|FAERR)) { /* corrupted packet received */ if (sis900_debug > 3) - printk(KERN_INFO "%s: Corrupted packet " + printk(KERN_DEBUG PFX "%s: Corrupted packet " "received, buffer status = 0x%8.8x.\n", net_dev->name, rx_status); sis_priv->stats.rx_errors++; @@ -1689,7 +1690,7 @@ some unknow bugs, it is possible that we are working on NULL sk_buff :-( */ if (sis_priv->rx_skbuff[entry] == NULL) { - printk(KERN_INFO "%s: NULL pointer " + printk(KERN_INFO PFX "%s: NULL pointer " "encountered in Rx ring, skipping\n", net_dev->name); break; @@ -1718,7 +1719,7 @@ * "hole" on the buffer ring, it is not clear * how the hardware will react to this kind * of degenerated buffer */ - printk(KERN_INFO "%s: Memory squeeze," + printk(KERN_INFO PFX "%s: Memory squeeze," "deferring packet.\n", net_dev->name); sis_priv->rx_skbuff[entry] = NULL; @@ -1754,7 +1755,7 @@ * "hole" on the buffer ring, it is not clear * how the hardware will react to this kind * of degenerated buffer */ - printk(KERN_INFO "%s: Memory squeeze," + printk(KERN_INFO PFX "%s: Memory squeeze," "deferring packet.\n", net_dev->name); sis_priv->stats.rx_dropped++; @@ -1806,7 +1807,7 @@ if (tx_status & (ABORT | UNDERRUN | OWCOLL)) { /* packet unsuccessfully transmitted */ if (sis900_debug > 3) - printk(KERN_INFO "%s: Transmit " + printk(KERN_DEBUG PFX "%s: Transmit " "error, Tx status %8.8x.\n", net_dev->name, tx_status); sis_priv->stats.tx_errors++; @@ -2073,12 +2074,10 @@ case IF_PORT_AUI: /* AUI */ case IF_PORT_100BASEFX: /* 100BaseFx */ /* These Modes are not supported (are they?)*/ - printk(KERN_INFO "Not supported"); return -EOPNOTSUPP; break; default: - printk(KERN_INFO "Invalid"); return -EINVAL; } } --/aVve/J9H4Wl5yVO-- --fCcDWlUEdh43YKr8 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Digital signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) iD8DBQFBvDpO2rmHZCWzV+0RAsB1AJ9tfdParizZ6dnsLNzF27RqC+emxACeMXP2 pvIgmbKSzQ0GtEqEtKUhkoI= =E3NF -----END PGP SIGNATURE----- --fCcDWlUEdh43YKr8-- From webvenza@libero.it Sun Dec 12 04:33:37 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 12 Dec 2004 04:33:41 -0800 (PST) Received: from gateway.milesteg.arr (venza@adsl-30-96.38-151.net24.it [151.38.96.30]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBCCXZZx030807 for ; Sun, 12 Dec 2004 04:33:36 -0800 Date: Sun, 12 Dec 2004 13:33:12 +0100 From: Daniele Venzano To: NetDev , Jeff Garzik Subject: [PATCH 4/5] sis900 debugging and revision code Message-ID: <20041212123312.GE16325@gateway.milesteg.arr> Mail-Followup-To: NetDev , Jeff Garzik References: <20041212121852.GA16325@gateway.milesteg.arr> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="LYw3s/afESlflPpp" Content-Disposition: inline In-Reply-To: <20041212121852.GA16325@gateway.milesteg.arr> X-Operating-System: Debian GNU/Linux on kernel Linux 2.4.27-grsec X-Copyright: Forwarding or publishing without permission is prohibited. X-Truth: La vita e' una questione di culo, o ce l'hai o te lo fanno. X-GPG-Fingerprint: 642A A345 1CEF B6E3 925C 23CE DAB9 8764 25B3 57ED User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12677 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: webvenza@libero.it Precedence: bulk X-list: netdev --LYw3s/afESlflPpp Content-Type: multipart/mixed; boundary="Bg2esWel0ueIH/G/" Content-Disposition: inline --Bg2esWel0ueIH/G/ Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Add some init debugging printk Use netif_msg macros before printing debug messages Signed-off-by: Daniele Venzano -- ----------------------------- Daniele Venzano Web: http://teg.homeunix.org --Bg2esWel0ueIH/G/ Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="patch_63.diff" Index: sis900.c =================================================================== --- a/drivers/net/sis900.c (revision 62) +++ b/drivers/net/sis900.c (revision 63) @@ -477,8 +477,13 @@ /* Get Mac address according to the chip revision */ pci_read_config_byte(pci_dev, PCI_CLASS_REVISION, &revision); + + if(netif_msg_probe(sis_priv)) + printk(KERN_DEBUG PFX "%s: detected revision %2.2x, " + "trying to get MAC address...\n", + net_dev->name, revision); + ret = 0; - if (revision == SIS630E_900_REV) ret = sis630e_get_mac_addr(pci_dev, net_dev); else if ((revision > 0x81) && (revision <= 0x90) ) @@ -489,6 +494,7 @@ ret = sis900_get_mac_addr(pci_dev, net_dev); if (ret == 0) { + printk(KERN_WARNING PFX "%s: Cannot read MAC address.\n", net_dev->name); ret = -ENODEV; goto err_out_unregister; } @@ -499,6 +505,7 @@ /* probe for mii transceiver */ if (sis900_mii_probe(net_dev) == 0) { + printk(KERN_WARNING PFX "%s: Error probing MII device.\n", net_dev->name); ret = -ENODEV; goto err_out_unregister; } @@ -562,9 +569,13 @@ for(i = 0; i < 2; i++) mii_status = mdio_read(net_dev, phy_addr, MII_STATUS); - if (mii_status == 0xffff || mii_status == 0x0000) - /* the mii is not accessible, try next one */ + if (mii_status == 0xffff || mii_status == 0x0000) { + if (netif_msg_probe(sis_priv)) + printk(KERN_DEBUG PFX "%s: MII at address %d" + " not accessible\n", + net_dev->name, phy_addr); continue; + } if ((mii_phy = kmalloc(sizeof(struct mii_phy), GFP_KERNEL)) == NULL) { printk(KERN_WARNING PFX "Cannot allocate mem for struct mii_phy\n"); @@ -593,9 +604,11 @@ if (mii_chip_table[i].phy_types == MIX) mii_phy->phy_types = (mii_status & (MII_STAT_CAN_TX_FDX | MII_STAT_CAN_TX)) ? LAN : HOME; - printk(KERN_INFO PFX "%s: %s transceiver found at address %d.\n", - net_dev->name, mii_chip_table[i].name, - phy_addr); + printk(KERN_INFO PFX "%s: %s transceiver found " + "at address %d.\n", + net_dev->name, + mii_chip_table[i].name, + phy_addr); break; } @@ -1011,6 +1024,7 @@ static void sis900_init_rxfilter (struct net_device * net_dev) { + struct sis900_private *sis_priv = net_dev->priv; long ioaddr = net_dev->base_addr; u32 rfcrSave; u32 i; @@ -1028,7 +1042,7 @@ outl((i << RFADDR_shift), ioaddr + rfcr); outl(w, ioaddr + rfdr); - if (sis900_debug > 2) { + if (netif_msg_hw(sis_priv)) { printk(KERN_DEBUG PFX "%s: Receive Filter Addrss[%d]=%x\n", net_dev->name, i, inl(ioaddr + rfdr)); } @@ -1066,7 +1080,7 @@ /* load Transmit Descriptor Register */ outl(sis_priv->tx_ring_dma, ioaddr + txdp); - if (sis900_debug > 2) + if (netif_msg_hw(sis_priv)) printk(KERN_DEBUG PFX "%s: TX descriptor register loaded with: %8.8x\n", net_dev->name, inl(ioaddr + txdp)); } @@ -1120,7 +1134,7 @@ /* load Receive Descriptor Register */ outl(sis_priv->rx_ring_dma, ioaddr + rxdp); - if (sis900_debug > 2) + if (netif_msg_hw(sis_priv)) printk(KERN_DEBUG PFX "%s: RX descriptor register loaded with: %8.8x\n", net_dev->name, inl(ioaddr + rxdp)); } @@ -1268,7 +1282,8 @@ /* Link ON -> OFF */ if (!(status & MII_STAT_LINK)){ netif_carrier_off(net_dev); - printk(KERN_INFO PFX "%s: Media Link Off\n", net_dev->name); + if(netif_msg_link(sis_priv)) + printk(KERN_INFO PFX "%s: Media Link Off\n", net_dev->name); /* Change mode issue */ if ((mii_phy->phy_id0 == 0x001D) && @@ -1383,7 +1398,8 @@ status = mdio_read(net_dev, phy_addr, MII_STATUS); if (!(status & MII_STAT_LINK)){ - printk(KERN_INFO PFX "%s: Media Link Off\n", net_dev->name); + if(netif_msg_link(sis_priv)) + printk(KERN_INFO PFX "%s: Media Link Off\n", net_dev->name); sis_priv->autong_complete = 1; netif_carrier_off(net_dev); return; @@ -1445,7 +1461,8 @@ *speed = HW_SPEED_100_MBPS; } - printk(KERN_INFO PFX "%s: Media Link On %s %s-duplex \n", + if(netif_msg_link(sis_priv)) + printk(KERN_INFO PFX "%s: Media Link On %s %s-duplex \n", net_dev->name, *speed == HW_SPEED_100_MBPS ? "100mbps" : "10mbps", @@ -1468,8 +1485,9 @@ unsigned long flags; int i; - printk(KERN_INFO PFX "%s: Transmit timeout, status %8.8x %8.8x \n", - net_dev->name, inl(ioaddr + cr), inl(ioaddr + isr)); + if(netif_msg_tx_err(sis_priv)) + printk(KERN_INFO PFX "%s: Transmit timeout, status %8.8x %8.8x \n", + net_dev->name, inl(ioaddr + cr), inl(ioaddr + isr)); /* Disable interrupts by clearing the interrupt mask. */ outl(0x0000, ioaddr + imr); @@ -1570,7 +1588,7 @@ net_dev->trans_start = jiffies; - if (sis900_debug > 3) + if (netif_msg_tx_queued(sis_priv)) printk(KERN_DEBUG PFX "%s: Queued Tx packet at %p size %d " "to slot %d.\n", net_dev->name, skb->data, (int)skb->len, entry); @@ -1618,19 +1636,21 @@ /* something strange happened !!! */ if (status & HIBERR) { - printk(KERN_INFO PFX "%s: Abnormal interrupt," - "status %#8.8x.\n", net_dev->name, status); + if(netif_msg_intr(sis_priv)) + printk(KERN_INFO PFX "%s: Abnormal interrupt," + "status %#8.8x.\n", net_dev->name, status); break; } if (--boguscnt < 0) { - printk(KERN_INFO PFX "%s: Too much work at interrupt, " - "interrupt status = %#8.8x.\n", - net_dev->name, status); + if(netif_msg_intr(sis_priv)) + printk(KERN_INFO PFX "%s: Too much work at interrupt, " + "interrupt status = %#8.8x.\n", + net_dev->name, status); break; } } while (1); - if (sis900_debug > 3) + if(netif_msg_intr(sis_priv)) printk(KERN_DEBUG PFX "%s: exiting interrupt, " "interrupt status = 0x%#8.8x.\n", net_dev->name, inl(ioaddr + isr)); @@ -1656,7 +1676,7 @@ unsigned int entry = sis_priv->cur_rx % NUM_RX_DESC; u32 rx_status = sis_priv->rx_ring[entry].cmdsts; - if (sis900_debug > 3) + if (netif_msg_rx_status(sis_priv)) printk(KERN_DEBUG PFX "sis900_rx, cur_rx:%4.4d, dirty_rx:%4.4d " "status:0x%8.8x\n", sis_priv->cur_rx, sis_priv->dirty_rx, rx_status); @@ -1668,7 +1688,7 @@ if (rx_status & (ABORT|OVERRUN|TOOLONG|RUNT|RXISERR|CRCERR|FAERR)) { /* corrupted packet received */ - if (sis900_debug > 3) + if (netif_msg_rx_err(sis_priv)) printk(KERN_DEBUG PFX "%s: Corrupted packet " "received, buffer status = 0x%8.8x.\n", net_dev->name, rx_status); @@ -1690,9 +1710,10 @@ some unknow bugs, it is possible that we are working on NULL sk_buff :-( */ if (sis_priv->rx_skbuff[entry] == NULL) { - printk(KERN_INFO PFX "%s: NULL pointer " - "encountered in Rx ring, skipping\n", - net_dev->name); + if (netif_msg_rx_err(sis_priv)) + printk(KERN_INFO PFX "%s: NULL pointer " + "encountered in Rx ring, skipping\n", + net_dev->name); break; } @@ -1719,9 +1740,10 @@ * "hole" on the buffer ring, it is not clear * how the hardware will react to this kind * of degenerated buffer */ - printk(KERN_INFO PFX "%s: Memory squeeze," - "deferring packet.\n", - net_dev->name); + if (netif_msg_rx_status(sis_priv)) + printk(KERN_INFO PFX "%s: Memory squeeze," + "deferring packet.\n", + net_dev->name); sis_priv->rx_skbuff[entry] = NULL; /* reset buffer descriptor state */ sis_priv->rx_ring[entry].cmdsts = 0; @@ -1755,9 +1777,10 @@ * "hole" on the buffer ring, it is not clear * how the hardware will react to this kind * of degenerated buffer */ - printk(KERN_INFO PFX "%s: Memory squeeze," - "deferring packet.\n", - net_dev->name); + if (netif_msg_rx_err(sis_priv)) + printk(KERN_INFO PFX "%s: Memory squeeze," + "deferring packet.\n", + net_dev->name); sis_priv->stats.rx_dropped++; break; } @@ -1806,7 +1829,7 @@ if (tx_status & (ABORT | UNDERRUN | OWCOLL)) { /* packet unsuccessfully transmitted */ - if (sis900_debug > 3) + if (netif_msg_tx_err(sis_priv)) printk(KERN_DEBUG PFX "%s: Transmit " "error, Tx status %8.8x.\n", net_dev->name, tx_status); --Bg2esWel0ueIH/G/-- --LYw3s/afESlflPpp Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Digital signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) iD8DBQFBvDqI2rmHZCWzV+0RApd/AJsF9W/acatHchRiHAKPWHJBKYtWzgCfTT6/ Rc2GjJ//NocK3qfndTJNkic= =HuL+ -----END PGP SIGNATURE----- --LYw3s/afESlflPpp-- From webvenza@libero.it Sun Dec 12 04:34:30 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 12 Dec 2004 04:34:36 -0800 (PST) Received: from gateway.milesteg.arr (venza@adsl-30-96.38-151.net24.it [151.38.96.30]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBCCYSH0031249 for ; Sun, 12 Dec 2004 04:34:29 -0800 Date: Sun, 12 Dec 2004 13:34:05 +0100 From: Daniele Venzano To: NetDev , Jeff Garzik Subject: [PATCH 5/5] sis900 debugging and revision code Message-ID: <20041212123405.GF16325@gateway.milesteg.arr> Mail-Followup-To: NetDev , Jeff Garzik References: <20041212121852.GA16325@gateway.milesteg.arr> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="VJJoKLVEFXdmHQwR" Content-Disposition: inline In-Reply-To: <20041212121852.GA16325@gateway.milesteg.arr> X-Operating-System: Debian GNU/Linux on kernel Linux 2.4.27-grsec X-Copyright: Forwarding or publishing without permission is prohibited. X-Truth: La vita e' una questione di culo, o ce l'hai o te lo fanno. X-GPG-Fingerprint: 642A A345 1CEF B6E3 925C 23CE DAB9 8764 25B3 57ED User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12678 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: webvenza@libero.it Precedence: bulk X-list: netdev --VJJoKLVEFXdmHQwR Content-Type: multipart/mixed; boundary="GvznHscUikHnwW2p" Content-Disposition: inline --GvznHscUikHnwW2p Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Chip revision is now a member of sis_priv structure Kill all calls to pci_read_config_byte but one Change the code to use sis_priv->chipset_rev Signed-off-by: Daniele Venzano -- ----------------------------- Daniele Venzano Web: http://teg.homeunix.org --GvznHscUikHnwW2p Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="patch_64.diff" Index: sis900.c =================================================================== --- a/drivers/net/sis900.c (revision 63) +++ b/drivers/net/sis900.c (revision 64) @@ -182,6 +182,7 @@ unsigned int tx_full; /* The Tx queue is full. */ u8 host_bridge_rev; + u8 chipset_rev; u32 pci_state[16]; }; @@ -395,7 +396,6 @@ void *ring_space; long ioaddr; int i, ret; - u8 revision; char *card_name = card_names[pci_id->driver_data]; /* when built into the kernel, we only print version if device is found */ @@ -476,19 +476,18 @@ goto err_unmap_rx; /* Get Mac address according to the chip revision */ - pci_read_config_byte(pci_dev, PCI_CLASS_REVISION, &revision); - + pci_read_config_byte(pci_dev, PCI_CLASS_REVISION, &(sis_priv->chipset_rev)); if(netif_msg_probe(sis_priv)) printk(KERN_DEBUG PFX "%s: detected revision %2.2x, " "trying to get MAC address...\n", - net_dev->name, revision); + net_dev->name, sis_priv->chipset_rev); ret = 0; - if (revision == SIS630E_900_REV) + if (sis_priv->chipset_rev == SIS630E_900_REV) ret = sis630e_get_mac_addr(pci_dev, net_dev); - else if ((revision > 0x81) && (revision <= 0x90) ) + else if ((sis_priv->chipset_rev > 0x81) && (sis_priv->chipset_rev <= 0x90) ) ret = sis635_get_mac_addr(pci_dev, net_dev); - else if (revision == SIS96x_900_REV) + else if (sis_priv->chipset_rev == SIS96x_900_REV) ret = sis96x_get_mac_addr(pci_dev, net_dev); else ret = sis900_get_mac_addr(pci_dev, net_dev); @@ -500,7 +499,7 @@ } /* 630ET : set the mii access mode as software-mode */ - if (revision == SIS630ET_900_REV) + if (sis_priv->chipset_rev == SIS630ET_900_REV) outl(ACCESSMODE | inl(ioaddr + cr), ioaddr + cr); /* probe for mii transceiver */ @@ -555,7 +554,6 @@ u16 poll_bit = MII_STAT_LINK, status = 0; unsigned long timeout = jiffies + 5 * HZ; int phy_addr; - u8 revision; sis_priv->mii = NULL; @@ -652,8 +650,7 @@ } } - pci_read_config_byte(sis_priv->pci_dev, PCI_CLASS_REVISION, &revision); - if (revision == SIS630E_900_REV) { + if (sis_priv->chipset_rev == SIS630E_900_REV) { /* SiS 630E has some bugs on default value of PHY registers */ mdio_write(net_dev, sis_priv->cur_phy, MII_ANADV, 0x05e1); mdio_write(net_dev, sis_priv->cur_phy, MII_CONFIG1, 0x22); @@ -968,15 +965,13 @@ { struct sis900_private *sis_priv = net_dev->priv; long ioaddr = net_dev->base_addr; - u8 revision; int ret; /* Soft reset the chip. */ sis900_reset(net_dev); /* Equalizer workaround Rule */ - pci_read_config_byte(sis_priv->pci_dev, PCI_CLASS_REVISION, &revision); - sis630_set_eq(net_dev, revision); + sis630_set_eq(net_dev, sis_priv->chipset_rev); ret = request_irq(net_dev->irq, &sis900_interrupt, SA_SHIRQ, net_dev->name, net_dev); @@ -1245,7 +1240,6 @@ struct mii_phy *mii_phy = sis_priv->mii; static int next_tick = 5*HZ; u16 status; - u8 revision; if (!sis_priv->autong_complete){ int speed, duplex = 0; @@ -1253,9 +1247,7 @@ sis900_read_mode(net_dev, &speed, &duplex); if (duplex){ sis900_set_mode(net_dev->base_addr, speed, duplex); - pci_read_config_byte(sis_priv->pci_dev, - PCI_CLASS_REVISION, &revision); - sis630_set_eq(net_dev, revision); + sis630_set_eq(net_dev, sis_priv->chipset_rev); netif_start_queue(net_dev); } @@ -1290,9 +1282,7 @@ ((mii_phy->phy_id1 & 0xFFF0) == 0x8000)) sis900_reset_phy(net_dev, sis_priv->cur_phy); - pci_read_config_byte(sis_priv->pci_dev, - PCI_CLASS_REVISION, &revision); - sis630_set_eq(net_dev, revision); + sis630_set_eq(net_dev, sis_priv->chipset_rev); goto LookForLink; } @@ -2146,11 +2136,10 @@ u16 mc_filter[16] = {0}; /* 256/128 bits multicast hash table */ int i, table_entries; u32 rx_mode; - u8 revision; /* 635 Hash Table entires = 256(2^16) */ - pci_read_config_byte(sis_priv->pci_dev, PCI_CLASS_REVISION, &revision); - if((revision >= SIS635A_900_REV) || (revision == SIS900B_900_REV)) + if((sis_priv->chipset_rev >= SIS635A_900_REV) || + (sis_priv->chipset_rev == SIS900B_900_REV)) table_entries = 16; else table_entries = 8; @@ -2176,7 +2165,7 @@ mclist && i < net_dev->mc_count; i++, mclist = mclist->next) { unsigned int bit_nr = - sis900_mcast_bitnr(mclist->dmi_addr, revision); + sis900_mcast_bitnr(mclist->dmi_addr, sis_priv->chipset_rev); mc_filter[bit_nr >> 4] |= (1 << (bit_nr & 0xf)); } } @@ -2222,7 +2211,6 @@ long ioaddr = net_dev->base_addr; int i = 0; u32 status = TxRCMP | RxRCMP; - u8 revision; outl(0, ioaddr + ier); outl(0, ioaddr + imr); @@ -2235,8 +2223,8 @@ status ^= (inl(isr + ioaddr) & status); } - pci_read_config_byte(sis_priv->pci_dev, PCI_CLASS_REVISION, &revision); - if( (revision >= SIS635A_900_REV) || (revision == SIS900B_900_REV) ) + if( (sis_priv->chipset_rev >= SIS635A_900_REV) || + (sis_priv->chipset_rev == SIS900B_900_REV) ) outl(PESEL | RND_CNT, ioaddr + cfg); else outl(PESEL, ioaddr + cfg); --GvznHscUikHnwW2p-- --VJJoKLVEFXdmHQwR Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Digital signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) iD8DBQFBvDq92rmHZCWzV+0RAnotAJ9OAAaFNb5GWIrpIN+dnpXiHJkCdwCcDYfH kdQddi/3gfzXYigc7sz4xNw= =rmcA -----END PGP SIGNATURE----- --VJJoKLVEFXdmHQwR-- From olh@suse.de Sun Dec 12 06:34:34 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 12 Dec 2004 06:34:38 -0800 (PST) Received: from Cantor.suse.de (cantor.suse.de [195.135.220.2]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBCEYXKR016738 for ; Sun, 12 Dec 2004 06:34:34 -0800 Received: from hermes.suse.de (hermes-ext.suse.de [195.135.221.8]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by Cantor.suse.de (Postfix) with ESMTP id 126E31204BAE; Sun, 12 Dec 2004 15:34:06 +0100 (CET) Date: Sun, 12 Dec 2004 15:34:01 +0100 From: Olaf Hering To: "David S. Miller" , netdev@oss.sgi.com Subject: optional hwchecksum in sungem Message-ID: <20041212143401.GA21262@suse.de> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline X-DOS: I got your 640K Real Mode Right Here Buddy! X-Homeland-Security: You are not supposed to read this line! You are a terrorist! User-Agent: Mutt und vi sind doch schneller als Notes (und GroupWise) X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12679 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: olh@suse.de Precedence: bulk X-list: netdev David, the harddrive in my ibook was broken and they replaced it. Somehow they managed to break the trackpad and sungem. The trackpad works sometimes, sometimes not. The sungem needs now skb->ip_summed set to 0 and networking works fine again. Today I wondered why none of my iptables rules matched anymore with 2.6.10 (worked with 2.6.8). It turned out that most of the traffic matched a rule: iptables -I INPUT -i ppp0 -m state --state INVALID -j LOG I cant imagine how they crippled the network and the trackpad. Do you know why ethtool can not turn off rx checksumming? Is it just not implemented, or is it not tunable during runtime? ethtool -k eth0 doesnt work for sungem. Would you accept a patch to make ->ip_summed a module option? From tgraf@suug.ch Sun Dec 12 09:57:41 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 12 Dec 2004 09:57:47 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBCHveC5026644 for ; Sun, 12 Dec 2004 09:57:41 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id 4D660F; Sun, 12 Dec 2004 18:56:54 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id 9A23E1C0EA; Sun, 12 Dec 2004 18:57:36 +0100 (CET) Date: Sun, 12 Dec 2004 18:57:36 +0100 From: Thomas Graf To: Patrick McHardy Cc: "David S. Miller" , Herbert Xu , netdev@oss.sgi.com Subject: Re: request_module while holding rtnl semaphore Message-ID: <20041212175736.GA8493@postel.suug.ch> References: <41899DCF.3050804@trash.net> <20041109161126.376f755c.davem@davemloft.net> <20041110010113.GJ31969@postel.suug.ch> <41916A91.3080107@trash.net> <20041110012251.GK31969@postel.suug.ch> <41916F0B.5010809@trash.net> <20041110013941.GL31969@postel.suug.ch> <41917330.6090002@trash.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <41917330.6090002@trash.net> X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12680 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev > >>Assuming all error-paths do proper cleanup, returning -EAGAIN > >>should always result in the same configuration state as before. > >I agree but this assumption is wrong, at least for u32. > > > It will be true soon :) Anything else is a bug, and a nice > side-effect of this change is that all those dusty error-paths > actually get used. I started working on a patchset to clean up all the error paths, allow changing all pararmeters for those not supporting it yet and to add the action bits for tcindex, route, and rsvp. Finishing it and writing all the test cases to do proper testing will take some time but I hope to have it ready for early 2.6.11 inclusion. Just to let you know so we don't do redundant work. From kaber@trash.net Sun Dec 12 10:05:10 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 12 Dec 2004 10:05:14 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBCI59LY027269 for ; Sun, 12 Dec 2004 10:05:09 -0800 Received: from localhost ([127.0.0.1]) by kaber.coreworks.de with esmtp (Exim 4.34) id 1CdY4n-0007FR-AY; Sun, 12 Dec 2004 19:04:09 +0100 Message-ID: <41BC8819.7040501@trash.net> Date: Sun, 12 Dec 2004 19:04:09 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.3) Gecko/20041008 Debian/1.7.3-5 X-Accept-Language: en MIME-Version: 1.0 To: Thomas Graf CC: "David S. Miller" , Herbert Xu , netdev@oss.sgi.com Subject: Re: request_module while holding rtnl semaphore References: <41899DCF.3050804@trash.net> <20041109161126.376f755c.davem@davemloft.net> <20041110010113.GJ31969@postel.suug.ch> <41916A91.3080107@trash.net> <20041110012251.GK31969@postel.suug.ch> <41916F0B.5010809@trash.net> <20041110013941.GL31969@postel.suug.ch> <41917330.6090002@trash.net> <20041212175736.GA8493@postel.suug.ch> In-Reply-To: <20041212175736.GA8493@postel.suug.ch> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12681 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev Thomas Graf wrote: >I started working on a patchset to clean up all the error paths, >allow changing all pararmeters for those not supporting it yet >and to add the action bits for tcindex, route, and rsvp. Finishing >it and writing all the test cases to do proper testing will take some >time but I hope to have it ready for early 2.6.11 inclusion. > >Just to let you know so we don't do redundant work. > Thanks for the information. Next thing I want to do is clean up the init paths and add proper error propagation. Regards Patrick From bunk@stusta.de Sun Dec 12 11:49:41 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 12 Dec 2004 11:49:49 -0800 (PST) Received: from mailout.stusta.mhn.de (emailhub.stusta.mhn.de [141.84.69.5]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBCJneY3030955 for ; Sun, 12 Dec 2004 11:49:40 -0800 Received: (qmail 29970 invoked from network); 12 Dec 2004 19:49:12 -0000 Received: from r063144.stusta.swh.mhn.de (10.150.63.144) by mailhub.stusta.mhn.de with SMTP; 12 Dec 2004 19:49:12 -0000 Received: by r063144.stusta.swh.mhn.de (Postfix, from userid 1000) id 92270BB577; Sun, 12 Dec 2004 20:49:03 +0100 (CET) Date: Sun, 12 Dec 2004 20:49:03 +0100 From: Adrian Bunk To: netdev@oss.sgi.com Cc: linux-kernel@vger.kernel.org Subject: [2.6 patch] remove unused net/sunrpc/svcauth_des.c Message-ID: <20041212194903.GG22324@stusta.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12683 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bunk@stusta.de Precedence: bulk X-list: netdev I wasn't able to find any usage of this file. diffstat output: net/sunrpc/svcauth_des.c | 215 --------------------------------------- 1 files changed, 215 deletions(-) Signed-off-by: Adrian Bunk --- linux-2.6.10-rc2-mm4-full/net/sunrpc/svcauth_des.c 2004-10-18 23:54:37.000000000 +0200 +++ /dev/null 2004-11-25 03:16:25.000000000 +0100 @@ -1,215 +0,0 @@ -/* - * linux/net/sunrpc/svcauth_des.c - * - * Server-side AUTH_DES handling. - * - * Copyright (C) 1996, 1997 Olaf Kirch - */ - -#include -#include -#include -#include -#include -#include - -#define RPCDBG_FACILITY RPCDBG_AUTH - -/* - * DES cedential cache. - * The cache is indexed by fullname/key to allow for multiple sessions - * by the same user from different hosts. - * It would be tempting to use the client's IP address rather than the - * conversation key as an index, but that could become problematic for - * multi-homed hosts that distribute traffic across their interfaces. - */ -struct des_cred { - struct des_cred * dc_next; - char * dc_fullname; - u32 dc_nickname; - des_cblock dc_key; /* conversation key */ - des_cblock dc_xkey; /* encrypted conv. key */ - des_key_schedule dc_keysched; -}; - -#define ADN_FULLNAME 0 -#define ADN_NICKNAME 1 - -/* - * The default slack allowed when checking for replayed credentials - * (in milliseconds). - */ -#define DES_REPLAY_SLACK 2000 - -/* - * Make sure we don't place more than one call to the key server at - * a time. - */ -static int in_keycall; - -#define FAIL(err) \ - { if (data) put_cred(data); \ - *authp = rpc_autherr_##err; \ - return; \ - } - -void -svcauth_des(struct svc_rqst *rqstp, u32 *statp, u32 *authp) -{ - struct svc_buf *argp = &rqstp->rq_argbuf; - struct svc_buf *resp = &rqstp->rq_resbuf; - struct svc_cred *cred = &rqstp->rq_cred; - struct des_cred *data = NULL; - u32 cryptkey[2]; - u32 cryptbuf[4]; - u32 *p = argp->buf; - int len = argp->len, slen, i; - - *authp = rpc_auth_ok; - - if ((argp->len -= 3) < 0) { - *statp = rpc_garbage_args; - return; - } - - p++; /* skip length field */ - namekind = ntohl(*p++); /* fullname/nickname */ - - /* Get the credentials */ - if (namekind == ADN_NICKNAME) { - /* If we can't find the cached session key, initiate a - * new session. */ - if (!(data = get_cred_bynick(*p++))) - FAIL(rejectedcred); - } else if (namekind == ADN_FULLNAME) { - p = xdr_decode_string(p, &fullname, &len, RPC_MAXNETNAMELEN); - if (p == NULL) - FAIL(badcred); - cryptkey[0] = *p++; /* get the encrypted key */ - cryptkey[1] = *p++; - cryptbuf[2] = *p++; /* get the encrypted window */ - } else { - FAIL(badcred); - } - - /* If we're just updating the key, silently discard the request. */ - if (data && data->dc_locked) { - *authp = rpc_autherr_dropit; - _put_cred(data); /* release but don't unlock */ - return; - } - - /* Get the verifier flavor and length */ - if (ntohl(*p++) != RPC_AUTH_DES && ntohl(*p++) != 12) - FAIL(badverf); - - cryptbuf[0] = *p++; /* encrypted time stamp */ - cryptbuf[1] = *p++; - cryptbuf[3] = *p++; /* 0 or window - 1 */ - - if (namekind == ADN_NICKNAME) { - status = des_ecb_encrypt((des_block *) cryptbuf, - (des_block *) cryptbuf, - data->dc_keysched, DES_DECRYPT); - } else { - /* We first have to decrypt the new session key and - * fill in the UNIX creds. */ - if (!(data = get_cred_byname(rqstp, authp, fullname, cryptkey))) - return; - status = des_cbc_encrypt((des_cblock *) cryptbuf, - (des_cblock *) cryptbuf, 16, - data->dc_keysched, - (des_cblock *) &ivec, - DES_DECRYPT); - } - if (status) { - printk("svcauth_des: DES decryption failed (status %d)\n", - status); - FAIL(badverf); - } - - /* Now check the whole lot */ - if (namekind == ADN_FULLNAME) { - unsigned long winverf; - - data->dc_window = ntohl(cryptbuf[2]); - winverf = ntohl(cryptbuf[2]); - if (window != winverf - 1) { - printk("svcauth_des: bad window verifier!\n"); - FAIL(badverf); - } - } - - /* XDR the decrypted timestamp */ - cryptbuf[0] = ntohl(cryptbuf[0]); - cryptbuf[1] = ntohl(cryptbuf[1]); - if (cryptbuf[1] > 1000000) { - dprintk("svcauth_des: bad usec value %u\n", cryptbuf[1]); - if (namekind == ADN_NICKNAME) - FAIL(rejectedverf); - FAIL(badverf); - } - - /* - * Check for replayed credentials. We must allow for reordering - * of requests by the network, and the OS scheduler, hence we - * cannot expect timestamps to be increasing monotonically. - * This opens a small security hole, therefore the replay_slack - * value shouldn't be too large. - */ - if ((delta = cryptbuf[0] - data->dc_timestamp[0]) <= 0) { - switch (delta) { - case -1: - delta = -1000000; - case 0: - delta += cryptbuf[1] - data->dc_timestamp[1]; - break; - default: - delta = -1000000; - } - if (delta < DES_REPLAY_SLACK) - FAIL(rejectedverf); -#ifdef STRICT_REPLAY_CHECKS - /* TODO: compare time stamp to last five timestamps cached - * and reject (drop?) request if a match is found. */ -#endif - } - - now = xtime; - now.tv_secs -= data->dc_window; - if (now.tv_secs < cryptbuf[0] || - (now.tv_secs == cryptbuf[0] && now.tv_usec < cryptbuf[1])) - FAIL(rejectedverf); - - /* Okay, we're done. Update the lot */ - if (namekind == ADN_FULLNAME) - data->dc_valid = 1; - data->dc_timestamp[0] = cryptbuf[0]; - data->dc_timestamp[1] = cryptbuf[1]; - - put_cred(data); - return; -garbage: - *statp = rpc_garbage_args; - return; -} - -/* - * Call the keyserver to obtain the decrypted conversation key and - * UNIX creds. We use a Linux-specific keycall extension that does - * both things in one go. - */ -static struct des_cred * -get_cred_byname(struct svc_rqst *rqstp, u32 *authp, char *fullname, u32 *cryptkey) -{ - static int in_keycall; - struct des_cred *cred; - - if (in_keycall) { - *authp = rpc_autherr_dropit; - return NULL; - } - in_keycall = 1; - in_keycall = 0; - return cred; -} From bunk@stusta.de Sun Dec 12 11:48:29 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 12 Dec 2004 11:48:35 -0800 (PST) Received: from mailout.stusta.mhn.de (mailout.stusta.mhn.de [141.84.69.5]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBCJmRev030799 for ; Sun, 12 Dec 2004 11:48:28 -0800 Received: (qmail 29896 invoked from network); 12 Dec 2004 19:47:59 -0000 Received: from r063144.stusta.swh.mhn.de (10.150.63.144) by mailout.stusta.mhn.de with SMTP; 12 Dec 2004 19:47:59 -0000 Received: by r063144.stusta.swh.mhn.de (Postfix, from userid 1000) id A778BBB577; Sun, 12 Dec 2004 20:47:50 +0100 (CET) Date: Sun, 12 Dec 2004 20:47:50 +0100 From: Adrian Bunk To: Andy Adamson Cc: netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: [2.6 patch] remove unused net/sunrpc/auth_gss/gss_pseudoflavors.c Message-ID: <20041212194750.GF22324@stusta.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12682 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bunk@stusta.de Precedence: bulk X-list: netdev I wasn't able to find any usage of this file. diffstat output: net/sunrpc/auth_gss/gss_pseudoflavors.c | 237 ------------------------ 1 files changed, 237 deletions(-) Signed-off-by: Adrian Bunk --- linux-2.6.10-rc2-mm4-full/net/sunrpc/auth_gss/gss_pseudoflavors.c 2004-10-18 23:54:08.000000000 +0200 +++ /dev/null 2004-11-25 03:16:25.000000000 +0100 @@ -1,237 +0,0 @@ -/* - * linux/net/sunrpc/gss_union.c - * - * Adapted from MIT Kerberos 5-1.2.1 lib/gssapi/generic code - * - * Copyright (c) 2001 The Regents of the University of Michigan. - * All rights reserved. - * - * Andy Adamson - * - */ - -/* - * Copyright 1993 by OpenVision Technologies, Inc. - * - * Permission to use, copy, modify, distribute, and sell this software - * and its documentation for any purpose is hereby granted without fee, - * provided that the above copyright notice appears in all copies and - * that both that copyright notice and this permission notice appear in - * supporting documentation, and that the name of OpenVision not be used - * in advertising or publicity pertaining to distribution of the software - * without specific, written prior permission. OpenVision makes no - * representations about the suitability of this software for any - * purpose. It is provided "as is" without express or implied warranty. - * - * OPENVISION DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, - * INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO - * EVENT SHALL OPENVISION BE LIABLE FOR ANY SPECIAL, INDIRECT OR - * CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF - * USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR - * OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR - * PERFORMANCE OF THIS SOFTWARE. - */ - -#include -#include -#include -#include -#include - -#ifdef RPC_DEBUG -# define RPCDBG_FACILITY RPCDBG_AUTH -#endif - -static LIST_HEAD(registered_triples); -static spinlock_t registered_triples_lock = SPIN_LOCK_UNLOCKED; - -/* The following must be called with spinlock held: */ -static struct sup_sec_triple * -do_lookup_triple_by_pseudoflavor(u32 pseudoflavor) -{ - struct sup_sec_triple *pos, *triple = NULL; - - list_for_each_entry(pos, ®istered_triples, triples) { - if (pos->pseudoflavor == pseudoflavor) { - triple = pos; - break; - } - } - return triple; -} - -/* XXX Need to think about reference counting of triples and of mechs. - * Currently we do no reference counting of triples, and I think that's - * probably OK given the reference counting on mechs, but there's probably - * a better way to do all this. */ - -int -gss_register_triple(u32 pseudoflavor, struct gss_api_mech *mech, - u32 qop, u32 service) -{ - struct sup_sec_triple *triple; - - if (!(triple = kmalloc(sizeof(*triple), GFP_KERNEL))) { - printk("Alloc failed in gss_register_triple"); - goto err; - } - triple->pseudoflavor = pseudoflavor; - triple->mech = gss_mech_get_by_OID(&mech->gm_oid); - triple->qop = qop; - triple->service = service; - - spin_lock(®istered_triples_lock); - if (do_lookup_triple_by_pseudoflavor(pseudoflavor)) { - printk(KERN_WARNING "RPC: Registered pseudoflavor %d again\n", - pseudoflavor); - goto err_unlock; - } - list_add(&triple->triples, ®istered_triples); - spin_unlock(®istered_triples_lock); - dprintk("RPC: registered pseudoflavor %d\n", pseudoflavor); - - return 0; - -err_unlock: - kfree(triple); - spin_unlock(®istered_triples_lock); -err: - return -1; -} - -int -gss_unregister_triple(u32 pseudoflavor) -{ - struct sup_sec_triple *triple; - - spin_lock(®istered_triples_lock); - if (!(triple = do_lookup_triple_by_pseudoflavor(pseudoflavor))) { - spin_unlock(®istered_triples_lock); - printk("Can't unregister unregistered pseudoflavor %d\n", - pseudoflavor); - return -1; - } - list_del(&triple->triples); - spin_unlock(®istered_triples_lock); - gss_mech_put(triple->mech); - kfree(triple); - return 0; - -} - -void -print_sec_triple(struct xdr_netobj *oid,u32 qop,u32 service) -{ - dprintk("RPC: print_sec_triple:\n"); - dprintk(" oid_len %d\n oid :\n",oid->len); - print_hexl((u32 *)oid->data,oid->len,0); - dprintk(" qop %d\n",qop); - dprintk(" service %d\n",service); -} - -/* Function: gss_get_cmp_triples - * - * Description: search sec_triples for a matching security triple - * return pseudoflavor if match, else 0 - * (Note that 0 is a valid pseudoflavor, but not for any gss pseudoflavor - * (0 means auth_null), so this shouldn't cause confusion.) - */ -u32 -gss_cmp_triples(u32 oid_len, char *oid_data, u32 qop, u32 service) -{ - struct sup_sec_triple *triple; - u32 pseudoflavor = 0; - struct xdr_netobj oid; - - oid.len = oid_len; - oid.data = oid_data; - - dprintk("RPC: gss_cmp_triples\n"); - print_sec_triple(&oid,qop,service); - - spin_lock(®istered_triples_lock); - list_for_each_entry(triple, ®istered_triples, triples) { - if((g_OID_equal(&oid, &triple->mech->gm_oid)) - && (qop == triple->qop) - && (service == triple->service)) { - pseudoflavor = triple->pseudoflavor; - break; - } - } - spin_unlock(®istered_triples_lock); - dprintk("RPC: gss_cmp_triples return %d\n", pseudoflavor); - return pseudoflavor; -} - -u32 -gss_get_pseudoflavor(struct gss_ctx *ctx, u32 qop, u32 service) -{ - return gss_cmp_triples(ctx->mech_type->gm_oid.len, - ctx->mech_type->gm_oid.data, - qop, service); -} - -/* Returns nonzero iff the given pseudoflavor is in the supported list. - * (Note that without incrementing a reference count or anything, this - * doesn't give any guarantees.) */ -int -gss_pseudoflavor_supported(u32 pseudoflavor) -{ - struct sup_sec_triple *triple; - - spin_lock(®istered_triples_lock); - triple = do_lookup_triple_by_pseudoflavor(pseudoflavor); - spin_unlock(®istered_triples_lock); - return (triple ? 1 : 0); -} - -u32 -gss_pseudoflavor_to_service(u32 pseudoflavor) -{ - struct sup_sec_triple *triple; - - spin_lock(®istered_triples_lock); - triple = do_lookup_triple_by_pseudoflavor(pseudoflavor); - spin_unlock(®istered_triples_lock); - if (!triple) { - dprintk("RPC: gss_pseudoflavor_to_service called with unsupported pseudoflavor %d\n", - pseudoflavor); - return 0; - } - return triple->service; -} - -struct gss_api_mech * -gss_pseudoflavor_to_mech(u32 pseudoflavor) { - struct sup_sec_triple *triple; - struct gss_api_mech *mech = NULL; - - spin_lock(®istered_triples_lock); - triple = do_lookup_triple_by_pseudoflavor(pseudoflavor); - spin_unlock(®istered_triples_lock); - if (triple) - mech = gss_mech_get(triple->mech); - else - dprintk("RPC: gss_pseudoflavor_to_mech called with unsupported pseudoflavor %d\n", - pseudoflavor); - return mech; -} - -int -gss_pseudoflavor_to_mechOID(u32 pseudoflavor, struct xdr_netobj * oid) -{ - struct gss_api_mech *mech; - - mech = gss_pseudoflavor_to_mech(pseudoflavor); - if (!mech) { - dprintk("RPC: gss_pseudoflavor_to_mechOID called with unsupported pseudoflavor %d\n", - pseudoflavor); - return -1; - } - oid->len = mech->gm_oid.len; - if (!(oid->data = kmalloc(oid->len, GFP_KERNEL))) - return -1; - memcpy(oid->data, mech->gm_oid.data, oid->len); - gss_mech_put(mech); - return 0; -} From bunk@stusta.de Sun Dec 12 11:51:24 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 12 Dec 2004 11:51:29 -0800 (PST) Received: from mailout.stusta.mhn.de (emailhub.stusta.mhn.de [141.84.69.5]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBCJpN0p031567 for ; Sun, 12 Dec 2004 11:51:24 -0800 Received: (qmail 30083 invoked from network); 12 Dec 2004 19:50:56 -0000 Received: from r063144.stusta.swh.mhn.de (10.150.63.144) by mailhub.stusta.mhn.de with SMTP; 12 Dec 2004 19:50:56 -0000 Received: by r063144.stusta.swh.mhn.de (Postfix, from userid 1000) id 5C3CDBB577; Sun, 12 Dec 2004 20:50:47 +0100 (CET) Date: Sun, 12 Dec 2004 20:50:47 +0100 From: Adrian Bunk To: netdev@oss.sgi.com Cc: linux-kernel@vger.kernel.org Subject: [2.6 patch] remove unused Message-ID: <20041212195047.GH22324@stusta.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12684 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bunk@stusta.de Precedence: bulk X-list: netdev I wasn't able to find any usage of this file (it seems the EXPORT_SYMBOL's were moved away, but deleting the filw was forgotten). diffstat output: net/sunrpc/auth_gss/sunrpcgss_syms.c | 37 --------------------------- 1 files changed, 37 deletions(-) Signed-off-by: Adrian Bunk --- linux-2.6.10-rc2-mm4-full/net/sunrpc/auth_gss/sunrpcgss_syms.c 2004-10-18 23:54:39.000000000 +0200 +++ /dev/null 2004-11-25 03:16:25.000000000 +0100 @@ -1,37 +0,0 @@ -#include -#include - -#include -#include -#include -#include -#include - -#include -#include -#include -#include - -/* svcauth_gss.c: */ -EXPORT_SYMBOL(svcauth_gss_register_pseudoflavor); - -/* registering gss mechanisms to the mech switching code: */ -EXPORT_SYMBOL(gss_mech_register); -EXPORT_SYMBOL(gss_mech_unregister); -EXPORT_SYMBOL(gss_mech_get); -EXPORT_SYMBOL(gss_mech_get_by_pseudoflavor); -EXPORT_SYMBOL(gss_mech_get_by_name); -EXPORT_SYMBOL(gss_mech_put); -EXPORT_SYMBOL(gss_pseudoflavor_to_service); -EXPORT_SYMBOL(gss_service_to_auth_domain_name); - -/* generic functionality in gss code: */ -EXPORT_SYMBOL(g_make_token_header); -EXPORT_SYMBOL(g_verify_token_header); -EXPORT_SYMBOL(g_token_size); -EXPORT_SYMBOL(make_checksum); -EXPORT_SYMBOL(krb5_encrypt); -EXPORT_SYMBOL(krb5_decrypt); - -/* debug */ -EXPORT_SYMBOL(print_hexl); From bunk@stusta.de Sun Dec 12 12:11:53 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 12 Dec 2004 12:12:01 -0800 (PST) Received: from mailout.stusta.mhn.de (emailhub.stusta.mhn.de [141.84.69.5]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBCKBquk000423 for ; Sun, 12 Dec 2004 12:11:52 -0800 Received: (qmail 31311 invoked from network); 12 Dec 2004 20:11:24 -0000 Received: from r063144.stusta.swh.mhn.de (10.150.63.144) by mailhub.stusta.mhn.de with SMTP; 12 Dec 2004 20:11:24 -0000 Received: by r063144.stusta.swh.mhn.de (Postfix, from userid 1000) id CDBA2BB577; Sun, 12 Dec 2004 21:11:15 +0100 (CET) Date: Sun, 12 Dec 2004 21:11:15 +0100 From: Adrian Bunk To: acme@conectiva.com.br Cc: netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: [2.6 patch] net/802/: some cleanups Message-ID: <20041212201115.GU22324@stusta.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12685 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bunk@stusta.de Precedence: bulk X-list: netdev The patch below contains the following cleanups: - make some needlessly global code static - net/802/hippi.c: remove the unused global function hippi_net_init - net/8021q/vlan.c: remove the global variable vlan_default_dev_flags that was never changed - drivers/net/net_init.c: remove four unneeded #include's diffstat output: drivers/net/fc/iph5526_ip.h | 1 - drivers/net/net_init.c | 4 ---- include/linux/fcdevice.h | 4 ---- include/linux/fddidevice.h | 7 ------- include/linux/hippidevice.h | 21 --------------------- include/linux/trdevice.h | 4 ---- net/802/fc.c | 7 ++++--- net/802/fddi.c | 7 ++++--- net/802/hippi.c | 19 ++++--------------- net/802/tr.c | 7 ++++--- net/8021q/vlan.c | 9 +++------ net/8021q/vlan.h | 1 - net/8021q/vlanproc.c | 2 +- 13 files changed, 20 insertions(+), 73 deletions(-) Signed-off-by: Adrian Bunk --- linux-2.6.10-rc2-mm4-full/include/linux/fcdevice.h.old 2004-12-12 18:03:19.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/include/linux/fcdevice.h 2004-12-12 18:06:57.000000000 +0100 @@ -27,10 +27,6 @@ #include #ifdef __KERNEL__ -extern int fc_header(struct sk_buff *skb, struct net_device *dev, - unsigned short type, void *daddr, - void *saddr, unsigned len); -extern int fc_rebuild_header(struct sk_buff *skb); extern unsigned short fc_type_trans(struct sk_buff *skb, struct net_device *dev); extern struct net_device *alloc_fcdev(int sizeof_priv); --- linux-2.6.10-rc2-mm4-full/net/802/fc.c.old 2004-12-12 18:03:56.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/net/802/fc.c 2004-12-12 18:04:53.000000000 +0100 @@ -35,8 +35,9 @@ * Put the headers on a Fibre Channel packet. */ -int fc_header(struct sk_buff *skb, struct net_device *dev, unsigned short type, - void *daddr, void *saddr, unsigned len) +static int fc_header(struct sk_buff *skb, struct net_device *dev, + unsigned short type, + void *daddr, void *saddr, unsigned len) { struct fch_hdr *fch; int hdr_len; @@ -81,7 +82,7 @@ * can now send the packet. */ -int fc_rebuild_header(struct sk_buff *skb) +static int fc_rebuild_header(struct sk_buff *skb) { struct fch_hdr *fch=(struct fch_hdr *)skb->data; struct fcllc *fcllc=(struct fcllc *)(skb->data+sizeof(struct fch_hdr)); --- linux-2.6.10-rc2-mm4-full/drivers/net/fc/iph5526_ip.h.old 2004-12-12 18:07:07.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/drivers/net/fc/iph5526_ip.h 2004-12-12 18:07:21.000000000 +0100 @@ -18,7 +18,6 @@ static void rx_net_packet(struct fc_info *fi, u_char *buff_addr, int payload_size); static void rx_net_mfs_packet(struct fc_info *fi, struct sk_buff *skb); -unsigned short fc_type_trans(struct sk_buff *skb, struct net_device *dev); static int tx_ip_packet(struct sk_buff *skb, unsigned long len, struct fc_info *fi); static int tx_arp_packet(char *data, unsigned long len, struct fc_info *fi); #endif --- linux-2.6.10-rc2-mm4-full/drivers/net/net_init.c.old 2004-12-12 18:07:46.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/drivers/net/net_init.c 2004-12-12 18:14:20.000000000 +0100 @@ -43,10 +43,6 @@ #include #include #include -#include -#include -#include -#include #include #include #include --- linux-2.6.10-rc2-mm4-full/include/linux/fddidevice.h.old 2004-12-12 18:08:23.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/include/linux/fddidevice.h 2004-12-12 18:10:20.000000000 +0100 @@ -25,13 +25,6 @@ #include #ifdef __KERNEL__ -extern int fddi_header(struct sk_buff *skb, - struct net_device *dev, - unsigned short type, - void *daddr, - void *saddr, - unsigned len); -extern int fddi_rebuild_header(struct sk_buff *skb); extern unsigned short fddi_type_trans(struct sk_buff *skb, struct net_device *dev); extern struct net_device *alloc_fddidev(int sizeof_priv); --- linux-2.6.10-rc2-mm4-full/net/802/fddi.c.old 2004-12-12 18:08:41.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/net/802/fddi.c 2004-12-12 18:10:15.000000000 +0100 @@ -52,8 +52,9 @@ * daddr=NULL means leave destination address (eg unresolved arp) */ -int fddi_header(struct sk_buff *skb, struct net_device *dev, unsigned short type, - void *daddr, void *saddr, unsigned len) +static int fddi_header(struct sk_buff *skb, struct net_device *dev, + unsigned short type, + void *daddr, void *saddr, unsigned len) { int hl = FDDI_K_SNAP_HLEN; struct fddihdr *fddi; @@ -96,7 +97,7 @@ * this sk_buff. We now let ARP fill in the other fields. */ -int fddi_rebuild_header(struct sk_buff *skb) +static int fddi_rebuild_header(struct sk_buff *skb) { struct fddihdr *fddi = (struct fddihdr *)skb->data; --- linux-2.6.10-rc2-mm4-full/include/linux/hippidevice.h.old 2004-12-12 18:10:40.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/include/linux/hippidevice.h 2004-12-12 18:11:56.000000000 +0100 @@ -26,30 +26,9 @@ #include #ifdef __KERNEL__ -extern int hippi_header(struct sk_buff *skb, - struct net_device *dev, - unsigned short type, - void *daddr, - void *saddr, - unsigned len); - -extern int hippi_rebuild_header(struct sk_buff *skb); - extern unsigned short hippi_type_trans(struct sk_buff *skb, struct net_device *dev); -extern void hippi_header_cache_bind(struct hh_cache ** hhp, - struct net_device *dev, - unsigned short htype, - __u32 daddr); - -extern void hippi_header_cache_update(struct hh_cache *hh, - struct net_device *dev, - unsigned char * haddr); -extern int hippi_header_parse(struct sk_buff *skb, unsigned char *haddr); - -extern void hippi_net_init(void); - extern struct net_device *alloc_hippi_dev(int sizeof_priv); #endif --- linux-2.6.10-rc2-mm4-full/net/802/hippi.c.old 2004-12-12 18:12:42.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/net/802/hippi.c 2004-12-12 18:13:32.000000000 +0100 @@ -40,26 +40,15 @@ #include /* - * hippi_net_init() - * - * Do nothing, this is just to pursuade the stupid linker to behave. - */ - -void hippi_net_init(void) -{ - return; -} - -/* * Create the HIPPI MAC header for an arbitrary protocol layer * * saddr=NULL means use device source address * daddr=NULL means leave destination address (eg unresolved arp) */ -int hippi_header(struct sk_buff *skb, struct net_device *dev, - unsigned short type, void *daddr, void *saddr, - unsigned len) +static int hippi_header(struct sk_buff *skb, struct net_device *dev, + unsigned short type, void *daddr, void *saddr, + unsigned len) { struct hippi_hdr *hip = (struct hippi_hdr *)skb_push(skb, HIPPI_HLEN); @@ -107,7 +96,7 @@ * completed on this sk_buff. We now let ARP fill in the other fields. */ -int hippi_rebuild_header(struct sk_buff *skb) +static int hippi_rebuild_header(struct sk_buff *skb) { struct hippi_hdr *hip = (struct hippi_hdr *)skb->data; --- linux-2.6.10-rc2-mm4-full/include/linux/trdevice.h.old 2004-12-12 18:14:26.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/include/linux/trdevice.h 2004-12-12 18:16:00.000000000 +0100 @@ -28,10 +28,6 @@ #include #ifdef __KERNEL__ -extern int tr_header(struct sk_buff *skb, struct net_device *dev, - unsigned short type, void *daddr, - void *saddr, unsigned len); -extern int tr_rebuild_header(struct sk_buff *skb); extern unsigned short tr_type_trans(struct sk_buff *skb, struct net_device *dev); extern void tr_source_route(struct sk_buff *skb, struct trh_hdr *trh, struct net_device *dev); extern struct net_device *alloc_trdev(int sizeof_priv); --- linux-2.6.10-rc2-mm4-full/net/802/tr.c.old 2004-12-12 18:14:51.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/net/802/tr.c 2004-12-12 18:15:28.000000000 +0100 @@ -98,8 +98,9 @@ * makes this a little more exciting than on ethernet. */ -int tr_header(struct sk_buff *skb, struct net_device *dev, unsigned short type, - void *daddr, void *saddr, unsigned len) +static int tr_header(struct sk_buff *skb, struct net_device *dev, + unsigned short type, + void *daddr, void *saddr, unsigned len) { struct trh_hdr *trh; int hdr_len; @@ -153,7 +154,7 @@ * can now send the packet. */ -int tr_rebuild_header(struct sk_buff *skb) +static int tr_rebuild_header(struct sk_buff *skb) { struct trh_hdr *trh=(struct trh_hdr *)skb->data; struct trllc *trllc=(struct trllc *)(skb->data+sizeof(struct trh_hdr)); --- linux-2.6.10-rc2-mm4-full/net/8021q/vlan.h.old 2004-12-12 18:17:02.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/net/8021q/vlan.h 2004-12-12 18:17:06.000000000 +0100 @@ -33,7 +33,6 @@ #define VLAN_GRP_HASH_SHIFT 5 #define VLAN_GRP_HASH_SIZE (1 << VLAN_GRP_HASH_SHIFT) #define VLAN_GRP_HASH_MASK (VLAN_GRP_HASH_SIZE - 1) -extern struct hlist_head vlan_group_hash[VLAN_GRP_HASH_SIZE]; /* Find a VLAN device by the MAC address of its Ethernet device, and * it's VLAN ID. The default configuration is to have VLAN's scope --- linux-2.6.10-rc2-mm4-full/net/8021q/vlan.c.old 2004-12-12 18:16:08.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/net/8021q/vlan.c 2004-12-12 18:17:20.000000000 +0100 @@ -40,7 +40,7 @@ /* Global VLAN variables */ /* Our listing of VLAN group(s) */ -struct hlist_head vlan_group_hash[VLAN_GRP_HASH_SIZE]; +static struct hlist_head vlan_group_hash[VLAN_GRP_HASH_SIZE]; #define vlan_grp_hashfn(IDX) ((((IDX) >> VLAN_GRP_HASH_SHIFT) ^ (IDX)) & VLAN_GRP_HASH_MASK) static char vlan_fullname[] = "802.1Q VLAN Support"; @@ -52,7 +52,7 @@ static int vlan_ioctl_handler(void __user *); static int unregister_vlan_dev(struct net_device *, unsigned short ); -struct notifier_block vlan_notifier_block = { +static struct notifier_block vlan_notifier_block = { .notifier_call = vlan_device_event, }; @@ -61,9 +61,6 @@ /* Determines interface naming scheme. */ unsigned short vlan_name_type = VLAN_NAME_TYPE_RAW_PLUS_VID_NO_PAD; -/* DO reorder the header by default */ -unsigned short vlan_default_dev_flags = 1; - static struct packet_type vlan_packet_type = { .type = __constant_htons(ETH_P_8021Q), .func = vlan_skb_recv, /* VLAN receive method */ @@ -490,7 +487,7 @@ VLAN_DEV_INFO(new_dev)->vlan_id = VLAN_ID; /* 1 through VLAN_VID_MASK */ VLAN_DEV_INFO(new_dev)->real_dev = real_dev; VLAN_DEV_INFO(new_dev)->dent = NULL; - VLAN_DEV_INFO(new_dev)->flags = vlan_default_dev_flags; + VLAN_DEV_INFO(new_dev)->flags = 1; #ifdef VLAN_DEBUG printk(VLAN_DBG "About to go find the group for idx: %i\n", --- linux-2.6.10-rc2-mm4-full/net/8021q/vlanproc.c.old 2004-12-12 18:17:33.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/net/8021q/vlanproc.c 2004-12-12 18:17:42.000000000 +0100 @@ -239,7 +239,7 @@ */ /* starting at dev, find a VLAN device */ -struct net_device *vlan_skip(struct net_device *dev) +static struct net_device *vlan_skip(struct net_device *dev) { while (dev && !(dev->priv_flags & IFF_802_1Q_VLAN)) dev = dev->next; From bfields@fieldses.org Sun Dec 12 12:19:23 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 12 Dec 2004 12:19:27 -0800 (PST) Received: from pickle.fieldses.org (dsl093-002-214.det1.dsl.speakeasy.net [66.93.2.214]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBCKJNSP004330 for ; Sun, 12 Dec 2004 12:19:23 -0800 Received: from bfields by pickle.fieldses.org with local (Exim 4.34) id 1CdaBS-000773-Qd; Sun, 12 Dec 2004 15:19:10 -0500 Date: Sun, 12 Dec 2004 15:19:10 -0500 To: Adrian Bunk Cc: Andy Adamson , netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: [2.6 patch] remove unused net/sunrpc/auth_gss/gss_pseudoflavors.c Message-ID: <20041212201910.GD25788@fieldses.org> References: <20041212194750.GF22324@stusta.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041212194750.GF22324@stusta.de> User-Agent: Mutt/1.5.6+20040907i From: "J. Bruce Fields" X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12686 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bfields@fieldses.org Precedence: bulk X-list: netdev On Sun, Dec 12, 2004 at 08:47:50PM +0100, Adrian Bunk wrote: > I wasn't able to find any usage of this file. Oops, you're right, thanks. --Bruce Fields From bfields@fieldses.org Sun Dec 12 12:21:43 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 12 Dec 2004 12:21:47 -0800 (PST) Received: from pickle.fieldses.org (dsl093-002-214.det1.dsl.speakeasy.net [66.93.2.214]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBCKLhLc004852 for ; Sun, 12 Dec 2004 12:21:43 -0800 Received: from bfields by pickle.fieldses.org with local (Exim 4.34) id 1CdaDn-00077c-94; Sun, 12 Dec 2004 15:21:35 -0500 Date: Sun, 12 Dec 2004 15:21:35 -0500 To: Adrian Bunk Cc: netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: [2.6 patch] remove unused Message-ID: <20041212202135.GE25788@fieldses.org> References: <20041212195047.GH22324@stusta.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041212195047.GH22324@stusta.de> User-Agent: Mutt/1.5.6+20040907i From: "J. Bruce Fields" X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12687 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bfields@fieldses.org Precedence: bulk X-list: netdev On Sun, Dec 12, 2004 at 08:50:47PM +0100, Adrian Bunk wrote: > I wasn't able to find any usage of this file (it seems the > EXPORT_SYMBOL's were moved away, but deleting the filw was forgotten). Yep.--b. From bunk@stusta.de Sun Dec 12 13:12:07 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 12 Dec 2004 13:12:20 -0800 (PST) Received: from mailout.stusta.mhn.de (mailout.stusta.mhn.de [141.84.69.5]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBCLC5QL009310 for ; Sun, 12 Dec 2004 13:12:06 -0800 Received: (qmail 2097 invoked from network); 12 Dec 2004 21:11:37 -0000 Received: from r063144.stusta.swh.mhn.de (10.150.63.144) by mailhub.stusta.mhn.de with SMTP; 12 Dec 2004 21:11:37 -0000 Received: by r063144.stusta.swh.mhn.de (Postfix, from userid 1000) id C08EABB577; Sun, 12 Dec 2004 22:11:28 +0100 (CET) Date: Sun, 12 Dec 2004 22:11:28 +0100 From: Adrian Bunk To: acme@conectiva.com.br Cc: netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: [2.6 patch] net/appletalk/: make some code static Message-ID: <20041212211128.GW22324@stusta.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12688 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bunk@stusta.de Precedence: bulk X-list: netdev The patch below makes some needlessly global code static. diffstat output: include/linux/atalk.h | 2 -- net/appletalk/aarp.c | 4 ++-- net/appletalk/atalk_proc.c | 6 +++--- net/appletalk/ddp.c | 6 +++--- 4 files changed, 8 insertions(+), 10 deletions(-) Signed-off-by: Adrian Bunk --- linux-2.6.10-rc2-mm4-full/include/linux/atalk.h.old 2004-12-12 18:21:46.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/include/linux/atalk.h 2004-12-12 18:21:53.000000000 +0100 @@ -188,8 +188,6 @@ extern int aarp_send_ddp(struct net_device *dev, struct sk_buff *skb, struct atalk_addr *sa, void *hwaddr); -extern void aarp_send_probe(struct net_device *dev, - struct atalk_addr *addr); extern void aarp_device_down(struct net_device *dev); extern void aarp_probe_network(struct atalk_iface *atif); extern int aarp_proxy_probe_network(struct atalk_iface *atif, --- linux-2.6.10-rc2-mm4-full/net/appletalk/aarp.c.old 2004-12-12 18:22:02.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/net/appletalk/aarp.c 2004-12-12 18:22:12.000000000 +0100 @@ -199,7 +199,7 @@ * aarp_proxy_probe_network. */ -void aarp_send_probe(struct net_device *dev, struct atalk_addr *us) +static void aarp_send_probe(struct net_device *dev, struct atalk_addr *us) { struct elapaarp *eah; int len = dev->hard_header_len + sizeof(*eah) + aarp_dl->header_length; @@ -429,7 +429,7 @@ * Probe a Phase 1 device or a device that requires its Net:Node to * be set via an ioctl. */ -void aarp_send_probe_phase1(struct atalk_iface *iface) +static void aarp_send_probe_phase1(struct atalk_iface *iface) { struct ifreq atreq; struct sockaddr_at *sa = (struct sockaddr_at *)&atreq.ifr_addr; --- linux-2.6.10-rc2-mm4-full/net/appletalk/atalk_proc.c.old 2004-12-12 18:22:28.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/net/appletalk/atalk_proc.c 2004-12-12 18:23:02.000000000 +0100 @@ -205,21 +205,21 @@ return 0; } -struct seq_operations atalk_seq_interface_ops = { +static struct seq_operations atalk_seq_interface_ops = { .start = atalk_seq_interface_start, .next = atalk_seq_interface_next, .stop = atalk_seq_interface_stop, .show = atalk_seq_interface_show, }; -struct seq_operations atalk_seq_route_ops = { +static struct seq_operations atalk_seq_route_ops = { .start = atalk_seq_route_start, .next = atalk_seq_route_next, .stop = atalk_seq_route_stop, .show = atalk_seq_route_show, }; -struct seq_operations atalk_seq_socket_ops = { +static struct seq_operations atalk_seq_socket_ops = { .start = atalk_seq_socket_start, .next = atalk_seq_socket_next, .stop = atalk_seq_socket_stop, --- linux-2.6.10-rc2-mm4-full/net/appletalk/ddp.c.old 2004-12-12 18:23:35.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/net/appletalk/ddp.c 2004-12-12 18:24:11.000000000 +0100 @@ -612,7 +612,7 @@ * Called when a device is downed. Just throw away any routes * via it. */ -void atrtr_device_down(struct net_device *dev) +static void atrtr_device_down(struct net_device *dev) { struct atalk_route **r = &atalk_routes; struct atalk_route *tmp; @@ -1854,12 +1854,12 @@ .notifier_call = ddp_device_event, }; -struct packet_type ltalk_packet_type = { +static struct packet_type ltalk_packet_type = { .type = __constant_htons(ETH_P_LOCALTALK), .func = ltalk_rcv, }; -struct packet_type ppptalk_packet_type = { +static struct packet_type ppptalk_packet_type = { .type = __constant_htons(ETH_P_PPPTALK), .func = atalk_rcv, }; From bunk@stusta.de Sun Dec 12 13:14:16 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 12 Dec 2004 13:14:21 -0800 (PST) Received: from mailout.stusta.mhn.de (emailhub.stusta.mhn.de [141.84.69.5]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBCLEFpc009679 for ; Sun, 12 Dec 2004 13:14:16 -0800 Received: (qmail 2292 invoked from network); 12 Dec 2004 21:13:48 -0000 Received: from r063144.stusta.swh.mhn.de (10.150.63.144) by mailhub.stusta.mhn.de with SMTP; 12 Dec 2004 21:13:48 -0000 Received: by r063144.stusta.swh.mhn.de (Postfix, from userid 1000) id 5A2B1BB577; Sun, 12 Dec 2004 22:13:39 +0100 (CET) Date: Sun, 12 Dec 2004 22:13:39 +0100 From: Adrian Bunk To: ralf@linux-mips.org Cc: linux-hams@vger.kernel.org, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: [2.6 patch] /net/ax25/: some cleanups Message-ID: <20041212211339.GX22324@stusta.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12689 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bunk@stusta.de Precedence: bulk X-list: netdev The patch below contains the following cleanups: - make two needlessly global functions static - net/ax25/ax25_addr.c: remove the unused global function ax25digicmp diffstat output: include/net/ax25.h | 3 --- net/ax25/af_ax25.c | 2 +- net/ax25/ax25_addr.c | 20 -------------------- net/ax25/ax25_ds_subr.c | 2 +- 4 files changed, 2 insertions(+), 25 deletions(-) Signed-off-by: Adrian Bunk --- linux-2.6.10-rc2-mm4-full/include/net/ax25.h.old 2004-12-12 18:56:04.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/include/net/ax25.h 2004-12-12 19:00:07.000000000 +0100 @@ -231,7 +231,6 @@ extern void ax25_destroy_socket(ax25_cb *); extern ax25_cb *ax25_create_cb(void); extern void ax25_fillin_cb(ax25_cb *, ax25_dev *); -extern int ax25_create(struct socket *, int); extern struct sock *ax25_make_new(struct sock *, struct ax25_dev *); /* ax25_addr.c */ @@ -239,7 +238,6 @@ extern char *ax2asc(ax25_address *); extern ax25_address *asc2ax(char *); extern int ax25cmp(ax25_address *, ax25_address *); -extern int ax25digicmp(ax25_digi *, ax25_digi *); extern unsigned char *ax25_addr_parse(unsigned char *, int, ax25_address *, ax25_address *, ax25_digi *, int *, int *); extern int ax25_addr_build(unsigned char *, ax25_address *, ax25_address *, ax25_digi *, int, int); extern int ax25_addr_size(ax25_digi *); @@ -268,7 +266,6 @@ extern void ax25_ds_nr_error_recovery(ax25_cb *); extern void ax25_ds_enquiry_response(ax25_cb *); extern void ax25_ds_establish_data_link(ax25_cb *); -extern void ax25_dev_dama_on(ax25_dev *); extern void ax25_dev_dama_off(ax25_dev *); extern void ax25_dama_on(ax25_cb *); extern void ax25_dama_off(ax25_cb *); --- linux-2.6.10-rc2-mm4-full/net/ax25/af_ax25.c.old 2004-12-12 18:55:30.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/net/ax25/af_ax25.c 2004-12-12 18:55:41.000000000 +0100 @@ -755,7 +755,7 @@ return res; } -int ax25_create(struct socket *sock, int protocol) +static int ax25_create(struct socket *sock, int protocol) { struct sock *sk; ax25_cb *ax25; --- linux-2.6.10-rc2-mm4-full/net/ax25/ax25_addr.c.old 2004-12-12 18:56:23.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/net/ax25/ax25_addr.c 2004-12-12 18:56:32.000000000 +0100 @@ -121,26 +121,6 @@ } /* - * Compare two AX.25 digipeater paths. - */ -int ax25digicmp(ax25_digi *digi1, ax25_digi *digi2) -{ - int i; - - if (digi1->ndigi != digi2->ndigi) - return 1; - - if (digi1->lastrepeat != digi2->lastrepeat) - return 1; - - for (i = 0; i < digi1->ndigi; i++) - if (ax25cmp(&digi1->calls[i], &digi2->calls[i]) != 0) - return 1; - - return 0; -} - -/* * Given an AX.25 address pull of to, from, digi list, command/response and the start of data * */ --- linux-2.6.10-rc2-mm4-full/net/ax25/ax25_ds_subr.c.old 2004-12-12 18:57:03.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/net/ax25/ax25_ds_subr.c 2004-12-12 18:57:45.000000000 +0100 @@ -174,7 +174,7 @@ return res; } -void ax25_dev_dama_on(ax25_dev *ax25_dev) +static void ax25_dev_dama_on(ax25_dev *ax25_dev) { if (ax25_dev == NULL) return; From bunk@stusta.de Sun Dec 12 13:15:44 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 12 Dec 2004 13:15:48 -0800 (PST) Received: from mailout.stusta.mhn.de (emailhub.stusta.mhn.de [141.84.69.5]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBCLFgNK010177 for ; Sun, 12 Dec 2004 13:15:43 -0800 Received: (qmail 2345 invoked from network); 12 Dec 2004 21:15:15 -0000 Received: from r063144.stusta.swh.mhn.de (10.150.63.144) by mailhub.stusta.mhn.de with SMTP; 12 Dec 2004 21:15:15 -0000 Received: by r063144.stusta.swh.mhn.de (Postfix, from userid 1000) id 8974FBB577; Sun, 12 Dec 2004 22:15:06 +0100 (CET) Date: Sun, 12 Dec 2004 22:15:06 +0100 From: Adrian Bunk To: netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: [2.6 patch] net/socket.c: make a function static Message-ID: <20041212211506.GY22324@stusta.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12690 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bunk@stusta.de Precedence: bulk X-list: netdev The patch below makes a needlessly global function static. diffstat output: include/linux/net.h | 4 ---- net/socket.c | 5 +++-- 2 files changed, 3 insertions(+), 6 deletions(-) Signed-off-by: Adrian Bunk --- linux-2.6.10-rc2-mm4-full/include/linux/net.h.old 2004-12-12 19:03:55.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/include/linux/net.h 2004-12-12 19:04:01.000000000 +0100 @@ -187,10 +187,6 @@ size_t len); extern int sock_recvmsg(struct socket *sock, struct msghdr *msg, size_t size, int flags); -extern int sock_readv_writev(int type, struct inode *inode, - struct file *file, - const struct iovec *iov, long count, - size_t size); extern int sock_map_fd(struct socket *sock); extern struct socket *sockfd_lookup(int fd, int *err); #define sockfd_put(sock) fput(sock->file) --- linux-2.6.10-rc2-mm4-full/net/socket.c.old 2004-12-12 19:04:08.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/net/socket.c 2004-12-12 19:04:46.000000000 +0100 @@ -736,8 +736,9 @@ return sock->ops->sendpage(sock, page, offset, size, flags); } -int sock_readv_writev(int type, struct inode * inode, struct file * file, - const struct iovec * iov, long count, size_t size) +static int sock_readv_writev(int type, struct inode * inode, + struct file * file, const struct iovec * iov, + long count, size_t size) { struct msghdr msg; struct socket *sock; From bunk@stusta.de Sun Dec 12 13:19:46 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 12 Dec 2004 13:19:51 -0800 (PST) Received: from mailout.stusta.mhn.de (mailout.stusta.mhn.de [141.84.69.5]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBCLJiMn010869 for ; Sun, 12 Dec 2004 13:19:44 -0800 Received: (qmail 2529 invoked from network); 12 Dec 2004 21:19:16 -0000 Received: from r063144.stusta.swh.mhn.de (10.150.63.144) by mailout.stusta.mhn.de with SMTP; 12 Dec 2004 21:19:16 -0000 Received: by r063144.stusta.swh.mhn.de (Postfix, from userid 1000) id CC967BB577; Sun, 12 Dec 2004 22:19:07 +0100 (CET) Date: Sun, 12 Dec 2004 22:19:07 +0100 From: Adrian Bunk To: netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: [2.6 patch] net/sunrpc/: some cleanups Message-ID: <20041212211907.GZ22324@stusta.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12691 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bunk@stusta.de Precedence: bulk X-list: netdev The patch below contains the following cleanups: - make some needlessly global code static - remove the following unused global functions: - net/sunrpc/auth_gss/gss_generic_token.c: g_get_mech_oid - net/sunrpc/cache.c: cache_find - net/sunrpc/cache.c: cache_drop - net/sunrpc/xdr.c: xdr_decode_netobj_fixed - net/sunrpc/xdr.c: xdr_shift_iovec - net/sunrpc/xdr.c: xdr_kmap - net/sunrpc/xdr.c: xdr_kunmap - remove the following unused global structs: - net/sunrpc/auth_gss/gss_krb5_mech.c: gss_mech_krb5_oid - net/sunrpc/auth_gss/gss_spkm3_mech.c: gss_mech_spkm3_oid - net/sunrpc/rpc_pipe.c: rpc_pipe_iops - remove the EXPORT_SYMBOL(cache_clean) Please review this patch. diffstat output: include/linux/sunrpc/auth.h | 2 include/linux/sunrpc/cache.h | 5 include/linux/sunrpc/gss_asn1.h | 2 include/linux/sunrpc/sched.h | 1 include/linux/sunrpc/xdr.h | 6 - include/linux/sunrpc/xprt.h | 3 net/sunrpc/auth.c | 2 net/sunrpc/auth_gss/auth_gss.c | 2 net/sunrpc/auth_gss/gss_generic_token.c | 35 ------ net/sunrpc/auth_gss/gss_krb5_crypto.c | 2 net/sunrpc/auth_gss/gss_krb5_mech.c | 3 net/sunrpc/auth_gss/gss_spkm3_mech.c | 9 - net/sunrpc/auth_gss/svcauth_gss.c | 4 net/sunrpc/cache.c | 41 +------ net/sunrpc/pmap_clnt.c | 4 net/sunrpc/rpc_pipe.c | 9 - net/sunrpc/sched.c | 5 net/sunrpc/sunrpc_syms.c | 1 net/sunrpc/svcauth.c | 3 net/sunrpc/svcauth_unix.c | 6 - net/sunrpc/xdr.c | 133 ------------------------ net/sunrpc/xprt.c | 8 - 22 files changed, 38 insertions(+), 248 deletions(-) Signed-off-by: Adrian Bunk --- linux-2.6.10-rc2-mm4-full/include/linux/sunrpc/auth.h.old 2004-12-12 19:05:25.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/include/linux/sunrpc/auth.h 2004-12-12 19:05:34.000000000 +0100 @@ -114,8 +114,6 @@ extern struct rpc_authops authdes_ops; #endif -u32 pseudoflavor_to_flavor(rpc_authflavor_t); - int rpcauth_register(struct rpc_authops *); int rpcauth_unregister(struct rpc_authops *); struct rpc_auth * rpcauth_create(rpc_authflavor_t, struct rpc_clnt *); --- linux-2.6.10-rc2-mm4-full/net/sunrpc/auth.c.old 2004-12-12 19:05:41.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/net/sunrpc/auth.c 2004-12-12 19:05:55.000000000 +0100 @@ -25,7 +25,7 @@ NULL, /* others can be loadable modules */ }; -u32 +static u32 pseudoflavor_to_flavor(u32 flavor) { if (flavor >= RPC_AUTH_MAXFLAVOR) return RPC_AUTH_GSS; --- linux-2.6.10-rc2-mm4-full/net/sunrpc/auth_gss/auth_gss.c.old 2004-12-12 19:06:10.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/net/sunrpc/auth_gss/auth_gss.c 2004-12-12 19:06:17.000000000 +0100 @@ -532,7 +532,7 @@ spin_unlock(&gss_auth->lock); } -void +static void gss_pipe_destroy_msg(struct rpc_pipe_msg *msg) { struct gss_upcall_msg *gss_msg = container_of(msg, struct gss_upcall_msg, msg); --- linux-2.6.10-rc2-mm4-full/include/linux/sunrpc/gss_asn1.h.old 2004-12-12 19:06:37.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/include/linux/sunrpc/gss_asn1.h 2004-12-12 19:06:43.000000000 +0100 @@ -71,8 +71,6 @@ unsigned char **buf_in, int toksize); -u32 g_get_mech_oid(struct xdr_netobj *mech, struct xdr_netobj * in_buf); - int g_token_size( struct xdr_netobj *mech, unsigned int body_size); --- linux-2.6.10-rc2-mm4-full/net/sunrpc/auth_gss/gss_generic_token.c.old 2004-12-12 19:06:50.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/net/sunrpc/auth_gss/gss_generic_token.c 2004-12-12 19:06:58.000000000 +0100 @@ -233,38 +233,3 @@ EXPORT_SYMBOL(g_verify_token_header); -/* Given a buffer containing a token, returns a copy of the mech oid in - * the parameter mech. */ -u32 -g_get_mech_oid(struct xdr_netobj *mech, struct xdr_netobj * in_buf) -{ - unsigned char *buf = in_buf->data; - int len = in_buf->len; - int ret=0; - int seqsize; - - if ((len-=1) < 0) - return(G_BAD_TOK_HEADER); - if (*buf++ != 0x60) - return(G_BAD_TOK_HEADER); - - if ((seqsize = der_read_length(&buf, &len)) < 0) - return(G_BAD_TOK_HEADER); - - if ((len-=1) < 0) - return(G_BAD_TOK_HEADER); - if (*buf++ != 0x06) - return(G_BAD_TOK_HEADER); - - if ((len-=1) < 0) - return(G_BAD_TOK_HEADER); - mech->len = *buf++; - - if ((len-=mech->len) < 0) - return(G_BAD_TOK_HEADER); - if (!(mech->data = kmalloc(mech->len, GFP_KERNEL))) - return(G_BUFFER_ALLOC); - memcpy(mech->data, buf, mech->len); - - return ret; -} --- linux-2.6.10-rc2-mm4-full/net/sunrpc/auth_gss/gss_krb5_crypto.c.old 2004-12-12 19:07:19.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/net/sunrpc/auth_gss/gss_krb5_crypto.c 2004-12-12 19:07:44.000000000 +0100 @@ -132,7 +132,7 @@ EXPORT_SYMBOL(krb5_decrypt); -void +static void buf_to_sg(struct scatterlist *sg, char *ptr, int len) { sg->page = virt_to_page(ptr); sg->offset = offset_in_page(ptr); --- linux-2.6.10-rc2-mm4-full/net/sunrpc/auth_gss/gss_krb5_mech.c.old 2004-12-12 19:07:57.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/net/sunrpc/auth_gss/gss_krb5_mech.c 2004-12-12 19:08:05.000000000 +0100 @@ -48,9 +48,6 @@ # define RPCDBG_FACILITY RPCDBG_AUTH #endif -struct xdr_netobj gss_mech_krb5_oid = - {9, "\052\206\110\206\367\022\001\002\002"}; - static inline int get_bytes(char **ptr, const char *end, void *res, int len) { --- linux-2.6.10-rc2-mm4-full/net/sunrpc/auth_gss/gss_spkm3_mech.c.old 2004-12-12 19:11:01.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/net/sunrpc/auth_gss/gss_spkm3_mech.c 2004-12-12 19:11:51.000000000 +0100 @@ -49,9 +49,6 @@ # define RPCDBG_FACILITY RPCDBG_AUTH #endif -struct xdr_netobj gss_mech_spkm3_oid = - {7, "\053\006\001\005\005\001\003"}; - static inline int get_bytes(char **ptr, const char *end, void *res, int len) { @@ -206,7 +203,7 @@ return GSS_S_FAILURE; } -void +static void gss_delete_sec_context_spkm3(void *internal_ctx) { struct spkm3_ctx *sctx = internal_ctx; @@ -221,7 +218,7 @@ kfree(sctx); } -u32 +static u32 gss_verify_mic_spkm3(struct gss_ctx *ctx, struct xdr_buf *signbuf, struct xdr_netobj *checksum, @@ -241,7 +238,7 @@ return maj_stat; } -u32 +static u32 gss_get_mic_spkm3(struct gss_ctx *ctx, u32 qop, struct xdr_buf *message_buffer, --- linux-2.6.10-rc2-mm4-full/net/sunrpc/auth_gss/svcauth_gss.c.old 2004-12-12 19:12:04.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/net/sunrpc/auth_gss/svcauth_gss.c 2004-12-12 19:12:24.000000000 +0100 @@ -447,7 +447,7 @@ static DefineSimpleCacheLookup(rsc, 0); -struct rsc * +static struct rsc * gss_svc_searchbyctx(struct xdr_netobj *handle) { struct rsc rsci; @@ -1044,7 +1044,7 @@ kfree(gd); } -struct auth_ops svcauthops_gss = { +static struct auth_ops svcauthops_gss = { .name = "rpcsec_gss", .owner = THIS_MODULE, .flavour = RPC_AUTH_GSS, --- linux-2.6.10-rc2-mm4-full/include/linux/sunrpc/cache.h.old 2004-12-12 19:15:13.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/include/linux/sunrpc/cache.h 2004-12-12 19:57:27.000000000 +0100 @@ -257,8 +257,6 @@ -extern void cache_defer_req(struct cache_req *req, struct cache_head *item); -extern void cache_revisit_request(struct cache_head *item); extern void cache_clean_deferred(void *owner); static inline struct cache_head *cache_get(struct cache_head *h) @@ -286,14 +284,11 @@ struct cache_head *head, time_t expiry); extern int cache_check(struct cache_detail *detail, struct cache_head *h, struct cache_req *rqstp); -extern int cache_clean(void); extern void cache_flush(void); extern void cache_purge(struct cache_detail *detail); #define NEVER (0x7FFFFFFF) extern void cache_register(struct cache_detail *cd); extern int cache_unregister(struct cache_detail *cd); -extern struct cache_detail *cache_find(char *name); -extern void cache_drop(struct cache_detail *detail); extern void qword_add(char **bpp, int *lp, char *str); extern void qword_addhex(char **bpp, int *lp, char *buf, int blen); --- linux-2.6.10-rc2-mm4-full/net/sunrpc/cache.c.old 2004-12-12 19:14:03.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/net/sunrpc/cache.c 2004-12-12 19:17:34.000000000 +0100 @@ -33,6 +33,9 @@ #define RPCDBG_FACILITY RPCDBG_CACHE +static void cache_defer_req(struct cache_req *req, struct cache_head *item); +static void cache_revisit_request(struct cache_head *item); + void cache_init(struct cache_head *h) { time_t now = get_seconds(); @@ -256,39 +259,13 @@ return 0; } -struct cache_detail *cache_find(char *name) -{ - struct list_head *l; - - spin_lock(&cache_list_lock); - list_for_each(l, &cache_list) { - struct cache_detail *cd = list_entry(l, struct cache_detail, others); - - if (strcmp(cd->name, name)==0) { - atomic_inc(&cd->inuse); - spin_unlock(&cache_list_lock); - return cd; - } - } - spin_unlock(&cache_list_lock); - return NULL; -} - -/* cache_drop must be called on any cache returned by - * cache_find, after it has been used - */ -void cache_drop(struct cache_detail *detail) -{ - atomic_dec(&detail->inuse); -} - /* clean cache tries to find something to clean * and cleans it. * It returns 1 if it cleaned something, * 0 if it didn't find anything this time * -1 if it fell off the end of the list. */ -int cache_clean(void) +static int cache_clean(void) { int rv = 0; struct list_head *next; @@ -428,12 +405,12 @@ #define DFR_MAX 300 /* ??? */ -spinlock_t cache_defer_lock = SPIN_LOCK_UNLOCKED; +static spinlock_t cache_defer_lock = SPIN_LOCK_UNLOCKED; static LIST_HEAD(cache_defer_list); static struct list_head cache_defer_hash[DFR_HASHSIZE]; static int cache_defer_cnt; -void cache_defer_req(struct cache_req *req, struct cache_head *item) +static void cache_defer_req(struct cache_req *req, struct cache_head *item) { struct cache_deferred_req *dreq; int hash = DFR_HASH(item); @@ -483,7 +460,7 @@ } } -void cache_revisit_request(struct cache_head *item) +static void cache_revisit_request(struct cache_head *item) { struct cache_deferred_req *dreq; struct list_head pending; @@ -902,7 +879,7 @@ *lp = len; } -void warn_no_listener(struct cache_detail *detail) +static void warn_no_listener(struct cache_detail *detail) { if (detail->last_warn != detail->last_close) { detail->last_warn = detail->last_close; @@ -1119,7 +1096,7 @@ return cd->cache_show(m, cd, cp); } -struct seq_operations cache_content_op = { +static struct seq_operations cache_content_op = { .start = c_start, .next = c_next, .stop = c_stop, --- linux-2.6.10-rc2-mm4-full/net/sunrpc/pmap_clnt.c.old 2004-12-12 19:17:51.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/net/sunrpc/pmap_clnt.c 2004-12-12 19:18:10.000000000 +0100 @@ -31,7 +31,7 @@ static struct rpc_procinfo pmap_procedures[]; static struct rpc_clnt * pmap_create(char *, struct sockaddr_in *, int); static void pmap_getport_done(struct rpc_task *); -extern struct rpc_program pmap_program; +static struct rpc_program pmap_program; static spinlock_t pmap_lock = SPIN_LOCK_UNLOCKED; /* @@ -292,7 +292,7 @@ static struct rpc_stat pmap_stats; -struct rpc_program pmap_program = { +static struct rpc_program pmap_program = { .name = "portmap", .number = RPC_PMAP_PROGRAM, .nrvers = ARRAY_SIZE(pmap_version), --- linux-2.6.10-rc2-mm4-full/net/sunrpc/rpc_pipe.c.old 2004-12-12 19:18:25.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/net/sunrpc/rpc_pipe.c 2004-12-12 19:19:08.000000000 +0100 @@ -276,12 +276,7 @@ } } -struct inode_operations rpc_pipe_iops = { - .lookup = simple_lookup, -}; - - -struct file_operations rpc_pipe_fops = { +static struct file_operations rpc_pipe_fops = { .owner = THIS_MODULE, .llseek = no_llseek, .read = rpc_pipe_read, @@ -595,7 +590,7 @@ return 0; } -struct dentry * +static struct dentry * rpc_lookup_negative(char *path, struct nameidata *nd) { struct dentry *dentry; --- linux-2.6.10-rc2-mm4-full/include/linux/sunrpc/sched.h.old 2004-12-12 19:19:29.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/include/linux/sunrpc/sched.h 2004-12-12 19:19:35.000000000 +0100 @@ -251,7 +251,6 @@ void rpc_wake_up_status(struct rpc_wait_queue *, int); void rpc_delay(struct rpc_task *, unsigned long); void * rpc_malloc(struct rpc_task *, size_t); -void rpc_free(struct rpc_task *); int rpciod_up(void); void rpciod_down(void); void rpciod_wake_up(void); --- linux-2.6.10-rc2-mm4-full/net/sunrpc/sched.c.old 2004-12-12 19:19:42.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/net/sunrpc/sched.c 2004-12-12 19:20:13.000000000 +0100 @@ -43,6 +43,9 @@ static void rpciod_killall(void); static void rpc_async_schedule(void *); +static void rpc_free(struct rpc_task *task); + + /* * RPC tasks that create another task (e.g. for contacting the portmapper) * will wait on this queue for their child's completion @@ -719,7 +722,7 @@ return task->tk_buffer; } -void +static void rpc_free(struct rpc_task *task) { if (task->tk_buffer) { --- linux-2.6.10-rc2-mm4-full/net/sunrpc/svcauth.c.old 2004-12-12 19:20:28.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/net/sunrpc/svcauth.c 2004-12-12 19:20:39.000000000 +0100 @@ -128,7 +128,8 @@ #define DN_HASHMASK (DN_HASHMAX-1) static struct cache_head *auth_domain_table[DN_HASHMAX]; -void auth_domain_drop(struct cache_head *item, struct cache_detail *cd) + +static void auth_domain_drop(struct cache_head *item, struct cache_detail *cd) { struct auth_domain *dom = container_of(item, struct auth_domain, h); if (cache_put(item,cd)) --- linux-2.6.10-rc2-mm4-full/net/sunrpc/svcauth_unix.c.old 2004-12-12 19:20:54.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/net/sunrpc/svcauth_unix.c 2004-12-12 19:21:30.000000000 +0100 @@ -97,7 +97,7 @@ }; static struct cache_head *ip_table[IP_HASHMAX]; -void ip_map_put(struct cache_head *item, struct cache_detail *cd) +static void ip_map_put(struct cache_head *item, struct cache_detail *cd) { struct ip_map *im = container_of(item, struct ip_map,h); if (cache_put(item, cd)) { @@ -432,7 +432,7 @@ }; -int +static int svcauth_unix_accept(struct svc_rqst *rqstp, u32 *authp) { struct kvec *argv = &rqstp->rq_arg.head[0]; @@ -487,7 +487,7 @@ return SVC_DENIED; } -int +static int svcauth_unix_release(struct svc_rqst *rqstp) { /* Verifier (such as it is) is already in place. --- linux-2.6.10-rc2-mm4-full/include/linux/sunrpc/xdr.h.old 2004-12-12 19:22:17.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/include/linux/sunrpc/xdr.h 2004-12-12 19:26:11.000000000 +0100 @@ -95,7 +95,6 @@ u32 * xdr_decode_string_inplace(u32 *p, char **sp, int *lenp, int maxlen); u32 * xdr_encode_netobj(u32 *p, const struct xdr_netobj *); u32 * xdr_decode_netobj(u32 *p, struct xdr_netobj *); -u32 * xdr_decode_netobj_fixed(u32 *p, void *obj, unsigned int len); void xdr_encode_pages(struct xdr_buf *, struct page **, unsigned int, unsigned int); @@ -135,8 +134,6 @@ return iov->iov_len = ((u8 *) p - (u8 *) iov->iov_base); } -void xdr_shift_iovec(struct kvec *, int, size_t); - /* * Maximum number of iov's we use. */ @@ -145,10 +142,7 @@ /* * XDR buffer helper functions */ -extern int xdr_kmap(struct kvec *, struct xdr_buf *, size_t); -extern void xdr_kunmap(struct xdr_buf *, size_t); extern void xdr_shift_buf(struct xdr_buf *, size_t); -extern void _copy_from_pages(char *, struct page **, size_t, size_t); extern void xdr_buf_from_iov(struct kvec *, struct xdr_buf *); extern int xdr_buf_subsegment(struct xdr_buf *, struct xdr_buf *, int, int); extern int xdr_buf_read_netobj(struct xdr_buf *, struct xdr_netobj *, int); --- linux-2.6.10-rc2-mm4-full/net/sunrpc/xdr.c.old 2004-12-12 19:21:55.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/net/sunrpc/xdr.c 2004-12-12 19:26:44.000000000 +0100 @@ -33,15 +33,6 @@ } u32 * -xdr_decode_netobj_fixed(u32 *p, void *obj, unsigned int len) -{ - if (ntohl(*p++) != len) - return NULL; - memcpy(obj, p, len); - return p + XDR_QUADLEN(len); -} - -u32 * xdr_decode_netobj(u32 *p, struct xdr_netobj *obj) { unsigned int len; @@ -185,124 +176,6 @@ xdr->buflen += len; } -/* - * Realign the kvec if the server missed out some reply elements - * (such as post-op attributes,...) - * Note: This is a simple implementation that assumes that - * len <= iov->iov_len !!! - * The RPC header (assumed to be the 1st element in the iov array) - * is not shifted. - */ -void xdr_shift_iovec(struct kvec *iov, int nr, size_t len) -{ - struct kvec *pvec; - - for (pvec = iov + nr - 1; nr > 1; nr--, pvec--) { - struct kvec *svec = pvec - 1; - - if (len > pvec->iov_len) { - printk(KERN_DEBUG "RPC: Urk! Large shift of short iovec.\n"); - return; - } - memmove((char *)pvec->iov_base + len, pvec->iov_base, - pvec->iov_len - len); - - if (len > svec->iov_len) { - printk(KERN_DEBUG "RPC: Urk! Large shift of short iovec.\n"); - return; - } - memcpy(pvec->iov_base, - (char *)svec->iov_base + svec->iov_len - len, len); - } -} - -/* - * Map a struct xdr_buf into an kvec array. - */ -int xdr_kmap(struct kvec *iov_base, struct xdr_buf *xdr, size_t base) -{ - struct kvec *iov = iov_base; - struct page **ppage = xdr->pages; - unsigned int len, pglen = xdr->page_len; - - len = xdr->head[0].iov_len; - if (base < len) { - iov->iov_len = len - base; - iov->iov_base = (char *)xdr->head[0].iov_base + base; - iov++; - base = 0; - } else - base -= len; - - if (pglen == 0) - goto map_tail; - if (base >= pglen) { - base -= pglen; - goto map_tail; - } - if (base || xdr->page_base) { - pglen -= base; - base += xdr->page_base; - ppage += base >> PAGE_CACHE_SHIFT; - base &= ~PAGE_CACHE_MASK; - } - do { - len = PAGE_CACHE_SIZE; - iov->iov_base = kmap(*ppage); - if (base) { - iov->iov_base += base; - len -= base; - base = 0; - } - if (pglen < len) - len = pglen; - iov->iov_len = len; - iov++; - ppage++; - } while ((pglen -= len) != 0); -map_tail: - if (xdr->tail[0].iov_len) { - iov->iov_len = xdr->tail[0].iov_len - base; - iov->iov_base = (char *)xdr->tail[0].iov_base + base; - iov++; - } - return (iov - iov_base); -} - -void xdr_kunmap(struct xdr_buf *xdr, size_t base) -{ - struct page **ppage = xdr->pages; - unsigned int pglen = xdr->page_len; - - if (!pglen) - return; - if (base > xdr->head[0].iov_len) - base -= xdr->head[0].iov_len; - else - base = 0; - - if (base >= pglen) - return; - if (base || xdr->page_base) { - pglen -= base; - base += xdr->page_base; - ppage += base >> PAGE_CACHE_SHIFT; - /* Note: The offset means that the length of the first - * page is really (PAGE_CACHE_SIZE - (base & ~PAGE_CACHE_MASK)). - * In order to avoid an extra test inside the loop, - * we bump pglen here, and just subtract PAGE_CACHE_SIZE... */ - pglen += base & ~PAGE_CACHE_MASK; - } - for (;;) { - flush_dcache_page(*ppage); - kunmap(*ppage); - if (pglen <= PAGE_CACHE_SIZE) - break; - pglen -= PAGE_CACHE_SIZE; - ppage++; - } -} - void xdr_partial_copy_from_skb(struct xdr_buf *xdr, unsigned int base, skb_reader_t *desc, @@ -572,7 +445,7 @@ * Copies data into an arbitrary memory location from an array of pages * The copy is assumed to be non-overlapping. */ -void +static void _copy_from_pages(char *p, struct page **pages, size_t pgbase, size_t len) { struct page **pgfrom; @@ -610,7 +483,7 @@ * 'len' bytes. The extra data is not lost, but is instead * moved into the inlined pages and/or the tail. */ -void +static void xdr_shrink_bufhead(struct xdr_buf *buf, size_t len) { struct kvec *head, *tail; @@ -683,7 +556,7 @@ * 'len' bytes. The extra data is not lost, but is instead * moved into the tail. */ -void +static void xdr_shrink_pagelen(struct xdr_buf *buf, size_t len) { struct kvec *tail; --- linux-2.6.10-rc2-mm4-full/include/linux/sunrpc/xprt.h.old 2004-12-12 19:27:24.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/include/linux/sunrpc/xprt.h 2004-12-12 19:29:18.000000000 +0100 @@ -201,8 +201,6 @@ struct rpc_xprt * xprt_create_proto(int proto, struct sockaddr_in *addr, struct rpc_timeout *toparms); int xprt_destroy(struct rpc_xprt *); -void xprt_shutdown(struct rpc_xprt *); -void xprt_default_timeout(struct rpc_timeout *, int); void xprt_set_timeout(struct rpc_timeout *, unsigned int, unsigned long); @@ -213,7 +211,6 @@ int xprt_adjust_timeout(struct rpc_rqst *req); void xprt_release(struct rpc_task *); void xprt_connect(struct rpc_task *); -int xprt_clear_backlog(struct rpc_xprt *); void xprt_sock_setbufsize(struct rpc_xprt *); #define XPRT_LOCKED 0 --- linux-2.6.10-rc2-mm4-full/net/sunrpc/xprt.c.old 2004-12-12 19:27:37.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/net/sunrpc/xprt.c 2004-12-12 19:29:23.000000000 +0100 @@ -90,6 +90,8 @@ static void xprt_bind_socket(struct rpc_xprt *, struct socket *); static int __xprt_get_cong(struct rpc_xprt *, struct rpc_task *); +static int xprt_clear_backlog(struct rpc_xprt *xprt); + #ifdef RPC_DEBUG_DATA /* * Print the buffer contents (first 128 bytes only--just enough for @@ -1397,7 +1399,7 @@ /* * Set default timeout parameters */ -void +static void xprt_default_timeout(struct rpc_timeout *to, int proto) { if (proto == IPPROTO_UDP) @@ -1633,7 +1635,7 @@ /* * Prepare for transport shutdown. */ -void +static void xprt_shutdown(struct rpc_xprt *xprt) { xprt->shutdown = 1; @@ -1648,7 +1650,7 @@ /* * Clear the xprt backlog queue */ -int +static int xprt_clear_backlog(struct rpc_xprt *xprt) { rpc_wake_up_next(&xprt->backlog); wake_up(&xprt->cong_wait); --- linux-2.6.10-rc2-mm4-full/net/sunrpc/sunrpc_syms.c.old 2004-12-12 19:23:19.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/net/sunrpc/sunrpc_syms.c 2004-12-12 20:19:39.000000000 +0100 @@ -107,7 +107,6 @@ EXPORT_SYMBOL(auth_unix_forget_old); EXPORT_SYMBOL(auth_unix_lookup); EXPORT_SYMBOL(cache_check); -EXPORT_SYMBOL(cache_clean); EXPORT_SYMBOL(cache_flush); EXPORT_SYMBOL(cache_purge); EXPORT_SYMBOL(cache_fresh); From bunk@stusta.de Sun Dec 12 13:20:59 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 12 Dec 2004 13:21:03 -0800 (PST) Received: from mailout.stusta.mhn.de (emailhub.stusta.mhn.de [141.84.69.5]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBCLKw5m011206 for ; Sun, 12 Dec 2004 13:20:59 -0800 Received: (qmail 2570 invoked from network); 12 Dec 2004 21:20:31 -0000 Received: from r063144.stusta.swh.mhn.de (10.150.63.144) by mailout.stusta.mhn.de with SMTP; 12 Dec 2004 21:20:31 -0000 Received: by r063144.stusta.swh.mhn.de (Postfix, from userid 1000) id 12802BB577; Sun, 12 Dec 2004 22:20:22 +0100 (CET) Date: Sun, 12 Dec 2004 22:20:22 +0100 From: Adrian Bunk To: netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: [2.6 patch] net/unix/: make some code static Message-ID: <20041212212022.GA22324@stusta.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12692 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bunk@stusta.de Precedence: bulk X-list: netdev The patch below makes some needlessly global code static. diffstat output: net/unix/af_unix.c | 2 +- net/unix/sysctl_net_unix.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) Signed-off-by: Adrian Bunk --- linux-2.6.10-rc2-mm4-full/net/unix/af_unix.c.old 2004-12-12 19:36:21.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/net/unix/af_unix.c 2004-12-12 19:36:32.000000000 +0100 @@ -121,7 +121,7 @@ int sysctl_unix_max_dgram_qlen = 10; -kmem_cache_t *unix_sk_cachep; +static kmem_cache_t *unix_sk_cachep; struct hlist_head unix_socket_table[UNIX_HASH_SIZE + 1]; rwlock_t unix_table_lock = RW_LOCK_UNLOCKED; --- linux-2.6.10-rc2-mm4-full/net/unix/sysctl_net_unix.c.old 2004-12-12 19:37:08.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/net/unix/sysctl_net_unix.c 2004-12-12 19:37:17.000000000 +0100 @@ -14,7 +14,7 @@ extern int sysctl_unix_max_dgram_qlen; -ctl_table unix_table[] = { +static ctl_table unix_table[] = { { .ctl_name = NET_UNIX_MAX_DGRAM_QLEN, .procname = "max_dgram_qlen", From bunk@stusta.de Sun Dec 12 13:23:56 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 12 Dec 2004 13:24:01 -0800 (PST) Received: from mailout.stusta.mhn.de (mailout.stusta.mhn.de [141.84.69.5]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBCLNsj2011929 for ; Sun, 12 Dec 2004 13:23:55 -0800 Received: (qmail 2786 invoked from network); 12 Dec 2004 21:23:27 -0000 Received: from r063144.stusta.swh.mhn.de (10.150.63.144) by mailout.stusta.mhn.de with SMTP; 12 Dec 2004 21:23:27 -0000 Received: by r063144.stusta.swh.mhn.de (Postfix, from userid 1000) id 80FDBBB577; Sun, 12 Dec 2004 22:23:18 +0100 (CET) Date: Sun, 12 Dec 2004 22:23:18 +0100 From: Adrian Bunk To: eis@baty.hanse.de Cc: linux-x25@vger.kernel.org, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: [2.6 patch] net/x25/: some cleanups Message-ID: <20041212212318.GB22324@stusta.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12693 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bunk@stusta.de Precedence: bulk X-list: netdev The patch below includes the following cleanups: - make some needlessly global code static - remove the following unused global functions: - net/x25/x25_dev.c: x25_llc_receive_frame - net/x25/x25_link.c: x25_transmit_diagnostic Please review this patch. diffstat output: include/net/x25.h | 5 ----- net/x25/af_x25.c | 8 ++++---- net/x25/x25_dev.c | 23 ----------------------- net/x25/x25_link.c | 33 +++++---------------------------- net/x25/x25_proc.c | 4 ++-- 5 files changed, 11 insertions(+), 62 deletions(-) Signed-off-by: Adrian Bunk --- linux-2.6.10-rc2-mm4-full/include/net/x25.h.old 2004-12-12 19:39:30.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/include/net/x25.h 2004-12-12 19:41:08.000000000 +0100 @@ -162,7 +162,6 @@ struct x25_address *); extern int x25_addr_aton(unsigned char *, struct x25_address *, struct x25_address *); -extern unsigned int x25_new_lci(struct x25_neigh *); extern struct sock *x25_find_socket(unsigned int, struct x25_neigh *); extern void x25_destroy_socket(struct sock *); extern int x25_rx_call_request(struct sk_buff *, struct x25_neigh *, unsigned int); @@ -171,7 +170,6 @@ /* x25_dev.c */ extern void x25_send_frame(struct sk_buff *, struct x25_neigh *); extern int x25_lapb_receive_frame(struct sk_buff *, struct net_device *, struct packet_type *); -extern int x25_llc_receive_frame(struct sk_buff *, struct net_device *, struct packet_type *); extern void x25_establish_link(struct x25_neigh *); extern void x25_terminate_link(struct x25_neigh *); @@ -191,9 +189,6 @@ extern void x25_link_device_down(struct net_device *); extern void x25_link_established(struct x25_neigh *); extern void x25_link_terminated(struct x25_neigh *); -extern void x25_transmit_restart_request(struct x25_neigh *); -extern void x25_transmit_restart_confirmation(struct x25_neigh *); -extern void x25_transmit_diagnostic(struct x25_neigh *, unsigned char); extern void x25_transmit_clear_request(struct x25_neigh *, unsigned int, unsigned char); extern void x25_transmit_link(struct sk_buff *, struct x25_neigh *); extern int x25_subscr_ioctl(unsigned int, void __user *); --- linux-2.6.10-rc2-mm4-full/net/x25/af_x25.c.old 2004-12-12 19:38:22.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/net/x25/af_x25.c 2004-12-12 19:39:17.000000000 +0100 @@ -261,7 +261,7 @@ /* * Find a connected X.25 socket given my LCI and neighbour. */ -struct sock *__x25_find_socket(unsigned int lci, struct x25_neigh *nb) +static struct sock *__x25_find_socket(unsigned int lci, struct x25_neigh *nb) { struct sock *s; struct hlist_node *node; @@ -289,7 +289,7 @@ /* * Find a unique LCI for a given device. */ -unsigned int x25_new_lci(struct x25_neigh *nb) +static unsigned int x25_new_lci(struct x25_neigh *nb) { unsigned int lci = 1; struct sock *sk; @@ -1336,7 +1336,7 @@ return rc; } -struct net_proto_family x25_family_ops = { +static struct net_proto_family x25_family_ops = { .family = AF_X25, .create = x25_create, .owner = THIS_MODULE, @@ -1371,7 +1371,7 @@ .func = x25_lapb_receive_frame, }; -struct notifier_block x25_dev_notifier = { +static struct notifier_block x25_dev_notifier = { .notifier_call = x25_device_event, }; --- linux-2.6.10-rc2-mm4-full/net/x25/x25_dev.c.old 2004-12-12 19:39:54.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/net/x25/x25_dev.c 2004-12-12 19:40:00.000000000 +0100 @@ -123,29 +123,6 @@ return 0; } -int x25_llc_receive_frame(struct sk_buff *skb, struct net_device *dev, - struct packet_type *ptype) -{ - struct x25_neigh *nb; - int rc = 0; - - skb->sk = NULL; - - /* - * Packet received from unrecognised device, throw it away. - */ - nb = x25_get_neigh(dev); - if (!nb) { - printk(KERN_DEBUG "X.25: unknown_neighbour - %s\n", dev->name); - kfree_skb(skb); - } else { - rc = x25_receive_data(skb, nb); - x25_neigh_put(nb); - } - - return rc; -} - void x25_establish_link(struct x25_neigh *nb) { struct sk_buff *skb; --- linux-2.6.10-rc2-mm4-full/net/x25/x25_link.c.old 2004-12-12 19:40:20.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/net/x25/x25_link.c 2004-12-12 19:41:23.000000000 +0100 @@ -35,6 +35,9 @@ static void x25_t20timer_expiry(unsigned long); +static void x25_transmit_restart_confirmation(struct x25_neigh *nb); +static void x25_transmit_restart_request(struct x25_neigh *nb); + /* * Linux set/reset timer routines */ @@ -106,7 +109,7 @@ /* * This routine is called when a Restart Request is needed */ -void x25_transmit_restart_request(struct x25_neigh *nb) +static void x25_transmit_restart_request(struct x25_neigh *nb) { unsigned char *dptr; int len = X25_MAX_L2_LEN + X25_STD_MIN_LEN + 2; @@ -133,7 +136,7 @@ /* * This routine is called when a Restart Confirmation is needed */ -void x25_transmit_restart_confirmation(struct x25_neigh *nb) +static void x25_transmit_restart_confirmation(struct x25_neigh *nb) { unsigned char *dptr; int len = X25_MAX_L2_LEN + X25_STD_MIN_LEN; @@ -156,32 +159,6 @@ } /* - * This routine is called when a Diagnostic is required. - */ -void x25_transmit_diagnostic(struct x25_neigh *nb, unsigned char diag) -{ - unsigned char *dptr; - int len = X25_MAX_L2_LEN + X25_STD_MIN_LEN + 1; - struct sk_buff *skb = alloc_skb(len, GFP_ATOMIC); - - if (!skb) - return; - - skb_reserve(skb, X25_MAX_L2_LEN); - - dptr = skb_put(skb, X25_STD_MIN_LEN + 1); - - *dptr++ = nb->extended ? X25_GFI_EXTSEQ : X25_GFI_STDSEQ; - *dptr++ = 0x00; - *dptr++ = X25_DIAGNOSTIC; - *dptr++ = diag; - - skb->sk = NULL; - - x25_send_frame(skb, nb); -} - -/* * This routine is called when a Clear Request is needed outside of the context * of a connected socket. */ --- linux-2.6.10-rc2-mm4-full/net/x25/x25_proc.c.old 2004-12-12 19:41:36.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/net/x25/x25_proc.c 2004-12-12 19:41:52.000000000 +0100 @@ -166,14 +166,14 @@ return 0; } -struct seq_operations x25_seq_route_ops = { +static struct seq_operations x25_seq_route_ops = { .start = x25_seq_route_start, .next = x25_seq_route_next, .stop = x25_seq_route_stop, .show = x25_seq_route_show, }; -struct seq_operations x25_seq_socket_ops = { +static struct seq_operations x25_seq_socket_ops = { .start = x25_seq_socket_start, .next = x25_seq_socket_next, .stop = x25_seq_socket_stop, From bunk@stusta.de Sun Dec 12 13:26:00 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 12 Dec 2004 13:26:04 -0800 (PST) Received: from mailout.stusta.mhn.de (mailout.stusta.mhn.de [141.84.69.5]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBCLPx22012423 for ; Sun, 12 Dec 2004 13:26:00 -0800 Received: (qmail 2927 invoked from network); 12 Dec 2004 21:25:32 -0000 Received: from r063144.stusta.swh.mhn.de (10.150.63.144) by mailhub.stusta.mhn.de with SMTP; 12 Dec 2004 21:25:32 -0000 Received: by r063144.stusta.swh.mhn.de (Postfix, from userid 1000) id 19698BB577; Sun, 12 Dec 2004 22:25:23 +0100 (CET) Date: Sun, 12 Dec 2004 22:25:23 +0100 From: Adrian Bunk To: James Morris Cc: netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: [2.6 patch] net/xfrm/: some cleanups Message-ID: <20041212212523.GC22324@stusta.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12694 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bunk@stusta.de Precedence: bulk X-list: netdev The patch below contains the following changes: - make some needlessly global code static - remove the EXPORT_SYMBOL_GPL'ed but unused function xfrm_calg_get_byidx diffstat output: include/net/xfrm.h | 5 ----- net/xfrm/xfrm_algo.c | 8 -------- net/xfrm/xfrm_export.c | 1 - net/xfrm/xfrm_policy.c | 8 ++++---- net/xfrm/xfrm_state.c | 7 +++++-- net/xfrm/xfrm_user.c | 4 ++-- 6 files changed, 11 insertions(+), 22 deletions(-) Signed-off-by: Adrian Bunk --- linux-2.6.10-rc2-mm4-full/include/net/xfrm.h.old 2004-12-12 19:42:57.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/include/net/xfrm.h 2004-12-12 19:45:37.000000000 +0100 @@ -843,7 +843,6 @@ } #endif -void xfrm_policy_init(void); struct xfrm_policy *xfrm_policy_alloc(int gfp); extern int xfrm_policy_walk(int (*func)(struct xfrm_policy *, int, int, void*), void *); int xfrm_policy_insert(int dir, struct xfrm_policy *policy, int excl); @@ -858,12 +857,9 @@ int create, unsigned short family); extern void xfrm_policy_flush(void); extern int xfrm_sk_policy_insert(struct sock *sk, int dir, struct xfrm_policy *pol); -extern struct xfrm_policy *xfrm_sk_policy_lookup(struct sock *sk, int dir, struct flowi *fl); extern int xfrm_flush_bundles(void); extern wait_queue_head_t km_waitq; -extern void km_state_expired(struct xfrm_state *x, int hard); -extern int km_query(struct xfrm_state *x, struct xfrm_tmpl *, struct xfrm_policy *pol); extern int km_new_mapping(struct xfrm_state *x, xfrm_address_t *ipaddr, u16 sport); extern void km_policy_expired(struct xfrm_policy *pol, int dir, int hard); @@ -875,7 +871,6 @@ extern int xfrm_count_enc_supported(void); extern struct xfrm_algo_desc *xfrm_aalg_get_byidx(unsigned int idx); extern struct xfrm_algo_desc *xfrm_ealg_get_byidx(unsigned int idx); -extern struct xfrm_algo_desc *xfrm_calg_get_byidx(unsigned int idx); extern struct xfrm_algo_desc *xfrm_aalg_get_byid(int alg_id); extern struct xfrm_algo_desc *xfrm_ealg_get_byid(int alg_id); extern struct xfrm_algo_desc *xfrm_calg_get_byid(int alg_id); --- linux-2.6.10-rc2-mm4-full/net/xfrm/xfrm_algo.c.old 2004-12-12 19:43:09.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/net/xfrm/xfrm_algo.c 2004-12-12 19:43:16.000000000 +0100 @@ -416,14 +416,6 @@ return &ealg_list[idx]; } -struct xfrm_algo_desc *xfrm_calg_get_byidx(unsigned int idx) -{ - if (idx >= calg_entries()) - return NULL; - - return &calg_list[idx]; -} - /* * Probe for the availability of crypto algorithms, and set the available * flag for any algorithms found on the system. This is typically called by --- linux-2.6.10-rc2-mm4-full/net/xfrm/xfrm_export.c.old 2004-12-12 19:43:23.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/net/xfrm/xfrm_export.c 2004-12-12 19:43:26.000000000 +0100 @@ -53,7 +53,6 @@ EXPORT_SYMBOL_GPL(xfrm_count_enc_supported); EXPORT_SYMBOL_GPL(xfrm_aalg_get_byidx); EXPORT_SYMBOL_GPL(xfrm_ealg_get_byidx); -EXPORT_SYMBOL_GPL(xfrm_calg_get_byidx); EXPORT_SYMBOL_GPL(xfrm_aalg_get_byid); EXPORT_SYMBOL_GPL(xfrm_ealg_get_byid); EXPORT_SYMBOL_GPL(xfrm_calg_get_byid); --- linux-2.6.10-rc2-mm4-full/net/xfrm/xfrm_policy.c.old 2004-12-12 19:43:41.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/net/xfrm/xfrm_policy.c 2004-12-12 19:44:43.000000000 +0100 @@ -33,7 +33,7 @@ static rwlock_t xfrm_policy_afinfo_lock = RW_LOCK_UNLOCKED; static struct xfrm_policy_afinfo *xfrm_policy_afinfo[NPROTO]; -kmem_cache_t *xfrm_dst_cache; +static kmem_cache_t *xfrm_dst_cache; static struct work_struct xfrm_policy_gc_work; static struct list_head xfrm_policy_gc_list = @@ -498,7 +498,7 @@ *obj_refp = &pol->refcnt; } -struct xfrm_policy *xfrm_sk_policy_lookup(struct sock *sk, int dir, struct flowi *fl) +static struct xfrm_policy *xfrm_sk_policy_lookup(struct sock *sk, int dir, struct flowi *fl) { struct xfrm_policy *pol; @@ -1220,13 +1220,13 @@ return NOTIFY_DONE; } -struct notifier_block xfrm_dev_notifier = { +static struct notifier_block xfrm_dev_notifier = { xfrm_dev_event, NULL, 0 }; -void __init xfrm_policy_init(void) +static void __init xfrm_policy_init(void) { xfrm_dst_cache = kmem_cache_create("xfrm_dst_cache", sizeof(struct xfrm_dst), --- linux-2.6.10-rc2-mm4-full/net/xfrm/xfrm_state.c.old 2004-12-12 19:45:01.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/net/xfrm/xfrm_state.c 2004-12-12 19:46:03.000000000 +0100 @@ -51,6 +51,9 @@ static struct xfrm_state_afinfo *xfrm_state_get_afinfo(unsigned short family); static void xfrm_state_put_afinfo(struct xfrm_state_afinfo *afinfo); +static int km_query(struct xfrm_state *x, struct xfrm_tmpl *t, struct xfrm_policy *pol); +static void km_state_expired(struct xfrm_state *x, int hard); + static void xfrm_state_gc_destroy(struct xfrm_state *x) { if (del_timer(&x->timer)) @@ -746,7 +749,7 @@ static struct list_head xfrm_km_list = LIST_HEAD_INIT(xfrm_km_list); static rwlock_t xfrm_km_lock = RW_LOCK_UNLOCKED; -void km_state_expired(struct xfrm_state *x, int hard) +static void km_state_expired(struct xfrm_state *x, int hard) { struct xfrm_mgr *km; @@ -764,7 +767,7 @@ wake_up(&km_waitq); } -int km_query(struct xfrm_state *x, struct xfrm_tmpl *t, struct xfrm_policy *pol) +static int km_query(struct xfrm_state *x, struct xfrm_tmpl *t, struct xfrm_policy *pol) { int err = -EINVAL; struct xfrm_mgr *km; --- linux-2.6.10-rc2-mm4-full/net/xfrm/xfrm_user.c.old 2004-12-12 19:46:18.000000000 +0100 +++ linux-2.6.10-rc2-mm4-full/net/xfrm/xfrm_user.c 2004-12-12 19:46:35.000000000 +0100 @@ -1128,8 +1128,8 @@ /* User gives us xfrm_user_policy_info followed by an array of 0 * or more templates. */ -struct xfrm_policy *xfrm_compile_policy(u16 family, int opt, - u8 *data, int len, int *dir) +static struct xfrm_policy *xfrm_compile_policy(u16 family, int opt, + u8 *data, int len, int *dir) { struct xfrm_userpolicy_info *p = (struct xfrm_userpolicy_info *)data; struct xfrm_user_tmpl *ut = (struct xfrm_user_tmpl *) (p + 1); From davem@davemloft.net Sun Dec 12 17:06:46 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 12 Dec 2004 17:06:56 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBD16P9k002619 for ; Sun, 12 Dec 2004 17:06:46 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1CdeV9-0008VP-00; Sun, 12 Dec 2004 16:55:47 -0800 Date: Sun, 12 Dec 2004 16:55:46 -0800 From: "David S. Miller" To: Robin Holt Cc: holt@sgi.com, akpm@osdl.org, yoshfuji@linux-ipv6.org, hirofumi@parknet.co.jp, torvalds@osdl.org, dipankar@ibm.com, laforge@gnumonks.org, bunk@stusta.de, herbert@apana.org.au, paulmck@ibm.com, netdev@oss.sgi.com, linux-kernel@vger.kernel.org, gnb@sgi.com Subject: Re: [RFC] Limit the size of the IPV4 route hash. Message-Id: <20041212165546.1808536e.davem@davemloft.net> In-Reply-To: <20041210234037.GB25582@lnx-holt.americas.sgi.com> References: <20041210190025.GA21116@lnx-holt.americas.sgi.com> <20041210114829.034e02eb.davem@davemloft.net> <20041210210006.GB23222@lnx-holt.americas.sgi.com> <20041210130947.1d945422.akpm@osdl.org> <20041210232722.GC24468@lnx-holt.americas.sgi.com> <20041210153848.5acacd0a.akpm@osdl.org> <20041210233700.GA25582@lnx-holt.americas.sgi.com> <20041210234037.GB25582@lnx-holt.americas.sgi.com> X-Mailer: Sylpheed version 1.0.0beta3 (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12695 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Fri, 10 Dec 2004 17:40:37 -0600 Robin Holt wrote: > Sorry, I was asleep at the wheel. I failed to even grok your second > paragraph. I will fall back to agreeing with the printk to let the admin > know that something is amiss. > > Should we possibly modify the output of /proc/net/rt_cache (or whatever > its name is) to include the hash bucket so people can watch to see how > many bucket collisions their system has? I think there are rt stats for this already added by Robert Olsson. One thing not mentioned, besides the physically contiguous issue, is that fact that the locking would need to be changed quite a bit in order to allow runtime reallocation of the hash table. The current code is certainly not ready for it. It might just work to run the hash freeing via RCU, but I'm not quite sure. From carola.mennicke@lintec.de Sun Dec 12 23:18:36 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 12 Dec 2004 23:18:43 -0800 (PST) Received: from mail1.arcor-ip.de (mail1.arcor-ip.de [145.253.2.10]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBD7IF5s008584 for ; Sun, 12 Dec 2004 23:18:36 -0800 Received: from mx.lintec.de (unknown [145.253.228.3]) by mail1.arcor-ip.de (Arcor-IP) with ESMTP id D9687D32 for ; Mon, 13 Dec 2004 08:17:46 +0100 (MET) Received: by MX with Internet Mail Service (5.5.2653.19) id ; Mon, 13 Dec 2004 08:18:46 +0100 Message-ID: From: "Mennicke, Carola" To: "'AVK'" Subject: ACHTUNG! Sie haben eine mit einem Virus infizierte Mail verschick t. Date: Mon, 13 Dec 2004 08:18:45 +0100 X-MS-TNEF-Correlator: MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2653.19) Content-Type: multipart/mixed; boundary="----_=_NextPart_000_01C4E0E3.FB62945A" X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12696 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: carola.mennicke@lintec.de Precedence: bulk X-list: netdev This message is in MIME format. Since your mail reader does not understand this format, some or all of this message may not be legible. ------_=_NextPart_000_01C4E0E3.FB62945A Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable G DATA AntiVirenKit hat in folgender Mail einen Virus entdeckt: Absender: netdev@oss.sgi.com Empf=E4nger: carola.mennicke@lintec.de Cc: =09 Bcc: =09 Datum: 10.12.2004 23:46 Betreff: Mail Delivery (failure carola.mennicke@lintec.de) Virus: Win32.Netsky.P@mm ------_=_NextPart_000_01C4E0E3.FB62945A Content-Type: application/ms-tnef Content-Transfer-Encoding: base64 eJ8+Ii8HAQaQCAAEAAAAAAABAAEAAQeQBgAIAAAA5AQAAAAAAADoAAEIgAcAGAAAAElQTS5NaWNy b3NvZnQgTWFpbC5Ob3RlADEIAQWAAwAOAAAA1AcMAA0ACAASAC0AAQA8AQEggAMADgAAANQHDAAN AAgAEgAtAAEAPAEBCYABACEAAAA0RDNEM0ZGNjA0NTE2QTQ5QTQ4RjA0NjAyNDA3NjhERQATBwEE gAEARAAAAEFDSFRVTkchIFNpZSBoYWJlbiBlaW5lIG1pdCBlaW5lbSBWaXJ1cyBpbmZpemllcnRl IE1haWwgdmVyc2NoaWNrdC4AnBcBDYAEAAIAAAACAAIAAQOQBgAYBQAAHwAAAEAAOQDwe2L74+DE AQMA8T8HBAAAHgAxQAEAAAALAAAAQy5NRU5OSUNLRQAAAwAaQAAAAAAeADBAAQAAAAsAAABDLk1F Tk5JQ0tFAAADABlAAAAAAAIBCRABAAAAUQEAAE0BAAAGAgAATFpGdaTbS+CHAAoBDQNDdGV4dAH3 /wKkA+QF6wKDAFAC8wa0AoMmMgPFAgBjaArAc2XYdDAgBxMCgH0KgAjPPwnZAoAKhAs3EsIB0CBH YCBEQVRBE+ACMGnEVmkVQG5LaQVAE3DLBUALgCACEGxnCfAEgbcF0AtwAyBlC4AJ8CAYgRZ1BCAJ 8HQFgWt0OmcKowqFCoBBYhOgGdI6GwMwAZEgGpAbUXZAb0EEEC5zZ2kuBaBtARvFRW1wZlwnZWw0 bhmwHRZjCsAG8GEaLgeAbgMAG4BlQGwzC4AO8GMuAQAbxUNjUx0lG8VCYyJtRBkgdQJtHSUxMC4x Mi4BAdAwNCAgMjM6/DQ2IuYTsBVAASAdJRojFERlIVB2BJB5ICi2ZhoxCHBlIE8hVykbxWsa0x0l VwuAMyVgB8B0gHNreS5QQG0epgJ9LVAAAAADAP0/5AQAAB4AcAABAAAARAAAAEFDSFRVTkchIFNp ZSBoYWJlbiBlaW5lIG1pdCBlaW5lbSBWaXJ1cyBpbmZpemllcnRlIE1haWwgdmVyc2NoaWNrdC4A AgFxAAEAAAAWAAAAAcTg4/tkN7XvIfPxTledYS79GeO3XwAAAgFHAAEAAAArAAAAYz1ERTthPSA7 cD1MSU5URUM7bD1NWC0wNDEyMTMwNzE4NDVaLTEzNjM3AAACAfk/AQAAAEgAAAAAAAAA3KdAyMBC EBq0uQgAKy/hggEAAAAAAAAAL089TElOVEVDL09VPVRBVUNIQS9DTj1UQVVDSEEvQ049Qy5NRU5O SUNLRQAeAPg/AQAAABEAAABNZW5uaWNrZSwgQ2Fyb2xhAAAAAB4AOEABAAAACwAAAEMuTUVOTklD S0UAAAIB+z8BAAAASAAAAAAAAADcp0DIwEIQGrS5CAArL+GCAQAAAAAAAAAvTz1MSU5URUMvT1U9 VEFVQ0hBL0NOPVRBVUNIQS9DTj1DLk1FTk5JQ0tFAB4A+j8BAAAAEQAAAE1lbm5pY2tlLCBDYXJv bGEAAAAAHgA5QAEAAAALAAAAQy5NRU5OSUNLRQAAQAAHMFqUYvvj4MQBQAAIMOySgfvj4MQBHgA9 AAEAAAABAAAAAAAAAB4AHQ4BAAAARAAAAEFDSFRVTkchIFNpZSBoYWJlbiBlaW5lIG1pdCBlaW5l bSBWaXJ1cyBpbmZpemllcnRlIE1haWwgdmVyc2NoaWNrdC4AHgA1EAEAAAAsAAAAPEI2MjNCNDVG M0MzRjUzNDdBQjAwNjRDQ0FCOEQ3MkMzMDYyMDQyQE1YPgADADYAAAAAAAsAKQAAAAAACwAjAAAA AAADAAYQ7CsMSQMABxDTAAAAAwAQEAAAAAADABEQAAAAAB4ACBABAAAAZQAAAEdEQVRBQU5USVZJ UkVOS0lUSEFUSU5GT0xHRU5ERVJNQUlMRUlORU5WSVJVU0VOVERFQ0tUOkFCU0VOREVSOk5FVERF VkBPU1NTR0lDT01FTVBG5E5HRVI6Q0FST0xBTUVOTkkAAAAAAgF/AAEAAAAsAAAAPEI2MjNCNDVG M0MzRjUzNDdBQjAwNjRDQ0FCOEQ3MkMzMDYyMDQyQE1YPgDiPA== ------_=_NextPart_000_01C4E0E3.FB62945A-- From Bosnjak@iskratel.si Mon Dec 13 07:29:56 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 13 Dec 2004 07:30:13 -0800 (PST) Received: from ittmr.iskratel.si (ittmr.iskratel.si [193.77.24.3]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBDFTWOm020451 for ; Mon, 13 Dec 2004 07:29:55 -0800 Received: from ntmailkr.iskratel.si (ntmailkr.iskratel.si [10.1.2.90]) by ittmr.iskratel.si (8.12.10/8.12.10) with ESMTP id iBDFb6Ol028173; Mon, 13 Dec 2004 16:37:07 +0100 (MET) X-MimeOLE: Produced By Microsoft Exchange V6.0.6249.0 Content-Class: urn:content-classes:message MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----_=_NextPart_001_01C4E128.290A6CAA" Subject: new ioctl for UDP socket - need help Date: Mon, 13 Dec 2004 16:26:47 +0100 Message-ID: <57D0409C9DC47A4B936406AC1788FA0003A382A2@ntmailkr.iskratel.si> X-MS-Has-Attach: yes X-MS-TNEF-Correlator: Thread-Topic: new ioctl for UDP socket - need help Thread-Index: AcThKCkF2bEWHOeWQAW/RMBvJ4lHlw== From: "Bosnjak Zoran ITWEP" To: , , , , , , , , X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12697 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Bosnjak@iskratel.si Precedence: bulk X-list: netdev This is a multi-part message in MIME format. ------_=_NextPart_001_01C4E128.290A6CAA Content-Type: text/plain; charset="iso-8859-2" Content-Transfer-Encoding: quoted-printable Hello all, I have changed the UDP packet handler to be able to count packets in = fastest possible manner. The best idea for me was to intercept the = packet just before sending it to user space. As I am not interested in = actual packets I just get the counters from the socket periodicly. The = most important is packet loss counter which is calculated from sequence = number (first byte in each packet's payload). You can see this add-on like the opposite site for PKTGEN which is = already part of the kernel. But here is the problem: I have to introduce new ioctl numbers for the socket, to: - be able to put the socket in 'counting' mode - get the counters from the socket to the application ... and I don't know how to do it. At the moment (to see it works) I just put some dummy ioctl numbers into = the code. Inside the patch (see attachment) there is also an example and = short documentation. Can someone please help me introduce new ioctl numbers for UDP sockets = corectly. best regards, Zoran Bosnjak <> <>=20 ------_=_NextPart_001_01C4E128.290A6CAA Content-Type: application/octet-stream; name="udp_patch-linux-2.4.27" Content-Transfer-Encoding: base64 Content-Description: udp_patch-linux-2.4.27 Content-Disposition: attachment; filename="udp_patch-linux-2.4.27" ZGlmZiAtdXJOIC1YIC9ob21lL3pvcmFuL2RvbnRkaWZmIGxpbnV4LTIuNC4yNy12YW5pbGxhL0Rv Y3VtZW50YXRpb24vQ29uZmlndXJlLmhlbHAgL2hvbWUvem9yYW4vbGludXgtMi40LjI3L0RvY3Vt ZW50YXRpb24vQ29uZmlndXJlLmhlbHAKLS0tIGxpbnV4LTIuNC4yNy12YW5pbGxhL0RvY3VtZW50 YXRpb24vQ29uZmlndXJlLmhlbHAJMjAwNC0wOC0wOCAwMToyNjowNC4wMDAwMDAwMDAgKzAyMDAK KysrIC9ob21lL3pvcmFuL2xpbnV4LTIuNC4yNy9Eb2N1bWVudGF0aW9uL0NvbmZpZ3VyZS5oZWxw CTIwMDQtMTEtMTQgMTE6NTk6NDEuMDAwMDAwMDAwICswMTAwCkBAIC0xMTEwMyw2ICsxMTEwMywy MiBAQAogICB3aGVuZXZlciB5b3Ugd2FudCkuICBJZiB5b3Ugd2FudCB0byBjb21waWxlIGl0IGFz IGEgbW9kdWxlLCBzYXkgTQogICBoZXJlIGFuZCByZWFkIDxmaWxlOkRvY3VtZW50YXRpb24vbW9k dWxlcy50eHQ+LgogCitVZHAgUlggcGFja2V0IGNvdW50ZXIKK0NPTkZJR19VRFBfUlhfQ09VTlRF UlMKKyAgVGhpcyBvcHRpb24gd2lsbCBhZGQgMiBuZXcgaW9jdGwgb3B0aW9ucyB0byBldmVyeSBV RFAgc29ja2V0LiAKKyAgSXQgd2lsbCBhbHNvIGV4dGVuZCBzb2NrZXQgc3RydWN0dXJlIHdpdGgg YSBmZXcgY291bnRlcnMuCisgIAorICBPbmNlIHRoZSBVRFAgc29ja2V0IGlzIGNyZWF0ZWQgLSBz b2NrZXQoQUZfSU5FVCwgU09DS19ER1JBTSksIAorICB0aGUgYXBwbGljYXRpb24gaGFzIGEgcG9z aWJpbGl0eSB0byBwdXQgdGhpcyBzb2NrZXQgaW50byAKKyAgImNvdW50aW5nIG1vZGUiIC0gZmly c3QgaW9jdGwgY2FsbC4gSWYgdGhlIHNvY2tldCBpcyBpbiB0aGlzIG1vZGUsCisgIHRoZSB0cmFm ZmljIGlzIG5vdCBkZWxpdmVyZWQgdG8gdGhlIGFwcGxpY2F0aW9uLCBidXQgdGhlIGNvdW50ZXJz CisgIGFyZSB1cGRhdGVkIGluc2lkZSBrZXJuZWwuIFRoZSBjb3VudGVycyBjYW4gYmUgcmV0aXJ2 ZWQgKGFuZAorICBjbGVhcmVkKSB3aXRoIHNlY29uZCBpb2N0bCBjYWxsIG9uIHRoaXMgc29ja2V0 LgorICBTZWUgPGZpbGU6RG9jdW1lbnRhdGlvbi9uZXR3b3JraW5nL3VkcF9yeF9jb3VudGVycy50 eHQ+LgorICAKKyAgVGhpcyBvcHRpb24gaXMgZm9yIHRlc3RpbmcgcHVycG9zZXMuIElmIHlvdSBk b24ndCBuZWVkIGZhc3QgVURQCisgIHBhY2tldCBjb3VudGVycywgc2F5IE4uCisKIFdhbiBpbnRl cmZhY2VzIHN1cHBvcnQKIENPTkZJR19XQU4KICAgV2lkZSBBcmVhIE5ldHdvcmtzIChXQU5zKSwg c3VjaCBhcyBYLjI1LCBmcmFtZSByZWxheSBhbmQgbGVhc2VkCmRpZmYgLXVyTiAtWCAvaG9tZS96 b3Jhbi9kb250ZGlmZiBsaW51eC0yLjQuMjctdmFuaWxsYS9Eb2N1bWVudGF0aW9uL25ldHdvcmtp bmcvdWRwX3J4X2NvdW50ZXJzLnR4dCAvaG9tZS96b3Jhbi9saW51eC0yLjQuMjcvRG9jdW1lbnRh dGlvbi9uZXR3b3JraW5nL3VkcF9yeF9jb3VudGVycy50eHQKLS0tIGxpbnV4LTIuNC4yNy12YW5p bGxhL0RvY3VtZW50YXRpb24vbmV0d29ya2luZy91ZHBfcnhfY291bnRlcnMudHh0CTE5NzAtMDEt MDEgMDE6MDA6MDAuMDAwMDAwMDAwICswMTAwCisrKyAvaG9tZS96b3Jhbi9saW51eC0yLjQuMjcv RG9jdW1lbnRhdGlvbi9uZXR3b3JraW5nL3VkcF9yeF9jb3VudGVycy50eHQJMjAwNC0xMS0xNCAx MzowODoyNi4wMDAwMDAwMDAgKzAxMDAKQEAgLTAsMCArMSw5MiBAQAorCisgICAgVURQIHBhY2tl dCBjb3VudGVyIHNob3J0IEhPV1RPCisgICAgLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t CisgICAgCisgICAgVURQIHBhY2tldCBjb3VudGVyIGlzIGtlcm5lbCBiYXNlZCB0b29sIHRoYXQg ZW5hYmxlcyB5b3UgdG8KKyAgICBkZXRlY3QgcGFja2V0IGxvc3Mgb24gc2VsZWN0ZWQgVURQIHRy YWZmaWMgZmxvdy4gSXQgaXMgdXNlZCBpbgorICAgIGNvbWJpbmF0aW9uIHdpdGggVURQIHRyYWZm aWMgZ2VuZXJhdG9yIChub3QgcGFydCBvZiB0aGlzIGFkZG9uKS4KKyAgICBUaGUgZ2VuZXJhdG9y IG11c3Qgc2VuZCBzZXF1ZW5jZSBudW1iZXIgYXMgdGhlIGZpcnN0IGJ5dGUKKyAgICBpbnNpZGUg ZWFjaCBVRFAgcGF5bG9hZCAocmlnaHQgYWZ0ZXIgVURQIGhlYWRlcikuCisgICAgVGhlIFVEUCBw YWNrZXQgY291bnRlciB0aGVuIGNoZWNrcyB0aGlzIHNlcXVlbmNlIG51bWJlciB3aXRoCisgICAg ZXhwZWN0ZWQgc2VxdWVuY2UgbnVtYmVyIGZvciBlYWNoIHJlY2VpdmVkIHBhY2tldC4KKyAgICBU aGUgc2VxdWVuY2UgbnVtYmVyIGlzIG9uZSBieXRlIG51bWJlciBhbmQgd3JhcHMgYXJvdW5kIGF0 IDI1NS4KKworICAgIEl0IGFkZHMgMiAodW5zaWduZWQgbG9uZykgY291bnRlcnMgdG8gdGhlIHNv Y2tldDoKKyAgICAtIElmIHRoZSByZWNlaXZlZCBzZXF1ZW5jZSBpcyBvbmUgbW9yZSB0aGFuIHBy ZXZpb3VzLAorICAgICAgdGhlICJva19jb3VudGVyIiBpcyBpbmNyZW1lbnRlZC4KKyAgICAtIGVs c2UKKyAgICAgIHRoZSAiZXJyb3JfY291bnRlciIgaXMgaW5jcmVtZW50ZWQgYW5kIHRoZSByZWNl aXZlZCBzZXF1ZW5jZQorICAgICAgbnVtYmVyIGJlY29tZXMgdGhlIG5ldyB2YWxpZCBzZXF1ZW5j ZSBudW1iZXIKKworICAgIFRoZSBjb3VudGVycyBjYW4gYmUgcmV0cmVpdmVkIHBlcmlvZGljbHku CisKKyAgICBUbyB1c2UgVURQIHBhY2tldCBjb3VudGVyOgorCisgICAgLSBtYWtlIHN1cmUgeW91 IGhhdmUgc2VsZWN0ZWQgIkNPTkZJR19VRFBfUlhfQ09VTlRFUlMiIGluIGtlcm5lbAorICAgICAg bmV0d29yay90ZXN0aW5nIGNvbmZpZ3VyYXRpb24uCisKKyAgICAtIGluc2lkZSB5b3VyIHRlc3Rp bmcgYXBwbGljYXRpb246CisgICAgICAgIC0gY3JlYXRlIG5vcm1hbCBVRFAgc29ja2V0CisgICAg ICAgIC0gY2FsbCBpb2N0bCBvbiB0aGlzIHNvY2tldCB0byBwdXQgaXQgaW50byBjb3VudGluZyBt b2RlCisgICAgICAgIC0gYmluZCB0aGUgc29ja2V0IHRvIDxpcCwgcG9ydD4gY29tYmluYXRpb24K KyAgICAgICAgLSBwZXJpb2RpY2x5IGNhbGwgaW9jdGwgb24gdGhpcyBzb2NrZXQgdG8gZ2V0IHRo ZSBjb3VudGVycworICAgICAgICAgIChpbnRlcm5hbCBjb3VudGVycyBhcmUgYXV0b21hdGljbHkg cmVzZXQgdG8gemVybyBieSB0aGlzIGNhbGwpCisgICAgICAgIC0gZG9uJ3QgdHJ5IHRvIHJlYWQg ZGF0YSBmcm9tIHRoaXMgc29ja2V0LCBiZWNhdXNlCisgICAgICAgICAgeW91IHdvbid0IGdldCBh bnkgKGFzIGxvbmcgYXMgc29ja2V0IGlzIGluIGNvdW50aW5nIG1vZGUpLgorICAgICAgICAtIGNs b3NlIHRoZSBzb2NrZXQgd2hlbiBkb25lCisKKyAgICBFeGFtcGxlOgorICAgIChzb21lIGNoZWNr aW5nIGlzIG1pc3Npbmcgb24gcHVycG9zZSkKKworLS0tLS0tLS0tLS0tLS0gc3RhcnQgb2YgZXhh bXBsZSAtLS0tLS0tLS0tLS0tLS0KKyNpbmNsdWRlIDxzdGRpby5oPgorI2luY2x1ZGUgPHN5cy9p b2N0bC5oPgorI2luY2x1ZGUgPG5ldGluZXQvaW4uaD4KKyNpbmNsdWRlIDxzeXMvc29ja2V0Lmg+ CisKKyNkZWZpbmUgSU9DVExfQ09VTlRJTkdfTU9ERSAgICAgMTUKKyNkZWZpbmUgSU9DVExfTk9S TUFMX01PREUgICAgICAgMTYKKyNkZWZpbmUgSU9DVExfR0VUX0NPVU5URVJTICAgICAgMTcKKwor bWFpbihpbnQgYXJnYywgY2hhciAqYXJndltdKSB7CisKKyAgICBpbnQgc29ja2ZkOworICAgIHN0 cnVjdCBzb2NrYWRkcl9pbiBteV9hZGRyOworICAgIHN0cnVjdCBzb2NrYWRkcl9pbiB0aGVpcl9h ZGRyOworICAgIGludCBhZGRyX2xlbjsKKworICAgIHN0cnVjdCB1ZHBfY291bnRlcnMgeworICAg ICAgICB1bnNpZ25lZCBsb25nIG9rX2NvdW50ZXI7CisgICAgICAgIHVuc2lnbmVkIGxvbmcgZXJy X2NvdW50ZXI7CisgICAgfSBjb3VudGVyczsKKworICAgIC8qIGNyZWF0ZSBzb2NrZXQgKi8KKyAg ICBzb2NrZmQgPSBzb2NrZXQoQUZfSU5FVCwgU09DS19ER1JBTSwgMCk7CisgICAgCisgICAgLyog cHV0IHNvY2tldCBpbnRvIGNvdW50aW5nIG1vZGUgKi8KKyAgICBpb2N0bChzb2NrZmQsIElPQ1RM X0NPVU5USU5HX01PREUsIDApOworICAgIAorICAgIC8qIGJpbmQgKi8KKyAgICBtZW1zZXQoJm15 X2FkZHIsIDAsIHNpemVvZihteV9hZGRyKSk7CisgICAgbXlfYWRkci5zaW5fZmFtaWx5ID0gQUZf SU5FVDsKKyAgICBteV9hZGRyLnNpbl9wb3J0ID0gaHRvbnMoMTAwMDApOworICAgIG15X2FkZHIu c2luX2FkZHIuc19hZGRyID0gaW5ldF9hZGRyKCIxMjcuMC4wLjEiKTsKKworICAgIGJpbmQoc29j a2ZkLCAoc3RydWN0IHNvY2thZGRyICopJm15X2FkZHIsIHNpemVvZihzdHJ1Y3Qgc29ja2FkZHIp KTsKKworICAgIGFkZHJfbGVuID0gc2l6ZW9mKHN0cnVjdCBzb2NrYWRkcik7CisKKyAgICAvKiBt YWluIGxvb3AqLworICAgIHdoaWxlKDEpIHsKKyAgICAgICAgc2xlZXAoMSk7CisKKyAgICAgICAg LyogZ2V0IGFuZCBjbGVhciB0aGUgc29ja2V0IGNvdW50ZXJzICovCisgICAgICAgIGlvY3RsKHNv Y2tmZCwgSU9DVExfR0VUX0NPVU5URVJTLCAodW5zaWduZWQgbG9uZykmY291bnRlcnMpOworCisg ICAgICAgIC8qIHByaW50IG91dCAqLworICAgICAgICBwcmludGYoIiV1ICV1XG4iLCBjb3VudGVy cy5va19jb3VudGVyLCBjb3VudGVycy5lcnJfY291bnRlcik7CisgICAgICAgIGZmbHVzaChzdGRv dXQpOworICAgIH0KK30KKy0tLS0tLS0tLS0tLS0tICAgZW5kIG9mIGV4YW1wbGUgLS0tLS0tLS0t LS0tLS0tCisKZGlmZiAtdXJOIC1YIC9ob21lL3pvcmFuL2RvbnRkaWZmIGxpbnV4LTIuNC4yNy12 YW5pbGxhL2luY2x1ZGUvbmV0L3NvY2suaCAvaG9tZS96b3Jhbi9saW51eC0yLjQuMjcvaW5jbHVk ZS9uZXQvc29jay5oCi0tLSBsaW51eC0yLjQuMjctdmFuaWxsYS9pbmNsdWRlL25ldC9zb2NrLmgJ MjAwNC0wOC0wOCAwMToyNjowNi4wMDAwMDAwMDAgKzAyMDAKKysrIC9ob21lL3pvcmFuL2xpbnV4 LTIuNC4yNy9pbmNsdWRlL25ldC9zb2NrLmgJMjAwNC0xMS0xNCAxMjowODo0OC4wMDAwMDAwMDAg KzAxMDAKQEAgLTc0Miw2ICs3NDIsMTQgQEAKICAgCWludAkJCSgqYmFja2xvZ19yY3YpIChzdHJ1 Y3Qgc29jayAqc2ssCiAJCQkJCQlzdHJ1Y3Qgc2tfYnVmZiAqc2tiKTsgIAogCXZvaWQgICAgICAg ICAgICAgICAgICAgICgqZGVzdHJ1Y3QpKHN0cnVjdCBzb2NrICpzayk7CisKKyNpZiBkZWZpbmVk KENPTkZJR19VRFBfUlhfQ09VTlRFUlMpCisJc3BpbmxvY2tfdAkJdWRwX3J4X2NvdW50ZXJzX2xv Y2s7CisJdW5zaWduZWQgY2hhcgl1ZHBfcnhfY291bnRlcnNfbW9kZTsJCS8qIGRvIHRoZSBjb3Vu dGluZyBvciBub3QgKi8KKwl1bnNpZ25lZCBjaGFyCXVkcF9yeF9jb3VudGVyc19zZXE7CQkvKiBu ZXh0IGV4cGVjdGVkIHNlcXVlbmNlIG51bWJlciAqLworCXVuc2lnbmVkIGxvbmcJdWRwX3J4X2Nv dW50ZXJzX3RvdGFsOwkJLyogdG90YWwgcGFja2V0IGNvdW50ZXIgKi8KKwl1bnNpZ25lZCBsb25n CXVkcF9yeF9jb3VudGVyc19lcnI7CQkvKiBzZXF1ZW5jZSBlcnJvcnMgY291bnRlciAqLworI2Vu ZGlmCiB9OwogCiAvKiBUaGUgcGVyLXNvY2tldCBzcGlubG9jayBtdXN0IGJlIGhlbGQgaGVyZS4g Ki8KZGlmZiAtdXJOIC1YIC9ob21lL3pvcmFuL2RvbnRkaWZmIGxpbnV4LTIuNC4yNy12YW5pbGxh L25ldC9Db25maWcuaW4gL2hvbWUvem9yYW4vbGludXgtMi40LjI3L25ldC9Db25maWcuaW4KLS0t IGxpbnV4LTIuNC4yNy12YW5pbGxhL25ldC9Db25maWcuaW4JMjAwNC0wOC0wOCAwMToyNjowNi4w MDAwMDAwMDAgKzAyMDAKKysrIC9ob21lL3pvcmFuL2xpbnV4LTIuNC4yNy9uZXQvQ29uZmlnLmlu CTIwMDQtMTEtMTQgMTE6NTk6NDEuMDAwMDAwMDAwICswMTAwCkBAIC0xMDAsNiArMTAwLDcgQEAK IG1haW5tZW51X29wdGlvbiBuZXh0X2NvbW1lbnQKIGNvbW1lbnQgJ05ldHdvcmsgdGVzdGluZycK IGRlcF90cmlzdGF0ZSAnUGFja2V0IEdlbmVyYXRvciAoVVNFIFdJVEggQ0FVVElPTiknIENPTkZJ R19ORVRfUEtUR0VOICRDT05GSUdfUFJPQ19GUworYm9vbCAnVURQIFJ4IGNvdW50ZXJzJyBDT05G SUdfVURQX1JYX0NPVU5URVJTCiBlbmRtZW51CiAKIGVuZG1lbnUKZGlmZiAtdXJOIC1YIC9ob21l L3pvcmFuL2RvbnRkaWZmIGxpbnV4LTIuNC4yNy12YW5pbGxhL25ldC9pcHY0L2FmX2luZXQuYyAv aG9tZS96b3Jhbi9saW51eC0yLjQuMjcvbmV0L2lwdjQvYWZfaW5ldC5jCi0tLSBsaW51eC0yLjQu MjctdmFuaWxsYS9uZXQvaXB2NC9hZl9pbmV0LmMJMjAwNC0wOC0wOCAwMToyNjowNi4wMDAwMDAw MDAgKzAyMDAKKysrIC9ob21lL3pvcmFuL2xpbnV4LTIuNC4yNy9uZXQvaXB2NC9hZl9pbmV0LmMJ MjAwNC0xMS0xNCAxMjoxNTowMS4wMDAwMDAwMDAgKzAxMDAKQEAgLTQxNiw2ICs0MTYsMTUgQEAK IAkJCXJldHVybiBlcnI7CiAJCX0KIAl9CisKKyNpZiBkZWZpbmVkKENPTkZJR19VRFBfUlhfQ09V TlRFUlMpCisJc2stPnVkcF9yeF9jb3VudGVyc19sb2NrID0gU1BJTl9MT0NLX1VOTE9DS0VEOwor CXNrLT51ZHBfcnhfY291bnRlcnNfbW9kZSA9IDA7CS8qIDAtZG9uJ3QgY291bnQsIDEtY291bnQg Ki8KKwlzay0+dWRwX3J4X2NvdW50ZXJzX3NlcSA9IDA7CisJc2stPnVkcF9yeF9jb3VudGVyc190 b3RhbCA9IDA7CisJc2stPnVkcF9yeF9jb3VudGVyc19lcnIgPSAwOworI2VuZGlmCisKIAlyZXR1 cm4gMDsKIAogZnJlZV9hbmRfYmFkdHlwZToKZGlmZiAtdXJOIC1YIC9ob21lL3pvcmFuL2RvbnRk aWZmIGxpbnV4LTIuNC4yNy12YW5pbGxhL25ldC9pcHY0L3VkcC5jIC9ob21lL3pvcmFuL2xpbnV4 LTIuNC4yNy9uZXQvaXB2NC91ZHAuYwotLS0gbGludXgtMi40LjI3LXZhbmlsbGEvbmV0L2lwdjQv dWRwLmMJMjAwNC0wOC0wOCAwMToyNjowNy4wMDAwMDAwMDAgKzAyMDAKKysrIC9ob21lL3pvcmFu L2xpbnV4LTIuNC4yNy9uZXQvaXB2NC91ZHAuYwkyMDA0LTExLTE0IDExOjU5OjQxLjAwMDAwMDAw MCArMDEwMApAQCAtNjE5LDYgKzYxOSwzMyBAQAogCQkJcmV0dXJuIHB1dF91c2VyKGFtb3VudCwg KGludCAqKWFyZyk7CiAJCX0KIAorI2lmIGRlZmluZWQoQ09ORklHX1VEUF9SWF9DT1VOVEVSUykK KwkJY2FzZSAxNToJLyogc2V0IGNvdW50aW5nIE9OICovCisJCQlzay0+dWRwX3J4X2NvdW50ZXJz X21vZGUgPSAxOworCQkJcmV0dXJuIDA7CisKKwkJY2FzZSAxNjoJLyogc2V0IGNvdW50aW5nIE9G RiAqLworCQkJc2stPnVkcF9yeF9jb3VudGVyc19tb2RlID0gMDsKKwkJCXJldHVybiAwOworCisg ICAgICAgIGNhc2UgMTc6CS8qIGdldCBhbmQgcmVzZXQgY291bnRlcnMgKi8KKyAgICAgICAgewor CQkJc3RydWN0IGNvdW50ZXJzIHsKKwkJCQl1bnNpZ25lZCBsb25nIGMxOworCQkJCXVuc2lnbmVk IGxvbmcgYzI7CisJCQl9IGM7CisgICAgICAgICAgICAKKwkJCXNwaW5fbG9ja19pcnEoJnNrLT51 ZHBfcnhfY291bnRlcnNfbG9jayk7CisJCQljLmMxID0gc2stPnVkcF9yeF9jb3VudGVyc190b3Rh bDsKKwkJCWMuYzIgPSBzay0+dWRwX3J4X2NvdW50ZXJzX2VycjsKKwkJCXNrLT51ZHBfcnhfY291 bnRlcnNfdG90YWwgPSAwOworCQkJc2stPnVkcF9yeF9jb3VudGVyc19lcnIgPSAwOworCQkJc3Bp bl91bmxvY2tfaXJxKCZzay0+dWRwX3J4X2NvdW50ZXJzX2xvY2spOworCisJCQlyZXR1cm4gY29w eV90b191c2VyKChzdHJ1Y3QgY291bnRlcnMgKilhcmcsIChzdHJ1Y3QgY291bnRlcnMgKikmYywg c2l6ZW9mKGMpKTsKKwkJfQorI2VuZGlmCisKIAkJZGVmYXVsdDoKIAkJCXJldHVybiAtRU5PSU9D VExDTUQ7CiAJfQpAQCAtOTQxLDYgKzk2OCwyOCBAQAogCXNrID0gdWRwX3Y0X2xvb2t1cChzYWRk ciwgdWgtPnNvdXJjZSwgZGFkZHIsIHVoLT5kZXN0LCBza2ItPmRldi0+aWZpbmRleCk7CiAKIAlp ZiAoc2sgIT0gTlVMTCkgeworI2lmIGRlZmluZWQoQ09ORklHX1VEUF9SWF9DT1VOVEVSUykKKwkJ aWYgKHNrLT51ZHBfcnhfY291bnRlcnNfbW9kZSA9PSAxKSB7CisJCQl1bnNpZ25lZCBsb25nIGZs YWdzOworCQkJdW5zaWduZWQgY2hhciAqc2VxX3A7CisKKwkJCS8qIHNlcXVlbmNlIG51bWJlciBp cyBmaXJzdCBkYXRhIGJ5dGUgKHJpZ2h0IGFmdGVyIHVkcGhkcikgKi8KKwkJCXNlcV9wID0gKHVu c2lnbmVkIGNoYXIgKil1aCArIHNpemVvZihzdHJ1Y3QgdWRwaGRyKTsKKyAgICAgICAgICAgIAor CQkJc3Bpbl9sb2NrX2lycXNhdmUoJnNrLT51ZHBfcnhfY291bnRlcnNfbG9jaywgZmxhZ3MpOwor CQkJc2stPnVkcF9yeF9jb3VudGVyc190b3RhbCsrOworCQkJLyogY2hlY2sgc2VxdWVuY2UgKi8K KwkJCWlmICgqc2VxX3AgIT0gc2stPnVkcF9yeF9jb3VudGVyc19zZXEpIHsKKwkJCQlzay0+dWRw X3J4X2NvdW50ZXJzX2VycisrOyAgICAgICAgICAvKiBpbmNyZW1lbnQgY291bnRlciAqLworCQkJ fQorCQkJc2stPnVkcF9yeF9jb3VudGVyc19zZXEgPSAqc2VxX3AgKyAxOyAgIC8qIHJlc2V0IHNl cXVlbmNlIHRvIG5ldyB2YWx1ZSAqLworCQkJc3Bpbl91bmxvY2tfaXJxcmVzdG9yZSgmc2stPnVk cF9yeF9jb3VudGVyc19sb2NrLCBmbGFncyk7CisKKwkJCS8qIGRyb3AgcGFja2V0ICovCisJCQlr ZnJlZV9za2Ioc2tiKTsKKwkJCXJldHVybiAwOworCQl9CisjZW5kaWYKIAkJdWRwX3F1ZXVlX3Jj dl9za2Ioc2ssIHNrYik7CiAJCXNvY2tfcHV0KHNrKTsKIAkJcmV0dXJuIDA7Cg== ------_=_NextPart_001_01C4E128.290A6CAA Content-Type: application/octet-stream; name="udp_patch-linux-2.6.9" Content-Transfer-Encoding: base64 Content-Description: udp_patch-linux-2.6.9 Content-Disposition: attachment; filename="udp_patch-linux-2.6.9" ZGlmZiAtdXByTiAtWCB6YWMvZG9udGRpZmYgL3Vzci9zcmMvbGludXgtMi42LjktdmFuaWxsYS9E b2N1bWVudGF0aW9uL25ldHdvcmtpbmcvdWRwX3J4X2NvdW50ZXJzLnR4dCBsaW51eC0yLjYuOS9E b2N1bWVudGF0aW9uL25ldHdvcmtpbmcvdWRwX3J4X2NvdW50ZXJzLnR4dAotLS0gL3Vzci9zcmMv bGludXgtMi42LjktdmFuaWxsYS9Eb2N1bWVudGF0aW9uL25ldHdvcmtpbmcvdWRwX3J4X2NvdW50 ZXJzLnR4dAkxOTcwLTAxLTAxIDAxOjAwOjAwLjAwMDAwMDAwMCArMDEwMAorKysgbGludXgtMi42 LjkvRG9jdW1lbnRhdGlvbi9uZXR3b3JraW5nL3VkcF9yeF9jb3VudGVycy50eHQJMjAwNC0xMi0w NiAxNTo0MjoxNC4wMDAwMDAwMDAgKzAxMDAKQEAgLTAsMCArMSw5MiBAQAorCisgICAgVURQIHBh Y2tldCBjb3VudGVyIHNob3J0IEhPV1RPCisgICAgLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0tCisgICAgCisgICAgVURQIHBhY2tldCBjb3VudGVyIGlzIGtlcm5lbCBiYXNlZCB0b29sIHRo YXQgZW5hYmxlcyB5b3UgdG8KKyAgICBkZXRlY3QgcGFja2V0IGxvc3Mgb24gc2VsZWN0ZWQgVURQ IHRyYWZmaWMgZmxvdy4gSXQgaXMgdXNlZCBpbgorICAgIGNvbWJpbmF0aW9uIHdpdGggVURQIHRy YWZmaWMgZ2VuZXJhdG9yIChub3QgcGFydCBvZiB0aGlzIGFkZG9uKS4KKyAgICBUaGUgZ2VuZXJh dG9yIG11c3Qgc2VuZCBzZXF1ZW5jZSBudW1iZXIgYXMgdGhlIGZpcnN0IGJ5dGUKKyAgICBpbnNp ZGUgZWFjaCBVRFAgcGF5bG9hZCAocmlnaHQgYWZ0ZXIgVURQIGhlYWRlcikuCisgICAgVGhlIFVE UCBwYWNrZXQgY291bnRlciB0aGVuIGNoZWNrcyB0aGlzIHNlcXVlbmNlIG51bWJlciB3aXRoCisg ICAgZXhwZWN0ZWQgc2VxdWVuY2UgbnVtYmVyIGZvciBlYWNoIHJlY2VpdmVkIHBhY2tldC4KKyAg ICBUaGUgc2VxdWVuY2UgbnVtYmVyIGlzIG9uZSBieXRlIG51bWJlciBhbmQgd3JhcHMgYXJvdW5k IGF0IDI1NS4KKworICAgIEl0IGFkZHMgMiAodW5zaWduZWQgbG9uZykgY291bnRlcnMgdG8gdGhl IHNvY2tldDoKKyAgICAtIElmIHRoZSByZWNlaXZlZCBzZXF1ZW5jZSBpcyBvbmUgbW9yZSB0aGFu IHByZXZpb3VzLAorICAgICAgdGhlICJva19jb3VudGVyIiBpcyBpbmNyZW1lbnRlZC4KKyAgICAt IGVsc2UKKyAgICAgIHRoZSAiZXJyb3JfY291bnRlciIgaXMgaW5jcmVtZW50ZWQgYW5kIHRoZSBy ZWNlaXZlZCBzZXF1ZW5jZQorICAgICAgbnVtYmVyIGJlY29tZXMgdGhlIG5ldyB2YWxpZCBzZXF1 ZW5jZSBudW1iZXIKKworICAgIFRoZSBjb3VudGVycyBjYW4gYmUgcmV0cmVpdmVkIHBlcmlvZGlj bHkuCisKKyAgICBUbyB1c2UgVURQIHBhY2tldCBjb3VudGVyOgorCisgICAgLSBtYWtlIHN1cmUg eW91IGhhdmUgc2VsZWN0ZWQgIkNPTkZJR19VRFBfUlhfQ09VTlRFUlMiIGluIGtlcm5lbAorICAg ICAgbmV0d29yay90ZXN0aW5nIGNvbmZpZ3VyYXRpb24uCisKKyAgICAtIGluc2lkZSB5b3VyIHRl c3RpbmcgYXBwbGljYXRpb246CisgICAgICAgIC0gY3JlYXRlIG5vcm1hbCBVRFAgc29ja2V0Cisg ICAgICAgIC0gY2FsbCBpb2N0bCBvbiB0aGlzIHNvY2tldCB0byBwdXQgaXQgaW50byBjb3VudGlu ZyBtb2RlCisgICAgICAgIC0gYmluZCB0aGUgc29ja2V0IHRvIDxpcCwgcG9ydD4gY29tYmluYXRp b24KKyAgICAgICAgLSBwZXJpb2RpY2x5IGNhbGwgaW9jdGwgb24gdGhpcyBzb2NrZXQgdG8gZ2V0 IHRoZSBjb3VudGVycworICAgICAgICAgIChpbnRlcm5hbCBjb3VudGVycyBhcmUgYXV0b21hdGlj bHkgcmVzZXQgdG8gemVybyBieSB0aGlzIGNhbGwpCisgICAgICAgIC0gZG9uJ3QgdHJ5IHRvIHJl YWQgZGF0YSBmcm9tIHRoaXMgc29ja2V0LCBiZWNhdXNlCisgICAgICAgICAgeW91IHdvbid0IGdl dCBhbnkgKGFzIGxvbmcgYXMgc29ja2V0IGlzIGluIGNvdW50aW5nIG1vZGUpLgorICAgICAgICAt IGNsb3NlIHRoZSBzb2NrZXQgd2hlbiBkb25lCisKKyAgICBFeGFtcGxlOgorICAgIChzb21lIGNo ZWNraW5nIGlzIG1pc3Npbmcgb24gcHVycG9zZSkKKworLS0tLS0tLS0tLS0tLS0gc3RhcnQgb2Yg ZXhhbXBsZSAtLS0tLS0tLS0tLS0tLS0KKyNpbmNsdWRlIDxzdGRpby5oPgorI2luY2x1ZGUgPHN5 cy9pb2N0bC5oPgorI2luY2x1ZGUgPG5ldGluZXQvaW4uaD4KKyNpbmNsdWRlIDxzeXMvc29ja2V0 Lmg+CisKKyNkZWZpbmUgSU9DVExfQ09VTlRJTkdfTU9ERSAgICAgMTUKKyNkZWZpbmUgSU9DVExf Tk9STUFMX01PREUgICAgICAgMTYKKyNkZWZpbmUgSU9DVExfR0VUX0NPVU5URVJTICAgICAgMTcK KworbWFpbihpbnQgYXJnYywgY2hhciAqYXJndltdKSB7CisKKyAgICBpbnQgc29ja2ZkOworICAg IHN0cnVjdCBzb2NrYWRkcl9pbiBteV9hZGRyOworICAgIHN0cnVjdCBzb2NrYWRkcl9pbiB0aGVp cl9hZGRyOworICAgIGludCBhZGRyX2xlbjsKKworICAgIHN0cnVjdCB1ZHBfY291bnRlcnMgewor ICAgICAgICB1bnNpZ25lZCBsb25nIG9rX2NvdW50ZXI7CisgICAgICAgIHVuc2lnbmVkIGxvbmcg ZXJyX2NvdW50ZXI7CisgICAgfSBjb3VudGVyczsKKworICAgIC8qIGNyZWF0ZSBzb2NrZXQgKi8K KyAgICBzb2NrZmQgPSBzb2NrZXQoQUZfSU5FVCwgU09DS19ER1JBTSwgMCk7CisgICAgCisgICAg LyogcHV0IHNvY2tldCBpbnRvIGNvdW50aW5nIG1vZGUgKi8KKyAgICBpb2N0bChzb2NrZmQsIElP Q1RMX0NPVU5USU5HX01PREUsIDApOworICAgIAorICAgIC8qIGJpbmQgKi8KKyAgICBtZW1zZXQo Jm15X2FkZHIsIDAsIHNpemVvZihteV9hZGRyKSk7CisgICAgbXlfYWRkci5zaW5fZmFtaWx5ID0g QUZfSU5FVDsKKyAgICBteV9hZGRyLnNpbl9wb3J0ID0gaHRvbnMoMTAwMDApOworICAgIG15X2Fk ZHIuc2luX2FkZHIuc19hZGRyID0gaW5ldF9hZGRyKCIxMjcuMC4wLjEiKTsKKworICAgIGJpbmQo c29ja2ZkLCAoc3RydWN0IHNvY2thZGRyICopJm15X2FkZHIsIHNpemVvZihzdHJ1Y3Qgc29ja2Fk ZHIpKTsKKworICAgIGFkZHJfbGVuID0gc2l6ZW9mKHN0cnVjdCBzb2NrYWRkcik7CisKKyAgICAv KiBtYWluIGxvb3AqLworICAgIHdoaWxlKDEpIHsKKyAgICAgICAgc2xlZXAoMSk7CisKKyAgICAg ICAgLyogZ2V0IGFuZCBjbGVhciB0aGUgc29ja2V0IGNvdW50ZXJzICovCisgICAgICAgIGlvY3Rs KHNvY2tmZCwgSU9DVExfR0VUX0NPVU5URVJTLCAodW5zaWduZWQgbG9uZykmY291bnRlcnMpOwor CisgICAgICAgIC8qIHByaW50IG91dCAqLworICAgICAgICBwcmludGYoIiV1ICV1XG4iLCBjb3Vu dGVycy5va19jb3VudGVyLCBjb3VudGVycy5lcnJfY291bnRlcik7CisgICAgICAgIGZmbHVzaChz dGRvdXQpOworICAgIH0KK30KKy0tLS0tLS0tLS0tLS0tICAgZW5kIG9mIGV4YW1wbGUgLS0tLS0t LS0tLS0tLS0tCisKZGlmZiAtdXByTiAtWCB6YWMvZG9udGRpZmYgL3Vzci9zcmMvbGludXgtMi42 LjktdmFuaWxsYS9pbmNsdWRlL25ldC9zb2NrLmggbGludXgtMi42LjkvaW5jbHVkZS9uZXQvc29j ay5oCi0tLSAvdXNyL3NyYy9saW51eC0yLjYuOS12YW5pbGxhL2luY2x1ZGUvbmV0L3NvY2suaAky MDA0LTEwLTE4IDIzOjU0OjQwLjAwMDAwMDAwMCArMDIwMAorKysgbGludXgtMi42LjkvaW5jbHVk ZS9uZXQvc29jay5oCTIwMDQtMTItMDYgMTU6NTI6MzcuMDAwMDAwMDAwICswMTAwCkBAIC0yNjQs NiArMjY0LDE1IEBAIHN0cnVjdCBzb2NrIHsKICAgCWludAkJCSgqc2tfYmFja2xvZ19yY3YpKHN0 cnVjdCBzb2NrICpzaywKIAkJCQkJCSAgc3RydWN0IHNrX2J1ZmYgKnNrYik7ICAKIAl2b2lkICAg ICAgICAgICAgICAgICAgICAoKnNrX2Rlc3RydWN0KShzdHJ1Y3Qgc29jayAqc2spOworCisjaWYg ZGVmaW5lZChDT05GSUdfVURQX1JYX0NPVU5URVJTKQorCXNwaW5sb2NrX3QJCXVkcF9yeF9jb3Vu dGVyc19sb2NrOworCXVuc2lnbmVkIGNoYXIJdWRwX3J4X2NvdW50ZXJzX21vZGU7CQkvKiBkbyB0 aGUgY291bnRpbmcgb3Igbm90ICovCisJdW5zaWduZWQgY2hhcgl1ZHBfcnhfY291bnRlcnNfc2Vx OwkJLyogbmV4dCBleHBlY3RlZCBzZXF1ZW5jZSBudW1iZXIgKi8KKwl1bnNpZ25lZCBsb25nCXVk cF9yeF9jb3VudGVyc190b3RhbDsJCS8qIHRvdGFsIHBhY2tldCBjb3VudGVyICovCisJdW5zaWdu ZWQgbG9uZwl1ZHBfcnhfY291bnRlcnNfZXJyOwkJLyogc2VxdWVuY2UgZXJyb3JzIGNvdW50ZXIg Ki8KKyNlbmRpZgorCiB9OwogCiAvKgpkaWZmIC11cHJOIC1YIHphYy9kb250ZGlmZiAvdXNyL3Ny Yy9saW51eC0yLjYuOS12YW5pbGxhL25ldC9pcHY0L2FmX2luZXQuYyBsaW51eC0yLjYuOS9uZXQv aXB2NC9hZl9pbmV0LmMKLS0tIC91c3Ivc3JjL2xpbnV4LTIuNi45LXZhbmlsbGEvbmV0L2lwdjQv YWZfaW5ldC5jCTIwMDQtMTAtMTggMjM6NTM6MjEuMDAwMDAwMDAwICswMjAwCisrKyBsaW51eC0y LjYuOS9uZXQvaXB2NC9hZl9pbmV0LmMJMjAwNC0xMi0wNiAxNTo1NTo0Ny4wMDAwMDAwMDAgKzAx MDAKQEAgLTM0Miw2ICszNDIsMTUgQEAgc3RhdGljIGludCBpbmV0X2NyZWF0ZShzdHJ1Y3Qgc29j a2V0ICpzbwogCQlpZiAoZXJyKQogCQkJc2tfY29tbW9uX3JlbGVhc2Uoc2spOwogCX0KKworI2lm IGRlZmluZWQoQ09ORklHX1VEUF9SWF9DT1VOVEVSUykKKwlzay0+dWRwX3J4X2NvdW50ZXJzX2xv Y2sgPSBTUElOX0xPQ0tfVU5MT0NLRUQ7CisJc2stPnVkcF9yeF9jb3VudGVyc19tb2RlID0gMDsJ LyogMC1kb24ndCBjb3VudCwgMS1jb3VudCAqLworCXNrLT51ZHBfcnhfY291bnRlcnNfc2VxID0g MDsKKwlzay0+dWRwX3J4X2NvdW50ZXJzX3RvdGFsID0gMDsKKwlzay0+dWRwX3J4X2NvdW50ZXJz X2VyciA9IDA7CisjZW5kaWYKKwogb3V0OgogCXJldHVybiBlcnI7CiBvdXRfcmN1X3VubG9jazoK ZGlmZiAtdXByTiAtWCB6YWMvZG9udGRpZmYgL3Vzci9zcmMvbGludXgtMi42LjktdmFuaWxsYS9u ZXQvaXB2NC91ZHAuYyBsaW51eC0yLjYuOS9uZXQvaXB2NC91ZHAuYwotLS0gL3Vzci9zcmMvbGlu dXgtMi42LjktdmFuaWxsYS9uZXQvaXB2NC91ZHAuYwkyMDA0LTEwLTE4IDIzOjUzOjIyLjAwMDAw MDAwMCArMDIwMAorKysgbGludXgtMi42LjkvbmV0L2lwdjQvdWRwLmMJMjAwNC0xMi0wNiAxNTo1 OTozOC4wMDAwMDAwMDAgKzAxMDAKQEAgLTc0OCw2ICs3NDgsMzMgQEAgaW50IHVkcF9pb2N0bChz dHJ1Y3Qgc29jayAqc2ssIGludCBjbWQsIAogCQkJcmV0dXJuIHB1dF91c2VyKGFtb3VudCwgKGlu dCBfX3VzZXIgKilhcmcpOwogCQl9CiAKKyNpZiBkZWZpbmVkKENPTkZJR19VRFBfUlhfQ09VTlRF UlMpCisJCWNhc2UgMTU6CS8qIHNldCBjb3VudGluZyBPTiAqLworCQkJc2stPnVkcF9yeF9jb3Vu dGVyc19tb2RlID0gMTsKKwkJCXJldHVybiAwOworCisJCWNhc2UgMTY6CS8qIHNldCBjb3VudGlu ZyBPRkYgKi8KKwkJCXNrLT51ZHBfcnhfY291bnRlcnNfbW9kZSA9IDA7CisJCQlyZXR1cm4gMDsK KworICAgICAgICBjYXNlIDE3OgkvKiBnZXQgYW5kIHJlc2V0IGNvdW50ZXJzICovCisgICAgICAg IHsKKwkJCXN0cnVjdCBjb3VudGVycyB7CisJCQkJdW5zaWduZWQgbG9uZyBjMTsKKwkJCQl1bnNp Z25lZCBsb25nIGMyOworCQkJfSBjOworICAgICAgICAgICAgCisJCQlzcGluX2xvY2tfaXJxKCZz ay0+dWRwX3J4X2NvdW50ZXJzX2xvY2spOworCQkJYy5jMSA9IHNrLT51ZHBfcnhfY291bnRlcnNf dG90YWw7CisJCQljLmMyID0gc2stPnVkcF9yeF9jb3VudGVyc19lcnI7CisJCQlzay0+dWRwX3J4 X2NvdW50ZXJzX3RvdGFsID0gMDsKKwkJCXNrLT51ZHBfcnhfY291bnRlcnNfZXJyID0gMDsKKwkJ CXNwaW5fdW5sb2NrX2lycSgmc2stPnVkcF9yeF9jb3VudGVyc19sb2NrKTsKKworCQkJcmV0dXJu IGNvcHlfdG9fdXNlcigoc3RydWN0IGNvdW50ZXJzICopYXJnLCAoc3RydWN0IGNvdW50ZXJzICop JmMsIHNpemVvZihjKSk7CisJCX0KKyNlbmRpZgorCiAJCWRlZmF1bHQ6CiAJCQlyZXR1cm4gLUVO T0lPQ1RMQ01EOwogCX0KQEAgLTExNDYsNyArMTE3MywzMiBAQCBpbnQgdWRwX3JjdihzdHJ1Y3Qg c2tfYnVmZiAqc2tiKQogCXNrID0gdWRwX3Y0X2xvb2t1cChzYWRkciwgdWgtPnNvdXJjZSwgZGFk ZHIsIHVoLT5kZXN0LCBza2ItPmRldi0+aWZpbmRleCk7CiAKIAlpZiAoc2sgIT0gTlVMTCkgewot CQlpbnQgcmV0ID0gdWRwX3F1ZXVlX3Jjdl9za2Ioc2ssIHNrYik7CisgICAgICAgIGludCByZXQ7 CisKKyNpZiBkZWZpbmVkKENPTkZJR19VRFBfUlhfQ09VTlRFUlMpCisJCWlmIChzay0+dWRwX3J4 X2NvdW50ZXJzX21vZGUgPT0gMSkgeworCQkJdW5zaWduZWQgbG9uZyBmbGFnczsKKwkJCXVuc2ln bmVkIGNoYXIgKnNlcV9wOworCisJCQkvKiBzZXF1ZW5jZSBudW1iZXIgaXMgZmlyc3QgZGF0YSBi eXRlIChyaWdodCBhZnRlciB1ZHBoZHIpICovCisJCQlzZXFfcCA9ICh1bnNpZ25lZCBjaGFyICop dWggKyBzaXplb2Yoc3RydWN0IHVkcGhkcik7CisgICAgICAgICAgICAKKwkJCXNwaW5fbG9ja19p cnFzYXZlKCZzay0+dWRwX3J4X2NvdW50ZXJzX2xvY2ssIGZsYWdzKTsKKwkJCXNrLT51ZHBfcnhf Y291bnRlcnNfdG90YWwrKzsKKwkJCS8qIGNoZWNrIHNlcXVlbmNlICovCisJCQlpZiAoKnNlcV9w ICE9IHNrLT51ZHBfcnhfY291bnRlcnNfc2VxKSB7CisJCQkJc2stPnVkcF9yeF9jb3VudGVyc19l cnIrKzsgICAgICAgICAgLyogaW5jcmVtZW50IGNvdW50ZXIgKi8KKwkJCX0KKwkJCXNrLT51ZHBf cnhfY291bnRlcnNfc2VxID0gKnNlcV9wICsgMTsgICAvKiByZXNldCBzZXF1ZW5jZSB0byBuZXcg dmFsdWUgKi8KKwkJCXNwaW5fdW5sb2NrX2lycXJlc3RvcmUoJnNrLT51ZHBfcnhfY291bnRlcnNf bG9jaywgZmxhZ3MpOworCisJCQkvKiBkcm9wIHBhY2tldCAqLworCQkJa2ZyZWVfc2tiKHNrYik7 CisJCQlyZXR1cm4gMDsKKwkJfQorI2VuZGlmCisgICAgICAgIAorCQlyZXQgPSB1ZHBfcXVldWVf cmN2X3NrYihzaywgc2tiKTsKIAkJc29ja19wdXQoc2spOwogCiAJCS8qIGEgcmV0dXJuIHZhbHVl ID4gMCBtZWFucyB0byByZXN1Ym1pdCB0aGUgaW5wdXQsIGJ1dApkaWZmIC11cHJOIC1YIHphYy9k b250ZGlmZiAvdXNyL3NyYy9saW51eC0yLjYuOS12YW5pbGxhL25ldC9LY29uZmlnIGxpbnV4LTIu Ni45L25ldC9LY29uZmlnCi0tLSAvdXNyL3NyYy9saW51eC0yLjYuOS12YW5pbGxhL25ldC9LY29u ZmlnCTIwMDQtMTAtMTggMjM6NTM6MTAuMDAwMDAwMDAwICswMjAwCisrKyBsaW51eC0yLjYuOS9u ZXQvS2NvbmZpZwkyMDA0LTEyLTA2IDE1OjQxOjQzLjAwMDAwMDAwMCArMDEwMApAQCAtNjQxLDYg KzY0MSwyNCBAQCBjb25maWcgTkVUX1BLVEdFTgogCSAgVG8gY29tcGlsZSB0aGlzIGNvZGUgYXMg YSBtb2R1bGUsIGNob29zZSBNIGhlcmU6IHRoZQogCSAgbW9kdWxlIHdpbGwgYmUgY2FsbGVkIHBr dGdlbi4KIAorY29uZmlnIFVEUF9SWF9DT1VOVEVSUworICAgIGJvb2wgIlVkcCBSWCBwYWNrZXQg Y291bnRlciIKKyAgICBkZWZhdWx0IG4KKwktLS1oZWxwLS0tCisgICAgVGhpcyBvcHRpb24gd2ls bCBhZGQgMiBuZXcgaW9jdGwgb3B0aW9ucyB0byBldmVyeSBVRFAgc29ja2V0LiAKKyAgICBJdCB3 aWxsIGFsc28gZXh0ZW5kIHNvY2tldCBzdHJ1Y3R1cmUgd2l0aCBhIGZldyBjb3VudGVycy4KKyAg CisgICAgT25jZSB0aGUgVURQIHNvY2tldCBpcyBjcmVhdGVkIC0gc29ja2V0KEFGX0lORVQsIFNP Q0tfREdSQU0pLCAKKyAgICB0aGUgYXBwbGljYXRpb24gaGFzIGEgcG9zaWJpbGl0eSB0byBwdXQg dGhpcyBzb2NrZXQgaW50byAKKyAgICAiY291bnRpbmcgbW9kZSIgLSBmaXJzdCBpb2N0bCBjYWxs LiBJZiB0aGUgc29ja2V0IGlzIGluIHRoaXMgbW9kZSwKKyAgICB0aGUgdHJhZmZpYyBpcyBub3Qg ZGVsaXZlcmVkIHRvIHRoZSBhcHBsaWNhdGlvbiwgYnV0IHRoZSBjb3VudGVycworICAgIGFyZSB1 cGRhdGVkIGluc2lkZSBrZXJuZWwuIFRoZSBjb3VudGVycyBjYW4gYmUgcmV0aXJ2ZWQgKGFuZAor ICAgIGNsZWFyZWQpIHdpdGggc2Vjb25kIGlvY3RsIGNhbGwgb24gdGhpcyBzb2NrZXQuCisgICAg U2VlIDxmaWxlOkRvY3VtZW50YXRpb24vbmV0d29ya2luZy91ZHBfcnhfY291bnRlcnMudHh0Pi4K KyAgCisgICAgVGhpcyBvcHRpb24gaXMgZm9yIHRlc3RpbmcgcHVycG9zZXMuIElmIHlvdSBkb24n dCBuZWVkIGZhc3QgVURQCisgICAgcGFja2V0IGNvdW50ZXJzLCBzYXkgTi4KKwogZW5kbWVudQog CiBlbmRtZW51Cg== ------_=_NextPart_001_01C4E128.290A6CAA-- From tgraf@suug.ch Mon Dec 13 08:53:25 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 13 Dec 2004 08:53:33 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBDGr4fe031003 for ; Mon, 13 Dec 2004 08:53:25 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id DBE8BF; Mon, 13 Dec 2004 17:52:18 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id 24BEF1C0EA; Mon, 13 Dec 2004 17:53:02 +0100 (CET) Date: Mon, 13 Dec 2004 17:53:02 +0100 From: Thomas Graf To: Patrick McHardy Cc: "David S. Miller" , Herbert Xu , netdev@oss.sgi.com Subject: [RFC] tcf_bind_filter failure handling Message-ID: <20041213165302.GE8493@postel.suug.ch> References: <20041109161126.376f755c.davem@davemloft.net> <20041110010113.GJ31969@postel.suug.ch> <41916A91.3080107@trash.net> <20041110012251.GK31969@postel.suug.ch> <41916F0B.5010809@trash.net> <20041110013941.GL31969@postel.suug.ch> <41917330.6090002@trash.net> <20041212175736.GA8493@postel.suug.ch> <41BC8819.7040501@trash.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <41BC8819.7040501@trash.net> X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12698 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev The handling of a failure in tcf_bind_filter is inconsistent. u32: ignore fw: ignore route: ignore rsvp: ignore tcindex: error It might be a good idea to make this consistent. So in order to validate the classid before making any changes we could simply lock it via get (see patch below), return an error if it fails and put it back in case of an error further in the path or after binding the filter. Bindings not only locks the class from removal while a filter is pointing to it. It speeds up classyfing by saving a lookup for every tc_classify call. It's not really a problem if the class is not locked, the qdisc will look it up and falls back to a default class if it doesn't exists so it's rather a cosmetic/policy thing. Note: The patch below is not intented for inclusion yet and will not apply since I made local changes to pkt_cls.h. --- linux-2.6.10-rc3-bk6.orig/include/net/pkt_cls.h 2004-12-12 21:54:03.000000000 +0100 +++ linux-2.6.10-rc3-bk6/include/net/pkt_cls.h 2004-12-13 16:40:06.000000000 +0100 @@ -62,6 +62,18 @@ tp->q->ops->cl_ops->unbind_tcf(tp->q, cl); } +static inline unsigned long +tcf_class_get(struct tcf_proto *tp, u32 classid) +{ + return tp->q->ops->cl_ops->get(tp->q, classid); +} + +static inline void +tcf_class_put(struct tcf_proto *tp, unsigned long cl) +{ + tp->q->ops->cl_ops->put(tp->q, cl); +} + #ifdef CONFIG_NET_CLS_ACT static inline int tcf_validate_act_police(struct rtattr *act_police_tlv, struct rtattr *rate_tlv, From roland@topspin.com Mon Dec 13 10:10:30 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 13 Dec 2004 10:10:39 -0800 (PST) Received: from umhlanga.STRATNET.NET (umhlanga.stratnet.net [12.162.17.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBDIA9wO008651 for ; Mon, 13 Dec 2004 10:10:29 -0800 Received: from exch-1.topspincom.com ([12.162.17.3]) by umhlanga.STRATNET.NET with Microsoft SMTPSVC(5.0.2195.5329); Mon, 13 Dec 2004 10:09:46 -0800 Received: from eddore ([10.10.253.169]) by exch-1.topspincom.com with Microsoft SMTPSVC(5.0.2195.5329); Mon, 13 Dec 2004 10:09:46 -0800 Received: from localhost ([127.0.0.1] helo=eddore) by eddore with smtp (Exim 4.34) id 1Cdudl-0005Ph-QZ; Mon, 13 Dec 2004 10:09:46 -0800 Cc: openib-general@openib.org, netdev@oss.sgi.com In-Reply-To: <20041213109.m8TyDSPRhM2X6Nst@topspin.com> X-Mailer: Roland's Patchbomber Date: Mon, 13 Dec 2004 10:09:45 -0800 Message-Id: <20041213109.5NKezuGE9PMejMSM@topspin.com> Mime-Version: 1.0 To: linux-kernel@vger.kernel.org From: Roland Dreier X-SA-Exim-Connect-IP: 127.0.0.1 X-SA-Exim-Mail-From: roland@topspin.com Subject: [PATCH][v3][15/21] IPoIB IPv4 multicast Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-SA-Exim-Version: 4.1 (built Tue, 17 Aug 2004 11:06:07 +0200) X-SA-Exim-Scanned: Yes (on eddore) X-OriginalArrivalTime: 13 Dec 2004 18:09:46.0474 (UTC) FILETIME=[EDA964A0:01C4E13E] X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12699 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: roland@topspin.com Precedence: bulk X-list: netdev Add ip_ib_mc_map() to convert IPv4 multicast addresses to IPoIB hardware addresses. Also add so INFINIBAND_ALEN has a home. The mapping for multicast addresses is described in http://www.ietf.org/internet-drafts/draft-ietf-ipoib-ip-over-infiniband-07.txt Signed-off-by: Roland Dreier --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/include/linux/if_infiniband.h 2004-12-13 09:44:47.613711020 -0800 @@ -0,0 +1,29 @@ +/* + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available at + * , or the OpenIB.org BSD + * license, available in the LICENSE.TXT file accompanying this + * software. These details are also available at + * . + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * $Id$ + */ + +#ifndef _LINUX_IF_INFINIBAND_H +#define _LINUX_IF_INFINIBAND_H + +#define INFINIBAND_ALEN 20 /* Octets in IPoIB HW addr */ + +#endif /* _LINUX_IF_INFINIBAND_H */ --- linux-bk.orig/include/net/ip.h 2004-12-11 15:16:19.000000000 -0800 +++ linux-bk/include/net/ip.h 2004-12-13 09:44:47.641706896 -0800 @@ -229,6 +229,39 @@ buf[3]=addr&0x7F; } +/* + * Map a multicast IP onto multicast MAC for type IP-over-InfiniBand. + * Leave P_Key as 0 to be filled in by driver. + */ + +static inline void ip_ib_mc_map(u32 addr, char *buf) +{ + buf[0] = 0; /* Reserved */ + buf[1] = 0xff; /* Multicast QPN */ + buf[2] = 0xff; + buf[3] = 0xff; + addr = ntohl(addr); + buf[4] = 0xff; + buf[5] = 0x12; /* link local scope */ + buf[6] = 0x40; /* IPv4 signature */ + buf[7] = 0x1b; + buf[8] = 0; /* P_Key */ + buf[9] = 0; + buf[10] = 0; + buf[11] = 0; + buf[12] = 0; + buf[13] = 0; + buf[14] = 0; + buf[15] = 0; + buf[19] = addr & 0xff; + addr >>= 8; + buf[18] = addr & 0xff; + addr >>= 8; + buf[17] = addr & 0xff; + addr >>= 8; + buf[16] = addr & 0x0f; +} + #if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) #include #endif --- linux-bk.orig/net/ipv4/arp.c 2004-12-11 15:16:28.000000000 -0800 +++ linux-bk/net/ipv4/arp.c 2004-12-13 09:44:47.650705570 -0800 @@ -213,6 +213,9 @@ case ARPHRD_IEEE802_TR: ip_tr_mc_map(addr, haddr); return 0; + case ARPHRD_INFINIBAND: + ip_ib_mc_map(addr, haddr); + return 0; default: if (dir) { memcpy(haddr, dev->broadcast, dev->addr_len); From roland@topspin.com Mon Dec 13 10:10:30 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 13 Dec 2004 10:10:40 -0800 (PST) Received: from umhlanga.STRATNET.NET (umhlanga.stratnet.net [12.162.17.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBDIA9wQ008651 for ; Mon, 13 Dec 2004 10:10:30 -0800 Received: from exch-1.topspincom.com ([12.162.17.3]) by umhlanga.STRATNET.NET with Microsoft SMTPSVC(5.0.2195.5329); Mon, 13 Dec 2004 10:09:52 -0800 Received: from eddore ([10.10.253.169]) by exch-1.topspincom.com with Microsoft SMTPSVC(5.0.2195.5329); Mon, 13 Dec 2004 10:09:52 -0800 Received: from localhost ([127.0.0.1] helo=eddore) by eddore with smtp (Exim 4.34) id 1Cdudm-0005Pr-EI; Mon, 13 Dec 2004 10:09:51 -0800 Cc: openib-general@openib.org, netdev@oss.sgi.com In-Reply-To: <20041213109.5NKezuGE9PMejMSM@topspin.com> X-Mailer: Roland's Patchbomber Date: Mon, 13 Dec 2004 10:09:46 -0800 Message-Id: <20041213109.iziHvQZqtmP83gmx@topspin.com> Mime-Version: 1.0 To: linux-kernel@vger.kernel.org From: Roland Dreier X-SA-Exim-Connect-IP: 127.0.0.1 X-SA-Exim-Mail-From: roland@topspin.com Subject: [PATCH][v3][16/21] IPoIB IPv6 support Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-SA-Exim-Version: 4.1 (built Tue, 17 Aug 2004 11:06:07 +0200) X-SA-Exim-Scanned: Yes (on eddore) X-OriginalArrivalTime: 13 Dec 2004 18:09:52.0037 (UTC) FILETIME=[F0FA3D50:01C4E13E] X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12700 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: roland@topspin.com Precedence: bulk X-list: netdev Add ipv6_ib_mc_map() to convert IPv6 multicast addresses to IPoIB hardware addresses, and add support for autoconfiguration for devices with type ARPHRD_INFINIBAND. The mapping for multicast addresses is described in http://www.ietf.org/internet-drafts/draft-ietf-ipoib-ip-over-infiniband-07.txt Signed-off-by: Nitin Hande Signed-off-by: Roland Dreier --- linux-bk.orig/include/net/if_inet6.h 2004-12-11 15:16:37.000000000 -0800 +++ linux-bk/include/net/if_inet6.h 2004-12-13 09:44:48.801536031 -0800 @@ -266,5 +266,20 @@ { buf[0] = 0x00; } + +static inline void ipv6_ib_mc_map(struct in6_addr *addr, char *buf) +{ + buf[0] = 0; /* Reserved */ + buf[1] = 0xff; /* Multicast QPN */ + buf[2] = 0xff; + buf[3] = 0xff; + buf[4] = 0xff; + buf[5] = 0x12; /* link local scope */ + buf[6] = 0x60; /* IPv6 signature */ + buf[7] = 0x1b; + buf[8] = 0; /* P_Key */ + buf[9] = 0; + memcpy(buf + 10, addr->s6_addr + 6, 10); +} #endif #endif --- linux-bk.orig/net/ipv6/addrconf.c 2004-12-11 15:16:33.000000000 -0800 +++ linux-bk/net/ipv6/addrconf.c 2004-12-13 09:44:48.840530286 -0800 @@ -48,6 +48,7 @@ #include #include #include +#include #include #include #include @@ -1095,6 +1096,12 @@ memset(eui, 0, 7); eui[7] = *(u8*)dev->dev_addr; return 0; + case ARPHRD_INFINIBAND: + if (dev->addr_len != INFINIBAND_ALEN) + return -1; + memcpy(eui, dev->dev_addr + 12, 8); + eui[0] |= 2; + return 0; } return -1; } @@ -1794,6 +1801,7 @@ if ((dev->type != ARPHRD_ETHER) && (dev->type != ARPHRD_FDDI) && (dev->type != ARPHRD_IEEE802_TR) && + (dev->type != ARPHRD_INFINIBAND) && (dev->type != ARPHRD_ARCNET)) { /* Alas, we support only Ethernet autoconfiguration. */ return; --- linux-bk.orig/net/ipv6/ndisc.c 2004-12-11 15:16:13.000000000 -0800 +++ linux-bk/net/ipv6/ndisc.c 2004-12-13 09:44:48.890522921 -0800 @@ -260,6 +260,9 @@ case ARPHRD_ARCNET: ipv6_arcnet_mc_map(addr, buf); return 0; + case ARPHRD_INFINIBAND: + ipv6_ib_mc_map(addr, buf); + return 0; default: if (dir) { memcpy(buf, dev->broadcast, dev->addr_len); From roland@topspin.com Mon Dec 13 10:10:31 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 13 Dec 2004 10:10:47 -0800 (PST) Received: from umhlanga.STRATNET.NET (umhlanga.stratnet.net [12.162.17.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBDIA9wS008651 for ; Mon, 13 Dec 2004 10:10:30 -0800 Received: from exch-1.topspincom.com ([12.162.17.3]) by umhlanga.STRATNET.NET with Microsoft SMTPSVC(5.0.2195.5329); Mon, 13 Dec 2004 10:09:58 -0800 Received: from eddore ([10.10.253.169]) by exch-1.topspincom.com with Microsoft SMTPSVC(5.0.2195.5329); Mon, 13 Dec 2004 10:09:57 -0800 Received: from localhost ([127.0.0.1] helo=eddore) by eddore with smtp (Exim 4.34) id 1Cdudr-0005Q4-Uv; Mon, 13 Dec 2004 10:09:57 -0800 Cc: openib-general@openib.org, netdev@oss.sgi.com In-Reply-To: <20041213109.iziHvQZqtmP83gmx@topspin.com> X-Mailer: Roland's Patchbomber Date: Mon, 13 Dec 2004 10:09:51 -0800 Message-Id: <20041213109.JT1ejUdkRIUXbWOm@topspin.com> Mime-Version: 1.0 To: linux-kernel@vger.kernel.org From: Roland Dreier X-SA-Exim-Connect-IP: 127.0.0.1 X-SA-Exim-Mail-From: roland@topspin.com Subject: [PATCH][v3][17/21] Add IPoIB (IP-over-InfiniBand) driver Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-SA-Exim-Version: 4.1 (built Tue, 17 Aug 2004 11:06:07 +0200) X-SA-Exim-Scanned: Yes (on eddore) X-OriginalArrivalTime: 13 Dec 2004 18:09:57.0958 (UTC) FILETIME=[F481B660:01C4E13E] X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12701 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: roland@topspin.com Precedence: bulk X-list: netdev Add a driver that implements the (IPoIB) IP-over-InfiniBand protocol. This is a network device driver of type ARPHRD_INFINIBAND (and addr_len INFINIBAND_ALEN bytes). The ARP/ND implementation for this driver is not completely straightforward, because InfiniBand requires an additional path lookup be performed (through an IB-specific mechanism) after a remote hardware address has been resolved. We are very open to suggestions of a better way to handle this than the current implementation. Although IB has a special multicast group join mode intended to support IP multicast routing (non member join), no means to identify different multicast styles has yet been determined, so all joins by the driver are currently full member joins. We are looking for guidance in how to solve this. The IPoIB protocol/encapsulation is described in the Internet-Drafts http://www.ietf.org/internet-drafts/draft-ietf-ipoib-architecture-04.txt http://www.ietf.org/internet-drafts/draft-ietf-ipoib-ip-over-infiniband-07.txt Signed-off-by: Roland Dreier --- linux-bk.orig/drivers/infiniband/Kconfig 2004-12-13 09:44:43.936252779 -0800 +++ linux-bk/drivers/infiniband/Kconfig 2004-12-13 09:44:49.385450009 -0800 @@ -2,7 +2,6 @@ config INFINIBAND tristate "InfiniBand support" - default n ---help--- Core support for InfiniBand (IB). Make sure to also select any protocols you wish to use as well as drivers for your @@ -10,4 +9,6 @@ source "drivers/infiniband/hw/mthca/Kconfig" +source "drivers/infiniband/ulp/ipoib/Kconfig" + endmenu --- linux-bk.orig/drivers/infiniband/Makefile 2004-12-13 09:44:43.909256756 -0800 +++ linux-bk/drivers/infiniband/Makefile 2004-12-13 09:44:49.342456343 -0800 @@ -1,2 +1,3 @@ obj-$(CONFIG_INFINIBAND) += core/ obj-$(CONFIG_INFINIBAND_MTHCA) += hw/mthca/ +obj-$(CONFIG_INFINIBAND_IPOIB) += ulp/ipoib/ --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/ulp/ipoib/Kconfig 2004-12-13 09:44:49.470437489 -0800 @@ -0,0 +1,33 @@ +config INFINIBAND_IPOIB + tristate "IP-over-InfiniBand" + depends on INFINIBAND && NETDEVICES && INET + ---help--- + Support for the IP-over-InfiniBand protocol (IPoIB). This + transports IP packets over InfiniBand so you can use your IB + device as a fancy NIC. + + The IPoIB protocol is defined by the IETF ipoib working + group: . + +config INFINIBAND_IPOIB_DEBUG + bool "IP-over-InfiniBand debugging" + depends on INFINIBAND_IPOIB + ---help--- + This option causes debugging code to be compiled into the + IPoIB driver. The output can be turned on via the + debug_level and mcast_debug_level module parameters (which + can also be set after the driver is loaded through sysfs). + + This option also creates an "ipoib_debugfs," which can be + mounted to expose debugging information about IB multicast + groups used by the IPoIB driver. + +config INFINIBAND_IPOIB_DEBUG_DATA + bool "IP-over-InfiniBand data path debugging" + depends on INFINIBAND_IPOIB_DEBUG + ---help--- + This option compiles debugging code into the the data path + of the IPoIB driver. The output can be turned on via the + data_debug_level module parameter; however, even with output + turned off, this debugging code will have some performance + impact. --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/ulp/ipoib/Makefile 2004-12-13 09:44:49.426443970 -0800 @@ -0,0 +1,11 @@ +EXTRA_CFLAGS += -Idrivers/infiniband/include + +obj-$(CONFIG_INFINIBAND_IPOIB) += ib_ipoib.o + +ib_ipoib-y := ipoib_main.o \ + ipoib_ib.o \ + ipoib_multicast.o \ + ipoib_verbs.o \ + ipoib_vlan.o +ib_ipoib-$(CONFIG_INFINIBAND_IPOIB_DEBUG) += ipoib_fs.o + --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/ulp/ipoib/ipoib.h 2004-12-13 09:44:49.497433512 -0800 @@ -0,0 +1,340 @@ +/* + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available at + * , or the OpenIB.org BSD + * license, available in the LICENSE.TXT file accompanying this + * software. These details are also available at + * . + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * $Id: ipoib.h 1323 2004-12-11 02:36:04Z roland $ + */ + +#ifndef _IPOIB_H +#define _IPOIB_H + +#include +#include +#include +#include +#include +#include +#include +#include + +#include + +#include +#include + +#include +#include +#include + +/* constants */ + +enum { + IPOIB_PACKET_SIZE = 2048, + IPOIB_BUF_SIZE = IPOIB_PACKET_SIZE + IB_GRH_BYTES, + + IPOIB_ENCAP_LEN = 4, + + IPOIB_RX_RING_SIZE = 128, + IPOIB_TX_RING_SIZE = 64, + + IPOIB_NUM_WC = 4, + + IPOIB_MAX_PATH_REC_QUEUE = 3, + IPOIB_MAX_MCAST_QUEUE = 3, + + IPOIB_FLAG_TX_FULL = 0, + IPOIB_FLAG_OPER_UP = 1, + IPOIB_FLAG_ADMIN_UP = 2, + IPOIB_PKEY_ASSIGNED = 3, + IPOIB_PKEY_STOP = 4, + IPOIB_FLAG_SUBINTERFACE = 5, + IPOIB_MCAST_RUN = 6, + IPOIB_STOP_REAPER = 7, + + IPOIB_MAX_BACKOFF_SECONDS = 16, + + IPOIB_MCAST_FLAG_FOUND = 0, /* used in set_multicast_list */ + IPOIB_MCAST_FLAG_SENDONLY = 1, + IPOIB_MCAST_FLAG_BUSY = 2, /* joining or already joined */ + IPOIB_MCAST_FLAG_ATTACHED = 3, +}; + +/* structs */ + +struct ipoib_header { + u16 proto; + u16 reserved; +}; + +struct ipoib_pseudoheader { + u8 hwaddr[INFINIBAND_ALEN]; +}; + +struct ipoib_mcast; + +struct ipoib_buf { + struct sk_buff *skb; + DECLARE_PCI_UNMAP_ADDR(mapping) +}; + +/* + * Device private locking: tx_lock protects members used in TX fast + * path (and we use LLTX so upper layers don't do extra locking). + * lock protects everything else. lock nests inside of tx_lock (ie + * tx_lock must be acquired first if needed). + */ +struct ipoib_dev_priv { + spinlock_t lock; + + struct net_device *dev; + + unsigned long flags; + + struct semaphore mcast_mutex; + struct semaphore vlan_mutex; + + struct rb_root path_tree; + struct list_head path_list; + + struct ipoib_mcast *broadcast; + struct list_head multicast_list; + struct rb_root multicast_tree; + + struct work_struct pkey_task; + struct work_struct mcast_task; + struct work_struct flush_task; + struct work_struct restart_task; + struct work_struct ah_reap_task; + + struct ib_device *ca; + u8 port; + u16 pkey; + struct ib_pd *pd; + struct ib_mr *mr; + struct ib_cq *cq; + struct ib_qp *qp; + u32 qkey; + + union ib_gid local_gid; + u16 local_lid; + + unsigned int admin_mtu; + unsigned int mcast_mtu; + + struct ipoib_buf *rx_ring; + + spinlock_t tx_lock; + struct ipoib_buf *tx_ring; + unsigned tx_head; + unsigned tx_tail; + + struct ib_wc ibwc[IPOIB_NUM_WC]; + + struct list_head dead_ahs; + + struct ib_event_handler event_handler; + + struct net_device_stats stats; + + struct net_device *parent; + struct list_head child_intfs; + struct list_head list; + +#ifdef CONFIG_INFINIBAND_IPOIB_DEBUG + struct list_head fs_list; + struct dentry *mcg_dentry; +#endif +}; + +struct ipoib_ah { + struct net_device *dev; + struct ib_ah *ah; + struct list_head list; + struct kref ref; + unsigned last_send; +}; + +struct ipoib_path { + struct net_device *dev; + struct ib_sa_path_rec pathrec; + struct ipoib_ah *ah; + struct sk_buff_head queue; + + struct list_head neigh_list; + + int query_id; + struct ib_sa_query *query; + struct completion done; + + struct rb_node rb_node; + struct list_head list; +}; + +struct ipoib_neigh { + struct ipoib_ah *ah; + struct sk_buff_head queue; + + struct neighbour *neighbour; + + struct list_head list; +}; + +static inline struct ipoib_neigh **to_ipoib_neigh(struct neighbour *neigh) +{ + return (struct ipoib_neigh **) (neigh->ha + 24 - + (offsetof(struct neighbour, ha) & 4)); +} + +extern struct workqueue_struct *ipoib_workqueue; + +/* functions */ + +void ipoib_ib_completion(struct ib_cq *cq, void *dev_ptr); + +struct ipoib_ah *ipoib_create_ah(struct net_device *dev, + struct ib_pd *pd, struct ib_ah_attr *attr); +void ipoib_free_ah(struct kref *kref); +static inline void ipoib_put_ah(struct ipoib_ah *ah) +{ + kref_put(&ah->ref, ipoib_free_ah); +} + +int ipoib_add_pkey_attr(struct net_device *dev); + +void ipoib_send(struct net_device *dev, struct sk_buff *skb, + struct ipoib_ah *address, u32 qpn); +void ipoib_reap_ah(void *dev_ptr); + +void ipoib_flush_paths(struct net_device *dev); +struct ipoib_dev_priv *ipoib_intf_alloc(const char *format); + +int ipoib_ib_dev_init(struct net_device *dev, struct ib_device *ca, int port); +void ipoib_ib_dev_flush(void *dev); +void ipoib_ib_dev_cleanup(struct net_device *dev); + +int ipoib_ib_dev_open(struct net_device *dev); +int ipoib_ib_dev_up(struct net_device *dev); +int ipoib_ib_dev_down(struct net_device *dev); +int ipoib_ib_dev_stop(struct net_device *dev); + +int ipoib_dev_init(struct net_device *dev, struct ib_device *ca, int port); +void ipoib_dev_cleanup(struct net_device *dev); + +void ipoib_mcast_join_task(void *dev_ptr); +void ipoib_mcast_send(struct net_device *dev, union ib_gid *mgid, + struct sk_buff *skb); + +void ipoib_mcast_restart_task(void *dev_ptr); +int ipoib_mcast_start_thread(struct net_device *dev); +int ipoib_mcast_stop_thread(struct net_device *dev); + +void ipoib_mcast_dev_down(struct net_device *dev); +void ipoib_mcast_dev_flush(struct net_device *dev); + +struct ipoib_mcast_iter *ipoib_mcast_iter_init(struct net_device *dev); +void ipoib_mcast_iter_free(struct ipoib_mcast_iter *iter); +int ipoib_mcast_iter_next(struct ipoib_mcast_iter *iter); +void ipoib_mcast_iter_read(struct ipoib_mcast_iter *iter, + union ib_gid *gid, + unsigned long *created, + unsigned int *queuelen, + unsigned int *complete, + unsigned int *send_only); + +int ipoib_mcast_attach(struct net_device *dev, u16 mlid, + union ib_gid *mgid); +int ipoib_mcast_detach(struct net_device *dev, u16 mlid, + union ib_gid *mgid); + +int ipoib_qp_create(struct net_device *dev); +int ipoib_transport_dev_init(struct net_device *dev, struct ib_device *ca); +void ipoib_transport_dev_cleanup(struct net_device *dev); + +void ipoib_event(struct ib_event_handler *handler, + struct ib_event *record); + +int ipoib_vlan_add(struct net_device *pdev, unsigned short pkey); +int ipoib_vlan_delete(struct net_device *pdev, unsigned short pkey); + +void ipoib_pkey_poll(void *dev); +int ipoib_pkey_dev_delay_open(struct net_device *dev); + +#ifdef CONFIG_INFINIBAND_IPOIB_DEBUG +int ipoib_create_debug_file(struct net_device *dev); +void ipoib_delete_debug_file(struct net_device *dev); +int ipoib_register_debugfs(void); +void ipoib_unregister_debugfs(void); +#else +static inline int ipoib_create_debug_file(struct net_device *dev) { return 0; } +static inline void ipoib_delete_debug_file(struct net_device *dev) { } +static inline int ipoib_register_debugfs(void) { return 0; } +static inline void ipoib_unregister_debugfs(void) { } +#endif + + +#define ipoib_printk(level, priv, format, arg...) \ + printk(level "%s: " format, ((struct ipoib_dev_priv *) priv)->dev->name , ## arg) +#define ipoib_warn(priv, format, arg...) \ + ipoib_printk(KERN_WARNING, priv, format , ## arg) + + +#ifdef CONFIG_INFINIBAND_IPOIB_DEBUG +extern int debug_level; + +#define ipoib_dbg(priv, format, arg...) \ + do { \ + if (debug_level > 0) \ + ipoib_printk(KERN_DEBUG, priv, format , ## arg); \ + } while (0) +#define ipoib_dbg_mcast(priv, format, arg...) \ + do { \ + if (mcast_debug_level > 0) \ + ipoib_printk(KERN_DEBUG, priv, format , ## arg); \ + } while (0) +#else /* CONFIG_INFINIBAND_IPOIB_DEBUG */ +#define ipoib_dbg(priv, format, arg...) \ + do { (void) (priv); } while (0) +#define ipoib_dbg_mcast(priv, format, arg...) \ + do { (void) (priv); } while (0) +#endif /* CONFIG_INFINIBAND_IPOIB_DEBUG */ + +#ifdef CONFIG_INFINIBAND_IPOIB_DEBUG_DATA +#define ipoib_dbg_data(priv, format, arg...) \ + do { \ + if (data_debug_level > 0) \ + ipoib_printk(KERN_DEBUG, priv, format , ## arg); \ + } while (0) +#else /* CONFIG_INFINIBAND_IPOIB_DEBUG_DATA */ +#define ipoib_dbg_data(priv, format, arg...) \ + do { (void) (priv); } while (0) +#endif /* CONFIG_INFINIBAND_IPOIB_DEBUG_DATA */ + + +#define IPOIB_GID_FMT "%x:%x:%x:%x:%x:%x:%x:%x" + +#define IPOIB_GID_ARG(gid) be16_to_cpup((__be16 *) ((gid).raw + 0)), \ + be16_to_cpup((__be16 *) ((gid).raw + 2)), \ + be16_to_cpup((__be16 *) ((gid).raw + 4)), \ + be16_to_cpup((__be16 *) ((gid).raw + 6)), \ + be16_to_cpup((__be16 *) ((gid).raw + 8)), \ + be16_to_cpup((__be16 *) ((gid).raw + 10)), \ + be16_to_cpup((__be16 *) ((gid).raw + 12)), \ + be16_to_cpup((__be16 *) ((gid).raw + 14)) + +#endif /* _IPOIB_H */ --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/ulp/ipoib/ipoib_fs.c 2004-12-13 09:44:49.522429829 -0800 @@ -0,0 +1,276 @@ +/* + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available at + * , or the OpenIB.org BSD + * license, available in the LICENSE.TXT file accompanying this + * software. These details are also available at + * . + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * $Id$ + */ + +#include +#include + +#include "ipoib.h" + +enum { + IPOIB_MAGIC = 0x49504942 /* "IPIB" */ +}; + +static DECLARE_MUTEX(ipoib_fs_mutex); +static struct dentry *ipoib_root; +static struct super_block *ipoib_sb; +static LIST_HEAD(ipoib_device_list); + +static void *ipoib_mcg_seq_start(struct seq_file *file, loff_t *pos) +{ + struct ipoib_mcast_iter *iter; + loff_t n = *pos; + + iter = ipoib_mcast_iter_init(file->private); + if (!iter) + return NULL; + + while (n--) { + if (ipoib_mcast_iter_next(iter)) { + ipoib_mcast_iter_free(iter); + return NULL; + } + } + + return iter; +} + +static void *ipoib_mcg_seq_next(struct seq_file *file, void *iter_ptr, + loff_t *pos) +{ + struct ipoib_mcast_iter *iter = iter_ptr; + + (*pos)++; + + if (ipoib_mcast_iter_next(iter)) { + ipoib_mcast_iter_free(iter); + return NULL; + } + + return iter; +} + +static void ipoib_mcg_seq_stop(struct seq_file *file, void *iter_ptr) +{ + /* nothing for now */ +} + +static int ipoib_mcg_seq_show(struct seq_file *file, void *iter_ptr) +{ + struct ipoib_mcast_iter *iter = iter_ptr; + char gid_buf[sizeof "ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff"]; + union ib_gid mgid; + int i, n; + unsigned long created; + unsigned int queuelen, complete, send_only; + + if (iter) { + ipoib_mcast_iter_read(iter, &mgid, &created, &queuelen, + &complete, &send_only); + + for (n = 0, i = 0; i < sizeof mgid / 2; ++i) { + n += sprintf(gid_buf + n, "%x", + be16_to_cpu(((u16 *)mgid.raw)[i])); + if (i < sizeof mgid / 2 - 1) + gid_buf[n++] = ':'; + } + } + + seq_printf(file, "GID: %*s", -(1 + (int) sizeof gid_buf), gid_buf); + + seq_printf(file, + " created: %10ld queuelen: %4d complete: %d send_only: %d\n", + created, queuelen, complete, send_only); + + return 0; +} + +static struct seq_operations ipoib_seq_ops = { + .start = ipoib_mcg_seq_start, + .next = ipoib_mcg_seq_next, + .stop = ipoib_mcg_seq_stop, + .show = ipoib_mcg_seq_show, +}; + +static int ipoib_mcg_open(struct inode *inode, struct file *file) +{ + struct seq_file *seq; + int ret; + + ret = seq_open(file, &ipoib_seq_ops); + if (ret) + return ret; + + seq = file->private_data; + seq->private = inode->u.generic_ip; + + return 0; +} + +static struct file_operations ipoib_fops = { + .owner = THIS_MODULE, + .open = ipoib_mcg_open, + .read = seq_read, + .llseek = seq_lseek, + .release = seq_release +}; + +static struct inode *ipoib_get_inode(void) +{ + struct inode *inode = new_inode(ipoib_sb); + + if (inode) { + inode->i_mode = S_IFREG | S_IRUGO; + inode->i_uid = 0; + inode->i_gid = 0; + inode->i_blksize = PAGE_CACHE_SIZE; + inode->i_blocks = 0; + inode->i_atime = inode->i_mtime = inode->i_ctime = CURRENT_TIME; + inode->i_fop = &ipoib_fops; + } + + return inode; +} + +static int __ipoib_create_debug_file(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct dentry *dentry; + struct inode *inode; + char name[IFNAMSIZ + sizeof "_mcg"]; + + snprintf(name, sizeof name, "%s_mcg", dev->name); + + dentry = d_alloc_name(ipoib_root, name); + if (!dentry) + return -ENOMEM; + + inode = ipoib_get_inode(); + if (!inode) { + dput(dentry); + return -ENOMEM; + } + + inode->u.generic_ip = dev; + priv->mcg_dentry = dentry; + + d_add(dentry, inode); + + return 0; +} + +int ipoib_create_debug_file(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + down(&ipoib_fs_mutex); + + list_add_tail(&priv->fs_list, &ipoib_device_list); + + if (!ipoib_sb) { + up(&ipoib_fs_mutex); + return 0; + } + + up(&ipoib_fs_mutex); + + return __ipoib_create_debug_file(dev); +} + +void ipoib_delete_debug_file(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + down(&ipoib_fs_mutex); + list_del(&priv->fs_list); + if (!ipoib_sb) { + up(&ipoib_fs_mutex); + return; + } + up(&ipoib_fs_mutex); + + if (priv->mcg_dentry) { + d_drop(priv->mcg_dentry); + simple_unlink(ipoib_root->d_inode, priv->mcg_dentry); + } +} + +static int ipoib_fill_super(struct super_block *sb, void *data, int silent) +{ + static struct tree_descr ipoib_files[] = { + { "" } + }; + struct ipoib_dev_priv *priv; + int ret; + + ret = simple_fill_super(sb, IPOIB_MAGIC, ipoib_files); + if (ret) + return ret; + + ipoib_root = sb->s_root; + + down(&ipoib_fs_mutex); + + ipoib_sb = sb; + + list_for_each_entry(priv, &ipoib_device_list, fs_list) { + ret = __ipoib_create_debug_file(priv->dev); + if (ret) + break; + } + + up(&ipoib_fs_mutex); + + return ret; +} + +static struct super_block *ipoib_get_sb(struct file_system_type *fs_type, + int flags, const char *dev_name, void *data) +{ + return get_sb_single(fs_type, flags, data, ipoib_fill_super); +} + +static void ipoib_kill_sb(struct super_block *sb) +{ + down(&ipoib_fs_mutex); + ipoib_sb = NULL; + up(&ipoib_fs_mutex); + + kill_litter_super(sb); +} + +static struct file_system_type ipoib_fs_type = { + .owner = THIS_MODULE, + .name = "ipoib_debugfs", + .get_sb = ipoib_get_sb, + .kill_sb = ipoib_kill_sb, +}; + +int ipoib_register_debugfs(void) +{ + return register_filesystem(&ipoib_fs_type); +} + +void ipoib_unregister_debugfs(void) +{ + unregister_filesystem(&ipoib_fs_type); +} --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/ulp/ipoib/ipoib_ib.c 2004-12-13 09:44:49.547426147 -0800 @@ -0,0 +1,627 @@ +/* + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available at + * , or the OpenIB.org BSD + * license, available in the LICENSE.TXT file accompanying this + * software. These details are also available at + * . + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * $Id: ipoib_ib.c 1323 2004-12-11 02:36:04Z roland $ + */ + +#include +#include + +#include + +#include "ipoib.h" + +#ifdef CONFIG_INFINIBAND_IPOIB_DEBUG_DATA +int data_debug_level; + +module_param(data_debug_level, int, 0644); +MODULE_PARM_DESC(data_debug_level, + "Enable data path debug tracing if > 0"); +#endif + +#define IPOIB_OP_RECV (1ul << 31) + +static DECLARE_MUTEX(pkey_sem); + +struct ipoib_ah *ipoib_create_ah(struct net_device *dev, + struct ib_pd *pd, struct ib_ah_attr *attr) +{ + struct ipoib_ah *ah; + + ah = kmalloc(sizeof *ah, GFP_KERNEL); + if (!ah) + return NULL; + + ah->dev = dev; + ah->last_send = 0; + kref_init(&ah->ref); + + ah->ah = ib_create_ah(pd, attr); + if (IS_ERR(ah->ah)) { + kfree(ah); + ah = NULL; + } else + ipoib_dbg(netdev_priv(dev), "Created ah %p\n", ah->ah); + + return ah; +} + +void ipoib_free_ah(struct kref *kref) +{ + struct ipoib_ah *ah = container_of(kref, struct ipoib_ah, ref); + struct ipoib_dev_priv *priv = netdev_priv(ah->dev); + + unsigned long flags; + + if (ah->last_send <= priv->tx_tail) { + ipoib_dbg(priv, "Freeing ah %p\n", ah->ah); + ib_destroy_ah(ah->ah); + kfree(ah); + } else { + spin_lock_irqsave(&priv->lock, flags); + list_add_tail(&ah->list, &priv->dead_ahs); + spin_unlock_irqrestore(&priv->lock, flags); + } +} + +static inline int ipoib_ib_receive(struct ipoib_dev_priv *priv, + unsigned int wr_id, + dma_addr_t addr) +{ + struct ib_sge list = { + .addr = addr, + .length = IPOIB_BUF_SIZE, + .lkey = priv->mr->lkey, + }; + struct ib_recv_wr param = { + .wr_id = wr_id | IPOIB_OP_RECV, + .sg_list = &list, + .num_sge = 1, + .recv_flags = IB_RECV_SIGNALED + }; + struct ib_recv_wr *bad_wr; + + return ib_post_recv(priv->qp, ¶m, &bad_wr); +} + +static int ipoib_ib_post_receive(struct net_device *dev, int id) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct sk_buff *skb; + dma_addr_t addr; + int ret; + + skb = dev_alloc_skb(IPOIB_BUF_SIZE + 4); + if (!skb) { + ipoib_warn(priv, "failed to allocate receive buffer\n"); + + priv->rx_ring[id].skb = NULL; + return -ENOMEM; + } + skb_reserve(skb, 4); /* 16 byte align IP header */ + priv->rx_ring[id].skb = skb; + addr = dma_map_single(priv->ca->dma_device, + skb->data, IPOIB_BUF_SIZE, + DMA_FROM_DEVICE); + pci_unmap_addr_set(&priv->rx_ring[id], mapping, addr); + + ret = ipoib_ib_receive(priv, id, addr); + if (ret) { + ipoib_warn(priv, "ipoib_ib_receive failed for buf %d (%d)\n", + id, ret); + priv->rx_ring[id].skb = NULL; + } + + return ret; +} + +static int ipoib_ib_post_receives(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + int i; + + for (i = 0; i < IPOIB_RX_RING_SIZE; ++i) { + if (ipoib_ib_post_receive(dev, i)) { + ipoib_warn(priv, "ipoib_ib_post_receive failed for buf %d\n", i); + return -EIO; + } + } + + return 0; +} + +static void ipoib_ib_handle_wc(struct net_device *dev, + struct ib_wc *wc) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + unsigned int wr_id = wc->wr_id; + + ipoib_dbg_data(priv, "called: id %d, op %d, status: %d\n", + wr_id, wc->opcode, wc->status); + + if (wr_id & IPOIB_OP_RECV) { + wr_id &= ~IPOIB_OP_RECV; + + if (wr_id < IPOIB_RX_RING_SIZE) { + struct sk_buff *skb = priv->rx_ring[wr_id].skb; + + priv->rx_ring[wr_id].skb = NULL; + + dma_unmap_single(priv->ca->dma_device, + pci_unmap_addr(&priv->rx_ring[wr_id], + mapping), + IPOIB_BUF_SIZE, + DMA_FROM_DEVICE); + + if (wc->status != IB_WC_SUCCESS) { + if (wc->status != IB_WC_WR_FLUSH_ERR) + ipoib_warn(priv, "failed recv event " + "(status=%d, wrid=%d vend_err %x)\n", + wc->status, wr_id, wc->vendor_err); + dev_kfree_skb_any(skb); + return; + } + + ipoib_dbg_data(priv, "received %d bytes, SLID 0x%04x\n", + wc->byte_len, wc->slid); + + skb_put(skb, wc->byte_len); + skb_pull(skb, IB_GRH_BYTES); + + if (wc->slid != priv->local_lid || + wc->src_qp != priv->qp->qp_num) { + skb->protocol = ((struct ipoib_header *) skb->data)->proto; + + skb_pull(skb, IPOIB_ENCAP_LEN); + + dev->last_rx = jiffies; + ++priv->stats.rx_packets; + priv->stats.rx_bytes += skb->len; + + skb->dev = dev; + /* XXX get correct PACKET_ type here */ + skb->pkt_type = PACKET_HOST; + netif_rx_ni(skb); + } else { + ipoib_dbg_data(priv, "dropping loopback packet\n"); + dev_kfree_skb_any(skb); + } + + /* repost receive */ + if (ipoib_ib_post_receive(dev, wr_id)) + ipoib_warn(priv, "ipoib_ib_post_receive failed " + "for buf %d\n", wr_id); + } else + ipoib_warn(priv, "completion event with wrid %d\n", + wr_id); + + } else { + struct ipoib_buf *tx_req; + unsigned long flags; + + if (wr_id >= IPOIB_TX_RING_SIZE) { + ipoib_warn(priv, "completion event with wrid %d (> %d)\n", + wr_id, IPOIB_TX_RING_SIZE); + return; + } + + ipoib_dbg_data(priv, "send complete, wrid %d\n", wr_id); + + tx_req = &priv->tx_ring[wr_id]; + + dma_unmap_single(priv->ca->dma_device, + pci_unmap_addr(tx_req, mapping), + tx_req->skb->len, + DMA_TO_DEVICE); + + ++priv->stats.tx_packets; + priv->stats.tx_bytes += tx_req->skb->len; + + dev_kfree_skb_any(tx_req->skb); + + spin_lock_irqsave(&priv->tx_lock, flags); + ++priv->tx_tail; + if (priv->tx_head - priv->tx_tail <= IPOIB_TX_RING_SIZE / 2) + netif_wake_queue(dev); + spin_unlock_irqrestore(&priv->tx_lock, flags); + + if (wc->status != IB_WC_SUCCESS && + wc->status != IB_WC_WR_FLUSH_ERR) + ipoib_warn(priv, "failed send event " + "(status=%d, wrid=%d vend_err %x)\n", + wc->status, wr_id, wc->vendor_err); + } +} + +void ipoib_ib_completion(struct ib_cq *cq, void *dev_ptr) +{ + struct net_device *dev = (struct net_device *) dev_ptr; + struct ipoib_dev_priv *priv = netdev_priv(dev); + int n, i; + + ib_req_notify_cq(cq, IB_CQ_NEXT_COMP); + do { + n = ib_poll_cq(cq, IPOIB_NUM_WC, priv->ibwc); + for (i = 0; i < n; ++i) + ipoib_ib_handle_wc(dev, priv->ibwc + i); + } while (n == IPOIB_NUM_WC); +} + +static inline int post_send(struct ipoib_dev_priv *priv, + unsigned int wr_id, + struct ib_ah *address, u32 qpn, + dma_addr_t addr, int len) +{ + struct ib_sge list = { + .addr = addr, + .length = len, + .lkey = priv->mr->lkey, + }; + struct ib_send_wr param = { + .wr_id = wr_id, + .opcode = IB_WR_SEND, + .sg_list = &list, + .num_sge = 1, + .wr = { + .ud = { + .remote_qpn = qpn, + .remote_qkey = priv->qkey, + .ah = address + }, + }, + .send_flags = IB_SEND_SIGNALED, + }; + struct ib_send_wr *bad_wr; + + return ib_post_send(priv->qp, ¶m, &bad_wr); +} + +void ipoib_send(struct net_device *dev, struct sk_buff *skb, + struct ipoib_ah *address, u32 qpn) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ipoib_buf *tx_req; + dma_addr_t addr; + + if (skb->len > dev->mtu + INFINIBAND_ALEN) { + ipoib_warn(priv, "packet len %d (> %d) too long to send, dropping\n", + skb->len, dev->mtu + INFINIBAND_ALEN); + ++priv->stats.tx_dropped; + ++priv->stats.tx_errors; + dev_kfree_skb_any(skb); + return; + } + + if (!(skb = skb_unshare(skb, GFP_ATOMIC))) { + ipoib_warn(priv, "failed to unshare sk_buff. Dropping\n"); + ++priv->stats.tx_dropped; + ++priv->stats.tx_errors; + return; + } + + ipoib_dbg_data(priv, "sending packet, length=%d address=%p qpn=0x%06x\n", + skb->len, address, qpn); + + /* + * We put the skb into the tx_ring _before_ we call post_send() + * because it's entirely possible that the completion handler will + * run before we execute anything after the post_send(). That + * means we have to make sure everything is properly recorded and + * our state is consistent before we call post_send(). + */ + tx_req = &priv->tx_ring[priv->tx_head & (IPOIB_TX_RING_SIZE - 1)]; + tx_req->skb = skb; + addr = dma_map_single(priv->ca->dma_device, + skb->data, skb->len, + DMA_TO_DEVICE); + pci_unmap_addr_set(tx_req, mapping, addr); + + if (post_send(priv, priv->tx_head & (IPOIB_TX_RING_SIZE - 1), + address->ah, qpn, addr, skb->len)) { + ipoib_warn(priv, "post_send failed\n"); + ++priv->stats.tx_errors; + dev_kfree_skb_any(skb); + } else { + dev->trans_start = jiffies; + + address->last_send = priv->tx_head; + ++priv->tx_head; + + if (priv->tx_head - priv->tx_tail == IPOIB_TX_RING_SIZE) { + ipoib_dbg(priv, "TX ring full, stopping kernel net queue\n"); + netif_stop_queue(dev); + } + } +} + +void __ipoib_reap_ah(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ipoib_ah *ah, *tah; + LIST_HEAD(remove_list); + + spin_lock_irq(&priv->lock); + list_for_each_entry_safe(ah, tah, &priv->dead_ahs, list) + if (ah->last_send <= priv->tx_tail) { + list_del(&ah->list); + list_add_tail(&ah->list, &remove_list); + } + spin_unlock_irq(&priv->lock); + + list_for_each_entry_safe(ah, tah, &remove_list, list) { + ipoib_dbg(priv, "Reaping ah %p\n", ah->ah); + ib_destroy_ah(ah->ah); + kfree(ah); + } +} + +void ipoib_reap_ah(void *dev_ptr) +{ + struct net_device *dev = dev_ptr; + struct ipoib_dev_priv *priv = netdev_priv(dev); + + __ipoib_reap_ah(dev); + + if (!test_bit(IPOIB_STOP_REAPER, &priv->flags)) + queue_delayed_work(ipoib_workqueue, &priv->ah_reap_task, HZ); +} + +int ipoib_ib_dev_open(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + int ret; + + ret = ipoib_qp_create(dev); + if (ret) { + ipoib_warn(priv, "ipoib_qp_create returned %d\n", ret); + return -1; + } + + ret = ipoib_ib_post_receives(dev); + if (ret) { + ipoib_warn(priv, "ipoib_ib_post_receives returned %d\n", ret); + return -1; + } + + clear_bit(IPOIB_STOP_REAPER, &priv->flags); + queue_delayed_work(ipoib_workqueue, &priv->ah_reap_task, HZ); + + return 0; +} + +int ipoib_ib_dev_up(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + set_bit(IPOIB_FLAG_OPER_UP, &priv->flags); + + return ipoib_mcast_start_thread(dev); +} + +int ipoib_ib_dev_down(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + ipoib_dbg(priv, "downing ib_dev\n"); + + clear_bit(IPOIB_FLAG_OPER_UP, &priv->flags); + netif_carrier_off(dev); + + /* Shutdown the P_Key thread if still active */ + if (!test_bit(IPOIB_PKEY_ASSIGNED, &priv->flags)) { + down(&pkey_sem); + set_bit(IPOIB_PKEY_STOP, &priv->flags); + cancel_delayed_work(&priv->pkey_task); + up(&pkey_sem); + flush_workqueue(ipoib_workqueue); + } + + ipoib_mcast_stop_thread(dev); + + /* + * Flush the multicast groups first so we stop any multicast joins. The + * completion thread may have already died and we may deadlock waiting + * for the completion thread to finish some multicast joins. + */ + ipoib_mcast_dev_flush(dev); + + /* Delete broadcast and local addresses since they will be recreated */ + ipoib_mcast_dev_down(dev); + + ipoib_flush_paths(dev); + + return 0; +} + +static int recvs_pending(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + int i; + + for (i = 0; i < IPOIB_RX_RING_SIZE; ++i) + if (priv->rx_ring[i].skb) + return 1; + + return 0; +} + +int ipoib_ib_dev_stop(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ib_qp_attr qp_attr; + int attr_mask; + int i; + + /* Kill the existing QP and allocate a new one */ + qp_attr.qp_state = IB_QPS_ERR; + attr_mask = IB_QP_STATE; + if (ib_modify_qp(priv->qp, &qp_attr, attr_mask)) + ipoib_warn(priv, "Failed to modify QP to ERROR state\n"); + + /* Wait for all sends and receives to complete */ + while (priv->tx_head != priv->tx_tail || recvs_pending(dev)) + yield(); + + ipoib_dbg(priv, "All sends and receives done.\n"); + + qp_attr.qp_state = IB_QPS_RESET; + attr_mask = IB_QP_STATE; + if (ib_modify_qp(priv->qp, &qp_attr, attr_mask)) + ipoib_warn(priv, "Failed to modify QP to RESET state\n"); + + /* Wait for all AHs to be reaped */ + set_bit(IPOIB_STOP_REAPER, &priv->flags); + cancel_delayed_work(&priv->ah_reap_task); + flush_workqueue(ipoib_workqueue); + while (!list_empty(&priv->dead_ahs)) { + __ipoib_reap_ah(dev); + yield(); + } + + for (i = 0; i < IPOIB_RX_RING_SIZE; ++i) + if (priv->rx_ring[i].skb) + ipoib_warn(priv, "Recv skb still around @ %d\n", i); + + return 0; +} + +int ipoib_ib_dev_init(struct net_device *dev, struct ib_device *ca, int port) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + priv->ca = ca; + priv->port = port; + priv->qp = NULL; + + if (ipoib_transport_dev_init(dev, ca)) { + printk(KERN_WARNING "%s: ipoib_transport_dev_init failed\n", ca->name); + return -ENODEV; + } + + if (dev->flags & IFF_UP) { + if (ipoib_ib_dev_open(dev)) { + ipoib_transport_dev_cleanup(dev); + return -ENODEV; + } + } + + return 0; +} + +void ipoib_ib_dev_flush(void *_dev) +{ + struct net_device *dev = (struct net_device *)_dev; + struct ipoib_dev_priv *priv = netdev_priv(dev), *cpriv; + + if (!test_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags)) + return; + + ipoib_dbg(priv, "flushing\n"); + + ipoib_ib_dev_down(dev); + + /* + * The device could have been brought down between the start and when + * we get here, don't bring it back up if it's not configured up + */ + if (test_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags)) + ipoib_ib_dev_up(dev); + + /* Flush any child interfaces too */ + list_for_each_entry(cpriv, &priv->child_intfs, list) + ipoib_ib_dev_flush(&cpriv->dev); +} + +void ipoib_ib_dev_cleanup(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + ipoib_dbg(priv, "cleaning up ib_dev\n"); + + ipoib_mcast_stop_thread(dev); + + /* Delete the broadcast address and the local address */ + ipoib_mcast_dev_down(dev); + + ipoib_transport_dev_cleanup(dev); +} + +/* + * Delayed P_Key Assigment Interim Support + * + * The following is initial implementation of delayed P_Key assigment + * mechanism. It is using the same approach implemented for the multicast + * group join. The single goal of this implementation is to quickly address + * Bug #2507. This implementation will probably be removed when the P_Key + * change async notification is available. + */ +int ipoib_open(struct net_device *dev); + +static void ipoib_pkey_dev_check_presence(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + u16 pkey_index = 0; + + if (ib_cached_pkey_find(priv->ca, priv->port, priv->pkey, &pkey_index)) + clear_bit(IPOIB_PKEY_ASSIGNED, &priv->flags); + else + set_bit(IPOIB_PKEY_ASSIGNED, &priv->flags); +} + +void ipoib_pkey_poll(void *dev_ptr) +{ + struct net_device *dev = dev_ptr; + struct ipoib_dev_priv *priv = netdev_priv(dev); + + ipoib_pkey_dev_check_presence(dev); + + if (test_bit(IPOIB_PKEY_ASSIGNED, &priv->flags)) + ipoib_open(dev); + else { + down(&pkey_sem); + if (!test_bit(IPOIB_PKEY_STOP, &priv->flags)) + queue_delayed_work(ipoib_workqueue, + &priv->pkey_task, + HZ); + up(&pkey_sem); + } +} + +int ipoib_pkey_dev_delay_open(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + /* Look for the interface pkey value in the IB Port P_Key table and */ + /* set the interface pkey assigment flag */ + ipoib_pkey_dev_check_presence(dev); + + /* P_Key value not assigned yet - start polling */ + if (!test_bit(IPOIB_PKEY_ASSIGNED, &priv->flags)) { + down(&pkey_sem); + clear_bit(IPOIB_PKEY_STOP, &priv->flags); + queue_delayed_work(ipoib_workqueue, + &priv->pkey_task, + HZ); + up(&pkey_sem); + return 1; + } + + return 0; +} --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/ulp/ipoib/ipoib_main.c 2004-12-13 09:44:49.573422317 -0800 @@ -0,0 +1,1023 @@ +/* + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available at + * , or the OpenIB.org BSD + * license, available in the LICENSE.TXT file accompanying this + * software. These details are also available at + * . + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * $Id: ipoib_main.c 1323 2004-12-11 02:36:04Z roland $ + */ + +#include "ipoib.h" + +#include +#include + +#include +#include +#include + +#include /* For ARPHRD_xxx */ + +#include +#include + +MODULE_AUTHOR("Roland Dreier"); +MODULE_DESCRIPTION("IP-over-InfiniBand net driver"); +MODULE_LICENSE("Dual BSD/GPL"); + +#ifdef CONFIG_INFINIBAND_IPOIB_DEBUG +int debug_level; + +module_param(debug_level, int, 0644); +MODULE_PARM_DESC(debug_level, "Enable debug tracing if > 0"); +#endif + +static const u8 ipv4_bcast_addr[] = { + 0x00, 0xff, 0xff, 0xff, + 0xff, 0x12, 0x40, 0x1b, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff +}; + +struct workqueue_struct *ipoib_workqueue; + +static void ipoib_add_one(struct ib_device *device); +static void ipoib_remove_one(struct ib_device *device); + +static struct ib_client ipoib_client = { + .name = "ipoib", + .add = ipoib_add_one, + .remove = ipoib_remove_one +}; + +int ipoib_open(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + ipoib_dbg(priv, "bringing up interface\n"); + + set_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags); + + if (ipoib_pkey_dev_delay_open(dev)) + return 0; + + if (ipoib_ib_dev_open(dev)) + return -EINVAL; + + if (ipoib_ib_dev_up(dev)) + return -EINVAL; + + if (!test_bit(IPOIB_FLAG_SUBINTERFACE, &priv->flags)) { + struct ipoib_dev_priv *cpriv; + + /* Bring up any child interfaces too */ + down(&priv->vlan_mutex); + list_for_each_entry(cpriv, &priv->child_intfs, list) { + int flags; + + flags = cpriv->dev->flags; + if (flags & IFF_UP) + continue; + + dev_change_flags(cpriv->dev, flags | IFF_UP); + } + up(&priv->vlan_mutex); + } + + netif_start_queue(dev); + + return 0; +} + +static int ipoib_stop(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + ipoib_dbg(priv, "stopping interface\n"); + + clear_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags); + + netif_stop_queue(dev); + + ipoib_ib_dev_down(dev); + ipoib_ib_dev_stop(dev); + + if (!test_bit(IPOIB_FLAG_SUBINTERFACE, &priv->flags)) { + struct ipoib_dev_priv *cpriv; + + /* Bring down any child interfaces too */ + down(&priv->vlan_mutex); + list_for_each_entry(cpriv, &priv->child_intfs, list) { + int flags; + + flags = cpriv->dev->flags; + if (!(flags & IFF_UP)) + continue; + + dev_change_flags(cpriv->dev, flags & ~IFF_UP); + } + up(&priv->vlan_mutex); + } + + return 0; +} + +static int ipoib_change_mtu(struct net_device *dev, int new_mtu) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + if (new_mtu > IPOIB_PACKET_SIZE - IPOIB_ENCAP_LEN) + return -EINVAL; + + priv->admin_mtu = new_mtu; + + dev->mtu = min(priv->mcast_mtu, priv->admin_mtu); + + return 0; +} + +static struct ipoib_path *__path_find(struct net_device *dev, + union ib_gid *gid) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct rb_node *n = priv->path_tree.rb_node; + struct ipoib_path *path; + int ret; + + while (n) { + path = rb_entry(n, struct ipoib_path, rb_node); + + ret = memcmp(path->pathrec.dgid.raw, gid->raw, + sizeof (union ib_gid)); + + if (ret < 0) + n = n->rb_left; + else if (ret > 0) + n = n->rb_right; + else + return path; + } + + return NULL; +} + +static int __path_add(struct net_device *dev, struct ipoib_path *path) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct rb_node **n = &priv->path_tree.rb_node; + struct rb_node *pn = NULL; + struct ipoib_path *tpath; + int ret; + + while (*n) { + pn = *n; + tpath = rb_entry(pn, struct ipoib_path, rb_node); + + ret = memcmp(path->pathrec.dgid.raw, tpath->pathrec.dgid.raw, + sizeof (union ib_gid)); + if (ret < 0) + n = &pn->rb_left; + else if (ret > 0) + n = &pn->rb_right; + else + return -EEXIST; + } + + rb_link_node(&path->rb_node, pn, n); + rb_insert_color(&path->rb_node, &priv->path_tree); + + list_add_tail(&path->list, &priv->path_list); + + return 0; +} + +static void __path_free(struct net_device *dev, struct ipoib_path *path) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ipoib_neigh *neigh, *tn; + struct sk_buff *skb; + + while ((skb = __skb_dequeue(&path->queue))) + dev_kfree_skb_irq(skb); + + list_for_each_entry_safe(neigh, tn, &path->neigh_list, list) { + if (neigh->ah) + ipoib_put_ah(neigh->ah); + *to_ipoib_neigh(neigh->neighbour) = NULL; + neigh->neighbour->ops->destructor = NULL; + kfree(neigh); + } + + if (path->ah) + ipoib_put_ah(path->ah); + + rb_erase(&path->rb_node, &priv->path_tree); + list_del(&path->list); + kfree(path); +} + +void ipoib_flush_paths(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ipoib_path *path, *tp; + unsigned long flags; + + spin_lock_irqsave(&priv->lock, flags); + + list_for_each_entry_safe(path, tp, &priv->path_list, list) + __path_free(dev, path); + + spin_unlock_irqrestore(&priv->lock, flags); +} + +static void path_rec_completion(int status, + struct ib_sa_path_rec *pathrec, + void *path_ptr) +{ + struct ipoib_path *path = path_ptr; + struct ipoib_dev_priv *priv = netdev_priv(path->dev); + struct ipoib_ah *ah = NULL; + struct ipoib_neigh *neigh; + struct sk_buff_head skqueue; + struct sk_buff *skb; + unsigned long flags; + + ipoib_dbg(priv, "status %d, LID 0x%04x for GID " IPOIB_GID_FMT "\n", + status, be16_to_cpu(pathrec->dlid), IPOIB_GID_ARG(pathrec->dgid)); + + if (status == IB_WC_SUCCESS) { + struct ib_ah_attr av = { + .dlid = be16_to_cpu(pathrec->dlid), + .sl = pathrec->sl, + .src_path_bits = 0, + .static_rate = 0, + .ah_flags = 0, + .port_num = priv->port + }; + + ah = ipoib_create_ah(path->dev, priv->pd, &av); + } + + spin_lock_irqsave(&priv->lock, flags); + + path->ah = ah; + + if (ah) { + path->pathrec = *pathrec; + + ipoib_dbg(priv, "created address handle %p for LID 0x%04x, SL %d\n", + ah, be16_to_cpu(pathrec->dlid), pathrec->sl); + + skb_queue_head_init(&skqueue); + + while ((skb = __skb_dequeue(&path->queue))) + __skb_queue_tail(&skqueue, skb); + + list_for_each_entry(neigh, &path->neigh_list, list) { + neigh->ah = path->ah; + kref_get(&path->ah->ref); + + while ((skb = __skb_dequeue(&neigh->queue))) + __skb_queue_tail(&skqueue, skb); + } + } else + path->query = NULL; + + + complete(&path->done); + + spin_unlock_irqrestore(&priv->lock, flags); + + while ((skb = __skb_dequeue(&skqueue))) { + skb->dev = path->dev; + if (dev_queue_xmit(skb)) + ipoib_warn(priv, "dev_queue_xmit failed " + "to requeue packet\n"); + } +} + +static struct ipoib_path *path_rec_create(struct net_device *dev, + union ib_gid *gid) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ipoib_path *path; + + path = kmalloc(sizeof *path, GFP_ATOMIC); + if (!path) + return NULL; + + path->dev = dev; + path->pathrec.dlid = 0; + + skb_queue_head_init(&path->queue); + + INIT_LIST_HEAD(&path->neigh_list); + path->query = NULL; + init_completion(&path->done); + + memcpy(path->pathrec.dgid.raw, gid->raw, sizeof (union ib_gid)); + path->pathrec.sgid = priv->local_gid; + path->pathrec.pkey = cpu_to_be16(priv->pkey); + path->pathrec.numb_path = 1; + + __path_add(dev, path); + + return path; +} + +static int path_rec_start(struct net_device *dev, + struct ipoib_path *path) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + path->query_id = + ib_sa_path_rec_get(priv->ca, priv->port, + &path->pathrec, + IB_SA_PATH_REC_DGID | + IB_SA_PATH_REC_SGID | + IB_SA_PATH_REC_NUMB_PATH | + IB_SA_PATH_REC_PKEY, + 1000, GFP_ATOMIC, + path_rec_completion, + path, &path->query); + if (path->query_id < 0) { + ipoib_warn(priv, "ib_sa_path_rec_get failed\n"); + path->query = NULL; + return path->query_id; + } + + return 0; +} + +static void neigh_add_path(struct sk_buff *skb, struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ipoib_path *path; + struct ipoib_neigh *neigh; + + neigh = kmalloc(sizeof *neigh, GFP_ATOMIC); + if (!neigh) { + ++priv->stats.tx_dropped; + dev_kfree_skb_any(skb); + return; + } + + skb_queue_head_init(&neigh->queue); + neigh->neighbour = skb->dst->neighbour; + *to_ipoib_neigh(skb->dst->neighbour) = neigh; + + /* + * We can only be called from ipoib_start_xmit, so we're + * inside tx_lock -- no need to save/restore flags. + */ + spin_lock(&priv->lock); + + path = __path_find(dev, (union ib_gid *) (skb->dst->neighbour->ha + 4)); + if (!path) { + path = path_rec_create(dev, + (union ib_gid *) (skb->dst->neighbour->ha + 4)); + if (!path) + goto err; + } + + list_add_tail(&neigh->list, &path->neigh_list); + + if (path->pathrec.dlid) { + neigh->ah = path->ah; + kref_get(&path->ah->ref); + + ipoib_send(dev, skb, path->ah, + be32_to_cpup((__be32 *) skb->dst->neighbour->ha)); + } else if (!path->query) { + neigh->ah = NULL; + __skb_queue_tail(&neigh->queue, skb); + if (path_rec_start(dev, path)) + goto err; + } + + spin_unlock(&priv->lock); + return; + +err: + *to_ipoib_neigh(skb->dst->neighbour) = NULL; + list_del(&neigh->list); + kfree(neigh); + neigh->neighbour->ops->destructor = NULL; + + ++priv->stats.tx_dropped; + dev_kfree_skb_any(skb); + + spin_unlock(&priv->lock); +} + +static void path_lookup(struct sk_buff *skb, struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(skb->dev); + + /* Look up path record for unicasts */ + if (skb->dst->neighbour->ha[4] != 0xff) { + neigh_add_path(skb, dev); + return; + } + + /* Add in the P_Key for multicasts */ + skb->dst->neighbour->ha[8] = (priv->pkey >> 8) & 0xff; + skb->dst->neighbour->ha[9] = priv->pkey & 0xff; + ipoib_mcast_send(dev, (union ib_gid *) (skb->dst->neighbour->ha + 4), skb); +} + +static void unicast_arp_send(struct sk_buff *skb, struct net_device *dev, + struct ipoib_pseudoheader *phdr) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ipoib_path *path; + + /* + * We can only be called from ipoib_start_xmit, so we're + * inside tx_lock -- no need to save/restore flags. + */ + spin_lock(&priv->lock); + + path = __path_find(dev, (union ib_gid *) (phdr->hwaddr + 4)); + if (!path) { + path = path_rec_create(dev, + (union ib_gid *) (phdr->hwaddr + 4)); + if (path) { + __skb_queue_tail(&path->queue, skb); + + if (path_rec_start(dev, path)) + __path_free(dev, path); + } else { + ++priv->stats.tx_dropped; + dev_kfree_skb_any(skb); + } + + spin_unlock(&priv->lock); + return; + } + + ipoib_dbg(priv, "Send unicast ARP to %04x\n", be16_to_cpu(path->pathrec.dlid)); + + ipoib_send(dev, skb, path->ah, + be32_to_cpup((__be32 *) phdr->hwaddr)); + + spin_unlock(&priv->lock); +} + +static int ipoib_start_xmit(struct sk_buff *skb, struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ipoib_neigh *neigh; + unsigned long flags; + + local_irq_save(flags); + if (!spin_trylock(&priv->tx_lock)) { + local_irq_restore(flags); + return NETDEV_TX_LOCKED; + } + + if (skb->dst && skb->dst->neighbour) { + if (unlikely(!*to_ipoib_neigh(skb->dst->neighbour))) { + path_lookup(skb, dev); + goto out; + } + + neigh = *to_ipoib_neigh(skb->dst->neighbour); + + if (likely(neigh->ah)) { + ipoib_send(dev, skb, neigh->ah, + be32_to_cpup((__be32 *) skb->dst->neighbour->ha)); + goto out; + } + + if (skb_queue_len(&neigh->queue) < IPOIB_MAX_PATH_REC_QUEUE) + __skb_queue_tail(&neigh->queue, skb); + else + goto err; + } else { + struct ipoib_pseudoheader *phdr = + (struct ipoib_pseudoheader *) skb->data; + skb_pull(skb, sizeof *phdr); + + if (phdr->hwaddr[4] == 0xff) { + /* Add in the P_Key for multicast*/ + phdr->hwaddr[8] = (priv->pkey >> 8) & 0xff; + phdr->hwaddr[9] = priv->pkey & 0xff; + + ipoib_mcast_send(dev, (union ib_gid *) (phdr->hwaddr + 4), skb); + } else { + /* unicast GID -- should be ARP reply */ + + if (be16_to_cpup((u16 *) skb->data) != ETH_P_ARP) { + ipoib_warn(priv, "Unicast, no %s: type %04x, QPN %06x " + IPOIB_GID_FMT "\n", + skb->dst ? "neigh" : "dst", + be16_to_cpup((u16 *) skb->data), + be32_to_cpup((u32 *) phdr->hwaddr), + IPOIB_GID_ARG(*(union ib_gid *) (phdr->hwaddr + 4))); + dev_kfree_skb_any(skb); + ++priv->stats.tx_dropped; + goto out; + } + + /* put the pseudoheader back on and try to send */ + unicast_arp_send(skb, dev, phdr); + } + } + + goto out; + +err: + ++priv->stats.tx_dropped; + dev_kfree_skb_any(skb); + +out: + spin_unlock_irqrestore(&priv->tx_lock, flags); + + return NETDEV_TX_OK; +} + +struct net_device_stats *ipoib_get_stats(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + return &priv->stats; +} + +static void ipoib_timeout(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + ipoib_warn(priv, "transmit timeout: latency %ld\n", + jiffies - dev->trans_start); + /* XXX reset QP, etc. */ +} + +static int ipoib_hard_header(struct sk_buff *skb, + struct net_device *dev, + unsigned short type, + void *daddr, void *saddr, unsigned len) +{ + struct ipoib_header *header; + + header = (struct ipoib_header *) skb_push(skb, sizeof *header); + + header->proto = htons(type); + header->reserved = 0; + + /* + * If we don't have a neighbour structure, stuff the + * destination address onto the front of the skb so we can + * figure out where to send the packet later. + */ + if (!skb->dst || !skb->dst->neighbour) { + struct ipoib_pseudoheader *phdr = + (struct ipoib_pseudoheader *) skb_push(skb, sizeof *phdr); + memcpy(phdr->hwaddr, daddr, INFINIBAND_ALEN); + } + + return 0; +} + +static void ipoib_set_mcast_list(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + schedule_work(&priv->restart_task); +} + +static void ipoib_neigh_destructor(struct neighbour *n) +{ + struct ipoib_neigh *neigh = *to_ipoib_neigh(n); + struct ipoib_dev_priv *priv = netdev_priv(n->dev); + unsigned long flags; + + ipoib_dbg(priv, + "neigh_destructor for %06x " IPOIB_GID_FMT "\n", + be32_to_cpup((__be32 *) n->ha), + IPOIB_GID_ARG(*((union ib_gid *) (n->ha + 4)))); + + spin_lock_irqsave(&priv->lock, flags); + + if (neigh) { + if (neigh->ah) + ipoib_put_ah(neigh->ah); + list_del(&neigh->list); + *to_ipoib_neigh(n) = NULL; + kfree(neigh); + } + + spin_unlock_irqrestore(&priv->lock, flags); +} + +static int ipoib_neigh_setup(struct neighbour *neigh) +{ + /* + * Is this kosher? I can't find anybody in the kernel that + * sets neigh->destructor, so we should be able to set it here + * without trouble. + */ + neigh->ops->destructor = ipoib_neigh_destructor; + + return 0; +} + +static int ipoib_neigh_setup_dev(struct net_device *dev, struct neigh_parms *parms) +{ + parms->neigh_setup = ipoib_neigh_setup; + + return 0; +} + +int ipoib_dev_init(struct net_device *dev, struct ib_device *ca, int port) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + /* Allocate RX/TX "rings" to hold queued skbs */ + + priv->rx_ring = kmalloc(IPOIB_RX_RING_SIZE * sizeof (struct ipoib_buf), + GFP_KERNEL); + if (!priv->rx_ring) { + printk(KERN_WARNING "%s: failed to allocate RX ring (%d entries)\n", + ca->name, IPOIB_RX_RING_SIZE); + goto out; + } + memset(priv->rx_ring, 0, + IPOIB_RX_RING_SIZE * sizeof (struct ipoib_buf)); + + priv->tx_ring = kmalloc(IPOIB_TX_RING_SIZE * sizeof (struct ipoib_buf), + GFP_KERNEL); + if (!priv->tx_ring) { + printk(KERN_WARNING "%s: failed to allocate TX ring (%d entries)\n", + ca->name, IPOIB_TX_RING_SIZE); + goto out_rx_ring_cleanup; + } + memset(priv->tx_ring, 0, + IPOIB_TX_RING_SIZE * sizeof (struct ipoib_buf)); + + /* priv->tx_head & tx_tail are already 0 */ + + if (ipoib_ib_dev_init(dev, ca, port)) + goto out_tx_ring_cleanup; + + return 0; + +out_tx_ring_cleanup: + kfree(priv->tx_ring); + +out_rx_ring_cleanup: + kfree(priv->rx_ring); + +out: + return -ENOMEM; +} + +void ipoib_dev_cleanup(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev), *cpriv, *tcpriv; + + ipoib_delete_debug_file(dev); + + /* Delete any child interfaces first */ + list_for_each_entry_safe(cpriv, tcpriv, &priv->child_intfs, list) { + unregister_netdev(cpriv->dev); + ipoib_dev_cleanup(cpriv->dev); + free_netdev(cpriv->dev); + } + + ipoib_ib_dev_cleanup(dev); + + if (priv->rx_ring) { + kfree(priv->rx_ring); + priv->rx_ring = NULL; + } + + if (priv->tx_ring) { + kfree(priv->tx_ring); + priv->tx_ring = NULL; + } +} + +static void ipoib_setup(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + dev->open = ipoib_open; + dev->stop = ipoib_stop; + dev->change_mtu = ipoib_change_mtu; + dev->hard_start_xmit = ipoib_start_xmit; + dev->get_stats = ipoib_get_stats; + dev->tx_timeout = ipoib_timeout; + dev->hard_header = ipoib_hard_header; + dev->set_multicast_list = ipoib_set_mcast_list; + dev->neigh_setup = ipoib_neigh_setup_dev; + + dev->watchdog_timeo = HZ; + + dev->rebuild_header = NULL; + dev->set_mac_address = NULL; + dev->header_cache_update = NULL; + + dev->flags |= IFF_BROADCAST | IFF_MULTICAST; + + /* + * We add in INFINIBAND_ALEN to allow for the destination + * address "pseudoheader" for skbs without neighbour struct. + */ + dev->hard_header_len = IPOIB_ENCAP_LEN + INFINIBAND_ALEN; + dev->addr_len = INFINIBAND_ALEN; + dev->type = ARPHRD_INFINIBAND; + dev->tx_queue_len = IPOIB_TX_RING_SIZE * 2; + dev->features = NETIF_F_VLAN_CHALLENGED | NETIF_F_LLTX; + + /* MTU will be reset when mcast join happens */ + dev->mtu = IPOIB_PACKET_SIZE - IPOIB_ENCAP_LEN; + priv->mcast_mtu = priv->admin_mtu = dev->mtu; + + memcpy(dev->broadcast, ipv4_bcast_addr, INFINIBAND_ALEN); + + netif_carrier_off(dev); + + SET_MODULE_OWNER(dev); + + priv->dev = dev; + + spin_lock_init(&priv->lock); + spin_lock_init(&priv->tx_lock); + + init_MUTEX(&priv->mcast_mutex); + init_MUTEX(&priv->vlan_mutex); + + INIT_LIST_HEAD(&priv->path_list); + INIT_LIST_HEAD(&priv->child_intfs); + INIT_LIST_HEAD(&priv->dead_ahs); + INIT_LIST_HEAD(&priv->multicast_list); + + INIT_WORK(&priv->pkey_task, ipoib_pkey_poll, priv->dev); + INIT_WORK(&priv->mcast_task, ipoib_mcast_join_task, priv->dev); + INIT_WORK(&priv->flush_task, ipoib_ib_dev_flush, priv->dev); + INIT_WORK(&priv->restart_task, ipoib_mcast_restart_task, priv->dev); + INIT_WORK(&priv->ah_reap_task, ipoib_reap_ah, priv->dev); +} + +struct ipoib_dev_priv *ipoib_intf_alloc(const char *name) +{ + struct net_device *dev; + + dev = alloc_netdev((int) sizeof (struct ipoib_dev_priv), name, + ipoib_setup); + if (!dev) + return NULL; + + return netdev_priv(dev); +} + +static ssize_t show_pkey(struct class_device *cdev, char *buf) +{ + struct ipoib_dev_priv *priv = + netdev_priv(container_of(cdev, struct net_device, class_dev)); + + return sprintf(buf, "0x%04x\n", priv->pkey); +} +static CLASS_DEVICE_ATTR(pkey, S_IRUGO, show_pkey, NULL); + +static ssize_t create_child(struct class_device *cdev, + const char *buf, size_t count) +{ + int pkey; + int ret; + + if (sscanf(buf, "%i", &pkey) != 1) + return -EINVAL; + + if (pkey < 0 || pkey > 0xffff) + return -EINVAL; + + ret = ipoib_vlan_add(container_of(cdev, struct net_device, class_dev), + pkey); + + return ret ? ret : count; +} +static CLASS_DEVICE_ATTR(create_child, S_IWUGO, NULL, create_child); + +static ssize_t delete_child(struct class_device *cdev, + const char *buf, size_t count) +{ + int pkey; + int ret; + + if (sscanf(buf, "%i", &pkey) != 1) + return -EINVAL; + + if (pkey < 0 || pkey > 0xffff) + return -EINVAL; + + ret = ipoib_vlan_delete(container_of(cdev, struct net_device, class_dev), + pkey); + + return ret ? ret : count; + +} +static CLASS_DEVICE_ATTR(delete_child, S_IWUGO, NULL, delete_child); + +int ipoib_add_pkey_attr(struct net_device *dev) +{ + return class_device_create_file(&dev->class_dev, + &class_device_attr_pkey); +} + +static struct net_device *ipoib_add_port(const char *format, + struct ib_device *hca, u8 port) +{ + struct ipoib_dev_priv *priv; + int result = -ENOMEM; + + priv = ipoib_intf_alloc(format); + if (!priv) + goto alloc_mem_failed; + + SET_NETDEV_DEV(priv->dev, hca->dma_device); + + result = ib_query_pkey(hca, port, 0, &priv->pkey); + if (result) { + printk(KERN_WARNING "%s: ib_query_pkey port %d failed (ret = %d)\n", + hca->name, port, result); + goto alloc_mem_failed; + } + + priv->dev->broadcast[8] = priv->pkey >> 8; + priv->dev->broadcast[9] = priv->pkey & 0xff; + + result = ib_query_gid(hca, port, 0, &priv->local_gid); + if (result) { + printk(KERN_WARNING "%s: ib_query_gid port %d failed (ret = %d)\n", + hca->name, port, result); + goto alloc_mem_failed; + } else + memcpy(priv->dev->dev_addr + 4, priv->local_gid.raw, sizeof (union ib_gid)); + + + result = ipoib_dev_init(priv->dev, hca, port); + if (result < 0) { + printk(KERN_WARNING "%s: failed to initialize port %d (ret = %d)\n", + hca->name, port, result); + goto device_init_failed; + } + + INIT_IB_EVENT_HANDLER(&priv->event_handler, + priv->ca, ipoib_event); + result = ib_register_event_handler(&priv->event_handler); + if (result < 0) { + printk(KERN_WARNING "%s: ib_register_event_handler failed for " + "port %d (ret = %d)\n", + hca->name, port, result); + goto event_failed; + } + + result = register_netdev(priv->dev); + if (result) { + printk(KERN_WARNING "%s: couldn't register ipoib port %d; error %d\n", + hca->name, port, result); + goto register_failed; + } + + if (ipoib_create_debug_file(priv->dev)) + goto debug_failed; + + if (ipoib_add_pkey_attr(priv->dev)) + goto sysfs_failed; + if (class_device_create_file(&priv->dev->class_dev, + &class_device_attr_create_child)) + goto sysfs_failed; + if (class_device_create_file(&priv->dev->class_dev, + &class_device_attr_delete_child)) + goto sysfs_failed; + + return priv->dev; + +sysfs_failed: + ipoib_delete_debug_file(priv->dev); + +debug_failed: + unregister_netdev(priv->dev); + +register_failed: + ib_unregister_event_handler(&priv->event_handler); + +event_failed: + ipoib_dev_cleanup(priv->dev); + +device_init_failed: + free_netdev(priv->dev); + +alloc_mem_failed: + return ERR_PTR(result); +} + +static void ipoib_add_one(struct ib_device *device) +{ + struct list_head *dev_list; + struct net_device *dev; + struct ipoib_dev_priv *priv; + int s, e, p; + + dev_list = kmalloc(sizeof *dev_list, GFP_KERNEL); + if (!dev_list) + return; + + INIT_LIST_HEAD(dev_list); + + if (device->node_type == IB_NODE_SWITCH) { + s = 0; + e = 0; + } else { + s = 1; + e = device->phys_port_cnt; + } + + for (p = s; p <= e; ++p) { + dev = ipoib_add_port("ib%d", device, p); + if (!IS_ERR(dev)) { + priv = netdev_priv(dev); + list_add_tail(&priv->list, dev_list); + } + } + + ib_set_client_data(device, &ipoib_client, dev_list); +} + +static void ipoib_remove_one(struct ib_device *device) +{ + struct ipoib_dev_priv *priv, *tmp; + struct list_head *dev_list; + + dev_list = ib_get_client_data(device, &ipoib_client); + + list_for_each_entry_safe(priv, tmp, dev_list, list) { + ib_unregister_event_handler(&priv->event_handler); + + unregister_netdev(priv->dev); + ipoib_dev_cleanup(priv->dev); + free_netdev(priv->dev); + } +} + +static int __init ipoib_init_module(void) +{ + int ret; + + ret = ipoib_register_debugfs(); + if (ret) + return ret; + + /* + * We create our own workqueue mainly because we want to be + * able to flush it when devices are being removed. We can't + * use schedule_work()/flush_scheduled_work() because both + * unregister_netdev() and linkwatch_event take the rtnl lock, + * so flush_scheduled_work() can deadlock during device + * removal. + */ + ipoib_workqueue = create_singlethread_workqueue("ipoib"); + if (!ipoib_workqueue) { + ret = -ENOMEM; + goto err_fs; + } + + ret = ib_register_client(&ipoib_client); + if (ret) + goto err_wq; + + return 0; + +err_fs: + ipoib_unregister_debugfs(); + +err_wq: + destroy_workqueue(ipoib_workqueue); + + return ret; +} + +static void __exit ipoib_cleanup_module(void) +{ + ipoib_unregister_debugfs(); + ib_unregister_client(&ipoib_client); + destroy_workqueue(ipoib_workqueue); +} + +module_init(ipoib_init_module); +module_exit(ipoib_cleanup_module); --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/ulp/ipoib/ipoib_multicast.c 2004-12-13 09:44:49.603417898 -0800 @@ -0,0 +1,954 @@ +/* + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available at + * , or the OpenIB.org BSD + * license, available in the LICENSE.TXT file accompanying this + * software. These details are also available at + * . + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * $Id: ipoib_multicast.c 1323 2004-12-11 02:36:04Z roland $ + */ + +#include +#include +#include +#include +#include +#include +#include +#include + +#include "ipoib.h" + +#ifdef CONFIG_INFINIBAND_IPOIB_DEBUG +int mcast_debug_level; + +module_param(mcast_debug_level, int, 0644); +MODULE_PARM_DESC(mcast_debug_level, + "Enable multicast debug tracing if > 0"); +#endif + +static DECLARE_MUTEX(mcast_mutex); + +/* Used for all multicast joins (broadcast, IPv4 mcast and IPv6 mcast) */ +struct ipoib_mcast { + struct ib_sa_mcmember_rec mcmember; + struct ipoib_ah *ah; + + struct rb_node rb_node; + struct list_head list; + struct completion done; + + int query_id; + struct ib_sa_query *query; + + unsigned long created; + unsigned long backoff; + + unsigned long flags; + unsigned char logcount; + + struct list_head neigh_list; + + struct sk_buff_head pkt_queue; + + struct net_device *dev; +}; + +struct ipoib_mcast_iter { + struct net_device *dev; + union ib_gid mgid; + unsigned long created; + unsigned int queuelen; + unsigned int complete; + unsigned int send_only; +}; + +static void ipoib_mcast_free(struct ipoib_mcast *mcast) +{ + struct net_device *dev = mcast->dev; + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ipoib_neigh *neigh, *tmp; + unsigned long flags; + + ipoib_dbg_mcast(netdev_priv(dev), + "deleting multicast group " IPOIB_GID_FMT "\n", + IPOIB_GID_ARG(mcast->mcmember.mgid)); + + spin_lock_irqsave(&priv->lock, flags); + + list_for_each_entry_safe(neigh, tmp, &mcast->neigh_list, list) { + ipoib_put_ah(neigh->ah); + *to_ipoib_neigh(neigh->neighbour) = NULL; + neigh->neighbour->ops->destructor = NULL; + kfree(neigh); + } + + spin_unlock_irqrestore(&priv->lock, flags); + + if (mcast->ah) + ipoib_put_ah(mcast->ah); + + while (!skb_queue_empty(&mcast->pkt_queue)) { + struct sk_buff *skb = skb_dequeue(&mcast->pkt_queue); + + skb->dev = dev; + dev_kfree_skb_any(skb); + } + + kfree(mcast); +} + +static struct ipoib_mcast *ipoib_mcast_alloc(struct net_device *dev, + int can_sleep) +{ + struct ipoib_mcast *mcast; + + mcast = kmalloc(sizeof (*mcast), can_sleep ? GFP_KERNEL : GFP_ATOMIC); + if (!mcast) + return NULL; + + memset(mcast, 0, sizeof (*mcast)); + + init_completion(&mcast->done); + + mcast->dev = dev; + mcast->created = jiffies; + mcast->backoff = HZ; + mcast->logcount = 0; + + INIT_LIST_HEAD(&mcast->list); + INIT_LIST_HEAD(&mcast->neigh_list); + skb_queue_head_init(&mcast->pkt_queue); + + mcast->ah = NULL; + mcast->query = NULL; + + return mcast; +} + +static struct ipoib_mcast *__ipoib_mcast_find(struct net_device *dev, union ib_gid *mgid) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct rb_node *n = priv->multicast_tree.rb_node; + + while (n) { + struct ipoib_mcast *mcast; + int ret; + + mcast = rb_entry(n, struct ipoib_mcast, rb_node); + + ret = memcmp(mgid->raw, mcast->mcmember.mgid.raw, + sizeof (union ib_gid)); + if (ret < 0) + n = n->rb_left; + else if (ret > 0) + n = n->rb_right; + else + return mcast; + } + + return NULL; +} + +static int __ipoib_mcast_add(struct net_device *dev, struct ipoib_mcast *mcast) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct rb_node **n = &priv->multicast_tree.rb_node, *pn = NULL; + + while (*n) { + struct ipoib_mcast *tmcast; + int ret; + + pn = *n; + tmcast = rb_entry(pn, struct ipoib_mcast, rb_node); + + ret = memcmp(mcast->mcmember.mgid.raw, tmcast->mcmember.mgid.raw, + sizeof (union ib_gid)); + if (ret < 0) + n = &pn->rb_left; + else if (ret > 0) + n = &pn->rb_right; + else + return -EEXIST; + } + + rb_link_node(&mcast->rb_node, pn, n); + rb_insert_color(&mcast->rb_node, &priv->multicast_tree); + + return 0; +} + +static int ipoib_mcast_join_finish(struct ipoib_mcast *mcast, + struct ib_sa_mcmember_rec *mcmember) +{ + struct net_device *dev = mcast->dev; + struct ipoib_dev_priv *priv = netdev_priv(dev); + int ret; + + mcast->mcmember = *mcmember; + + /* Set the cached Q_Key before we attach if it's the broadcast group */ + if (!memcmp(mcast->mcmember.mgid.raw, priv->dev->broadcast + 4, + sizeof (union ib_gid))) + priv->qkey = be32_to_cpu(priv->broadcast->mcmember.qkey); + + if (!test_bit(IPOIB_MCAST_FLAG_SENDONLY, &mcast->flags)) { + if (test_and_set_bit(IPOIB_MCAST_FLAG_ATTACHED, &mcast->flags)) { + ipoib_warn(priv, "multicast group " IPOIB_GID_FMT + " already attached\n", + IPOIB_GID_ARG(mcast->mcmember.mgid)); + + return 0; + } + + ret = ipoib_mcast_attach(dev, be16_to_cpu(mcast->mcmember.mlid), + &mcast->mcmember.mgid); + if (ret < 0) { + ipoib_warn(priv, "couldn't attach QP to multicast group " + IPOIB_GID_FMT "\n", + IPOIB_GID_ARG(mcast->mcmember.mgid)); + + clear_bit(IPOIB_MCAST_FLAG_ATTACHED, &mcast->flags); + return ret; + } + } + + { + struct ib_ah_attr av = { + .dlid = be16_to_cpu(mcast->mcmember.mlid), + .port_num = priv->port, + .sl = mcast->mcmember.sl, + .src_path_bits = 0, + .static_rate = 0, + .ah_flags = IB_AH_GRH, + .grh = { + .flow_label = be32_to_cpu(mcast->mcmember.flow_label), + .hop_limit = mcast->mcmember.hop_limit, + .sgid_index = 0, + .traffic_class = mcast->mcmember.traffic_class + } + }; + + av.grh.dgid = mcast->mcmember.mgid; + + mcast->ah = ipoib_create_ah(dev, priv->pd, &av); + if (!mcast->ah) { + ipoib_warn(priv, "ib_address_create failed\n"); + } else { + ipoib_dbg_mcast(priv, "MGID " IPOIB_GID_FMT + " AV %p, LID 0x%04x, SL %d\n", + IPOIB_GID_ARG(mcast->mcmember.mgid), + mcast->ah->ah, + be16_to_cpu(mcast->mcmember.mlid), + mcast->mcmember.sl); + } + } + + /* actually send any queued packets */ + while (!skb_queue_empty(&mcast->pkt_queue)) { + struct sk_buff *skb = skb_dequeue(&mcast->pkt_queue); + + skb->dev = dev; + + if (dev_queue_xmit(skb)) + ipoib_warn(priv, "dev_queue_xmit failed to requeue packet\n"); + } + + return 0; +} + +static void +ipoib_mcast_sendonly_join_complete(int status, + struct ib_sa_mcmember_rec *mcmember, + void *mcast_ptr) +{ + struct ipoib_mcast *mcast = mcast_ptr; + struct net_device *dev = mcast->dev; + + if (!status) + ipoib_mcast_join_finish(mcast, mcmember); + else { + if (mcast->logcount++ < 20) + ipoib_dbg_mcast(netdev_priv(dev), "multicast join failed for " + IPOIB_GID_FMT ", status %d\n", + IPOIB_GID_ARG(mcast->mcmember.mgid), status); + + /* Flush out any queued packets */ + while (!skb_queue_empty(&mcast->pkt_queue)) { + struct sk_buff *skb = skb_dequeue(&mcast->pkt_queue); + + skb->dev = dev; + + dev_kfree_skb_any(skb); + } + + /* Clear the busy flag so we try again */ + clear_bit(IPOIB_MCAST_FLAG_BUSY, &mcast->flags); + } + + complete(&mcast->done); +} + +static int ipoib_mcast_sendonly_join(struct ipoib_mcast *mcast) +{ + struct net_device *dev = mcast->dev; + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ib_sa_mcmember_rec rec = { +#if 0 /* Some SMs don't support send-only yet */ + .join_state = 4 +#else + .join_state = 1 +#endif + }; + int ret = 0; + + if (!test_bit(IPOIB_FLAG_OPER_UP, &priv->flags)) { + ipoib_dbg_mcast(priv, "device shutting down, no multicast joins\n"); + return -ENODEV; + } + + if (test_and_set_bit(IPOIB_MCAST_FLAG_BUSY, &mcast->flags)) { + ipoib_dbg_mcast(priv, "multicast entry busy, skipping\n"); + return -EBUSY; + } + + rec.mgid = mcast->mcmember.mgid; + rec.port_gid = priv->local_gid; + rec.pkey = be16_to_cpu(priv->pkey); + + ret = ib_sa_mcmember_rec_set(priv->ca, priv->port, &rec, + IB_SA_MCMEMBER_REC_MGID | + IB_SA_MCMEMBER_REC_PORT_GID | + IB_SA_MCMEMBER_REC_PKEY | + IB_SA_MCMEMBER_REC_JOIN_STATE, + 1000, GFP_ATOMIC, + ipoib_mcast_sendonly_join_complete, + mcast, &mcast->query); + if (ret < 0) { + ipoib_warn(priv, "ib_sa_mcmember_rec_set failed (ret = %d)\n", + ret); + } else { + ipoib_dbg_mcast(priv, "no multicast record for " IPOIB_GID_FMT + ", starting join\n", + IPOIB_GID_ARG(mcast->mcmember.mgid)); + + mcast->query_id = ret; + } + + return ret; +} + +static void ipoib_mcast_join_complete(int status, + struct ib_sa_mcmember_rec *mcmember, + void *mcast_ptr) +{ + struct ipoib_mcast *mcast = mcast_ptr; + struct net_device *dev = mcast->dev; + struct ipoib_dev_priv *priv = netdev_priv(dev); + + ipoib_dbg_mcast(priv, "join completion for " IPOIB_GID_FMT + " (status %d)\n", + IPOIB_GID_ARG(mcast->mcmember.mgid), status); + + if (!status && !ipoib_mcast_join_finish(mcast, mcmember)) { + mcast->backoff = HZ; + down(&mcast_mutex); + if (test_bit(IPOIB_MCAST_RUN, &priv->flags)) + queue_work(ipoib_workqueue, &priv->mcast_task); + up(&mcast_mutex); + complete(&mcast->done); + return; + } + + if (status == -EINTR) { + complete(&mcast->done); + return; + } + + if (status && mcast->logcount++ < 20) { + if (status == -ETIMEDOUT || status == -EINTR) { + ipoib_dbg_mcast(priv, "multicast join failed for " IPOIB_GID_FMT + ", status %d\n", + IPOIB_GID_ARG(mcast->mcmember.mgid), + status); + } else { + ipoib_warn(priv, "multicast join failed for " + IPOIB_GID_FMT ", status %d\n", + IPOIB_GID_ARG(mcast->mcmember.mgid), + status); + } + } + + mcast->backoff *= 2; + if (mcast->backoff > IPOIB_MAX_BACKOFF_SECONDS) + mcast->backoff = IPOIB_MAX_BACKOFF_SECONDS; + + mcast->query = NULL; + + down(&mcast_mutex); + if (test_bit(IPOIB_MCAST_RUN, &priv->flags)) { + if (status == -ETIMEDOUT) + queue_work(ipoib_workqueue, &priv->mcast_task); + else + queue_delayed_work(ipoib_workqueue, &priv->mcast_task, + mcast->backoff * HZ); + } else + complete(&mcast->done); + up(&mcast_mutex); + + return; +} + +static void ipoib_mcast_join(struct net_device *dev, struct ipoib_mcast *mcast, + int create) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ib_sa_mcmember_rec rec = { + .join_state = 1 + }; + ib_sa_comp_mask comp_mask; + int ret = 0; + + ipoib_dbg_mcast(priv, "joining MGID " IPOIB_GID_FMT "\n", + IPOIB_GID_ARG(mcast->mcmember.mgid)); + + rec.mgid = mcast->mcmember.mgid; + rec.port_gid = priv->local_gid; + rec.pkey = be16_to_cpu(priv->pkey); + + comp_mask = + IB_SA_MCMEMBER_REC_MGID | + IB_SA_MCMEMBER_REC_PORT_GID | + IB_SA_MCMEMBER_REC_PKEY | + IB_SA_MCMEMBER_REC_JOIN_STATE; + + if (create) { + comp_mask |= + IB_SA_MCMEMBER_REC_QKEY | + IB_SA_MCMEMBER_REC_SL | + IB_SA_MCMEMBER_REC_FLOW_LABEL | + IB_SA_MCMEMBER_REC_TRAFFIC_CLASS; + + rec.qkey = priv->broadcast->mcmember.qkey; + rec.sl = priv->broadcast->mcmember.sl; + rec.flow_label = priv->broadcast->mcmember.flow_label; + rec.traffic_class = priv->broadcast->mcmember.traffic_class; + } + + ret = ib_sa_mcmember_rec_set(priv->ca, priv->port, &rec, comp_mask, + mcast->backoff * 1000, GFP_ATOMIC, + ipoib_mcast_join_complete, + mcast, &mcast->query); + + if (ret < 0) { + ipoib_warn(priv, "ib_sa_mcmember_rec_set failed, status %d\n", ret); + + mcast->backoff *= 2; + if (mcast->backoff > IPOIB_MAX_BACKOFF_SECONDS) + mcast->backoff = IPOIB_MAX_BACKOFF_SECONDS; + + down(&mcast_mutex); + if (test_bit(IPOIB_MCAST_RUN, &priv->flags)) + queue_delayed_work(ipoib_workqueue, + &priv->mcast_task, + mcast->backoff); + up(&mcast_mutex); + } else + mcast->query_id = ret; +} + +void ipoib_mcast_join_task(void *dev_ptr) +{ + struct net_device *dev = dev_ptr; + struct ipoib_dev_priv *priv = netdev_priv(dev); + + if (!test_bit(IPOIB_MCAST_RUN, &priv->flags)) + return; + + if (ib_query_gid(priv->ca, priv->port, 0, &priv->local_gid)) + ipoib_warn(priv, "ib_gid_entry_get() failed\n"); + else + memcpy(priv->dev->dev_addr + 4, priv->local_gid.raw, sizeof (union ib_gid)); + + if (!priv->broadcast) { + priv->broadcast = ipoib_mcast_alloc(dev, 1); + if (!priv->broadcast) { + ipoib_warn(priv, "failed to allocate broadcast group\n"); + down(&mcast_mutex); + if (test_bit(IPOIB_MCAST_RUN, &priv->flags)) + queue_delayed_work(ipoib_workqueue, + &priv->mcast_task, HZ); + up(&mcast_mutex); + return; + } + + memcpy(priv->broadcast->mcmember.mgid.raw, priv->dev->broadcast + 4, + sizeof (union ib_gid)); + + spin_lock_irq(&priv->lock); + __ipoib_mcast_add(dev, priv->broadcast); + spin_unlock_irq(&priv->lock); + } + + if (!test_bit(IPOIB_MCAST_FLAG_ATTACHED, &priv->broadcast->flags)) { + ipoib_mcast_join(dev, priv->broadcast, 0); + return; + } + + while (1) { + struct ipoib_mcast *mcast = NULL; + + spin_lock_irq(&priv->lock); + list_for_each_entry(mcast, &priv->multicast_list, list) { + if (!test_bit(IPOIB_MCAST_FLAG_SENDONLY, &mcast->flags) + && !test_bit(IPOIB_MCAST_FLAG_BUSY, &mcast->flags) + && !test_bit(IPOIB_MCAST_FLAG_ATTACHED, &mcast->flags)) { + /* Found the next unjoined group */ + break; + } + } + spin_unlock_irq(&priv->lock); + + if (&mcast->list == &priv->multicast_list) { + /* All done */ + break; + } + + ipoib_mcast_join(dev, mcast, 1); + return; + } + + { + struct ib_port_attr attr; + + if (!ib_query_port(priv->ca, priv->port, &attr)) + priv->local_lid = attr.lid; + else + ipoib_warn(priv, "ib_query_port failed\n"); + } + + priv->mcast_mtu = ib_mtu_enum_to_int(priv->broadcast->mcmember.mtu) - + IPOIB_ENCAP_LEN; + dev->mtu = min(priv->mcast_mtu, priv->admin_mtu); + + ipoib_dbg_mcast(priv, "successfully joined all multicast groups\n"); + + clear_bit(IPOIB_MCAST_RUN, &priv->flags); + netif_carrier_on(dev); +} + +int ipoib_mcast_start_thread(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + ipoib_dbg_mcast(priv, "starting multicast thread\n"); + + down(&mcast_mutex); + if (!test_and_set_bit(IPOIB_MCAST_RUN, &priv->flags)) + queue_work(ipoib_workqueue, &priv->mcast_task); + up(&mcast_mutex); + + return 0; +} + +int ipoib_mcast_stop_thread(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ipoib_mcast *mcast; + + ipoib_dbg_mcast(priv, "stopping multicast thread\n"); + + down(&mcast_mutex); + clear_bit(IPOIB_MCAST_RUN, &priv->flags); + cancel_delayed_work(&priv->mcast_task); + up(&mcast_mutex); + + flush_workqueue(ipoib_workqueue); + + if (priv->broadcast && priv->broadcast->query) { + ib_sa_cancel_query(priv->broadcast->query_id, priv->broadcast->query); + priv->broadcast->query = NULL; + ipoib_dbg_mcast(priv, "waiting for bcast\n"); + wait_for_completion(&priv->broadcast->done); + } + + list_for_each_entry(mcast, &priv->multicast_list, list) { + if (mcast->query) { + ib_sa_cancel_query(mcast->query_id, mcast->query); + mcast->query = NULL; + ipoib_dbg_mcast(priv, "waiting for MGID " IPOIB_GID_FMT "\n", + IPOIB_GID_ARG(mcast->mcmember.mgid)); + wait_for_completion(&mcast->done); + } + } + + return 0; +} + +int ipoib_mcast_leave(struct net_device *dev, struct ipoib_mcast *mcast) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ib_sa_mcmember_rec rec = { + .join_state = 1 + }; + int ret = 0; + + if (!test_and_clear_bit(IPOIB_MCAST_FLAG_ATTACHED, &mcast->flags)) + return 0; + + ipoib_dbg_mcast(priv, "leaving MGID " IPOIB_GID_FMT "\n", + IPOIB_GID_ARG(mcast->mcmember.mgid)); + + rec.mgid = mcast->mcmember.mgid; + rec.port_gid = priv->local_gid; + rec.pkey = be16_to_cpu(priv->pkey); + + /* Remove ourselves from the multicast group */ + ret = ipoib_mcast_detach(dev, be16_to_cpu(mcast->mcmember.mlid), + &mcast->mcmember.mgid); + if (ret) + ipoib_warn(priv, "ipoib_mcast_detach failed (result = %d)\n", ret); + + /* + * Just make one shot at leaving and don't wait for a reply; + * if we fail, too bad. + */ + ret = ib_sa_mcmember_rec_delete(priv->ca, priv->port, &rec, + IB_SA_MCMEMBER_REC_MGID | + IB_SA_MCMEMBER_REC_PORT_GID | + IB_SA_MCMEMBER_REC_PKEY | + IB_SA_MCMEMBER_REC_JOIN_STATE, + 0, GFP_ATOMIC, NULL, + mcast, &mcast->query); + if (ret < 0) + ipoib_warn(priv, "ib_sa_mcmember_rec_delete failed " + "for leave (result = %d)\n", ret); + + return 0; +} + +void ipoib_mcast_send(struct net_device *dev, union ib_gid *mgid, + struct sk_buff *skb) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ipoib_mcast *mcast; + unsigned long flags; + + spin_lock_irqsave(&priv->lock, flags); + mcast = __ipoib_mcast_find(dev, mgid); + if (!mcast) { + /* Let's create a new send only group now */ + ipoib_dbg_mcast(priv, "setting up send only multicast group for " + IPOIB_GID_FMT "\n", IPOIB_GID_ARG(*mgid)); + + mcast = ipoib_mcast_alloc(dev, 0); + if (!mcast) { + ipoib_warn(priv, "unable to allocate memory for " + "multicast structure\n"); + dev_kfree_skb_any(skb); + goto out; + } + + set_bit(IPOIB_MCAST_FLAG_SENDONLY, &mcast->flags); + mcast->mcmember.mgid = *mgid; + __ipoib_mcast_add(dev, mcast); + list_add_tail(&mcast->list, &priv->multicast_list); + } + + if (!mcast->ah) { + if (skb_queue_len(&mcast->pkt_queue) < IPOIB_MAX_MCAST_QUEUE) + skb_queue_tail(&mcast->pkt_queue, skb); + else + dev_kfree_skb_any(skb); + + if (mcast->query) + ipoib_dbg_mcast(priv, "no address vector, " + "but multicast join already started\n"); + else if (test_bit(IPOIB_MCAST_FLAG_SENDONLY, &mcast->flags)) + ipoib_mcast_sendonly_join(mcast); + + /* + * If lookup completes between here and out:, don't + * want to send packet twice. + */ + mcast = NULL; + } + +out: + if (mcast && mcast->ah) { + if (skb->dst && + skb->dst->neighbour && + !*to_ipoib_neigh(skb->dst->neighbour)) { + struct ipoib_neigh *neigh = kmalloc(sizeof *neigh, GFP_ATOMIC); + + if (neigh) { + kref_get(&mcast->ah->ref); + neigh->ah = mcast->ah; + neigh->neighbour = skb->dst->neighbour; + *to_ipoib_neigh(skb->dst->neighbour) = neigh; + list_add_tail(&neigh->list, &mcast->neigh_list); + } + } + + ipoib_send(dev, skb, mcast->ah, IB_MULTICAST_QPN); + } + + spin_unlock_irqrestore(&priv->lock, flags); +} + +void ipoib_mcast_dev_flush(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + LIST_HEAD(remove_list); + struct ipoib_mcast *mcast, *tmcast, *nmcast; + unsigned long flags; + + ipoib_dbg_mcast(priv, "flushing multicast list\n"); + + spin_lock_irqsave(&priv->lock, flags); + list_for_each_entry_safe(mcast, tmcast, &priv->multicast_list, list) { + nmcast = ipoib_mcast_alloc(dev, 0); + if (nmcast) { + nmcast->flags = + mcast->flags & (1 << IPOIB_MCAST_FLAG_SENDONLY); + + nmcast->mcmember.mgid = mcast->mcmember.mgid; + + /* Add the new group in before the to-be-destroyed group */ + list_add_tail(&nmcast->list, &mcast->list); + list_del_init(&mcast->list); + + rb_replace_node(&mcast->rb_node, &nmcast->rb_node, + &priv->multicast_tree); + + list_add_tail(&mcast->list, &remove_list); + } else { + ipoib_warn(priv, "could not reallocate multicast group " + IPOIB_GID_FMT "\n", + IPOIB_GID_ARG(mcast->mcmember.mgid)); + } + } + + if (priv->broadcast) { + nmcast = ipoib_mcast_alloc(dev, 0); + if (nmcast) { + nmcast->mcmember.mgid = priv->broadcast->mcmember.mgid; + + rb_replace_node(&priv->broadcast->rb_node, + &nmcast->rb_node, + &priv->multicast_tree); + + list_add_tail(&priv->broadcast->list, &remove_list); + } + + priv->broadcast = nmcast; + } + + spin_unlock_irqrestore(&priv->lock, flags); + + list_for_each_entry(mcast, &remove_list, list) { + ipoib_mcast_leave(dev, mcast); + ipoib_mcast_free(mcast); + } +} + +void ipoib_mcast_dev_down(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + unsigned long flags; + + /* Delete broadcast since it will be recreated */ + if (priv->broadcast) { + ipoib_dbg_mcast(priv, "deleting broadcast group\n"); + + spin_lock_irqsave(&priv->lock, flags); + rb_erase(&priv->broadcast->rb_node, &priv->multicast_tree); + spin_unlock_irqrestore(&priv->lock, flags); + ipoib_mcast_leave(dev, priv->broadcast); + ipoib_mcast_free(priv->broadcast); + priv->broadcast = NULL; + } +} + +void ipoib_mcast_restart_task(void *dev_ptr) +{ + struct net_device *dev = dev_ptr; + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct dev_mc_list *mclist; + struct ipoib_mcast *mcast, *tmcast; + LIST_HEAD(remove_list); + unsigned long flags; + + ipoib_dbg_mcast(priv, "restarting multicast task\n"); + + ipoib_mcast_stop_thread(dev); + + spin_lock_irqsave(&priv->lock, flags); + + /* + * Unfortunately, the networking core only gives us a list of all of + * the multicast hardware addresses. We need to figure out which ones + * are new and which ones have been removed + */ + + /* Clear out the found flag */ + list_for_each_entry(mcast, &priv->multicast_list, list) + clear_bit(IPOIB_MCAST_FLAG_FOUND, &mcast->flags); + + /* Mark all of the entries that are found or don't exist */ + for (mclist = dev->mc_list; mclist; mclist = mclist->next) { + union ib_gid mgid; + + memcpy(mgid.raw, mclist->dmi_addr + 4, sizeof mgid); + + /* Add in the P_Key */ + mgid.raw[4] = (priv->pkey >> 8) & 0xff; + mgid.raw[5] = priv->pkey & 0xff; + + mcast = __ipoib_mcast_find(dev, &mgid); + if (!mcast || test_bit(IPOIB_MCAST_FLAG_SENDONLY, &mcast->flags)) { + struct ipoib_mcast *nmcast; + + /* Not found or send-only group, let's add a new entry */ + ipoib_dbg_mcast(priv, "adding multicast entry for mgid " + IPOIB_GID_FMT "\n", IPOIB_GID_ARG(mgid)); + + nmcast = ipoib_mcast_alloc(dev, 0); + if (!nmcast) { + ipoib_warn(priv, "unable to allocate memory for multicast structure\n"); + continue; + } + + set_bit(IPOIB_MCAST_FLAG_FOUND, &nmcast->flags); + + nmcast->mcmember.mgid = mgid; + + if (mcast) { + /* Destroy the send only entry */ + list_del(&mcast->list); + list_add_tail(&mcast->list, &remove_list); + + rb_replace_node(&mcast->rb_node, + &nmcast->rb_node, + &priv->multicast_tree); + } else + __ipoib_mcast_add(dev, nmcast); + + list_add_tail(&nmcast->list, &priv->multicast_list); + } + + if (mcast) + set_bit(IPOIB_MCAST_FLAG_FOUND, &mcast->flags); + } + + /* Remove all of the entries don't exist anymore */ + list_for_each_entry_safe(mcast, tmcast, &priv->multicast_list, list) { + if (!test_bit(IPOIB_MCAST_FLAG_FOUND, &mcast->flags) && + !test_bit(IPOIB_MCAST_FLAG_SENDONLY, &mcast->flags)) { + ipoib_dbg_mcast(priv, "deleting multicast group " IPOIB_GID_FMT "\n", + IPOIB_GID_ARG(mcast->mcmember.mgid)); + + rb_erase(&mcast->rb_node, &priv->multicast_tree); + + /* Move to the remove list */ + list_del(&mcast->list); + list_add_tail(&mcast->list, &remove_list); + } + } + spin_unlock_irqrestore(&priv->lock, flags); + + /* We have to cancel outside of the spinlock */ + list_for_each_entry(mcast, &remove_list, list) { + ipoib_mcast_leave(mcast->dev, mcast); + ipoib_mcast_free(mcast); + } + + if (test_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags)) + ipoib_mcast_start_thread(dev); +} + +struct ipoib_mcast_iter *ipoib_mcast_iter_init(struct net_device *dev) +{ + struct ipoib_mcast_iter *iter; + + iter = kmalloc(sizeof *iter, GFP_KERNEL); + if (!iter) + return NULL; + + iter->dev = dev; + memset(iter->mgid.raw, 0, sizeof iter->mgid); + + if (ipoib_mcast_iter_next(iter)) { + ipoib_mcast_iter_free(iter); + return NULL; + } + + return iter; +} + +void ipoib_mcast_iter_free(struct ipoib_mcast_iter *iter) +{ + kfree(iter); +} + +int ipoib_mcast_iter_next(struct ipoib_mcast_iter *iter) +{ + struct ipoib_dev_priv *priv = netdev_priv(iter->dev); + struct rb_node *n; + struct ipoib_mcast *mcast; + int ret = 1; + + spin_lock_irq(&priv->lock); + + n = rb_first(&priv->multicast_tree); + + while (n) { + mcast = rb_entry(n, struct ipoib_mcast, rb_node); + + if (memcmp(iter->mgid.raw, mcast->mcmember.mgid.raw, + sizeof (union ib_gid)) < 0) { + iter->mgid = mcast->mcmember.mgid; + iter->created = mcast->created; + iter->queuelen = skb_queue_len(&mcast->pkt_queue); + iter->complete = !!mcast->ah; + iter->send_only = !!(mcast->flags & (1 << IPOIB_MCAST_FLAG_SENDONLY)); + + ret = 0; + + break; + } + + n = rb_next(n); + } + + spin_unlock_irq(&priv->lock); + + return ret; +} + +void ipoib_mcast_iter_read(struct ipoib_mcast_iter *iter, + union ib_gid *mgid, + unsigned long *created, + unsigned int *queuelen, + unsigned int *complete, + unsigned int *send_only) +{ + *mgid = iter->mgid; + *created = iter->created; + *queuelen = iter->queuelen; + *complete = iter->complete; + *send_only = iter->send_only; +} --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/ulp/ipoib/ipoib_verbs.c 2004-12-13 09:44:49.634413332 -0800 @@ -0,0 +1,243 @@ +/* + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available at + * , or the OpenIB.org BSD + * license, available in the LICENSE.TXT file accompanying this + * software. These details are also available at + * . + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * $Id: ipoib_verbs.c 1308 2004-12-03 01:31:40Z roland $ + */ + +#include + +#include "ipoib.h" + +int ipoib_mcast_attach(struct net_device *dev, u16 mlid, union ib_gid *mgid) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ib_qp_attr *qp_attr; + int attr_mask; + int ret; + u16 pkey_index; + + ret = -ENOMEM; + qp_attr = kmalloc(sizeof *qp_attr, GFP_KERNEL); + if (!qp_attr) + goto out; + + if (ib_cached_pkey_find(priv->ca, priv->port, priv->pkey, &pkey_index)) { + clear_bit(IPOIB_PKEY_ASSIGNED, &priv->flags); + ret = -ENXIO; + goto out; + } + set_bit(IPOIB_PKEY_ASSIGNED, &priv->flags); + + /* set correct QKey for QP */ + qp_attr->qkey = priv->qkey; + attr_mask = IB_QP_QKEY; + ret = ib_modify_qp(priv->qp, qp_attr, attr_mask); + if (ret) { + ipoib_warn(priv, "failed to modify QP, ret = %d\n", ret); + goto out; + } + + /* attach QP to multicast group */ + down(&priv->mcast_mutex); + ret = ib_attach_mcast(priv->qp, mgid, mlid); + up(&priv->mcast_mutex); + if (ret) + ipoib_warn(priv, "failed to attach to multicast group, ret = %d\n", ret); + +out: + kfree(qp_attr); + return ret; +} + +int ipoib_mcast_detach(struct net_device *dev, u16 mlid, union ib_gid *mgid) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + int ret; + + down(&priv->mcast_mutex); + ret = ib_detach_mcast(priv->qp, mgid, mlid); + up(&priv->mcast_mutex); + if (ret) + ipoib_warn(priv, "ib_detach_mcast failed (result = %d)\n", ret); + + return ret; +} + +int ipoib_qp_create(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + int ret; + u16 pkey_index; + struct ib_qp_attr qp_attr; + int attr_mask; + + /* + * Search through the port P_Key table for the requested pkey value. + * The port has to be assigned to the respective IB partition in + * advance. + */ + ret = ib_cached_pkey_find(priv->ca, priv->port, priv->pkey, &pkey_index); + if (ret) { + clear_bit(IPOIB_PKEY_ASSIGNED, &priv->flags); + return ret; + } + set_bit(IPOIB_PKEY_ASSIGNED, &priv->flags); + + qp_attr.qp_state = IB_QPS_INIT; + qp_attr.qkey = 0; + qp_attr.port_num = priv->port; + qp_attr.pkey_index = pkey_index; + attr_mask = + IB_QP_QKEY | + IB_QP_PORT | + IB_QP_PKEY_INDEX | + IB_QP_STATE; + ret = ib_modify_qp(priv->qp, &qp_attr, attr_mask); + if (ret) { + ipoib_warn(priv, "failed to modify QP to init, ret = %d\n", ret); + goto out_fail; + } + + qp_attr.qp_state = IB_QPS_RTR; + /* Can't set this in a INIT->RTR transition */ + attr_mask &= ~IB_QP_PORT; + ret = ib_modify_qp(priv->qp, &qp_attr, attr_mask); + if (ret) { + ipoib_warn(priv, "failed to modify QP to RTR, ret = %d\n", ret); + goto out_fail; + } + + qp_attr.qp_state = IB_QPS_RTS; + qp_attr.sq_psn = 0; + attr_mask |= IB_QP_SQ_PSN; + attr_mask &= ~IB_QP_PKEY_INDEX; + ret = ib_modify_qp(priv->qp, &qp_attr, attr_mask); + if (ret) { + ipoib_warn(priv, "failed to modify QP to RTS, ret = %d\n", ret); + goto out_fail; + } + + return 0; + +out_fail: + ib_destroy_qp(priv->qp); + priv->qp = NULL; + + return -EINVAL; +} + +int ipoib_transport_dev_init(struct net_device *dev, struct ib_device *ca) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ib_qp_init_attr init_attr = { + .cap = { + .max_send_wr = IPOIB_TX_RING_SIZE, + .max_recv_wr = IPOIB_RX_RING_SIZE, + .max_send_sge = 1, + .max_recv_sge = 1 + }, + .sq_sig_type = IB_SIGNAL_ALL_WR, + .rq_sig_type = IB_SIGNAL_ALL_WR, + .qp_type = IB_QPT_UD + }; + + priv->pd = ib_alloc_pd(priv->ca); + if (IS_ERR(priv->pd)) { + printk(KERN_WARNING "%s: failed to allocate PD\n", ca->name); + return -ENODEV; + } + + priv->cq = ib_create_cq(priv->ca, ipoib_ib_completion, NULL, dev, + IPOIB_TX_RING_SIZE + IPOIB_RX_RING_SIZE + 1); + if (IS_ERR(priv->cq)) { + printk(KERN_WARNING "%s: failed to create CQ\n", ca->name); + goto out_free_pd; + } + + if (ib_req_notify_cq(priv->cq, IB_CQ_NEXT_COMP)) + goto out_free_cq; + + priv->mr = ib_get_dma_mr(priv->pd, IB_ACCESS_LOCAL_WRITE); + if (IS_ERR(priv->mr)) { + printk(KERN_WARNING "%s: ib_reg_phys_mr failed\n", ca->name); + goto out_free_cq; + } + + init_attr.send_cq = priv->cq; + init_attr.recv_cq = priv->cq, + + priv->qp = ib_create_qp(priv->pd, &init_attr); + if (IS_ERR(priv->qp)) { + printk(KERN_WARNING "%s: failed to create QP\n", ca->name); + goto out_free_mr; + } + + priv->dev->dev_addr[1] = (priv->qp->qp_num >> 16) & 0xff; + priv->dev->dev_addr[2] = (priv->qp->qp_num >> 8) & 0xff; + priv->dev->dev_addr[3] = (priv->qp->qp_num ) & 0xff; + + return 0; + +out_free_mr: + ib_dereg_mr(priv->mr); + +out_free_cq: + ib_destroy_cq(priv->cq); + +out_free_pd: + ib_dealloc_pd(priv->pd); + return -ENODEV; +} + +void ipoib_transport_dev_cleanup(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + if (priv->qp) { + if (ib_destroy_qp(priv->qp)) + ipoib_warn(priv, "ib_qp_destroy failed\n"); + + priv->qp = NULL; + clear_bit(IPOIB_PKEY_ASSIGNED, &priv->flags); + } + + if (ib_dereg_mr(priv->mr)) + ipoib_warn(priv, "ib_dereg_mr failed\n"); + + if (ib_destroy_cq(priv->cq)) + ipoib_warn(priv, "ib_cq_destroy failed\n"); + + if (ib_dealloc_pd(priv->pd)) + ipoib_warn(priv, "ib_dealloc_pd failed\n"); +} + +void ipoib_event(struct ib_event_handler *handler, + struct ib_event *record) +{ + struct ipoib_dev_priv *priv = + container_of(handler, struct ipoib_dev_priv, event_handler); + + if (record->event == IB_EVENT_PORT_ACTIVE || + record->event == IB_EVENT_LID_CHANGE || + record->event == IB_EVENT_SM_CHANGE) { + ipoib_dbg(priv, "Port active event\n"); + schedule_work(&priv->flush_task); + } +} --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/ulp/ipoib/ipoib_vlan.c 2004-12-13 09:44:49.660409502 -0800 @@ -0,0 +1,166 @@ +/* + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available at + * , or the OpenIB.org BSD + * license, available in the LICENSE.TXT file accompanying this + * software. These details are also available at + * . + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * $Id: ipoib_vlan.c 1271 2004-11-18 22:11:29Z roland $ + */ + +#include +#include + +#include +#include +#include + +#include + +#include "ipoib.h" + +static ssize_t show_parent(struct class_device *class_dev, char *buf) +{ + struct net_device *dev = + container_of(class_dev, struct net_device, class_dev); + struct ipoib_dev_priv *priv = netdev_priv(dev); + + return sprintf(buf, "%s\n", priv->parent->name); +} +static CLASS_DEVICE_ATTR(parent, S_IRUGO, show_parent, NULL); + +int ipoib_vlan_add(struct net_device *pdev, unsigned short pkey) +{ + struct ipoib_dev_priv *ppriv, *priv; + char intf_name[IFNAMSIZ]; + int result; + + if (!capable(CAP_NET_ADMIN)) + return -EPERM; + + ppriv = netdev_priv(pdev); + + down(&ppriv->vlan_mutex); + + /* + * First ensure this isn't a duplicate. We check the parent device and + * then all of the child interfaces to make sure the Pkey doesn't match. + */ + if (ppriv->pkey == pkey) { + result = -ENOTUNIQ; + goto err; + } + + list_for_each_entry(priv, &ppriv->child_intfs, list) { + if (priv->pkey == pkey) { + result = -ENOTUNIQ; + goto err; + } + } + + snprintf(intf_name, sizeof intf_name, "%s.%04x", + ppriv->dev->name, pkey); + priv = ipoib_intf_alloc(intf_name); + if (!priv) { + result = -ENOMEM; + goto err; + } + + set_bit(IPOIB_FLAG_SUBINTERFACE, &priv->flags); + + priv->pkey = pkey; + + memcpy(priv->dev->dev_addr, ppriv->dev->dev_addr, INFINIBAND_ALEN); + priv->dev->broadcast[8] = pkey >> 8; + priv->dev->broadcast[9] = pkey & 0xff; + + result = ipoib_dev_init(priv->dev, ppriv->ca, ppriv->port); + if (result < 0) { + ipoib_warn(ppriv, "failed to initialize subinterface: " + "device %s, port %d", + ppriv->ca->name, ppriv->port); + goto device_init_failed; + } + + result = register_netdev(priv->dev); + if (result) { + ipoib_warn(priv, "failed to initialize; error %i", result); + goto register_failed; + } + + priv->parent = ppriv->dev; + + if (ipoib_create_debug_file(priv->dev)) + goto debug_failed; + + if (ipoib_add_pkey_attr(priv->dev)) + goto sysfs_failed; + + if (class_device_create_file(&priv->dev->class_dev, + &class_device_attr_parent)) + goto sysfs_failed; + + list_add_tail(&priv->list, &ppriv->child_intfs); + + up(&ppriv->vlan_mutex); + + return 0; + +sysfs_failed: + ipoib_delete_debug_file(priv->dev); + +debug_failed: + unregister_netdev(priv->dev); + +register_failed: + ipoib_dev_cleanup(priv->dev); + +device_init_failed: + free_netdev(priv->dev); + +err: + up(&ppriv->vlan_mutex); + return result; +} + +int ipoib_vlan_delete(struct net_device *pdev, unsigned short pkey) +{ + struct ipoib_dev_priv *ppriv, *priv, *tpriv; + int ret = -ENOENT; + + if (!capable(CAP_NET_ADMIN)) + return -EPERM; + + ppriv = netdev_priv(pdev); + + down(&ppriv->vlan_mutex); + list_for_each_entry_safe(priv, tpriv, &ppriv->child_intfs, list) { + if (priv->pkey == pkey) { + unregister_netdev(priv->dev); + ipoib_dev_cleanup(priv->dev); + + list_del(&priv->list); + + kfree(priv); + + ret = 0; + break; + } + } + up(&ppriv->vlan_mutex); + + return ret; +} From kaber@trash.net Mon Dec 13 10:13:03 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 13 Dec 2004 10:13:11 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBDICfSw009615 for ; Mon, 13 Dec 2004 10:13:02 -0800 Received: from localhost ([127.0.0.1]) by kaber.coreworks.de with esmtp (Exim 4.34) id 1Cdufa-0002t3-Ee; Mon, 13 Dec 2004 19:11:38 +0100 Message-ID: <41BDDB5A.9000907@trash.net> Date: Mon, 13 Dec 2004 19:11:38 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.3) Gecko/20041008 Debian/1.7.3-5 X-Accept-Language: en MIME-Version: 1.0 To: Thomas Graf CC: "David S. Miller" , Herbert Xu , netdev@oss.sgi.com Subject: Re: [RFC] tcf_bind_filter failure handling References: <20041109161126.376f755c.davem@davemloft.net> <20041110010113.GJ31969@postel.suug.ch> <41916A91.3080107@trash.net> <20041110012251.GK31969@postel.suug.ch> <41916F0B.5010809@trash.net> <20041110013941.GL31969@postel.suug.ch> <41917330.6090002@trash.net> <20041212175736.GA8493@postel.suug.ch> <41BC8819.7040501@trash.net> <20041213165302.GE8493@postel.suug.ch> In-Reply-To: <20041213165302.GE8493@postel.suug.ch> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12702 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev Thomas Graf wrote: >The handling of a failure in tcf_bind_filter is inconsistent. > >u32: ignore >fw: ignore >route: ignore >rsvp: ignore >tcindex: error > >It might be a good idea to make this consistent. So in order to validate >the classid before making any changes we could simply lock it via get >(see patch below), return an error if it fails and put it back in case >of an error further in the path or after binding the filter. > >Bindings not only locks the class from removal while a filter is >pointing to it. It speeds up classyfing by saving a lookup for every >tc_classify call. It's not really a problem if the class is not locked, >the qdisc will look it up and falls back to a default class if it >doesn't exists so it's rather a cosmetic/policy thing. > You should just fix tcindex not to care about errors in tcf_bind_filter. bind_tcf already locks the class. Some qdiscs (like prio) map bind_filter to get, but others (HTB, HFSC, CBQ) use a seperate counter because it is legal to end up with a refcnt > 0 after delete. When a class with filters pointing to it is tried to destroy they return -EBUSY, which can't be done by looking at the refcnt. Regards Patrick From tduffy@sun.com Mon Dec 13 10:45:13 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 13 Dec 2004 10:45:20 -0800 (PST) Received: from brmea-mail-4.sun.com (brmea-mail-4.Sun.COM [192.18.98.36]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBDIiq88011768 for ; Mon, 13 Dec 2004 10:45:13 -0800 Received: from duffman.sfbay.sun.com ([129.146.95.108]) by brmea-mail-4.sun.com (8.12.10/8.12.9) with ESMTP id iBDIiTdt011422; Mon, 13 Dec 2004 11:44:29 -0700 (MST) Received: from duffman.sfbay.sun.com (localhost.localdomain [127.0.0.1]) by duffman.sfbay.sun.com (8.13.1/8.12.11) with ESMTP id iBDIiSuB008206; Mon, 13 Dec 2004 10:44:28 -0800 Received: (from tduffy@localhost) by duffman.sfbay.sun.com (8.13.1/8.12.11/Submit) id iBDIiOhZ008202; Mon, 13 Dec 2004 10:44:24 -0800 X-Authentication-Warning: duffman.sfbay.sun.com: tduffy set sender to tduffy@sun.com using -f Subject: Re: [openib-general] [PATCH][v3][17/21] Add IPoIB (IP-over-InfiniBand) driver From: Tom Duffy To: Roland Dreier Cc: Linux Kernel Mailing List , netdev@oss.sgi.com, openib-general@openib.org In-Reply-To: <20041213109.JT1ejUdkRIUXbWOm@topspin.com> References: <20041213109.JT1ejUdkRIUXbWOm@topspin.com> Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="=-+Gg/yNtd8JYVNcpHnKZq" Date: Mon, 13 Dec 2004 10:44:24 -0800 Message-Id: <1102963464.9258.11.camel@duffman> Mime-Version: 1.0 X-Mailer: Evolution 2.0.2 (2.0.2-3) X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12703 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tduffy@sun.com Precedence: bulk X-list: netdev --=-+Gg/yNtd8JYVNcpHnKZq Content-Type: text/plain Content-Transfer-Encoding: quoted-printable On Mon, 2004-12-13 at 10:09 -0800, Roland Dreier wrote: > --- linux-bk.orig/drivers/infiniband/Kconfig 2004-12-13 09:44:43.93625277= 9 -0800 > +++ linux-bk/drivers/infiniband/Kconfig 2004-12-13 09:44:49.385450009 -08= 00 > @@ -2,7 +2,6 @@ > =20 > config INFINIBAND > tristate "InfiniBand support" > - default n > ---help--- > Core support for InfiniBand (IB). Make sure to also select > any protocols you wish to use as well as drivers for your Is there a reason why you put this in in an earlier patch and then take it out later? -tduffy --=-+Gg/yNtd8JYVNcpHnKZq Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.6 (GNU/Linux) iD8DBQBBveMIdY502zjzwbwRAuywAKCbVtRdF/PW+JhPLjveoWFGzCKrhACgjoqK 64lZj1fH0R7rZbF+pAmN4MI= =VJFe -----END PGP SIGNATURE----- --=-+Gg/yNtd8JYVNcpHnKZq-- From shemminger@osdl.org Mon Dec 13 10:49:02 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 13 Dec 2004 10:49:08 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBDImgYn012324 for ; Mon, 13 Dec 2004 10:49:02 -0800 Received: from dxpl.pdx.osdl.net (dxpl.pdx.osdl.net [172.20.1.103]) by mail.osdl.org (8.11.6/8.11.6) with ESMTP id iBDIm8906036; Mon, 13 Dec 2004 10:48:08 -0800 Date: Mon, 13 Dec 2004 10:48:08 -0800 From: Stephen Hemminger To: Brande Cc: Brande , netdev@oss.sgi.com, linux-net@vger.kernel.org Subject: Re: Prism / Hostap Bridge Problems... Message-Id: <20041213104808.175afc52@dxpl.pdx.osdl.net> In-Reply-To: <41B9C4E0.40407@novolab.de> References: <41B9B6E9.7010600@novolab.de> <41B9C4E0.40407@novolab.de> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.10claws (GTK+ 1.2.10; x86_64-suse-linux) Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id iBDImgYn012324 X-archive-position: 12704 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Read the FAQ http://bridge.sourceforge.net/faq.html ------------- It doesn't work with my Wireless card! This is a known problem, and it is not caused by the bridge code. Many wireless cards don't allow spoofing of the source address. It is a firmware restriction with some chipsets. You might find some information in the bridge mailing list archives to help. Has anyone found a way to get around Wavelan not allowing anything but its own MAC address? (answer by Michael Renzmann (mrenzmann at compulan.de)) Well, for 99% of computer users there will never be a way to get rid of this. For this function a special firmware is needed. This firmware can be loaded into the RAM of any WaveLAN card, so it could do its job with bridging. But there is no documentation on the interface available to the public. The only way to achieve this is to have a full version of the hcf library which controls every function of the card and also allows accessing the card´s RAM. To get this full version Lucent wants to know that it will be a financial win for them, also you have to sign an NDA. So be sure that you won´t most probably get access to this peace of software until Lucent does not change its mind in this (which I doubt never will happen). If you urgently need to have a wireless LAN card which is able to bridge, you should use one of those having the prism chipset onboard (manufactured by Harris Intersil). There are drivers for those cards available at www.linux-wlan.com (which is the website from Absoval), and I found a mail that says that there is the necessary firmware and an upload tool available for Linux to the public. If you need additional features of an access point you should also talk to Absoval. I still don't understand!! (answer by Mark S. Mathews (mark at absoval.com)) Bridging Ethernet (v2 or 802.3) is predicated on the ability of a station to transmit frames with a source address (SA) other than its own. This is possible because Ethernet uses a 'transmit and forget'/stateless transmission model. This isn't possible with 'normal' 802.11 station cards and software because 802.11 station mode doesn't allow the transmission of frames with 'someone else's source address. The primary reason is that 802.11 is an acknowledged protocol. If you transmit a frame with someone else's source address, the ACK will never come back to you. The ACK will be sent to the station whose source address you used. There are ways to make it work (that's how I earn a living ;-), but it is not always straightforward and you probably won't get it right without a pretty solid understanding of 802.11, it's modes, and the frame header format. From roland@topspin.com Mon Dec 13 10:50:20 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 13 Dec 2004 10:50:27 -0800 (PST) Received: from umhlanga.STRATNET.NET (umhlanga.stratnet.net [12.162.17.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBDInxkK012782 for ; Mon, 13 Dec 2004 10:50:20 -0800 Received: from exch-1.topspincom.com ([12.162.17.3]) by umhlanga.STRATNET.NET with Microsoft SMTPSVC(5.0.2195.5329); Mon, 13 Dec 2004 10:49:37 -0800 Received: from eddore ([10.10.253.169]) by exch-1.topspincom.com with Microsoft SMTPSVC(5.0.2195.5329); Mon, 13 Dec 2004 10:49:37 -0800 Received: from roland by eddore with local (Exim 4.34) id 1CdvGK-0005eH-Lk; Mon, 13 Dec 2004 10:49:37 -0800 To: Tom Duffy Cc: Linux Kernel Mailing List , netdev@oss.sgi.com, openib-general@openib.org X-Message-Flag: Warning: May contain useful information References: <20041213109.JT1ejUdkRIUXbWOm@topspin.com> <1102963464.9258.11.camel@duffman> From: Roland Dreier Date: Mon, 13 Dec 2004 10:49:36 -0800 In-Reply-To: <1102963464.9258.11.camel@duffman> (Tom Duffy's message of "Mon, 13 Dec 2004 10:44:24 -0800") Message-ID: <52mzwi58zj.fsf@topspin.com> User-Agent: Gnus/5.1006 (Gnus v5.10.6) XEmacs/21.4 (Security Through Obscurity, linux) MIME-Version: 1.0 X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: roland@topspin.com Subject: Re: [openib-general] [PATCH][v3][17/21] Add IPoIB (IP-over-InfiniBand) driver Content-Type: text/plain; charset=us-ascii X-SA-Exim-Version: 4.1 (built Tue, 17 Aug 2004 11:06:07 +0200) X-SA-Exim-Scanned: Yes (on eddore) X-OriginalArrivalTime: 13 Dec 2004 18:49:37.0130 (UTC) FILETIME=[7E9A8CA0:01C4E144] X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12705 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: roland@topspin.com Precedence: bulk X-list: netdev Tom> Is there a reason why you put this in in an earlier patch and Tom> then take it out later? I guess the reasons are stupidity and bad patch scripts... Doesn't hurt for now, will be fixed in future versions. - R. From tgraf@suug.ch Mon Dec 13 10:52:28 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 13 Dec 2004 10:52:37 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBDIq7EL013315 for ; Mon, 13 Dec 2004 10:52:27 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id 4926AF; Mon, 13 Dec 2004 19:51:20 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id 902F11C0EA; Mon, 13 Dec 2004 19:52:03 +0100 (CET) Date: Mon, 13 Dec 2004 19:52:03 +0100 From: Thomas Graf To: Patrick McHardy Cc: "David S. Miller" , Herbert Xu , netdev@oss.sgi.com Subject: Re: [RFC] tcf_bind_filter failure handling Message-ID: <20041213185203.GF8493@postel.suug.ch> References: <20041110010113.GJ31969@postel.suug.ch> <41916A91.3080107@trash.net> <20041110012251.GK31969@postel.suug.ch> <41916F0B.5010809@trash.net> <20041110013941.GL31969@postel.suug.ch> <41917330.6090002@trash.net> <20041212175736.GA8493@postel.suug.ch> <41BC8819.7040501@trash.net> <20041213165302.GE8493@postel.suug.ch> <41BDDB5A.9000907@trash.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <41BDDB5A.9000907@trash.net> X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12706 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * Patrick McHardy <41BDDB5A.9000907@trash.net> 2004-12-13 19:11 > Thomas Graf wrote: > > >The handling of a failure in tcf_bind_filter is inconsistent. > > > >u32: ignore > >fw: ignore > >route: ignore > >rsvp: ignore > >tcindex: error > > > >It might be a good idea to make this consistent. So in order to validate > >the classid before making any changes we could simply lock it via get > >(see patch below), return an error if it fails and put it back in case > >of an error further in the path or after binding the filter. > > > >Bindings not only locks the class from removal while a filter is > >pointing to it. It speeds up classyfing by saving a lookup for every > >tc_classify call. It's not really a problem if the class is not locked, > >the qdisc will look it up and falls back to a default class if it > >doesn't exists so it's rather a cosmetic/policy thing. > > > You should just fix tcindex not to care about errors in tcf_bind_filter. > bind_tcf already locks the class. Some qdiscs (like prio) map bind_filter > to get, but others (HTB, HFSC, CBQ) use a seperate counter because it is > legal to end up with a refcnt > 0 after delete. When a class with filters > pointing to it is tried to destroy they return -EBUSY, which can't be done > by looking at the refcnt. Little misunderstanding here. I'm not aiming at replacing tcf_bind_filter with get. My question is rather whether to regard tcf_bind_filter not setting tcf_result->class as an error or ignore it. I'm all for ignoring it in tcindex, it requires some changes because it checks tcf_result.class field to see if hash bucket is non-empty if perfect hash is used but is not a problem at all. The tcf_class_get/put would be required to ensure proper locking during validation of parameters if validating the classid being last before changing things doesn't make sense due to the need to undo expensive operations required before binding. I will fix tcindex, since you also agree on simply ignoring it and regard the binding as an ptional locking and performance increase possibility given to userspace. From tduffy@sun.com Mon Dec 13 10:56:14 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 13 Dec 2004 10:56:21 -0800 (PST) Received: from nwkea-mail-1.sun.com (nwkea-mail-1.sun.com [192.18.42.13]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBDItsSm014003 for ; Mon, 13 Dec 2004 10:56:14 -0800 Received: from duffman.sfbay.sun.com ([129.146.95.108]) by nwkea-mail-1.sun.com (8.12.10/8.12.9) with ESMTP id iBDItUiI020217; Mon, 13 Dec 2004 10:55:31 -0800 (PST) Received: from duffman.sfbay.sun.com (localhost.localdomain [127.0.0.1]) by duffman.sfbay.sun.com (8.13.1/8.12.11) with ESMTP id iBDIsTQq008763; Mon, 13 Dec 2004 10:54:29 -0800 Received: (from tduffy@localhost) by duffman.sfbay.sun.com (8.13.1/8.12.11/Submit) id iBDIsTKI008762; Mon, 13 Dec 2004 10:54:29 -0800 X-Authentication-Warning: duffman.sfbay.sun.com: tduffy set sender to tduffy@sun.com using -f Subject: Re: [openib-general] [PATCH][v3][17/21] Add IPoIB (IP-over-InfiniBand) driver From: Tom Duffy To: Roland Dreier Cc: Linux Kernel Mailing List , netdev@oss.sgi.com, openib-general@openib.org In-Reply-To: <52mzwi58zj.fsf@topspin.com> References: <20041213109.JT1ejUdkRIUXbWOm@topspin.com> <1102963464.9258.11.camel@duffman> <52mzwi58zj.fsf@topspin.com> Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="=-Yk6Oa+xZH+Icmf/OE6Jd" Date: Mon, 13 Dec 2004 10:54:29 -0800 Message-Id: <1102964069.9258.20.camel@duffman> Mime-Version: 1.0 X-Mailer: Evolution 2.0.2 (2.0.2-3) X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12707 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tduffy@sun.com Precedence: bulk X-list: netdev --=-Yk6Oa+xZH+Icmf/OE6Jd Content-Type: text/plain Content-Transfer-Encoding: quoted-printable On Mon, 2004-12-13 at 10:49 -0800, Roland Dreier wrote: > Tom> Is there a reason why you put this in in an earlier patch and > Tom> then take it out later? >=20 > I guess the reasons are stupidity and bad patch scripts... >=20 > Doesn't hurt for now, will be fixed in future versions. Speaking of nits, there are also some formatting issues with the Makefiles that changes in the later patches... But, the end result is you get the "correct" formatting if you apply all the patches. -tduffy --=-Yk6Oa+xZH+Icmf/OE6Jd Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.6 (GNU/Linux) iD8DBQBBveVkdY502zjzwbwRAkGbAJ9Y7nYIijNZSAB28LyYyB6UeuopUgCgokCn V7J0vqX7XrxA2MFpGbBVJ2g= =xZFZ -----END PGP SIGNATURE----- --=-Yk6Oa+xZH+Icmf/OE6Jd-- From mmadore@blarg.net Mon Dec 13 10:58:07 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 13 Dec 2004 10:58:15 -0800 (PST) Received: from beaker.blarg.net (beaker.blarg.net [206.124.128.4]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBDIvlNj014508 for ; Mon, 13 Dec 2004 10:58:07 -0800 Received: from www.blargmail.com (localhost [127.0.0.1]) by beaker.blarg.net (Postfix) with ESMTP id 2E1433BECC for ; Mon, 13 Dec 2004 10:57:20 -0800 (PST) Received: from phpmailer ([69.225.172.73]) by www.blargmail.com with HTTPS (PHPMailer); Mon, 13 Dec 2004 10:57:20 -0800 Date: Mon, 13 Dec 2004 10:57:20 -0800 To: netdev@oss.sgi.com From: mmadore Subject: 2.4.28 neighbour/arp problem Message-ID: <9b6c67c48842b0cd62ac6aeda45088d5@www.blargmail.com> X-Priority: 3 X-Mailer: Blarg's Communications Control Center v4.4.0 X-Mailer-Info: http://www.blarg.net/ X-CCCUser: mmadore MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="b1_9b6c67c48842b0cd62ac6aeda45088d5" X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12708 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mmadore@blarg.net Precedence: bulk X-list: netdev --b1_9b6c67c48842b0cd62ac6aeda45088d5 Content-Type: text/plain; charset = "iso-8859-1" Content-Transfer-Encoding: 8bit Hi, I am seeing some strange behavior with the changes to the neighbour/arp code that went into the 2.4.28 kernel. When I first boot the system with 2.4.28, I cannot connect to other machines on my network. If I print out the arp table, I will see something like this: [root@node1 root]# arp Address HWtype HWaddress Flags Mask Ifaceiserver.mynet.com (incomplete) eth0 dns.mynet.com ether 00:90:27:34:5F:6A C eth0 iserver.mynet.com ether 00:E0:81:04:DC:F0 C eth0 For some reason, the system I am trying to connect to shows up twice. If I bring down eth0 and then start it again, I can then connect to the system. Also, If I wait for a number of minutes, the (incomplete) entry will disappear from the arp table and I can connect to the system. If I apply the attached path to backout the arp/neighbour changes, then everthing seems to work fine. The problem only occurs with 2.4.28. Both 2.4.27 and 2.6.9 work correctly. Also, I have reproduced this issue on RedHat 7.3, 8.0, 9 and Fedora Core 1 (using 2.4.28) with both Intel e1000 and Marvell gigabit (sk98lin). Let me know if I can provide additional information. Mike --b1_9b6c67c48842b0cd62ac6aeda45088d5 Content-Type: text/html; charset = "iso-8859-1" Content-Transfer-Encoding: 8bit Hi,

I am seeing some strange behavior with the changes to the neighbour/arp code that went into the 2.4.28 kernel.  When I first boot the system with 2.4.28, I cannot connect to other machines on my network.  If I print out the arp table, I will see something like this:

[root@node1 root]# arp
Address                  HWtype  HWaddress           Flags Mask            Ifaceiserver.mynet.com                (incomplete)                              eth0
dns.mynet.com            ether   00:90:27:34:5F:6A   C                     eth0
iserver.mynet.com        ether   00:E0:81:04:DC:F0   C                     eth0

For some reason, the system I am trying to connect to shows up twice.  If I bring down eth0 and then start it again, I can then connect to the system.  Also, If I wait for a number of minutes, the (incomplete) entry will disappear from the arp table and I can connect to the system.

If I apply the attached path to backout the arp/neighbour changes, then everthing seems to work fine.

The problem only occurs with 2.4.28.  Both 2.4.27 and 2.6.9 work correctly.  Also, I have reproduced this issue on RedHat 7.3, 8.0, 9 and Fedora Core 1 (using 2.4.28) with both Intel e1000 and Marvell gigabit (sk98lin).

Let me know if I can provide additional information.

Mike
--b1_9b6c67c48842b0cd62ac6aeda45088d5-- From roland@topspin.com Mon Dec 13 11:01:31 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 13 Dec 2004 11:01:38 -0800 (PST) Received: from umhlanga.STRATNET.NET (umhlanga.stratnet.net [12.162.17.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBDJ1B47015463 for ; Mon, 13 Dec 2004 11:01:31 -0800 Received: from exch-1.topspincom.com ([12.162.17.3]) by umhlanga.STRATNET.NET with Microsoft SMTPSVC(5.0.2195.5329); Mon, 13 Dec 2004 11:00:49 -0800 Received: from eddore ([10.10.253.169]) by exch-1.topspincom.com with Microsoft SMTPSVC(5.0.2195.5329); Mon, 13 Dec 2004 11:00:48 -0800 Received: from roland by eddore with local (Exim 4.34) id 1CdvRA-0005hD-9q; Mon, 13 Dec 2004 11:00:48 -0800 To: Tom Duffy Cc: Linux Kernel Mailing List , netdev@oss.sgi.com, openib-general@openib.org X-Message-Flag: Warning: May contain useful information References: <20041213109.JT1ejUdkRIUXbWOm@topspin.com> <1102963464.9258.11.camel@duffman> <52mzwi58zj.fsf@topspin.com> <1102964069.9258.20.camel@duffman> From: Roland Dreier Date: Mon, 13 Dec 2004 11:00:48 -0800 In-Reply-To: <1102964069.9258.20.camel@duffman> (Tom Duffy's message of "Mon, 13 Dec 2004 10:54:29 -0800") Message-ID: <52ekhu58gv.fsf@topspin.com> User-Agent: Gnus/5.1006 (Gnus v5.10.6) XEmacs/21.4 (Security Through Obscurity, linux) MIME-Version: 1.0 X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: roland@topspin.com Subject: Re: [openib-general] [PATCH][v3][17/21] Add IPoIB (IP-over-InfiniBand) driver Content-Type: text/plain; charset=us-ascii X-SA-Exim-Version: 4.1 (built Tue, 17 Aug 2004 11:06:07 +0200) X-SA-Exim-Scanned: Yes (on eddore) X-OriginalArrivalTime: 13 Dec 2004 19:00:48.0849 (UTC) FILETIME=[0EFABC10:01C4E146] X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12709 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: roland@topspin.com Precedence: bulk X-list: netdev Tom> Speaking of nits, there are also some formatting issues with Tom> the Makefiles that changes in the later patches... Thanks, I've fixed those intermediate versions as well. - R. From shemminger@osdl.org Mon Dec 13 11:09:27 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 13 Dec 2004 11:09:34 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBDJ924m016189 for ; Mon, 13 Dec 2004 11:09:27 -0800 Received: from dxpl.pdx.osdl.net (dxpl.pdx.osdl.net [172.20.1.103]) by mail.osdl.org (8.11.6/8.11.6) with ESMTP id iBDJ8S910611; Mon, 13 Dec 2004 11:08:28 -0800 Date: Mon, 13 Dec 2004 11:08:28 -0800 From: Stephen Hemminger To: Patrick McHardy Cc: "David S. Miller" , netem@osdl.org, netdev@oss.sgi.com Subject: Re: [PATCH] netem: restart device after inserting packets Message-Id: <20041213110828.2af5d0e1@dxpl.pdx.osdl.net> In-Reply-To: <41B91901.3070304@trash.net> References: <20041208123103.4cc6b005@dxpl.pdx.osdl.net> <20041208210031.63f0963f.davem@davemloft.net> <41B91901.3070304@trash.net> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.10claws (GTK+ 1.2.10; x86_64-suse-linux) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12710 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev On Fri, 10 Dec 2004 04:33:21 +0100 Patrick McHardy wrote: > David S. Miller wrote: > > >On Wed, 8 Dec 2004 12:31:03 -0800 > >Stephen Hemminger wrote: > > > > > >>The version of netem in 2.6.10 moves packets from the delayed queue > >>to the qdisc in a timer interrupt. But it forgot to force the device to > >>pick them up. > >> > >>Signed-off-by: Stephen Hemminger > >> > > > >Good spotting. Applied, thanks Stephen. > > > The patch is incomplete, netem may dequeue multiple packets from > the delayed queue at once and feed them to the inner queue, but > qdisc_restart will only dequeue one packet from the inner queue. > This patch moves qdisc_run back to include/net/pkt_sched.h and > replaces qdisc_restart by qdisc_run in netem_watchdog. Yes, I wasn't running big enough delays to notice. From kaber@trash.net Mon Dec 13 11:13:59 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 13 Dec 2004 11:14:07 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBDJDcAK016815 for ; Mon, 13 Dec 2004 11:13:59 -0800 Received: from localhost ([127.0.0.1]) by kaber.coreworks.de with esmtp (Exim 4.34) id 1Cdvce-0002zk-Nh; Mon, 13 Dec 2004 20:12:40 +0100 Message-ID: <41BDE9A8.9080505@trash.net> Date: Mon, 13 Dec 2004 20:12:40 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.3) Gecko/20041008 Debian/1.7.3-5 X-Accept-Language: en MIME-Version: 1.0 To: Thomas Graf CC: "David S. Miller" , Herbert Xu , netdev@oss.sgi.com Subject: Re: [RFC] tcf_bind_filter failure handling References: <20041110010113.GJ31969@postel.suug.ch> <41916A91.3080107@trash.net> <20041110012251.GK31969@postel.suug.ch> <41916F0B.5010809@trash.net> <20041110013941.GL31969@postel.suug.ch> <41917330.6090002@trash.net> <20041212175736.GA8493@postel.suug.ch> <41BC8819.7040501@trash.net> <20041213165302.GE8493@postel.suug.ch> <41BDDB5A.9000907@trash.net> <20041213185203.GF8493@postel.suug.ch> In-Reply-To: <20041213185203.GF8493@postel.suug.ch> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12712 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev Thomas Graf wrote: >* Patrick McHardy <41BDDB5A.9000907@trash.net> 2004-12-13 19:11 > > >>You should just fix tcindex not to care about errors in tcf_bind_filter. >>bind_tcf already locks the class. Some qdiscs (like prio) map bind_filter >>to get, but others (HTB, HFSC, CBQ) use a seperate counter because it is >>legal to end up with a refcnt > 0 after delete. When a class with filters >>pointing to it is tried to destroy they return -EBUSY, which can't be done >>by looking at the refcnt. >> >> > >Little misunderstanding here. I'm not aiming at replacing tcf_bind_filter >with get. My question is rather whether to regard tcf_bind_filter not setting >tcf_result->class as an error or ignore it. > >I'm all for ignoring it in tcindex, it requires some changes because >it checks tcf_result.class field to see if hash bucket is non-empty if >perfect hash is used but is not a problem at all. > >The tcf_class_get/put would be required to ensure proper locking during >validation of parameters if validating the classid being last before >changing things doesn't make sense due to the need to undo expensive >operations required before binding. > >I will fix tcindex, since you also agree on simply ignoring it and regard >the binding as an ptional locking and performance increase possibility >given to userspace. > Yes, it should be ignored, otherwise you can't point a filter to a class you will add later. Regards Patrick From mmadore@blarg.net Mon Dec 13 11:13:51 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 13 Dec 2004 11:14:00 -0800 (PST) Received: from beaker.blarg.net (beaker.blarg.net [206.124.128.4]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBDJDVto016805 for ; Mon, 13 Dec 2004 11:13:51 -0800 Received: from www.blargmail.com (localhost [127.0.0.1]) by beaker.blarg.net (Postfix) with ESMTP id A976F3BECC for ; Mon, 13 Dec 2004 11:13:03 -0800 (PST) Received: from phpmailer ([69.225.172.73]) by www.blargmail.com with HTTPS (PHPMailer); Mon, 13 Dec 2004 11:13:03 -0800 Date: Mon, 13 Dec 2004 11:13:03 -0800 To: netdev@oss.sgi.com From: mmadore Subject: Re: 2.4.28 neighbour/arp problem Message-ID: X-Priority: 3 X-Mailer: Blarg's Communications Control Center v4.4.0 X-Mailer-Info: http://www.blarg.net/ X-CCCUser: mmadore MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="b1_da999e80abd36631c9c60f8e897c4cde" X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12711 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mmadore@blarg.net Precedence: bulk X-list: netdev --b1_da999e80abd36631c9c60f8e897c4cde Content-Type: text/plain; charset = "iso-8859-1" Content-Transfer-Encoding: 8bit Sorry for responding to my own post, but the attachment didn't get attached. Mike On Mon, Dec 13, 2004 at 10:57am mmadore wrote: > Hi, > > I am seeing some strange behavior with the changes to the neighbour/arp code that went into the 2.4.28 kernel. When I first boot the system with 2.4.28, I cannot connect to other machines on my network. If I print out the arp table, I will see something like this: > > [root@node1 root]# arp > Address HWtype HWaddress Flags Mask Ifaceiserver.mynet.com (incomplete) eth0 > dns.mynet.com ether 00:90:27:34:5F:6A C eth0 > iserver.mynet.com ether 00:E0:81:04:DC:F0 C eth0 > > For some reason, the system I am trying to connect to shows up twice. If I bring down eth0 and then start it again, I can then connect to the system. Also, If I wait for a number of minutes, the (incomplete) entry will disappear from the arp table and I can connect to the system. > > If I apply the attached path to backout the arp/neighbour changes, then everthing seems to work fine. > > The problem only occurs with 2.4.28. Both 2.4.27 and 2.6.9 work correctly. Also, I have reproduced this issue on RedHat 7.3, 8.0, 9 and Fedora Core 1 (using 2.4.28) with both Intel e1000 and Marvell gigabit (sk98lin). > > Let me know if I can provide additional information. > > Mike > > --b1_da999e80abd36631c9c60f8e897c4cde Content-Type: text/x-patch; name="arp-hash-backout.patch" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="arp-hash-backout.patch" ZGlmZiAtTnVyIGxpbnV4LTIuNC4yOC9pbmNsdWRlL25ldC9uZWlnaGJvdXIuaCBsaW51eC0yLjQu MjgubmV3L2luY2x1ZGUvbmV0L25laWdoYm91ci5oCi0tLSBsaW51eC0yLjQuMjgvaW5jbHVkZS9u ZXQvbmVpZ2hib3VyLmgJMjAwNC0xMS0xNyAxMjoxNTozMS4wMDAwMDAwMDAgLTA1MDAKKysrIGxp bnV4LTIuNC4yOC5uZXcvaW5jbHVkZS9uZXQvbmVpZ2hib3VyLmgJMjAwNC0xMi0wOSAwOTo1NDoy Ny4wMDAwMDAwMDAgLTA1MDAKQEAgLTUsMTMgKzUsOCBAQAogICoJR2VuZXJpYyBuZWlnaGJvdXIg bWFuaXB1bGF0aW9uCiAgKgogICoJQXV0aG9yczoKLSAqCVBlZHJvIFJvcXVlCQk8cGVkcm9fbUB5 YWhvby5jb20+CisgKglQZWRybyBSb3F1ZQkJPHJvcXVlQGRpLmZjLnVsLnB0PgogICoJQWxleGV5 IEt1em5ldHNvdgk8a3V6bmV0QG1zMi5pbnIuYWMucnU+Ci0gKgotICogCUNoYW5nZXM6Ci0gKgot ICoJSGFyYWxkIFdlbHRlOgkJPGxhZm9yZ2VAZ251bW9ua3Mub3JnPgotICoJCS0gQWRkIG5laWdo Ym91ciBjYWNoZSBzdGF0aXN0aWNzIGxpa2UgcnRzdGF0CiAgKi8KIAogLyogVGhlIGZvbGxvd2lu ZyBmbGFncyAmIHN0YXRlcyBhcmUgZXhwb3J0ZWQgdG8gdXNlciBzcGFjZSwKQEAgLTUwLDcgKzQ1 LDYgQEAKIAogI2luY2x1ZGUgPGFzbS9hdG9taWMuaD4KICNpbmNsdWRlIDxsaW51eC9za2J1ZmYu aD4KLSNpbmNsdWRlIDxsaW51eC9zZXFfZmlsZS5oPgogCiAjZGVmaW5lIE5VRF9JTl9USU1FUgko TlVEX0lOQ09NUExFVEV8TlVEX0RFTEFZfE5VRF9QUk9CRSkKICNkZWZpbmUgTlVEX1ZBTElECShO VURfUEVSTUFORU5UfE5VRF9OT0FSUHxOVURfUkVBQ0hBQkxFfE5VRF9QUk9CRXxOVURfU1RBTEV8 TlVEX0RFTEFZKQpAQCAtODQsMjUgKzc4LDEyIEBACiAKIHN0cnVjdCBuZWlnaF9zdGF0aXN0aWNz CiB7Ci0JdW5zaWduZWQgbG9uZyBhbGxvY3M7CQkvKiBudW1iZXIgb2YgYWxsb2NhdGVkIG5laWdo cyAqLwotCXVuc2lnbmVkIGxvbmcgZGVzdHJveXM7CQkvKiBudW1iZXIgb2YgZGVzdHJveWVkIG5l aWdocyAqLwotCXVuc2lnbmVkIGxvbmcgaGFzaF9ncm93czsJLyogbnVtYmVyIG9mIGhhc2ggcmVz aXplcyAqLwotCi0JdW5zaWduZWQgbG9uZyByZXNfZmFpbGVkOwkvKiBub21iZXIgb2YgZmFpbGVk IHJlc29sdXRpb25zICovCi0KLQl1bnNpZ25lZCBsb25nIGxvb2t1cHM7CQkvKiBudW1iZXIgb2Yg bG9va3VwcyAqLwotCXVuc2lnbmVkIGxvbmcgaGl0czsJCS8qIG51bWJlciBvZiBoaXRzIChhbW9u ZyBsb29rdXBzKSAqLwotCi0JdW5zaWduZWQgbG9uZyByY3ZfcHJvYmVzX21jYXN0OwkvKiBudW1i ZXIgb2YgcmVjZWl2ZWQgbWNhc3QgaXB2NiAqLwotCXVuc2lnbmVkIGxvbmcgcmN2X3Byb2Jlc191 Y2FzdDsgLyogbnVtYmVyIG9mIHJlY2VpdmVkIHVjYXN0IGlwdjYgKi8KLQotCXVuc2lnbmVkIGxv bmcgcGVyaW9kaWNfZ2NfcnVuczsJLyogbnVtYmVyIG9mIHBlcmlvZGljIEdDIHJ1bnMgKi8KLQl1 bnNpZ25lZCBsb25nIGZvcmNlZF9nY19ydW5zOwkvKiBudW1iZXIgb2YgZm9yY2VkIEdDIHJ1bnMg Ki8KKwl1bnNpZ25lZCBsb25nIGFsbG9jczsKKwl1bnNpZ25lZCBsb25nIHJlc19mYWlsZWQ7CisJ dW5zaWduZWQgbG9uZyByY3ZfcHJvYmVzX21jYXN0OworCXVuc2lnbmVkIGxvbmcgcmN2X3Byb2Jl c191Y2FzdDsKIH07CiAKLSNkZWZpbmUgTkVJR0hfQ0FDSEVfU1RBVF9JTkModGJsLCBmaWVsZCkJ CQkJXAotCQkoKHRibCktPnN0YXRzW3NtcF9wcm9jZXNzb3JfaWQoKV0uZmllbGQrKykKLQogc3Ry dWN0IG5laWdoYm91cgogewogCXN0cnVjdCBuZWlnaGJvdXIJKm5leHQ7CkBAIC0xNDcsNiArMTI4 LDkgQEAKIAl1OAkJCWtleVswXTsKIH07CiAKKyNkZWZpbmUgTkVJR0hfSEFTSE1BU0sJCTB4MUYK KyNkZWZpbmUgUE5FSUdIX0hBU0hNQVNLCQkweEYKKwogLyoKICAqCW5laWdoYm91ciB0YWJsZSBt YW5pcHVsYXRpb24KICAqLwpAQCAtMTc0LDIxICsxNTgsMTUgQEAKIAlzdHJ1Y3QgdGltZXJfbGlz dCAJZ2NfdGltZXI7CiAJc3RydWN0IHRpbWVyX2xpc3QgCXByb3h5X3RpbWVyOwogCXN0cnVjdCBz a19idWZmX2hlYWQJcHJveHlfcXVldWU7Ci0JYXRvbWljX3QJCWVudHJpZXM7CisJaW50CQkJZW50 cmllczsKIAlyd2xvY2tfdAkJbG9jazsKIAl1bnNpZ25lZCBsb25nCQlsYXN0X3JhbmQ7CiAJc3Ry dWN0IG5laWdoX3Bhcm1zCSpwYXJtc19saXN0OwogCWttZW1fY2FjaGVfdAkJKmttZW1fY2FjaGVw OwogCXN0cnVjdCB0YXNrbGV0X3N0cnVjdAlnY190YXNrOwotCXN0cnVjdCBuZWlnaF9zdGF0aXN0 aWNzCXN0YXRzW05SX0NQVVNdOwotCXN0cnVjdCBuZWlnaGJvdXIJKipoYXNoX2J1Y2tldHM7Ci0J dW5zaWduZWQgaW50CQloYXNoX21hc2s7Ci0JX191MzIJCQloYXNoX3JuZDsKLQl1bnNpZ25lZCBp bnQJCWhhc2hfY2hhaW5fZ2M7Ci0Jc3RydWN0IHBuZWlnaF9lbnRyeQkqKnBoYXNoX2J1Y2tldHM7 Ci0jaWZkZWYgQ09ORklHX1BST0NfRlMKLQlzdHJ1Y3QgcHJvY19kaXJfZW50cnkJKnBkZTsKLSNl bmRpZgorCXN0cnVjdCBuZWlnaF9zdGF0aXN0aWNzCXN0YXRzOworCXN0cnVjdCBuZWlnaGJvdXIJ Kmhhc2hfYnVja2V0c1tORUlHSF9IQVNITUFTSysxXTsKKwlzdHJ1Y3QgcG5laWdoX2VudHJ5CSpw aGFzaF9idWNrZXRzW1BORUlHSF9IQVNITUFTSysxXTsKIH07CiAKIGV4dGVybiB2b2lkCQkJbmVp Z2hfdGFibGVfaW5pdChzdHJ1Y3QgbmVpZ2hfdGFibGUgKnRibCk7CkBAIC0xOTYsOCArMTc0LDYg QEAKIGV4dGVybiBzdHJ1Y3QgbmVpZ2hib3VyICoJbmVpZ2hfbG9va3VwKHN0cnVjdCBuZWlnaF90 YWJsZSAqdGJsLAogCQkJCQkgICAgIGNvbnN0IHZvaWQgKnBrZXksCiAJCQkJCSAgICAgc3RydWN0 IG5ldF9kZXZpY2UgKmRldik7Ci1leHRlcm4gc3RydWN0IG5laWdoYm91ciAqCW5laWdoX2xvb2t1 cF9ub2RldihzdHJ1Y3QgbmVpZ2hfdGFibGUgKnRibCwKLQkJCQkJCSAgIGNvbnN0IHZvaWQgKnBr ZXkpOwogZXh0ZXJuIHN0cnVjdCBuZWlnaGJvdXIgKgluZWlnaF9jcmVhdGUoc3RydWN0IG5laWdo X3RhYmxlICp0YmwsCiAJCQkJCSAgICAgY29uc3Qgdm9pZCAqcGtleSwKIAkJCQkJICAgICBzdHJ1 Y3QgbmV0X2RldmljZSAqZGV2KTsKQEAgLTIyOSwyNCArMjA1LDYgQEAKIGV4dGVybiBpbnQgbmVp Z2hfZGVsZXRlKHN0cnVjdCBza19idWZmICpza2IsIHN0cnVjdCBubG1zZ2hkciAqbmxoLCB2b2lk ICphcmcpOwogZXh0ZXJuIHZvaWQgbmVpZ2hfYXBwX25zKHN0cnVjdCBuZWlnaGJvdXIgKm4pOwog Ci1leHRlcm4gdm9pZCBuZWlnaF9mb3JfZWFjaChzdHJ1Y3QgbmVpZ2hfdGFibGUgKnRibCwgdm9p ZCAoKmNiKShzdHJ1Y3QgbmVpZ2hib3VyICosIHZvaWQgKiksIHZvaWQgKmNvb2tpZSk7Ci1leHRl cm4gdm9pZCBfX25laWdoX2Zvcl9lYWNoX3JlbGVhc2Uoc3RydWN0IG5laWdoX3RhYmxlICp0Ymws IGludCAoKmNiKShzdHJ1Y3QgbmVpZ2hib3VyICopKTsKLWV4dGVybiB2b2lkIHBuZWlnaF9mb3Jf ZWFjaChzdHJ1Y3QgbmVpZ2hfdGFibGUgKnRibCwgdm9pZCAoKmNiKShzdHJ1Y3QgcG5laWdoX2Vu dHJ5ICopKTsKLQotc3RydWN0IG5laWdoX3NlcV9zdGF0ZSB7Ci0Jc3RydWN0IG5laWdoX3RhYmxl ICp0Ymw7Ci0Jdm9pZCAqKCpuZWlnaF9zdWJfaXRlcikoc3RydWN0IG5laWdoX3NlcV9zdGF0ZSAq c3RhdGUsCi0JCQkJc3RydWN0IG5laWdoYm91ciAqbiwgbG9mZl90ICpwb3MpOwotCXVuc2lnbmVk IGludCBidWNrZXQ7Ci0JdW5zaWduZWQgaW50IGZsYWdzOwotI2RlZmluZSBORUlHSF9TRVFfTkVJ R0hfT05MWQkweDAwMDAwMDAxCi0jZGVmaW5lIE5FSUdIX1NFUV9JU19QTkVJR0gJMHgwMDAwMDAw MgotI2RlZmluZSBORUlHSF9TRVFfU0tJUF9OT0FSUAkweDAwMDAwMDA0Ci19OwotZXh0ZXJuIHZv aWQgKm5laWdoX3NlcV9zdGFydChzdHJ1Y3Qgc2VxX2ZpbGUgKiwgbG9mZl90ICosIHN0cnVjdCBu ZWlnaF90YWJsZSAqLCB1bnNpZ25lZCBpbnQpOwotZXh0ZXJuIHZvaWQgKm5laWdoX3NlcV9uZXh0 KHN0cnVjdCBzZXFfZmlsZSAqLCB2b2lkICosIGxvZmZfdCAqKTsKLWV4dGVybiB2b2lkIG5laWdo X3NlcV9zdG9wKHN0cnVjdCBzZXFfZmlsZSAqLCB2b2lkICopOwotCiBleHRlcm4gaW50CQkJbmVp Z2hfc3lzY3RsX3JlZ2lzdGVyKHN0cnVjdCBuZXRfZGV2aWNlICpkZXYsIHN0cnVjdCBuZWlnaF9w YXJtcyAqcCwKIAkJCQkJCSAgICAgIGludCBwX2lkLCBpbnQgcGRldl9pZCwgY2hhciAqcF9uYW1l KTsKIGV4dGVybiB2b2lkCQkJbmVpZ2hfc3lzY3RsX3VucmVnaXN0ZXIoc3RydWN0IG5laWdoX3Bh cm1zICpwKTsKZGlmZiAtTnVyIGxpbnV4LTIuNC4yOC9uZXQvY29yZS9uZWlnaGJvdXIuYyBsaW51 eC0yLjQuMjgubmV3L25ldC9jb3JlL25laWdoYm91ci5jCi0tLSBsaW51eC0yLjQuMjgvbmV0L2Nv cmUvbmVpZ2hib3VyLmMJMjAwNC0xMS0xNyAxMjoxNTozMi4wMDAwMDAwMDAgLTA1MDAKKysrIGxp bnV4LTIuNC4yOC5uZXcvbmV0L2NvcmUvbmVpZ2hib3VyLmMJMjAwNC0xMi0wOSAwOTo1NDoyNy4w MDAwMDAwMDAgLTA1MDAKQEAgLTIsNyArMiw3IEBACiAgKglHZW5lcmljIGFkZHJlc3MgcmVzb2x1 dGlvbiBlbnRpdHkKICAqCiAgKglBdXRob3JzOgotICoJUGVkcm8gUm9xdWUJCTxwZWRyb19tQHlh aG9vLmNvbT4KKyAqCVBlZHJvIFJvcXVlCQk8cm9xdWVAZGkuZmMudWwucHQ+CiAgKglBbGV4ZXkg S3V6bmV0c292CTxrdXpuZXRAbXMyLmluci5hYy5ydT4KICAqCiAgKglUaGlzIHByb2dyYW0gaXMg ZnJlZSBzb2Z0d2FyZTsgeW91IGNhbiByZWRpc3RyaWJ1dGUgaXQgYW5kL29yCkBAIC0xMiwxOCAr MTIsMTQgQEAKICAqCiAgKglGaXhlczoKICAqCVZpdGFseSBFLiBMYXZyb3YJcmVsZWFzaW5nIE5V TEwgbmVpZ2hib3IgaW4gbmVpZ2hfYWRkLgotICoJSGFyYWxkIFdlbHRlCQlBZGQgbmVpZ2hib3Vy IGNhY2hlIHN0YXRpc3RpY3MgbGlrZSBydHN0YXQKLSAqCUhhcmFsZCBXZWx0ZQkJcG9ydCBuZWln aGJvdXIgY2FjaGUgcmV3b3JrIGZyb20gMi42LjktcmNYCiAgKi8KIAogI2luY2x1ZGUgPGxpbnV4 L2NvbmZpZy5oPgogI2luY2x1ZGUgPGxpbnV4L3R5cGVzLmg+CiAjaW5jbHVkZSA8bGludXgva2Vy bmVsLmg+Ci0jaW5jbHVkZSA8bGludXgvbW9kdWxlLmg+CiAjaW5jbHVkZSA8bGludXgvc29ja2V0 Lmg+CiAjaW5jbHVkZSA8bGludXgvc2NoZWQuaD4KICNpbmNsdWRlIDxsaW51eC9uZXRkZXZpY2Uu aD4KLSNpbmNsdWRlIDxsaW51eC9wcm9jX2ZzLmg+CiAjaWZkZWYgQ09ORklHX1NZU0NUTAogI2lu Y2x1ZGUgPGxpbnV4L3N5c2N0bC5oPgogI2VuZGlmCkBAIC0zMSw4ICsyNyw2IEBACiAjaW5jbHVk ZSA8bmV0L2RzdC5oPgogI2luY2x1ZGUgPG5ldC9zb2NrLmg+CiAjaW5jbHVkZSA8bGludXgvcnRu ZXRsaW5rLmg+Ci0jaW5jbHVkZSA8bGludXgvcmFuZG9tLmg+Ci0jaW5jbHVkZSA8bGludXgvbW9k dWxlLmg+CiAKICNkZWZpbmUgTkVJR0hfREVCVUcgMQogCkBAIC01MSw4ICs0NSw2IEBACiAjZGVm aW5lIE5FSUdIX1BSSU5USzIgTkVJR0hfUFJJTlRLCiAjZW5kaWYKIAotI2RlZmluZSBQTkVJR0hf SEFTSE1BU0sJCTB4RgotCiBzdGF0aWMgdm9pZCBuZWlnaF90aW1lcl9oYW5kbGVyKHVuc2lnbmVk IGxvbmcgYXJnKTsKICNpZmRlZiBDT05GSUdfQVJQRAogc3RhdGljIHZvaWQgbmVpZ2hfYXBwX25v dGlmeShzdHJ1Y3QgbmVpZ2hib3VyICpuKTsKQEAgLTYyLDcgKzU0LDYgQEAKIAogc3RhdGljIGlu dCBuZWlnaF9nbGJsX2FsbG9jczsKIHN0YXRpYyBzdHJ1Y3QgbmVpZ2hfdGFibGUgKm5laWdoX3Rh YmxlczsKLXN0YXRpYyBzdHJ1Y3QgZmlsZV9vcGVyYXRpb25zIG5laWdoX3N0YXRfc2VxX2ZvcHM7 CiAKIC8qCiAgICBOZWlnaGJvdXIgaGFzaCB0YWJsZSBidWNrZXRzIGFyZSBwcm90ZWN0ZWQgd2l0 aCByd2xvY2sgdGJsLT5sb2NrLgpAQCAtMTIwLDIxICsxMTEsMjcgQEAKIAlpbnQgc2hydW5rID0g MDsKIAlpbnQgaTsKIAotCU5FSUdIX0NBQ0hFX1NUQVRfSU5DKHRibCwgZm9yY2VkX2djX3J1bnMp OwotCi0Jd3JpdGVfbG9ja19iaCgmdGJsLT5sb2NrKTsKLQlmb3IgKGkgPSAwOyBpIDw9IHRibC0+ aGFzaF9tYXNrOyBpKyspIHsKKwlmb3IgKGk9MDsgaTw9TkVJR0hfSEFTSE1BU0s7IGkrKykgewog CQlzdHJ1Y3QgbmVpZ2hib3VyICpuLCAqKm5wOwogCiAJCW5wID0gJnRibC0+aGFzaF9idWNrZXRz W2ldOworCQl3cml0ZV9sb2NrX2JoKCZ0YmwtPmxvY2spOwogCQl3aGlsZSAoKG4gPSAqbnApICE9 IE5VTEwpIHsKIAkJCS8qIE5laWdoYm91ciByZWNvcmQgbWF5IGJlIGRpc2NhcmRlZCBpZjoKLQkJ CSAqIC0gbm9ib2R5IHJlZmVycyB0byBpdC4KLQkJCSAqIC0gaXQgaXMgbm90IHBlcm1hbmVudAor CQkJICAgLSBub2JvZHkgcmVmZXJzIHRvIGl0LgorCQkJICAgLSBpdCBpcyBub3QgcGVybWFuZW50 CisJCQkgICAtIChORVcgYW5kIHByb2JhYmx5IHdyb25nKQorCQkJICAgICBJTkNPTVBMRVRFIGVu dHJpZXMgYXJlIGtlcHQgYXQgbGVhc3QgZm9yCisJCQkgICAgIG4tPnBhcm1zLT5yZXRyYW5zX3Rp bWUsIG90aGVyd2lzZSB3ZSBjb3VsZAorCQkJICAgICBmbG9vZCBuZXR3b3JrIHdpdGggcmVzb2x1 dGlvbiByZXF1ZXN0cy4KKwkJCSAgICAgSXQgaXMgbm90IGNsZWFyLCB3aGF0IGlzIGJldHRlciB0 YWJsZSBvdmVyZmxvdworCQkJICAgICBvciBmbG9vZGluZy4KIAkJCSAqLwogCQkJd3JpdGVfbG9j aygmbi0+bG9jayk7CiAJCQlpZiAoYXRvbWljX3JlYWQoJm4tPnJlZmNudCkgPT0gMSAmJgotCQkJ ICAgICEobi0+bnVkX3N0YXRlJk5VRF9QRVJNQU5FTlQpKSB7CisJCQkgICAgIShuLT5udWRfc3Rh dGUmTlVEX1BFUk1BTkVOVCkgJiYKKwkJCSAgICAobi0+bnVkX3N0YXRlICE9IE5VRF9JTkNPTVBM RVRFIHx8CisJCQkgICAgIGppZmZpZXMgLSBuLT51c2VkID4gbi0+cGFybXMtPnJldHJhbnNfdGlt ZSkpIHsKIAkJCQkqbnAgPSBuLT5uZXh0OwogCQkJCW4tPmRlYWQgPSAxOwogCQkJCXNocnVuayA9 IDE7CkBAIC0xNDUsMTIgKzE0MiwxMCBAQAogCQkJd3JpdGVfdW5sb2NrKCZuLT5sb2NrKTsKIAkJ CW5wID0gJm4tPm5leHQ7CiAJCX0KKwkJd3JpdGVfdW5sb2NrX2JoKCZ0YmwtPmxvY2spOwogCX0K IAkKIAl0YmwtPmxhc3RfZmx1c2ggPSBqaWZmaWVzOwotCi0Jd3JpdGVfdW5sb2NrX2JoKCZ0Ymwt PmxvY2spOwotCiAJcmV0dXJuIHNocnVuazsKIH0KIApAQCAtMTgxLDcgKzE3Niw3IEBACiAKIAl3 cml0ZV9sb2NrX2JoKCZ0YmwtPmxvY2spOwogCi0JZm9yIChpPTA7IGkgPD0gdGJsLT5oYXNoX21h c2s7IGkrKykgeworCWZvciAoaT0wOyBpIDw9IE5FSUdIX0hBU0hNQVNLOyBpKyspIHsKIAkJc3Ry dWN0IG5laWdoYm91ciAqbiwgKipucDsKIAogCQlucCA9ICZ0YmwtPmhhc2hfYnVja2V0c1tpXTsK QEAgLTIwOCw3ICsyMDMsNyBAQAogCiAJd3JpdGVfbG9ja19iaCgmdGJsLT5sb2NrKTsKIAotCWZv ciAoaSA9IDA7IGkgPD0gdGJsLT5oYXNoX21hc2s7IGkrKykgeworCWZvciAoaT0wOyBpPD1ORUlH SF9IQVNITUFTSzsgaSsrKSB7CiAJCXN0cnVjdCBuZWlnaGJvdXIgKm4sICoqbnA7CiAKIAkJbnAg PSAmdGJsLT5oYXNoX2J1Y2tldHNbaV07CkBAIC0yNTksMTEgKzI1NCwxMSBAQAogCXN0cnVjdCBu ZWlnaGJvdXIgKm47CiAJdW5zaWduZWQgbG9uZyBub3cgPSBqaWZmaWVzOwogCi0JaWYgKGF0b21p Y19yZWFkKCZ0YmwtPmVudHJpZXMpID4gdGJsLT5nY190aHJlc2gzIHx8Ci0JICAgIChhdG9taWNf cmVhZCgmdGJsLT5lbnRyaWVzKSA+IHRibC0+Z2NfdGhyZXNoMiAmJgorCWlmICh0YmwtPmVudHJp ZXMgPiB0YmwtPmdjX3RocmVzaDMgfHwKKwkgICAgKHRibC0+ZW50cmllcyA+IHRibC0+Z2NfdGhy ZXNoMiAmJgogCSAgICAgbm93IC0gdGJsLT5sYXN0X2ZsdXNoID4gNSpIWikpIHsKIAkJaWYgKG5l aWdoX2ZvcmNlZF9nYyh0YmwpID09IDAgJiYKLQkJICAgIGF0b21pY19yZWFkKCZ0YmwtPmVudHJp ZXMpID4gdGJsLT5nY190aHJlc2gzKQorCQkgICAgdGJsLT5lbnRyaWVzID4gdGJsLT5nY190aHJl c2gzKQogCQkJcmV0dXJuIE5VTEw7CiAJfQogCkBAIC0yODIsMTEzICsyNzcsMjkgQEAKIAlpbml0 X3RpbWVyKCZuLT50aW1lcik7CiAJbi0+dGltZXIuZnVuY3Rpb24gPSBuZWlnaF90aW1lcl9oYW5k bGVyOwogCW4tPnRpbWVyLmRhdGEgPSAodW5zaWduZWQgbG9uZyluOwotCU5FSUdIX0NBQ0hFX1NU QVRfSU5DKHRibCwgYWxsb2NzKTsKKwl0YmwtPnN0YXRzLmFsbG9jcysrOwogCW5laWdoX2dsYmxf YWxsb2NzKys7Ci0JYXRvbWljX2luYygmdGJsLT5lbnRyaWVzKTsKKwl0YmwtPmVudHJpZXMrKzsK IAluLT50YmwgPSB0Ymw7CiAJYXRvbWljX3NldCgmbi0+cmVmY250LCAxKTsKIAluLT5kZWFkID0g MTsKIAlyZXR1cm4gbjsKIH0KIAotc3RhdGljIHN0cnVjdCBuZWlnaGJvdXIgKipuZWlnaF9oYXNo X2FsbG9jKHVuc2lnbmVkIGludCBlbnRyaWVzKQotewotCXVuc2lnbmVkIGxvbmcgc2l6ZSA9IGVu dHJpZXMgKiBzaXplb2Yoc3RydWN0IG5laWdoYm91ciAqKTsKLQlzdHJ1Y3QgbmVpZ2hib3VyICoq cmV0OwotCi0JaWYgKHNpemUgPD0gUEFHRV9TSVpFKSB7Ci0JCXJldCA9IGttYWxsb2Moc2l6ZSwg R0ZQX0FUT01JQyk7Ci0JfSBlbHNlIHsKLQkJcmV0ID0gKHN0cnVjdCBuZWlnaGJvdXIgKiopCi0J CQlfX2dldF9mcmVlX3BhZ2VzKEdGUF9BVE9NSUMsIGdldF9vcmRlcihzaXplKSk7Ci0JfQotCWlm IChyZXQpCi0JCW1lbXNldChyZXQsIDAsIHNpemUpOwotCi0JcmV0dXJuIHJldDsKLX0KLQotc3Rh dGljIHZvaWQgbmVpZ2hfaGFzaF9mcmVlKHN0cnVjdCBuZWlnaGJvdXIgKipoYXNoLCB1bnNpZ25l ZCBpbnQgZW50cmllcykKLXsKLQl1bnNpZ25lZCBsb25nIHNpemUgPSBlbnRyaWVzICogc2l6ZW9m KHN0cnVjdCBuZWlnaGJvdXIgKik7Ci0KLQlpZiAoc2l6ZSA8PSBQQUdFX1NJWkUpCi0JCWtmcmVl KGhhc2gpOwotCWVsc2UKLQkJZnJlZV9wYWdlcygodW5zaWduZWQgbG9uZyloYXNoLCBnZXRfb3Jk ZXIoc2l6ZSkpOwotfQotCi1zdGF0aWMgdm9pZCBuZWlnaF9oYXNoX2dyb3coc3RydWN0IG5laWdo X3RhYmxlICp0YmwsIHVuc2lnbmVkIGxvbmcgbmV3X2VudHJpZXMpCi17Ci0Jc3RydWN0IG5laWdo Ym91ciAqKm5ld19oYXNoLCAqKm9sZF9oYXNoOwotCXVuc2lnbmVkIGludCBpLCBuZXdfaGFzaF9t YXNrLCBvbGRfZW50cmllczsKLQotCU5FSUdIX0NBQ0hFX1NUQVRfSU5DKHRibCwgaGFzaF9ncm93 cyk7Ci0KLQlCVUdfT04obmV3X2VudHJpZXMgJiAobmV3X2VudHJpZXMgLSAxKSk7Ci0JbmV3X2hh c2ggPSBuZWlnaF9oYXNoX2FsbG9jKG5ld19lbnRyaWVzKTsKLQlpZiAoIW5ld19oYXNoKQotCQly ZXR1cm47Ci0KLQlvbGRfZW50cmllcyA9IHRibC0+aGFzaF9tYXNrICsgMTsKLQluZXdfaGFzaF9t YXNrID0gbmV3X2VudHJpZXMgLSAxOwotCW9sZF9oYXNoID0gdGJsLT5oYXNoX2J1Y2tldHM7Ci0K LQlnZXRfcmFuZG9tX2J5dGVzKCZ0YmwtPmhhc2hfcm5kLCBzaXplb2YodGJsLT5oYXNoX3JuZCkp OwotCWZvciAoaSA9IDA7IGkgPCBvbGRfZW50cmllczsgaSsrKSB7Ci0JCXN0cnVjdCBuZWlnaGJv dXIgKm4sICpuZXh0OwotCi0JCWZvciAobiA9IG9sZF9oYXNoW2ldOyBuOyBuID0gbmV4dCkgewot CQkJdW5zaWduZWQgaW50IGhhc2hfdmFsID0gdGJsLT5oYXNoKG4tPnByaW1hcnlfa2V5LCBuLT5k ZXYpOwotCi0JCQloYXNoX3ZhbCAmPSBuZXdfaGFzaF9tYXNrOwotCQkJbmV4dCA9IG4tPm5leHQ7 Ci0KLQkJCW4tPm5leHQgPSBuZXdfaGFzaFtoYXNoX3ZhbF07Ci0JCQluZXdfaGFzaFtoYXNoX3Zh bF0gPSBuOwotCQl9Ci0JfQotCXRibC0+aGFzaF9idWNrZXRzID0gbmV3X2hhc2g7Ci0JdGJsLT5o YXNoX21hc2sgPSBuZXdfaGFzaF9tYXNrOwotCi0JbmVpZ2hfaGFzaF9mcmVlKG9sZF9oYXNoLCBv bGRfZW50cmllcyk7Ci19Ci0KIHN0cnVjdCBuZWlnaGJvdXIgKm5laWdoX2xvb2t1cChzdHJ1Y3Qg bmVpZ2hfdGFibGUgKnRibCwgY29uc3Qgdm9pZCAqcGtleSwKIAkJCSAgICAgICBzdHJ1Y3QgbmV0 X2RldmljZSAqZGV2KQogewogCXN0cnVjdCBuZWlnaGJvdXIgKm47CisJdTMyIGhhc2hfdmFsOwog CWludCBrZXlfbGVuID0gdGJsLT5rZXlfbGVuOwotCXUzMiBoYXNoX3ZhbCA9IHRibC0+aGFzaChw a2V5LCBkZXYpICYgdGJsLT5oYXNoX21hc2s7CiAKLQlORUlHSF9DQUNIRV9TVEFUX0lOQyh0Ymws IGxvb2t1cHMpOworCWhhc2hfdmFsID0gdGJsLT5oYXNoKHBrZXksIGRldik7CiAKIAlyZWFkX2xv Y2tfYmgoJnRibC0+bG9jayk7CiAJZm9yIChuID0gdGJsLT5oYXNoX2J1Y2tldHNbaGFzaF92YWxd OyBuOyBuID0gbi0+bmV4dCkgewogCQlpZiAoZGV2ID09IG4tPmRldiAmJgogCQkgICAgbWVtY21w KG4tPnByaW1hcnlfa2V5LCBwa2V5LCBrZXlfbGVuKSA9PSAwKSB7CiAJCQluZWlnaF9ob2xkKG4p OwotCQkJTkVJR0hfQ0FDSEVfU1RBVF9JTkModGJsLCBoaXRzKTsKLQkJCWJyZWFrOwotCQl9Ci0J fQotCXJlYWRfdW5sb2NrX2JoKCZ0YmwtPmxvY2spOwotCXJldHVybiBuOwotfQotCi1zdHJ1Y3Qg bmVpZ2hib3VyICpuZWlnaF9sb29rdXBfbm9kZXYoc3RydWN0IG5laWdoX3RhYmxlICp0YmwsIGNv bnN0IHZvaWQgKnBrZXkpCi17Ci0Jc3RydWN0IG5laWdoYm91ciAqbjsKLQlpbnQga2V5X2xlbiA9 IHRibC0+a2V5X2xlbjsKLQl1MzIgaGFzaF92YWwgPSB0YmwtPmhhc2gocGtleSwgTlVMTCkgJiB0 YmwtPmhhc2hfbWFzazsKLQotCU5FSUdIX0NBQ0hFX1NUQVRfSU5DKHRibCwgbG9va3Vwcyk7Ci0K LQlyZWFkX2xvY2tfYmgoJnRibC0+bG9jayk7Ci0JZm9yIChuID0gdGJsLT5oYXNoX2J1Y2tldHNb aGFzaF92YWxdOyBuOyBuID0gbi0+bmV4dCkgewotCQlpZiAoIW1lbWNtcChuLT5wcmltYXJ5X2tl eSwgcGtleSwga2V5X2xlbikpIHsKLQkJCW5laWdoX2hvbGQobik7Ci0JCQlORUlHSF9DQUNIRV9T VEFUX0lOQyh0YmwsIGhpdHMpOwogCQkJYnJlYWs7CiAJCX0KIAl9CkBAIC00MjcsMTEgKzMzOCw5 IEBACiAKIAluLT5jb25maXJtZWQgPSBqaWZmaWVzIC0gKG4tPnBhcm1zLT5iYXNlX3JlYWNoYWJs ZV90aW1lPDwxKTsKIAotCWhhc2hfdmFsID0gdGJsLT5oYXNoKHBrZXksIGRldikgJiB0YmwtPmhh c2hfbWFzazsKKwloYXNoX3ZhbCA9IHRibC0+aGFzaChwa2V5LCBkZXYpOwogCiAJd3JpdGVfbG9j a19iaCgmdGJsLT5sb2NrKTsKLQlpZiAoYXRvbWljX3JlYWQoJnRibC0+ZW50cmllcykgPiAodGJs LT5oYXNoX21hc2sgKyAxKSkKLQkJbmVpZ2hfaGFzaF9ncm93KHRibCwgKHRibC0+aGFzaF9tYXNr ICsgMSkgPDwgMSk7CiAJZm9yIChuMSA9IHRibC0+aGFzaF9idWNrZXRzW2hhc2hfdmFsXTsgbjE7 IG4xID0gbjEtPm5leHQpIHsKIAkJaWYgKGRldiA9PSBuMS0+ZGV2ICYmCiAJCSAgICBtZW1jbXAo bjEtPnByaW1hcnlfa2V5LCBwa2V5LCBrZXlfbGVuKSA9PSAwKSB7CkBAIC01MDksOSArNDE4LDkg QEAKIAloYXNoX3ZhbCBePSBoYXNoX3ZhbD4+NDsKIAloYXNoX3ZhbCAmPSBQTkVJR0hfSEFTSE1B U0s7CiAKLQl3cml0ZV9sb2NrX2JoKCZ0YmwtPmxvY2spOwogCWZvciAobnAgPSAmdGJsLT5waGFz aF9idWNrZXRzW2hhc2hfdmFsXTsgKG49Km5wKSAhPSBOVUxMOyBucCA9ICZuLT5uZXh0KSB7CiAJ CWlmIChtZW1jbXAobi0+a2V5LCBwa2V5LCBrZXlfbGVuKSA9PSAwICYmIG4tPmRldiA9PSBkZXYp IHsKKwkJCXdyaXRlX2xvY2tfYmgoJnRibC0+bG9jayk7CiAJCQkqbnAgPSBuLT5uZXh0OwogCQkJ d3JpdGVfdW5sb2NrX2JoKCZ0YmwtPmxvY2spOwogCQkJaWYgKHRibC0+cGRlc3RydWN0b3IpCkBA IC01MjAsNyArNDI5LDYgQEAKIAkJCXJldHVybiAwOwogCQl9CiAJfQotCXdyaXRlX3VubG9ja19i aCgmdGJsLT5sb2NrKTsKIAlyZXR1cm4gLUVOT0VOVDsKIH0KIApAQCAtNTU0LDggKzQ2Miw2IEBA CiB7CQogCXN0cnVjdCBoaF9jYWNoZSAqaGg7CiAKLQlORUlHSF9DQUNIRV9TVEFUX0lOQyhuZWln aC0+dGJsLCBkZXN0cm95cyk7Ci0KIAlpZiAoIW5laWdoLT5kZWFkKSB7CiAJCXByaW50aygiRGVz dHJveWluZyBhbGl2ZSBuZWlnaGJvdXIgJXBcbiIsIG5laWdoKTsKIAkJZHVtcF9zdGFjaygpOwpA QCAtNTg1LDcgKzQ5MSw3IEBACiAJTkVJR0hfUFJJTlRLMigibmVpZ2ggJXAgaXMgZGVzdHJveWVk LlxuIiwgbmVpZ2gpOwogCiAJbmVpZ2hfZ2xibF9hbGxvY3MtLTsKLQlhdG9taWNfZGVjKCZuZWln aC0+dGJsLT5lbnRyaWVzKTsKKwluZWlnaC0+dGJsLT5lbnRyaWVzLS07CiAJa21lbV9jYWNoZV9m cmVlKG5laWdoLT50YmwtPmttZW1fY2FjaGVwLCBuZWlnaCk7CiB9CiAKQEAgLTY2MCwxMCArNTY2 LDkgQEAKIHN0YXRpYyB2b2lkIFNNUF9USU1FUl9OQU1FKG5laWdoX3BlcmlvZGljX3RpbWVyKSh1 bnNpZ25lZCBsb25nIGFyZykKIHsKIAlzdHJ1Y3QgbmVpZ2hfdGFibGUgKnRibCA9IChzdHJ1Y3Qg bmVpZ2hfdGFibGUqKWFyZzsKLQlzdHJ1Y3QgbmVpZ2hib3VyICpuLCAqKm5wOwotCXVuc2lnbmVk IGxvbmcgZXhwaXJlLCBub3cgPSBqaWZmaWVzOworCXVuc2lnbmVkIGxvbmcgbm93ID0gamlmZmll czsKKwlpbnQgaTsKIAotCU5FSUdIX0NBQ0hFX1NUQVRfSU5DKHRibCwgcGVyaW9kaWNfZ2NfcnVu cyk7CiAKIAl3cml0ZV9sb2NrKCZ0YmwtPmxvY2spOwogCkBAIC02NzgsNDkgKzU4Myw0NiBAQAog CQkJcC0+cmVhY2hhYmxlX3RpbWUgPSBuZWlnaF9yYW5kX3JlYWNoX3RpbWUocC0+YmFzZV9yZWFj aGFibGVfdGltZSk7CiAJfQogCi0JbnAgPSAmdGJsLT5oYXNoX2J1Y2tldHNbdGJsLT5oYXNoX2No YWluX2djXTsKLQl0YmwtPmhhc2hfY2hhaW5fZ2MgPSAoKHRibC0+aGFzaF9jaGFpbl9nYyArIDEp ICYgdGJsLT5oYXNoX21hc2spOworCWZvciAoaT0wOyBpIDw9IE5FSUdIX0hBU0hNQVNLOyBpKysp IHsKKwkJc3RydWN0IG5laWdoYm91ciAqbiwgKipucDsKIAotCXdoaWxlICgobiA9ICpucCkgIT0g TlVMTCkgewotCQl1bnNpZ25lZCBpbnQgc3RhdGU7CisJCW5wID0gJnRibC0+aGFzaF9idWNrZXRz W2ldOworCQl3aGlsZSAoKG4gPSAqbnApICE9IE5VTEwpIHsKKwkJCXVuc2lnbmVkIHN0YXRlOwog Ci0JCXdyaXRlX2xvY2soJm4tPmxvY2spOwotICAKLQkJc3RhdGUgPSBuLT5udWRfc3RhdGU7Ci0J CWlmIChzdGF0ZSAmIChOVURfUEVSTUFORU5UIHwgTlVEX0lOX1RJTUVSKSkgewotCQkJd3JpdGVf dW5sb2NrKCZuLT5sb2NrKTsKLQkJCWdvdG8gbmV4dF9lbHQ7Ci0JCX0KKwkJCXdyaXRlX2xvY2so Jm4tPmxvY2spOworCisJCQlzdGF0ZSA9IG4tPm51ZF9zdGF0ZTsKKwkJCWlmIChzdGF0ZSYoTlVE X1BFUk1BTkVOVHxOVURfSU5fVElNRVIpKSB7CisJCQkJd3JpdGVfdW5sb2NrKCZuLT5sb2NrKTsK KwkJCQlnb3RvIG5leHRfZWx0OworCQkJfQogCi0JCWlmICh0aW1lX2JlZm9yZShuLT51c2VkLCBu LT5jb25maXJtZWQpKQotCQkJbi0+dXNlZCA9IG4tPmNvbmZpcm1lZDsKKwkJCWlmICgobG9uZyko bi0+dXNlZCAtIG4tPmNvbmZpcm1lZCkgPCAwKQorCQkJCW4tPnVzZWQgPSBuLT5jb25maXJtZWQ7 CiAKLQkJaWYgKGF0b21pY19yZWFkKCZuLT5yZWZjbnQpID09IDEgJiYKLQkJICAgIChzdGF0ZSA9 PSBOVURfRkFJTEVEIHx8Ci0JCSAgICAgdGltZV9hZnRlcihub3csIG4tPnVzZWQgKyBuLT5wYXJt cy0+Z2Nfc3RhbGV0aW1lKSkpIHsKLQkJCSpucCA9IG4tPm5leHQ7Ci0JCQluLT5kZWFkID0gMTsK KwkJCWlmIChhdG9taWNfcmVhZCgmbi0+cmVmY250KSA9PSAxICYmCisJCQkgICAgKHN0YXRlID09 IE5VRF9GQUlMRUQgfHwgbm93IC0gbi0+dXNlZCA+IG4tPnBhcm1zLT5nY19zdGFsZXRpbWUpKSB7 CisJCQkJKm5wID0gbi0+bmV4dDsKKwkJCQluLT5kZWFkID0gMTsKKwkJCQl3cml0ZV91bmxvY2so Jm4tPmxvY2spOworCQkJCW5laWdoX3JlbGVhc2Uobik7CisJCQkJY29udGludWU7CisJCQl9CisK KwkJCWlmIChuLT5udWRfc3RhdGUmTlVEX1JFQUNIQUJMRSAmJgorCQkJICAgIG5vdyAtIG4tPmNv bmZpcm1lZCA+IG4tPnBhcm1zLT5yZWFjaGFibGVfdGltZSkgeworCQkJCW4tPm51ZF9zdGF0ZSA9 IE5VRF9TVEFMRTsKKwkJCQluZWlnaF9zdXNwZWN0KG4pOworCQkJfQogCQkJd3JpdGVfdW5sb2Nr KCZuLT5sb2NrKTsKLQkJCW5laWdoX3JlbGVhc2Uobik7Ci0JCQljb250aW51ZTsKLQkJfQotCQl3 cml0ZV91bmxvY2soJm4tPmxvY2spOwogCiBuZXh0X2VsdDoKLQkJbnAgPSAmbi0+bmV4dDsKKwkJ CW5wID0gJm4tPm5leHQ7CisJCX0KIAl9Ci0gIAotIAkvKiBDeWNsZSB0aHJvdWdoIGFsbCBoYXNo IGJ1Y2tldHMgZXZlcnkgYmFzZV9yZWFjaGFibGVfdGltZS8yIHRpY2tzLgotIAkgKiBBUlAgZW50 cnkgdGltZW91dHMgcmFuZ2UgZnJvbSAxLzIgYmFzZV9yZWFjaGFibGVfdGltZSB0byAzLzIKLSAJ ICogYmFzZV9yZWFjaGFibGVfdGltZS4KLQkgKi8KLQlleHBpcmUgPSB0YmwtPnBhcm1zLmJhc2Vf cmVhY2hhYmxlX3RpbWUgPj4gMTsKLQlleHBpcmUgLz0gKHRibC0+aGFzaF9tYXNrICsgMSk7Ci0J aWYgKCFleHBpcmUpCi0JCWV4cGlyZSA9IDE7Ci0KLSAJbW9kX3RpbWVyKCZ0YmwtPmdjX3RpbWVy LCBub3cgKyBleHBpcmUpOwogCisJbW9kX3RpbWVyKCZ0YmwtPmdjX3RpbWVyLCBub3cgKyB0Ymwt PmdjX2ludGVydmFsKTsKIAl3cml0ZV91bmxvY2soJnRibC0+bG9jayk7CiB9CiAKQEAgLTc0Niw3 ICs2NDgsNiBAQAogewogCXVuc2lnbmVkIGxvbmcgbm93ID0gamlmZmllczsKIAlzdHJ1Y3QgbmVp Z2hib3VyICpuZWlnaCA9IChzdHJ1Y3QgbmVpZ2hib3VyKilhcmc7Ci0Jc3RydWN0IHNrX2J1ZmYg KnNrYjsKIAl1bnNpZ25lZCBzdGF0ZTsKIAlpbnQgbm90aWZ5ID0gMDsKIApAQCAtNzc5LDcgKzY4 MCw3IEBACiAKIAkJbmVpZ2gtPm51ZF9zdGF0ZSA9IE5VRF9GQUlMRUQ7CiAJCW5vdGlmeSA9IDE7 Ci0JCU5FSUdIX0NBQ0hFX1NUQVRfSU5DKG5laWdoLT50YmwsIHJlc19mYWlsZWQpOworCQluZWln aC0+dGJsLT5zdGF0cy5yZXNfZmFpbGVkKys7CiAJCU5FSUdIX1BSSU5USzIoIm5laWdoICVwIGlz IGZhaWxlZC5cbiIsIG5laWdoKTsKIAogCQkvKiBJdCBpcyB2ZXJ5IHRoaW4gcGxhY2UuIHJlcG9y dF91bnJlYWNoYWJsZSBpcyB2ZXJ5IGNvbXBsaWNhdGVkCkBAIC03OTgsMjAgKzY5OSwxMCBAQAog CiAJbmVpZ2gtPnRpbWVyLmV4cGlyZXMgPSBub3cgKyBuZWlnaC0+cGFybXMtPnJldHJhbnNfdGlt ZTsKIAlhZGRfdGltZXIoJm5laWdoLT50aW1lcik7Ci0KLQkvKiBrZWVwIHNrYiBhbGl2ZSBldmVu IGlmIGFycF9xdWV1ZSBvdmVyZmxvd3MgKi8KLQlza2IgPSBza2JfcGVlaygmbmVpZ2gtPmFycF9x dWV1ZSk7Ci0JaWYgKHNrYikKLQkJc2tiX2dldChza2IpOwotCiAJd3JpdGVfdW5sb2NrKCZuZWln aC0+bG9jayk7CiAKLQluZWlnaC0+b3BzLT5zb2xpY2l0KG5laWdoLCBza2IpOworCW5laWdoLT5v cHMtPnNvbGljaXQobmVpZ2gsIHNrYl9wZWVrKCZuZWlnaC0+YXJwX3F1ZXVlKSk7CiAJYXRvbWlj X2luYygmbmVpZ2gtPnByb2Jlcyk7Ci0KLQlpZiAoc2tiKQotCQlrZnJlZV9za2Ioc2tiKTsKLQog CXJldHVybjsKIAogb3V0OgpAQCAtMTI0MSw3ICsxMTMyLDYgQEAKIHZvaWQgbmVpZ2hfdGFibGVf aW5pdChzdHJ1Y3QgbmVpZ2hfdGFibGUgKnRibCkKIHsKIAl1bnNpZ25lZCBsb25nIG5vdyA9IGpp ZmZpZXM7Ci0JdW5zaWduZWQgbG9uZyBwaHNpemU7CiAKIAl0YmwtPnBhcm1zLnJlYWNoYWJsZV90 aW1lID0gbmVpZ2hfcmFuZF9yZWFjaF90aW1lKHRibC0+cGFybXMuYmFzZV9yZWFjaGFibGVfdGlt ZSk7CiAKQEAgLTEyNTEsMzAgKzExNDEsNiBAQAogCQkJCQkJICAgICAwLCBTTEFCX0hXQ0FDSEVf QUxJR04sCiAJCQkJCQkgICAgIE5VTEwsIE5VTEwpOwogCi0JaWYgKCF0YmwtPmttZW1fY2FjaGVw KQotCQlwYW5pYygiY2Fubm90IGNyZWF0ZSBuZWlnaGJvdXIgY2FjaGUiKTsKLQotI2lmZGVmIENP TkZJR19QUk9DX0ZTCi0JdGJsLT5wZGUgPSBjcmVhdGVfcHJvY19lbnRyeSh0YmwtPmlkLCAwLCBw cm9jX25ldF9zdGF0KTsKLQlpZiAoIXRibC0+cGRlKSAKLQkJcGFuaWMoImNhbm5vdCBjcmVhdGUg bmVpZ2hib3VyIHByb2MgZGlyIGVudHJ5Iik7Ci0JdGJsLT5wZGUtPnByb2NfZm9wcyA9ICZuZWln aF9zdGF0X3NlcV9mb3BzOwotCXRibC0+cGRlLT5kYXRhID0gdGJsOwotI2VuZGlmCi0KLQl0Ymwt Pmhhc2hfbWFzayA9IDE7Ci0JdGJsLT5oYXNoX2J1Y2tldHMgPSBuZWlnaF9oYXNoX2FsbG9jKHRi bC0+aGFzaF9tYXNrICsgMSk7Ci0KLQlwaHNpemUgPSAoUE5FSUdIX0hBU0hNQVNLICsgMSkgKiBz aXplb2Yoc3RydWN0IHBuZWlnaF9lbnRyeSAqKTsKLQl0YmwtPnBoYXNoX2J1Y2tldHMgPSBrbWFs bG9jKHBoc2l6ZSwgR0ZQX0tFUk5FTCk7Ci0KLQlpZiAoIXRibC0+aGFzaF9idWNrZXRzIHx8ICF0 YmwtPnBoYXNoX2J1Y2tldHMpCi0JCXBhbmljKCJjYW5ub3QgYWxsb2NhdGUgbmVpZ2hib3VyIGNh Y2hlIGhhc2hlcyIpOwotCi0JbWVtc2V0KHRibC0+cGhhc2hfYnVja2V0cywgMCwgcGhzaXplKTsK LQotCWdldF9yYW5kb21fYnl0ZXMoJnRibC0+aGFzaF9ybmQsIHNpemVvZih0YmwtPmhhc2hfcm5k KSk7Ci0KICNpZmRlZiBDT05GSUdfU01QCiAJdGFza2xldF9pbml0KCZ0YmwtPmdjX3Rhc2ssIFNN UF9USU1FUl9OQU1FKG5laWdoX3BlcmlvZGljX3RpbWVyKSwgKHVuc2lnbmVkIGxvbmcpdGJsKTsK ICNlbmRpZgpAQCAtMTI4Miw3ICsxMTQ4LDcgQEAKIAl0YmwtPmxvY2sgPSBSV19MT0NLX1VOTE9D S0VEOwogCXRibC0+Z2NfdGltZXIuZGF0YSA9ICh1bnNpZ25lZCBsb25nKXRibDsKIAl0YmwtPmdj X3RpbWVyLmZ1bmN0aW9uID0gbmVpZ2hfcGVyaW9kaWNfdGltZXI7Ci0JdGJsLT5nY190aW1lci5l eHBpcmVzID0gbm93ICsgMTsKKwl0YmwtPmdjX3RpbWVyLmV4cGlyZXMgPSBub3cgKyB0YmwtPmdj X2ludGVydmFsICsgdGJsLT5wYXJtcy5yZWFjaGFibGVfdGltZTsKIAlhZGRfdGltZXIoJnRibC0+ Z2NfdGltZXIpOwogCiAJaW5pdF90aW1lcigmdGJsLT5wcm94eV90aW1lcik7CkBAIC0xMzA4LDcg KzExNzQsNyBAQAogCWRlbF90aW1lcl9zeW5jKCZ0YmwtPnByb3h5X3RpbWVyKTsKIAlwbmVpZ2hf cXVldWVfcHVyZ2UoJnRibC0+cHJveHlfcXVldWUpOwogCW5laWdoX2lmZG93bih0YmwsIE5VTEwp OwotCWlmIChhdG9taWNfcmVhZCgmdGJsLT5lbnRyaWVzKSkKKwlpZiAodGJsLT5lbnRyaWVzKQog CQlwcmludGsoS0VSTl9DUklUICJuZWlnaGJvdXIgbGVha2FnZVxuIik7CiAJd3JpdGVfbG9jaygm bmVpZ2hfdGJsX2xvY2spOwogCWZvciAodHAgPSAmbmVpZ2hfdGFibGVzOyAqdHA7IHRwID0gJigq dHApLT5uZXh0KSB7CkBAIC0xMzE4LDEzICsxMTg0LDYgQEAKIAkJfQogCX0KIAl3cml0ZV91bmxv Y2soJm5laWdoX3RibF9sb2NrKTsKLQotCW5laWdoX2hhc2hfZnJlZSh0YmwtPmhhc2hfYnVja2V0 cywgdGJsLT5oYXNoX21hc2sgKyAxKTsKLQl0YmwtPmhhc2hfYnVja2V0cyA9IE5VTEw7Ci0KLQlr ZnJlZSh0YmwtPnBoYXNoX2J1Y2tldHMpOwotCXRibC0+cGhhc2hfYnVja2V0cyA9IE5VTEw7Ci0K ICNpZmRlZiBDT05GSUdfU1lTQ1RMCiAJbmVpZ2hfc3lzY3RsX3VucmVnaXN0ZXIoJnRibC0+cGFy bXMpOwogI2VuZGlmCkBAIC0xNTA1LDcgKzEzNjQsNyBAQAogCiAJc19oID0gY2ItPmFyZ3NbMV07 CiAJc19pZHggPSBpZHggPSBjYi0+YXJnc1syXTsKLQlmb3IgKGg9MDsgaCA8PSB0YmwtPmhhc2hf bWFzazsgaCsrKSB7CisJZm9yIChoPTA7IGggPD0gTkVJR0hfSEFTSE1BU0s7IGgrKykgewogCQlp ZiAoaCA8IHNfaCkgY29udGludWU7CiAJCWlmIChoID4gc19oKQogCQkJc19pZHggPSAwOwpAQCAt MTU1NiwzNjEgKzE0MTUsNiBAQAogCXJldHVybiBza2ItPmxlbjsKIH0KIAotdm9pZCBuZWlnaF9m b3JfZWFjaChzdHJ1Y3QgbmVpZ2hfdGFibGUgKnRibCwgdm9pZCAoKmNiKShzdHJ1Y3QgbmVpZ2hi b3VyICosIHZvaWQgKiksIHZvaWQgKmNvb2tpZSkKLXsKLQlpbnQgY2hhaW47Ci0KLQlyZWFkX2xv Y2tfYmgoJnRibC0+bG9jayk7Ci0JZm9yIChjaGFpbiA9IDA7IGNoYWluIDw9IHRibC0+aGFzaF9t YXNrOyBjaGFpbisrKSB7Ci0JCXN0cnVjdCBuZWlnaGJvdXIgKm47Ci0KLQkJZm9yIChuID0gdGJs LT5oYXNoX2J1Y2tldHNbY2hhaW5dOyBuOyBuID0gbi0+bmV4dCkKLQkJCWNiKG4sIGNvb2tpZSk7 Ci0JfQotCXJlYWRfdW5sb2NrX2JoKCZ0YmwtPmxvY2spOwotfQotRVhQT1JUX1NZTUJPTChuZWln aF9mb3JfZWFjaCk7Ci0KLS8qIFRoZSB0YmwtPmxvY2sgbXVzdCBiZSBoZWxkIGFzIGEgd3JpdGVy IGFuZCBCSCBkaXNhYmxlZC4gKi8KLXZvaWQgX19uZWlnaF9mb3JfZWFjaF9yZWxlYXNlKHN0cnVj dCBuZWlnaF90YWJsZSAqdGJsLAotCQkJICAgICAgaW50ICgqY2IpKHN0cnVjdCBuZWlnaGJvdXIg KikpCi17Ci0JaW50IGNoYWluOwotCi0JZm9yIChjaGFpbiA9IDA7IGNoYWluIDw9IHRibC0+aGFz aF9tYXNrOyBjaGFpbisrKSB7Ci0JCXN0cnVjdCBuZWlnaGJvdXIgKm4sICoqbnA7Ci0KLQkJbnAg PSAmdGJsLT5oYXNoX2J1Y2tldHNbY2hhaW5dOwotCQl3aGlsZSAoKG4gPSAqbnApICE9IE5VTEwp IHsKLQkJCWludCByZWxlYXNlOwotCi0JCQl3cml0ZV9sb2NrKCZuLT5sb2NrKTsKLQkJCXJlbGVh c2UgPSBjYihuKTsKLQkJCWlmIChyZWxlYXNlKSB7Ci0JCQkJKm5wID0gbi0+bmV4dDsKLQkJCQlu LT5kZWFkID0gMTsKLQkJCX0gZWxzZQotCQkJCW5wID0gJm4tPm5leHQ7Ci0JCQl3cml0ZV91bmxv Y2soJm4tPmxvY2spOwotCQkJaWYgKHJlbGVhc2UpCi0JCQkJbmVpZ2hfcmVsZWFzZShuKTsKLQkJ fQotCX0KLX0KLUVYUE9SVF9TWU1CT0woX19uZWlnaF9mb3JfZWFjaF9yZWxlYXNlKTsKLQotI2lm ZGVmIENPTkZJR19QUk9DX0ZTCi0KLXN0YXRpYyBzdHJ1Y3QgbmVpZ2hib3VyICpuZWlnaF9nZXRf Zmlyc3Qoc3RydWN0IHNlcV9maWxlICpzZXEpCi17Ci0Jc3RydWN0IG5laWdoX3NlcV9zdGF0ZSAq c3RhdGUgPSBzZXEtPnByaXZhdGU7Ci0Jc3RydWN0IG5laWdoX3RhYmxlICp0YmwgPSBzdGF0ZS0+ dGJsOwotCXN0cnVjdCBuZWlnaGJvdXIgKm4gPSBOVUxMOwotCWludCBidWNrZXQgPSBzdGF0ZS0+ YnVja2V0OwotCi0Jc3RhdGUtPmZsYWdzICY9IH5ORUlHSF9TRVFfSVNfUE5FSUdIOwotCWZvciAo YnVja2V0ID0gMDsgYnVja2V0IDw9IHRibC0+aGFzaF9tYXNrOyBidWNrZXQrKykgewotCQluID0g dGJsLT5oYXNoX2J1Y2tldHNbYnVja2V0XTsKLQotCQl3aGlsZSAobikgewotCQkJaWYgKHN0YXRl LT5uZWlnaF9zdWJfaXRlcikgewotCQkJCWxvZmZfdCBmYWtlcCA9IDA7Ci0JCQkJdm9pZCAqdjsK LQotCQkJCXYgPSBzdGF0ZS0+bmVpZ2hfc3ViX2l0ZXIoc3RhdGUsIG4sICZmYWtlcCk7Ci0JCQkJ aWYgKCF2KQotCQkJCQlnb3RvIG5leHQ7Ci0JCQl9Ci0JCQlpZiAoIShzdGF0ZS0+ZmxhZ3MgJiBO RUlHSF9TRVFfU0tJUF9OT0FSUCkpCi0JCQkJYnJlYWs7Ci0JCQlpZiAobi0+bnVkX3N0YXRlICYg fk5VRF9OT0FSUCkKLQkJCQlicmVhazsKLQkJbmV4dDoKLQkJCW4gPSBuLT5uZXh0OwotCQl9Ci0K LQkJaWYgKG4pCi0JCQlicmVhazsKLQl9Ci0Jc3RhdGUtPmJ1Y2tldCA9IGJ1Y2tldDsKLQotCXJl dHVybiBuOwotfQotCi1zdGF0aWMgc3RydWN0IG5laWdoYm91ciAqbmVpZ2hfZ2V0X25leHQoc3Ry dWN0IHNlcV9maWxlICpzZXEsCi0JCQkJCXN0cnVjdCBuZWlnaGJvdXIgKm4sCi0JCQkJCWxvZmZf dCAqcG9zKQotewotCXN0cnVjdCBuZWlnaF9zZXFfc3RhdGUgKnN0YXRlID0gc2VxLT5wcml2YXRl OwotCXN0cnVjdCBuZWlnaF90YWJsZSAqdGJsID0gc3RhdGUtPnRibDsKLQotCWlmIChzdGF0ZS0+ bmVpZ2hfc3ViX2l0ZXIpIHsKLQkJdm9pZCAqdiA9IHN0YXRlLT5uZWlnaF9zdWJfaXRlcihzdGF0 ZSwgbiwgcG9zKTsKLQkJaWYgKHYpCi0JCQlyZXR1cm4gbjsKLQl9Ci0JbiA9IG4tPm5leHQ7Ci0K LQl3aGlsZSAoMSkgewotCQl3aGlsZSAobikgewotCQkJaWYgKHN0YXRlLT5uZWlnaF9zdWJfaXRl cikgewotCQkJCXZvaWQgKnYgPSBzdGF0ZS0+bmVpZ2hfc3ViX2l0ZXIoc3RhdGUsIG4sIHBvcyk7 Ci0JCQkJaWYgKHYpCi0JCQkJCXJldHVybiBuOwotCQkJCWdvdG8gbmV4dDsKLQkJCX0KLQkJCWlm ICghKHN0YXRlLT5mbGFncyAmIE5FSUdIX1NFUV9TS0lQX05PQVJQKSkKLQkJCQlicmVhazsKLQot CQkJaWYgKG4tPm51ZF9zdGF0ZSAmIH5OVURfTk9BUlApCi0JCQkJYnJlYWs7Ci0JCW5leHQ6Ci0J CQluID0gbi0+bmV4dDsKLQkJfQotCi0JCWlmIChuKQotCQkJYnJlYWs7Ci0KLQkJaWYgKCsrc3Rh dGUtPmJ1Y2tldCA+IHRibC0+aGFzaF9tYXNrKQotCQkJYnJlYWs7Ci0KLQkJbiA9IHRibC0+aGFz aF9idWNrZXRzW3N0YXRlLT5idWNrZXRdOwotCX0KLQotCWlmIChuICYmIHBvcykKLQkJLS0oKnBv cyk7Ci0JcmV0dXJuIG47Ci19Ci0KLXN0YXRpYyBzdHJ1Y3QgbmVpZ2hib3VyICpuZWlnaF9nZXRf aWR4KHN0cnVjdCBzZXFfZmlsZSAqc2VxLCBsb2ZmX3QgKnBvcykKLXsKLQlzdHJ1Y3QgbmVpZ2hi b3VyICpuID0gbmVpZ2hfZ2V0X2ZpcnN0KHNlcSk7Ci0KLQlpZiAobikgewotCQl3aGlsZSAoKnBv cykgewotCQkJbiA9IG5laWdoX2dldF9uZXh0KHNlcSwgbiwgcG9zKTsKLQkJCWlmICghbikKLQkJ CQlicmVhazsKLQkJfQotCX0KLQlyZXR1cm4gKnBvcyA/IE5VTEwgOiBuOwotfQotCi1zdGF0aWMg c3RydWN0IHBuZWlnaF9lbnRyeSAqcG5laWdoX2dldF9maXJzdChzdHJ1Y3Qgc2VxX2ZpbGUgKnNl cSkKLXsKLQlzdHJ1Y3QgbmVpZ2hfc2VxX3N0YXRlICpzdGF0ZSA9IHNlcS0+cHJpdmF0ZTsKLQlz dHJ1Y3QgbmVpZ2hfdGFibGUgKnRibCA9IHN0YXRlLT50Ymw7Ci0Jc3RydWN0IHBuZWlnaF9lbnRy eSAqcG4gPSBOVUxMOwotCWludCBidWNrZXQgPSBzdGF0ZS0+YnVja2V0OwotCi0Jc3RhdGUtPmZs YWdzIHw9IE5FSUdIX1NFUV9JU19QTkVJR0g7Ci0JZm9yIChidWNrZXQgPSAwOyBidWNrZXQgPD0g UE5FSUdIX0hBU0hNQVNLOyBidWNrZXQrKykgewotCQlwbiA9IHRibC0+cGhhc2hfYnVja2V0c1ti dWNrZXRdOwotCQlpZiAocG4pCi0JCQlicmVhazsKLQl9Ci0Jc3RhdGUtPmJ1Y2tldCA9IGJ1Y2tl dDsKLQotCXJldHVybiBwbjsKLX0KLQotc3RhdGljIHN0cnVjdCBwbmVpZ2hfZW50cnkgKnBuZWln aF9nZXRfbmV4dChzdHJ1Y3Qgc2VxX2ZpbGUgKnNlcSwKLQkJCQkJICAgIHN0cnVjdCBwbmVpZ2hf ZW50cnkgKnBuLAotCQkJCQkgICAgbG9mZl90ICpwb3MpCi17Ci0Jc3RydWN0IG5laWdoX3NlcV9z dGF0ZSAqc3RhdGUgPSBzZXEtPnByaXZhdGU7Ci0Jc3RydWN0IG5laWdoX3RhYmxlICp0YmwgPSBz dGF0ZS0+dGJsOwotCi0JcG4gPSBwbi0+bmV4dDsKLQl3aGlsZSAoIXBuKSB7Ci0JCWlmICgrK3N0 YXRlLT5idWNrZXQgPiBQTkVJR0hfSEFTSE1BU0spCi0JCQlicmVhazsKLQkJcG4gPSB0YmwtPnBo YXNoX2J1Y2tldHNbc3RhdGUtPmJ1Y2tldF07Ci0JCWlmIChwbikKLQkJCWJyZWFrOwotCX0KLQot CWlmIChwbiAmJiBwb3MpCi0JCS0tKCpwb3MpOwotCi0JcmV0dXJuIHBuOwotfQotCi1zdGF0aWMg c3RydWN0IHBuZWlnaF9lbnRyeSAqcG5laWdoX2dldF9pZHgoc3RydWN0IHNlcV9maWxlICpzZXEs IGxvZmZfdCAqcG9zKQotewotCXN0cnVjdCBwbmVpZ2hfZW50cnkgKnBuID0gcG5laWdoX2dldF9m aXJzdChzZXEpOwotCi0JaWYgKHBuKSB7Ci0JCXdoaWxlICgqcG9zKSB7Ci0JCQlwbiA9IHBuZWln aF9nZXRfbmV4dChzZXEsIHBuLCBwb3MpOwotCQkJaWYgKCFwbikKLQkJCQlicmVhazsKLQkJfQot CX0KLQlyZXR1cm4gKnBvcyA/IE5VTEwgOiBwbjsKLX0KLQotc3RhdGljIHZvaWQgKm5laWdoX2dl dF9pZHhfYW55KHN0cnVjdCBzZXFfZmlsZSAqc2VxLCBsb2ZmX3QgKnBvcykKLXsKLQlzdHJ1Y3Qg bmVpZ2hfc2VxX3N0YXRlICpzdGF0ZSA9IHNlcS0+cHJpdmF0ZTsKLQl2b2lkICpyYzsKLQotCXJj ID0gbmVpZ2hfZ2V0X2lkeChzZXEsIHBvcyk7Ci0JaWYgKCFyYyAmJiAhKHN0YXRlLT5mbGFncyAm IE5FSUdIX1NFUV9ORUlHSF9PTkxZKSkKLQkJcmMgPSBwbmVpZ2hfZ2V0X2lkeChzZXEsIHBvcyk7 Ci0KLQlyZXR1cm4gcmM7Ci19Ci0KLXZvaWQgKm5laWdoX3NlcV9zdGFydChzdHJ1Y3Qgc2VxX2Zp bGUgKnNlcSwgbG9mZl90ICpwb3MsIHN0cnVjdCBuZWlnaF90YWJsZSAqdGJsLCB1bnNpZ25lZCBp bnQgbmVpZ2hfc2VxX2ZsYWdzKQotewotCXN0cnVjdCBuZWlnaF9zZXFfc3RhdGUgKnN0YXRlID0g c2VxLT5wcml2YXRlOwotCWxvZmZfdCBwb3NfbWludXNfb25lOwotCi0Jc3RhdGUtPnRibCA9IHRi bDsKLQlzdGF0ZS0+YnVja2V0ID0gMDsKLQlzdGF0ZS0+ZmxhZ3MgPSAobmVpZ2hfc2VxX2ZsYWdz ICYgfk5FSUdIX1NFUV9JU19QTkVJR0gpOwotCi0JcmVhZF9sb2NrX2JoKCZ0YmwtPmxvY2spOwot Ci0JcG9zX21pbnVzX29uZSA9ICpwb3MgLSAxOwotCXJldHVybiAqcG9zID8gbmVpZ2hfZ2V0X2lk eF9hbnkoc2VxLCAmcG9zX21pbnVzX29uZSkgOiBTRVFfU1RBUlRfVE9LRU47Ci19Ci0KLXZvaWQg Km5laWdoX3NlcV9uZXh0KHN0cnVjdCBzZXFfZmlsZSAqc2VxLCB2b2lkICp2LCBsb2ZmX3QgKnBv cykKLXsKLQlzdHJ1Y3QgbmVpZ2hfc2VxX3N0YXRlICpzdGF0ZTsKLQl2b2lkICpyYzsKLQotCWlm ICh2ID09IFNFUV9TVEFSVF9UT0tFTikgewotCQlyYyA9IG5laWdoX2dldF9pZHgoc2VxLCBwb3Mp OwotCQlnb3RvIG91dDsKLQl9Ci0KLQlzdGF0ZSA9IHNlcS0+cHJpdmF0ZTsKLQlpZiAoIShzdGF0 ZS0+ZmxhZ3MgJiBORUlHSF9TRVFfSVNfUE5FSUdIKSkgewotCQlyYyA9IG5laWdoX2dldF9uZXh0 KHNlcSwgdiwgTlVMTCk7Ci0JCWlmIChyYykKLQkJCWdvdG8gb3V0OwotCQlpZiAoIShzdGF0ZS0+ ZmxhZ3MgJiBORUlHSF9TRVFfTkVJR0hfT05MWSkpCi0JCQlyYyA9IHBuZWlnaF9nZXRfZmlyc3Qo c2VxKTsKLQl9IGVsc2UgewotCQlCVUdfT04oc3RhdGUtPmZsYWdzICYgTkVJR0hfU0VRX05FSUdI X09OTFkpOwotCQlyYyA9IHBuZWlnaF9nZXRfbmV4dChzZXEsIHYsIE5VTEwpOwotCX0KLW91dDoK LQkrKygqcG9zKTsKLQlyZXR1cm4gcmM7Ci19Ci0KLXZvaWQgbmVpZ2hfc2VxX3N0b3Aoc3RydWN0 IHNlcV9maWxlICpzZXEsIHZvaWQgKnYpCi17Ci0Jc3RydWN0IG5laWdoX3NlcV9zdGF0ZSAqc3Rh dGUgPSBzZXEtPnByaXZhdGU7Ci0Jc3RydWN0IG5laWdoX3RhYmxlICp0YmwgPSBzdGF0ZS0+dGJs OwotCi0JcmVhZF91bmxvY2tfYmgoJnRibC0+bG9jayk7Ci19Ci0KLS8qIHN0YXRpc3RpY3Mgdmlh IHNlcV9maWxlICovCi0KLXN0YXRpYyB2b2lkICpuZWlnaF9zdGF0X3NlcV9zdGFydChzdHJ1Y3Qg c2VxX2ZpbGUgKnNlcSwgbG9mZl90ICpwb3MpCi17Ci0Jc3RydWN0IHByb2NfZGlyX2VudHJ5ICpw ZGUgPSBzZXEtPnByaXZhdGU7Ci0Jc3RydWN0IG5laWdoX3RhYmxlICp0YmwgPSBwZGUtPmRhdGE7 Ci0JaW50IGxjcHU7Ci0KLQlpZiAoKnBvcyA9PSAwKQotCQlyZXR1cm4gU0VRX1NUQVJUX1RPS0VO OwotCQotCWZvciAobGNwdSA9ICpwb3MtMTsgbGNwdSA8IHNtcF9udW1fY3B1czsgKytsY3B1KSB7 Ci0JCWludCBpID0gY3B1X2xvZ2ljYWxfbWFwKGxjcHUpOwotCQkqcG9zID0gbGNwdSsxOwotCQly ZXR1cm4gJnRibC0+c3RhdHNbaV07Ci0JfQotCXJldHVybiBOVUxMOwotfQotCi1zdGF0aWMgdm9p ZCAqbmVpZ2hfc3RhdF9zZXFfbmV4dChzdHJ1Y3Qgc2VxX2ZpbGUgKnNlcSwgdm9pZCAqdiwgbG9m Zl90ICpwb3MpCi17Ci0Jc3RydWN0IHByb2NfZGlyX2VudHJ5ICpwZGUgPSBzZXEtPnByaXZhdGU7 Ci0Jc3RydWN0IG5laWdoX3RhYmxlICp0YmwgPSBwZGUtPmRhdGE7Ci0JaW50IGxjcHU7Ci0KLQlm b3IgKGxjcHUgPSAqcG9zOyBsY3B1IDwgc21wX251bV9jcHVzOyArK2xjcHUpIHsKLQkJaW50IGkg PSBjcHVfbG9naWNhbF9tYXAobGNwdSk7Ci0JCSpwb3MgPSBsY3B1KzE7Ci0JCXJldHVybiAmdGJs LT5zdGF0c1tpXTsKLQl9Ci0JcmV0dXJuIE5VTEw7Ci19Ci0KLXN0YXRpYyB2b2lkIG5laWdoX3N0 YXRfc2VxX3N0b3Aoc3RydWN0IHNlcV9maWxlICpzZXEsIHZvaWQgKnYpCi17Ci0KLX0KLQotc3Rh dGljIGludCBuZWlnaF9zdGF0X3NlcV9zaG93KHN0cnVjdCBzZXFfZmlsZSAqc2VxLCB2b2lkICp2 KQotewotCXN0cnVjdCBwcm9jX2Rpcl9lbnRyeSAqcGRlID0gc2VxLT5wcml2YXRlOwotCXN0cnVj dCBuZWlnaF90YWJsZSAqdGJsID0gcGRlLT5kYXRhOwotCXN0cnVjdCBuZWlnaF9zdGF0aXN0aWNz ICpzdCA9IHY7Ci0KLQlpZiAodiA9PSBTRVFfU1RBUlRfVE9LRU4pIHsKLQkJc2VxX3ByaW50Zihz ZXEsICJlbnRyaWVzICBhbGxvY3MgZGVzdHJveXMgaGFzaF9ncm93cyAgbG9va3VwcyBoaXRzICBy ZXNfZmFpbGVkICByY3ZfcHJvYmVzX21jYXN0IHJjdl9wcm9iZXNfdWNhc3QgIHBlcmlvZGljX2dj X3J1bnMgZm9yY2VkX2djX3J1bnMgZm9yY2VkX2djX2dvYWxfbWlzc1xuIik7Ci0JCXJldHVybiAw OwotCX0KLQotCXNlcV9wcmludGYoc2VxLCAiJTA4eCAgJTA4bHggJTA4bHggJTA4bHggICUwOGx4 ICUwOGx4ICAlMDhseCAgIgotCQkJIiUwOGx4ICUwOGx4ICAlMDhseCAlMDhseFxuIiwKLQkJICAg YXRvbWljX3JlYWQoJnRibC0+ZW50cmllcyksCi0KLQkJICAgc3QtPmFsbG9jcywKLQkJICAgc3Qt PmRlc3Ryb3lzLAotCQkgICBzdC0+aGFzaF9ncm93cywKLQotCQkgICBzdC0+bG9va3VwcywKLQkJ ICAgc3QtPmhpdHMsCi0KLQkJICAgc3QtPnJlc19mYWlsZWQsCi0KLQkJICAgc3QtPnJjdl9wcm9i ZXNfbWNhc3QsCi0JCSAgIHN0LT5yY3ZfcHJvYmVzX3VjYXN0LAotCi0JCSAgIHN0LT5wZXJpb2Rp Y19nY19ydW5zLAotCQkgICBzdC0+Zm9yY2VkX2djX3J1bnMKLQkJICAgKTsKLQotCXJldHVybiAw OwotfQotCi1zdGF0aWMgc3RydWN0IHNlcV9vcGVyYXRpb25zIG5laWdoX3N0YXRfc2VxX29wcyA9 IHsKLQkuc3RhcnQJPSBuZWlnaF9zdGF0X3NlcV9zdGFydCwKLQkubmV4dAk9IG5laWdoX3N0YXRf c2VxX25leHQsCi0JLnN0b3AJPSBuZWlnaF9zdGF0X3NlcV9zdG9wLAotCS5zaG93CT0gbmVpZ2hf c3RhdF9zZXFfc2hvdywKLX07Ci0KLXN0YXRpYyBpbnQgbmVpZ2hfc3RhdF9zZXFfb3BlbihzdHJ1 Y3QgaW5vZGUgKmlub2RlLCBzdHJ1Y3QgZmlsZSAqZmlsZSkKLXsKLQlpbnQgcmV0ID0gc2VxX29w ZW4oZmlsZSwgJm5laWdoX3N0YXRfc2VxX29wcyk7Ci0KLQlpZiAoIXJldCkgewotCQlzdHJ1Y3Qg c2VxX2ZpbGUgKnNmID0gZmlsZS0+cHJpdmF0ZV9kYXRhOwotCQlzZi0+cHJpdmF0ZSA9IFBERShp bm9kZSk7Ci0JfQotCXJldHVybiByZXQ7Ci19OwotCi1zdGF0aWMgc3RydWN0IGZpbGVfb3BlcmF0 aW9ucyBuZWlnaF9zdGF0X3NlcV9mb3BzID0gewotCS5vd25lcgkgPSBUSElTX01PRFVMRSwKLQku b3BlbiAJID0gbmVpZ2hfc3RhdF9zZXFfb3BlbiwKLQkucmVhZAkgPSBzZXFfcmVhZCwKLQkubGxz ZWVrCSA9IHNlcV9sc2VlaywKLQkucmVsZWFzZSA9IHNlcV9yZWxlYXNlLAotfTsKLQotI2VuZGlm IC8qIENPTkZJR19QUk9DX0ZTICovCi0KICNpZmRlZiBDT05GSUdfQVJQRAogdm9pZCBuZWlnaF9h cHBfbnMoc3RydWN0IG5laWdoYm91ciAqbikKIHsKZGlmZiAtTnVyIGxpbnV4LTIuNC4yOC9uZXQv aXB2NC9hcnAuYyBsaW51eC0yLjQuMjgubmV3L25ldC9pcHY0L2FycC5jCi0tLSBsaW51eC0yLjQu MjgvbmV0L2lwdjQvYXJwLmMJMjAwNC0xMS0xNyAxMjoxNTozMi4wMDAwMDAwMDAgLTA1MDAKKysr IGxpbnV4LTIuNC4yOC5uZXcvbmV0L2lwdjQvYXJwLmMJMjAwNC0xMi0wOSAwOTo1NDoyNy4wMDAw MDAwMDAgLTA1MDAKQEAgLTcwLDEzICs3MCwxMSBAQAogICoJCQkJCWFycF94bWl0IHNvIGludGVy bWVkaWF0ZSBkcml2ZXJzIGxpa2UKICAqCQkJCQlib25kaW5nIGNhbiBjaGFuZ2UgdGhlIHNrYiBi ZWZvcmUKICAqCQkJCQlzZW5kaW5nIChlLmcuIGluc2VydCA4MDIxcSB0YWcpLgotICoJCUhhcmFs ZCBXZWx0ZQk6CWNvbnZlcnQgdG8gbWFrZSB1c2Ugb2YgamVua2lucyBoYXNoCiAgKi8KIAogI2lu Y2x1ZGUgPGxpbnV4L3R5cGVzLmg+CiAjaW5jbHVkZSA8bGludXgvc3RyaW5nLmg+CiAjaW5jbHVk ZSA8bGludXgva2VybmVsLmg+Ci0jaW5jbHVkZSA8bGludXgvbW9kdWxlLmg+CiAjaW5jbHVkZSA8 bGludXgvc2NoZWQuaD4KICNpbmNsdWRlIDxsaW51eC9jb25maWcuaD4KICNpbmNsdWRlIDxsaW51 eC9zb2NrZXQuaD4KQEAgLTk0LDggKzkyLDYgQEAKICNpbmNsdWRlIDxsaW51eC9wcm9jX2ZzLmg+ CiAjaW5jbHVkZSA8bGludXgvc3RhdC5oPgogI2luY2x1ZGUgPGxpbnV4L2luaXQuaD4KLSNpbmNs dWRlIDxsaW51eC9qaGFzaC5oPgotI2luY2x1ZGUgPGxpbnV4L21vZHVsZS5oPgogI2lmZGVmIENP TkZJR19TWVNDVEwKICNpbmNsdWRlIDxsaW51eC9zeXNjdGwuaD4KICNlbmRpZgpAQCAtMjIyLDcg KzIxOCwxNSBAQAogCiBzdGF0aWMgdTMyIGFycF9oYXNoKGNvbnN0IHZvaWQgKnBrZXksIGNvbnN0 IHN0cnVjdCBuZXRfZGV2aWNlICpkZXYpCiB7Ci0JcmV0dXJuIGpoYXNoXzJ3b3JkcygqKHUzMiAq KXBrZXksIGRldi0+aWZpbmRleCwgYXJwX3RibC5oYXNoX3JuZCk7CisJdTMyIGhhc2hfdmFsOwor CisJaGFzaF92YWwgPSAqKHUzMiopcGtleTsKKwloYXNoX3ZhbCBePSAoaGFzaF92YWw+PjE2KTsK KwloYXNoX3ZhbCBePSBoYXNoX3ZhbD4+ODsKKwloYXNoX3ZhbCBePSBoYXNoX3ZhbD4+MzsKKwlo YXNoX3ZhbCA9IChoYXNoX3ZhbF5kZXYtPmlmaW5kZXgpJk5FSUdIX0hBU0hNQVNLOworCisJcmV0 dXJuIGhhc2hfdmFsOwogfQogCiBzdGF0aWMgaW50IGFycF9jb25zdHJ1Y3RvcihzdHJ1Y3QgbmVp Z2hib3VyICpuZWlnaCkKQEAgLTEwMjAsMjYgKzEwMjQsOCBAQAogCQlpZiAoIWRldikKIAkJCXJl dHVybiAtRUlOVkFMOwogCX0KLQlzd2l0Y2ggKGRldi0+dHlwZSkgewotI2lmZGVmIENPTkZJR19G RERJCi0JY2FzZSBBUlBIUkRfRkRESToKLQkJLyoKLQkJICogQWNjb3JkaW5nIHRvIFJGQyAxMzkw LCBGRERJIGRldmljZXMgc2hvdWxkIGFjY2VwdCBBUlAKLQkJICogaGFyZHdhcmUgdHlwZXMgb2Yg MSAoRXRoZXJuZXQpLiAgSG93ZXZlciwgdG8gYmUgbW9yZQotCQkgKiByb2J1c3QsIHdlJ2xsIGFj Y2VwdCBoYXJkd2FyZSB0eXBlcyBvZiBlaXRoZXIgMSAoRXRoZXJuZXQpCi0JCSAqIG9yIDYgKElF RUUgODAyLjIpLgotCQkgKi8KLQkJaWYgKHItPmFycF9oYS5zYV9mYW1pbHkgIT0gQVJQSFJEX0ZE REkgJiYKLQkJICAgIHItPmFycF9oYS5zYV9mYW1pbHkgIT0gQVJQSFJEX0VUSEVSICYmCi0JCSAg ICByLT5hcnBfaGEuc2FfZmFtaWx5ICE9IEFSUEhSRF9JRUVFODAyKQotCQkJcmV0dXJuIC1FSU5W QUw7Ci0JCWJyZWFrOwotI2VuZGlmCi0JZGVmYXVsdDoKLQkJaWYgKHItPmFycF9oYS5zYV9mYW1p bHkgIT0gZGV2LT50eXBlKQotCQkJcmV0dXJuIC1FSU5WQUw7Ci0JCWJyZWFrOwotCX0KKwlpZiAo ci0+YXJwX2hhLnNhX2ZhbWlseSAhPSBkZXYtPnR5cGUpCQorCQlyZXR1cm4gLUVJTlZBTDsKIAog CW5laWdoID0gX19uZWlnaF9sb29rdXBfZXJybm8oJmFycF90YmwsICZpcCwgZGV2KTsKIAllcnIg PSBQVFJfRVJSKG5laWdoKTsKQEAgLTExOTksMTU1ICsxMTg1LDEyOSBAQAogCXJldHVybiBlcnI7 CiB9CiAKLSNpZmRlZiBDT05GSUdfUFJPQ19GUwotI2lmIGRlZmluZWQoQ09ORklHX0FYMjUpIHx8 IGRlZmluZWQoQ09ORklHX0FYMjVfTU9EVUxFKQotCi0vKiAtLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0gKi8KIC8q Ci0gKglheDI1IC0+IEFTQ0lJIGNvbnZlcnNpb24KKyAqCVdyaXRlIHRoZSBjb250ZW50cyBvZiB0 aGUgQVJQIGNhY2hlIHRvIGEgUFJPQ2ZzIGZpbGUuCiAgKi8KLXN0YXRpYyBjaGFyICpheDJhc2My KGF4MjVfYWRkcmVzcyAqYSwgY2hhciAqYnVmKQotewotCWNoYXIgYywgKnM7Ci0JaW50IG47Ci0K LQlmb3IgKG4gPSAwLCBzID0gYnVmOyBuIDwgNjsgbisrKSB7Ci0JCWMgPSAoYS0+YXgyNV9jYWxs W25dID4+IDEpICYgMHg3RjsKLQotCQlpZiAoYyAhPSAnICcpICpzKysgPSBjOwotCX0KLQkKLQkq cysrID0gJy0nOwotCi0JaWYgKChuID0gKChhLT5heDI1X2NhbGxbNl0gPj4gMSkgJiAweDBGKSkg PiA5KSB7Ci0JCSpzKysgPSAnMSc7Ci0JCW4gLT0gMTA7Ci0JfQotCQotCSpzKysgPSBuICsgJzAn OwotCSpzKysgPSAnXDAnOwotCi0JaWYgKCpidWYgPT0gJ1wwJyB8fCAqYnVmID09ICctJykKLQkg ICByZXR1cm4gIioiOwotCi0JcmV0dXJuIGJ1ZjsKLQotfQotI2VuZGlmIC8qIENPTkZJR19BWDI1 ICovCi0KKyNpZm5kZWYgQ09ORklHX1BST0NfRlMKK3N0YXRpYyBpbnQgYXJwX2dldF9pbmZvKGNo YXIgKmJ1ZmZlciwgY2hhciAqKnN0YXJ0LCBvZmZfdCBvZmZzZXQsIGludCBsZW5ndGgpIHsgcmV0 dXJuIDA7IH0KKyNlbHNlCisjaWYgZGVmaW5lZChDT05GSUdfQVgyNSkgfHwgZGVmaW5lZChDT05G SUdfQVgyNV9NT0RVTEUpCitzdGF0aWMgY2hhciAqYXgyYXNjMihheDI1X2FkZHJlc3MgKmEsIGNo YXIgKmJ1Zik7CisjZW5kaWYKICNkZWZpbmUgSEJVRkZFUkxFTiAzMAogCi1zdGF0aWMgdm9pZCBh cnBfZm9ybWF0X25laWdoX2VudHJ5KHN0cnVjdCBzZXFfZmlsZSAqc2VxLAotCQkJCSAgIHN0cnVj dCBuZWlnaGJvdXIgKm4pCitzdGF0aWMgaW50IGFycF9nZXRfaW5mbyhjaGFyICpidWZmZXIsIGNo YXIgKipzdGFydCwgb2ZmX3Qgb2Zmc2V0LCBpbnQgbGVuZ3RoKQogeworCWludCBsZW49MDsKKwlv ZmZfdCBwb3M9MDsKKwlpbnQgc2l6ZTsKIAljaGFyIGhidWZmZXJbSEJVRkZFUkxFTl07CisJaW50 IGksaixrOwogCWNvbnN0IGNoYXIgaGV4YnVmW10gPSAgIjAxMjM0NTY3ODlBQkNERUYiOwotCWlu dCBrLCBqOwotCWNoYXIgdGJ1ZlsxNl07Ci0Jc3RydWN0IG5ldF9kZXZpY2UgKmRldiA9IG4tPmRl djsKLQlpbnQgaGF0eXBlID0gZGV2LT50eXBlOwogCi0JcmVhZF9sb2NrKCZuLT5sb2NrKTsKKwlz aXplID0gc3ByaW50ZihidWZmZXIsIklQIGFkZHJlc3MgICAgICAgSFcgdHlwZSAgICAgRmxhZ3Mg ICAgICAgSFcgYWRkcmVzcyAgICAgICAgICAgIE1hc2sgICAgIERldmljZVxuIik7CiAKLQkvKiBD b252ZXJ0IGhhcmR3YXJlIGFkZHJlc3MgdG8gWFg6WFg6WFg6WFggLi4uIGZvcm0uICovCi0jaWYg ZGVmaW5lZChDT05GSUdfQVgyNSkgfHwgZGVmaW5lZChDT05GSUdfQVgyNV9NT0RVTEUpCi0JaWYg KGhhdHlwZSA9PSBBUlBIUkRfQVgyNSB8fCBoYXR5cGUgPT0gQVJQSFJEX05FVFJPTSkKLQkJYXgy YXNjMigoYXgyNV9hZGRyZXNzICopbi0+aGEsIGhidWZmZXIpOwotCWVsc2UgewotI2VuZGlmCi0J Zm9yIChrPTAsaj0wO2s8SEJVRkZFUkxFTi0zICYmIGo8ZGV2LT5hZGRyX2xlbjtqKyspIHsKLQkJ aGJ1ZmZlcltrKytdPWhleGJ1Zlsobi0+aGFbal0+PjQpJjE1IF07Ci0JCWhidWZmZXJbaysrXT1o ZXhidWZbbi0+aGFbal0mMTUgICAgIF07Ci0JCWhidWZmZXJbaysrXT0nOic7Ci0JfQotCWhidWZm ZXJbLS1rXT0wOwotI2lmIGRlZmluZWQoQ09ORklHX0FYMjUpIHx8IGRlZmluZWQoQ09ORklHX0FY MjVfTU9EVUxFKQotCX0KLSNlbmRpZgotCXNwcmludGYodGJ1ZiwgIiV1LiV1LiV1LiV1IiwgTklQ UVVBRCgqKHUzMiopbi0+cHJpbWFyeV9rZXkpKTsKLQlzZXFfcHJpbnRmKHNlcSwgIiUtMTZzIDB4 JS0xMHgweCUtMTB4JXMgICAgICogICAgICAgICVzXG4iLAotCQkgICB0YnVmLCBoYXR5cGUsIGFy cF9zdGF0ZV90b19mbGFncyhuKSwgaGJ1ZmZlciwgZGV2LT5uYW1lKTsKLQlyZWFkX3VubG9jaygm bi0+bG9jayk7Ci19CisJcG9zKz1zaXplOworCWxlbis9c2l6ZTsKIAotc3RhdGljIHZvaWQgYXJw X2Zvcm1hdF9wbmVpZ2hfZW50cnkoc3RydWN0IHNlcV9maWxlICpzZXEsCi0JCQkJICAgIHN0cnVj dCBwbmVpZ2hfZW50cnkgKm4pCi17Ci0Jc3RydWN0IG5ldF9kZXZpY2UgKmRldiA9IG4tPmRldjsK LQlpbnQgaGF0eXBlID0gZGV2ID8gZGV2LT50eXBlIDogMDsKLQljaGFyIHRidWZbMTZdOworCWZv cihpPTA7IGk8PU5FSUdIX0hBU0hNQVNLOyBpKyspIHsKKwkJc3RydWN0IG5laWdoYm91ciAqbjsK KwkJcmVhZF9sb2NrX2JoKCZhcnBfdGJsLmxvY2spOworCQlmb3IgKG49YXJwX3RibC5oYXNoX2J1 Y2tldHNbaV07IG47IG49bi0+bmV4dCkgeworCQkJc3RydWN0IG5ldF9kZXZpY2UgKmRldiA9IG4t PmRldjsKKwkJCWludCBoYXR5cGUgPSBkZXYtPnR5cGU7CisKKwkJCS8qIERvIG5vdCBjb25mdXNl IHVzZXJzICJhcnAgLWEiIHdpdGggbWFnaWMgZW50cmllcyAqLworCQkJaWYgKCEobi0+bnVkX3N0 YXRlJn5OVURfTk9BUlApKQorCQkJCWNvbnRpbnVlOwogCi0Jc3ByaW50Zih0YnVmLCAiJXUuJXUu JXUuJXUiLCBOSVBRVUFEKCoodTMyKiluLT5rZXkpKTsKLQlzZXFfcHJpbnRmKHNlcSwgIiUtMTZz IDB4JS0xMHgweCUtMTB4JXMgICAgICogICAgICAgICVzXG4iLAotCQkgICB0YnVmLCBoYXR5cGUs IEFURl9QVUJMIHwgQVRGX1BFUk0sICIwMDowMDowMDowMDowMDowMCIsCi0JCSAgIGRldiA/IGRl di0+bmFtZSA6ICIqIik7Ci19CisJCQlyZWFkX2xvY2soJm4tPmxvY2spOwogCi1zdGF0aWMgaW50 IGFycF9zZXFfc2hvdyhzdHJ1Y3Qgc2VxX2ZpbGUgKnNlcSwgdm9pZCAqdikKLXsKLQlpZiAodiA9 PSBTRVFfU1RBUlRfVE9LRU4pIHsKLQkJc2VxX3B1dHMoc2VxLCAiSVAgYWRkcmVzcyAgICAgICBI VyB0eXBlICAgICBGbGFncyAgICAgICAiCi0JCQkgICAgICAiSFcgYWRkcmVzcyAgICAgICAgICAg IE1hc2sgICAgIERldmljZVxuIik7Ci0JfSBlbHNlIHsKLQkJc3RydWN0IG5laWdoX3NlcV9zdGF0 ZSAqc3RhdGUgPSBzZXEtPnByaXZhdGU7Ci0KLQkJaWYgKHN0YXRlLT5mbGFncyAmIE5FSUdIX1NF UV9JU19QTkVJR0gpCi0JCQlhcnBfZm9ybWF0X3BuZWlnaF9lbnRyeShzZXEsIHYpOwotCQllbHNl Ci0JCQlhcnBfZm9ybWF0X25laWdoX2VudHJ5KHNlcSwgdik7Ci0JfQotCi0JcmV0dXJuIDA7Ci19 CisvKgorICoJQ29udmVydCBoYXJkd2FyZSBhZGRyZXNzIHRvIFhYOlhYOlhYOlhYIC4uLiBmb3Jt LgorICovCisjaWYgZGVmaW5lZChDT05GSUdfQVgyNSkgfHwgZGVmaW5lZChDT05GSUdfQVgyNV9N T0RVTEUpCisJCQlpZiAoaGF0eXBlID09IEFSUEhSRF9BWDI1IHx8IGhhdHlwZSA9PSBBUlBIUkRf TkVUUk9NKQorCQkJCWF4MmFzYzIoKGF4MjVfYWRkcmVzcyAqKW4tPmhhLCBoYnVmZmVyKTsKKwkJ CWVsc2UgeworI2VuZGlmCisJCQlmb3IgKGs9MCxqPTA7azxIQlVGRkVSTEVOLTMgJiYgajxkZXYt PmFkZHJfbGVuO2orKykgeworCQkJCWhidWZmZXJbaysrXT1oZXhidWZbKG4tPmhhW2pdPj40KSYx NSBdOworCQkJCWhidWZmZXJbaysrXT1oZXhidWZbbi0+aGFbal0mMTUgICAgIF07CisJCQkJaGJ1 ZmZlcltrKytdPSc6JzsKKwkJCX0KKwkJCWhidWZmZXJbLS1rXT0wOwogCi1zdGF0aWMgdm9pZCAq YXJwX3NlcV9zdGFydChzdHJ1Y3Qgc2VxX2ZpbGUgKnNlcSwgbG9mZl90ICpwb3MpCi17Ci0JLyog RG9uJ3Qgd2FudCB0byBjb25mdXNlICJhcnAgLWEiIHcvIG1hZ2ljIGVudHJpZXMsCi0JICogc28g d2UgdGVsbCB0aGUgZ2VuZXJpYyBpdGVyYXRvciB0byBza2lwIE5VRF9OT0FSUC4KLQkgKi8KLQly ZXR1cm4gbmVpZ2hfc2VxX3N0YXJ0KHNlcSwgcG9zLCAmYXJwX3RibCwgTkVJR0hfU0VRX1NLSVBf Tk9BUlApOwotfQorI2lmIGRlZmluZWQoQ09ORklHX0FYMjUpIHx8IGRlZmluZWQoQ09ORklHX0FY MjVfTU9EVUxFKQorCQl9CisjZW5kaWYKIAotLyogLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tICovCisJCQl7CisJ CQkJY2hhciB0YnVmWzE2XTsKKwkJCQlzcHJpbnRmKHRidWYsICIldS4ldS4ldS4ldSIsIE5JUFFV QUQoKih1MzIqKW4tPnByaW1hcnlfa2V5KSk7CisJCQkJc2l6ZSA9IHNwcmludGYoYnVmZmVyK2xl biwgIiUtMTZzIDB4JS0xMHgweCUtMTB4JXMiCisJCQkJCQkJIiAgICAgKiAgICAgICAgJXNcbiIs CisJCQkJCXRidWYsCisJCQkJCWhhdHlwZSwKKwkJCQkJYXJwX3N0YXRlX3RvX2ZsYWdzKG4pLCAK KwkJCQkJaGJ1ZmZlciwKKwkJCQkJZGV2LT5uYW1lKTsKKwkJCX0KIAotc3RhdGljIHN0cnVjdCBz ZXFfb3BlcmF0aW9ucyBhcnBfc2VxX29wcyA9IHsKLQkuc3RhcnQJPSBhcnBfc2VxX3N0YXJ0LAot CS5uZXh0CT0gbmVpZ2hfc2VxX25leHQsCi0JLnN0b3AJPSBuZWlnaF9zZXFfc3RvcCwKLQkuc2hv dwk9IGFycF9zZXFfc2hvdywKLX07CisJCQlyZWFkX3VubG9jaygmbi0+bG9jayk7CiAKLXN0YXRp YyBpbnQgYXJwX3NlcV9vcGVuKHN0cnVjdCBpbm9kZSAqaW5vZGUsIHN0cnVjdCBmaWxlICpmaWxl KQotewotCXN0cnVjdCBzZXFfZmlsZSAqc2VxOwotCWludCByYyA9IC1FTk9NRU07Ci0Jc3RydWN0 IG5laWdoX3NlcV9zdGF0ZSAqcyA9IGttYWxsb2Moc2l6ZW9mKCpzKSwgR0ZQX0tFUk5FTCk7CisJ CQlsZW4gKz0gc2l6ZTsKKwkJCXBvcyArPSBzaXplOworCQkgIAorCQkJaWYgKHBvcyA8PSBvZmZz ZXQpCisJCQkJbGVuPTA7CisJCQlpZiAocG9zID49IG9mZnNldCtsZW5ndGgpIHsKKwkJCQlyZWFk X3VubG9ja19iaCgmYXJwX3RibC5sb2NrKTsKKyAJCQkJZ290byBkb25lOworCQkJfQorCQl9CisJ CXJlYWRfdW5sb2NrX2JoKCZhcnBfdGJsLmxvY2spOworCX0KIAotCWlmICghcykKLQkJZ290byBv dXQ7CisJZm9yIChpPTA7IGk8PVBORUlHSF9IQVNITUFTSzsgaSsrKSB7CisJCXN0cnVjdCBwbmVp Z2hfZW50cnkgKm47CisJCWZvciAobj1hcnBfdGJsLnBoYXNoX2J1Y2tldHNbaV07IG47IG49bi0+ bmV4dCkgeworCQkJc3RydWN0IG5ldF9kZXZpY2UgKmRldiA9IG4tPmRldjsKKwkJCWludCBoYXR5 cGUgPSBkZXYgPyBkZXYtPnR5cGUgOiAwOworCisJCQl7CisJCQkJY2hhciB0YnVmWzE2XTsKKwkJ CQlzcHJpbnRmKHRidWYsICIldS4ldS4ldS4ldSIsIE5JUFFVQUQoKih1MzIqKW4tPmtleSkpOwor CQkJCXNpemUgPSBzcHJpbnRmKGJ1ZmZlcitsZW4sICIlLTE2cyAweCUtMTB4MHglLTEweCVzIgor CQkJCQkJCSIgICAgICogICAgICAgICVzXG4iLAorCQkJCQl0YnVmLAorCQkJCQloYXR5cGUsCisg CQkJCQlBVEZfUFVCTHxBVEZfUEVSTSwKKwkJCQkJIjAwOjAwOjAwOjAwOjAwOjAwIiwKKwkJCQkJ ZGV2ID8gZGV2LT5uYW1lIDogIioiKTsKKwkJCX0KIAotCW1lbXNldChzLCAwLCBzaXplb2YoKnMp KTsKLQlyYyA9IHNlcV9vcGVuKGZpbGUsICZhcnBfc2VxX29wcyk7Ci0JaWYgKHJjKQotCQlnb3Rv IG91dF9rZnJlZTsKKwkJCWxlbiArPSBzaXplOworCQkJcG9zICs9IHNpemU7CisJCSAgCisJCQlp ZiAocG9zIDw9IG9mZnNldCkKKwkJCQlsZW49MDsKKwkJCWlmIChwb3MgPj0gb2Zmc2V0K2xlbmd0 aCkKKwkJCQlnb3RvIGRvbmU7CisJCX0KKwl9CiAKLQlzZXEgPSBmaWxlLT5wcml2YXRlX2RhdGE7 Ci0Jc2VxLT5wcml2YXRlID0gczsKLW91dDoKLQlyZXR1cm4gcmM7Ci1vdXRfa2ZyZWU6Ci0Ja2Zy ZWUocyk7Ci0JZ290byBvdXQ7Citkb25lOgorICAKKwkqc3RhcnQgPSBidWZmZXIrbGVuLShwb3Mt b2Zmc2V0KTsJLyogU3RhcnQgb2Ygd2FudGVkIGRhdGEgKi8KKwlsZW4gPSBwb3Mtb2Zmc2V0OwkJ CS8qIFN0YXJ0IHNsb3AgKi8KKwlpZiAobGVuPmxlbmd0aCkKKwkJbGVuID0gbGVuZ3RoOwkJCS8q IEVuZGluZyBzbG9wICovCisJaWYgKGxlbjwwKQorCQlsZW4gPSAwOworCXJldHVybiBsZW47CiB9 Ci0KLXN0YXRpYyBzdHJ1Y3QgZmlsZV9vcGVyYXRpb25zIGFycF9zZXFfZm9wcyA9IHsKLQkub3du ZXIJCT0gVEhJU19NT0RVTEUsCi0JLm9wZW4JCT0gYXJwX3NlcV9vcGVuLAotCS5yZWFkCQk9IHNl cV9yZWFkLAotCS5sbHNlZWsJCT0gc2VxX2xzZWVrLAotCS5yZWxlYXNlCT0gc2VxX3JlbGVhc2Vf cHJpdmF0ZSwKLX07Ci0jZW5kaWYgLyogQ09ORklHX1BST0NfRlMgKi8KKyNlbmRpZgogCiBzdGF0 aWMgaW50IGFycF9uZXRkZXZfZXZlbnQoc3RydWN0IG5vdGlmaWVyX2Jsb2NrICp0aGlzLCB1bnNp Z25lZCBsb25nIGV2ZW50LCB2b2lkICpwdHIpCiB7CkBAIC0xMzk1LDEwICsxMzU1LDggQEAKIAog CWRldl9hZGRfcGFjaygmYXJwX3BhY2tldF90eXBlKTsKIAotI2lmZGVmIENPTkZJR19QUk9DX0ZT Ci0JaWYgKCFwcm9jX25ldF9mb3BzX2NyZWF0ZSgiYXJwIiwgU19JUlVHTywgJmFycF9zZXFfZm9w cykpCi0JCXBhbmljKCJ1bmFibGUgdG8gY3JlYXRlIGFycCBwcm9jIGVudHJ5Iik7Ci0jZW5kaWYK Kwlwcm9jX25ldF9jcmVhdGUgKCJhcnAiLCAwLCBhcnBfZ2V0X2luZm8pOworCiAjaWZkZWYgQ09O RklHX1NZU0NUTAogCW5laWdoX3N5c2N0bF9yZWdpc3RlcihOVUxMLCAmYXJwX3RibC5wYXJtcywg TkVUX0lQVjQsIE5FVF9JUFY0X05FSUdILCAiaXB2NCIpOwogI2VuZGlmCkBAIC0xNDA2LDMgKzEz NjQsMzkgQEAKIH0KIAogCisjaWZkZWYgQ09ORklHX1BST0NfRlMKKyNpZiBkZWZpbmVkKENPTkZJ R19BWDI1KSB8fCBkZWZpbmVkKENPTkZJR19BWDI1X01PRFVMRSkKKworLyoKKyAqCWF4MjUgLT4g QVNDSUkgY29udmVyc2lvbgorICovCitjaGFyICpheDJhc2MyKGF4MjVfYWRkcmVzcyAqYSwgY2hh ciAqYnVmKQoreworCWNoYXIgYywgKnM7CisJaW50IG47CisKKwlmb3IgKG4gPSAwLCBzID0gYnVm OyBuIDwgNjsgbisrKSB7CisJCWMgPSAoYS0+YXgyNV9jYWxsW25dID4+IDEpICYgMHg3RjsKKwor CQlpZiAoYyAhPSAnICcpICpzKysgPSBjOworCX0KKwkKKwkqcysrID0gJy0nOworCisJaWYgKChu ID0gKChhLT5heDI1X2NhbGxbNl0gPj4gMSkgJiAweDBGKSkgPiA5KSB7CisJCSpzKysgPSAnMSc7 CisJCW4gLT0gMTA7CisJfQorCQorCSpzKysgPSBuICsgJzAnOworCSpzKysgPSAnXDAnOworCisJ aWYgKCpidWYgPT0gJ1wwJyB8fCAqYnVmID09ICctJykKKwkgICByZXR1cm4gIioiOworCisJcmV0 dXJuIGJ1ZjsKKworfQorCisjZW5kaWYKKyNlbmRpZgpkaWZmIC1OdXIgbGludXgtMi40LjI4L25l dC9uZXRzeW1zLmMgbGludXgtMi40LjI4Lm5ldy9uZXQvbmV0c3ltcy5jCi0tLSBsaW51eC0yLjQu MjgvbmV0L25ldHN5bXMuYwkyMDA0LTExLTE3IDEyOjE1OjMyLjAwMDAwMDAwMCAtMDUwMAorKysg bGludXgtMi40LjI4Lm5ldy9uZXQvbmV0c3ltcy5jCTIwMDQtMTItMDkgMDk6NTQ6MjcuMDAwMDAw MDAwIC0wNTAwCkBAIC0xNzksMTMgKzE3OSw5IEBACiBFWFBPUlRfU1lNQk9MKG5laWdoX3VwZGF0 ZSk7CiBFWFBPUlRfU1lNQk9MKG5laWdoX2NyZWF0ZSk7CiBFWFBPUlRfU1lNQk9MKG5laWdoX2xv b2t1cCk7Ci1FWFBPUlRfU1lNQk9MKG5laWdoX2xvb2t1cF9ub2Rldik7CiBFWFBPUlRfU1lNQk9M KF9fbmVpZ2hfZXZlbnRfc2VuZCk7CiBFWFBPUlRfU1lNQk9MKG5laWdoX2V2ZW50X25zKTsKIEVY UE9SVF9TWU1CT0wobmVpZ2hfaWZkb3duKTsKLUVYUE9SVF9TWU1CT0wobmVpZ2hfc2VxX3N0YXJ0 KTsKLUVYUE9SVF9TWU1CT0wobmVpZ2hfc2VxX25leHQpOwotRVhQT1JUX1NZTUJPTChuZWlnaF9z ZXFfc3RvcCk7CiAjaWZkZWYgQ09ORklHX0FSUEQKIEVYUE9SVF9TWU1CT0wobmVpZ2hfYXBwX25z KTsKICNlbmRpZgo= --b1_da999e80abd36631c9c60f8e897c4cde-- From hadi@cyberus.ca Mon Dec 13 11:24:45 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 13 Dec 2004 11:24:52 -0800 (PST) Received: from mx02.cybersurf.com (mx02.cybersurf.com [209.197.145.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBDJOPew018077 for ; Mon, 13 Dec 2004 11:24:45 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx02.cybersurf.com with esmtp (Exim 4.30) id 1Cdvnb-0007Nj-6u for netdev@oss.sgi.com; Mon, 13 Dec 2004 14:23:59 -0500 Received: from [216.209.86.2] (helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CdvnP-0004Ik-SO; Mon, 13 Dec 2004 14:23:47 -0500 Subject: Re: [RFC] tcf_bind_filter failure handling From: jamal Reply-To: hadi@cyberus.ca To: Thomas Graf Cc: Patrick McHardy , "David S. Miller" , Herbert Xu , netdev@oss.sgi.com In-Reply-To: <20041213185203.GF8493@postel.suug.ch> References: <20041110010113.GJ31969@postel.suug.ch> <41916A91.3080107@trash.net> <20041110012251.GK31969@postel.suug.ch> <41916F0B.5010809@trash.net> <20041110013941.GL31969@postel.suug.ch> <41917330.6090002@trash.net> <20041212175736.GA8493@postel.suug.ch> <41BC8819.7040501@trash.net> <20041213165302.GE8493@postel.suug.ch> <41BDDB5A.9000907@trash.net> <20041213185203.GF8493@postel.suug.ch> Content-Type: text/plain Organization: jamalopolous Message-Id: <1102965823.1075.14.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 13 Dec 2004 14:23:43 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12713 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev When/why would binding fail? tcindex is an exception. I dont see binding as having any contribution to the error path. Additional locking is not advisable. The binding could happen while traffic is running. cheers, jamal On Mon, 2004-12-13 at 13:52, Thomas Graf wrote: > * Patrick McHardy <41BDDB5A.9000907@trash.net> 2004-12-13 19:11 > > Thomas Graf wrote: > > > > >The handling of a failure in tcf_bind_filter is inconsistent. > > > > > >u32: ignore > > >fw: ignore > > >route: ignore > > >rsvp: ignore > > >tcindex: error > > > > > >It might be a good idea to make this consistent. So in order to validate > > >the classid before making any changes we could simply lock it via get > > >(see patch below), return an error if it fails and put it back in case > > >of an error further in the path or after binding the filter. > > > > > >Bindings not only locks the class from removal while a filter is > > >pointing to it. It speeds up classyfing by saving a lookup for every > > >tc_classify call. It's not really a problem if the class is not locked, > > >the qdisc will look it up and falls back to a default class if it > > >doesn't exists so it's rather a cosmetic/policy thing. > > > > > You should just fix tcindex not to care about errors in tcf_bind_filter. > > bind_tcf already locks the class. Some qdiscs (like prio) map bind_filter > > to get, but others (HTB, HFSC, CBQ) use a seperate counter because it is > > legal to end up with a refcnt > 0 after delete. When a class with filters > > pointing to it is tried to destroy they return -EBUSY, which can't be done > > by looking at the refcnt. > > Little misunderstanding here. I'm not aiming at replacing tcf_bind_filter > with get. My question is rather whether to regard tcf_bind_filter not setting > tcf_result->class as an error or ignore it. > > I'm all for ignoring it in tcindex, it requires some changes because > it checks tcf_result.class field to see if hash bucket is non-empty if > perfect hash is used but is not a problem at all. > > The tcf_class_get/put would be required to ensure proper locking during > validation of parameters if validating the classid being last before > changing things doesn't make sense due to the need to undo expensive > operations required before binding. > > I will fix tcindex, since you also agree on simply ignoring it and regard > the binding as an ptional locking and performance increase possibility > given to userspace. > > From tgraf@suug.ch Mon Dec 13 11:32:53 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 13 Dec 2004 11:33:00 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBDJWWaw018779 for ; Mon, 13 Dec 2004 11:32:53 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id 3C854F; Mon, 13 Dec 2004 20:31:47 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id D98231C0EA; Mon, 13 Dec 2004 20:32:28 +0100 (CET) Date: Mon, 13 Dec 2004 20:32:28 +0100 From: Thomas Graf To: jamal Cc: Patrick McHardy , "David S. Miller" , Herbert Xu , netdev@oss.sgi.com Subject: Re: [RFC] tcf_bind_filter failure handling Message-ID: <20041213193228.GG8493@postel.suug.ch> References: <20041110012251.GK31969@postel.suug.ch> <41916F0B.5010809@trash.net> <20041110013941.GL31969@postel.suug.ch> <41917330.6090002@trash.net> <20041212175736.GA8493@postel.suug.ch> <41BC8819.7040501@trash.net> <20041213165302.GE8493@postel.suug.ch> <41BDDB5A.9000907@trash.net> <20041213185203.GF8493@postel.suug.ch> <1102965823.1075.14.camel@jzny.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1102965823.1075.14.camel@jzny.localdomain> X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12714 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * jamal <1102965823.1075.14.camel@jzny.localdomain> 2004-12-13 14:23 > > When/why would binding fail? tcindex is an exception. > I dont see binding as having any contribution to the error path. > Additional locking is not advisable. The binding could happen while > traffic is running. The binding fails if the class does not exist at the time the classifier is loaded. The original implementation regarded binding optional to do the classid -> class lookup only once while loading instead of everytime classify() returns. tcindex does not do so because it uses the class field to determine whether a perfect hash bucket is used or not. I changed it to check classid || police || action because one of them is definitely defined and enforced as a requirement during validation. The locking i mentioned was not a spinlock but rather a refcnt++ in the class intended to be bound later during change, i.e. to ensure a qdisc destroy rcu callback can't destroy the class while we're about to bind to it. But since we agree on changing tcindex this is not an issue and nothing will change. Sorry for the confusion, my word choice was rather bad for such a simple issue and produced more confusion than results. From halr@voltaire.com Mon Dec 13 12:22:41 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 13 Dec 2004 12:22:48 -0800 (PST) Received: from taurus.voltaire.com (Volter-FW.ser.netvision.net.il [212.143.107.30]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBDKMJiO024123 for ; Mon, 13 Dec 2004 12:22:41 -0800 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Subject: RE: [openib-general] [PATCH][v3][17/21] Add IPoIB (IP-over-InfiniBand)driver X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 Date: Mon, 13 Dec 2004 22:19:09 +0200 Message-ID: <5CE025EE7D88BA4599A2C8FEFCF226F5175B0C@taurus.voltaire.com> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: [openib-general] [PATCH][v3][17/21] Add IPoIB (IP-over-InfiniBand)driver Thread-Index: AcThP4iCp3eUgFs6Q32yzxzs66Z4HwAEXg88 From: "Hal Rosenstock" To: "Roland Dreier" , Cc: , X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id iBDKMJiO024123 X-archive-position: 12715 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: halr@voltaire.com Precedence: bulk X-list: netdev Hi Roland, [You wrote:] > The IPoIB protocol/encapsulation is described in the Internet-Drafts > http://www.ietf.org/internet-drafts/draft-ietf-ipoib-architecture-04.txt > http://www.ietf.org/internet-drafts/draft-ietf-ipoib-ip-over-infiniband-07.txt The latest I-D is now http://www.ietf.org/internet-drafts/draft-ietf-ipoib-ip-over-infiniband-08.txt Also, isn't DHCP over IB (http://www.ietf.org/internet-drafts/draft-ietf-ipoib-dhcp-over-infiniband-07.txt) also supported ? If so, is that part of this or some other patch being submitted ? Thanks. -- Hal From kazunori@miyazawa.org Mon Dec 13 15:52:25 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 13 Dec 2004 15:52:32 -0800 (PST) Received: from miyazawa.org (usen-221x116x13x66.ap-US01.usen.ad.jp [221.116.13.66]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBDNpsaW032223 for ; Mon, 13 Dec 2004 15:52:24 -0800 Received: from [10.21.7.64] (cp.64translator.com [::ffff:202.214.123.2]) (AUTH: LOGIN kazunori, SSL: TLSv1/SSLv3,128bits,RC4-MD5) by miyazawa.org with esmtp; Tue, 14 Dec 2004 08:48:05 +0900 id 00000916.41BE2A35.00004066 Message-ID: <41BE2AAB.1070904@miyazawa.org> Date: Tue, 14 Dec 2004 08:50:03 +0900 From: Kazunori Miyazawa User-Agent: Mozilla Thunderbird 0.8 (Windows/20040913) X-Accept-Language: ja, en-us, en MIME-Version: 1.0 To: davem@redhat.com, herbert@gondor.apana.org.au CC: netdev@oss.sgi.com, usagi-core@linux-ipv6.org Subject: a question about XFRM_POLICY_FWD Content-Type: text/plain; charset=ISO-2022-JP Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12716 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kazunori@miyazawa.org Precedence: bulk X-list: netdev Hello, I would like to ask about meaning of XFRM_POLICY_FWD, although a question may have been asked before. I configured IPsec tunnel and thought XFRM_POLICY_FWD might be confusing. A forwarding policy which is represent with XFRM_POLICY_FWD in the kernel affects only in-comming forwarding packet. The policy does not affect out-going forwarding packet and input packet to a security gateway itself. If we want to connect network 2001:DB8:1::/64 and 2001:DB8:3::/64 with SGW(A) and SGW(B), ---------------SGW(A)==================SGW(B)--------- 2001:DB8:1::/64 2001:DB8:2::/64 2001:DB8:3::/64 === represents IPsec tunnel --- represents a network behind SGW. The addresses of each SGW are: SGW(A) internal address 2001:DB8:1::A/64 SGW(A) external address 2001:DB8:2::A/64 SGW(B) external address 2001:DB8:2::B/64 SGW(B) internal address 2001:DB8:3::B/64 We need to configure policies in SGW(A) which are represent with setkey command are spdadd 2001:DB8:1::/64 2001:DB8:3::/64 any -P out ipsec esp/tunnel/2001:DB8:2::A-2001:DB82::B/require; spdadd 2001:DB8:3::/64 2001:DB8:1::/64 any -P fwd ipsec esp/tunnel/2001:DB8:2::B-2001:DB82::A/require; However, the above policies does not allow SGW(A) to receive packest from 2001:DB8:3::/64 to 2001:DB8:1::A/64 because there is no policy for "INPUT". To let the packet reach 2001:DB8:1::A, we needs an additional policy spdadd 2001:DB8:3::/64 2001:DB8:1::A/128 any -P in ipsec esp/tunnel/2001:DB8:2::B-2001:DB82::A/require; Totally, we need three policies for the configuration in a SGW. I think , from the point of view of user or administrator, why forward does not allow the packet instead 2001:DB8:1::A is included in network 2001:DB8:1::/64. And I also think why I can not configure out-going forward policy with "fwd". Anyway XFRM_POLICY_FWD or "fwd" might be confusing. What does XFRM_POLICY_FWD or direction="forward" means in the architecture design? Of course I know the implementation :-p P.S. IMHO, We should remove or obsolete XFRM_POLICY_FORWARD and use XFRM_POLICY_IN instead of it. or We should lookup out-going forwarding packet with XFRM_POLICY_FORWARD. -- Kazunori Miyazawa From roland@topspin.com Mon Dec 13 16:15:38 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 13 Dec 2004 16:15:47 -0800 (PST) Received: from umhlanga.STRATNET.NET (umhlanga.stratnet.net [12.162.17.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBE0FIJN001511 for ; Mon, 13 Dec 2004 16:15:38 -0800 Received: from exch-1.topspincom.com ([12.162.17.3]) by umhlanga.STRATNET.NET with Microsoft SMTPSVC(5.0.2195.5329); Mon, 13 Dec 2004 16:14:55 -0800 Received: from eddore ([10.10.253.169]) by exch-1.topspincom.com with Microsoft SMTPSVC(5.0.2195.5329); Mon, 13 Dec 2004 16:14:55 -0800 Received: from roland by eddore with local (Exim 4.34) id 1Ce0L8-0007r8-QT; Mon, 13 Dec 2004 16:14:55 -0800 To: "Hal Rosenstock" Cc: , , X-Message-Flag: Warning: May contain useful information References: <5CE025EE7D88BA4599A2C8FEFCF226F5175B0C@taurus.voltaire.com> From: Roland Dreier Date: Mon, 13 Dec 2004 16:14:54 -0800 In-Reply-To: <5CE025EE7D88BA4599A2C8FEFCF226F5175B0C@taurus.voltaire.com> (Hal Rosenstock's message of "Mon, 13 Dec 2004 22:19:09 +0200") Message-ID: <52sm694txd.fsf@topspin.com> User-Agent: Gnus/5.1006 (Gnus v5.10.6) XEmacs/21.4 (Security Through Obscurity, linux) MIME-Version: 1.0 X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: roland@topspin.com Subject: Re: [openib-general] [PATCH][v3][17/21] Add IPoIB (IP-over-InfiniBand)driver Content-Type: text/plain; charset=us-ascii X-SA-Exim-Version: 4.1 (built Tue, 17 Aug 2004 11:06:07 +0200) X-SA-Exim-Scanned: Yes (on eddore) X-OriginalArrivalTime: 14 Dec 2004 00:14:55.0311 (UTC) FILETIME=[F0589DF0:01C4E171] X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12717 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: roland@topspin.com Precedence: bulk X-list: netdev Hal> The latest I-D is now Hal> http://www.ietf.org/internet-drafts/draft-ietf-ipoib-ip-over-infiniband-08.txt Thanks, I'll correct this. Hal> Also, isn't DHCP over IB Hal> (http://www.ietf.org/internet-drafts/draft-ietf-ipoib-dhcp-over-infiniband-07.txt) Hal> also supported ? If so, is that part of this or some other Hal> patch being submitted ? DHCP should work but I don't think it's a kernel issue (I don't think kernel DHCP for NFS root over IPoIB will work unfortunately). - R. From random@72616e646f6d20323030342d30342d31360a.nosense.org Mon Dec 13 16:23:33 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 13 Dec 2004 16:23:42 -0800 (PST) Received: from ubu.nosense.org (149.cust1.sa.dsl.ozemail.com.au [210.84.224.149]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBE0N9a1010471 for ; Mon, 13 Dec 2004 16:23:30 -0800 Received: from ubu.nosense.org (ubu.nosense.org [127.0.0.1]) by ubu.nosense.org (Postfix) with SMTP id C374F62A9F for ; Tue, 14 Dec 2004 10:52:45 +1030 (CST) Date: Tue, 14 Dec 2004 10:52:45 +1030 From: Mark Smith To: netdev@oss.sgi.com Subject: [2.6.9] Networking crash, slightly exotic setup, bridged tap/tun interfaces Message-Id: <20041214105245.32c9b1a6.random@72616e646f6d20323030342d30342d31360a.nosense.org> Organization: The No Sense Organisation X-Mailer: Sylpheed version 1.0.0beta1 (GTK+ 1.2.10; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/624/Thu Dec 9 11:01:06 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12718 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: random@72616e646f6d20323030342d30342d31360a.nosense.org Precedence: bulk X-list: netdev Hi, I'm playing around with using QEMU to emulate virtual routers. I'm using Qemu (0.6.1) in full PC emulation mode, running multiple instances of a base ubunto 4.1 install. Within Qemu, the virtual PC emulates one or more PCI based NE2K NICs, which can be attached to tun tap/tun type devices on the host kernel. In each instance of Qemu, I'm running Quagga 0.97.3. At the time of the crash I was running OSPF across the vitual PCI NE2K NICs. I don't know if it matters to this problem, OSPF users layer 2 and 3 unicast and multicast capabilities. On my main host, I'd bridged together the multiple tun interfaces. I'm also running quagga 0.97.3 on the host kernel, and had then enabled OSPF on the bridge interface. I was also running the command line version of ethereal, tethereal, to watch the OSPF traffic across the virtual LAN. It seemed to be working ok, although I was having some problems with OSPF database synchronisation across the virtual LAN. That could be a bug in Quagga OSPF though. I do find it a bit odd that Qemu uses tun interfaces, supposedly for IP only, rather than tap interfaces to connect the virtual NE2K NICs to the host.These tun interfaces seem to support layer 2 fine - I was seeing layer 2 traffic across them, such as ARPs, IPv6 RS/ND traffic and STP BPDUs. My kernel is only slightly modified from vanilla 2.6.9 - I've increased the kernel log buffer size to allow me to capture boot messages I loose because of my RAID1 arrays. I'm happy to provide a diff against vanilla 2.6.9 if people think that could be a problem. After a while of this set up operating, I received the following Oops from the host kernel, and thought I should report it. (As a side note, do I need to run it through ksymoops any more ? I installed ksymoops, however I don't have a /proc/ksyms file. I tried to use /proc/kallsyms, that didn't work either. I've understood that the purpose of ksymoops is to list the functions being called at the time of the Ooops, it would appear that the kernel does that automatically now.) If there is anything further I can do to help diagnose this problem, please let me know. Thanks, Mark. -- 00:22:24 ubu kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000000 00:22:24 ubu kernel: printing eip: 00:22:24 ubu kernel: c035322b 00:22:24 ubu kernel: *pde = 00000000 00:22:24 ubu kernel: Oops: 0002 [#1] 00:22:24 ubu kernel: PREEMPT 00:22:24 ubu kernel: Modules linked in: dummy bridge atm snd_seq_oss snd_seq_midi snd_seq_midi_event snd_opl3_synth snd_seq_instr snd_seq_midi_emul snd_ainstr_fm snd_seq tun nls_cp437 isofs udf mga cpufreq_userspace cpufreq_powersave cpufreq_ondemand thermal fan button processor pppoe pppox ppp_generic slhc irda_usb irda crc_ccitt snd_bt87x tuner tvaudio msp3400 bttv video_buf firmware_class i2c_algo_bit v4l2_common btcx_risc videodev natsemi epic100 mii crc32 snd_es18xx snd_pcm_oss snd_mixer_oss snd_pcm snd_page_alloc snd_opl3_lib snd_timer snd_hwdep snd_mpu401_uart snd_rawmidi snd_seq_device snd soundcore evdev eeprom i2c_sensor ir_kbd_i2c ir_common i2c_piix4 parport_pc lp parport 00:22:24 ubu kernel: CPU: 0 00:22:24 ubu kernel: EIP: 0060:[fib_release_info+139/240] Not tainted VLI 00:22:24 ubu kernel: EFLAGS: 00210246 (2.6.9-mrs) 00:22:24 ubu kernel: EIP is at fib_release_info+0x8b/0xf0 00:22:24 ubu kernel: eax: 00000000 ebx: ca4834a0 ecx: 00000000 edx: ca483504 00:22:24 ubu kernel: esi: ca483508 edi: 00000000 ebp: d6a32710 esp: cfe1fc60 00:22:24 ubu kernel: ds: 007b es: 007b ss: 0068 00:22:24 ubu kernel: Process zebra (pid: 14501, threadinfo=cfe1e000 task=d6a65560) 00:22:24 ubu kernel: Stack: 00000001 c6607bb0 d6fad5e8 c0355e3b 00000008 000000fe c6607ba0 c3435d78 00:22:24 ubu kernel: 003c7e00 c9103ca0 00000008 d6fad5e0 c1379e20 d69a5bc0 0000000a d69a5bc0 00:22:24 ubu kernel: c6607bb0 c6607ba0 c1379e20 c0352a95 c6607ba0 c3435d78 c3435d40 c3435d40 00:22:24 ubu kernel: Call Trace: 00:22:24 ubu kernel: [fn_hash_delete+443/608] fn_hash_delete+0x1bb/0x260 00:22:24 ubu kernel: [inet_rtm_delroute+101/128] inet_rtm_delroute+0x65/0x80 00:22:24 ubu kernel: [inet_rtm_delroute+0/128] inet_rtm_delroute+0x0/0x80 00:22:24 ubu kernel: [rtnetlink_rcv+719/896] rtnetlink_rcv+0x2cf/0x380 00:22:24 ubu kernel: [copy_to_user+50/80] copy_to_user+0x32/0x50 00:22:24 ubu kernel: [rtnetlink_rcv+0/896] rtnetlink_rcv+0x0/0x380 00:22:24 ubu kernel: [netlink_data_ready+85/96] netlink_data_ready+0x55/0x60 00:22:24 ubu kernel: [netlink_sendskb+33/80] netlink_sendskb+0x21/0x50 00:22:24 ubu kernel: [netlink_sendmsg+465/720] netlink_sendmsg+0x1d1/0x2d0 00:22:24 ubu kernel: [activate_task+90/112] activate_task+0x5a/0x70 00:22:24 ubu kernel: [sock_sendmsg+190/224] sock_sendmsg+0xbe/0xe0 00:22:24 ubu kernel: [group_send_sig_info+141/144] group_send_sig_info+0x8d/0x90 00:22:24 ubu kernel: [__mod_timer+300/384] __mod_timer+0x12c/0x180 00:22:24 ubu kernel: [recalc_task_prio+151/400] recalc_task_prio+0x97/0x190 00:22:24 ubu kernel: [schedule+668/1248] schedule+0x29c/0x4e0 00:22:24 ubu kernel: [__do_softirq+121/144] __do_softirq+0x79/0x90 00:22:24 ubu kernel: [need_resched+39/50] need_resched+0x27/0x32 00:22:24 ubu kernel: [autoremove_wake_function+0/80] autoremove_wake_function+0x0/0x50 00:22:24 ubu kernel: [copy_from_user+52/112] copy_from_user+0x34/0x70 00:22:24 ubu kernel: [verify_iovec+42/144] verify_iovec+0x2a/0x90 00:22:24 ubu kernel: [sys_sendmsg+337/416] sys_sendmsg+0x151/0x1a0 00:22:24 ubu kernel: [do_sync_write+154/208] do_sync_write+0x9a/0xd0 00:22:24 ubu kernel: [__group_send_sig_info+119/160] __group_send_sig_info+0x77/0xa0 00:22:24 ubu kernel: [__mod_timer+300/384] __mod_timer+0x12c/0x180 00:22:24 ubu kernel: [autoremove_wake_function+0/80] autoremove_wake_function+0x0/0x50 00:22:24 ubu kernel: [sys_socketcall+562/592] sys_socketcall+0x232/0x250 00:22:24 ubu kernel: [capable+21/48] capable+0x15/0x30 00:22:24 ubu kernel: [sysenter_past_esp+82/113] sysenter_past_esp+0x52/0x71 00:22:24 ubu kernel: Code: 00 c7 41 04 00 02 20 00 31 ff 8d 53 64 3b 7b 5c 7d 37 8d b4 26 00 00 00 00 8d bc 27 00 00 00 00 8b 42 04 8d 72 04 8b 4e 04 85 c0 <89> 01 74 03 89 48 04 c7 42 04 00 01 10 00 47 83 c2 2c c7 46 04 00:22:24 ubu kernel: <6>note: zebra[14501] exited with preempt_count 1 00:22:24 ubu kernel: bad: scheduling while atomic! 00:22:24 ubu kernel: [schedule+1228/1248] schedule+0x4cc/0x4e0 00:22:24 ubu kernel: [zap_pmd_range+75/112] zap_pmd_range+0x4b/0x70 00:22:24 ubu kernel: [unmap_page_range+61/112] unmap_page_range+0x3d/0x70 00:22:24 ubu kernel: [unmap_vmas+427/448] unmap_vmas+0x1ab/0x1c0 00:22:24 ubu kernel: [exit_mmap+116/336] exit_mmap+0x74/0x150 00:22:24 ubu kernel: [mmput+85/144] mmput+0x55/0x90 00:22:24 ubu kernel: [do_exit+323/1008] do_exit+0x143/0x3f0 00:22:24 ubu kernel: [die+365/368] die+0x16d/0x170 00:22:24 ubu kernel: [do_page_fault+0/1390] do_page_fault+0x0/0x56e 00:22:24 ubu kernel: [do_page_fault+0/1390] do_page_fault+0x0/0x56e 00:22:24 ubu kernel: [do_page_fault+543/1390] do_page_fault+0x21f/0x56e 00:22:24 ubu kernel: [schedule+668/1248] schedule+0x29c/0x4e0 00:22:24 ubu kernel: [netlink_broadcast+471/656] netlink_broadcast+0x1d7/0x290 00:22:24 ubu kernel: [do_page_fault+0/1390] do_page_fault+0x0/0x56e 00:22:24 ubu kernel: [error_code+45/56] error_code+0x2d/0x38 00:22:24 ubu kernel: [fib_release_info+139/240] fib_release_info+0x8b/0xf0 00:22:24 ubu kernel: [fn_hash_delete+443/608] fn_hash_delete+0x1bb/0x260 00:22:24 ubu kernel: [inet_rtm_delroute+101/128] inet_rtm_delroute+0x65/0x80 00:22:24 ubu kernel: [inet_rtm_delroute+0/128] inet_rtm_delroute+0x0/0x80 00:22:24 ubu kernel: [rtnetlink_rcv+719/896] rtnetlink_rcv+0x2cf/0x380 00:22:24 ubu kernel: [copy_to_user+50/80] copy_to_user+0x32/0x50 00:22:24 ubu kernel: [rtnetlink_rcv+0/896] rtnetlink_rcv+0x0/0x380 00:22:24 ubu kernel: [netlink_data_ready+85/96] netlink_data_ready+0x55/0x60 00:22:24 ubu kernel: [netlink_sendskb+33/80] netlink_sendskb+0x21/0x50 00:22:24 ubu kernel: [netlink_sendmsg+465/720] netlink_sendmsg+0x1d1/0x2d0 00:22:24 ubu kernel: [activate_task+90/112] activate_task+0x5a/0x70 00:22:24 ubu kernel: [sock_sendmsg+190/224] sock_sendmsg+0xbe/0xe0 00:22:24 ubu kernel: [group_send_sig_info+141/144] group_send_sig_info+0x8d/0x90 00:22:24 ubu kernel: [__mod_timer+300/384] __mod_timer+0x12c/0x180 00:22:24 ubu kernel: [recalc_task_prio+151/400] recalc_task_prio+0x97/0x190 00:22:24 ubu kernel: [schedule+668/1248] schedule+0x29c/0x4e0 00:22:24 ubu kernel: [__do_softirq+121/144] __do_softirq+0x79/0x90 00:22:24 ubu kernel: [need_resched+39/50] need_resched+0x27/0x32 00:22:24 ubu kernel: [autoremove_wake_function+0/80] autoremove_wake_function+0x0/0x50 00:22:24 ubu kernel: [copy_from_user+52/112] copy_from_user+0x34/0x70 00:22:24 ubu kernel: [verify_iovec+42/144] verify_iovec+0x2a/0x90 00:22:24 ubu kernel: [sys_sendmsg+337/416] sys_sendmsg+0x151/0x1a0 00:22:24 ubu kernel: [do_sync_write+154/208] do_sync_write+0x9a/0xd0 00:22:24 ubu kernel: [__group_send_sig_info+119/160] __group_send_sig_info+0x77/0xa0 00:22:24 ubu kernel: [__mod_timer+300/384] __mod_timer+0x12c/0x180 00:22:24 ubu kernel: [autoremove_wake_function+0/80] autoremove_wake_function+0x0/0x50 00:22:24 ubu kernel: [sys_socketcall+562/592] sys_socketcall+0x232/0x250 00:22:24 ubu kernel: [capable+21/48] capable+0x15/0x30 00:22:24 ubu kernel: [sysenter_past_esp+82/113] sysenter_past_esp+0x52/0x71 -- From acme@conectiva.com.br Mon Dec 13 16:51:05 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 13 Dec 2004 16:51:13 -0800 (PST) Received: from orion.netbank.com.br (orion.netbank.com.br [200.203.199.90]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBE0oer5001889 for ; Mon, 13 Dec 2004 16:51:05 -0800 Received: from [201.14.39.194] (helo=oops.ghostprotocols.net) by orion.netbank.com.br with asmtp (Exim 3.33 #1) id 1Ce0x4-0003Kr-00; Mon, 13 Dec 2004 22:54:06 -0200 Received: from [192.168.1.6] (amd64.kerneljanitors.org [192.168.1.6]) by oops.ghostprotocols.net (Postfix) with ESMTP id 127E11463D; Mon, 13 Dec 2004 22:50:13 -0200 (BRST) Message-ID: <41BE2B20.5080602@conectiva.com.br> Date: Mon, 13 Dec 2004 21:52:00 -0200 From: Arnaldo Carvalho de Melo Organization: Conectiva S.A. User-Agent: Mozilla Thunderbird 0.9 (X11/20041103) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Mark Smith Cc: netdev@oss.sgi.com Subject: Re: [2.6.9] Networking crash, slightly exotic setup, bridged tap/tun interfaces References: <20041214105245.32c9b1a6.random@72616e646f6d20323030342d30342d31360a.nosense.org> In-Reply-To: <20041214105245.32c9b1a6.random@72616e646f6d20323030342d30342d31360a.nosense.org> Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12719 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: acme@conectiva.com.br Precedence: bulk X-list: netdev Mark Smith wrote: > My kernel is only slightly modified from vanilla 2.6.9 - I've increased Can you reproduce this with 2.6.10-rc3? 2.6.9 had several networking bugs fixed already in 2.6.10-rc3. - Arnaldo From random@72616e646f6d20323030342d30342d31360a.nosense.org Mon Dec 13 17:01:21 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 13 Dec 2004 17:01:29 -0800 (PST) Received: from ubu.nosense.org (149.cust1.sa.dsl.ozemail.com.au [210.84.224.149]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBE1102P002724 for ; Mon, 13 Dec 2004 17:01:21 -0800 Received: from ubu.nosense.org (ubu.nosense.org [127.0.0.1]) by ubu.nosense.org (Postfix) with SMTP id 573F962A9F; Tue, 14 Dec 2004 11:30:36 +1030 (CST) Date: Tue, 14 Dec 2004 11:30:36 +1030 From: Mark Smith To: Arnaldo Carvalho de Melo Cc: netdev@oss.sgi.com Subject: Re: [2.6.9] Networking crash, slightly exotic setup, bridged tap/tun interfaces Message-Id: <20041214113036.3ccb3480.random@72616e646f6d20323030342d30342d31360a.nosense.org> In-Reply-To: <41BE2B20.5080602@conectiva.com.br> References: <20041214105245.32c9b1a6.random@72616e646f6d20323030342d30342d31360a.nosense.org> <41BE2B20.5080602@conectiva.com.br> Organization: The No Sense Organisation X-Mailer: Sylpheed version 1.0.0beta1 (GTK+ 1.2.10; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12720 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: random@72616e646f6d20323030342d30342d31360a.nosense.org Precedence: bulk X-list: netdev Hi Arnaldo, Thanks for getting back to me so quickly. On Mon, 13 Dec 2004 21:52:00 -0200 Arnaldo Carvalho de Melo wrote: > Mark Smith wrote: > > My kernel is only slightly modified from vanilla 2.6.9 - I've > > increased > > Can you reproduce this with 2.6.10-rc3? 2.6.9 had several networking > bugs fixed already in 2.6.10-rc3. > I'll install that, and report back a bit later. I've tended to stick to "full figure" kernels to stay a bit more stable. Admittedly, I don't know exactly what caused the oops, other than the configuration I was playing with, so I don't think I can trigger it on demand. I'll see how I go. Thanks again, Mark. -- From acme@conectiva.com.br Mon Dec 13 17:20:28 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 13 Dec 2004 17:20:36 -0800 (PST) Received: from orion.netbank.com.br (orion.netbank.com.br [200.203.199.90]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBE1K8Ii003737 for ; Mon, 13 Dec 2004 17:20:28 -0800 Received: from [201.14.39.194] (helo=oops.ghostprotocols.net) by orion.netbank.com.br with asmtp (Exim 3.33 #1) id 1Ce1Pc-0003Ps-00; Mon, 13 Dec 2004 23:23:36 -0200 Received: from [192.168.1.6] (amd64.kerneljanitors.org [192.168.1.6]) by oops.ghostprotocols.net (Postfix) with ESMTP id CA1AE1463D; Mon, 13 Dec 2004 23:19:42 -0200 (BRST) Message-ID: <41BE320A.8090201@conectiva.com.br> Date: Mon, 13 Dec 2004 22:21:30 -0200 From: Arnaldo Carvalho de Melo Organization: Conectiva S.A. User-Agent: Mozilla Thunderbird 0.9 (X11/20041103) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Mark Smith Cc: netdev@oss.sgi.com Subject: Re: [2.6.9] Networking crash, slightly exotic setup, bridged tap/tun interfaces References: <20041214105245.32c9b1a6.random@72616e646f6d20323030342d30342d31360a.nosense.org> <41BE2B20.5080602@conectiva.com.br> <20041214113036.3ccb3480.random@72616e646f6d20323030342d30342d31360a.nosense.org> In-Reply-To: <20041214113036.3ccb3480.random@72616e646f6d20323030342d30342d31360a.nosense.org> Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12721 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: acme@conectiva.com.br Precedence: bulk X-list: netdev Mark Smith wrote: > Hi Arnaldo, > > Thanks for getting back to me so quickly. :-) > On Mon, 13 Dec 2004 21:52:00 -0200 > Arnaldo Carvalho de Melo wrote: > > >>Mark Smith wrote: >> >>>My kernel is only slightly modified from vanilla 2.6.9 - I've >>>increased >> >>Can you reproduce this with 2.6.10-rc3? 2.6.9 had several networking >>bugs fixed already in 2.6.10-rc3. >> > > > I'll install that, and report back a bit later. I've tended to stick to > "full figure" kernels to stay a bit more stable. Yeah, I understand, 2.6.10 final is taking a long time to brew, hopefully this time around we'll manage to get something more solid than the previous ones. - Arnaldo From yoshfuji@linux-ipv6.org Mon Dec 13 18:29:44 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 13 Dec 2004 18:29:56 -0800 (PST) Received: from yue.st-paulia.net (yue.linux-ipv6.org [203.178.140.15]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBE2TNmq009099 for ; Mon, 13 Dec 2004 18:29:43 -0800 Received: from localhost (localhost [127.0.0.1]) by yue.st-paulia.net (Postfix) with ESMTP id EE40033CE5; Tue, 14 Dec 2004 11:30:38 +0900 (JST) Date: Tue, 14 Dec 2004 11:30:23 +0900 (JST) Message-Id: <20041214.113023.65866789.yoshfuji@linux-ipv6.org> To: roland@topspin.com Cc: linux-kernel@vger.kernel.org, openib-general@openib.org, netdev@oss.sgi.com, yoshfuji@linux-ipv6.org Subject: Re: [PATCH][v3][16/21] IPoIB IPv6 support From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: <20041213109.iziHvQZqtmP83gmx@topspin.com> References: <20041213109.5NKezuGE9PMejMSM@topspin.com> <20041213109.iziHvQZqtmP83gmx@topspin.com> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12722 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev In article <20041213109.iziHvQZqtmP83gmx@topspin.com> (at Mon, 13 Dec 2004 10:09:46 -0800), Roland Dreier says: > } > @@ -1794,6 +1801,7 @@ > if ((dev->type != ARPHRD_ETHER) && > (dev->type != ARPHRD_FDDI) && > (dev->type != ARPHRD_IEEE802_TR) && > + (dev->type != ARPHRD_INFINIBAND) && > (dev->type != ARPHRD_ARCNET)) { > /* Alas, we support only Ethernet autoconfiguration. */ > return; Please put ARPHRD_INFINIBAND after ARPHRD_ARCNET like other places. --yoshfuji From roland@topspin.com Mon Dec 13 18:37:00 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 13 Dec 2004 18:37:08 -0800 (PST) Received: from umhlanga.STRATNET.NET (umhlanga.stratnet.net [12.162.17.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBE2aeZY009741 for ; Mon, 13 Dec 2004 18:37:00 -0800 Received: from exch-1.topspincom.com ([12.162.17.3]) by umhlanga.STRATNET.NET with Microsoft SMTPSVC(5.0.2195.5329); Mon, 13 Dec 2004 18:36:17 -0800 Received: from eddore ([10.10.253.169]) by exch-1.topspincom.com with Microsoft SMTPSVC(5.0.2195.5329); Mon, 13 Dec 2004 18:36:17 -0800 Received: from roland by eddore with local (Exim 4.34) id 1Ce2Xw-0007yj-VI; Mon, 13 Dec 2004 18:36:17 -0800 To: YOSHIFUJI Hideaki / =?iso-2022-jp?b?GyRCNUhGIzFRGyhC?= =?iso-2022-jp?b?GyRCTEAbKEI=?= Cc: linux-kernel@vger.kernel.org, openib-general@openib.org, netdev@oss.sgi.com X-Message-Flag: Warning: May contain useful information References: <20041213109.5NKezuGE9PMejMSM@topspin.com> <20041213109.iziHvQZqtmP83gmx@topspin.com> <20041214.113023.65866789.yoshfuji@linux-ipv6.org> From: Roland Dreier Date: Mon, 13 Dec 2004 18:36:16 -0800 In-Reply-To: <20041214.113023.65866789.yoshfuji@linux-ipv6.org> (YOSHIFUJI Hideaki's message of "Tue, 14 Dec 2004 11:30:23 +0900 (JST)") Message-ID: <52y8g138tb.fsf@topspin.com> User-Agent: Gnus/5.1006 (Gnus v5.10.6) XEmacs/21.4 (Security Through Obscurity, linux) MIME-Version: 1.0 X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: roland@topspin.com Subject: Re: [PATCH][v3][16/21] IPoIB IPv6 support Content-Type: text/plain; charset=us-ascii X-SA-Exim-Version: 4.1 (built Tue, 17 Aug 2004 11:06:07 +0200) X-SA-Exim-Scanned: Yes (on eddore) X-OriginalArrivalTime: 14 Dec 2004 02:36:17.0483 (UTC) FILETIME=[B01D59B0:01C4E185] X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12723 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: roland@topspin.com Precedence: bulk X-list: netdev YOSHIFUJI> Please put ARPHRD_INFINIBAND after ARPHRD_ARCNET like YOSHIFUJI> other places. Thanks, I've made this update. Thanks, Roland From bunk@stusta.de Mon Dec 13 20:14:44 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 13 Dec 2004 20:14:52 -0800 (PST) Received: from mailout.stusta.mhn.de (emailhub.stusta.mhn.de [141.84.69.5]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBE4EM9B015194 for ; Mon, 13 Dec 2004 20:14:43 -0800 Received: (qmail 18801 invoked from network); 14 Dec 2004 04:13:54 -0000 Received: from r063144.stusta.swh.mhn.de (10.150.63.144) by mailout.stusta.mhn.de with SMTP; 14 Dec 2004 04:13:54 -0000 Received: by r063144.stusta.swh.mhn.de (Postfix, from userid 1000) id D9F8ABBC94; Tue, 14 Dec 2004 05:13:52 +0100 (CET) Date: Tue, 14 Dec 2004 05:13:52 +0100 From: Adrian Bunk To: marcel@holtmann.org, maxk@qualcomm.com Cc: bluez-devel@lists.sourceforge.net, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: [2.6 patch] net/bluetooth/: misc possible cleanups Message-ID: <20041214041352.GZ23151@stusta.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12724 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bunk@stusta.de Precedence: bulk X-list: netdev The patch below contains the following possible cleanups: - make needlessly global code static - remove the following EXPORT_SYMBOL'ed but unused functions in hci_core.c: - hci_suspend_dev - hci_resume_dev - hci_register_cb - hci_unregister_cb Please comment on which of these changes are correct and which conflict with pending patches. diffstat output: include/net/bluetooth/hci_core.h | 7 ---- include/net/bluetooth/rfcomm.h | 27 ------------------- net/bluetooth/cmtp/capi.c | 4 ++ net/bluetooth/cmtp/cmtp.h | 1 net/bluetooth/hci_conn.c | 2 - net/bluetooth/hci_core.c | 44 +------------------------------ net/bluetooth/hci_sock.c | 10 +++---- net/bluetooth/l2cap.c | 2 - net/bluetooth/rfcomm/core.c | 37 +++++++++++++++++++++----- net/bluetooth/rfcomm/sock.c | 4 +- 10 files changed, 44 insertions(+), 94 deletions(-) Signed-off-by: Adrian Bunk --- linux-2.6.10-rc3-mm1-full/net/bluetooth/cmtp/cmtp.h.old 2004-12-14 02:12:11.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/bluetooth/cmtp/cmtp.h 2004-12-14 02:12:38.000000000 +0100 @@ -120,7 +120,6 @@ void cmtp_detach_device(struct cmtp_session *session); void cmtp_recv_capimsg(struct cmtp_session *session, struct sk_buff *skb); -void cmtp_send_capimsg(struct cmtp_session *session, struct sk_buff *skb); static inline void cmtp_schedule(struct cmtp_session *session) { --- linux-2.6.10-rc3-mm1-full/net/bluetooth/cmtp/capi.c.old 2004-12-14 02:12:47.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/bluetooth/cmtp/capi.c 2004-12-14 02:13:38.000000000 +0100 @@ -74,6 +74,8 @@ #define CMTP_APPLID 2 #define CMTP_MAPPING 3 +static void cmtp_send_capimsg(struct cmtp_session *session, struct sk_buff *skb); + static struct cmtp_application *cmtp_application_add(struct cmtp_session *session, __u16 appl) { struct cmtp_application *app = kmalloc(sizeof(*app), GFP_KERNEL); @@ -337,7 +339,7 @@ capi_ctr_handle_message(ctrl, appl, skb); } -void cmtp_send_capimsg(struct cmtp_session *session, struct sk_buff *skb) +static void cmtp_send_capimsg(struct cmtp_session *session, struct sk_buff *skb) { struct cmtp_scb *scb = (void *) skb->cb; --- linux-2.6.10-rc3-mm1-full/include/net/bluetooth/hci_core.h.old 2004-12-14 02:13:54.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/net/bluetooth/hci_core.h 2004-12-14 02:31:34.000000000 +0100 @@ -277,7 +277,6 @@ return NULL; } -void hci_acl_connect(struct hci_conn *conn); void hci_acl_disconn(struct hci_conn *conn, __u8 reason); void hci_add_sco(struct hci_conn *conn, __u16 handle); @@ -372,8 +371,6 @@ void hci_free_dev(struct hci_dev *hdev); int hci_register_dev(struct hci_dev *hdev); int hci_unregister_dev(struct hci_dev *hdev); -int hci_suspend_dev(struct hci_dev *hdev); -int hci_resume_dev(struct hci_dev *hdev); int hci_dev_open(__u16 dev); int hci_dev_close(__u16 dev); int hci_dev_reset(__u16 dev); @@ -546,9 +543,6 @@ read_unlock_bh(&hci_cb_list_lock); } -int hci_register_cb(struct hci_cb *hcb); -int hci_unregister_cb(struct hci_cb *hcb); - int hci_register_notifier(struct notifier_block *nb); int hci_unregister_notifier(struct notifier_block *nb); @@ -589,6 +583,5 @@ #define hci_req_unlock(d) up(&d->req_lock) void hci_req_complete(struct hci_dev *hdev, int result); -void hci_req_cancel(struct hci_dev *hdev, int err); #endif /* __HCI_CORE_H */ --- linux-2.6.10-rc3-mm1-full/net/bluetooth/hci_conn.c.old 2004-12-14 02:14:10.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/bluetooth/hci_conn.c 2004-12-14 02:14:15.000000000 +0100 @@ -53,7 +53,7 @@ #define BT_DBG(D...) #endif -void hci_acl_connect(struct hci_conn *conn) +static void hci_acl_connect(struct hci_conn *conn) { struct hci_dev *hdev = conn->hdev; struct inquiry_entry *ie; --- linux-2.6.10-rc3-mm1-full/net/bluetooth/hci_core.c.old 2004-12-14 02:14:53.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/bluetooth/hci_core.c 2004-12-14 02:31:09.000000000 +0100 @@ -59,7 +59,7 @@ static void hci_tx_task(unsigned long arg); static void hci_notify(struct hci_dev *hdev, int event); -rwlock_t hci_task_lock = RW_LOCK_UNLOCKED; +static rwlock_t hci_task_lock = RW_LOCK_UNLOCKED; /* HCI device list */ LIST_HEAD(hci_dev_list); @@ -106,7 +106,7 @@ } } -void hci_req_cancel(struct hci_dev *hdev, int err) +static void hci_req_cancel(struct hci_dev *hdev, int err) { BT_DBG("%s err 0x%2.2x", hdev->name, err); @@ -878,22 +878,6 @@ } EXPORT_SYMBOL(hci_unregister_dev); -/* Suspend HCI device */ -int hci_suspend_dev(struct hci_dev *hdev) -{ - hci_notify(hdev, HCI_DEV_SUSPEND); - return 0; -} -EXPORT_SYMBOL(hci_suspend_dev); - -/* Resume HCI device */ -int hci_resume_dev(struct hci_dev *hdev) -{ - hci_notify(hdev, HCI_DEV_RESUME); - return 0; -} -EXPORT_SYMBOL(hci_resume_dev); - /* ---- Interface to upper protocols ---- */ /* Register/Unregister protocols. @@ -942,30 +926,6 @@ } EXPORT_SYMBOL(hci_unregister_proto); -int hci_register_cb(struct hci_cb *cb) -{ - BT_DBG("%p name %s", cb, cb->name); - - write_lock_bh(&hci_cb_list_lock); - list_add(&cb->list, &hci_cb_list); - write_unlock_bh(&hci_cb_list_lock); - - return 0; -} -EXPORT_SYMBOL(hci_register_cb); - -int hci_unregister_cb(struct hci_cb *cb) -{ - BT_DBG("%p name %s", cb, cb->name); - - write_lock_bh(&hci_cb_list_lock); - list_del(&cb->list); - write_unlock_bh(&hci_cb_list_lock); - - return 0; -} -EXPORT_SYMBOL(hci_unregister_cb); - static int hci_send_frame(struct sk_buff *skb) { struct hci_dev *hdev = (struct hci_dev *) skb->dev; --- linux-2.6.10-rc3-mm1-full/net/bluetooth/hci_sock.c.old 2004-12-14 02:16:59.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/bluetooth/hci_sock.c 2004-12-14 02:17:59.000000000 +0100 @@ -447,7 +447,7 @@ goto done; } -int hci_sock_setsockopt(struct socket *sock, int level, int optname, char __user *optval, int len) +static int hci_sock_setsockopt(struct socket *sock, int level, int optname, char __user *optval, int len) { struct hci_ufilter uf = { .opcode = 0 }; struct sock *sk = sock->sk; @@ -514,7 +514,7 @@ return err; } -int hci_sock_getsockopt(struct socket *sock, int level, int optname, char __user *optval, int __user *optlen) +static int hci_sock_getsockopt(struct socket *sock, int level, int optname, char __user *optval, int __user *optlen) { struct hci_ufilter uf; struct sock *sk = sock->sk; @@ -567,7 +567,7 @@ return 0; } -struct proto_ops hci_sock_ops = { +static struct proto_ops hci_sock_ops = { .family = PF_BLUETOOTH, .owner = THIS_MODULE, .release = hci_sock_release, @@ -647,13 +647,13 @@ return NOTIFY_DONE; } -struct net_proto_family hci_sock_family_ops = { +static struct net_proto_family hci_sock_family_ops = { .family = PF_BLUETOOTH, .owner = THIS_MODULE, .create = hci_sock_create, }; -struct notifier_block hci_sock_nblock = { +static struct notifier_block hci_sock_nblock = { .notifier_call = hci_sock_dev_event }; --- linux-2.6.10-rc3-mm1-full/net/bluetooth/l2cap.c.old 2004-12-14 02:18:13.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/bluetooth/l2cap.c 2004-12-14 02:18:21.000000000 +0100 @@ -61,7 +61,7 @@ static struct proto_ops l2cap_sock_ops; -struct bt_sock_list l2cap_sk_list = { +static struct bt_sock_list l2cap_sk_list = { .lock = RW_LOCK_UNLOCKED }; --- linux-2.6.10-rc3-mm1-full/include/net/bluetooth/rfcomm.h.old 2004-12-14 02:19:37.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/net/bluetooth/rfcomm.h 2004-12-14 02:27:24.000000000 +0100 @@ -216,22 +216,6 @@ #define RFCOMM_CFC_DISABLED 0 #define RFCOMM_CFC_ENABLED RFCOMM_MAX_CREDITS -extern struct task_struct *rfcomm_thread; -extern unsigned long rfcomm_event; - -static inline void rfcomm_schedule(uint event) -{ - if (!rfcomm_thread) - return; - //set_bit(event, &rfcomm_event); - set_bit(RFCOMM_SCHED_WAKEUP, &rfcomm_event); - wake_up_process(rfcomm_thread); -} - -extern struct semaphore rfcomm_sem; -#define rfcomm_lock() down(&rfcomm_sem); -#define rfcomm_unlock() up(&rfcomm_sem); - /* ---- RFCOMM DLCs (channels) ---- */ struct rfcomm_dlc *rfcomm_dlc_alloc(int prio); void rfcomm_dlc_free(struct rfcomm_dlc *d); @@ -271,11 +255,6 @@ } /* ---- RFCOMM sessions ---- */ -struct rfcomm_session *rfcomm_session_add(struct socket *sock, int state); -struct rfcomm_session *rfcomm_session_get(bdaddr_t *src, bdaddr_t *dst); -struct rfcomm_session *rfcomm_session_create(bdaddr_t *src, bdaddr_t *dst, int *err); -void rfcomm_session_del(struct rfcomm_session *s); -void rfcomm_session_close(struct rfcomm_session *s, int err); void rfcomm_session_getaddr(struct rfcomm_session *s, bdaddr_t *src, bdaddr_t *dst); static inline void rfcomm_session_hold(struct rfcomm_session *s) @@ -283,12 +262,6 @@ atomic_inc(&s->refcnt); } -static inline void rfcomm_session_put(struct rfcomm_session *s) -{ - if (atomic_dec_and_test(&s->refcnt)) - rfcomm_session_del(s); -} - /* ---- RFCOMM chechsum ---- */ extern u8 rfcomm_crc_table[]; --- linux-2.6.10-rc3-mm1-full/net/bluetooth/rfcomm/core.c.old 2004-12-14 02:19:43.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/bluetooth/rfcomm/core.c 2004-12-14 02:27:41.000000000 +0100 @@ -61,8 +61,12 @@ struct proc_dir_entry *proc_bt_rfcomm; #endif -struct task_struct *rfcomm_thread; -DECLARE_MUTEX(rfcomm_sem); +static struct task_struct *rfcomm_thread; + +static DECLARE_MUTEX(rfcomm_sem); +#define rfcomm_lock() down(&rfcomm_sem); +#define rfcomm_unlock() up(&rfcomm_sem); + unsigned long rfcomm_event; static LIST_HEAD(session_list); @@ -81,6 +85,10 @@ static void rfcomm_process_connect(struct rfcomm_session *s); +static struct rfcomm_session *rfcomm_session_create(bdaddr_t *src, bdaddr_t *dst, int *err); +static struct rfcomm_session *rfcomm_session_get(bdaddr_t *src, bdaddr_t *dst); +static void rfcomm_session_del(struct rfcomm_session *s); + /* ---- RFCOMM frame parsing macros ---- */ #define __get_dlci(b) ((b & 0xfc) >> 2) #define __get_channel(b) ((b & 0xf8) >> 3) @@ -111,6 +119,21 @@ #define __get_rpn_stop_bits(line) (((line) >> 2) & 0x1) #define __get_rpn_parity(line) (((line) >> 3) & 0x3) +static inline void rfcomm_schedule(uint event) +{ + if (!rfcomm_thread) + return; + //set_bit(event, &rfcomm_event); + set_bit(RFCOMM_SCHED_WAKEUP, &rfcomm_event); + wake_up_process(rfcomm_thread); +} + +static inline void rfcomm_session_put(struct rfcomm_session *s) +{ + if (atomic_dec_and_test(&s->refcnt)) + rfcomm_session_del(s); +} + /* ---- RFCOMM FCS computation ---- */ /* CRC on 2 bytes */ @@ -458,7 +481,7 @@ } /* ---- RFCOMM sessions ---- */ -struct rfcomm_session *rfcomm_session_add(struct socket *sock, int state) +static struct rfcomm_session *rfcomm_session_add(struct socket *sock, int state) { struct rfcomm_session *s = kmalloc(sizeof(*s), GFP_KERNEL); if (!s) @@ -487,7 +510,7 @@ return s; } -void rfcomm_session_del(struct rfcomm_session *s) +static void rfcomm_session_del(struct rfcomm_session *s) { int state = s->state; @@ -505,7 +528,7 @@ module_put(THIS_MODULE); } -struct rfcomm_session *rfcomm_session_get(bdaddr_t *src, bdaddr_t *dst) +static struct rfcomm_session *rfcomm_session_get(bdaddr_t *src, bdaddr_t *dst) { struct rfcomm_session *s; struct list_head *p, *n; @@ -521,7 +544,7 @@ return NULL; } -void rfcomm_session_close(struct rfcomm_session *s, int err) +static void rfcomm_session_close(struct rfcomm_session *s, int err) { struct rfcomm_dlc *d; struct list_head *p, *n; @@ -542,7 +565,7 @@ rfcomm_session_put(s); } -struct rfcomm_session *rfcomm_session_create(bdaddr_t *src, bdaddr_t *dst, int *err) +static struct rfcomm_session *rfcomm_session_create(bdaddr_t *src, bdaddr_t *dst, int *err) { struct rfcomm_session *s = NULL; struct sockaddr_l2 addr; --- linux-2.6.10-rc3-mm1-full/net/bluetooth/rfcomm/sock.c.old 2004-12-14 02:28:14.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/bluetooth/rfcomm/sock.c 2004-12-14 02:28:33.000000000 +0100 @@ -393,7 +393,7 @@ return err; } -int rfcomm_sock_listen(struct socket *sock, int backlog) +static int rfcomm_sock_listen(struct socket *sock, int backlog) { struct sock *sk = sock->sk; int err = 0; @@ -437,7 +437,7 @@ return err; } -int rfcomm_sock_accept(struct socket *sock, struct socket *newsock, int flags) +static int rfcomm_sock_accept(struct socket *sock, struct socket *newsock, int flags) { DECLARE_WAITQUEUE(wait, current); struct sock *sk = sock->sk, *nsk; From bunk@stusta.de Mon Dec 13 20:58:50 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 13 Dec 2004 20:58:57 -0800 (PST) Received: from mailout.stusta.mhn.de (mailout.stusta.mhn.de [141.84.69.5]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBE4wSJj020652 for ; Mon, 13 Dec 2004 20:58:49 -0800 Received: (qmail 20619 invoked from network); 14 Dec 2004 04:58:01 -0000 Received: from r063144.stusta.swh.mhn.de (10.150.63.144) by mailhub.stusta.mhn.de with SMTP; 14 Dec 2004 04:58:01 -0000 Received: by r063144.stusta.swh.mhn.de (Postfix, from userid 1000) id 43D4CBBC67; Tue, 14 Dec 2004 05:57:59 +0100 (CET) Date: Tue, 14 Dec 2004 05:57:58 +0100 From: Adrian Bunk To: netdev@oss.sgi.com Cc: linux-kernel@vger.kernel.org Subject: [2.6 patch] net/core/: misc possible cleanups Message-ID: <20041214045758.GA23151@stusta.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12725 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bunk@stusta.de Precedence: bulk X-list: netdev The patch below contains the following possible cleanups: - make needlessly global code static - remove the following unused global functions: - datagram.c: skb_copy_datagram - iovec.c: memcpy_tokerneliovec - skbuff.c: skb_insert - skbuff.c: skb_iter_first - skbuff.c: skb_iter_next - skbuff.c: skb_iter_abort - remove the following unneeded EXPORT_SYMBOL's: - datagram.c: skb_copy_datagram - dev.c: ing_filter - iovec.c: memcpy_tokerneliovec - netpoll.c: netpoll_send_skb - rtnetlink.c: rtnetlink_dump_ifinfo - skbuff.c: skb_insert - skbuff.c: skb_iter_first - skbuff.c: skb_iter_next - skbuff.c: skb_iter_abort - sock.c: sock_alloc_send_pskb Please comment on which of these changes are correct and which conflict with pending patches. diffstat output: include/linux/netdevice.h | 1 include/linux/netfilter.h | 4 - include/linux/netpoll.h | 1 include/linux/rtnetlink.h | 1 include/linux/skbuff.h | 12 ----- include/linux/socket.h | 1 include/net/iw_handler.h | 3 - include/net/pkt_cls.h | 1 include/net/sock.h | 7 --- net/core/datagram.c | 19 +------- net/core/dev.c | 13 +---- net/core/dst.c | 2 net/core/iovec.c | 23 ---------- net/core/netfilter.c | 2 net/core/netpoll.c | 5 -- net/core/rtnetlink.c | 3 - net/core/skbuff.c | 87 -------------------------------------- net/core/sock.c | 17 +++---- net/core/wireless.c | 2 19 files changed, 22 insertions(+), 182 deletions(-) Signed-off-by: Adrian Bunk --- linux-2.6.10-rc3-mm1-full/include/linux/skbuff.h.old 2004-12-14 02:34:03.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/linux/skbuff.h 2004-12-14 02:51:27.000000000 +0100 @@ -594,7 +594,6 @@ /* * Insert a packet on a list. */ -extern void skb_insert(struct sk_buff *old, struct sk_buff *newsk); static inline void __skb_insert(struct sk_buff *newsk, struct sk_buff *prev, struct sk_buff *next, struct sk_buff_head *list) @@ -1086,14 +1085,9 @@ int noblock, int *err); extern unsigned int datagram_poll(struct file *file, struct socket *sock, struct poll_table_struct *wait); -extern int skb_copy_datagram(const struct sk_buff *from, - int offset, char __user *to, int size); extern int skb_copy_datagram_iovec(const struct sk_buff *from, int offset, struct iovec *to, int size); -extern int skb_copy_and_csum_datagram(const struct sk_buff *skb, - int offset, u8 __user *to, - int len, unsigned int *csump); extern int skb_copy_and_csum_datagram_iovec(const struct sk_buff *skb, int hlen, @@ -1137,12 +1131,6 @@ struct sk_buff *fraglist; }; -/* Keep iterating until skb_iter_next returns false. */ -extern void skb_iter_first(const struct sk_buff *skb, struct skb_iter *i); -extern int skb_iter_next(const struct sk_buff *skb, struct skb_iter *i); -/* Call this if aborting loop before !skb_iter_next */ -extern void skb_iter_abort(const struct sk_buff *skb, struct skb_iter *i); - #ifdef CONFIG_NETFILTER static inline void nf_conntrack_put(struct nf_conntrack *nfct) { --- linux-2.6.10-rc3-mm1-full/include/net/pkt_cls.h.old 2004-12-14 02:37:09.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/net/pkt_cls.h 2004-12-14 02:37:15.000000000 +0100 @@ -17,7 +17,6 @@ extern int register_tcf_proto_ops(struct tcf_proto_ops *ops); extern int unregister_tcf_proto_ops(struct tcf_proto_ops *ops); -extern int ing_filter(struct sk_buff *skb); static inline unsigned long __cls_set_class(unsigned long *clp, unsigned long cl) --- linux-2.6.10-rc3-mm1-full/include/linux/netdevice.h.old 2004-12-14 02:38:05.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/linux/netdevice.h 2004-12-14 02:38:11.000000000 +0100 @@ -522,7 +522,6 @@ extern struct net_device *dev_base; /* All devices */ extern rwlock_t dev_base_lock; /* Device list lock */ -extern int netdev_boot_setup_add(char *name, struct ifmap *map); extern int netdev_boot_setup_check(struct net_device *dev); extern unsigned long netdev_boot_base(const char *prefix, int unit); extern struct net_device *dev_getbyhwaddr(unsigned short type, char *hwaddr); --- linux-2.6.10-rc3-mm1-full/include/linux/socket.h.old 2004-12-14 02:39:18.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/linux/socket.h 2004-12-14 02:39:24.000000000 +0100 @@ -286,7 +286,6 @@ extern int verify_iovec(struct msghdr *m, struct iovec *iov, char *address, int mode); extern int memcpy_toiovec(struct iovec *v, unsigned char *kdata, int len); -extern void memcpy_tokerneliovec(struct iovec *iov, unsigned char *kdata, int len); extern int move_addr_to_user(void *kaddr, int klen, void __user *uaddr, int __user *ulen); extern int move_addr_to_kernel(void __user *uaddr, int ulen, void *kaddr); extern int put_cmsg(struct msghdr*, int level, int type, int len, void *data); --- linux-2.6.10-rc3-mm1-full/include/linux/netfilter.h.old 2004-12-14 02:41:28.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/linux/netfilter.h 2004-12-14 02:41:37.000000000 +0100 @@ -175,10 +175,6 @@ extern void (*ip_ct_attach)(struct sk_buff *, struct sk_buff *); extern void nf_ct_attach(struct sk_buff *, struct sk_buff *); -#ifdef CONFIG_NETFILTER_DEBUG -extern void nf_dump_skb(int pf, struct sk_buff *skb); -#endif - /* FIXME: Before cache is ever used, this must be implemented for real. */ extern void nf_invalidate_cache(int pf); --- linux-2.6.10-rc3-mm1-full/net/core/datagram.c.old 2004-12-14 04:22:37.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/core/datagram.c 2004-12-14 02:35:34.000000000 +0100 @@ -199,19 +199,6 @@ kfree_skb(skb); } -/* - * Copy a datagram to a linear buffer. - */ -int skb_copy_datagram(const struct sk_buff *skb, int offset, char __user *to, int size) -{ - struct iovec iov = { - .iov_base = to, - .iov_len =size, - }; - - return skb_copy_datagram_iovec(skb, offset, &iov, size); -} - /** * skb_copy_datagram_iovec - Copy a datagram to an iovec. * @skb - buffer to copy @@ -296,8 +283,9 @@ return -EFAULT; } -int skb_copy_and_csum_datagram(const struct sk_buff *skb, int offset, - u8 __user *to, int len, unsigned int *csump) +static int skb_copy_and_csum_datagram(const struct sk_buff *skb, int offset, + u8 __user *to, int len, + unsigned int *csump) { int start = skb_headlen(skb); int pos = 0; @@ -489,7 +477,6 @@ EXPORT_SYMBOL(datagram_poll); EXPORT_SYMBOL(skb_copy_and_csum_datagram_iovec); -EXPORT_SYMBOL(skb_copy_datagram); EXPORT_SYMBOL(skb_copy_datagram_iovec); EXPORT_SYMBOL(skb_free_datagram); EXPORT_SYMBOL(skb_recv_datagram); --- linux-2.6.10-rc3-mm1-full/net/core/dev.c.old 2004-12-14 02:36:09.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/core/dev.c 2004-12-14 02:38:19.000000000 +0100 @@ -183,7 +183,7 @@ * semaphore held. */ struct net_device *dev_base; -struct net_device **dev_tail = &dev_base; +static struct net_device **dev_tail = &dev_base; rwlock_t dev_base_lock = RW_LOCK_UNLOCKED; EXPORT_SYMBOL(dev_base); @@ -361,7 +361,7 @@ * returns 0 on error and 1 on success. This is a generic routine to * all netdevices. */ -int netdev_boot_setup_add(char *name, struct ifmap *map) +static int netdev_boot_setup_add(char *name, struct ifmap *map) { struct netdev_boot_setup *s; int i; @@ -644,7 +644,7 @@ * Network device names need to be valid file names to * to allow sysfs to work */ -int dev_valid_name(const char *name) +static int dev_valid_name(const char *name) { return !(*name == '\0' || !strcmp(name, ".") @@ -1596,7 +1596,7 @@ * the ingress scheduler, you just cant add policies on ingress. * */ -int ing_filter(struct sk_buff *skb) +static int ing_filter(struct sk_buff *skb) { struct Qdisc *q; struct net_device *dev = skb->dev; @@ -3251,9 +3251,4 @@ EXPORT_SYMBOL(dev_load); #endif -#ifdef CONFIG_NET_CLS_ACT -EXPORT_SYMBOL(ing_filter); -#endif - - EXPORT_PER_CPU_SYMBOL(softnet_data); --- linux-2.6.10-rc3-mm1-full/net/core/dst.c.old 2004-12-14 02:38:35.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/core/dst.c 2004-12-14 02:38:45.000000000 +0100 @@ -264,7 +264,7 @@ return NOTIFY_DONE; } -struct notifier_block dst_dev_notifier = { +static struct notifier_block dst_dev_notifier = { .notifier_call = dst_dev_event, }; --- linux-2.6.10-rc3-mm1-full/net/core/iovec.c.old 2004-12-14 02:39:30.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/core/iovec.c 2004-12-14 02:39:52.000000000 +0100 @@ -99,28 +99,6 @@ } /* - * In kernel copy to iovec. Returns -EFAULT on error. - * - * Note: this modifies the original iovec. - */ - -void memcpy_tokerneliovec(struct iovec *iov, unsigned char *kdata, int len) -{ - while (len > 0) { - if (iov->iov_len) { - int copy = min_t(unsigned int, iov->iov_len, len); - memcpy(iov->iov_base, kdata, copy); - kdata += copy; - len -= copy; - iov->iov_len -= copy; - iov->iov_base += copy; - } - iov++; - } -} - - -/* * Copy iovec to kernel. Returns -EFAULT on error. * * Note: this modifies the original iovec. @@ -259,4 +237,3 @@ EXPORT_SYMBOL(memcpy_fromiovec); EXPORT_SYMBOL(memcpy_fromiovecend); EXPORT_SYMBOL(memcpy_toiovec); -EXPORT_SYMBOL(memcpy_tokerneliovec); --- linux-2.6.10-rc3-mm1-full/net/core/netfilter.c.old 2004-12-14 02:41:44.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/core/netfilter.c 2004-12-14 02:41:52.000000000 +0100 @@ -173,7 +173,7 @@ printk("\n"); } -void nf_dump_skb(int pf, struct sk_buff *skb) +static void nf_dump_skb(int pf, struct sk_buff *skb) { printk("skb: pf=%i %s dev=%s len=%u\n", pf, --- linux-2.6.10-rc3-mm1-full/include/linux/netpoll.h.old 2004-12-14 02:43:42.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/linux/netpoll.h 2004-12-14 02:43:47.000000000 +0100 @@ -24,7 +24,6 @@ }; void netpoll_poll(struct netpoll *np); -void netpoll_send_skb(struct netpoll *np, struct sk_buff *skb); void netpoll_send_udp(struct netpoll *np, const char *msg, int len); int netpoll_parse_options(struct netpoll *np, char *opt); int netpoll_setup(struct netpoll *np); --- linux-2.6.10-rc3-mm1-full/net/core/netpoll.c.old 2004-12-14 02:43:06.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/core/netpoll.c 2004-12-14 02:43:59.000000000 +0100 @@ -39,7 +39,7 @@ static LIST_HEAD(rx_list); static atomic_t trapped; -spinlock_t netpoll_poll_lock = SPIN_LOCK_UNLOCKED; +static spinlock_t netpoll_poll_lock = SPIN_LOCK_UNLOCKED; #define NETPOLL_RX_ENABLED 1 #define NETPOLL_RX_DROP 2 @@ -178,7 +178,7 @@ return skb; } -void netpoll_send_skb(struct netpoll *np, struct sk_buff *skb) +static void netpoll_send_skb(struct netpoll *np, struct sk_buff *skb) { int status; @@ -676,6 +676,5 @@ EXPORT_SYMBOL(netpoll_parse_options); EXPORT_SYMBOL(netpoll_setup); EXPORT_SYMBOL(netpoll_cleanup); -EXPORT_SYMBOL(netpoll_send_skb); EXPORT_SYMBOL(netpoll_send_udp); EXPORT_SYMBOL(netpoll_poll); --- linux-2.6.10-rc3-mm1-full/include/linux/rtnetlink.h.old 2004-12-14 02:44:36.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/linux/rtnetlink.h 2004-12-14 02:44:52.000000000 +0100 @@ -765,7 +765,6 @@ }; extern struct rtnetlink_link * rtnetlink_links[NPROTO]; -extern int rtnetlink_dump_ifinfo(struct sk_buff *skb, struct netlink_callback *cb); extern int rtnetlink_send(struct sk_buff *skb, u32 pid, u32 group, int echo); extern int rtnetlink_put_metrics(struct sk_buff *skb, u32 *metrics); --- linux-2.6.10-rc3-mm1-full/net/core/rtnetlink.c.old 2004-12-14 02:45:12.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/core/rtnetlink.c 2004-12-14 02:45:26.000000000 +0100 @@ -241,7 +241,7 @@ return -1; } -int rtnetlink_dump_ifinfo(struct sk_buff *skb, struct netlink_callback *cb) +static int rtnetlink_dump_ifinfo(struct sk_buff *skb, struct netlink_callback *cb) { int idx; int s_idx = cb->args[0]; @@ -676,7 +676,6 @@ EXPORT_SYMBOL(__rta_fill); EXPORT_SYMBOL(rtattr_parse); -EXPORT_SYMBOL(rtnetlink_dump_ifinfo); EXPORT_SYMBOL(rtnetlink_links); EXPORT_SYMBOL(rtnetlink_put_metrics); EXPORT_SYMBOL(rtnl); --- linux-2.6.10-rc3-mm1-full/net/core/skbuff.c.old 2004-12-14 02:45:55.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/core/skbuff.c 2004-12-14 04:40:27.000000000 +0100 @@ -984,70 +984,6 @@ return -EFAULT; } -/* Keep iterating until skb_iter_next returns false. */ -void skb_iter_first(const struct sk_buff *skb, struct skb_iter *i) -{ - i->len = skb_headlen(skb); - i->data = (unsigned char *)skb->data; - i->nextfrag = 0; - i->fraglist = NULL; -} - -int skb_iter_next(const struct sk_buff *skb, struct skb_iter *i) -{ - /* Unmap previous, if not head fragment. */ - if (i->nextfrag) - kunmap_skb_frag(i->data); - - if (i->fraglist) { - fraglist: - /* We're iterating through fraglist. */ - if (i->nextfrag < skb_shinfo(i->fraglist)->nr_frags) { - i->data = kmap_skb_frag(&skb_shinfo(i->fraglist) - ->frags[i->nextfrag]); - i->len = skb_shinfo(i->fraglist)->frags[i->nextfrag] - .size; - i->nextfrag++; - return 1; - } - /* Fragments with fragments? Too hard! */ - BUG_ON(skb_shinfo(i->fraglist)->frag_list); - i->fraglist = i->fraglist->next; - if (!i->fraglist) - goto end; - - i->len = skb_headlen(i->fraglist); - i->data = i->fraglist->data; - i->nextfrag = 0; - return 1; - } - - if (i->nextfrag < skb_shinfo(skb)->nr_frags) { - i->data = kmap_skb_frag(&skb_shinfo(skb)->frags[i->nextfrag]); - i->len = skb_shinfo(skb)->frags[i->nextfrag].size; - i->nextfrag++; - return 1; - } - - i->fraglist = skb_shinfo(skb)->frag_list; - if (i->fraglist) - goto fraglist; - -end: - /* Bug trap for callers */ - i->data = NULL; - return 0; -} - -void skb_iter_abort(const struct sk_buff *skb, struct skb_iter *i) -{ - /* Unmap previous, if not head fragment. */ - if (i->data && i->nextfrag) - kunmap_skb_frag(i->data); - /* Bug trap for callers */ - i->data = NULL; -} - /* Checksum skb data. */ unsigned int skb_checksum(const struct sk_buff *skb, int offset, @@ -1373,25 +1309,6 @@ } -/** - * skb_insert - insert a buffer - * @old: buffer to insert before - * @newsk: buffer to insert - * - * Place a packet before a given packet in a list. The list locks are taken - * and this function is atomic with respect to other list locked calls - * A buffer cannot be placed on two lists at the same time. - */ - -void skb_insert(struct sk_buff *old, struct sk_buff *newsk) -{ - unsigned long flags; - - spin_lock_irqsave(&old->list->lock, flags); - __skb_insert(newsk, old->prev, old, old->list); - spin_unlock_irqrestore(&old->list->lock, flags); -} - #if 0 /* * Tune the memory allocator for a new MTU size. @@ -1511,13 +1428,9 @@ EXPORT_SYMBOL(skb_under_panic); EXPORT_SYMBOL(skb_dequeue); EXPORT_SYMBOL(skb_dequeue_tail); -EXPORT_SYMBOL(skb_insert); EXPORT_SYMBOL(skb_queue_purge); EXPORT_SYMBOL(skb_queue_head); EXPORT_SYMBOL(skb_queue_tail); EXPORT_SYMBOL(skb_unlink); EXPORT_SYMBOL(skb_append); EXPORT_SYMBOL(skb_split); -EXPORT_SYMBOL(skb_iter_first); -EXPORT_SYMBOL(skb_iter_next); -EXPORT_SYMBOL(skb_iter_abort); --- linux-2.6.10-rc3-mm1-full/include/net/sock.h.old 2004-12-14 02:56:46.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/net/sock.h 2004-12-14 02:53:27.000000000 +0100 @@ -733,11 +733,6 @@ unsigned long size, int noblock, int *errcode); -extern struct sk_buff *sock_alloc_send_pskb(struct sock *sk, - unsigned long header_len, - unsigned long data_len, - int noblock, - int *errcode); extern void *sock_kmalloc(struct sock *sk, int size, int priority); extern void sock_kfree_s(struct sock *sk, void *mem, int size); extern void sk_send_sigurg(struct sock *sk); @@ -795,8 +790,6 @@ * Default socket callbacks and setup code */ -extern void sock_def_destruct(struct sock *); - /* Initialise core socket variables */ extern void sock_init_data(struct socket *sock, struct sock *sk); --- linux-2.6.10-rc3-mm1-full/net/core/sock.c.old 2004-12-14 02:52:27.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/core/sock.c 2004-12-14 03:12:56.000000000 +0100 @@ -825,8 +825,10 @@ * Generic send/receive buffer handlers */ -struct sk_buff *sock_alloc_send_pskb(struct sock *sk, unsigned long header_len, - unsigned long data_len, int noblock, int *errcode) +static struct sk_buff *sock_alloc_send_pskb(struct sock *sk, + unsigned long header_len, + unsigned long data_len, + int noblock, int *errcode) { struct sk_buff *skb; unsigned int gfp_mask; @@ -1084,7 +1086,7 @@ * Default Socket Callbacks */ -void sock_def_wakeup(struct sock *sk) +static void sock_def_wakeup(struct sock *sk) { read_lock(&sk->sk_callback_lock); if (sk->sk_sleep && waitqueue_active(sk->sk_sleep)) @@ -1092,7 +1094,7 @@ read_unlock(&sk->sk_callback_lock); } -void sock_def_error_report(struct sock *sk) +static void sock_def_error_report(struct sock *sk) { read_lock(&sk->sk_callback_lock); if (sk->sk_sleep && waitqueue_active(sk->sk_sleep)) @@ -1101,7 +1103,7 @@ read_unlock(&sk->sk_callback_lock); } -void sock_def_readable(struct sock *sk, int len) +static void sock_def_readable(struct sock *sk, int len) { read_lock(&sk->sk_callback_lock); if (sk->sk_sleep && waitqueue_active(sk->sk_sleep)) @@ -1110,7 +1112,7 @@ read_unlock(&sk->sk_callback_lock); } -void sock_def_write_space(struct sock *sk) +static void sock_def_write_space(struct sock *sk) { read_lock(&sk->sk_callback_lock); @@ -1129,7 +1131,7 @@ read_unlock(&sk->sk_callback_lock); } -void sock_def_destruct(struct sock *sk) +static void sock_def_destruct(struct sock *sk) { if (sk->sk_protinfo) kfree(sk->sk_protinfo); @@ -1368,7 +1370,6 @@ EXPORT_SYMBOL(sk_alloc); EXPORT_SYMBOL(sk_free); EXPORT_SYMBOL(sk_send_sigurg); -EXPORT_SYMBOL(sock_alloc_send_pskb); EXPORT_SYMBOL(sock_alloc_send_skb); EXPORT_SYMBOL(sock_init_data); EXPORT_SYMBOL(sock_kfree_s); --- linux-2.6.10-rc3-mm1-full/include/net/iw_handler.h.old 2004-12-14 02:54:50.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/net/iw_handler.h 2004-12-14 02:54:57.000000000 +0100 @@ -418,9 +418,6 @@ * Those may be called only within the kernel. */ -/* Data needed by fs/compat_ioctl.c for 32->64 bit conversion */ -extern const char iw_priv_type_size[]; - /* First : function strictly used inside the kernel */ /* Handle /proc/net/wireless, called in net/code/dev.c */ --- linux-2.6.10-rc3-mm1-full/net/core/wireless.c.old 2004-12-14 02:55:04.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/core/wireless.c 2004-12-14 02:55:12.000000000 +0100 @@ -304,7 +304,7 @@ sizeof(struct iw_ioctl_description)); /* Size (in bytes) of the various private data types */ -const char iw_priv_type_size[] = { +static const char iw_priv_type_size[] = { 0, /* IW_PRIV_TYPE_NONE */ 1, /* IW_PRIV_TYPE_BYTE */ 1, /* IW_PRIV_TYPE_CHAR */ From jmorris@redhat.com Mon Dec 13 22:57:08 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 13 Dec 2004 22:57:15 -0800 (PST) Received: from mx1.redhat.com (mx1.redhat.com [66.187.233.31]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBE6ulZp028674 for ; Mon, 13 Dec 2004 22:57:08 -0800 Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.12.11/8.12.11) with ESMTP id iBE6u4uE015134; Tue, 14 Dec 2004 01:56:09 -0500 Received: from mail.boston.redhat.com (mail.boston.redhat.com [172.16.76.12]) by int-mx1.corp.redhat.com (8.11.6/8.11.6) with ESMTP id iBE6u4r20749; Tue, 14 Dec 2004 01:56:04 -0500 Received: from thoron.boston.redhat.com (thoron.boston.redhat.com [172.16.80.63]) by mail.boston.redhat.com (8.12.8/8.12.8) with ESMTP id iBE6txZV028913; Tue, 14 Dec 2004 01:55:59 -0500 Date: Tue, 14 Dec 2004 01:56:01 -0500 (EST) From: James Morris X-X-Sender: jmorris@thoron.boston.redhat.com To: Evgeniy Polyakov cc: netdev@oss.sgi.com, , "David S. Miller" Subject: Re: Asynchronous crypto layer. In-Reply-To: <1099288738.5070.71.camel@uganda> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12726 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jmorris@redhat.com Precedence: bulk X-list: netdev On Mon, 1 Nov 2004, Evgeniy Polyakov wrote: [Sorry for taking so long to get back to this]. > > c) Async engine: scheduling/batching/routing etc. > > > > This seems to be the core of what you've developed so far. I'm not sure > > if we need pluggable load balancers. How would the sysadmin select the > > correct one? The simpler this code is the better. > > By following command: > echo -en "simple_lb" > /sys/class/crypto_lb/lbs > or by sending [still not implemented] connector's(netlink) command. The real question was how would the sysadmin know which is the best load balancer to configure. Start simple, make one good load balancer, and ponly create a pluggable framework only if a demonstratable need arises. > > Overall I think it's a good start. There are some chicken & egg type > > problems when you don't have GPL drivers, hardware or an existing async > > API, so I'd imagine that this will all continue to evolve: with more > > hardware we can write/incorporate more drivers, with more drivers we can > > further refine the async support and API etc. > > That is true, but I think not all parts of API should be exported as GPL > only. Why do you think this? > > Here are some issues that I noticed: > > > > > > What is main_crypto_device? Is this a placeholder for when there are no > > other devices registered? > > main_crypto_device is just virtual device into which all sessions are > placed along with specific crypto device queue. > it is usefull for cases when some driver decides that HW is broken and > marks itself like not working, then scheduler _may_ (simple_lb does not > have such feature) and actually _should_ move all sessions that are > placed into broken device's queue into other devices or drop them(call > callback with special session flag). No reason to drop them, just fall back to software. > > Async & sync drivers need to both be treated as crypto drivers. > > > > > > static inline void crypto_route_free(struct crypto_route *rt) > > { > > crypto_device_put(rt->dev); > > rt->dev = NULL; > > kfree(rt); > > } > > > > Why do you set rt->dev to NULL here? It should not still be referenceable > > at this point. > > crypto_route_free() can be called only from crypto_route_del() which > unlinks given route thus noone can reference rt->dev(but of course > asnyone can have it's private copy, obtained under the lock). > Any route access is guarded by route list lock. No, if you're about to kfree(rt), you cannot also be able to dereference it. > > __crypto_route_del() etc: > > > > Why are you rolling your own list management and not using the list.h > > functions? > > It is more convenient here to use queue like sk_buf does. Use list.h or put what you need into list.h. > > +struct crypto_device_stat > > +{ > > + __u64 scompleted; > > + __u64 sfinished; > > ... > > > > Please see how networking stats are implemented (e.g. netif_rx_stats) and > > previous netdev discussions on 64-bit counters. Please only use __u64 and > > similar when exporting structures to userspace. > > I use __ prefix for any parameter that is exported to userspace. But this structure is not. > I've seen several attempts to convert network statistic to 64bit values, > with backward compatibility it is not an easy task. Let's do not > catch this again. To clarifu: the best way to do this is to use per-cpu counters of unsigned long and add them together in userspace. - James -- James Morris From jmorris@redhat.com Mon Dec 13 23:25:05 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 13 Dec 2004 23:25:12 -0800 (PST) Received: from mx1.redhat.com (mx1.redhat.com [66.187.233.31]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBE7OeZg031240 for ; Mon, 13 Dec 2004 23:25:04 -0800 Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.12.11/8.12.11) with ESMTP id iBE7O1UK020982; Tue, 14 Dec 2004 02:24:01 -0500 Received: from mail.boston.redhat.com (mail.boston.redhat.com [172.16.76.12]) by int-mx1.corp.redhat.com (8.11.6/8.11.6) with ESMTP id iBE7Nur25157; Tue, 14 Dec 2004 02:23:56 -0500 Received: from thoron.boston.redhat.com (thoron.boston.redhat.com [172.16.80.63]) by mail.boston.redhat.com (8.12.8/8.12.8) with ESMTP id iBE7NrZV030187; Tue, 14 Dec 2004 02:23:53 -0500 Date: Tue, 14 Dec 2004 02:23:55 -0500 (EST) From: James Morris X-X-Sender: jmorris@thoron.boston.redhat.com To: Evgeniy Polyakov cc: jamal , Michal Ludvig , , Subject: Re: Asynchronous crypto layer. In-Reply-To: <20041102191235.609efde6@zanzibar.2ka.mipt.ru> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12727 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jmorris@redhat.com Precedence: bulk X-list: netdev On Tue, 2 Nov 2004, Evgeniy Polyakov wrote: Some feedback: +#define session_completed(s) (s->ci.flags & SESSION_COMPLETED) +#define complete_session(s) do {s->ci.flags |= SESSION_COMPLETED;} while(0) +#define uncomplete_session(s) do {s->ci.flags &= ~SESSION_COMPLETED;} while (0) etc. Please use static inlines for all of these things. Are these supposed to be exported to userspace? Why? +struct crypto_conn_data +{ + char name[SCACHE_NAMELEN]; + __u16 cmd; Again, exporting to userspace? + list_for_each_entry_safe(__dev, n, &cdev_list, cdev_entry) + { + if (compare_device(__dev, dev)) + { Incorrect coding style (and many more). As mentioned before, pluggable load balacers are not needed now, and complicate the code. - James -- James Morris From marcel@holtmann.org Mon Dec 13 23:34:56 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 13 Dec 2004 23:35:04 -0800 (PST) Received: from mail.holtmann.net (root@coyote.holtmann.net [217.160.111.169]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBE7YZvs032474 for ; Mon, 13 Dec 2004 23:34:56 -0800 Received: from pegasus (pD9525B82.dip.t-dialin.net [217.82.91.130]) by mail.holtmann.net (8.12.3/8.12.3/Debian-7.1) with ESMTP id iBE7YVCJ029261; Tue, 14 Dec 2004 08:34:31 +0100 Subject: Re: [2.6 patch] net/bluetooth/: misc possible cleanups From: Marcel Holtmann To: Adrian Bunk Cc: Max Krasnyansky , bluez-devel@lists.sourceforge.net, Linux Kernel Mailing List , Network Development Mailing List In-Reply-To: <20041214041352.GZ23151@stusta.de> References: <20041214041352.GZ23151@stusta.de> Content-Type: text/plain Date: Tue, 14 Dec 2004 08:34:08 +0100 Message-Id: <1103009649.2143.65.camel@pegasus> Mime-Version: 1.0 X-Mailer: Evolution 2.0.3 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Scanned: ClamAV 0.80/601/Mon Nov 22 14:40:21 2004 clamav-milter version 0.80j on coyote.holtmann.net X-Virus-Status: Clean X-Virus-Status: Clean X-archive-position: 12728 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: marcel@holtmann.org Precedence: bulk X-list: netdev Hi Adrian, > The patch below contains the following possible cleanups: > - make needlessly global code static > - remove the following EXPORT_SYMBOL'ed but unused functions in > hci_core.c: > - hci_suspend_dev > - hci_resume_dev > - hci_register_cb > - hci_unregister_cb these functions must stay. They have users outside the mainline kernel that are not merged back yet. Otherwise they won't be exported ;) > Please comment on which of these changes are correct and which conflict > with pending patches. Please send a separate patch for all the RFCOMM changes, because these conflicts with some pending patches and then it will make it easier for me to merge them. The rest of the changes are fine with me, but I like to see also a separate patch for the CMTP stuff and cmtp_send_capimsg() don't need a forward declaration. Simply move the function to another place in the source code. Regards Marcel From thomas.spatzier@de.ibm.com Mon Dec 13 23:41:27 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 13 Dec 2004 23:41:42 -0800 (PST) Received: from mtagate3.de.ibm.com (mtagate3.de.ibm.com [195.212.29.152]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBE7f64c000910 for ; Mon, 13 Dec 2004 23:41:27 -0800 Received: from d12nrmr1607.megacenter.de.ibm.com (d12nrmr1607.megacenter.de.ibm.com [9.149.167.49]) by mtagate3.de.ibm.com (8.12.10/8.12.10) with ESMTP id iBE7eZdt106760 for ; Tue, 14 Dec 2004 07:40:35 GMT Received: from d12av02.megacenter.de.ibm.com (d12av02.megacenter.de.ibm.com [9.149.165.228]) by d12nrmr1607.megacenter.de.ibm.com (8.12.10/NCO/VER6.6) with ESMTP id iBE7fETQ054124 for ; Tue, 14 Dec 2004 08:41:14 +0100 Received: from d12av02.megacenter.de.ibm.com (loopback [127.0.0.1]) by d12av02.megacenter.de.ibm.com (8.12.11/8.12.11) with ESMTP id iBE7eYbr022218 for ; Tue, 14 Dec 2004 08:40:34 +0100 Received: from d12ml061.megacenter.de.ibm.com ([9.149.165.51]) by d12av02.megacenter.de.ibm.com (8.12.11/8.12.11) with ESMTP id iBE7eY3P022213; Tue, 14 Dec 2004 08:40:34 +0100 In-Reply-To: Subject: Re: [patch 4/10] s390: network driver. To: Paul Jakma Cc: hadi@cyberus.ca, Hasso Tepper , Herbert Xu , jgarzik@pobox.com, netdev@oss.sgi.com X-Mailer: Lotus Notes Release 6.0.2CF1 June 9, 2003 Message-ID: From: Thomas Spatzier Date: Tue, 14 Dec 2004 08:40:26 +0100 X-MIMETrack: Serialize by Router on D12ML061/12/M/IBM(Release 6.51HF338 | June 21, 2004) at 14/12/2004 08:41:05 MIME-Version: 1.0 Content-type: text/plain; charset=US-ASCII X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12729 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: thomas.spatzier@de.ibm.com Precedence: bulk X-list: netdev Paul Jakma wrote on 10.12.2004 16:37:15: > Thomas' original patch was to address this problem. I wonder could he > recap the kernel side of this problem? Here is why we submitted the original patch: We got reports from several customers that their dynamic routing daemons got hung when one network interface lost its physical connection. Some debugging showed that the write queues of sockets went full and got blocked. This was because we issued a netif_stop_queue when we detect a cable pull or something. As a solution, we removed the netif_stop_queue calls and just dropped the packets + we increment the respective error counts in the net_device_stats and call netif_carrier_off. This solved the customer problems and seems to be right thing for zebra etc. Regards, Thomas. From johnpol@2ka.mipt.ru Mon Dec 13 23:55:15 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 13 Dec 2004 23:55:21 -0800 (PST) Received: from localhost.localdomain ([217.67.177.50]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBE7srxB002569 for ; Mon, 13 Dec 2004 23:55:14 -0800 Received: from uganda (uganda [127.0.0.1]) by localhost.localdomain (8.13.1/8.13.1) with ESMTP id iBE7wcA5004180; Tue, 14 Dec 2004 10:58:40 +0300 Subject: Re: Asynchronous crypto layer. From: Evgeniy Polyakov Reply-To: johnpol@2ka.mipt.ru To: James Morris Cc: netdev@oss.sgi.com, cryptoapi@lists.logix.cz, "David S. Miller" In-Reply-To: References: Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="=-aMNZjhWFP46+fE4KfCMG" Organization: MIPT Date: Tue, 14 Dec 2004 10:58:37 +0300 Message-Id: <1103011117.3430.24.camel@uganda> Mime-Version: 1.0 X-Mailer: Evolution 2.0.2 (2.0.2-3) X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12730 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: johnpol@2ka.mipt.ru Precedence: bulk X-list: netdev --=-aMNZjhWFP46+fE4KfCMG Content-Type: text/plain Content-Transfer-Encoding: quoted-printable On Tue, 2004-12-14 at 01:56 -0500, James Morris wrote: > On Mon, 1 Nov 2004, Evgeniy Polyakov wrote: >=20 > [Sorry for taking so long to get back to this]. >=20 > > > c) Async engine: scheduling/batching/routing etc. > > >=20 > > > This seems to be the core of what you've developed so far. I'm not s= ure=20 > > > if we need pluggable load balancers. How would the sysadmin select t= he=20 > > > correct one? The simpler this code is the better. > >=20 > > By following command: > > echo -en "simple_lb" > /sys/class/crypto_lb/lbs > > or by sending [still not implemented] connector's(netlink) command. >=20 > The real question was how would the sysadmin know which is the best load=20 > balancer to configure. Start simple, make one good load balancer, and=20 > ponly create a pluggable framework only if a demonstratable need arises. By reading documentation which will be written some day... If it has only SW crypto it does not need scheduler at all - only very simple switch, but if system has diferent kinds of HW crypto cards, and sysadmin knows about it's capabilities it can be better to turn more advanced scheduler on - for example one=20 which can take packet size into account. It will be a bit slower while searching needed device, but will have big win=20 when doing actual crypto processing. > > > Overall I think it's a good start. There are some chicken & egg type= =20 > > > problems when you don't have GPL drivers, hardware or an existing asy= nc=20 > > > API, so I'd imagine that this will all continue to evolve: with more= =20 > > > hardware we can write/incorporate more drivers, with more drivers we = can=20 > > > further refine the async support and API etc. > >=20 > > That is true, but I think not all parts of API should be exported as GP= L > > only. >=20 > Why do you think this? In my current acrypto tree I export all symbols as GPL except crypto_session_alloc(). Lets greedy scums use at least bits of new functionality. :) >=20 > > > Here are some issues that I noticed: > > >=20 > > >=20 > > > What is main_crypto_device? Is this a placeholder for when there are= no=20 > > > other devices registered? > >=20 > > main_crypto_device is just virtual device into which all sessions are > > placed along with specific crypto device queue. > > it is usefull for cases when some driver decides that HW is broken and > > marks itself like not working, then scheduler _may_ (simple_lb does not > > have such feature) and actually _should_ move all sessions that are > > placed into broken device's queue into other devices or drop them(call > > callback with special session flag). >=20 > No reason to drop them, just fall back to software. SW is just one of the crypto devices, consider emebedded system without=20 gigantic processor time which can be spent doing crypto processing. Or if it is asymmetric crypto? If SW exists and is loaded into acrypto it _will_ be called if needed. > > > Async & sync drivers need to both be treated as crypto drivers. > > >=20 > > >=20 > > > static inline void crypto_route_free(struct crypto_route *rt) > > > { > > > crypto_device_put(rt->dev); > > > rt->dev =3D NULL; > > > kfree(rt); > > > } > > >=20 > > > Why do you set rt->dev to NULL here? It should not still be referenc= eable=20 > > > at this point. > >=20 > > crypto_route_free() can be called only from crypto_route_del() which > > unlinks given route thus noone can reference rt->dev(but of course > > asnyone can have it's private copy, obtained under the lock). > > Any route access is guarded by route list lock. >=20 > No, if you're about to kfree(rt), you cannot also be able to dereference=20 > it. I mean reference to device, not route. Crypto route in this point can be accessed only through crypto_route_del (). Any routing _must_ be accessed _only_ by crypto_route_* functions from=20 crypto_route.h which are always performed with proper locking. > > > __crypto_route_del() etc: > > >=20 > > > Why are you rolling your own list management and not using the list.h= =20 > > > functions? > >=20 > > It is more convenient here to use queue like sk_buf does. >=20 > Use list.h or put what you need into list.h. list.h will never have struct sk_buff_head or any other queueing primitives. crypto_route is not a list, it is a queue. > > > +struct crypto_device_stat > > > +{ > > > + __u64 scompleted; > > > + __u64 sfinished; > > > ... > > >=20 > > > Please see how networking stats are implemented (e.g. netif_rx_stats)= and=20 > > > previous netdev discussions on 64-bit counters. Please only use __u6= 4 and=20 > > > similar when exporting structures to userspace. > >=20 > > I use __ prefix for any parameter that is exported to userspace. >=20 > But this structure is not. struct crypto_device_stat { __u64 scompleted; __u64 sfinished; __u64 sstarted; __u64 kmem_failed; }; Probably version mismatch... > > I've seen several attempts to convert network statistic to 64bit values= , > > with backward compatibility it is not an easy task. Let's do not > > catch this again. >=20 > To clarifu: the best way to do this is to use per-cpu counters of unsigne= d > long and add them together in userspace. It is the fastest way sure,=20 current acrypto statistic code is not fast, it takes a stat lock to guarantee that fields are changed atomically since some (per session_crypto_initializer ) of them are used in scheduler. >=20 > - James --=20 Evgeniy Polyakov Crash is better than data corruption -- Arthur Grabowski --=-aMNZjhWFP46+fE4KfCMG Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.6 (GNU/Linux) iD8DBQBBvp0tIKTPhE+8wY0RAu/MAJ9Iwicz9CTQ0kQ3eQXtLAVIFgvh9wCfSX/2 LeIoQyk3UWwdBI1B5lttMd4= =EG9R -----END PGP SIGNATURE----- --=-aMNZjhWFP46+fE4KfCMG-- From johnpol@2ka.mipt.ru Tue Dec 14 00:04:13 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 14 Dec 2004 00:04:21 -0800 (PST) Received: from localhost.localdomain (dea.vocord.ru [217.67.177.50]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBE83kjG003829 for ; Tue, 14 Dec 2004 00:04:12 -0800 Received: from uganda (uganda [127.0.0.1]) by localhost.localdomain (8.13.1/8.13.1) with ESMTP id iBE87JB0004213; Tue, 14 Dec 2004 11:07:21 +0300 Subject: Re: Asynchronous crypto layer. From: Evgeniy Polyakov Reply-To: johnpol@2ka.mipt.ru To: James Morris Cc: jamal , Michal Ludvig , netdev@oss.sgi.com, cryptoapi@lists.logix.cz In-Reply-To: References: Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="=-TJkbJY6fW7l4d8nLjy6X" Organization: MIPT Date: Tue, 14 Dec 2004 11:07:18 +0300 Message-Id: <1103011638.3430.33.camel@uganda> Mime-Version: 1.0 X-Mailer: Evolution 2.0.2 (2.0.2-3) X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12731 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: johnpol@2ka.mipt.ru Precedence: bulk X-list: netdev --=-TJkbJY6fW7l4d8nLjy6X Content-Type: text/plain Content-Transfer-Encoding: quoted-printable On Tue, 2004-12-14 at 02:23 -0500, James Morris wrote: > On Tue, 2 Nov 2004, Evgeniy Polyakov wrote: >=20 > Some feedback: >=20 > +#define session_completed(s) (s->ci.flags & SESSION_COMPLETED) > +#define complete_session(s) do {s->ci.flags |=3D SESSION_COMPLETED;} = while(0) > +#define uncomplete_session(s) do {s->ci.flags &=3D ~SESSION_COMPLETED;}= while (0) > etc. >=20 > Please use static inlines for all of these things. > Are these supposed to be exported to userspace? Why? No, they are not supposed to be exported, I will convert them. >=20 > +struct crypto_conn_data > +{ > + char name[SCACHE_NAMELEN]; > + __u16 cmd; >=20 > Again, exporting to userspace? Yes, it is part of the acrypto control over connector(netlink) protocol. >=20 > + list_for_each_entry_safe(__dev, n, &cdev_list, cdev_entry) > + { > + if (compare_device(__dev, dev)) > + { >=20 > Incorrect coding style (and many more). Sigh... :) I know, know, and will change it. > As mentioned before, pluggable load balacers are not needed now, and=20 > complicate the code. I am still not giving up :) -=20 we have several TCP congestion models, we have different IO schedulers,=20 we can tune routing code(although we can not change hash to tree in runtime). >=20 > - James --=20 Evgeniy Polyakov Crash is better than data corruption -- Arthur Grabowski --=-TJkbJY6fW7l4d8nLjy6X Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.6 (GNU/Linux) iD8DBQBBvp82IKTPhE+8wY0RAgsZAJ9pWJaeUDvKpgrfXDVHdf2UNRT8ngCdHwfN Jdzl57cr4awrEjvPcGJrO7Q= =y8zH -----END PGP SIGNATURE----- --=-TJkbJY6fW7l4d8nLjy6X-- From David.Nisbet@thalesgroup.com Tue Dec 14 00:36:17 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 14 Dec 2004 00:36:24 -0800 (PST) Received: from GWOUT.thalesgroup.com (gwout.thalesgroup.com [195.101.39.227]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBE8ZuU2010600 for ; Tue, 14 Dec 2004 00:36:16 -0800 Received: from thalescan.corp.thales (200.3.2.3) by GWOUT.thalesgroup.com (NPlex 6.5.026) id 41BBCA290014967D for netdev@oss.sgi.com; Tue, 14 Dec 2004 09:35:30 +0100 Received: from msw001.uk.trt.thales ([192.168.224.28]) by thalescan with InterScan Messaging Security Suite; Tue, 14 Dec 2004 09:35:23 +0100 Received: from NTS013.uk.trt.thales (unverified) by msw001.uk.trt.thales (Content Technologies SMTPRS 4.2.10) with ESMTP id for ; Tue, 14 Dec 2004 08:29:38 +0000 Received: by nts013.uk.trt.thales with Internet Mail Service (5.5.2656.59) id ; Tue, 14 Dec 2004 08:26:38 -0000 Message-ID: <51BF576D5A02CC4CB2591F50994FD76698B144@nts013.uk.trt.thales> From: "Nisbet, David " To: "'netdev@oss.sgi.com'" Subject: Locally generated multicast packets not routed Date: Tue, 14 Dec 2004 08:26:37 -0000 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2656.59) Content-Type: text/plain; charset="iso-8859-1" X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12732 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: David.Nisbet@thalesgroup.com Precedence: bulk X-list: netdev Hi, I'm hoping someone can help me with the answer to this question... I have been experiencing problems with multicast traffic generated locally on a 2.4.17 machine configured for multicast routing. Traffic generated by my multicast source (vic) is correctly sent to the local ethernet but it is never sent to either of my two gre tunnels. I believe I should see at least some packets on all interfaces before the routes are pruned but I see nothing. This seems to parallel a problem that was originally fixed in the 2.0 kernel where multicast packets were not passed to the routing function. I know that the tunnels can handle multicast traffic because I am receiving packets from a remote machine. Similarly, multicast traffic received from the local ethernet is correctly routed over the tunnels. It is only the traffic generated on the router that is having problems. Using vic, I have set the source ip address to the ethernet port to ensure the data comes from a valid address and I have forced the ttl to 16 but neither has fixed the problem. So my question is: Are locally generated multicast packets passed throught the kernel routing functions? If not, were they previously? The kernel is 2.4.17 (although the same problem appears to occur on a 2.4.22 machine) and I am running mrouted-3.9-beta-3 for route transfer. Thanks David From info@alextools.com Tue Dec 14 01:47:48 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 14 Dec 2004 01:47:59 -0800 (PST) Received: from EV1SERVE-R2O6MS (ev1s-67-15-74-76.ev1servers.net [67.15.74.76] (may be forged)) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBE9lR3u017337 for ; Tue, 14 Dec 2004 01:47:48 -0800 Received: from alextools.com [203.199.69.113] by EV1SERVE-R2O6MS with ESMTP (SMTPD32-8.10) id A75F36A0200; Tue, 14 Dec 2004 15:20:23 +0530 Date: Tue, 14 Dec 2004 15:12:32 +0530 Message-Id: <200412141512.AA38339736@alextools.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii From: "ALEX alex" Reply-To: To: , , <5.1.0.14.2.20020213093438.03156e58@caplet.com>, , , To: , , , , , To: , , CC: <0045b751@neu.edu>, , <100000@netcore.fi>, , , CC: , , , , , CC: , , Subject: INTRODUCTION X-Mailer: X-RBL-Warning: DSN: "Not supporting null originator (DSN)" X-Declude-Sender: info@alextools.com [203.199.69.113] X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12734 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: info@alextools.com Precedence: bulk X-list: netdev ALEX WOOD WORKING TOOLS Sub: Introduction of our Company Dear Sir, We find pleasure in introducing ourselves as one of the leading manufacture of Router Machine (Portable Router) -5 Models. Circular Router Machine. Router Bits . Moulding Blocks. Moulding Blades. Takkti Cutter. Cutter. Fil - Fil Cutter. Molam Cutter. Any Types of Wood Working Machine & Cutter's Planner Blade (HSS, TCT) Hinge Boring Bits Dowel Driling . Cnc Router Moulding Patti We here by request you add our name in your valued list of suppliers and here by request you give chance to prove ourselves. We are along with sending our detailed catalogues and price lost of products. Above Mention detail's is Maximum Discount. In anticipation of yours reply. ALEX WOOD WORKING TOOLS PVT LTD 81-C Karelibaug Ind, Estate, Jalaram Marg, Telephone Office, Karelibaug, Baroda -390018 (Gujarat) India Mobile: 98250-28694 ph-0265-2461882 www.alextools.com E-mail info@alextools.com yogeshalex@yahoo.co.in Yours Sincerely Yogesh Mistry FOR -ALEX WOOD WORKING TOOLS PVT LTD From SRS0+5d25a609d40da48894fc+478+infradead.org+hch@pentafluge.srs.infradead.org Tue Dec 14 01:46:42 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 14 Dec 2004 01:46:55 -0800 (PST) Received: from pentafluge.infradead.org ([213.146.154.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBE9kJNQ017217 for ; Tue, 14 Dec 2004 01:46:42 -0800 Received: from hch by pentafluge.infradead.org with local (Exim 4.42 #1 (Red Hat Linux)) id 1Ce9FX-0000Fk-8M; Tue, 14 Dec 2004 09:45:43 +0000 Date: Tue, 14 Dec 2004 09:45:43 +0000 From: Christoph Hellwig To: Marcel Holtmann Cc: Adrian Bunk , Max Krasnyansky , bluez-devel@lists.sourceforge.net, Linux Kernel Mailing List , Network Development Mailing List Subject: Re: [2.6 patch] net/bluetooth/: misc possible cleanups Message-ID: <20041214094543.GA963@infradead.org> Mail-Followup-To: Christoph Hellwig , Marcel Holtmann , Adrian Bunk , Max Krasnyansky , bluez-devel@lists.sourceforge.net, Linux Kernel Mailing List , Network Development Mailing List References: <20041214041352.GZ23151@stusta.de> <1103009649.2143.65.camel@pegasus> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1103009649.2143.65.camel@pegasus> User-Agent: Mutt/1.4.1i X-SRS-Rewrite: SMTP reverse-path rewritten from by pentafluge.infradead.org See http://www.infradead.org/rpr.html X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12733 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hch@infradead.org Precedence: bulk X-list: netdev On Tue, Dec 14, 2004 at 08:34:08AM +0100, Marcel Holtmann wrote: > these functions must stay. They have users outside the mainline kernel > that are not merged back yet. Otherwise they won't be exported ;) But we traditionally don't keep APIs only for the sake of external modules. Exceptions are made if you have short- to mid-term plans to merge them. From marcel@holtmann.org Tue Dec 14 01:55:52 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 14 Dec 2004 01:56:01 -0800 (PST) Received: from mail.holtmann.net (root@coyote.holtmann.net [217.160.111.169]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBE9tVDB019040 for ; Tue, 14 Dec 2004 01:55:51 -0800 Received: from pegasus (pD9525B82.dip.t-dialin.net [217.82.91.130]) by mail.holtmann.net (8.12.3/8.12.3/Debian-7.1) with ESMTP id iBE9tMCJ030579; Tue, 14 Dec 2004 10:55:23 +0100 Subject: Re: [2.6 patch] net/bluetooth/: misc possible cleanups From: Marcel Holtmann To: Christoph Hellwig Cc: Adrian Bunk , Max Krasnyansky , BlueZ Mailing List , Linux Kernel Mailing List , Network Development Mailing List In-Reply-To: <20041214094543.GA963@infradead.org> References: <20041214041352.GZ23151@stusta.de> <1103009649.2143.65.camel@pegasus> <20041214094543.GA963@infradead.org> Content-Type: text/plain Date: Tue, 14 Dec 2004 10:54:59 +0100 Message-Id: <1103018099.2143.127.camel@pegasus> Mime-Version: 1.0 X-Mailer: Evolution 2.0.3 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Scanned: ClamAV 0.80/601/Mon Nov 22 14:40:21 2004 clamav-milter version 0.80j on coyote.holtmann.net X-Virus-Status: Clean X-Virus-Status: Clean X-archive-position: 12735 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: marcel@holtmann.org Precedence: bulk X-list: netdev Hi Christoph, > > these functions must stay. They have users outside the mainline kernel > > that are not merged back yet. Otherwise they won't be exported ;) > > But we traditionally don't keep APIs only for the sake of external modules. > Exceptions are made if you have short- to mid-term plans to merge them. it is a short term plan, because otherwise I won't have submitted it in the first place. And as I said, they are not merged back yet. They are not tested enough for mainline at the moment. Regards Marcel From hanemade@gmail.com Tue Dec 14 02:43:14 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 14 Dec 2004 02:43:21 -0800 (PST) Received: from rproxy.gmail.com (rproxy.gmail.com [64.233.170.195]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBEAgpNq027006 for ; Tue, 14 Dec 2004 02:43:14 -0800 Received: by rproxy.gmail.com with SMTP id b11so995927rne for ; Tue, 14 Dec 2004 02:42:26 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:mime-version:content-type:content-transfer-encoding; b=nuUyfS/P6xPf1z1JbbnPvWhJOpYjjAnMXpWNRmyvHNhVwfrPOya8jrMMGQ9zhWbJJukXzshsOC1d3IoZLAbVrwjItozFKVmY4OxaywdzN5UrDJBmxMgxUBO9wmSaUdJJQPg2466KKUXRNeJxgrNUeEuviaUNv/Po7B8j4uKTb8g= Received: by 10.38.207.1 with SMTP id e1mr202129rng; Tue, 14 Dec 2004 02:42:26 -0800 (PST) Received: by 10.38.161.18 with HTTP; Tue, 14 Dec 2004 02:42:26 -0800 (PST) Message-ID: Date: Tue, 14 Dec 2004 16:12:26 +0530 From: Harshal Nemade Reply-To: Harshal Nemade To: netdev@oss.sgi.com Subject: loopback packet processing. Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12736 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hanemade@gmail.com Precedence: bulk X-list: netdev Hi all, I want to know that when i send a packet on loopback on 127.0.0.1 the packet is given to netif_rx without MAC header adding. I want to know Is that correct? If yes then what happen to that 14 byte MAC header bytes space in case Ping packet? Is that space free? Why eth_type_trans is called that checks eth->h_proto field ? Please correct me if i am wrong. Thanks in advance, harsh. From Robert.Olsson@data.slu.se Tue Dec 14 04:58:04 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 14 Dec 2004 04:58:21 -0800 (PST) Received: from mail1.slu.se (mail1.slu.se [130.238.96.11]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBECvgM1010936 for ; Tue, 14 Dec 2004 04:58:03 -0800 Received: from robur.slu.se (robur.slu.se [130.238.98.12]) by mail1.slu.se (8.12.10/8.12.10) with ESMTP id iBECvInN032007; Tue, 14 Dec 2004 13:57:19 +0100 Received: by robur.slu.se (Postfix, from userid 1000) id EB006EC001; Tue, 14 Dec 2004 13:57:18 +0100 (CET) From: Robert Olsson MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16830.58158.882590.971712@robur.slu.se> Date: Tue, 14 Dec 2004 13:57:18 +0100 To: Lennert Buytenhek Cc: robert.olsson@data.slu.se, netdev@oss.sgi.com Subject: [PATCH,RFC] pktgen sleeping/timing rework In-Reply-To: <20041210222058.GA5984@xi.wantstofly.org> References: <20041210222058.GA5984@xi.wantstofly.org> X-Mailer: VM 7.18 under Emacs 21.3.1 X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12737 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Robert.Olsson@data.slu.se Precedence: bulk X-list: netdev Lennert Buytenhek writes: > Below is a patch against your latest pktgen devel version Hello! It's nice. Timing code gets cleaner and bit more predictable at least we know better now how to arrange delay for lower rates. It still need some experimentation. schedule_timeout() it nice with very long delays too. Haven't been able to test with any variable frequency system. Lennert patch: W/o patches Estimated. Delay run1 run2 run3 run1 run2 run3 1/delay*1000000000 0 807790 818194 818168 818048 817737 816892 5 807496 818167 818277 811374 814652 816474 10 807502 818180 818277 811368 815200 816333 50 807387 818218 818246 811308 815280 816431 100 796581 818152 818100 811369 814649 816405 500 805049 436554 426719 813738 814656 815095 1000 420252 654450 668470 767182 732713 734529 1000000 5000 99784 99987 122393 185104 182491 182572 200000 10000 78375 72364 73738 95545 95642 95633 100000 50000 19944 19942 19941 16934 16933 16934 20000 100000 9997 9987 9996 9166 9166 9163 10000 500000 1999 1999 1999 1964 1964 1964 2000 1000000 999 999 999 991 991 991 1000 Appplied this to the development version and also renamed ipg to delay. It breaks some scripts. We need a variant and of this and the FCS patch for the kernel version. Also noticed a bug causing an oops while testing this need to be investigated it's not related to this patch it seems. --ro > --- pktgen.c.041209 2004-12-11 21:21:26.000000000 +0100 > +++ pktgen.c 2004-12-11 22:57:51.221340048 +0100 > @@ -222,14 +222,16 @@ > int min_pkt_size; /* = ETH_ZLEN; */ > int max_pkt_size; /* = ETH_ZLEN; */ > int nfrags; > - __u32 ipg; /* Default Interpacket gap in nsec */ > + __u32 ipg_us; /* Default Interpacket gap */ > + __u32 ipg_ns; > __u64 count; /* Default No packets to send */ > __u64 sofar; /* How many pkts we've sent so far */ > __u64 tx_bytes; /* How many bytes we've transmitted */ > __u64 errors; /* Errors when trying to transmit, pkts will be re-sent */ > > /* runtime counters relating to clone_skb */ > - __u64 next_tx_ns; /* timestamp of when to tx next, in nano-seconds */ > + __u64 next_tx_us; /* timestamp of when to tx next */ > + __u32 next_tx_ns; > > __u64 allocated_skbs; > __u32 clone_count; > @@ -239,7 +241,7 @@ > */ > __u64 started_at; /* micro-seconds */ > __u64 stopped_at; /* micro-seconds */ > - __u64 idle_acc; > + __u64 idle_acc; /* micro-seconds */ > __u32 seq_num; > > int clone_skb; /* Use multiple SKBs during packet gen. If this number > @@ -346,10 +348,6 @@ > #define REMOVE 1 > #define FIND 0 > > -static u32 pg_cycles_per_ns; > -static u32 pg_cycles_per_us; > -static u32 pg_cycles_per_ms; > - > /* This code works around the fact that do_div cannot handle two 64-bit > numbers, and regular 64-bit division doesn't work on x86 kernels. > --Ben > @@ -452,14 +450,6 @@ > #endif > } > > -/* Fast, not horribly accurate, since the machine started. */ > -static inline __u64 getRelativeCurMs(void) { > - return pg_div(get_cycles(), pg_cycles_per_ms); > -} > - > -/* Since the epoc. More precise over long periods of time than > - * getRelativeCurMs > - */ > static inline __u64 getCurMs(void) > { > struct timeval tv; > @@ -467,9 +457,6 @@ > return tv_to_ms(&tv); > } > > -/* Since the epoc. More precise over long periods of time than > - * getRelativeCurMs > - */ > static inline __u64 getCurUs(void) > { > struct timeval tv; > @@ -477,18 +464,6 @@ > return tv_to_us(&tv); > } > > -/* Since the machine booted. */ > -static inline __u64 getRelativeCurUs(void) > -{ > - return pg_div(get_cycles(), pg_cycles_per_us); > -} > - > -/* Since the machine booted. */ > -static inline __u64 getRelativeCurNs(void) > -{ > - return pg_div(get_cycles(), pg_cycles_per_ns); > -} > - > static inline __u64 tv_diff(const struct timeval* a, const struct timeval* b) > { > return tv_to_us(a) - tv_to_us(b); > @@ -523,11 +498,6 @@ > static unsigned int scan_ip6(const char *s,char ip[16]); > static unsigned int fmt_ip6(char *s,const char ip[16]); > > -/* cycles per micro-second */ > -static u32 pg_cycles_per_ns; > -static u32 pg_cycles_per_us; > -static u32 pg_cycles_per_ms; > - > /* Module parameters, defaults. */ > static int pg_count_d = 1000; /* 1000 pkts by default */ > static int pg_ipg_d = 0; > @@ -646,7 +616,7 @@ > pkt_dev->count, pkt_dev->min_pkt_size, pkt_dev->max_pkt_size); > > p += sprintf(p, " frags: %d ipg: %u clone_skb: %d ifname: %s\n", > - pkt_dev->nfrags, pkt_dev->ipg, pkt_dev->clone_skb, pkt_dev->ifname); > + pkt_dev->nfrags, 1000*pkt_dev->ipg_us+pkt_dev->ipg_ns, pkt_dev->clone_skb, pkt_dev->ifname); > > p += sprintf(p, " flows: %u flowlen: %u\n", pkt_dev->cflows, pkt_dev->lflow); > > @@ -718,7 +688,7 @@ > > p += sprintf(p, "Current:\n pkts-sofar: %llu errors: %llu\n started: %lluus stopped: %lluus idle: %lluus\n", > pkt_dev->sofar, pkt_dev->errors, sa, stopped, > - pg_div(pkt_dev->idle_acc, pg_cycles_per_us)); > + pkt_dev->idle_acc); > > p += sprintf(p, " seq_num: %d cur_dst_mac_offset: %d cur_src_mac_offset: %d\n", > pkt_dev->seq_num, pkt_dev->cur_dst_mac_offset, pkt_dev->cur_src_mac_offset); > @@ -932,11 +902,14 @@ > len = num_arg(&user_buffer[i], 10, &value); > if (len < 0) { return len; } > i += len; > - pkt_dev->ipg = value; > - if ((getRelativeCurNs() + pkt_dev->ipg) > pkt_dev->next_tx_ns) { > - pkt_dev->next_tx_ns = getRelativeCurNs() + pkt_dev->ipg; > - } > - sprintf(pg_result, "OK: ipg=%u", pkt_dev->ipg); > + if (value == 0x7FFFFFFF) { > + pkt_dev->ipg_us = 0x7FFFFFFF; > + pkt_dev->ipg_ns = 0; > + } else { > + pkt_dev->ipg_us = value / 1000; > + pkt_dev->ipg_ns = value % 1000; > + } > + sprintf(pg_result, "OK: ipg=%u", 1000*pkt_dev->ipg_us+pkt_dev->ipg_ns); > return count; > } > if (!strcmp(name, "udp_src_min")) { > @@ -1732,108 +1705,32 @@ > pkt_dev->nflows = 0; > } > > -/* ipg is in nano-seconds */ > -static void nanospin(__u32 ipg, struct pktgen_dev *pkt_dev) > +static void spin(struct pktgen_dev *pkt_dev, __u64 spin_until_us) > { > - u64 idle_start = get_cycles(); > - u64 idle; > + __u64 start; > + __u64 now; > > - for (;;) { > - barrier(); > - idle = get_cycles() - idle_start; > - if (idle * 1000 >= ipg * pg_cycles_per_us) > - break; > - } > - pkt_dev->idle_acc += idle; > -} > - > - > -/* ipg is in micro-seconds (usecs) */ > -static void pg_udelay(__u32 delay_us, struct pktgen_dev *pkt_dev) > -{ > - u64 start = getRelativeCurUs(); > - u64 now; > - > - for (;;) { > - do_softirq(); > - now = getRelativeCurUs(); > - if (start + delay_us <= (now - 10)) > - break; > + start = now = getCurUs(); > + printk(KERN_INFO "sleeping for %d\n", (int)(spin_until_us - now)); > + while (now < spin_until_us) { > + /* TODO: optimise sleeping behavior */ > + if (spin_until_us - now > (1000000/HZ)+1) { > + current->state = TASK_INTERRUPTIBLE; > + schedule_timeout(1); > + } else if (spin_until_us - now > 100) { > + do_softirq(); > + if (!pkt_dev->running) > + return; > + if (need_resched()) > + schedule(); > + } > > - if (!pkt_dev->running) > - return; > - > - if (need_resched()) > - schedule(); > - > - now = getRelativeCurUs(); > - if (start + delay_us <= (now - 10)) > - break; > + now = getCurUs(); > } > > - pkt_dev->idle_acc += (1000 * (now - start)); > - > - /* We can break out of the loop up to 10us early, so spend the rest of > - * it spinning to increase accuracy. > - */ > - if (start + delay_us > now) > - nanospin((start + delay_us) - now, pkt_dev); > + pkt_dev->idle_acc += now - start; > } > > -/* Returns: cycles per micro-second */ > -static int calc_mhz(void) > -{ > - struct timeval start, stop; > - u64 start_s; > - u64 t1, t2; > - u32 elapsed; > - u32 clock_time = 0; > - > - do_gettimeofday(&start); > - start_s = get_cycles(); > - /* Spin for 50,000,000 cycles */ > - do { > - barrier(); > - elapsed = (u32)(get_cycles() - start_s); > - if (elapsed == 0) > - return 0; > - } while (elapsed < 50000000); > - > - do_gettimeofday(&stop); > - > - t1 = tv_to_us(&start); > - t2 = tv_to_us(&stop); > - > - clock_time = (u32)(t2 - t1); > - if (clock_time == 0) { > - printk("pktgen: ERROR: clock_time was zero..things may not work right, t1: %u t2: %u ...\n", > - (u32)(t1), (u32)(t2)); > - return 0x7FFFFFFF; > - } > - return elapsed / clock_time; > -} > - > -/* Calibrate cycles per micro-second */ > -static void cycles_calibrate(void) > -{ > - int i; > - > - for (i = 0; i < 3; i++) { > - u32 res = calc_mhz(); > - if (res > pg_cycles_per_us) > - pg_cycles_per_us = res; > - } > - > - /* Set these up too, only need to calculate these once. */ > - pg_cycles_per_ns = pg_cycles_per_us / 1000; > - if (pg_cycles_per_ns == 0) > - pg_cycles_per_ns = 1; > - > - pg_cycles_per_ms = pg_cycles_per_us * 1000; > - > - printk("pktgen: cycles_calibrate, cycles_per_ns: %d per_us: %d per_ms: %d\n", > - pg_cycles_per_ns, pg_cycles_per_us, pg_cycles_per_ms); > -} > > /* Increment/randomize headers according to flags and current values > * for IP src/dest, UDP src/dst port, MAC-Addr src/dst > @@ -2455,7 +2352,8 @@ > pkt_dev->running = 1; /* Cranke yeself! */ > pkt_dev->skb = NULL; > pkt_dev->started_at = getCurUs(); > - pkt_dev->next_tx_ns = 0; /* Transmit immediately */ > + pkt_dev->next_tx_us = getCurUs(); /* Transmit immediately */ > + pkt_dev->next_tx_ns = 0; > > strcpy(pkt_dev->result, "Starting"); > started++; > @@ -2568,17 +2466,13 @@ > > total_us = pkt_dev->stopped_at - pkt_dev->started_at; > > - BUG_ON(pg_cycles_per_us == 0); > - > idle = pkt_dev->idle_acc; > - do_div(idle, pg_cycles_per_us); > > p += sprintf(p, "OK: %llu(c%llu+d%llu) usec, %llu (%dbyte,%dfrags)\n", > total_us, (unsigned long long)(total_us - idle), idle, > pkt_dev->sofar, pkt_dev->cur_pkt_size, nr_frags); > > pps = pkt_dev->sofar * USEC_PER_SEC; > - > > while ((total_us >> 32) != 0) { > pps >>= 1; > @@ -2626,7 +2520,7 @@ > for(next=t->if_list; next ; next=next->next) { > if(!next->running) continue; > if(best == NULL) best=next; > - else if ( next->next_tx_ns < best->next_tx_ns) > + else if ( next->next_tx_us < best->next_tx_us) > best = next; > } > if_unlock(t); > @@ -2692,46 +2586,29 @@ > { > struct net_device *odev = NULL; > __u64 idle_start = 0; > - u32 next_ipg = 0; > - u64 now = 0; /* in nano-seconds */ > int ret; > > odev = pkt_dev->odev; > > - if (pkt_dev->ipg) { > - now = getRelativeCurNs(); > - if (now < pkt_dev->next_tx_ns) { > - next_ipg = (u32)(pkt_dev->next_tx_ns - now); > - > - /* Try not to busy-spin if we have larger sleep times. > - * TODO: Investigate better ways to do this. > - */ > + if (pkt_dev->ipg_us || pkt_dev->ipg_ns) { > + u64 now; > > - /* 10 usecs or less */ > - if (next_ipg < 10000) > - nanospin(next_ipg, pkt_dev); > - > - /* 10ms or less */ > - else if (next_ipg < 10000000) > - pg_udelay(next_ipg / 1000, pkt_dev); > - > - /* fall asleep for a 10ms or more. */ > - else > - pg_udelay(next_ipg / 1000, pkt_dev); > - } > + now = getCurUs(); > + if (now < pkt_dev->next_tx_us) > + spin(pkt_dev, pkt_dev->next_tx_us); > > /* This is max IPG, this has special meaning of > * "never transmit" > */ > - if (pkt_dev->ipg == 0x7FFFFFFF) { > - pkt_dev->next_tx_ns = getRelativeCurNs() + pkt_dev->ipg; > + if (pkt_dev->ipg_us == 0x7FFFFFFF) { > + pkt_dev->next_tx_us = getCurUs() + pkt_dev->ipg_us; > + pkt_dev->next_tx_ns = pkt_dev->ipg_ns; > goto out; > } > } > > if (netif_queue_stopped(odev) || need_resched()) { > - > - idle_start = get_cycles(); > + idle_start = getCurUs(); > > if (!netif_running(odev)) { > pktgen_stop_device(pkt_dev); > @@ -2740,10 +2617,11 @@ > if (need_resched()) > schedule(); > > - pkt_dev->idle_acc += get_cycles() - idle_start; > + pkt_dev->idle_acc += getCurUs() - idle_start; > > if (netif_queue_stopped(odev)) { > - pkt_dev->next_tx_ns = getRelativeCurNs(); /* TODO */ > + pkt_dev->next_tx_us = getCurUs(); /* TODO */ > + pkt_dev->next_tx_ns = 0; > goto out; /* Try the next interface */ > } > } > @@ -2768,7 +2646,8 @@ > > spin_lock_bh(&odev->xmit_lock); > if (!netif_queue_stopped(odev)) { > - > + u64 now; > + > atomic_inc(&(pkt_dev->skb->users)); > retry_now: > ret = odev->hard_start_xmit(pkt_dev->skb, odev); > @@ -2789,16 +2668,32 @@ > if (debug && net_ratelimit()) > printk(KERN_INFO "pktgen: Hard xmit error\n"); > > - pkt_dev->errors++; > - pkt_dev->last_ok = 0; > - pkt_dev->next_tx_ns = getRelativeCurNs(); /* TODO */ > + pkt_dev->errors++; > + pkt_dev->last_ok = 0; > + pkt_dev->next_tx_us = getCurUs(); /* TODO */ > + pkt_dev->next_tx_ns = 0; > + } > + > + pkt_dev->next_tx_us += pkt_dev->ipg_us; > + pkt_dev->next_tx_ns += pkt_dev->ipg_ns; > + if (pkt_dev->next_tx_ns > 1000) { > + pkt_dev->next_tx_us++; > + pkt_dev->next_tx_ns -= 1000; > + } > + > + now = getCurUs(); > + if (now > pkt_dev->next_tx_us) { > + /* TODO: this code is slightly wonky. */ > + pkt_dev->errors++; > + pkt_dev->next_tx_us = now - pkt_dev->ipg_us; > + pkt_dev->next_tx_ns = 0; > } > - pkt_dev->next_tx_ns = getRelativeCurNs() + pkt_dev->ipg; > } > > else { /* Retry it next time */ > pkt_dev->last_ok = 0; > - pkt_dev->next_tx_ns = getRelativeCurNs(); /* TODO */ > + pkt_dev->next_tx_us = getCurUs(); /* TODO */ > + pkt_dev->next_tx_ns = 0; > } > > spin_unlock_bh(&odev->xmit_lock); > @@ -2806,14 +2701,14 @@ > /* If pkt_dev->count is zero, then run forever */ > if ((pkt_dev->count != 0) && (pkt_dev->sofar >= pkt_dev->count)) { > if (atomic_read(&(pkt_dev->skb->users)) != 1) { > - idle_start = get_cycles(); > + idle_start = getCurUs(); > while (atomic_read(&(pkt_dev->skb->users)) != 1) { > if (signal_pending(current)) { > break; > } > schedule(); > } > - pkt_dev->idle_acc += get_cycles() - idle_start; > + pkt_dev->idle_acc += getCurUs() - idle_start; > } > > /* Done with this */ > @@ -3006,7 +2901,8 @@ > pkt_dev->max_pkt_size = ETH_ZLEN; > pkt_dev->nfrags = 0; > pkt_dev->clone_skb = pg_clone_skb_d; > - pkt_dev->ipg = pg_ipg_d; > + pkt_dev->ipg_us = pg_ipg_d / 1000; > + pkt_dev->ipg_ns = pg_ipg_d % 1000; > pkt_dev->count = pg_count_d; > pkt_dev->sofar = 0; > pkt_dev->udp_src_min = 9; /* sink port */ > @@ -3169,12 +3065,6 @@ > > module_fname[0] = 0; > > - cycles_calibrate(); > - if (pg_cycles_per_us == 0) { > - printk("pktgen: ERROR: your machine does not have working cycle counter.\n"); > - return -EINVAL; > - } > - > create_proc_dir(); > > sprintf(module_fname, "net/%s/pgctrl", PG_PROC_DIR); From bunk@stusta.de Tue Dec 14 04:59:30 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 14 Dec 2004 04:59:47 -0800 (PST) Received: from mailout.stusta.mhn.de (mailout.stusta.mhn.de [141.84.69.5]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBECx9dt011181 for ; Tue, 14 Dec 2004 04:59:29 -0800 Received: (qmail 17903 invoked from network); 14 Dec 2004 12:58:41 -0000 Received: from r063144.stusta.swh.mhn.de (10.150.63.144) by mailhub.stusta.mhn.de with SMTP; 14 Dec 2004 12:58:41 -0000 Received: by r063144.stusta.swh.mhn.de (Postfix, from userid 1000) id 02B53BBC88; Tue, 14 Dec 2004 13:58:38 +0100 (CET) Date: Tue, 14 Dec 2004 13:58:38 +0100 From: Adrian Bunk To: patrick@tykepenguin.com, Steve Whitehouse Cc: linux-decnet-user@lists.sourceforge.net, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: [2.6 patch] net/decnet/: misc possible cleanups Message-ID: <20041214125838.GC23151@stusta.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12738 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bunk@stusta.de Precedence: bulk X-list: netdev The patch below contains the following possible cleanups: - make needlessly global code static - dn_fib.c: remove the write-only global variable dn_fib_info_cnt - dn_fib.c: remove the unused global function dn_fib_rt_message - dn_neigh.c: remove the unused global function dn_neigh_pointopoint_notify - dn_timer.c: remove the fast timer code that isn't used Please review and comment on this patch. diffstat output: include/net/dn.h | 2 - include/net/dn_fib.h | 1 net/decnet/af_decnet.c | 6 +---- net/decnet/dn_fib.c | 15 ------------- net/decnet/dn_neigh.c | 8 ------ net/decnet/dn_route.c | 6 ++--- net/decnet/dn_timer.c | 47 ----------------------------------------- 7 files changed, 5 insertions(+), 80 deletions(-) Signed-off-by: Adrian Bunk --- linux-2.6.10-rc3-mm1-full/net/decnet/af_decnet.c.old 2004-12-14 03:30:13.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/decnet/af_decnet.c 2004-12-14 03:47:00.000000000 +0100 @@ -246,7 +246,7 @@ write_unlock_bh(&dn_hash_lock); } -struct hlist_head *listen_hash(struct sockaddr_dn *addr) +static struct hlist_head *listen_hash(struct sockaddr_dn *addr) { int i; unsigned hash = addr->sdn_objnum; @@ -447,7 +447,7 @@ dst_release(xchg(&sk->sk_dst_cache, NULL)); } -struct sock *dn_alloc_sock(struct socket *sock, int gfp) +static struct sock *dn_alloc_sock(struct socket *sock, int gfp) { struct dn_scp *scp; struct sock *sk = sk_alloc(PF_DECnet, gfp, sizeof(struct dn_sock), @@ -578,7 +578,6 @@ if (sk->sk_socket) return 0; - dn_stop_fast_timer(sk); /* unlikely, but possible that this is runninng */ if ((jiffies - scp->stamp) >= (HZ * decnet_time_wait)) { dn_unhash_sock(sk); sock_put(sk); @@ -631,7 +630,6 @@ default: printk(KERN_DEBUG "DECnet: dn_destroy_sock passed socket in invalid state\n"); case DN_O: - dn_stop_fast_timer(sk); dn_stop_slow_timer(sk); dn_unhash_sock_bh(sk); --- linux-2.6.10-rc3-mm1-full/include/net/dn_fib.h.old 2004-12-14 03:32:14.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/net/dn_fib.h 2004-12-14 03:32:19.000000000 +0100 @@ -117,7 +117,6 @@ extern void dn_fib_init(void); extern void dn_fib_cleanup(void); -extern int dn_fib_rt_message(struct sk_buff *skb); extern int dn_fib_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg); extern struct dn_fib_info *dn_fib_create_info(const struct rtmsg *r, --- linux-2.6.10-rc3-mm1-full/net/decnet/dn_fib.c.old 2004-12-14 03:31:33.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/decnet/dn_fib.c 2004-12-14 03:32:31.000000000 +0100 @@ -60,7 +60,6 @@ static spinlock_t dn_fib_multipath_lock = SPIN_LOCK_UNLOCKED; static struct dn_fib_info *dn_fib_info_list; static rwlock_t dn_fib_info_lock = RW_LOCK_UNLOCKED; -int dn_fib_info_cnt; static struct { @@ -93,7 +92,6 @@ dev_put(nh->nh_dev); nh->nh_dev = NULL; } endfor_nexthops(fi); - dn_fib_info_cnt--; kfree(fi); } @@ -388,7 +386,6 @@ if (dn_fib_info_list) dn_fib_info_list->fib_prev = fi; dn_fib_info_list = fi; - dn_fib_info_cnt++; write_unlock(&dn_fib_info_lock); return fi; @@ -486,18 +483,6 @@ } -/* - * Punt to user via netlink for example, but for now - * we just drop it. - */ -int dn_fib_rt_message(struct sk_buff *skb) -{ - kfree_skb(skb); - - return 0; -} - - static int dn_fib_check_attr(struct rtmsg *r, struct rtattr **rta) { int i; --- linux-2.6.10-rc3-mm1-full/net/decnet/dn_neigh.c.old 2004-12-14 03:32:55.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/decnet/dn_neigh.c 2004-12-14 03:33:29.000000000 +0100 @@ -355,14 +355,6 @@ * basically does a neigh_lookup(), but without comparing the device * field. This is required for the On-Ethernet cache */ -/* - * Any traffic on a pointopoint link causes the timer to be reset - * for the entry in the neighbour table. - */ -void dn_neigh_pointopoint_notify(struct sk_buff *skb) -{ - return; -} /* * Pointopoint link receives a hello message --- linux-2.6.10-rc3-mm1-full/net/decnet/dn_route.c.old 2004-12-14 03:33:53.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/decnet/dn_route.c 2004-12-14 03:34:54.000000000 +0100 @@ -99,9 +99,9 @@ static unsigned char dn_hiord_addr[6] = {0xAA,0x00,0x04,0x00,0x00,0x00}; -int dn_rt_min_delay = 2 * HZ; -int dn_rt_max_delay = 10 * HZ; -int dn_rt_mtu_expires = 10 * 60 * HZ; +static const int dn_rt_min_delay = 2 * HZ; +static const int dn_rt_max_delay = 10 * HZ; +static const int dn_rt_mtu_expires = 10 * 60 * HZ; static unsigned long dn_rt_deadline; --- linux-2.6.10-rc3-mm1-full/include/net/dn.h.old 2004-12-14 03:35:13.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/net/dn.h 2004-12-14 03:46:38.000000000 +0100 @@ -220,8 +220,6 @@ extern void dn_start_slow_timer(struct sock *sk); extern void dn_stop_slow_timer(struct sock *sk); -extern void dn_start_fast_timer(struct sock *sk); -extern void dn_stop_fast_timer(struct sock *sk); extern dn_address decnet_address; extern int decnet_debug_level; --- linux-2.6.10-rc3-mm1-full/net/decnet/dn_timer.c.old 2004-12-14 03:35:29.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/decnet/dn_timer.c 2004-12-14 03:48:49.000000000 +0100 @@ -27,11 +27,9 @@ #include /* - * Fast timer is for delayed acks (200mS max) * Slow timer is for everything else (n * 500mS) */ -#define FAST_INTERVAL (HZ/5) #define SLOW_INTERVAL (HZ/2) static void dn_slow_timer(unsigned long arg); @@ -109,48 +107,3 @@ bh_unlock_sock(sk); sock_put(sk); } - -static void dn_fast_timer(unsigned long arg) -{ - struct sock *sk = (struct sock *)arg; - struct dn_scp *scp = DN_SK(sk); - - bh_lock_sock(sk); - if (sock_owned_by_user(sk)) { - scp->delack_timer.expires = jiffies + HZ / 20; - add_timer(&scp->delack_timer); - goto out; - } - - scp->delack_pending = 0; - - if (scp->delack_fxn) - scp->delack_fxn(sk); -out: - bh_unlock_sock(sk); -} - -void dn_start_fast_timer(struct sock *sk) -{ - struct dn_scp *scp = DN_SK(sk); - - if (!scp->delack_pending) { - scp->delack_pending = 1; - init_timer(&scp->delack_timer); - scp->delack_timer.expires = jiffies + FAST_INTERVAL; - scp->delack_timer.data = (unsigned long)sk; - scp->delack_timer.function = dn_fast_timer; - add_timer(&scp->delack_timer); - } -} - -void dn_stop_fast_timer(struct sock *sk) -{ - struct dn_scp *scp = DN_SK(sk); - - if (scp->delack_pending) { - scp->delack_pending = 0; - del_timer(&scp->delack_timer); - } -} - From steve@souterrain.chygwyn.com Tue Dec 14 05:30:56 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 14 Dec 2004 05:31:03 -0800 (PST) Received: from souterrain.chygwyn.com (souterrain.chygwyn.com [194.39.143.233]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBEDUXsE014649 for ; Tue, 14 Dec 2004 05:30:56 -0800 Received: from localhost ([127.0.0.1] helo=souterrain.chygwyn.com) by souterrain.chygwyn.com with esmtp (Exim 4.41) id 1CeCnE-0002iG-B0; Tue, 14 Dec 2004 13:32:44 +0000 Received: (from steve@localhost) by souterrain.chygwyn.com (8.13.0/8.13.0/Submit) id iBEDWZie010430; Tue, 14 Dec 2004 13:32:35 GMT Date: Tue, 14 Dec 2004 13:32:35 +0000 From: Steven Whitehouse To: Adrian Bunk Cc: patrick@tykepenguin.com, Steve Whitehouse , linux-decnet-user@lists.sourceforge.net, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: [2.6 patch] net/decnet/: misc possible cleanups Message-ID: <20041214133235.GB10131@souterrain.chygwyn.com> References: <20041214125838.GC23151@stusta.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041214125838.GC23151@stusta.de> User-Agent: Mutt/1.4.1i Organization: ChyGwyn Limited X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12739 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: steve@chygwyn.com Precedence: bulk X-list: netdev Hi, In general this looks fine to me. Patrick has the last word though. I would suggest hanging on to the fast timer code though for now... its there so that delayed acks can be used and the rest of the code is very close to actually allowing that to work. The DECnet code has a habit of sending lots of (rather small) ack packets and it would be nice if that could be fixed at some stage. The issues at stake are that it would interact with the code thats working out what the correct send window is to a certain extent. I know there is also the argument that it can always be added back should someone have time to add delayed acks, so I've no strong opinion either way. Also, when I was writing the routing code - a lot of the design was "borrowed" from the ipv4 routing code. It might be worth doing a comparison to see where the two have diverged (something I used to do now and again) to pick up any bugs I'd inadvertently copied over, if you are working on clean ups in this area, Steve. On Tue, Dec 14, 2004 at 01:58:38PM +0100, Adrian Bunk wrote: > The patch below contains the following possible cleanups: > - make needlessly global code static > - dn_fib.c: remove the write-only global variable dn_fib_info_cnt > - dn_fib.c: remove the unused global function dn_fib_rt_message > - dn_neigh.c: remove the unused global function dn_neigh_pointopoint_notify > - dn_timer.c: remove the fast timer code that isn't used > > Please review and comment on this patch. From random@72616e646f6d20323030342d30342d31360a.nosense.org Tue Dec 14 05:33:00 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 14 Dec 2004 05:33:08 -0800 (PST) Received: from ubu.nosense.org (169.cust10.sa.dsl.ozemail.com.au [210.84.233.169]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBEDWcq8015412 for ; Tue, 14 Dec 2004 05:32:59 -0800 Received: from ubu.nosense.org (ubu.nosense.org [127.0.0.1]) by ubu.nosense.org (Postfix) with SMTP id 404A162A9F; Wed, 15 Dec 2004 00:02:14 +1030 (CST) Date: Wed, 15 Dec 2004 00:02:13 +1030 From: Mark Smith To: Arnaldo Carvalho de Melo Cc: netdev@oss.sgi.com Subject: Re: [2.6.9] Networking crash, slightly exotic setup, bridged tap/tun interfaces Message-Id: <20041215000213.77cc0fa0.random@72616e646f6d20323030342d30342d31360a.nosense.org> In-Reply-To: <41BE320A.8090201@conectiva.com.br> References: <20041214105245.32c9b1a6.random@72616e646f6d20323030342d30342d31360a.nosense.org> <41BE2B20.5080602@conectiva.com.br> <20041214113036.3ccb3480.random@72616e646f6d20323030342d30342d31360a.nosense.org> <41BE320A.8090201@conectiva.com.br> Organization: The No Sense Organisation X-Mailer: Sylpheed version 1.0.0beta1 (GTK+ 1.2.10; i686-pc-linux-gnu) X-Location: Adelaide, Australia Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12740 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: random@72616e646f6d20323030342d30342d31360a.nosense.org Precedence: bulk X-list: netdev Hi Arnaldo, On Mon, 13 Dec 2004 22:21:30 -0200 Arnaldo Carvalho de Melo wrote: > Mark Smith wrote: > > Hi Arnaldo, > > > >> > >>Can you reproduce this with 2.6.10-rc3? 2.6.9 had several networking > >>bugs fixed already in 2.6.10-rc3. > >> > > 2.6.10-rc3 has appeared to have "fixed" it, although I wasn't all that sure what caused it in the first place. I'm running 2.6.10-rc3 on both the main host and the Qemu virtual PCs. I've been playing with the set up for the last couple of hours. I now have 5 Qemu virtual PCs with 3 virtual NE2K NICs / 3 real host tun interfaces running - at total of 15 host tun interfaces. I've now got them interconnected via 6 separate (internal) bridges. [root@ubu] > brctl show bridge name bridge id STP enabled interfaces br0 8000.00ff3761a7da no tun2 tun3 br1,3,5 8000.00ff3ae4ea7f no tun0 tun14 tun7 br1,4 8000.00ff17200d92 no tun10 tun1 br1 8000.00ff08da3925 no tun5 tun6 br2 8000.00ff312eed9e no tun8 tun9 br3 8000.00ff7a131b45 no tun11 tun12 [root@ubu] > The 5 Qemu instances are all now running OSPF, the main host (ubu), where the Oops occured, is also running OSPF, and is attached to the topology via bridges br1,3,5 and br1,4. I'll leave it running overnight in this configuration to see how it goes. Thanks, Mark. -- From bunk@stusta.de Tue Dec 14 05:49:35 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 14 Dec 2004 05:49:48 -0800 (PST) Received: from mailout.stusta.mhn.de (emailhub.stusta.mhn.de [141.84.69.5]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBEDnEGi018801 for ; Tue, 14 Dec 2004 05:49:35 -0800 Received: (qmail 20752 invoked from network); 14 Dec 2004 13:48:46 -0000 Received: from r063144.stusta.swh.mhn.de (10.150.63.144) by mailout.stusta.mhn.de with SMTP; 14 Dec 2004 13:48:46 -0000 Received: by r063144.stusta.swh.mhn.de (Postfix, from userid 1000) id 1ADD2C7650; Tue, 14 Dec 2004 14:48:43 +0100 (CET) Date: Tue, 14 Dec 2004 14:48:42 +0100 From: Adrian Bunk To: netdev@oss.sgi.com Cc: linux-kernel@vger.kernel.org Subject: [2.6 patch] net/ethernet/eth.c: make a function static Message-ID: <20041214134842.GD23151@stusta.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12741 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bunk@stusta.de Precedence: bulk X-list: netdev The patch below makes a needlessly global function static. diffstat output: include/linux/etherdevice.h | 2 -- net/ethernet/eth.c | 2 +- 2 files changed, 1 insertion(+), 3 deletions(-) Signed-off-by: Adrian Bunk --- linux-2.6.10-rc3-mm1-full/include/linux/etherdevice.h.old 2004-12-14 03:38:12.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/linux/etherdevice.h 2004-12-14 03:38:20.000000000 +0100 @@ -37,8 +37,6 @@ unsigned char * haddr); extern int eth_header_cache(struct neighbour *neigh, struct hh_cache *hh); -extern int eth_header_parse(struct sk_buff *skb, - unsigned char *haddr); extern struct net_device *alloc_etherdev(int sizeof_priv); static inline void eth_copy_and_sum (struct sk_buff *dest, --- linux-2.6.10-rc3-mm1-full/net/ethernet/eth.c.old 2004-12-14 03:38:29.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ethernet/eth.c 2004-12-14 03:38:34.000000000 +0100 @@ -208,7 +208,7 @@ return htons(ETH_P_802_2); } -int eth_header_parse(struct sk_buff *skb, unsigned char *haddr) +static int eth_header_parse(struct sk_buff *skb, unsigned char *haddr) { struct ethhdr *eth = eth_hdr(skb); memcpy(haddr, eth->h_source, ETH_ALEN); From johnpol@2ka.mipt.ru Tue Dec 14 08:01:24 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 14 Dec 2004 08:01:39 -0800 (PST) Received: from vocord.com (dea.vocord.ru [217.67.177.50]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBEG0cj7031891 for ; Tue, 14 Dec 2004 08:01:00 -0800 Received: from uganda.factory.vocord.ru (uganda.factory.vocord.ru [192.168.0.48]) by vocord.com (8.13.1/8.13.1) with ESMTP id iBEFvYwY019799; Tue, 14 Dec 2004 18:57:34 +0300 Subject: Asynchronous crypto layer. Reincarnation #1. From: Evgeniy Polyakov Reply-To: johnpol@2ka.mipt.ru To: James Morris Cc: netdev@oss.sgi.com, cryptoapi@lists.logix.cz, jamal , Michal Ludvig Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="=-84FmUi6+bypRF6VRBLEL" Organization: MIPT Date: Tue, 14 Dec 2004 19:03:56 +0300 Message-Id: <1103040236.3430.54.camel@uganda> Mime-Version: 1.0 X-Mailer: Evolution 2.0.2 (2.0.2-3) X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 22:53:11 2004 clamav-milter version 0.80j on dea.vocord.com X-Virus-Status: Clean X-Virus-Status: Clean X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-1.4 (vocord.com [192.168.0.1]); Tue, 14 Dec 2004 18:57:40 +0300 (MSK) X-archive-position: 12742 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: johnpol@2ka.mipt.ru Precedence: bulk X-list: netdev --=-84FmUi6+bypRF6VRBLEL Content-Type: multipart/mixed; boundary="=-6hXhvAybrmqrLjuzQFoX" --=-6hXhvAybrmqrLjuzQFoX Content-Type: text/plain Content-Transfer-Encoding: quoted-printable Changed locking schema, coding style, symbol exporting. HIFN driver is ready and is waiting the latest test. It works several hours on a 4-way system without any problem (test crypto_provider and crypto consumer are also in patch). BTW, little note: System crypts with 128 bits aes in ecb mode, since only one CPU in a time performs crypto operations(crypto_provider is a wrapper over synchronous crypto layer) numbers are not so exciting: about 250-350 sessions per second, each one works with 32 bytes block. Test crypto consumer injects new session each 3 milliseconds. An interesting note about linux work_queues: when I inject new crypto_session each 2 msecs - work_queue's threads for 2 of 4 CPUs are loaded about the same 3-10% of the CPU time, but when I inject sessions each 3 or 1 msec - only one queue's thread gets all sessions callbacks and takes about 100% of the CPU time. With 2msec delay I get about 500 sessions per second, with 1msec - little more than 1000. pcix$ cat /sys/class/acrypto/crypto_provider/scompleted && sleep 1 &&cat /sys/class/acrypto/crypto_provider/scompleted 7044717 7045728 Please test, review and comment. Signed-off-by: Evgeniy Polyakov --=20 Evgeniy Polyakov Crash is better than data corruption -- Arthur Grabowski --=-6hXhvAybrmqrLjuzQFoX Content-Disposition: attachment; filename=acrypto.patch Content-Type: text/x-patch; name=acrypto.patch; charset=KOI8-R Content-Transfer-Encoding: base64 ZGlmZiAtTnJ1IC90bXAvZW1wdHkvYWNyeXB0by5oIGxpbnV4LTIuNi9kcml2ZXJzL2FjcnlwdG8v YWNyeXB0by5oDQotLS0gL3RtcC9lbXB0eS9hY3J5cHRvLmgJMTk3MC0wMS0wMSAwMzowMDowMC4w MDAwMDAwMDAgKzAzMDANCisrKyBsaW51eC0yLjYvZHJpdmVycy9hY3J5cHRvL2FjcnlwdG8uaAky MDA0LTEyLTE0IDE4OjUzOjExLjAwMDAwMDAwMCArMDMwMA0KQEAgLTAsMCArMSwyMjYgQEANCisv Kg0KKyAqIAlhY3J5cHRvLmgNCisgKg0KKyAqIENvcHlyaWdodCAoYykgMjAwNCBFdmdlbml5IFBv bHlha292IDxqb2hucG9sQDJrYS5taXB0LnJ1Pg0KKyAqIA0KKyAqDQorICogVGhpcyBwcm9ncmFt IGlzIGZyZWUgc29mdHdhcmU7IHlvdSBjYW4gcmVkaXN0cmlidXRlIGl0IGFuZC9vciBtb2RpZnkN CisgKiBpdCB1bmRlciB0aGUgdGVybXMgb2YgdGhlIEdOVSBHZW5lcmFsIFB1YmxpYyBMaWNlbnNl IGFzIHB1Ymxpc2hlZCBieQ0KKyAqIHRoZSBGcmVlIFNvZnR3YXJlIEZvdW5kYXRpb247IGVpdGhl ciB2ZXJzaW9uIDIgb2YgdGhlIExpY2Vuc2UsIG9yDQorICogKGF0IHlvdXIgb3B0aW9uKSBhbnkg bGF0ZXIgdmVyc2lvbi4NCisgKg0KKyAqIFRoaXMgcHJvZ3JhbSBpcyBkaXN0cmlidXRlZCBpbiB0 aGUgaG9wZSB0aGF0IGl0IHdpbGwgYmUgdXNlZnVsLA0KKyAqIGJ1dCBXSVRIT1VUIEFOWSBXQVJS QU5UWTsgd2l0aG91dCBldmVuIHRoZSBpbXBsaWVkIHdhcnJhbnR5IG9mDQorICogTUVSQ0hBTlRB QklMSVRZIG9yIEZJVE5FU1MgRk9SIEEgUEFSVElDVUxBUiBQVVJQT1NFLiAgU2VlIHRoZQ0KKyAq IEdOVSBHZW5lcmFsIFB1YmxpYyBMaWNlbnNlIGZvciBtb3JlIGRldGFpbHMuDQorICoNCisgKiBZ b3Ugc2hvdWxkIGhhdmUgcmVjZWl2ZWQgYSBjb3B5IG9mIHRoZSBHTlUgR2VuZXJhbCBQdWJsaWMg TGljZW5zZQ0KKyAqIGFsb25nIHdpdGggdGhpcyBwcm9ncmFtOyBpZiBub3QsIHdyaXRlIHRvIHRo ZSBGcmVlIFNvZnR3YXJlDQorICogRm91bmRhdGlvbiwgSW5jLiwgNTkgVGVtcGxlIFBsYWNlLCBT dWl0ZSAzMzAsIEJvc3RvbiwgTUEgMDIxMTEtMTMwNyBVU0ENCisgKi8NCisNCisjaWZuZGVmIF9f QUNSWVBUT19IDQorI2RlZmluZSBfX0FDUllQVE9fSA0KKw0KKyNkZWZpbmUgU0NBQ0hFX05BTUVM RU4JCTMyDQorDQorc3RydWN0IGNyeXB0b19zZXNzaW9uX2luaXRpYWxpemVyOw0KK3N0cnVjdCBj cnlwdG9fZGF0YTsNCit0eXBlZGVmIHZvaWQgKCpjcnlwdG9fY2FsbGJhY2tfdCkgKHN0cnVjdCBj cnlwdG9fc2Vzc2lvbl9pbml0aWFsaXplciAqLA0KKwkJCQkgICBzdHJ1Y3QgY3J5cHRvX2RhdGEg Kik7DQorDQorZXh0ZXJuIHZvaWQgY3J5cHRvX3dha2VfbGIodm9pZCk7DQorDQorI2RlZmluZSBT RVNTSU9OX0NPTVBMRVRFRAkoMTw8MTUpDQorI2RlZmluZSBTRVNTSU9OX0ZJTklTSEVECSgxPDwx NCkNCisjZGVmaW5lIFNFU1NJT05fU1RBUlRFRAkJKDE8PDEzKQ0KKyNkZWZpbmUgU0VTU0lPTl9Q Uk9DRVNTRUQJKDE8PDEyKQ0KKyNkZWZpbmUgU0VTU0lPTl9CSU5ERUQJCSgxPDwxMSkNCisjZGVm aW5lIFNFU1NJT05fQlJPS0VOCQkoMTw8MTApDQorDQorI2RlZmluZSBzZXNzaW9uX2NvbXBsZXRl ZChzKQkocy0+Y2kuZmxhZ3MgJiBTRVNTSU9OX0NPTVBMRVRFRCkNCisjZGVmaW5lIGNvbXBsZXRl X3Nlc3Npb24ocykJZG8ge3MtPmNpLmZsYWdzIHw9IFNFU1NJT05fQ09NUExFVEVEO30gd2hpbGUo MCkNCisjZGVmaW5lIHVuY29tcGxldGVfc2Vzc2lvbihzKQlkbyB7cy0+Y2kuZmxhZ3MgJj0gflNF U1NJT05fQ09NUExFVEVEO30gd2hpbGUgKDApDQorDQorI2RlZmluZSBzZXNzaW9uX2ZpbmlzaGVk KHMpCShzLT5jaS5mbGFncyAmIFNFU1NJT05fRklOSVNIRUQpDQorI2RlZmluZSBmaW5pc2hfc2Vz c2lvbihzKQlkbyB7cy0+Y2kuZmxhZ3MgfD0gU0VTU0lPTl9GSU5JU0hFRDt9IHdoaWxlKDApDQor I2RlZmluZSB1bmZpbmlzaF9zZXNzaW9uKHMpCWRvIHtzLT5jaS5mbGFncyAmPSB+U0VTU0lPTl9G SU5JU0hFRDt9IHdoaWxlICgwKQ0KKw0KKyNkZWZpbmUgc2Vzc2lvbl9zdGFydGVkKHMpCShzLT5j aS5mbGFncyAmIFNFU1NJT05fU1RBUlRFRCkNCisjZGVmaW5lIHN0YXJ0X3Nlc3Npb24ocykJZG8g e3MtPmNpLmZsYWdzIHw9IFNFU1NJT05fU1RBUlRFRDt9IHdoaWxlKDApDQorI2RlZmluZSB1bnN0 YXJ0X3Nlc3Npb24ocykJZG8ge3MtPmNpLmZsYWdzICY9IH5TRVNTSU9OX1NUQVJURUQ7fSB3aGls ZSAoMCkNCisNCisjZGVmaW5lIHNlc3Npb25faXNfcHJvY2Vzc2VkKHMpCQkocy0+Y2kuZmxhZ3Mg JiBTRVNTSU9OX1BST0NFU1NFRCkNCisjZGVmaW5lIHN0YXJ0X3Byb2Nlc3Nfc2Vzc2lvbihzKQlk byB7cy0+Y2kuZmxhZ3MgfD0gU0VTU0lPTl9QUk9DRVNTRUQ7IHMtPmNpLnB0aW1lID0gamlmZmll czt9IHdoaWxlKDApDQorI2RlZmluZSBzdG9wX3Byb2Nlc3Nfc2Vzc2lvbihzKQkJZG8ge3MtPmNp LmZsYWdzICY9IH5TRVNTSU9OX1BST0NFU1NFRDsgcy0+Y2kucHRpbWUgPSBqaWZmaWVzIC0gcy0+ Y2kucHRpbWU7IGNyeXB0b193YWtlX2xiKCk7fSB3aGlsZSAoMCkNCisNCisjZGVmaW5lIHNlc3Np b25fYmluZGVkKHMpCShzLT5jaS5mbGFncyAmIFNFU1NJT05fQklOREVEKQ0KKyNkZWZpbmUgYmlu ZF9zZXNzaW9uKHMpCQlkbyB7cy0+Y2kuZmxhZ3MgfD0gU0VTU0lPTl9CSU5ERUQ7fSB3aGlsZSgw KQ0KKyNkZWZpbmUgdW5iaW5kX3Nlc3Npb24ocykJZG8ge3MtPmNpLmZsYWdzICY9IH5TRVNTSU9O X0JJTkRFRDt9IHdoaWxlICgwKQ0KKyNkZWZpbmUgc2NpX2JpbmRlZChjaSkJCShjaS0+ZmxhZ3Mg JiBTRVNTSU9OX0JJTkRFRCkNCisNCisjZGVmaW5lIHNlc3Npb25fYnJva2VuKHMpCShzLT5jaS5m bGFncyAmIFNFU1NJT05fQlJPS0VOKQ0KKyNkZWZpbmUgYnJva2Vfc2Vzc2lvbihzKQlkbyB7cy0+ Y2kuZmxhZ3MgfD0gU0VTU0lPTl9CUk9LRU47fSB3aGlsZSgwKQ0KKyNkZWZpbmUgdW5icm9rZV9z ZXNzaW9uKHMpCWRvIHtzLT5jaS5mbGFncyAmPSB+U0VTU0lPTl9CUk9LRU47fSB3aGlsZSAoMCkN CisNCitzdHJ1Y3QgY3J5cHRvX2RldmljZV9zdGF0IHsNCisJX191NjQgc2NvbXBsZXRlZDsNCisJ X191NjQgc2ZpbmlzaGVkOw0KKwlfX3U2NCBzc3RhcnRlZDsNCisJX191NjQga21lbV9mYWlsZWQ7 DQorfTsNCisNCisjaWZkZWYgX19LRVJORUxfXw0KKw0KKyNpbmNsdWRlIDxsaW51eC90eXBlcy5o Pg0KKyNpbmNsdWRlIDxsaW51eC9saXN0Lmg+DQorI2luY2x1ZGUgPGxpbnV4L3NsYWIuaD4NCisj aW5jbHVkZSA8bGludXgvc3BpbmxvY2suaD4NCisjaW5jbHVkZSA8bGludXgvZGV2aWNlLmg+DQor I2luY2x1ZGUgPGxpbnV4L3dvcmtxdWV1ZS5oPg0KKw0KKyNpbmNsdWRlIDxhc20vc2NhdHRlcmxp c3QuaD4NCisNCisjZGVmaW5lIERFQlVHDQorI2lmZGVmIERFQlVHDQorI2RlZmluZSBkcHJpbnRr KGYsIGEuLi4pIHByaW50aygiJWQgIiBmLCBzbXBfcHJvY2Vzc29yX2lkKCksICMjYSkNCisjZWxz ZQ0KKyNkZWZpbmUgZHByaW50ayhmLCBhLi4uKQ0KKyNlbmRpZg0KKw0KKyNkZWZpbmUgQ1JZUFRP X01BWF9QUklWX1NJWkUJMTAyNA0KKw0KKyNkZWZpbmUgREVWSUNFX0JST0tFTgkJKDE8PDApDQor DQorI2RlZmluZSBkZXZpY2VfYnJva2VuKGRldikJKGRldi0+ZmxhZ3MgJiBERVZJQ0VfQlJPS0VO KQ0KKyNkZWZpbmUgYnJva2VfZGV2aWNlKGRldikJZG8ge2Rldi0+ZmxhZ3MgfD0gREVWSUNFX0JS T0tFTjt9IHdoaWxlKDApDQorI2RlZmluZSByZXBhaXJfZGV2aWNlKGRldikJZG8ge2Rldi0+Zmxh Z3MgJj0gfkRFVklDRV9CUk9LRU47fSB3aGlsZSgwKQ0KKw0KK3N0cnVjdCBjcnlwdG9fY2FwYWJp bGl0eSB7DQorCXUxNiAJCQlvcGVyYXRpb247DQorCXUxNiAJCQl0eXBlOw0KKwl1MTYgCQkJbW9k ZTsNCisJdTE2IAkJCXFsZW47DQorCXU2NCAJCQlwdGltZTsNCisJdTY0IAkJCXNjb21wOw0KK307 DQorDQorc3RydWN0IGNyeXB0b19zZXNzaW9uX2luaXRpYWxpemVyIHsNCisJdTE2IAkJCW9wZXJh dGlvbjsNCisJdTE2IAkJCXR5cGU7DQorCXUxNiAJCQltb2RlOw0KKwl1MTYgCQkJcHJpb3JpdHk7 DQorDQorCXU2NCAJCQlpZDsNCisJdTY0IAkJCWRldl9pZDsNCisNCisJdTMyIAkJCWZsYWdzOw0K Kw0KKwl1MzIgCQkJYmRldjsNCisNCisJdTY0IAkJCXB0aW1lOw0KKw0KKwljcnlwdG9fY2FsbGJh Y2tfdCAJY2FsbGJhY2s7DQorfTsNCisNCitzdHJ1Y3QgY3J5cHRvX2RhdGEgew0KKwlpbnQgCQkJ c2dfc3JjX251bTsNCisJc3RydWN0CQkJc2NhdHRlcmxpc3Qgc2dfc3JjOw0KKwlzdHJ1Y3QJCQlz Y2F0dGVybGlzdCBzZ19kc3Q7DQorCXN0cnVjdAkJCXNjYXR0ZXJsaXN0IHNnX2tleTsNCisJc3Ry dWN0CQkJc2NhdHRlcmxpc3Qgc2dfaXY7DQorDQorCXZvaWQgCQkJKnByaXY7DQorCXVuc2lnbmVk IGludCAJCXByaXZfc2l6ZTsNCit9Ow0KKw0KK3N0cnVjdCBjcnlwdG9fZGV2aWNlIHsNCisJY2hh ciAJCQluYW1lW1NDQUNIRV9OQU1FTEVOXTsNCisNCisJc3BpbmxvY2tfdCAJCXNlc3Npb25fbG9j azsNCisJc3RydWN0IGxpc3RfaGVhZCAJc2Vzc2lvbl9saXN0Ow0KKw0KKwl1NjQgCQkJc2lkOw0K KwlzcGlubG9ja190IAkJbG9jazsNCisNCisJYXRvbWljX3QgCQlyZWZjbnQ7DQorDQorCXUzMiAJ CQlmbGFnczsNCisNCisJdTMyIAkJCWlkOw0KKw0KKwlzdHJ1Y3QgbGlzdF9oZWFkIAljZGV2X2Vu dHJ5Ow0KKw0KKwl2b2lkIAkJCSgqZGF0YV9yZWFkeSkoc3RydWN0IGNyeXB0b19kZXZpY2UgKik7 DQorDQorCXN0cnVjdCBkZXZpY2VfZHJpdmVyIAkqZHJpdmVyOw0KKwlzdHJ1Y3QgZGV2aWNlIAkJ ZGV2aWNlOw0KKwlzdHJ1Y3QgY2xhc3NfZGV2aWNlIAljbGFzc19kZXZpY2U7DQorCXN0cnVjdCBj b21wbGV0aW9uIAlkZXZfcmVsZWFzZWQ7DQorDQorCXNwaW5sb2NrX3QgCQlzdGF0X2xvY2s7DQor CXN0cnVjdCBjcnlwdG9fZGV2aWNlX3N0YXQgc3RhdDsNCisNCisJc3RydWN0IGNyeXB0b19jYXBh YmlsaXR5ICpjYXA7DQorCWludCAJCQljYXBfbnVtYmVyOw0KKw0KKwl2b2lkIAkJCSpwcml2Ow0K K307DQorDQorc3RydWN0IGNyeXB0b19yb3V0ZV9oZWFkIHsNCisJc3RydWN0IGNyeXB0b19yb3V0 ZSAJKm5leHQ7DQorCXN0cnVjdCBjcnlwdG9fcm91dGUgCSpwcmV2Ow0KKw0KKwlfX3UzMiAJCQlx bGVuOw0KKwlzcGlubG9ja190IAkJbG9jazsNCit9Ow0KKw0KK3N0cnVjdCBjcnlwdG9fcm91dGUg ew0KKwlzdHJ1Y3QgY3J5cHRvX3JvdXRlIAkqbmV4dDsNCisJc3RydWN0IGNyeXB0b19yb3V0ZSAJ KnByZXY7DQorDQorCXN0cnVjdCBjcnlwdG9fcm91dGVfaGVhZCAqbGlzdDsNCisJc3RydWN0IGNy eXB0b19kZXZpY2UgCSpkZXY7DQorDQorCXN0cnVjdCBjcnlwdG9fc2Vzc2lvbl9pbml0aWFsaXpl ciBjaTsNCit9Ow0KKw0KK3N0cnVjdCBjcnlwdG9fc2Vzc2lvbiB7DQorCXN0cnVjdCBsaXN0X2hl YWQgCWRldl9xdWV1ZV9lbnRyeTsNCisJc3RydWN0IGxpc3RfaGVhZAltYWluX3F1ZXVlX2VudHJ5 Ow0KKw0KKwlzdHJ1Y3QgY3J5cHRvX3Nlc3Npb25faW5pdGlhbGl6ZXIgY2k7DQorDQorCXN0cnVj dCBjcnlwdG9fZGF0YSAJZGF0YTsNCisNCisJc3BpbmxvY2tfdCAJCWxvY2s7DQorDQorCXN0cnVj dCB3b3JrX3N0cnVjdCAJd29yazsNCisNCisJc3RydWN0IGNyeXB0b19yb3V0ZV9oZWFkIHJvdXRl X2xpc3Q7DQorfTsNCisNCitzdHJ1Y3QgY3J5cHRvX3Nlc3Npb24gKmNyeXB0b19zZXNzaW9uX2Fs bG9jKHN0cnVjdCBjcnlwdG9fc2Vzc2lvbl9pbml0aWFsaXplciAqLCBzdHJ1Y3QgY3J5cHRvX2Rh dGEgKik7DQorc3RydWN0IGNyeXB0b19zZXNzaW9uICpjcnlwdG9fc2Vzc2lvbl9jcmVhdGUoc3Ry dWN0IGNyeXB0b19zZXNzaW9uX2luaXRpYWxpemVyICosIHN0cnVjdCBjcnlwdG9fZGF0YSAqKTsN Cit2b2lkIGNyeXB0b19zZXNzaW9uX2FkZChzdHJ1Y3QgY3J5cHRvX3Nlc3Npb24gKik7DQordm9p ZCBjcnlwdG9fc2Vzc2lvbl9kZXF1ZXVlX21haW4oc3RydWN0IGNyeXB0b19zZXNzaW9uICopOw0K K3ZvaWQgX19jcnlwdG9fc2Vzc2lvbl9kZXF1ZXVlX21haW4oc3RydWN0IGNyeXB0b19zZXNzaW9u ICopOw0KK3ZvaWQgX19jcnlwdG9fc2Vzc2lvbl9kZXF1ZXVlX3JvdXRlKHN0cnVjdCBjcnlwdG9f c2Vzc2lvbiAqKTsNCit2b2lkIGNyeXB0b19zZXNzaW9uX2RlcXVldWVfcm91dGUoc3RydWN0IGNy eXB0b19zZXNzaW9uICopOw0KKw0KK3ZvaWQgY3J5cHRvX2RldmljZV9nZXQoc3RydWN0IGNyeXB0 b19kZXZpY2UgKik7DQordm9pZCBjcnlwdG9fZGV2aWNlX3B1dChzdHJ1Y3QgY3J5cHRvX2Rldmlj ZSAqKTsNCitzdHJ1Y3QgY3J5cHRvX2RldmljZSAqY3J5cHRvX2RldmljZV9nZXRfbmFtZShjaGFy ICopOw0KKw0KK2ludCBfX2NyeXB0b19kZXZpY2VfYWRkKHN0cnVjdCBjcnlwdG9fZGV2aWNlICop Ow0KK2ludCBjcnlwdG9fZGV2aWNlX2FkZChzdHJ1Y3QgY3J5cHRvX2RldmljZSAqKTsNCit2b2lk IF9fY3J5cHRvX2RldmljZV9yZW1vdmUoc3RydWN0IGNyeXB0b19kZXZpY2UgKik7DQordm9pZCBj cnlwdG9fZGV2aWNlX3JlbW92ZShzdHJ1Y3QgY3J5cHRvX2RldmljZSAqKTsNCitpbnQgbWF0Y2hf aW5pdGlhbGl6ZXIoc3RydWN0IGNyeXB0b19kZXZpY2UgKiwgc3RydWN0IGNyeXB0b19zZXNzaW9u X2luaXRpYWxpemVyICopOw0KK2ludCBfX21hdGNoX2luaXRpYWxpemVyKHN0cnVjdCBjcnlwdG9f Y2FwYWJpbGl0eSAqLCBzdHJ1Y3QgY3J5cHRvX3Nlc3Npb25faW5pdGlhbGl6ZXIgKik7DQorDQor dm9pZCBjcnlwdG9fc2Vzc2lvbl9pbnNlcnRfbWFpbihzdHJ1Y3QgY3J5cHRvX2RldmljZSAqZGV2 LCBzdHJ1Y3QgY3J5cHRvX3Nlc3Npb24gKnMpOw0KK3ZvaWQgY3J5cHRvX3Nlc3Npb25faW5zZXJ0 KHN0cnVjdCBjcnlwdG9fZGV2aWNlICpkZXYsIHN0cnVjdCBjcnlwdG9fc2Vzc2lvbiAqcyk7DQor DQorI2VuZGlmCQkJCS8qIF9fS0VSTkVMX18gKi8NCisjZW5kaWYJCQkJLyogX19BQ1JZUFRPX0gg Ki8NCmRpZmYgLU5ydSAvdG1wL2VtcHR5Ly5hcmNoLWlkcy9hY3J5cHRvLmguaWQgbGludXgtMi42 L2RyaXZlcnMvYWNyeXB0by8uYXJjaC1pZHMvYWNyeXB0by5oLmlkDQotLS0gL3RtcC9lbXB0eS8u YXJjaC1pZHMvYWNyeXB0by5oLmlkCTE5NzAtMDEtMDEgMDM6MDA6MDAuMDAwMDAwMDAwICswMzAw DQorKysgbGludXgtMi42L2RyaXZlcnMvYWNyeXB0by8uYXJjaC1pZHMvYWNyeXB0by5oLmlkCTIw MDQtMTItMTQgMTc6NTA6MDQuMDAwMDAwMDAwICswMzAwDQpAQCAtMCwwICsxIEBADQorRXZnZW5p eSBQb2x5YWtvdiA8am9obnBvbEAya2EubWlwdC5ydT4gU3VuIE9jdCAgMyAyMDozMzoyNSAyMDA0 IDI5NTcuMA0KZGlmZiAtTnJ1IC90bXAvZW1wdHkvLmFyY2gtaWRzL2NvbnN1bWVyLmMuaWQgbGlu dXgtMi42L2RyaXZlcnMvYWNyeXB0by8uYXJjaC1pZHMvY29uc3VtZXIuYy5pZA0KLS0tIC90bXAv ZW1wdHkvLmFyY2gtaWRzL2NvbnN1bWVyLmMuaWQJMTk3MC0wMS0wMSAwMzowMDowMC4wMDAwMDAw MDAgKzAzMDANCisrKyBsaW51eC0yLjYvZHJpdmVycy9hY3J5cHRvLy5hcmNoLWlkcy9jb25zdW1l ci5jLmlkCTIwMDQtMTItMTQgMTc6NTA6MDQuMDAwMDAwMDAwICswMzAwDQpAQCAtMCwwICsxIEBA DQorRXZnZW5peSBQb2x5YWtvdiA8am9obnBvbEAya2EubWlwdC5ydT4gV2VkIE9jdCAgNiAwMjoz NDowNCAyMDA0IDQ5MDIuMA0KZGlmZiAtTnJ1IC90bXAvZW1wdHkvLmFyY2gtaWRzL2NyeXB0b19j b25uLmMuaWQgbGludXgtMi42L2RyaXZlcnMvYWNyeXB0by8uYXJjaC1pZHMvY3J5cHRvX2Nvbm4u Yy5pZA0KLS0tIC90bXAvZW1wdHkvLmFyY2gtaWRzL2NyeXB0b19jb25uLmMuaWQJMTk3MC0wMS0w MSAwMzowMDowMC4wMDAwMDAwMDAgKzAzMDANCisrKyBsaW51eC0yLjYvZHJpdmVycy9hY3J5cHRv Ly5hcmNoLWlkcy9jcnlwdG9fY29ubi5jLmlkCTIwMDQtMTItMTQgMTc6NTA6MDQuMDAwMDAwMDAw ICswMzAwDQpAQCAtMCwwICsxIEBADQorRXZnZW5peSBQb2x5YWtvdiA8am9obnBvbEAya2EubWlw dC5ydT4gV2VkIE9jdCAgNiAwMzoxOTo0NCAyMDA0IDU1NTguMA0KZGlmZiAtTnJ1IC90bXAvZW1w dHkvLmFyY2gtaWRzL2NyeXB0b19jb25uLmguaWQgbGludXgtMi42L2RyaXZlcnMvYWNyeXB0by8u YXJjaC1pZHMvY3J5cHRvX2Nvbm4uaC5pZA0KLS0tIC90bXAvZW1wdHkvLmFyY2gtaWRzL2NyeXB0 b19jb25uLmguaWQJMTk3MC0wMS0wMSAwMzowMDowMC4wMDAwMDAwMDAgKzAzMDANCisrKyBsaW51 eC0yLjYvZHJpdmVycy9hY3J5cHRvLy5hcmNoLWlkcy9jcnlwdG9fY29ubi5oLmlkCTIwMDQtMTIt MTQgMTc6NTA6MDQuMDAwMDAwMDAwICswMzAwDQpAQCAtMCwwICsxIEBADQorRXZnZW5peSBQb2x5 YWtvdiA8am9obnBvbEAya2EubWlwdC5ydT4gV2VkIE9jdCAgNiAwMzoxOTo0NCAyMDA0IDU1NTgu MQ0KZGlmZiAtTnJ1IC90bXAvZW1wdHkvLmFyY2gtaWRzL2NyeXB0b19kZWYuaC5pZCBsaW51eC0y LjYvZHJpdmVycy9hY3J5cHRvLy5hcmNoLWlkcy9jcnlwdG9fZGVmLmguaWQNCi0tLSAvdG1wL2Vt cHR5Ly5hcmNoLWlkcy9jcnlwdG9fZGVmLmguaWQJMTk3MC0wMS0wMSAwMzowMDowMC4wMDAwMDAw MDAgKzAzMDANCisrKyBsaW51eC0yLjYvZHJpdmVycy9hY3J5cHRvLy5hcmNoLWlkcy9jcnlwdG9f ZGVmLmguaWQJMjAwNC0xMi0xNCAxNzo1MDowNC4wMDAwMDAwMDAgKzAzMDANCkBAIC0wLDAgKzEg QEANCitFdmdlbml5IFBvbHlha292IDxqb2hucG9sQDJrYS5taXB0LnJ1PiBUdWUgT2N0IDI2IDIw OjI2OjUzIDIwMDQgMjkwMy4wDQpkaWZmIC1OcnUgL3RtcC9lbXB0eS8uYXJjaC1pZHMvY3J5cHRv X2Rldi5jLmlkIGxpbnV4LTIuNi9kcml2ZXJzL2FjcnlwdG8vLmFyY2gtaWRzL2NyeXB0b19kZXYu Yy5pZA0KLS0tIC90bXAvZW1wdHkvLmFyY2gtaWRzL2NyeXB0b19kZXYuYy5pZAkxOTcwLTAxLTAx IDAzOjAwOjAwLjAwMDAwMDAwMCArMDMwMA0KKysrIGxpbnV4LTIuNi9kcml2ZXJzL2FjcnlwdG8v LmFyY2gtaWRzL2NyeXB0b19kZXYuYy5pZAkyMDA0LTEyLTE0IDE3OjUwOjA0LjAwMDAwMDAwMCAr MDMwMA0KQEAgLTAsMCArMSBAQA0KK0V2Z2VuaXkgUG9seWFrb3YgPGpvaG5wb2xAMmthLm1pcHQu cnU+IFN1biBPY3QgIDMgMjI6Mjc6MTMgMjAwNCA0MjExLjANCmRpZmYgLU5ydSAvdG1wL2VtcHR5 Ly5hcmNoLWlkcy9jcnlwdG9fbGIuYy5pZCBsaW51eC0yLjYvZHJpdmVycy9hY3J5cHRvLy5hcmNo LWlkcy9jcnlwdG9fbGIuYy5pZA0KLS0tIC90bXAvZW1wdHkvLmFyY2gtaWRzL2NyeXB0b19sYi5j LmlkCTE5NzAtMDEtMDEgMDM6MDA6MDAuMDAwMDAwMDAwICswMzAwDQorKysgbGludXgtMi42L2Ry aXZlcnMvYWNyeXB0by8uYXJjaC1pZHMvY3J5cHRvX2xiLmMuaWQJMjAwNC0xMi0xNCAxNzo1MDow NC4wMDAwMDAwMDAgKzAzMDANCkBAIC0wLDAgKzEgQEANCitFdmdlbml5IFBvbHlha292IDxqb2hu cG9sQDJrYS5taXB0LnJ1PiBTdW4gT2N0ICAzIDIyOjI3OjA5IDIwMDQgNDIxMC4wDQpkaWZmIC1O cnUgL3RtcC9lbXB0eS8uYXJjaC1pZHMvY3J5cHRvX2xiLmguaWQgbGludXgtMi42L2RyaXZlcnMv YWNyeXB0by8uYXJjaC1pZHMvY3J5cHRvX2xiLmguaWQNCi0tLSAvdG1wL2VtcHR5Ly5hcmNoLWlk cy9jcnlwdG9fbGIuaC5pZAkxOTcwLTAxLTAxIDAzOjAwOjAwLjAwMDAwMDAwMCArMDMwMA0KKysr IGxpbnV4LTIuNi9kcml2ZXJzL2FjcnlwdG8vLmFyY2gtaWRzL2NyeXB0b19sYi5oLmlkCTIwMDQt MTItMTQgMTc6NTA6MDQuMDAwMDAwMDAwICswMzAwDQpAQCAtMCwwICsxIEBADQorRXZnZW5peSBQ b2x5YWtvdiA8am9obnBvbEAya2EubWlwdC5ydT4gU3VuIE9jdCAgMyAyMjoyNzowOSAyMDA0IDQy MTAuMQ0KZGlmZiAtTnJ1IC90bXAvZW1wdHkvLmFyY2gtaWRzL2NyeXB0b19tYWluLmMuaWQgbGlu dXgtMi42L2RyaXZlcnMvYWNyeXB0by8uYXJjaC1pZHMvY3J5cHRvX21haW4uYy5pZA0KLS0tIC90 bXAvZW1wdHkvLmFyY2gtaWRzL2NyeXB0b19tYWluLmMuaWQJMTk3MC0wMS0wMSAwMzowMDowMC4w MDAwMDAwMDAgKzAzMDANCisrKyBsaW51eC0yLjYvZHJpdmVycy9hY3J5cHRvLy5hcmNoLWlkcy9j cnlwdG9fbWFpbi5jLmlkCTIwMDQtMTItMTQgMTc6NTA6MDQuMDAwMDAwMDAwICswMzAwDQpAQCAt MCwwICsxIEBADQorRXZnZW5peSBQb2x5YWtvdiA8am9obnBvbEAya2EubWlwdC5ydT4gU3VuIE9j dCAgMyAyMDoyODowNSAyMDA0IDI5NTAuMQ0KZGlmZiAtTnJ1IC90bXAvZW1wdHkvLmFyY2gtaWRz L2NyeXB0b19yb3V0ZS5oLmlkIGxpbnV4LTIuNi9kcml2ZXJzL2FjcnlwdG8vLmFyY2gtaWRzL2Ny eXB0b19yb3V0ZS5oLmlkDQotLS0gL3RtcC9lbXB0eS8uYXJjaC1pZHMvY3J5cHRvX3JvdXRlLmgu aWQJMTk3MC0wMS0wMSAwMzowMDowMC4wMDAwMDAwMDAgKzAzMDANCisrKyBsaW51eC0yLjYvZHJp dmVycy9hY3J5cHRvLy5hcmNoLWlkcy9jcnlwdG9fcm91dGUuaC5pZAkyMDA0LTEyLTE0IDE3OjUw OjA0LjAwMDAwMDAwMCArMDMwMA0KQEAgLTAsMCArMSBAQA0KK0V2Z2VuaXkgUG9seWFrb3YgPGpv aG5wb2xAMmthLm1pcHQucnU+IFR1ZSBPY3QgMTIgMTk6MjU6MjcgMjAwNCAyNTE4LjANCmRpZmYg LU5ydSAvdG1wL2VtcHR5Ly5hcmNoLWlkcy9jcnlwdG9fc3RhdC5jLmlkIGxpbnV4LTIuNi9kcml2 ZXJzL2FjcnlwdG8vLmFyY2gtaWRzL2NyeXB0b19zdGF0LmMuaWQNCi0tLSAvdG1wL2VtcHR5Ly5h cmNoLWlkcy9jcnlwdG9fc3RhdC5jLmlkCTE5NzAtMDEtMDEgMDM6MDA6MDAuMDAwMDAwMDAwICsw MzAwDQorKysgbGludXgtMi42L2RyaXZlcnMvYWNyeXB0by8uYXJjaC1pZHMvY3J5cHRvX3N0YXQu Yy5pZAkyMDA0LTEyLTE0IDE3OjUwOjA0LjAwMDAwMDAwMCArMDMwMA0KQEAgLTAsMCArMSBAQA0K K0V2Z2VuaXkgUG9seWFrb3YgPGpvaG5wb2xAMmthLm1pcHQucnU+IFRodSBPY3QgIDcgMjM6MDg6 NDggMjAwNCA0Mzg0LjANCmRpZmYgLU5ydSAvdG1wL2VtcHR5Ly5hcmNoLWlkcy9jcnlwdG9fc3Rh dC5oLmlkIGxpbnV4LTIuNi9kcml2ZXJzL2FjcnlwdG8vLmFyY2gtaWRzL2NyeXB0b19zdGF0Lmgu aWQNCi0tLSAvdG1wL2VtcHR5Ly5hcmNoLWlkcy9jcnlwdG9fc3RhdC5oLmlkCTE5NzAtMDEtMDEg MDM6MDA6MDAuMDAwMDAwMDAwICswMzAwDQorKysgbGludXgtMi42L2RyaXZlcnMvYWNyeXB0by8u YXJjaC1pZHMvY3J5cHRvX3N0YXQuaC5pZAkyMDA0LTEyLTE0IDE3OjUwOjA0LjAwMDAwMDAwMCAr MDMwMA0KQEAgLTAsMCArMSBAQA0KK0V2Z2VuaXkgUG9seWFrb3YgPGpvaG5wb2xAMmthLm1pcHQu cnU+IFRodSBPY3QgIDcgMjM6MDg6NDggMjAwNCA0Mzg0LjENCmRpZmYgLU5ydSAvdG1wL2VtcHR5 Ly5hcmNoLWlkcy9NYWtlZmlsZS5pZCBsaW51eC0yLjYvZHJpdmVycy9hY3J5cHRvLy5hcmNoLWlk cy9NYWtlZmlsZS5pZA0KLS0tIC90bXAvZW1wdHkvLmFyY2gtaWRzL01ha2VmaWxlLmlkCTE5NzAt MDEtMDEgMDM6MDA6MDAuMDAwMDAwMDAwICswMzAwDQorKysgbGludXgtMi42L2RyaXZlcnMvYWNy eXB0by8uYXJjaC1pZHMvTWFrZWZpbGUuaWQJMjAwNC0xMi0xNCAxNzo1MDowNC4wMDAwMDAwMDAg KzAzMDANCkBAIC0wLDAgKzEgQEANCitFdmdlbml5IFBvbHlha292IDxqb2hucG9sQDJrYS5taXB0 LnJ1PiBTdW4gT2N0ICAzIDIwOjI4OjA1IDIwMDQgMjk1MC4wDQpkaWZmIC1OcnUgL3RtcC9lbXB0 eS8uYXJjaC1pZHMvcHJvdmlkZXIuYy5pZCBsaW51eC0yLjYvZHJpdmVycy9hY3J5cHRvLy5hcmNo LWlkcy9wcm92aWRlci5jLmlkDQotLS0gL3RtcC9lbXB0eS8uYXJjaC1pZHMvcHJvdmlkZXIuYy5p ZAkxOTcwLTAxLTAxIDAzOjAwOjAwLjAwMDAwMDAwMCArMDMwMA0KKysrIGxpbnV4LTIuNi9kcml2 ZXJzL2FjcnlwdG8vLmFyY2gtaWRzL3Byb3ZpZGVyLmMuaWQJMjAwNC0xMi0xNCAxNzo1MDowNC4w MDAwMDAwMDAgKzAzMDANCkBAIC0wLDAgKzEgQEANCitFdmdlbml5IFBvbHlha292IDxqb2hucG9s QDJrYS5taXB0LnJ1PiBXZWQgT2N0ICA2IDAyOjM0OjA0IDIwMDQgNDkwMi4xDQpkaWZmIC1OcnUg L3RtcC9lbXB0eS8uYXJjaC1pZHMvc2hhMV9wcm92aWRlci5jLmlkIGxpbnV4LTIuNi9kcml2ZXJz L2FjcnlwdG8vLmFyY2gtaWRzL3NoYTFfcHJvdmlkZXIuYy5pZA0KLS0tIC90bXAvZW1wdHkvLmFy Y2gtaWRzL3NoYTFfcHJvdmlkZXIuYy5pZAkxOTcwLTAxLTAxIDAzOjAwOjAwLjAwMDAwMDAwMCAr MDMwMA0KKysrIGxpbnV4LTIuNi9kcml2ZXJzL2FjcnlwdG8vLmFyY2gtaWRzL3NoYTFfcHJvdmlk ZXIuYy5pZAkyMDA0LTEyLTE0IDE3OjUwOjA0LjAwMDAwMDAwMCArMDMwMA0KQEAgLTAsMCArMSBA QA0KK0V2Z2VuaXkgUG9seWFrb3YgPGpvaG5wb2xAMmthLm1pcHQucnU+IFdlZCBPY3QgMjAgMjA6 MjY6NTMgMjAwNCAyNTY1NC4wDQpkaWZmIC1OcnUgL3RtcC9lbXB0eS8uYXJjaC1pZHMvc2ltcGxl X2xiLmMuaWQgbGludXgtMi42L2RyaXZlcnMvYWNyeXB0by8uYXJjaC1pZHMvc2ltcGxlX2xiLmMu aWQNCi0tLSAvdG1wL2VtcHR5Ly5hcmNoLWlkcy9zaW1wbGVfbGIuYy5pZAkxOTcwLTAxLTAxIDAz OjAwOjAwLjAwMDAwMDAwMCArMDMwMA0KKysrIGxpbnV4LTIuNi9kcml2ZXJzL2FjcnlwdG8vLmFy Y2gtaWRzL3NpbXBsZV9sYi5jLmlkCTIwMDQtMTItMTQgMTc6NTA6MDQuMDAwMDAwMDAwICswMzAw DQpAQCAtMCwwICsxIEBADQorRXZnZW5peSBQb2x5YWtvdiA8am9obnBvbEAya2EubWlwdC5ydT4g TW9uIE9jdCAgNCAwMDo1ODowNyAyMDA0IDQ2OTEuMA0KZGlmZiAtTnJ1IC90bXAvZW1wdHkvY29u c3VtZXIuYyBsaW51eC0yLjYvZHJpdmVycy9hY3J5cHRvL2NvbnN1bWVyLmMNCi0tLSAvdG1wL2Vt cHR5L2NvbnN1bWVyLmMJMTk3MC0wMS0wMSAwMzowMDowMC4wMDAwMDAwMDAgKzAzMDANCisrKyBs aW51eC0yLjYvZHJpdmVycy9hY3J5cHRvL2NvbnN1bWVyLmMJMjAwNC0xMi0xNCAxODo1MzoxMS4w MDAwMDAwMDAgKzAzMDANCkBAIC0wLDAgKzEsMjg4IEBADQorLyoNCisgKiAJY29uc3VtZXIuYw0K KyAqDQorICogQ29weXJpZ2h0IChjKSAyMDA0IEV2Z2VuaXkgUG9seWFrb3YgPGpvaG5wb2xAMmth Lm1pcHQucnU+DQorICogDQorICoNCisgKiBUaGlzIHByb2dyYW0gaXMgZnJlZSBzb2Z0d2FyZTsg eW91IGNhbiByZWRpc3RyaWJ1dGUgaXQgYW5kL29yIG1vZGlmeQ0KKyAqIGl0IHVuZGVyIHRoZSB0 ZXJtcyBvZiB0aGUgR05VIEdlbmVyYWwgUHVibGljIExpY2Vuc2UgYXMgcHVibGlzaGVkIGJ5DQor ICogdGhlIEZyZWUgU29mdHdhcmUgRm91bmRhdGlvbjsgZWl0aGVyIHZlcnNpb24gMiBvZiB0aGUg TGljZW5zZSwgb3INCisgKiAoYXQgeW91ciBvcHRpb24pIGFueSBsYXRlciB2ZXJzaW9uLg0KKyAq DQorICogVGhpcyBwcm9ncmFtIGlzIGRpc3RyaWJ1dGVkIGluIHRoZSBob3BlIHRoYXQgaXQgd2ls bCBiZSB1c2VmdWwsDQorICogYnV0IFdJVEhPVVQgQU5ZIFdBUlJBTlRZOyB3aXRob3V0IGV2ZW4g dGhlIGltcGxpZWQgd2FycmFudHkgb2YNCisgKiBNRVJDSEFOVEFCSUxJVFkgb3IgRklUTkVTUyBG T1IgQSBQQVJUSUNVTEFSIFBVUlBPU0UuICBTZWUgdGhlDQorICogR05VIEdlbmVyYWwgUHVibGlj IExpY2Vuc2UgZm9yIG1vcmUgZGV0YWlscy4NCisgKg0KKyAqIFlvdSBzaG91bGQgaGF2ZSByZWNl aXZlZCBhIGNvcHkgb2YgdGhlIEdOVSBHZW5lcmFsIFB1YmxpYyBMaWNlbnNlDQorICogYWxvbmcg d2l0aCB0aGlzIHByb2dyYW07IGlmIG5vdCwgd3JpdGUgdG8gdGhlIEZyZWUgU29mdHdhcmUNCisg KiBGb3VuZGF0aW9uLCBJbmMuLCA1OSBUZW1wbGUgUGxhY2UsIFN1aXRlIDMzMCwgQm9zdG9uLCBN QSAwMjExMS0xMzA3IFVTQQ0KKyAqLw0KKw0KKyNpbmNsdWRlIDxsaW51eC9rZXJuZWwuaD4NCisj aW5jbHVkZSA8bGludXgvbW9kdWxlLmg+DQorI2luY2x1ZGUgPGxpbnV4L21vZHVsZXBhcmFtLmg+ DQorI2luY2x1ZGUgPGxpbnV4L3R5cGVzLmg+DQorI2luY2x1ZGUgPGxpbnV4L2xpc3QuaD4NCisj aW5jbHVkZSA8bGludXgvc2xhYi5oPg0KKyNpbmNsdWRlIDxsaW51eC9zcGlubG9jay5oPg0KKyNp bmNsdWRlIDxsaW51eC93b3JrcXVldWUuaD4NCisjaW5jbHVkZSA8bGludXgvZXJyLmg+DQorI2lu Y2x1ZGUgPGxpbnV4L21tLmg+DQorI2luY2x1ZGUgPGxpbnV4L2NvbXBsZXRpb24uaD4NCisNCisj dW5kZWYgREVCVUcNCisjaW5jbHVkZSAiYWNyeXB0by5oIg0KKyNpbmNsdWRlICJjcnlwdG9fZGVm LmgiDQorI2luY2x1ZGUgInZpYS1wYWRsb2NrL3BhZGxvY2suaCINCisNCisjdW5kZWYgZHByaW50 aw0KKyNkZWZpbmUgZHByaW50ayhmLCBhLi4uKSBkbyB7fSB3aGlsZSgwKQ0KKw0KKyNkZWZpbmUg S0VZX0xFTkdUSAkJMTYNCitzdGF0aWMgY2hhciBja2V5W0tFWV9MRU5HVEhdOw0KK3N0YXRpYyBp bnQga2V5X2xlbmd0aCA9IHNpemVvZihja2V5KTsNCisNCitzdGF0aWMgdm9pZCBjdGVzdF9jYWxs YmFjayhzdHJ1Y3QgY3J5cHRvX3Nlc3Npb25faW5pdGlhbGl6ZXIgKmNpLA0KKwkJCSAgIHN0cnVj dCBjcnlwdG9fZGF0YSAqZGF0YSk7DQorDQorc3RhdGljIHN0cnVjdCBjcnlwdG9fc2Vzc2lvbl9p bml0aWFsaXplciBjaSA9IHsNCisJLm9wZXJhdGlvbiAJPSBDUllQVE9fT1BfRU5DUllQVCwNCisJ LnR5cGUgCQk9IENSWVBUT19UWVBFX0FFU18xMjgsDQorCS5tb2RlIAkJPSBDUllQVE9fTU9ERV9F Q0IsDQorCS5wcmlvcml0eSAJPSA0LA0KKwkuY2FsbGJhY2sgCT0gY3Rlc3RfY2FsbGJhY2ssDQor fTsNCisNCitzdGF0aWMgc3RydWN0IGNyeXB0b19kYXRhIGNkYXRhOw0KKw0KKyNkZWZpbmUgQ1NF U1NJT05fTUFYCTEwMDANCitzdGF0aWMgc3RydWN0IGNyeXB0b19zZXNzaW9uICpzOw0KK3N0YXRp YyBpbnQgd2F0ZXJtYXJrOw0KKy8vc3RhdGljIHN0cnVjdCBjb21wbGV0aW9uIGNhbGxiYWNrX2Nv bXBsZXRlZDsNCitzdGF0aWMgc3RydWN0IHRpbWVyX2xpc3QgY3RpbWVyOw0KKw0KK3N0YXRpYyB2 b2lkIGN0ZXN0X2NhbGxiYWNrKHN0cnVjdCBjcnlwdG9fc2Vzc2lvbl9pbml0aWFsaXplciAqY2ks DQorCQkJICAgc3RydWN0IGNyeXB0b19kYXRhICpkYXRhKQ0KK3sNCisJaW50IGksIG9mZiwgc2l6 ZSwgc3NpemU7DQorCXN0cnVjdCBzY2F0dGVybGlzdCAqc2c7DQorCXVuc2lnbmVkIGNoYXIgKnB0 cjsNCisNCisJd2F0ZXJtYXJrLS07DQorDQorCXByaW50aygiJXM6IHNlc3Npb24gJWxsdSBbJWxs dV1cbiIsIF9fZnVuY19fLCBjaS0+aWQsIGNpLT5kZXZfaWQpOw0KKwlkcHJpbnRrKCJzcmM9JXMs IGxlbj0lZC5cbiIsDQorCQkoKGNoYXIgKilwYWdlX2FkZHJlc3MoZGF0YS0+c2dfc3JjLnBhZ2Up KSArIGRhdGEtPnNnX3NyYy5vZmZzZXQsDQorCQlkYXRhLT5zZ19zcmMubGVuZ3RoKTsNCisNCisJ c2cgPSAmZGF0YS0+c2dfa2V5Ow0KKwlwdHIgPSAodW5zaWduZWQgY2hhciAqKXBhZ2VfYWRkcmVz cyhzZy0+cGFnZSk7DQorCW9mZiA9IHNnLT5vZmZzZXQ7DQorCXNpemUgPSBzZy0+bGVuZ3RoOw0K Kw0KKwlpZiAoc2l6ZSA+IDE2KSB7DQorCQlkcHJpbnRrKCJrZXkgc2cgaXMgYnJva2VuLCBzaXpl PSVkLlxuIiwgc2l6ZSk7DQorCQlnb3RvIGVycl9vdXQ7DQorCX0NCisNCisJZHByaW50aygia2V5 WyVkXT0iLCBzaXplKTsNCisJZm9yIChpID0gMDsgaSA8IHNpemU7ICsraSkNCisJCWRwcmludGso IjB4JTAyeCwgIiwgcHRyW2kgKyBvZmZdKTsNCisJZHByaW50aygiXG4iKTsNCisNCisJc2cgPSAm ZGF0YS0+c2dfc3JjOw0KKwlwdHIgPSAodW5zaWduZWQgY2hhciAqKXBhZ2VfYWRkcmVzcyhzZy0+ cGFnZSk7DQorCW9mZiA9IHNnLT5vZmZzZXQ7DQorCXNzaXplID0gc2l6ZSA9IHNnLT5sZW5ndGg7 DQorCWlmIChzaXplID4gMzIpIHsNCisJCWRwcmludGsoInNyYyBzZyBpcyBicm9rZW4sIHNpemU9 JWQuXG4iLCBzaXplKTsNCisJCWdvdG8gZXJyX291dDsNCisJfQ0KKw0KKwlkcHJpbnRrKCJzcmNb JWRdPSIsIHNpemUpOw0KKwlmb3IgKGkgPSAwOyBpIDwgc2l6ZTsgKytpKQ0KKwkJZHByaW50aygi MHglMDJ4LCAiLCBwdHJbaSArIG9mZl0pOw0KKwlkcHJpbnRrKCJcbiIpOw0KKw0KKwlzZyA9ICZk YXRhLT5zZ19kc3Q7DQorCXB0ciA9ICh1bnNpZ25lZCBjaGFyICopcGFnZV9hZGRyZXNzKHNnLT5w YWdlKTsNCisJb2ZmID0gc2ctPm9mZnNldDsNCisJc2l6ZSA9IHNnLT5sZW5ndGg7DQorCWlmIChz aXplID4gMzIpIHsNCisJCWRwcmludGsoImRzdCBzZyBpcyBicm9rZW4sIHNpemU9JWQuXG4iLCBz aXplKTsNCisJCWdvdG8gZXJyX291dDsNCisJfQ0KKw0KKwlpZiAoc2l6ZSA9PSAwKSB7DQorCQlk cHJpbnRrKCJzaXplPTAsIHNldHRpbmcgdG8gJWQuXG4iLCBzc2l6ZSk7DQorCQlzaXplID0gc3Np emU7DQorCX0NCisNCisJZHByaW50aygiZHN0WyVkXT0iLCBzaXplKTsNCisJZm9yIChpID0gMDsg aSA8IHNpemU7ICsraSkNCisJCWRwcmludGsoIjB4JTAyeCwgIiwgcHRyW2kgKyBvZmZdKTsNCisJ ZHByaW50aygiXG4iKTsNCisNCitlcnJfb3V0Og0KKwkvL2NvbXBsZXRlKCZjYWxsYmFja19jb21w bGV0ZWQpOw0KKwlyZXR1cm47DQorfQ0KKw0KK2ludCBjdGltZXJfZnVuYyh2b2lkICpkYXRhLCBp bnQgc2l6ZSwgaW50IG9wKQ0KK3sNCisJdTggKnB0cjsNCisNCisJaWYgKHNpemUgPiBQQUdFX1NJ WkUpDQorCQlzaXplID0gUEFHRV9TSVpFOw0KKw0KKwlwdHIgPSAodTggKilwYWdlX2FkZHJlc3Mo Y2RhdGEuc2dfc3JjLnBhZ2UpICsgY2RhdGEuc2dfc3JjLm9mZnNldDsNCisJbWVtY3B5KHB0ciwg ZGF0YSwgc2l6ZSk7DQorCWNkYXRhLnNnX3NyYy5sZW5ndGggPSBzaXplOw0KKwljZGF0YS5zZ19k c3QubGVuZ3RoID0gc2l6ZTsNCisNCisJY2kub3BlcmF0aW9uID0gb3A7DQorCXMgPSBjcnlwdG9f c2Vzc2lvbl9hbGxvYygmY2ksICZjZGF0YSk7DQorCWlmIChzKQ0KKwkJd2F0ZXJtYXJrKys7DQor CWVsc2UNCisJCWRwcmludGsoImFsbG9jYXRpb24gZmFpbGVkLlxuIik7DQorDQorCXJldHVybiAo cykgPyAwIDogLUVJTlZBTDsNCit9DQorDQorc3RhdGljIGludCBhbGxvY19zZyhzdHJ1Y3Qgc2Nh dHRlcmxpc3QgKnNnLCB2b2lkICpkYXRhLCBpbnQgc2l6ZSkNCit7DQorCXNnLT5vZmZzZXQgPSAw Ow0KKwlzZy0+cGFnZSA9IGFsbG9jX3BhZ2VzKEdGUF9LRVJORUwsIGdldF9vcmRlcihzaXplKSk7 DQorCWlmICghc2ctPnBhZ2UpIHsNCisJCXByaW50ayhLRVJOX0VSUiAiRmFpbGVkIHRvIGFsbG9j YXRlIHBhZ2UuXG4iKTsNCisJCXJldHVybiAtRU5PTUVNOw0KKwl9DQorDQorCW1lbXNldChwYWdl X2FkZHJlc3Moc2ctPnBhZ2UpLCAwLCBQQUdFX1NJWkUpOw0KKw0KKwlpZiAoZGF0YSkNCisJCW1l bWNweShwYWdlX2FkZHJlc3Moc2ctPnBhZ2UpLCBkYXRhLCBzaXplKTsNCisNCisJc2ctPmxlbmd0 aCA9IHNpemU7DQorDQorCXJldHVybiAwOw0KK30NCisNCitzdGF0aWMgdm9pZCBjdGltZXJmKHVu c2lnbmVkIGxvbmcgZGF0YSkNCit7DQorCWNoYXIgc3RyW10gPSAidGVzdCBtZXNzYWdlIHF3ZXJ0 eSAgYXNkenhjXG4iOw0KKwlpbnQgc2l6ZSA9IHNpemVvZihzdHIpLCBlcnI7DQorDQorCXByaW50 aygiJXMgc3RhcnRlZFxuIiwgX19mdW5jX18pOw0KKw0KKwllcnIgPSBjdGltZXJfZnVuYyhzdHIs IHNpemUsIENSWVBUT19PUF9FTkNSWVBUKTsNCisNCisJaWYgKCFlcnIpDQorCQltb2RfdGltZXIo JmN0aW1lciwgamlmZmllcyArIDMpOw0KKwllbHNlDQorCQltb2RfdGltZXIoJmN0aW1lciwgamlm ZmllcyArIEhaKTsNCisNCisJcHJpbnRrKCIlcyBmaW5pc2hlZC5cbiIsIF9fZnVuY19fKTsNCit9 DQorDQoraW50IGNvbnN1bWVyX2luaXQodm9pZCkNCit7DQorCWludCBlcnI7DQorDQorCWNkYXRh LnNnX3NyY19udW0gPSAxOw0KKw0KKwllcnIgPSBhbGxvY19zZygmY2RhdGEuc2dfc3JjLCBOVUxM LCBQQUdFX1NJWkUpOw0KKwlpZiAoZXJyKQ0KKwkJZ290byBlcnJfb3V0X3JldHVybjsNCisNCisJ ZXJyID0gYWxsb2Nfc2coJmNkYXRhLnNnX2RzdCwgTlVMTCwgUEFHRV9TSVpFKTsNCisJaWYgKGVy cikNCisJCWdvdG8gZXJyX291dF9zcmM7DQorDQorCWVyciA9IGFsbG9jX3NnKCZjZGF0YS5zZ19r ZXksIGNrZXksIGtleV9sZW5ndGgpOw0KKwlpZiAoZXJyKQ0KKwkJZ290byBlcnJfb3V0X2RzdDsN CisNCisJZXJyID0gYWxsb2Nfc2coJmNkYXRhLnNnX2l2LCBOVUxMLCBQQUdFX1NJWkUpOw0KKwlp ZiAoZXJyKQ0KKwkJZ290byBlcnJfb3V0X2tleTsNCisNCisJY2RhdGEucHJpdl9zaXplID0gc2l6 ZW9mKHN0cnVjdCBhZXNfY3R4KTsNCisNCisJaW5pdF90aW1lcigmY3RpbWVyKTsNCisJY3RpbWVy LmZ1bmN0aW9uID0gJmN0aW1lcmY7DQorCWN0aW1lci5leHBpcmVzID0gamlmZmllcyArIEhaOw0K KwljdGltZXIuZGF0YSA9IDA7DQorDQorCWFkZF90aW1lcigmY3RpbWVyKTsNCisNCisJcmV0dXJu IDA7DQorI2lmIDANCisJY2hhciBzdHJbXSA9ICJ0ZXN0IG1lc3NhZ2UgcXdlcnR5ICBhc2R6eGNc biI7DQorCWludCBzaXplOw0KKw0KKwlpbml0X2NvbXBsZXRpb24oJmNhbGxiYWNrX2NvbXBsZXRl ZCk7DQorDQorCXNpemUgPSAzMjsNCisJZXJyID0gY3RpbWVyX2Z1bmMoc3RyLCBzaXplLCBDUllQ VE9fT1BfRU5DUllQVCk7DQorCWlmIChlcnIpDQorCQlnb3RvIGVycl9vdXRfaXY7DQorDQorCXdh aXRfZm9yX2NvbXBsZXRpb24oJmNhbGxiYWNrX2NvbXBsZXRlZCk7DQorCWluaXRfY29tcGxldGlv bigmY2FsbGJhY2tfY29tcGxldGVkKTsNCisNCisJZXJyID0NCisJICAgIGN0aW1lcl9mdW5jKHBh Z2VfYWRkcmVzcyhjZGF0YS5zZ19kc3QucGFnZSksIGNkYXRhLnNnX2RzdC5sZW5ndGgsDQorCQkJ Q1JZUFRPX09QX0RFQ1JZUFQpOw0KKwlpZiAoZXJyKQ0KKwkJZ290byBlcnJfb3V0X2l2Ow0KKw0K Kwl3YWl0X2Zvcl9jb21wbGV0aW9uKCZjYWxsYmFja19jb21wbGV0ZWQpOw0KKw0KKwlkcHJpbnRr KCJkc3Q6ICVzXG4iLA0KKwkJKChjaGFyICopcGFnZV9hZGRyZXNzKGNkYXRhLnNnX2RzdC5wYWdl KSkgKw0KKwkJY2RhdGEuc2dfZHN0Lm9mZnNldCk7DQorICAgICAgZXJyX291dF9pdjoNCisjZW5k aWYNCisNCisJX19mcmVlX3BhZ2VzKGNkYXRhLnNnX2l2LnBhZ2UsIGdldF9vcmRlcigxKSk7DQor ICAgICAgZXJyX291dF9rZXk6DQorCV9fZnJlZV9wYWdlcyhjZGF0YS5zZ19rZXkucGFnZSwgZ2V0 X29yZGVyKGtleV9sZW5ndGgpKTsNCisgICAgICBlcnJfb3V0X2RzdDoNCisJX19mcmVlX3BhZ2Vz KGNkYXRhLnNnX2RzdC5wYWdlLCBnZXRfb3JkZXIoMSkpOw0KKyAgICAgIGVycl9vdXRfc3JjOg0K KwlfX2ZyZWVfcGFnZXMoY2RhdGEuc2dfc3JjLnBhZ2UsIGdldF9vcmRlcigxKSk7DQorICAgICAg ZXJyX291dF9yZXR1cm46DQorCXJldHVybiAtRU5PREVWOw0KK30NCisNCit2b2lkIGNvbnN1bWVy X2Zpbmkodm9pZCkNCit7DQorCWRlbF90aW1lcl9zeW5jKCZjdGltZXIpOw0KKwl3aGlsZSAod2F0 ZXJtYXJrKSB7DQorCQlzZXRfY3VycmVudF9zdGF0ZShUQVNLX1VOSU5URVJSVVBUSUJMRSk7DQor CQlzY2hlZHVsZV90aW1lb3V0KEhaKTsNCisNCisJCXByaW50ayhLRVJOX0lORk8gIldhaXRpbmcg Zm9yIHNlc3Npb25zIHRvIGJlIGZyZWVkOiB3YXRlcm1hcms9JWQuXG4iLA0KKwkJCXdhdGVybWFy ayk7DQorCX0NCisNCisJaWYgKCFjZGF0YS5zZ19rZXkubGVuZ3RoKQ0KKwkJcHJpbnRrKCJCVUc6 IGtleSBsZW5ndGggaXMgMCBpbiAlcy5cbiIsIF9fZnVuY19fKTsNCisNCisJX19mcmVlX3BhZ2Vz KGNkYXRhLnNnX2l2LnBhZ2UsIGdldF9vcmRlcigxKSk7DQorCV9fZnJlZV9wYWdlcyhjZGF0YS5z Z19rZXkucGFnZSwgZ2V0X29yZGVyKGtleV9sZW5ndGgpKTsNCisJX19mcmVlX3BhZ2VzKGNkYXRh LnNnX2RzdC5wYWdlLCBnZXRfb3JkZXIoMSkpOw0KKwlfX2ZyZWVfcGFnZXMoY2RhdGEuc2dfc3Jj LnBhZ2UsIGdldF9vcmRlcigxKSk7DQorDQorCXByaW50ayhLRVJOX0lORk8gIlRlc3QgY3J5cHRv IG1vZHVsZSBjb25zdW1lciBpcyB1bmxvYWRlZC5cbiIpOw0KK30NCisNCittb2R1bGVfaW5pdChj b25zdW1lcl9pbml0KTsNCittb2R1bGVfZXhpdChjb25zdW1lcl9maW5pKTsNCisNCitNT0RVTEVf TElDRU5TRSgiR1BMIik7DQorTU9EVUxFX0FVVEhPUigiRXZnZW5peSBQb2x5YWtvdiA8am9obnBv bEAya2EubWlwdC5ydT4iKTsNCitNT0RVTEVfREVTQ1JJUFRJT04oIlRlc3QgY3J5cHRvIG1vZHVs ZSBjb25zdW1lci4iKTsNCkJpbmFyeSBmaWxlcyAvdG1wL2VtcHR5Ly5jb25zdW1lci5jLnN3cCBh bmQgbGludXgtMi42L2RyaXZlcnMvYWNyeXB0by8uY29uc3VtZXIuYy5zd3AgZGlmZmVyDQpkaWZm IC1OcnUgL3RtcC9lbXB0eS9jcnlwdG9fY29ubi5jIGxpbnV4LTIuNi9kcml2ZXJzL2FjcnlwdG8v Y3J5cHRvX2Nvbm4uYw0KLS0tIC90bXAvZW1wdHkvY3J5cHRvX2Nvbm4uYwkxOTcwLTAxLTAxIDAz OjAwOjAwLjAwMDAwMDAwMCArMDMwMA0KKysrIGxpbnV4LTIuNi9kcml2ZXJzL2FjcnlwdG8vY3J5 cHRvX2Nvbm4uYwkyMDA0LTEyLTE0IDE4OjUzOjExLjAwMDAwMDAwMCArMDMwMA0KQEAgLTAsMCAr MSwxOTEgQEANCisvKg0KKyAqIAljcnlwdG9fY29ubi5jDQorICoNCisgKiBDb3B5cmlnaHQgKGMp IDIwMDQgRXZnZW5peSBQb2x5YWtvdiA8am9obnBvbEAya2EubWlwdC5ydT4NCisgKiANCisgKg0K KyAqIFRoaXMgcHJvZ3JhbSBpcyBmcmVlIHNvZnR3YXJlOyB5b3UgY2FuIHJlZGlzdHJpYnV0ZSBp dCBhbmQvb3IgbW9kaWZ5DQorICogaXQgdW5kZXIgdGhlIHRlcm1zIG9mIHRoZSBHTlUgR2VuZXJh bCBQdWJsaWMgTGljZW5zZSBhcyBwdWJsaXNoZWQgYnkNCisgKiB0aGUgRnJlZSBTb2Z0d2FyZSBG b3VuZGF0aW9uOyBlaXRoZXIgdmVyc2lvbiAyIG9mIHRoZSBMaWNlbnNlLCBvcg0KKyAqIChhdCB5 b3VyIG9wdGlvbikgYW55IGxhdGVyIHZlcnNpb24uDQorICoNCisgKiBUaGlzIHByb2dyYW0gaXMg ZGlzdHJpYnV0ZWQgaW4gdGhlIGhvcGUgdGhhdCBpdCB3aWxsIGJlIHVzZWZ1bCwNCisgKiBidXQg V0lUSE9VVCBBTlkgV0FSUkFOVFk7IHdpdGhvdXQgZXZlbiB0aGUgaW1wbGllZCB3YXJyYW50eSBv Zg0KKyAqIE1FUkNIQU5UQUJJTElUWSBvciBGSVRORVNTIEZPUiBBIFBBUlRJQ1VMQVIgUFVSUE9T RS4gIFNlZSB0aGUNCisgKiBHTlUgR2VuZXJhbCBQdWJsaWMgTGljZW5zZSBmb3IgbW9yZSBkZXRh aWxzLg0KKyAqDQorICogWW91IHNob3VsZCBoYXZlIHJlY2VpdmVkIGEgY29weSBvZiB0aGUgR05V IEdlbmVyYWwgUHVibGljIExpY2Vuc2UNCisgKiBhbG9uZyB3aXRoIHRoaXMgcHJvZ3JhbTsgaWYg bm90LCB3cml0ZSB0byB0aGUgRnJlZSBTb2Z0d2FyZQ0KKyAqIEZvdW5kYXRpb24sIEluYy4sIDU5 IFRlbXBsZSBQbGFjZSwgU3VpdGUgMzMwLCBCb3N0b24sIE1BIDAyMTExLTEzMDcgVVNBDQorICov DQorDQorI2luY2x1ZGUgPGxpbnV4L2tlcm5lbC5oPg0KKyNpbmNsdWRlIDxsaW51eC9tb2R1bGUu aD4NCisjaW5jbHVkZSA8bGludXgvbW9kdWxlcGFyYW0uaD4NCisjaW5jbHVkZSA8bGludXgvdHlw ZXMuaD4NCisjaW5jbHVkZSA8bGludXgvbGlzdC5oPg0KKyNpbmNsdWRlIDxsaW51eC9zbGFiLmg+ DQorI2luY2x1ZGUgPGxpbnV4L3NwaW5sb2NrLmg+DQorI2luY2x1ZGUgPGxpbnV4L3ZtYWxsb2Mu aD4NCisNCisjaW5jbHVkZSAiYWNyeXB0by5oIg0KKyNpbmNsdWRlICJjcnlwdG9fbGIuaCINCisN CisjaW5jbHVkZSAiLi4vY29ubmVjdG9yL2Nvbm5lY3Rvci5oIg0KKw0KKyNpbmNsdWRlICJjcnlw dG9fY29ubi5oIg0KKw0KK3N0YXRpYyBzdHJ1Y3QgY2JfaWQgY3J5cHRvX2Nvbm5faWQgPSB7IDB4 ZGVhZCwgMHgwMDAwIH07DQorc3RhdGljIGNoYXIgY3J5cHRvX2Nvbm5fbmFtZVtdID0gImNyY29u biI7DQorDQorc3RhdGljIHZvaWQgY3J5cHRvX2Nvbm5fY2FsbGJhY2sodm9pZCAqZGF0YSkNCit7 DQorCXN0cnVjdCBjbl9tc2cgKm1zZywgKnJlcGx5Ow0KKwlzdHJ1Y3QgY3J5cHRvX2Nvbm5fZGF0 YSAqZCwgKmNtZDsNCisJc3RydWN0IGNyeXB0b19kZXZpY2UgKmRldjsNCisJdTMyIHNlc3Npb25z Ow0KKw0KKwltc2cgPSAoc3RydWN0IGNuX21zZyAqKWRhdGE7DQorCWQgPSAoc3RydWN0IGNyeXB0 b19jb25uX2RhdGEgKiltc2ctPmRhdGE7DQorDQorCWlmIChtc2ctPmxlbiA8IHNpemVvZigqZCkp IHsNCisJCWRwcmludGsoS0VSTl9FUlIgIldyb25nIG1lc3NhZ2UgdG8gY3J5cHRvIGNvbm5lY3Rv cjogbXNnLT5sZW49JXUgPCAldS5cbiIsDQorCQkJbXNnLT5sZW4sIHNpemVvZigqZCkpOw0KKwkJ cmV0dXJuOw0KKwl9DQorDQorCWlmIChtc2ctPmxlbiAhPSBzaXplb2YoKmQpICsgZC0+bGVuKSB7 DQorCQlkcHJpbnRrKEtFUk5fRVJSICJXcm9uZyBtZXNzYWdlIHRvIGNyeXB0byBjb25uZWN0b3I6 IG1zZy0+bGVuPSV1ICE9ICV1LlxuIiwNCisJCQltc2ctPmxlbiwgc2l6ZW9mKCpkKSArIGQtPmxl bik7DQorCQlyZXR1cm47DQorCX0NCisNCisJZGV2ID0gY3J5cHRvX2RldmljZV9nZXRfbmFtZShk LT5uYW1lKTsNCisJaWYgKCFkZXYpIHsNCisJCWRwcmludGsoS0VSTl9JTkZPICJDcnlwdG8gZGV2 aWNlICVzIHdhcyBub3QgZm91bmQuXG4iLCBkLT5uYW1lKTsNCisJCXJldHVybjsNCisJfQ0KKw0K Kwlzd2l0Y2ggKGQtPmNtZCkgew0KKwljYXNlIENSWVBUT19DT05OX1JFQURfU0VTU0lPTlM6DQor CQlyZXBseSA9IGttYWxsb2Moc2l6ZW9mKCptc2cpICsgc2l6ZW9mKCpjbWQpICsgc2l6ZW9mKHNl c3Npb25zKSwgR0ZQX0FUT01JQyk7DQorCQlpZiAocmVwbHkpIHsNCisJCQltZW1jcHkocmVwbHks IG1zZywgc2l6ZW9mKCpyZXBseSkpOw0KKwkJCXJlcGx5LT5sZW4gPSBzaXplb2YoKmNtZCkgKyBz aXplb2Yoc2Vzc2lvbnMpOw0KKw0KKwkJCS8qDQorCQkJICogU2VlIHByb3RvY29sIGRlc2NyaXB0 aW9uIGluIGNvbm5lY3Rvci5jDQorCQkJICovDQorCQkJcmVwbHktPmFjaysrOw0KKw0KKwkJCWNt ZCA9IChzdHJ1Y3QgY3J5cHRvX2Nvbm5fZGF0YSAqKShyZXBseSArIDEpOw0KKwkJCW1lbWNweShj bWQsIGQsIHNpemVvZigqY21kKSk7DQorCQkJY21kLT5sZW4gPSBzaXplb2Yoc2Vzc2lvbnMpOw0K Kw0KKwkJCXNlc3Npb25zID0gYXRvbWljX3JlYWQoJmRldi0+cmVmY250KTsNCisNCisJCQltZW1j cHkoY21kICsgMSwgJnNlc3Npb25zLCBzaXplb2Yoc2Vzc2lvbnMpKTsNCisNCisJCQljbl9uZXRs aW5rX3NlbmQocmVwbHksIDApOw0KKw0KKwkJCWtmcmVlKHJlcGx5KTsNCisJCX0gZWxzZQ0KKwkJ CWRwcmludGsoS0VSTl9FUlIgIkZhaWxlZCB0byBhbGxvY2F0ZSAlZCBieXRlcyBpbiByZXBseSB0 byBjb21hbW5kIDB4JXguXG4iLA0KKwkJCQlzaXplb2YoKm1zZykgKyBzaXplb2YoKmNtZCksIGQt PmNtZCk7DQorCQlicmVhazsNCisJY2FzZSBDUllQVE9fQ09OTl9EVU1QX1FVRVVFOg0KKwkJcmVw bHkgPSBrbWFsbG9jKHNpemVvZigqbXNnKSArIHNpemVvZigqY21kKSArIA0KKwkJCQkxMDI0ICog c2l6ZW9mKHN0cnVjdCBjcnlwdG9fc2Vzc2lvbl9pbml0aWFsaXplciksIEdGUF9BVE9NSUMpOw0K KwkJaWYgKHJlcGx5KSB7DQorCQkJc3RydWN0IGNyeXB0b19zZXNzaW9uICpzOw0KKwkJCXN0cnVj dCBjcnlwdG9fc2Vzc2lvbl9pbml0aWFsaXplciAqcHRyOw0KKw0KKwkJCW1lbWNweShyZXBseSwg bXNnLCBzaXplb2YoKnJlcGx5KSk7DQorDQorCQkJLyoNCisJCQkgKiBTZWUgcHJvdG9jb2wgZGVz Y3JpcHRpb24gaW4gY29ubmVjdG9yLmMNCisJCQkgKi8NCisJCQlyZXBseS0+YWNrKys7DQorDQor CQkJY21kID0gKHN0cnVjdCBjcnlwdG9fY29ubl9kYXRhICopKHJlcGx5ICsgMSk7DQorCQkJbWVt Y3B5KGNtZCwgZCwgc2l6ZW9mKCpjbWQpKTsNCisNCisJCQlwdHIgPSAoc3RydWN0IGNyeXB0b19z ZXNzaW9uX2luaXRpYWxpemVyICopKGNtZCArIDEpOw0KKw0KKwkJCXNlc3Npb25zID0gMDsNCisJ CQlzcGluX2xvY2tfaXJxKCZkZXYtPnNlc3Npb25fbG9jayk7DQorCQkJbGlzdF9mb3JfZWFjaF9l bnRyeShzLCAmZGV2LT5zZXNzaW9uX2xpc3QsIGRldl9xdWV1ZV9lbnRyeSkgew0KKwkJCQltZW1j cHkocHRyLCAmcy0+Y2ksIHNpemVvZigqcHRyKSk7DQorCQkJCXNlc3Npb25zKys7DQorCQkJCXB0 cisrOw0KKw0KKwkJCQlpZiAoc2Vzc2lvbnMgPj0gMTAyNCkNCisJCQkJCWJyZWFrOw0KKwkJCX0N CisJCQlzcGluX3VubG9ja19pcnEoJmRldi0+c2Vzc2lvbl9sb2NrKTsNCisNCisJCQljbWQtPmxl biA9IHNpemVvZigqcHRyKSAqIHNlc3Npb25zOw0KKwkJCXJlcGx5LT5sZW4gPSBzaXplb2YoKmNt ZCkgKyBjbWQtPmxlbjsNCisNCisJCQljbl9uZXRsaW5rX3NlbmQocmVwbHksIDApOw0KKw0KKwkJ CWtmcmVlKHJlcGx5KTsNCisJCX0gZWxzZQ0KKwkJCWRwcmludGsoS0VSTl9FUlIgIkZhaWxlZCB0 byBhbGxvY2F0ZSAlZCBieXRlcyBpbiByZXBseSB0byBjb21hbW5kIDB4JXguXG4iLA0KKwkJCQlz aXplb2YoKm1zZykgKyBzaXplb2YoKmNtZCksIGQtPmNtZCk7DQorCQlicmVhazsNCisJY2FzZSBD UllQVE9fR0VUX1NUQVQ6DQorCQlyZXBseSA9DQorCQkgICAga21hbGxvYyhzaXplb2YoKm1zZykg KyBzaXplb2YoKmNtZCkgKyBzaXplb2Yoc3RydWN0IGNyeXB0b19kZXZpY2Vfc3RhdCksIEdGUF9B VE9NSUMpOw0KKwkJaWYgKHJlcGx5KSB7DQorCQkJc3RydWN0IGNyeXB0b19kZXZpY2Vfc3RhdCAq cHRyOw0KKw0KKwkJCW1lbWNweShyZXBseSwgbXNnLCBzaXplb2YoKnJlcGx5KSk7DQorCQkJcmVw bHktPmxlbiA9IHNpemVvZigqY21kKSArIHNpemVvZigqcHRyKTsNCisNCisJCQkvKg0KKwkJCSAq IFNlZSBwcm90b2NvbCBkZXNjcmlwdGlvbiBpbiBjb25uZWN0b3IuYw0KKwkJCSAqLw0KKwkJCXJl cGx5LT5hY2srKzsNCisNCisJCQljbWQgPSAoc3RydWN0IGNyeXB0b19jb25uX2RhdGEgKikocmVw bHkgKyAxKTsNCisJCQltZW1jcHkoY21kLCBkLCBzaXplb2YoKmNtZCkpOw0KKwkJCWNtZC0+bGVu ID0gc2l6ZW9mKCpwdHIpOw0KKw0KKwkJCXB0ciA9IChzdHJ1Y3QgY3J5cHRvX2RldmljZV9zdGF0 ICopKGNtZCArIDEpOw0KKwkJCW1lbWNweShwdHIsICZkZXYtPnN0YXQsIHNpemVvZigqcHRyKSk7 DQorDQorCQkJY25fbmV0bGlua19zZW5kKHJlcGx5LCAwKTsNCisNCisJCQlrZnJlZShyZXBseSk7 DQorCQl9IGVsc2UNCisJCQlkcHJpbnRrKEtFUk5fRVJSICJGYWlsZWQgdG8gYWxsb2NhdGUgJWQg Ynl0ZXMgaW4gcmVwbHkgdG8gY29tYW1uZCAweCV4LlxuIiwNCisJCQkJc2l6ZW9mKCptc2cpICsg c2l6ZW9mKCpjbWQpLCBkLT5jbWQpOw0KKwkJYnJlYWs7DQorCWRlZmF1bHQ6DQorCQlkcHJpbnRr KEtFUk5fRVJSICJXcm9uZyBvcGVyYXRpb24gMHglMDR4IGZvciBjcnlwdG8gY29ubmVjdG9yLlxu IiwNCisJCQlkLT5jbWQpOw0KKwkJcmV0dXJuOw0KKwl9DQorDQorCWNyeXB0b19kZXZpY2VfcHV0 KGRldik7DQorfQ0KKw0KK2ludCBjcnlwdG9fY29ubl9pbml0KHZvaWQpDQorew0KKwlpbnQgZXJy Ow0KKw0KKwllcnIgPSBjbl9hZGRfY2FsbGJhY2soJmNyeXB0b19jb25uX2lkLCBjcnlwdG9fY29u bl9uYW1lLCBjcnlwdG9fY29ubl9jYWxsYmFjayk7DQorCWlmIChlcnIpDQorCQlyZXR1cm4gZXJy Ow0KKw0KKwlkcHJpbnRrKEtFUk5fSU5GTyAiQ3J5cHRvIGNvbm5lY3RvciBjYWxsYmFjayBpcyBy ZWdpc3RlcmVkLlxuIik7DQorDQorCXJldHVybiAwOw0KK30NCisNCit2b2lkIGNyeXB0b19jb25u X2Zpbmkodm9pZCkNCit7DQorCWNuX2RlbF9jYWxsYmFjaygmY3J5cHRvX2Nvbm5faWQpOw0KKwlk cHJpbnRrKEtFUk5fSU5GTyAiQ3J5cHRvIGNvbm5lY3RvciBjYWxsYmFjayBpcyB1bnJlZ2lzdGVy ZWQuXG4iKTsNCit9DQpkaWZmIC1OcnUgL3RtcC9lbXB0eS9jcnlwdG9fY29ubi5oIGxpbnV4LTIu Ni9kcml2ZXJzL2FjcnlwdG8vY3J5cHRvX2Nvbm4uaA0KLS0tIC90bXAvZW1wdHkvY3J5cHRvX2Nv bm4uaAkxOTcwLTAxLTAxIDAzOjAwOjAwLjAwMDAwMDAwMCArMDMwMA0KKysrIGxpbnV4LTIuNi9k cml2ZXJzL2FjcnlwdG8vY3J5cHRvX2Nvbm4uaAkyMDA0LTEyLTE0IDE4OjUzOjExLjAwMDAwMDAw MCArMDMwMA0KQEAgLTAsMCArMSw0NCBAQA0KKy8qDQorICogCWNyeXB0b19jb25uLmgNCisgKg0K KyAqIENvcHlyaWdodCAoYykgMjAwNCBFdmdlbml5IFBvbHlha292IDxqb2hucG9sQDJrYS5taXB0 LnJ1Pg0KKyAqIA0KKyAqDQorICogVGhpcyBwcm9ncmFtIGlzIGZyZWUgc29mdHdhcmU7IHlvdSBj YW4gcmVkaXN0cmlidXRlIGl0IGFuZC9vciBtb2RpZnkNCisgKiBpdCB1bmRlciB0aGUgdGVybXMg b2YgdGhlIEdOVSBHZW5lcmFsIFB1YmxpYyBMaWNlbnNlIGFzIHB1Ymxpc2hlZCBieQ0KKyAqIHRo ZSBGcmVlIFNvZnR3YXJlIEZvdW5kYXRpb247IGVpdGhlciB2ZXJzaW9uIDIgb2YgdGhlIExpY2Vu c2UsIG9yDQorICogKGF0IHlvdXIgb3B0aW9uKSBhbnkgbGF0ZXIgdmVyc2lvbi4NCisgKg0KKyAq IFRoaXMgcHJvZ3JhbSBpcyBkaXN0cmlidXRlZCBpbiB0aGUgaG9wZSB0aGF0IGl0IHdpbGwgYmUg dXNlZnVsLA0KKyAqIGJ1dCBXSVRIT1VUIEFOWSBXQVJSQU5UWTsgd2l0aG91dCBldmVuIHRoZSBp bXBsaWVkIHdhcnJhbnR5IG9mDQorICogTUVSQ0hBTlRBQklMSVRZIG9yIEZJVE5FU1MgRk9SIEEg UEFSVElDVUxBUiBQVVJQT1NFLiAgU2VlIHRoZQ0KKyAqIEdOVSBHZW5lcmFsIFB1YmxpYyBMaWNl bnNlIGZvciBtb3JlIGRldGFpbHMuDQorICoNCisgKiBZb3Ugc2hvdWxkIGhhdmUgcmVjZWl2ZWQg YSBjb3B5IG9mIHRoZSBHTlUgR2VuZXJhbCBQdWJsaWMgTGljZW5zZQ0KKyAqIGFsb25nIHdpdGgg dGhpcyBwcm9ncmFtOyBpZiBub3QsIHdyaXRlIHRvIHRoZSBGcmVlIFNvZnR3YXJlDQorICogRm91 bmRhdGlvbiwgSW5jLiwgNTkgVGVtcGxlIFBsYWNlLCBTdWl0ZSAzMzAsIEJvc3RvbiwgTUEgMDIx MTEtMTMwNyBVU0ENCisgKi8NCisNCisjaWZuZGVmIF9fQ1JZUFRPX0NPTk5fSA0KKyNkZWZpbmUg X19DUllQVE9fQ09OTl9IDQorDQorI2luY2x1ZGUgImFjcnlwdG8uaCINCisNCisjZGVmaW5lIENS WVBUT19DT05OX1JFQURfU0VTU0lPTlMJCTANCisjZGVmaW5lIENSWVBUT19DT05OX0RVTVBfUVVF VUUJCQkxDQorI2RlZmluZSBDUllQVE9fR0VUX1NUQVQJCQkJMg0KKw0KK3N0cnVjdCBjcnlwdG9f Y29ubl9kYXRhIHsNCisJY2hhciAJCW5hbWVbU0NBQ0hFX05BTUVMRU5dOw0KKwlfX3UxNiAJCWNt ZDsNCisJX191MTYgCQlsZW47DQorCV9fdTggCQlkYXRhWzBdOw0KK307DQorDQorI2lmZGVmIF9f S0VSTkVMX18NCisNCitpbnQgY3J5cHRvX2Nvbm5faW5pdCh2b2lkKTsNCit2b2lkIGNyeXB0b19j b25uX2Zpbmkodm9pZCk7DQorDQorI2VuZGlmCQkJCS8qIF9fS0VSTkVMX18gKi8NCisjZW5kaWYJ CQkJLyogX19DUllQVE9fQ09OTl9IICovDQpkaWZmIC1OcnUgL3RtcC9lbXB0eS9jcnlwdG9fZGVm LmggbGludXgtMi42L2RyaXZlcnMvYWNyeXB0by9jcnlwdG9fZGVmLmgNCi0tLSAvdG1wL2VtcHR5 L2NyeXB0b19kZWYuaAkxOTcwLTAxLTAxIDAzOjAwOjAwLjAwMDAwMDAwMCArMDMwMA0KKysrIGxp bnV4LTIuNi9kcml2ZXJzL2FjcnlwdG8vY3J5cHRvX2RlZi5oCTIwMDQtMTItMTQgMTg6NTM6MTEu MDAwMDAwMDAwICswMzAwDQpAQCAtMCwwICsxLDM4IEBADQorLyoNCisgKiAJY3J5cHRvX2RlZi5o DQorICoNCisgKiBDb3B5cmlnaHQgKGMpIDIwMDQgRXZnZW5peSBQb2x5YWtvdiA8am9obnBvbEAy a2EubWlwdC5ydT4NCisgKiANCisgKg0KKyAqIFRoaXMgcHJvZ3JhbSBpcyBmcmVlIHNvZnR3YXJl OyB5b3UgY2FuIHJlZGlzdHJpYnV0ZSBpdCBhbmQvb3IgbW9kaWZ5DQorICogaXQgdW5kZXIgdGhl IHRlcm1zIG9mIHRoZSBHTlUgR2VuZXJhbCBQdWJsaWMgTGljZW5zZSBhcyBwdWJsaXNoZWQgYnkN CisgKiB0aGUgRnJlZSBTb2Z0d2FyZSBGb3VuZGF0aW9uOyBlaXRoZXIgdmVyc2lvbiAyIG9mIHRo ZSBMaWNlbnNlLCBvcg0KKyAqIChhdCB5b3VyIG9wdGlvbikgYW55IGxhdGVyIHZlcnNpb24uDQor ICoNCisgKiBUaGlzIHByb2dyYW0gaXMgZGlzdHJpYnV0ZWQgaW4gdGhlIGhvcGUgdGhhdCBpdCB3 aWxsIGJlIHVzZWZ1bCwNCisgKiBidXQgV0lUSE9VVCBBTlkgV0FSUkFOVFk7IHdpdGhvdXQgZXZl biB0aGUgaW1wbGllZCB3YXJyYW50eSBvZg0KKyAqIE1FUkNIQU5UQUJJTElUWSBvciBGSVRORVNT IEZPUiBBIFBBUlRJQ1VMQVIgUFVSUE9TRS4gIFNlZSB0aGUNCisgKiBHTlUgR2VuZXJhbCBQdWJs aWMgTGljZW5zZSBmb3IgbW9yZSBkZXRhaWxzLg0KKyAqDQorICogWW91IHNob3VsZCBoYXZlIHJl Y2VpdmVkIGEgY29weSBvZiB0aGUgR05VIEdlbmVyYWwgUHVibGljIExpY2Vuc2UNCisgKiBhbG9u ZyB3aXRoIHRoaXMgcHJvZ3JhbTsgaWYgbm90LCB3cml0ZSB0byB0aGUgRnJlZSBTb2Z0d2FyZQ0K KyAqIEZvdW5kYXRpb24sIEluYy4sIDU5IFRlbXBsZSBQbGFjZSwgU3VpdGUgMzMwLCBCb3N0b24s IE1BIDAyMTExLTEzMDcgVVNBDQorICovDQorDQorI2lmbmRlZiBfX0NSWVBUT19ERUZfSA0KKyNk ZWZpbmUgX19DUllQVE9fREVGX0gNCisNCisjZGVmaW5lIENSWVBUT19PUF9ERUNSWVBUCTANCisj ZGVmaW5lIENSWVBUT19PUF9FTkNSWVBUCTENCisjZGVmaW5lIENSWVBUT19PUF9ITUFDCQkyDQor DQorI2RlZmluZSBDUllQVE9fTU9ERV9FQ0IJCTANCisjZGVmaW5lIENSWVBUT19NT0RFX0NCQwkJ MQ0KKyNkZWZpbmUgQ1JZUFRPX01PREVfQ0ZCCQkyDQorI2RlZmluZSBDUllQVE9fTU9ERV9PRkIJ CTMNCisNCisjZGVmaW5lIENSWVBUT19UWVBFX0FFU18xMjgJMA0KKyNkZWZpbmUgQ1JZUFRPX1RZ UEVfQUVTXzE5MgkxDQorI2RlZmluZSBDUllQVE9fVFlQRV9BRVNfMjU2CTINCisNCisjZW5kaWYJ CQkJLyogX19DUllQVE9fREVGX0ggKi8NCmRpZmYgLU5ydSAvdG1wL2VtcHR5L2NyeXB0b19kZXYu YyBsaW51eC0yLjYvZHJpdmVycy9hY3J5cHRvL2NyeXB0b19kZXYuYw0KLS0tIC90bXAvZW1wdHkv Y3J5cHRvX2Rldi5jCTE5NzAtMDEtMDEgMDM6MDA6MDAuMDAwMDAwMDAwICswMzAwDQorKysgbGlu dXgtMi42L2RyaXZlcnMvYWNyeXB0by9jcnlwdG9fZGV2LmMJMjAwNC0xMi0xNCAxODo1MzoxMS4w MDAwMDAwMDAgKzAzMDANCkBAIC0wLDAgKzEsMzcxIEBADQorLyoNCisgKiAJY3J5cHRvX2Rldi5j DQorICoNCisgKiBDb3B5cmlnaHQgKGMpIDIwMDQgRXZnZW5peSBQb2x5YWtvdiA8am9obnBvbEAy a2EubWlwdC5ydT4NCisgKiANCisgKg0KKyAqIFRoaXMgcHJvZ3JhbSBpcyBmcmVlIHNvZnR3YXJl OyB5b3UgY2FuIHJlZGlzdHJpYnV0ZSBpdCBhbmQvb3IgbW9kaWZ5DQorICogaXQgdW5kZXIgdGhl IHRlcm1zIG9mIHRoZSBHTlUgR2VuZXJhbCBQdWJsaWMgTGljZW5zZSBhcyBwdWJsaXNoZWQgYnkN CisgKiB0aGUgRnJlZSBTb2Z0d2FyZSBGb3VuZGF0aW9uOyBlaXRoZXIgdmVyc2lvbiAyIG9mIHRo ZSBMaWNlbnNlLCBvcg0KKyAqIChhdCB5b3VyIG9wdGlvbikgYW55IGxhdGVyIHZlcnNpb24uDQor ICoNCisgKiBUaGlzIHByb2dyYW0gaXMgZGlzdHJpYnV0ZWQgaW4gdGhlIGhvcGUgdGhhdCBpdCB3 aWxsIGJlIHVzZWZ1bCwNCisgKiBidXQgV0lUSE9VVCBBTlkgV0FSUkFOVFk7IHdpdGhvdXQgZXZl biB0aGUgaW1wbGllZCB3YXJyYW50eSBvZg0KKyAqIE1FUkNIQU5UQUJJTElUWSBvciBGSVRORVNT IEZPUiBBIFBBUlRJQ1VMQVIgUFVSUE9TRS4gIFNlZSB0aGUNCisgKiBHTlUgR2VuZXJhbCBQdWJs aWMgTGljZW5zZSBmb3IgbW9yZSBkZXRhaWxzLg0KKyAqDQorICogWW91IHNob3VsZCBoYXZlIHJl Y2VpdmVkIGEgY29weSBvZiB0aGUgR05VIEdlbmVyYWwgUHVibGljIExpY2Vuc2UNCisgKiBhbG9u ZyB3aXRoIHRoaXMgcHJvZ3JhbTsgaWYgbm90LCB3cml0ZSB0byB0aGUgRnJlZSBTb2Z0d2FyZQ0K KyAqIEZvdW5kYXRpb24sIEluYy4sIDU5IFRlbXBsZSBQbGFjZSwgU3VpdGUgMzMwLCBCb3N0b24s IE1BIDAyMTExLTEzMDcgVVNBDQorICovDQorDQorI2luY2x1ZGUgPGxpbnV4L2tlcm5lbC5oPg0K KyNpbmNsdWRlIDxsaW51eC9tb2R1bGUuaD4NCisjaW5jbHVkZSA8bGludXgvbW9kdWxlcGFyYW0u aD4NCisjaW5jbHVkZSA8bGludXgvdHlwZXMuaD4NCisjaW5jbHVkZSA8bGludXgvbGlzdC5oPg0K KyNpbmNsdWRlIDxsaW51eC9zbGFiLmg+DQorI2luY2x1ZGUgPGxpbnV4L2ludGVycnVwdC5oPg0K KyNpbmNsdWRlIDxsaW51eC9zcGlubG9jay5oPg0KKyNpbmNsdWRlIDxsaW51eC9kZXZpY2UuaD4N CisNCisjaW5jbHVkZSAiYWNyeXB0by5oIg0KKw0KK3N0YXRpYyBMSVNUX0hFQUQoY2Rldl9saXN0 KTsNCitzdGF0aWMgc3BpbmxvY2tfdCBjZGV2X2xvY2sgPSBTUElOX0xPQ0tfVU5MT0NLRUQ7DQor c3RhdGljIHUzMiBjZGV2X2lkczsNCisNCitzdHJ1Y3QgbGlzdF9oZWFkICpjcnlwdG9fZGV2aWNl X2xpc3QgPSAmY2Rldl9saXN0Ow0KK3NwaW5sb2NrX3QgKmNyeXB0b19kZXZpY2VfbG9jayA9ICZj ZGV2X2xvY2s7DQorDQorc3RhdGljIGludCBjcnlwdG9fbWF0Y2goc3RydWN0IGRldmljZSAqZGV2 LCBzdHJ1Y3QgZGV2aWNlX2RyaXZlciAqZHJ2KQ0KK3sNCisJcmV0dXJuIDE7DQorfQ0KKw0KK3N0 YXRpYyBpbnQgY3J5cHRvX3Byb2JlKHN0cnVjdCBkZXZpY2UgKmRldikNCit7DQorCXJldHVybiAt RU5PREVWOw0KK30NCisNCitzdGF0aWMgaW50IGNyeXB0b19yZW1vdmUoc3RydWN0IGRldmljZSAq ZGV2KQ0KK3sNCisJcmV0dXJuIDA7DQorfQ0KKw0KK3N0YXRpYyB2b2lkIGNyeXB0b19yZWxlYXNl KHN0cnVjdCBkZXZpY2UgKmRldikNCit7DQorCXN0cnVjdCBjcnlwdG9fZGV2aWNlICpkID0gY29u dGFpbmVyX29mKGRldiwgc3RydWN0IGNyeXB0b19kZXZpY2UsIGRldmljZSk7DQorDQorCWNvbXBs ZXRlKCZkLT5kZXZfcmVsZWFzZWQpOw0KK30NCisNCitzdGF0aWMgdm9pZCBjcnlwdG9fY2xhc3Nf cmVsZWFzZShzdHJ1Y3QgY2xhc3MgKmNsYXNzKQ0KK3sNCit9DQorDQorc3RhdGljIHZvaWQgY3J5 cHRvX2NsYXNzX3JlbGVhc2VfZGV2aWNlKHN0cnVjdCBjbGFzc19kZXZpY2UgKmNsYXNzX2RldikN Cit7DQorfQ0KKw0KK3N0cnVjdCBjbGFzcyBjcnlwdG9fY2xhc3MgPSB7DQorCS5uYW1lIAkJPSAi YWNyeXB0byIsDQorCS5jbGFzc19yZWxlYXNlIAk9IGNyeXB0b19jbGFzc19yZWxlYXNlLA0KKwku cmVsZWFzZSAJPSBjcnlwdG9fY2xhc3NfcmVsZWFzZV9kZXZpY2UNCit9Ow0KKw0KK3N0cnVjdCBi dXNfdHlwZSBjcnlwdG9fYnVzX3R5cGUgPSB7DQorCS5uYW1lIAkJPSAiYWNyeXB0byIsDQorCS5t YXRjaCAJCT0gY3J5cHRvX21hdGNoDQorfTsNCisNCitzdHJ1Y3QgZGV2aWNlX2RyaXZlciBjcnlw dG9fZHJpdmVyID0gew0KKwkubmFtZSAJCT0gImNyeXB0b19kcml2ZXIiLA0KKwkuYnVzIAkJPSAm Y3J5cHRvX2J1c190eXBlLA0KKwkucHJvYmUgCQk9IGNyeXB0b19wcm9iZSwNCisJLnJlbW92ZSAJ PSBjcnlwdG9fcmVtb3ZlLA0KK307DQorDQorc3RydWN0IGRldmljZSBjcnlwdG9fZGV2ID0gew0K KwkucGFyZW50IAk9IE5VTEwsDQorCS5idXMgCQk9ICZjcnlwdG9fYnVzX3R5cGUsDQorCS5idXNf aWQJCT0gIkFzeW5jaHJvbm91cyBjcnlwdG8iLA0KKwkuZHJpdmVyIAk9ICZjcnlwdG9fZHJpdmVy LA0KKwkucmVsZWFzZSAJPSAmY3J5cHRvX3JlbGVhc2UNCit9Ow0KKw0KK3N0YXRpYyBzc2l6ZV90 IHNlc3Npb25zX3Nob3coc3RydWN0IGNsYXNzX2RldmljZSAqZGV2LCBjaGFyICpidWYpDQorew0K KwlzdHJ1Y3QgY3J5cHRvX2RldmljZSAqZCA9IGNvbnRhaW5lcl9vZihkZXYsIHN0cnVjdCBjcnlw dG9fZGV2aWNlLCBjbGFzc19kZXZpY2UpOw0KKw0KKwlyZXR1cm4gc3ByaW50ZihidWYsICIlZFxu IiwgYXRvbWljX3JlYWQoJmQtPnJlZmNudCkpOw0KK30NCitzdGF0aWMgc3NpemVfdCBuYW1lX3No b3coc3RydWN0IGNsYXNzX2RldmljZSAqZGV2LCBjaGFyICpidWYpDQorew0KKwlzdHJ1Y3QgY3J5 cHRvX2RldmljZSAqZCA9IGNvbnRhaW5lcl9vZihkZXYsIHN0cnVjdCBjcnlwdG9fZGV2aWNlLCBj bGFzc19kZXZpY2UpOw0KKw0KKwlyZXR1cm4gc3ByaW50ZihidWYsICIlc1xuIiwgZC0+bmFtZSk7 DQorfQ0KK3N0YXRpYyBzc2l6ZV90IGRldmljZXNfc2hvdyhzdHJ1Y3QgY2xhc3NfZGV2aWNlICpk ZXYsIGNoYXIgKmJ1ZikNCit7DQorCXN0cnVjdCBjcnlwdG9fZGV2aWNlICpkOw0KKwlpbnQgb2Zm ID0gMDsNCisNCisJc3Bpbl9sb2NrX2lycSgmY2Rldl9sb2NrKTsNCisJbGlzdF9mb3JfZWFjaF9l bnRyeShkLCAmY2Rldl9saXN0LCBjZGV2X2VudHJ5KSB7DQorCQlvZmYgKz0gc3ByaW50ZihidWYg KyBvZmYsICIlcyAiLCBkLT5uYW1lKTsNCisJfQ0KKwlzcGluX3VubG9ja19pcnEoJmNkZXZfbG9j ayk7DQorDQorCWlmICghb2ZmKQ0KKwkJb2ZmID0gc3ByaW50ZihidWYsICJObyBkZXZpY2VzIHJl Z2lzdGVyZWQgeWV0LiIpOw0KKw0KKwlvZmYgKz0gc3ByaW50ZihidWYgKyBvZmYsICJcbiIpOw0K Kw0KKwlyZXR1cm4gb2ZmOw0KK30NCisNCitzdGF0aWMgc3NpemVfdCBrbWVtX2ZhaWxlZF9zaG93 KHN0cnVjdCBjbGFzc19kZXZpY2UgKmRldiwgY2hhciAqYnVmKQ0KK3sNCisJc3RydWN0IGNyeXB0 b19kZXZpY2UgKmQgPSAgY29udGFpbmVyX29mKGRldiwgc3RydWN0IGNyeXB0b19kZXZpY2UsIGNs YXNzX2RldmljZSk7DQorDQorCXJldHVybiBzcHJpbnRmKGJ1ZiwgIiVsbHVcbiIsIGQtPnN0YXQu a21lbV9mYWlsZWQpOw0KK30NCitzdGF0aWMgc3NpemVfdCBzc3RhcnRlZF9zaG93KHN0cnVjdCBj bGFzc19kZXZpY2UgKmRldiwgY2hhciAqYnVmKQ0KK3sNCisJc3RydWN0IGNyeXB0b19kZXZpY2Ug KmQgPSBjb250YWluZXJfb2YoZGV2LCBzdHJ1Y3QgY3J5cHRvX2RldmljZSwgY2xhc3NfZGV2aWNl KTsNCisNCisJcmV0dXJuIHNwcmludGYoYnVmLCAiJWxsdVxuIiwgZC0+c3RhdC5zc3RhcnRlZCk7 DQorfQ0KK3N0YXRpYyBzc2l6ZV90IHNmaW5pc2hlZF9zaG93KHN0cnVjdCBjbGFzc19kZXZpY2Ug KmRldiwgY2hhciAqYnVmKQ0KK3sNCisJc3RydWN0IGNyeXB0b19kZXZpY2UgKmQgPSBjb250YWlu ZXJfb2YoZGV2LCBzdHJ1Y3QgY3J5cHRvX2RldmljZSwgY2xhc3NfZGV2aWNlKTsNCisNCisJcmV0 dXJuIHNwcmludGYoYnVmLCAiJWxsdVxuIiwgZC0+c3RhdC5zZmluaXNoZWQpOw0KK30NCitzdGF0 aWMgc3NpemVfdCBzY29tcGxldGVkX3Nob3coc3RydWN0IGNsYXNzX2RldmljZSAqZGV2LCBjaGFy ICpidWYpDQorew0KKwlzdHJ1Y3QgY3J5cHRvX2RldmljZSAqZCA9IGNvbnRhaW5lcl9vZihkZXYs IHN0cnVjdCBjcnlwdG9fZGV2aWNlLCBjbGFzc19kZXZpY2UpOw0KKw0KKwlyZXR1cm4gc3ByaW50 ZihidWYsICIlbGx1XG4iLCBkLT5zdGF0LnNjb21wbGV0ZWQpOw0KK30NCisNCitzdGF0aWMgQ0xB U1NfREVWSUNFX0FUVFIoc2Vzc2lvbnMsIDA0NDQsIHNlc3Npb25zX3Nob3csIE5VTEwpOw0KK3N0 YXRpYyBDTEFTU19ERVZJQ0VfQVRUUihuYW1lLCAwNDQ0LCBuYW1lX3Nob3csIE5VTEwpOw0KK0NM QVNTX0RFVklDRV9BVFRSKGRldmljZXMsIDA0NDQsIGRldmljZXNfc2hvdywgTlVMTCk7DQorc3Rh dGljIENMQVNTX0RFVklDRV9BVFRSKHNjb21wbGV0ZWQsIDA0NDQsIHNjb21wbGV0ZWRfc2hvdywg TlVMTCk7DQorc3RhdGljIENMQVNTX0RFVklDRV9BVFRSKHNzdGFydGVkLCAwNDQ0LCBzc3RhcnRl ZF9zaG93LCBOVUxMKTsNCitzdGF0aWMgQ0xBU1NfREVWSUNFX0FUVFIoc2ZpbmlzaGVkLCAwNDQ0 LCBzZmluaXNoZWRfc2hvdywgTlVMTCk7DQorc3RhdGljIENMQVNTX0RFVklDRV9BVFRSKGttZW1f ZmFpbGVkLCAwNDQ0LCBrbWVtX2ZhaWxlZF9zaG93LCBOVUxMKTsNCisNCitzdGF0aWMgaW50IGNv bXBhcmVfZGV2aWNlKHN0cnVjdCBjcnlwdG9fZGV2aWNlICpkMSwgc3RydWN0IGNyeXB0b19kZXZp Y2UgKmQyKQ0KK3sNCisJaWYgKCFzdHJuY21wKGQxLT5uYW1lLCBkMi0+bmFtZSwgc2l6ZW9mKGQx LT5uYW1lKSkpDQorCQlyZXR1cm4gMTsNCisNCisJcmV0dXJuIDA7DQorfQ0KKw0KK3N0YXRpYyB2 b2lkIGNyZWF0ZV9kZXZpY2VfYXR0cmlidXRlcyhzdHJ1Y3QgY3J5cHRvX2RldmljZSAqZGV2KQ0K K3sNCisJY2xhc3NfZGV2aWNlX2NyZWF0ZV9maWxlKCZkZXYtPmNsYXNzX2RldmljZSwgJmNsYXNz X2RldmljZV9hdHRyX3Nlc3Npb25zKTsNCisJY2xhc3NfZGV2aWNlX2NyZWF0ZV9maWxlKCZkZXYt PmNsYXNzX2RldmljZSwgJmNsYXNzX2RldmljZV9hdHRyX25hbWUpOw0KKwljbGFzc19kZXZpY2Vf Y3JlYXRlX2ZpbGUoJmRldi0+Y2xhc3NfZGV2aWNlLCAmY2xhc3NfZGV2aWNlX2F0dHJfc2NvbXBs ZXRlZCk7DQorCWNsYXNzX2RldmljZV9jcmVhdGVfZmlsZSgmZGV2LT5jbGFzc19kZXZpY2UsICZj bGFzc19kZXZpY2VfYXR0cl9zc3RhcnRlZCk7DQorCWNsYXNzX2RldmljZV9jcmVhdGVfZmlsZSgm ZGV2LT5jbGFzc19kZXZpY2UsICZjbGFzc19kZXZpY2VfYXR0cl9zZmluaXNoZWQpOw0KKwljbGFz c19kZXZpY2VfY3JlYXRlX2ZpbGUoJmRldi0+Y2xhc3NfZGV2aWNlLCAmY2xhc3NfZGV2aWNlX2F0 dHJfa21lbV9mYWlsZWQpOw0KK30NCisNCitzdGF0aWMgdm9pZCByZW1vdmVfZGV2aWNlX2F0dHJp YnV0ZXMoc3RydWN0IGNyeXB0b19kZXZpY2UgKmRldikNCit7DQorCWNsYXNzX2RldmljZV9yZW1v dmVfZmlsZSgmZGV2LT5jbGFzc19kZXZpY2UsICZjbGFzc19kZXZpY2VfYXR0cl9zZXNzaW9ucyk7 DQorCWNsYXNzX2RldmljZV9yZW1vdmVfZmlsZSgmZGV2LT5jbGFzc19kZXZpY2UsICZjbGFzc19k ZXZpY2VfYXR0cl9uYW1lKTsNCisJY2xhc3NfZGV2aWNlX3JlbW92ZV9maWxlKCZkZXYtPmNsYXNz X2RldmljZSwgJmNsYXNzX2RldmljZV9hdHRyX3Njb21wbGV0ZWQpOw0KKwljbGFzc19kZXZpY2Vf cmVtb3ZlX2ZpbGUoJmRldi0+Y2xhc3NfZGV2aWNlLCAmY2xhc3NfZGV2aWNlX2F0dHJfc3N0YXJ0 ZWQpOw0KKwljbGFzc19kZXZpY2VfcmVtb3ZlX2ZpbGUoJmRldi0+Y2xhc3NfZGV2aWNlLCAmY2xh c3NfZGV2aWNlX2F0dHJfc2ZpbmlzaGVkKTsNCisJY2xhc3NfZGV2aWNlX3JlbW92ZV9maWxlKCZk ZXYtPmNsYXNzX2RldmljZSwgJmNsYXNzX2RldmljZV9hdHRyX2ttZW1fZmFpbGVkKTsNCit9DQor DQoraW50IF9fbWF0Y2hfaW5pdGlhbGl6ZXIoc3RydWN0IGNyeXB0b19jYXBhYmlsaXR5ICpjYXAs IHN0cnVjdCBjcnlwdG9fc2Vzc2lvbl9pbml0aWFsaXplciAqY2kpDQorew0KKwlpZiAoY2FwLT5v cGVyYXRpb24gPT0gY2ktPm9wZXJhdGlvbiAmJiBjYXAtPnR5cGUgPT0gY2ktPnR5cGUgJiYgDQor CQkJY2FwLT5tb2RlID09IChjaS0+bW9kZSAmIDB4MWZmZikpDQorCQlyZXR1cm4gMTsNCisNCisJ cmV0dXJuIDA7DQorfQ0KKw0KK2ludCBtYXRjaF9pbml0aWFsaXplcihzdHJ1Y3QgY3J5cHRvX2Rl dmljZSAqZGV2LCBzdHJ1Y3QgY3J5cHRvX3Nlc3Npb25faW5pdGlhbGl6ZXIgKmNpKQ0KK3sNCisJ aW50IGk7DQorDQorCWZvciAoaSA9IDA7IGkgPCBkZXYtPmNhcF9udW1iZXI7ICsraSkgew0KKwkJ c3RydWN0IGNyeXB0b19jYXBhYmlsaXR5ICpjYXAgPSAmZGV2LT5jYXBbaV07DQorDQorCQlpZiAo X19tYXRjaF9pbml0aWFsaXplcihjYXAsIGNpKSkgew0KKwkJCWlmIChjYXAtPnFsZW4gPj0gYXRv bWljX3JlYWQoJmRldi0+cmVmY250KSArIDEpIHsNCisJCQkJZHByaW50aygiY2FwLT5sZW49JXUs IHJlcT0ldS5cbiIsDQorCQkJCQljYXAtPnFsZW4sIGF0b21pY19yZWFkKCZkZXYtPnJlZmNudCkg KyAxKTsNCisJCQkJcmV0dXJuIDE7DQorCQkJfQ0KKwkJfQ0KKwl9DQorDQorCXJldHVybiAwOw0K K30NCisNCit2b2lkIGNyeXB0b19kZXZpY2VfZ2V0KHN0cnVjdCBjcnlwdG9fZGV2aWNlICpkZXYp DQorew0KKwlhdG9taWNfaW5jKCZkZXYtPnJlZmNudCk7DQorfQ0KKw0KK3N0cnVjdCBjcnlwdG9f ZGV2aWNlICpjcnlwdG9fZGV2aWNlX2dldF9uYW1lKGNoYXIgKm5hbWUpDQorew0KKwlzdHJ1Y3Qg Y3J5cHRvX2RldmljZSAqZGV2Ow0KKwlpbnQgZm91bmQgPSAwOw0KKw0KKwlzcGluX2xvY2tfaXJx KCZjZGV2X2xvY2spOw0KKwlsaXN0X2Zvcl9lYWNoX2VudHJ5KGRldiwgJmNkZXZfbGlzdCwgY2Rl dl9lbnRyeSkgew0KKwkJaWYgKCFzdHJjbXAoZGV2LT5uYW1lLCBuYW1lKSkgew0KKwkJCWZvdW5k ID0gMTsNCisJCQljcnlwdG9fZGV2aWNlX2dldChkZXYpOw0KKwkJCWJyZWFrOw0KKwkJfQ0KKwl9 DQorCXNwaW5fdW5sb2NrX2lycSgmY2Rldl9sb2NrKTsNCisNCisJaWYgKCFmb3VuZCkNCisJCXJl dHVybiBOVUxMOw0KKw0KKwlyZXR1cm4gZGV2Ow0KK30NCisNCit2b2lkIGNyeXB0b19kZXZpY2Vf cHV0KHN0cnVjdCBjcnlwdG9fZGV2aWNlICpkZXYpDQorew0KKwlhdG9taWNfZGVjKCZkZXYtPnJl ZmNudCk7DQorfQ0KKw0KK2ludCBfX2NyeXB0b19kZXZpY2VfYWRkKHN0cnVjdCBjcnlwdG9fZGV2 aWNlICpkZXYpDQorew0KKwlpbnQgZXJyOw0KKw0KKwltZW1zZXQoJmRldi0+c3RhdCwgMCwgc2l6 ZW9mKGRldi0+c3RhdCkpOw0KKwlzcGluX2xvY2tfaW5pdCgmZGV2LT5zdGF0X2xvY2spOw0KKwlz cGluX2xvY2tfaW5pdCgmZGV2LT5sb2NrKTsNCisJc3Bpbl9sb2NrX2luaXQoJmRldi0+c2Vzc2lv bl9sb2NrKTsNCisJSU5JVF9MSVNUX0hFQUQoJmRldi0+c2Vzc2lvbl9saXN0KTsNCisJYXRvbWlj X3NldCgmZGV2LT5yZWZjbnQsIDApOw0KKwlkZXYtPnNpZCA9IDA7DQorCWRldi0+ZmxhZ3MgPSAw Ow0KKwlpbml0X2NvbXBsZXRpb24oJmRldi0+ZGV2X3JlbGVhc2VkKTsNCisJbWVtY3B5KCZkZXYt PmRldmljZSwgJmNyeXB0b19kZXYsIHNpemVvZihzdHJ1Y3QgZGV2aWNlKSk7DQorCWRldi0+ZHJp dmVyID0gJmNyeXB0b19kcml2ZXI7DQorDQorCXNucHJpbnRmKGRldi0+ZGV2aWNlLmJ1c19pZCwg c2l6ZW9mKGRldi0+ZGV2aWNlLmJ1c19pZCksICIlcyIsIGRldi0+bmFtZSk7DQorCWVyciA9IGRl dmljZV9yZWdpc3RlcigmZGV2LT5kZXZpY2UpOw0KKwlpZiAoZXJyKSB7DQorCQlkcHJpbnRrKEtF Uk5fRVJSICJGYWlsZWQgdG8gcmVnaXN0ZXIgY3J5cHRvIGRldmljZSAlczogZXJyPSVkLlxuIiwN CisJCQlkZXYtPm5hbWUsIGVycik7DQorCQlyZXR1cm4gZXJyOw0KKwl9DQorDQorCXNucHJpbnRm KGRldi0+Y2xhc3NfZGV2aWNlLmNsYXNzX2lkLCBzaXplb2YoZGV2LT5jbGFzc19kZXZpY2UuY2xh c3NfaWQpLCAiJXMiLCBkZXYtPm5hbWUpOw0KKwlkZXYtPmNsYXNzX2RldmljZS5kZXYgPSAmZGV2 LT5kZXZpY2U7DQorCWRldi0+Y2xhc3NfZGV2aWNlLmNsYXNzID0gJmNyeXB0b19jbGFzczsNCisN CisJZXJyID0gY2xhc3NfZGV2aWNlX3JlZ2lzdGVyKCZkZXYtPmNsYXNzX2RldmljZSk7DQorCWlm IChlcnIpIHsNCisJCWRwcmludGsoS0VSTl9FUlIgIkZhaWxlZCB0byByZWdpc3RlciBjcnlwdG8g Y2xhc3MgZGV2aWNlICVzOiBlcnI9JWQuXG4iLA0KKwkJCWRldi0+bmFtZSwgZXJyKTsNCisJCWRl dmljZV91bnJlZ2lzdGVyKCZkZXYtPmRldmljZSk7DQorCQlyZXR1cm4gZXJyOw0KKwl9DQorDQor CWNyZWF0ZV9kZXZpY2VfYXR0cmlidXRlcyhkZXYpOw0KKw0KKwlyZXR1cm4gMDsNCit9DQorDQor dm9pZCBfX2NyeXB0b19kZXZpY2VfcmVtb3ZlKHN0cnVjdCBjcnlwdG9fZGV2aWNlICpkZXYpDQor ew0KKwlyZW1vdmVfZGV2aWNlX2F0dHJpYnV0ZXMoZGV2KTsNCisJY2xhc3NfZGV2aWNlX3VucmVn aXN0ZXIoJmRldi0+Y2xhc3NfZGV2aWNlKTsNCisJZGV2aWNlX3VucmVnaXN0ZXIoJmRldi0+ZGV2 aWNlKTsNCit9DQorDQoraW50IGNyeXB0b19kZXZpY2VfYWRkKHN0cnVjdCBjcnlwdG9fZGV2aWNl ICpkZXYpDQorew0KKwlpbnQgZXJyOw0KKw0KKwllcnIgPSBfX2NyeXB0b19kZXZpY2VfYWRkKGRl dik7DQorCWlmIChlcnIpDQorCQlyZXR1cm4gZXJyOw0KKw0KKwlzcGluX2xvY2tfaXJxKCZjZGV2 X2xvY2spOw0KKwlsaXN0X2FkZCgmZGV2LT5jZGV2X2VudHJ5LCAmY2Rldl9saXN0KTsNCisJZGV2 LT5pZCA9ICsrY2Rldl9pZHM7DQorCXNwaW5fdW5sb2NrX2lycSgmY2Rldl9sb2NrKTsNCisNCisJ ZHByaW50ayhLRVJOX0lORk8gIkNyeXB0byBkZXZpY2UgJXMgd2FzIHJlZ2lzdGVyZWQgd2l0aCBJ RD0leC5cbiIsDQorCQlkZXYtPm5hbWUsIGRldi0+aWQpOw0KKw0KKwlyZXR1cm4gMDsNCit9DQor DQordm9pZCBjcnlwdG9fZGV2aWNlX3JlbW92ZShzdHJ1Y3QgY3J5cHRvX2RldmljZSAqZGV2KQ0K K3sNCisJc3RydWN0IGNyeXB0b19kZXZpY2UgKl9fZGV2LCAqbjsNCisNCisJX19jcnlwdG9fZGV2 aWNlX3JlbW92ZShkZXYpOw0KKw0KKwlzcGluX2xvY2tfaXJxKCZjZGV2X2xvY2spOw0KKwlsaXN0 X2Zvcl9lYWNoX2VudHJ5X3NhZmUoX19kZXYsIG4sICZjZGV2X2xpc3QsIGNkZXZfZW50cnkpIHsN CisJCWlmIChjb21wYXJlX2RldmljZShfX2RldiwgZGV2KSkgew0KKwkJCWxpc3RfZGVsX2luaXQo Jl9fZGV2LT5jZGV2X2VudHJ5KTsNCisJCQlzcGluX3VubG9ja19pcnEoJmNkZXZfbG9jayk7DQor DQorCQkJLyoNCisJCQkgKiBJbiB0ZXN0IGNhc2VzIG9yIHdoZW4gY3J5cHRvIGRldmljZSBkcml2 ZXIgaXMgbm90IHdyaXR0ZW4gY29ycmVjdGx5LA0KKwkJCSAqIGl0J3MgLT5kYXRhX3JlYWR5KCkg bWV0aG9kIHdpbGwgbm90IGJlIGNhbGxlbiBhbnltb3JlDQorCQkJICogYWZ0ZXIgZGV2aWNlIGlz IHJlbW92ZWQgZnJvbSBjcnlwdG8gZGV2aWNlIGxpc3QuDQorCQkJICoNCisJCQkgKiBGb3Igc3Vj aCBjYXNlcyB3ZSBlaXRoZXIgc2hvdWxkIHByb3ZpZGUgLT5mbHVzaCgpIGNhbGwNCisJCQkgKiBv ciBwcm9wZXJseSB3cml0ZSAtPmRhdGFfcmVhZHkoKSBtZXRob2QuDQorCQkJICovDQorDQorCQkJ d2hpbGUgKGF0b21pY19yZWFkKCZfX2Rldi0+cmVmY250KSkgew0KKw0KKwkJCQlkcHJpbnRrKEtF Uk5fSU5GTyAiV2FpdGluZyBmb3IgJXMgdG8gYmVjb21lIGZyZWU6IHJlZmNudD0lZC5cbiIsDQor CQkJCQlfX2Rldi0+bmFtZSwgYXRvbWljX3JlYWQoJmRldi0+cmVmY250KSk7DQorDQorCQkJCS8q DQorCQkJCSAqIEhhY2sgem9uZTogeW91IG5lZWQgdG8gd3JpdGUgZ29vZCAtPmRhdGFfcmVhZHko KQ0KKwkJCQkgKiBhbmQgY3J5cHRvIGRldmljZSBkcml2ZXIgaXRzZWxmLg0KKwkJCQkgKg0KKwkJ CQkgKiBEcml2ZXIgc2hvdWQgbm90IGJ1enogaWYgaXQgaGFzIHBlbmRpbmcgc2Vzc2lvbnMNCisJ CQkJICogaW4gaXQncyBxdWV1ZSBhbmQgaXQgd2FzIHJlbW92ZWQgZnJvbSBnbG9iYWwgZGV2aWNl IGxpc3QuDQorCQkJCSAqDQorCQkJCSAqIEFsdGhvdWdoIEkgY2FuIHdvcmthcm91bmQgaXQgaGVy ZSwgZm9yIGV4YW1wbGUgYnkNCisJCQkJICogZmx1c2hpbmcgdGhlIHdob2xlIHF1ZXVlIGFuZCBk cm9wIGFsbCBwZW5kaW5nIHNlc3Npb25zLg0KKwkJCQkgKi8NCisNCisJCQkJX19kZXYtPmRhdGFf cmVhZHkoX19kZXYpOw0KKwkJCQlzZXRfY3VycmVudF9zdGF0ZShUQVNLX1VOSU5URVJSVVBUSUJM RSk7DQorCQkJCXNjaGVkdWxlX3RpbWVvdXQoSFopOw0KKwkJCX0NCisNCisJCQlkcHJpbnRrKEtF Uk5fRVJSICJDcnlwdG8gZGV2aWNlICVzIHdhcyB1bnJlZ2lzdGVyZWQuXG4iLA0KKwkJCQlkZXYt Pm5hbWUpOw0KKwkJCXJldHVybjsNCisJCX0NCisJfQ0KKwlzcGluX3VubG9ja19pcnEoJmNkZXZf bG9jayk7DQorDQorCWRwcmludGsoS0VSTl9FUlIgIkNyeXB0byBkZXZpY2UgJXMgd2FzIG5vdCBy ZWdpc3RlcmVkLlxuIiwgZGV2LT5uYW1lKTsNCit9DQorDQorRVhQT1JUX1NZTUJPTF9HUEwoY3J5 cHRvX2RldmljZV9hZGQpOw0KK0VYUE9SVF9TWU1CT0xfR1BMKGNyeXB0b19kZXZpY2VfcmVtb3Zl KTsNCitFWFBPUlRfU1lNQk9MX0dQTChjcnlwdG9fZGV2aWNlX2dldCk7DQorRVhQT1JUX1NZTUJP TF9HUEwoY3J5cHRvX2RldmljZV9nZXRfbmFtZSk7DQorRVhQT1JUX1NZTUJPTF9HUEwoY3J5cHRv X2RldmljZV9wdXQpOw0KZGlmZiAtTnJ1IC90bXAvZW1wdHkvY3J5cHRvX2xiLmMgbGludXgtMi42 L2RyaXZlcnMvYWNyeXB0by9jcnlwdG9fbGIuYw0KLS0tIC90bXAvZW1wdHkvY3J5cHRvX2xiLmMJ MTk3MC0wMS0wMSAwMzowMDowMC4wMDAwMDAwMDAgKzAzMDANCisrKyBsaW51eC0yLjYvZHJpdmVy cy9hY3J5cHRvL2NyeXB0b19sYi5jCTIwMDQtMTItMTQgMTg6NTM6MTEuMDAwMDAwMDAwICswMzAw DQpAQCAtMCwwICsxLDYzMSBAQA0KKy8qDQorICogCWNyeXB0b19sYi5jDQorICoNCisgKiBDb3B5 cmlnaHQgKGMpIDIwMDQgRXZnZW5peSBQb2x5YWtvdiA8am9obnBvbEAya2EubWlwdC5ydT4NCisg KiANCisgKg0KKyAqIFRoaXMgcHJvZ3JhbSBpcyBmcmVlIHNvZnR3YXJlOyB5b3UgY2FuIHJlZGlz dHJpYnV0ZSBpdCBhbmQvb3IgbW9kaWZ5DQorICogaXQgdW5kZXIgdGhlIHRlcm1zIG9mIHRoZSBH TlUgR2VuZXJhbCBQdWJsaWMgTGljZW5zZSBhcyBwdWJsaXNoZWQgYnkNCisgKiB0aGUgRnJlZSBT b2Z0d2FyZSBGb3VuZGF0aW9uOyBlaXRoZXIgdmVyc2lvbiAyIG9mIHRoZSBMaWNlbnNlLCBvcg0K KyAqIChhdCB5b3VyIG9wdGlvbikgYW55IGxhdGVyIHZlcnNpb24uDQorICoNCisgKiBUaGlzIHBy b2dyYW0gaXMgZGlzdHJpYnV0ZWQgaW4gdGhlIGhvcGUgdGhhdCBpdCB3aWxsIGJlIHVzZWZ1bCwN CisgKiBidXQgV0lUSE9VVCBBTlkgV0FSUkFOVFk7IHdpdGhvdXQgZXZlbiB0aGUgaW1wbGllZCB3 YXJyYW50eSBvZg0KKyAqIE1FUkNIQU5UQUJJTElUWSBvciBGSVRORVNTIEZPUiBBIFBBUlRJQ1VM QVIgUFVSUE9TRS4gIFNlZSB0aGUNCisgKiBHTlUgR2VuZXJhbCBQdWJsaWMgTGljZW5zZSBmb3Ig bW9yZSBkZXRhaWxzLg0KKyAqDQorICogWW91IHNob3VsZCBoYXZlIHJlY2VpdmVkIGEgY29weSBv ZiB0aGUgR05VIEdlbmVyYWwgUHVibGljIExpY2Vuc2UNCisgKiBhbG9uZyB3aXRoIHRoaXMgcHJv Z3JhbTsgaWYgbm90LCB3cml0ZSB0byB0aGUgRnJlZSBTb2Z0d2FyZQ0KKyAqIEZvdW5kYXRpb24s IEluYy4sIDU5IFRlbXBsZSBQbGFjZSwgU3VpdGUgMzMwLCBCb3N0b24sIE1BIDAyMTExLTEzMDcg VVNBDQorICovDQorDQorI2luY2x1ZGUgPGxpbnV4L2tlcm5lbC5oPg0KKyNpbmNsdWRlIDxsaW51 eC9tb2R1bGUuaD4NCisjaW5jbHVkZSA8bGludXgvbW9kdWxlcGFyYW0uaD4NCisjaW5jbHVkZSA8 bGludXgvdHlwZXMuaD4NCisjaW5jbHVkZSA8bGludXgvbGlzdC5oPg0KKyNpbmNsdWRlIDxsaW51 eC9zbGFiLmg+DQorI2luY2x1ZGUgPGxpbnV4L2ludGVycnVwdC5oPg0KKyNpbmNsdWRlIDxsaW51 eC9zcGlubG9jay5oPg0KKyNpbmNsdWRlIDxsaW51eC93b3JrcXVldWUuaD4NCisjaW5jbHVkZSA8 bGludXgvZXJyLmg+DQorDQorI2luY2x1ZGUgImFjcnlwdG8uaCINCisjaW5jbHVkZSAiY3J5cHRv X2xiLmgiDQorI2luY2x1ZGUgImNyeXB0b19zdGF0LmgiDQorI2luY2x1ZGUgImNyeXB0b19yb3V0 ZS5oIg0KKw0KK3N0YXRpYyBMSVNUX0hFQUQoY3J5cHRvX2xiX2xpc3QpOw0KK3N0YXRpYyBzcGlu bG9ja190IGNyeXB0b19sYl9sb2NrID0gU1BJTl9MT0NLX1VOTE9DS0VEOw0KK3N0YXRpYyBpbnQg bGJfbnVtID0gMDsNCitzdGF0aWMgc3RydWN0IGNyeXB0b19sYiAqY3VycmVudF9sYiwgKmRlZmF1 bHRfbGI7DQorc3RhdGljIHN0cnVjdCBjb21wbGV0aW9uIHRocmVhZF9leGl0ZWQ7DQorc3RhdGlj IGludCBuZWVkX2V4aXQ7DQorc3RhdGljIHN0cnVjdCB3b3JrcXVldWVfc3RydWN0ICpjcnlwdG9f bGJfcXVldWU7DQorc3RhdGljIERFQ0xBUkVfV0FJVF9RVUVVRV9IRUFEKGNyeXB0b19sYl93YWl0 X3F1ZXVlKTsNCisNCitleHRlcm4gc3RydWN0IGxpc3RfaGVhZCAqY3J5cHRvX2RldmljZV9saXN0 Ow0KK2V4dGVybiBzcGlubG9ja190ICpjcnlwdG9fZGV2aWNlX2xvY2s7DQorDQorZXh0ZXJuIGlu dCBmb3JjZV9sYl9yZW1vdmU7DQorZXh0ZXJuIHN0cnVjdCBjcnlwdG9fZGV2aWNlIG1haW5fY3J5 cHRvX2RldmljZTsNCisNCitzdGF0aWMgaW50IGxiX2lzX2N1cnJlbnQoc3RydWN0IGNyeXB0b19s YiAqbCkNCit7DQorCXJldHVybiAobC0+Y3J5cHRvX2RldmljZV9saXN0ICE9IE5VTEwgJiYgbC0+ Y3J5cHRvX2RldmljZV9sb2NrICE9IE5VTEwpOw0KK30NCisNCitzdGF0aWMgaW50IGxiX2lzX2Rl ZmF1bHQoc3RydWN0IGNyeXB0b19sYiAqbCkNCit7DQorCXJldHVybiAobCA9PSBkZWZhdWx0X2xi KTsNCit9DQorDQorc3RhdGljIHZvaWQgX19sYl9zZXRfY3VycmVudChzdHJ1Y3QgY3J5cHRvX2xi ICpsKQ0KK3sNCisJc3RydWN0IGNyeXB0b19sYiAqYyA9IGN1cnJlbnRfbGI7DQorDQorCWlmIChj KSB7DQorCQlsLT5jcnlwdG9fZGV2aWNlX2xpc3QgPSBjcnlwdG9fZGV2aWNlX2xpc3Q7DQorCQls LT5jcnlwdG9fZGV2aWNlX2xvY2sgPSBjcnlwdG9fZGV2aWNlX2xvY2s7DQorCQljdXJyZW50X2xi ID0gbDsNCisJCWMtPmNyeXB0b19kZXZpY2VfbGlzdCA9IE5VTEw7DQorCQljLT5jcnlwdG9fZGV2 aWNlX2xvY2sgPSBOVUxMOw0KKwl9IGVsc2Ugew0KKwkJbC0+Y3J5cHRvX2RldmljZV9saXN0ID0g Y3J5cHRvX2RldmljZV9saXN0Ow0KKwkJbC0+Y3J5cHRvX2RldmljZV9sb2NrID0gY3J5cHRvX2Rl dmljZV9sb2NrOw0KKwkJY3VycmVudF9sYiA9IGw7DQorCX0NCit9DQorDQorc3RhdGljIHZvaWQg bGJfc2V0X2N1cnJlbnQoc3RydWN0IGNyeXB0b19sYiAqbCkNCit7DQorCXN0cnVjdCBjcnlwdG9f bGIgKmMgPSBjdXJyZW50X2xiOw0KKw0KKwlpZiAoYykgew0KKwkJc3Bpbl9sb2NrX2lycSgmYy0+ bG9jayk7DQorCQlfX2xiX3NldF9jdXJyZW50KGwpOw0KKwkJc3Bpbl91bmxvY2tfaXJxKCZjLT5s b2NrKTsNCisJfSBlbHNlDQorCQlfX2xiX3NldF9jdXJyZW50KGwpOw0KK30NCisNCitzdGF0aWMg dm9pZCBfX2xiX3NldF9kZWZhdWx0KHN0cnVjdCBjcnlwdG9fbGIgKmwpDQorew0KKwlkZWZhdWx0 X2xiID0gbDsNCit9DQorDQorc3RhdGljIHZvaWQgbGJfc2V0X2RlZmF1bHQoc3RydWN0IGNyeXB0 b19sYiAqbCkNCit7DQorCXN0cnVjdCBjcnlwdG9fbGIgKmMgPSBkZWZhdWx0X2xiOw0KKw0KKwlp ZiAoYykgew0KKwkJc3Bpbl9sb2NrX2lycSgmYy0+bG9jayk7DQorCQlfX2xiX3NldF9kZWZhdWx0 KGwpOw0KKwkJc3Bpbl91bmxvY2tfaXJxKCZjLT5sb2NrKTsNCisJfSBlbHNlDQorCQlfX2xiX3Nl dF9kZWZhdWx0KGwpOw0KK30NCisNCitzdGF0aWMgaW50IGNyeXB0b19sYl9tYXRjaChzdHJ1Y3Qg ZGV2aWNlICpkZXYsIHN0cnVjdCBkZXZpY2VfZHJpdmVyICpkcnYpDQorew0KKwlyZXR1cm4gMTsN Cit9DQorDQorc3RhdGljIGludCBjcnlwdG9fbGJfcHJvYmUoc3RydWN0IGRldmljZSAqZGV2KQ0K K3sNCisJcmV0dXJuIC1FTk9ERVY7DQorfQ0KKw0KK3N0YXRpYyBpbnQgY3J5cHRvX2xiX3JlbW92 ZShzdHJ1Y3QgZGV2aWNlICpkZXYpDQorew0KKwlyZXR1cm4gMDsNCit9DQorDQorc3RhdGljIHZv aWQgY3J5cHRvX2xiX3JlbGVhc2Uoc3RydWN0IGRldmljZSAqZGV2KQ0KK3sNCisJc3RydWN0IGNy eXB0b19sYiAqZCA9IGNvbnRhaW5lcl9vZihkZXYsIHN0cnVjdCBjcnlwdG9fbGIsIGRldmljZSk7 DQorDQorCWNvbXBsZXRlKCZkLT5kZXZfcmVsZWFzZWQpOw0KK30NCisNCitzdGF0aWMgdm9pZCBj cnlwdG9fbGJfY2xhc3NfcmVsZWFzZShzdHJ1Y3QgY2xhc3MgKmNsYXNzKQ0KK3sNCit9DQorDQor c3RhdGljIHZvaWQgY3J5cHRvX2xiX2NsYXNzX3JlbGVhc2VfZGV2aWNlKHN0cnVjdCBjbGFzc19k ZXZpY2UgKmNsYXNzX2RldikNCit7DQorfQ0KKw0KK3N0cnVjdCBjbGFzcyBjcnlwdG9fbGJfY2xh c3MgPSB7DQorCS5uYW1lIAkJPSAiY3J5cHRvX2xiIiwNCisJLmNsYXNzX3JlbGVhc2UgCT0gY3J5 cHRvX2xiX2NsYXNzX3JlbGVhc2UsDQorCS5yZWxlYXNlIAk9IGNyeXB0b19sYl9jbGFzc19yZWxl YXNlX2RldmljZQ0KK307DQorDQorc3RydWN0IGJ1c190eXBlIGNyeXB0b19sYl9idXNfdHlwZSA9 IHsNCisJLm5hbWUgCQk9ICJjcnlwdG9fbGIiLA0KKwkubWF0Y2ggCQk9IGNyeXB0b19sYl9tYXRj aA0KK307DQorDQorc3RydWN0IGRldmljZV9kcml2ZXIgY3J5cHRvX2xiX2RyaXZlciA9IHsNCisJ Lm5hbWUgCQk9ICJjcnlwdG9fbGJfZHJpdmVyIiwNCisJLmJ1cyAJCT0gJmNyeXB0b19sYl9idXNf dHlwZSwNCisJLnByb2JlIAkJPSBjcnlwdG9fbGJfcHJvYmUsDQorCS5yZW1vdmUgCT0gY3J5cHRv X2xiX3JlbW92ZSwNCit9Ow0KKw0KK3N0cnVjdCBkZXZpY2UgY3J5cHRvX2xiX2RldiA9IHsNCisJ LnBhcmVudCAJPSBOVUxMLA0KKwkuYnVzIAkJPSAmY3J5cHRvX2xiX2J1c190eXBlLA0KKwkuYnVz X2lkIAk9ICJjcnlwdG8gbG9hZCBiYWxhbmNlciIsDQorCS5kcml2ZXIgCT0gJmNyeXB0b19sYl9k cml2ZXIsDQorCS5yZWxlYXNlIAk9ICZjcnlwdG9fbGJfcmVsZWFzZQ0KK307DQorDQorc3RhdGlj IHNzaXplX3QgbmFtZV9zaG93KHN0cnVjdCBjbGFzc19kZXZpY2UgKmRldiwgY2hhciAqYnVmKQ0K K3sNCisJc3RydWN0IGNyeXB0b19sYiAqbGIgPSBjb250YWluZXJfb2YoZGV2LCBzdHJ1Y3QgY3J5 cHRvX2xiLCBjbGFzc19kZXZpY2UpOw0KKw0KKwlyZXR1cm4gc3ByaW50ZihidWYsICIlc1xuIiwg bGItPm5hbWUpOw0KK30NCisNCitzdGF0aWMgc3NpemVfdCBjdXJyZW50X3Nob3coc3RydWN0IGNs YXNzX2RldmljZSAqZGV2LCBjaGFyICpidWYpDQorew0KKwlzdHJ1Y3QgY3J5cHRvX2xiICpsYjsN CisJaW50IG9mZiA9IDA7DQorDQorCXNwaW5fbG9ja19pcnEoJmNyeXB0b19sYl9sb2NrKTsNCisN CisJbGlzdF9mb3JfZWFjaF9lbnRyeShsYiwgJmNyeXB0b19sYl9saXN0LCBsYl9lbnRyeSkgew0K KwkJaWYgKGxiX2lzX2N1cnJlbnQobGIpKQ0KKwkJCW9mZiArPSBzcHJpbnRmKGJ1ZiArIG9mZiwg IlsiKTsNCisJCWlmIChsYl9pc19kZWZhdWx0KGxiKSkNCisJCQlvZmYgKz0gc3ByaW50ZihidWYg KyBvZmYsICIoIik7DQorCQlvZmYgKz0gc3ByaW50ZihidWYgKyBvZmYsICIlcyIsIGxiLT5uYW1l KTsNCisJCWlmIChsYl9pc19kZWZhdWx0KGxiKSkNCisJCQlvZmYgKz0gc3ByaW50ZihidWYgKyBv ZmYsICIpIik7DQorCQlpZiAobGJfaXNfY3VycmVudChsYikpDQorCQkJb2ZmICs9IHNwcmludGYo YnVmICsgb2ZmLCAiXSIpOw0KKwl9DQorDQorCXNwaW5fdW5sb2NrX2lycSgmY3J5cHRvX2xiX2xv Y2spOw0KKw0KKwlpZiAoIW9mZikNCisJCW9mZiA9IHNwcmludGYoYnVmLCAiTm8gbG9hZCBiYWxh bmNlcnMgcmVnaXRlcmVkIHlldC4iKTsNCisNCisJb2ZmICs9IHNwcmludGYoYnVmICsgb2ZmLCAi XG4iKTsNCisNCisJcmV0dXJuIG9mZjsNCit9DQorc3RhdGljIHNzaXplX3QgY3VycmVudF9zdG9y ZShzdHJ1Y3QgY2xhc3NfZGV2aWNlICpkZXYsIGNvbnN0IGNoYXIgKmJ1Ziwgc2l6ZV90IGNvdW50 KQ0KK3sNCisJc3RydWN0IGNyeXB0b19sYiAqbGI7DQorDQorCXNwaW5fbG9ja19pcnEoJmNyeXB0 b19sYl9sb2NrKTsNCisNCisJbGlzdF9mb3JfZWFjaF9lbnRyeShsYiwgJmNyeXB0b19sYl9saXN0 LCBsYl9lbnRyeSkgew0KKwkJaWYgKGNvdW50ID09IHN0cmxlbihsYi0+bmFtZSkgJiYgIXN0cmNt cChidWYsIGxiLT5uYW1lKSkgew0KKwkJCWxiX3NldF9jdXJyZW50KGxiKTsNCisJCQlsYl9zZXRf ZGVmYXVsdChsYik7DQorDQorCQkJZHByaW50ayhLRVJOX0lORk8gIkxvYWQgYmFsYW5jZXIgJXMg aXMgc2V0IGFzIGN1cnJlbnQgYW5kIGRlZmF1bHQuXG4iLA0KKwkJCQlsYi0+bmFtZSk7DQorDQor CQkJYnJlYWs7DQorCQl9DQorCX0NCisJc3Bpbl91bmxvY2tfaXJxKCZjcnlwdG9fbGJfbG9jayk7 DQorDQorCXJldHVybiBjb3VudDsNCit9DQorDQorc3RhdGljIENMQVNTX0RFVklDRV9BVFRSKG5h bWUsIDA0NDQsIG5hbWVfc2hvdywgTlVMTCk7DQorQ0xBU1NfREVWSUNFX0FUVFIobGJzLCAwNjQ0 LCBjdXJyZW50X3Nob3csIGN1cnJlbnRfc3RvcmUpOw0KKw0KK3N0YXRpYyB2b2lkIGNyZWF0ZV9k ZXZpY2VfYXR0cmlidXRlcyhzdHJ1Y3QgY3J5cHRvX2xiICpsYikNCit7DQorCWNsYXNzX2Rldmlj ZV9jcmVhdGVfZmlsZSgmbGItPmNsYXNzX2RldmljZSwgJmNsYXNzX2RldmljZV9hdHRyX25hbWUp Ow0KK30NCisNCitzdGF0aWMgdm9pZCByZW1vdmVfZGV2aWNlX2F0dHJpYnV0ZXMoc3RydWN0IGNy eXB0b19sYiAqbGIpDQorew0KKwljbGFzc19kZXZpY2VfcmVtb3ZlX2ZpbGUoJmxiLT5jbGFzc19k ZXZpY2UsICZjbGFzc19kZXZpY2VfYXR0cl9uYW1lKTsNCit9DQorDQorc3RhdGljIGludCBjb21w YXJlX2xiKHN0cnVjdCBjcnlwdG9fbGIgKmwxLCBzdHJ1Y3QgY3J5cHRvX2xiICpsMikNCit7DQor CWlmICghc3RybmNtcChsMS0+bmFtZSwgbDItPm5hbWUsIHNpemVvZihsMS0+bmFtZSkpKQ0KKwkJ cmV0dXJuIDE7DQorDQorCXJldHVybiAwOw0KK30NCisNCit2b2lkIGNyeXB0b19sYl9yZWhhc2go dm9pZCkNCit7DQorCWlmICghY3VycmVudF9sYikNCisJCXJldHVybjsNCisNCisJc3Bpbl9sb2Nr X2lycSgmY3VycmVudF9sYi0+bG9jayk7DQorDQorCWN1cnJlbnRfbGItPnJlaGFzaChjdXJyZW50 X2xiKTsNCisNCisJc3Bpbl91bmxvY2tfaXJxKCZjdXJyZW50X2xiLT5sb2NrKTsNCisNCisJd2Fr ZV91cF9pbnRlcnJ1cHRpYmxlKCZjcnlwdG9fbGJfd2FpdF9xdWV1ZSk7DQorfQ0KKw0KK3N0cnVj dCBjcnlwdG9fZGV2aWNlICpjcnlwdG9fbGJfZmluZF9kZXZpY2Uoc3RydWN0IGNyeXB0b19zZXNz aW9uX2luaXRpYWxpemVyICpjaSwgc3RydWN0IGNyeXB0b19kYXRhICpkYXRhKQ0KK3sNCisJc3Ry dWN0IGNyeXB0b19kZXZpY2UgKmRldjsNCisNCisJaWYgKCFjdXJyZW50X2xiKQ0KKwkJcmV0dXJu IE5VTEw7DQorDQorCWlmIChzY2lfYmluZGVkKGNpKSkgew0KKwkJaW50IGZvdW5kID0gMDsNCisN CisJCXNwaW5fbG9ja19pcnEoY3J5cHRvX2RldmljZV9sb2NrKTsNCisNCisJCWxpc3RfZm9yX2Vh Y2hfZW50cnkoZGV2LCBjcnlwdG9fZGV2aWNlX2xpc3QsIGNkZXZfZW50cnkpIHsNCisJCQlpZiAo ZGV2LT5pZCA9PSBjaS0+YmRldikgew0KKwkJCQlmb3VuZCA9IDE7DQorCQkJCWJyZWFrOw0KKwkJ CX0NCisJCX0NCisNCisJCXNwaW5fdW5sb2NrX2lycShjcnlwdG9fZGV2aWNlX2xvY2spOw0KKw0K KwkJcmV0dXJuIChmb3VuZCkgPyBkZXYgOiBOVUxMOw0KKwl9DQorDQorCXNwaW5fbG9ja19pcnEo JmN1cnJlbnRfbGItPmxvY2spOw0KKw0KKwljdXJyZW50X2xiLT5yZWhhc2goY3VycmVudF9sYik7 DQorDQorCXNwaW5fbG9jayhjcnlwdG9fZGV2aWNlX2xvY2spOw0KKw0KKwlkZXYgPSBjdXJyZW50 X2xiLT5maW5kX2RldmljZShjdXJyZW50X2xiLCBjaSwgZGF0YSk7DQorCWlmIChkZXYpDQorCQlj cnlwdG9fZGV2aWNlX2dldChkZXYpOw0KKw0KKwlzcGluX3VubG9jayhjcnlwdG9fZGV2aWNlX2xv Y2spOw0KKw0KKwlzcGluX3VubG9ja19pcnEoJmN1cnJlbnRfbGItPmxvY2spOw0KKw0KKwl3YWtl X3VwX2ludGVycnVwdGlibGUoJmNyeXB0b19sYl93YWl0X3F1ZXVlKTsNCisNCisJcmV0dXJuIGRl djsNCit9DQorDQorc3RhdGljIGludCBfX2NyeXB0b19sYl9yZWdpc3RlcihzdHJ1Y3QgY3J5cHRv X2xiICpsYikNCit7DQorCWludCBlcnI7DQorDQorCXNwaW5fbG9ja19pbml0KCZsYi0+bG9jayk7 DQorDQorCWluaXRfY29tcGxldGlvbigmbGItPmRldl9yZWxlYXNlZCk7DQorCW1lbWNweSgmbGIt PmRldmljZSwgJmNyeXB0b19sYl9kZXYsIHNpemVvZihzdHJ1Y3QgZGV2aWNlKSk7DQorCWxiLT5k cml2ZXIgPSAmY3J5cHRvX2xiX2RyaXZlcjsNCisNCisJc25wcmludGYobGItPmRldmljZS5idXNf aWQsIHNpemVvZihsYi0+ZGV2aWNlLmJ1c19pZCksICIlcyIsIGxiLT5uYW1lKTsNCisJZXJyID0g ZGV2aWNlX3JlZ2lzdGVyKCZsYi0+ZGV2aWNlKTsNCisJaWYgKGVycikgew0KKwkJZHByaW50ayhL RVJOX0VSUg0KKwkJCSJGYWlsZWQgdG8gcmVnaXN0ZXIgY3J5cHRvIGxvYWQgYmFsYW5jZXIgZGV2 aWNlICVzOiBlcnI9JWQuXG4iLA0KKwkJCWxiLT5uYW1lLCBlcnIpOw0KKwkJcmV0dXJuIGVycjsN CisJfQ0KKw0KKwlzbnByaW50ZihsYi0+Y2xhc3NfZGV2aWNlLmNsYXNzX2lkLCBzaXplb2YobGIt PmNsYXNzX2RldmljZS5jbGFzc19pZCksICIlcyIsIGxiLT5uYW1lKTsNCisJbGItPmNsYXNzX2Rl dmljZS5kZXYgPSAmbGItPmRldmljZTsNCisJbGItPmNsYXNzX2RldmljZS5jbGFzcyA9ICZjcnlw dG9fbGJfY2xhc3M7DQorDQorCWVyciA9IGNsYXNzX2RldmljZV9yZWdpc3RlcigmbGItPmNsYXNz X2RldmljZSk7DQorCWlmIChlcnIpIHsNCisJCWRwcmludGsoS0VSTl9FUlIgIkZhaWxlZCB0byBy ZWdpc3RlciBjcnlwdG8gbG9hZCBiYWxhbmNlciBjbGFzcyBkZXZpY2UgJXM6IGVycj0lZC5cbiIs DQorCQkJbGItPm5hbWUsIGVycik7DQorCQlkZXZpY2VfdW5yZWdpc3RlcigmbGItPmRldmljZSk7 DQorCQlyZXR1cm4gZXJyOw0KKwl9DQorDQorCWNyZWF0ZV9kZXZpY2VfYXR0cmlidXRlcyhsYik7 DQorCXdha2VfdXBfaW50ZXJydXB0aWJsZSgmY3J5cHRvX2xiX3dhaXRfcXVldWUpOw0KKw0KKwly ZXR1cm4gMDsNCisNCit9DQorDQorc3RhdGljIHZvaWQgX19jcnlwdG9fbGJfdW5yZWdpc3Rlcihz dHJ1Y3QgY3J5cHRvX2xiICpsYikNCit7DQorCXdha2VfdXBfaW50ZXJydXB0aWJsZSgmY3J5cHRv X2xiX3dhaXRfcXVldWUpOw0KKwlyZW1vdmVfZGV2aWNlX2F0dHJpYnV0ZXMobGIpOw0KKwljbGFz c19kZXZpY2VfdW5yZWdpc3RlcigmbGItPmNsYXNzX2RldmljZSk7DQorCWRldmljZV91bnJlZ2lz dGVyKCZsYi0+ZGV2aWNlKTsNCit9DQorDQoraW50IGNyeXB0b19sYl9yZWdpc3RlcihzdHJ1Y3Qg Y3J5cHRvX2xiICpsYiwgaW50IHNldF9jdXJyZW50LCBpbnQgc2V0X2RlZmF1bHQpDQorew0KKwlz dHJ1Y3QgY3J5cHRvX2xiICpfX2xiOw0KKwlpbnQgZXJyOw0KKw0KKwlzcGluX2xvY2tfaXJxKCZj cnlwdG9fbGJfbG9jayk7DQorDQorCWxpc3RfZm9yX2VhY2hfZW50cnkoX19sYiwgJmNyeXB0b19s Yl9saXN0LCBsYl9lbnRyeSkgew0KKwkJaWYgKHVubGlrZWx5KGNvbXBhcmVfbGIoX19sYiwgbGIp KSkgew0KKwkJCXNwaW5fdW5sb2NrX2lycSgmY3J5cHRvX2xiX2xvY2spOw0KKw0KKwkJCWRwcmlu dGsoS0VSTl9FUlIgIkNyeXB0byBsb2FkIGJhbGFuY2VyICVzIGlzIGFscmVhZHkgcmVnaXN0ZXJl ZC5cbiIsDQorCQkJCWxiLT5uYW1lKTsNCisJCQlyZXR1cm4gLUVJTlZBTDsNCisJCX0NCisJfQ0K Kw0KKwlsaXN0X2FkZCgmbGItPmxiX2VudHJ5LCAmY3J5cHRvX2xiX2xpc3QpOw0KKw0KKwlzcGlu X3VubG9ja19pcnEoJmNyeXB0b19sYl9sb2NrKTsNCisNCisJZXJyID0gX19jcnlwdG9fbGJfcmVn aXN0ZXIobGIpOw0KKwlpZiAoZXJyKSB7DQorCQlzcGluX2xvY2tfaXJxKCZjcnlwdG9fbGJfbG9j ayk7DQorCQlsaXN0X2RlbF9pbml0KCZsYi0+bGJfZW50cnkpOw0KKwkJc3Bpbl91bmxvY2tfaXJx KCZjcnlwdG9fbGJfbG9jayk7DQorDQorCQlyZXR1cm4gZXJyOw0KKwl9DQorDQorCWlmICghZGVm YXVsdF9sYiB8fCBzZXRfZGVmYXVsdCkNCisJCWxiX3NldF9kZWZhdWx0KGxiKTsNCisNCisJaWYg KCFjdXJyZW50X2xiIHx8IHNldF9jdXJyZW50KQ0KKwkJbGJfc2V0X2N1cnJlbnQobGIpOw0KKw0K KwlkcHJpbnRrKEtFUk5fSU5GTyAiQ3J5cHRvIGxvYWQgYmFsYW5jZXIgJXMgd2FzIHJlZ2lzdGVy ZWQgYW5kIHNldCB0byBiZSBbJXMuJXNdLlxuIiwNCisJCWxiLT5uYW1lLCAobGJfaXNfY3VycmVu dChsYikpID8gImN1cnJlbnQiIDogIm5vdCBjdXJyZW50IiwNCisJCShsYl9pc19kZWZhdWx0KGxi KSkgPyAiZGVmYXVsdCIgOiAibm90IGRlZmF1bHQiKTsNCisNCisJbGJfbnVtKys7DQorDQorCXJl dHVybiAwOw0KK30NCisNCit2b2lkIGNyeXB0b19sYl91bnJlZ2lzdGVyKHN0cnVjdCBjcnlwdG9f bGIgKmxiKQ0KK3sNCisJc3RydWN0IGNyeXB0b19sYiAqX19sYiwgKm47DQorDQorCWlmIChsYl9u dW0gPT0gMSkgew0KKwkJZHByaW50ayhLRVJOX0lORk8gIllvdSBhcmUgcmVtb3ZpbmcgY3J5cHRv IGxvYWQgYmFsYW5jZXIgJXMgd2hpY2ggaXMgY3VycmVudCBhbmQgZGVmYXVsdC5cbiINCisJCQki VGhlcmUgaXMgbm8gb3RoZXIgY3J5cHRvIGxvYWQgYmFsYW5jZXJzLiAiDQorCQkJIlJlbW92aW5n ICVzIGRlbGF5ZWQgdW50aWxsIG5ldyBsb2FkIGJhbGFuY2VyIGlzIHJlZ2lzdGVyZWQuXG4iLA0K KwkJCWxiLT5uYW1lLCAoZm9yY2VfbGJfcmVtb3ZlKSA/ICJpcyBub3QiIDogImlzIik7DQorCQl3 aGlsZSAobGJfbnVtID09IDEgJiYgIWZvcmNlX2xiX3JlbW92ZSkgew0KKwkJCXNldF9jdXJyZW50 X3N0YXRlKFRBU0tfSU5URVJSVVBUSUJMRSk7DQorCQkJc2NoZWR1bGVfdGltZW91dChIWik7DQor DQorCQkJaWYgKHNpZ25hbF9wZW5kaW5nKGN1cnJlbnQpKQ0KKwkJCQlmbHVzaF9zaWduYWxzKGN1 cnJlbnQpOw0KKwkJfQ0KKwl9DQorDQorCV9fY3J5cHRvX2xiX3VucmVnaXN0ZXIobGIpOw0KKw0K KwlzcGluX2xvY2tfaXJxKCZjcnlwdG9fbGJfbG9jayk7DQorDQorCWxpc3RfZm9yX2VhY2hfZW50 cnlfc2FmZShfX2xiLCBuLCAmY3J5cHRvX2xiX2xpc3QsIGxiX2VudHJ5KSB7DQorCQlpZiAoY29t cGFyZV9sYihfX2xiLCBsYikpIHsNCisJCQlsYl9udW0tLTsNCisJCQlsaXN0X2RlbF9pbml0KCZf X2xiLT5sYl9lbnRyeSk7DQorDQorCQkJZHByaW50ayhLRVJOX0VSUiAiQ3J5cHRvIGxvYWQgYmFs YW5jZXIgJXMgd2FzIHVucmVnaXN0ZXJlZC5cbiIsDQorCQkJCWxiLT5uYW1lKTsNCisJCX0gZWxz ZSBpZiAobGJfbnVtKSB7DQorCQkJaWYgKGxiX2lzX2RlZmF1bHQobGIpKQ0KKwkJCQlsYl9zZXRf ZGVmYXVsdChfX2xiKTsNCisJCQlpZiAobGJfaXNfY3VycmVudChsYikpDQorCQkJCWxiX3NldF9j dXJyZW50KGRlZmF1bHRfbGIpOw0KKwkJfQ0KKwl9DQorDQorCXNwaW5fdW5sb2NrX2lycSgmY3J5 cHRvX2xiX2xvY2spOw0KK30NCisNCitzdGF0aWMgdm9pZCBjcnlwdG9fbGJfcXVldWVfd3JhcHBl cih2b2lkICpkYXRhKQ0KK3sNCisJc3RydWN0IGNyeXB0b19kZXZpY2UgKmRldiA9ICZtYWluX2Ny eXB0b19kZXZpY2U7DQorCXN0cnVjdCBjcnlwdG9fc2Vzc2lvbiAqcyA9IChzdHJ1Y3QgY3J5cHRv X3Nlc3Npb24gKilkYXRhOw0KKw0KKwlkcHJpbnRrKEtFUk5fSU5GTyAiJXM6IENhbGxpbmcgY2Fs bGJhY2sgZm9yIHNlc3Npb24gJWxsdSBbJWxsdV0gZmxhZ3M9JXgsICINCisJCSJvcD0lMDR1LCB0 eXBlPSUwNHgsIG1vZGU9JTA0eCwgcHJpb3JpdHk9JTA0eFxuIiwgX19mdW5jX18sDQorCQlzLT5j aS5pZCwgcy0+Y2kuZGV2X2lkLCBzLT5jaS5mbGFncywgcy0+Y2kub3BlcmF0aW9uLA0KKwkJcy0+ Y2kudHlwZSwgcy0+Y2kubW9kZSwgcy0+Y2kucHJpb3JpdHkpOw0KKw0KKwlzcGluX2xvY2tfaXJx KCZzLT5sb2NrKTsNCisJY3J5cHRvX3N0YXRfZmluaXNoX2luYyhzKTsNCisNCisJZmluaXNoX3Nl c3Npb24ocyk7DQorCXVuc3RhcnRfc2Vzc2lvbihzKTsNCisJc3Bpbl91bmxvY2tfaXJxKCZzLT5s b2NrKTsNCisNCisJcy0+Y2kuY2FsbGJhY2soJnMtPmNpLCAmcy0+ZGF0YSk7DQorDQorCWlmIChz ZXNzaW9uX2ZpbmlzaGVkKHMpKSB7DQorCQlrZnJlZShzKTsNCisJCXJldHVybjsNCisJfSBlbHNl IHsNCisJCS8qDQorCQkgKiBTcGVjaWFsIGNhc2U6IGNyeXB0byBjb25zdW1lciBtYXJrcyBzZXNz aW9uIGFzICJub3QgZmluaXNoZWQiDQorCQkgKiBpbiBpdCdzIGNhbGxiYWNrIC0gaXQgbWVhbnMg dGhhdCBjcnlwdG8gY29uc3VtZXIgd2FudHMgDQorCQkgKiB0aGlzIHNlc3Npb24gdG8gYmUgcHJv Y2Vzc2VkIGZ1cnRoZXIsIA0KKwkJICogZm9yIGV4YW1wbGUgY3J5cHRvIGNvbnN1bWVyIGNhbiBh ZGQgbmV3IHJvdXRlIGFuZCB0aGVuDQorCQkgKiBtYXJrIHNlc3Npb24gYXMgIm5vdCBmaW5pc2hl ZCIuDQorCQkgKi8NCisNCisJCXVuY29tcGxldGVfc2Vzc2lvbihzKTsNCisJCXVuc3RhcnRfc2Vz c2lvbihzKTsNCisJCWNyeXB0b19zZXNzaW9uX2luc2VydF9tYWluKGRldiwgcyk7DQorCX0NCisJ c3Bpbl91bmxvY2tfaXJxKCZzLT5sb2NrKTsNCit9DQorDQorc3RhdGljIHZvaWQgY3J5cHRvX2xi X3Byb2Nlc3NfbmV4dF9yb3V0ZShzdHJ1Y3QgY3J5cHRvX3Nlc3Npb24gKnMpDQorew0KKwlzdHJ1 Y3QgY3J5cHRvX3JvdXRlICpydDsNCisJc3RydWN0IGNyeXB0b19kZXZpY2UgKmRldjsNCisNCisJ cnQgPSBjcnlwdG9fcm91dGVfZGVxdWV1ZShzKTsNCisJaWYgKHJ0KSB7DQorCQlkZXYgPSBydC0+ ZGV2Ow0KKw0KKwkJc3Bpbl9sb2NrX2lycSgmZGV2LT5zZXNzaW9uX2xvY2spOw0KKwkJbGlzdF9k ZWxfaW5pdCgmcy0+ZGV2X3F1ZXVlX2VudHJ5KTsNCisJCXNwaW5fdW5sb2NrX2lycSgmZGV2LT5z ZXNzaW9uX2xvY2spOw0KKw0KKwkJY3J5cHRvX3JvdXRlX2ZyZWUocnQpOw0KKw0KKwkJZGV2ID0g Y3J5cHRvX3JvdXRlX2dldF9jdXJyZW50X2RldmljZShzKTsNCisJCWlmIChkZXYpIHsNCisJCQlk cHJpbnRrKEtFUk5fSU5GTyAiJXM6IHByb2Nlc3NpbmcgbmV3IHJvdXRlIHRvICVzLlxuIiwNCisJ CQkJX19mdW5jX18sIGRldi0+bmFtZSk7DQorDQorCQkJbWVtY3B5KCZzLT5jaSwgJnJ0LT5jaSwg c2l6ZW9mKHMtPmNpKSk7DQorCQkJY3J5cHRvX3Nlc3Npb25faW5zZXJ0KGRldiwgcyk7DQorDQor CQkJLyoNCisJCQkgKiBSZWZlcmVuY2UgdG8gdGhpcyBkZXZpY2Ugd2FzIGFscmVhZHkgaG9sZCB3 aGVuDQorCQkJICogbmV3IHJvdXRpbmcgd2FzIGFkZGVkLg0KKwkJCSAqLw0KKwkJCWNyeXB0b19k ZXZpY2VfcHV0KGRldik7DQorCQl9DQorCX0NCit9DQorDQordm9pZCBjcnlwdG9fd2FrZV9sYih2 b2lkKQ0KK3sNCisJd2FrZV91cF9pbnRlcnJ1cHRpYmxlKCZjcnlwdG9fbGJfd2FpdF9xdWV1ZSk7 DQorfQ0KKw0KK2ludCBjcnlwdG9fbGJfdGhyZWFkKHZvaWQgKmRhdGEpDQorew0KKwlzdHJ1Y3Qg Y3J5cHRvX3Nlc3Npb24gKnMsICpuOw0KKwlzdHJ1Y3QgY3J5cHRvX2RldmljZSAqZGV2ID0gKHN0 cnVjdCBjcnlwdG9fZGV2aWNlICopZGF0YTsNCisNCisJZGFlbW9uaXplKCIlcyIsIGRldi0+bmFt ZSk7DQorCWFsbG93X3NpZ25hbChTSUdURVJNKTsNCisNCisJd2hpbGUgKCFuZWVkX2V4aXQpIHsN CisJCWxpc3RfZm9yX2VhY2hfZW50cnlfc2FmZShzLCBuLCAmZGV2LT5zZXNzaW9uX2xpc3QsDQor CQkJCQkgbWFpbl9xdWV1ZV9lbnRyeSkgew0KKwkJCWRwcmludGsoInNlc3Npb24gJWxsdSBbJWxs dV06IGZsYWdzPSV4LCByb3V0ZV9udW09JWQsICVzLCVzLCVzLCVzLlxuIiwNCisJCQkgICAgIHMt PmNpLmlkLCBzLT5jaS5kZXZfaWQsIHMtPmNpLmZsYWdzLA0KKwkJCSAgICAgY3J5cHRvX3JvdXRl X3F1ZXVlX2xlbihzKSwNCisJCQkgICAgIChzZXNzaW9uX2NvbXBsZXRlZChzKSkgPyAiY29tcGxl dGVkIiA6ICJub3QgY29tcGxldGVkIiwNCisJCQkgICAgIChzZXNzaW9uX2ZpbmlzaGVkKHMpKSA/ ICJmaW5pc2hlZCIgOiAibm90IGZpbmlzaGVkIiwNCisJCQkgICAgIChzZXNzaW9uX3N0YXJ0ZWQo cykpID8gInN0YXJ0ZWQiIDogIm5vdCBzdGFydGVkIiwNCisJCQkgICAgIChzZXNzaW9uX2lzX3By b2Nlc3NlZChzKSkgPyAiaXMgYmVpbmcgcHJvY2Vzc2VkIiA6ICJpcyBub3QgYmVpbmcgcHJvY2Vz c2VkIik7DQorDQorCQkJaWYgKCFzcGluX3RyeWxvY2soJnMtPmxvY2spKQ0KKwkJCQljb250aW51 ZTsNCisNCisJCQlpZiAoc2Vzc2lvbl9pc19wcm9jZXNzZWQocykpDQorCQkJCWdvdG8gdW5sb2Nr Ow0KKwkJCWlmIChzZXNzaW9uX3N0YXJ0ZWQocykpDQorCQkJCWdvdG8gdW5sb2NrOw0KKw0KKwkJ CWlmIChzZXNzaW9uX2NvbXBsZXRlZChzKSkgew0KKwkJCQljcnlwdG9fc3RhdF9wdGltZV9pbmMo cyk7DQorDQorCQkJCWlmIChjcnlwdG9fcm91dGVfcXVldWVfbGVuKHMpID4gMSkgew0KKwkJCQkJ Y3J5cHRvX2xiX3Byb2Nlc3NfbmV4dF9yb3V0ZShzKTsNCisJCQkJfSBlbHNlIHsNCisJCQkJCXN0 YXJ0X3Nlc3Npb24ocyk7DQorCQkJCQljcnlwdG9fc3RhdF9zdGFydF9pbmMocyk7DQorDQorCQkJ CQlkcHJpbnRrKCIlczogZ29pbmcgdG8gcmVtb3ZlIHNlc3Npb24gJWxsdSBbJWxsdV0uXG4iLA0K KwkJCQkJICAgICBfX2Z1bmNfXywgcy0+Y2kuaWQsIHMtPmNpLmRldl9pZCk7DQorDQorCQkJCQlj cnlwdG9fc2Vzc2lvbl9kZXF1ZXVlX21haW4ocyk7DQorCQkJCQlzcGluX3VubG9jaygmcy0+bG9j ayk7DQorDQorCQkJCQlJTklUX1dPUksoJnMtPndvcmssICZjcnlwdG9fbGJfcXVldWVfd3JhcHBl ciwgcyk7DQorCQkJCQlxdWV1ZV93b3JrKGNyeXB0b19sYl9xdWV1ZSwgJnMtPndvcmspOw0KKwkJ CQkJY29udGludWU7DQorCQkJCX0NCisJCQl9DQorCQkgICAgICB1bmxvY2s6DQorCQkJc3Bpbl91 bmxvY2soJnMtPmxvY2spOw0KKwkJfQ0KKw0KKwkJaW50ZXJydXB0aWJsZV9zbGVlcF9vbl90aW1l b3V0KCZjcnlwdG9fbGJfd2FpdF9xdWV1ZSwgMTAwKTsNCisJfQ0KKw0KKwlmbHVzaF93b3JrcXVl dWUoY3J5cHRvX2xiX3F1ZXVlKTsNCisJY29tcGxldGVfYW5kX2V4aXQoJnRocmVhZF9leGl0ZWQs IDApOw0KK30NCisNCitpbnQgY3J5cHRvX2xiX2luaXQodm9pZCkNCit7DQorCWludCBlcnI7DQor CWxvbmcgcGlkOw0KKw0KKwllcnIgPSBidXNfcmVnaXN0ZXIoJmNyeXB0b19sYl9idXNfdHlwZSk7 DQorCWlmIChlcnIpIHsNCisJCWRwcmludGsoS0VSTl9FUlIgIkZhaWxlZCB0byByZWdpc3RlciBj cnlwdG8gbG9hZCBiYWxhbmNlciBidXM6IGVycj0lZC5cbiIsIGVycik7DQorCQlnb3RvIGVycl9v dXRfZXhpdDsNCisJfQ0KKw0KKwllcnIgPSBkcml2ZXJfcmVnaXN0ZXIoJmNyeXB0b19sYl9kcml2 ZXIpOw0KKwlpZiAoZXJyKSB7DQorCQlkcHJpbnRrKEtFUk5fRVJSICJGYWlsZWQgdG8gcmVnaXN0 ZXIgY3J5cHRvIGxvYWQgYmFsYW5jZXIgZHJpdmVyOiBlcnI9JWQuXG4iLCBlcnIpOw0KKwkJZ290 byBlcnJfb3V0X2J1c191bnJlZ2lzdGVyOw0KKwl9DQorDQorCWNyeXB0b19sYl9jbGFzcy5jbGFz c19kZXZfYXR0cnMgPSAmY2xhc3NfZGV2aWNlX2F0dHJfbGJzOw0KKw0KKwllcnIgPSBjbGFzc19y ZWdpc3RlcigmY3J5cHRvX2xiX2NsYXNzKTsNCisJaWYgKGVycikgew0KKwkJZHByaW50ayhLRVJO X0VSUiAiRmFpbGVkIHRvIHJlZ2lzdGVyIGNyeXB0byBsb2FkIGJhbGFuY2VyIGNsYXNzOiBlcnI9 JWQuXG4iLCBlcnIpOw0KKwkJZ290byBlcnJfb3V0X2RyaXZlcl91bnJlZ2lzdGVyOw0KKwl9DQor DQorCWNyeXB0b19sYl9xdWV1ZSA9IGNyZWF0ZV93b3JrcXVldWUoImNsYnEiKTsNCisJaWYgKCFj cnlwdG9fbGJfcXVldWUpIHsNCisJCWRwcmludGsoS0VSTl9FUlIgIkZhaWxlZCB0byBjcmVhdGUg Y3J5cHRvIGxvYWQgYmFsYW5lciB3b3JrIHF1ZXVlLlxuIik7DQorCQlnb3RvIGVycl9vdXRfY2xh c3NfdW5yZWdpc3RlcjsNCisJfQ0KKw0KKwlpbml0X2NvbXBsZXRpb24oJnRocmVhZF9leGl0ZWQp Ow0KKwlwaWQgPSBrZXJuZWxfdGhyZWFkKGNyeXB0b19sYl90aHJlYWQsICZtYWluX2NyeXB0b19k ZXZpY2UsIENMT05FX0ZTIHwgQ0xPTkVfRklMRVMpOw0KKwlpZiAoSVNfRVJSKCh2b2lkICopcGlk KSkgew0KKwkJZHByaW50ayhLRVJOX0VSUiAiRmFpbGVkIHRvIGNyZWF0ZSBrZXJuZWwgbG9hZCBi YWxhbmNpbmcgdGhyZWFkLlxuIik7DQorCQlnb3RvIGVycl9vdXRfZGVzdHJveV93b3JrcXVldWU7 DQorCX0NCisNCisJcmV0dXJuIDA7DQorDQorZXJyX291dF9kZXN0cm95X3dvcmtxdWV1ZToNCisJ ZGVzdHJveV93b3JrcXVldWUoY3J5cHRvX2xiX3F1ZXVlKTsNCitlcnJfb3V0X2NsYXNzX3VucmVn aXN0ZXI6DQorCWNsYXNzX3VucmVnaXN0ZXIoJmNyeXB0b19sYl9jbGFzcyk7DQorZXJyX291dF9k cml2ZXJfdW5yZWdpc3RlcjoNCisJZHJpdmVyX3VucmVnaXN0ZXIoJmNyeXB0b19sYl9kcml2ZXIp Ow0KK2Vycl9vdXRfYnVzX3VucmVnaXN0ZXI6DQorCWJ1c191bnJlZ2lzdGVyKCZjcnlwdG9fbGJf YnVzX3R5cGUpOw0KK2Vycl9vdXRfZXhpdDoNCisJcmV0dXJuIGVycjsNCit9DQorDQordm9pZCBj cnlwdG9fbGJfZmluaSh2b2lkKQ0KK3sNCisJbmVlZF9leGl0ID0gMTsNCisJd2FpdF9mb3JfY29t cGxldGlvbigmdGhyZWFkX2V4aXRlZCk7DQorCWZsdXNoX3dvcmtxdWV1ZShjcnlwdG9fbGJfcXVl dWUpOw0KKwlkZXN0cm95X3dvcmtxdWV1ZShjcnlwdG9fbGJfcXVldWUpOw0KKwljbGFzc191bnJl Z2lzdGVyKCZjcnlwdG9fbGJfY2xhc3MpOw0KKwlkcml2ZXJfdW5yZWdpc3RlcigmY3J5cHRvX2xi X2RyaXZlcik7DQorCWJ1c191bnJlZ2lzdGVyKCZjcnlwdG9fbGJfYnVzX3R5cGUpOw0KK30NCisN CitFWFBPUlRfU1lNQk9MX0dQTChjcnlwdG9fbGJfcmVnaXN0ZXIpOw0KK0VYUE9SVF9TWU1CT0xf R1BMKGNyeXB0b19sYl91bnJlZ2lzdGVyKTsNCitFWFBPUlRfU1lNQk9MX0dQTChjcnlwdG9fbGJf cmVoYXNoKTsNCitFWFBPUlRfU1lNQk9MX0dQTChjcnlwdG9fbGJfZmluZF9kZXZpY2UpOw0KK0VY UE9SVF9TWU1CT0xfR1BMKGNyeXB0b193YWtlX2xiKTsNCkJpbmFyeSBmaWxlcyAvdG1wL2VtcHR5 Ly5jcnlwdG9fbGIuYy5zd24gYW5kIGxpbnV4LTIuNi9kcml2ZXJzL2FjcnlwdG8vLmNyeXB0b19s Yi5jLnN3biBkaWZmZXINCkJpbmFyeSBmaWxlcyAvdG1wL2VtcHR5Ly5jcnlwdG9fbGIuYy5zd28g YW5kIGxpbnV4LTIuNi9kcml2ZXJzL2FjcnlwdG8vLmNyeXB0b19sYi5jLnN3byBkaWZmZXINCkJp bmFyeSBmaWxlcyAvdG1wL2VtcHR5Ly5jcnlwdG9fbGIuYy5zd3AgYW5kIGxpbnV4LTIuNi9kcml2 ZXJzL2FjcnlwdG8vLmNyeXB0b19sYi5jLnN3cCBkaWZmZXINCmRpZmYgLU5ydSAvdG1wL2VtcHR5 L2NyeXB0b19sYi5oIGxpbnV4LTIuNi9kcml2ZXJzL2FjcnlwdG8vY3J5cHRvX2xiLmgNCi0tLSAv dG1wL2VtcHR5L2NyeXB0b19sYi5oCTE5NzAtMDEtMDEgMDM6MDA6MDAuMDAwMDAwMDAwICswMzAw DQorKysgbGludXgtMi42L2RyaXZlcnMvYWNyeXB0by9jcnlwdG9fbGIuaAkyMDA0LTEyLTE0IDE4 OjUzOjExLjAwMDAwMDAwMCArMDMwMA0KQEAgLTAsMCArMSw2MiBAQA0KKy8qDQorICogCWNyeXB0 b19sYi5oDQorICoNCisgKiBDb3B5cmlnaHQgKGMpIDIwMDQgRXZnZW5peSBQb2x5YWtvdiA8am9o bnBvbEAya2EubWlwdC5ydT4NCisgKiANCisgKg0KKyAqIFRoaXMgcHJvZ3JhbSBpcyBmcmVlIHNv ZnR3YXJlOyB5b3UgY2FuIHJlZGlzdHJpYnV0ZSBpdCBhbmQvb3IgbW9kaWZ5DQorICogaXQgdW5k ZXIgdGhlIHRlcm1zIG9mIHRoZSBHTlUgR2VuZXJhbCBQdWJsaWMgTGljZW5zZSBhcyBwdWJsaXNo ZWQgYnkNCisgKiB0aGUgRnJlZSBTb2Z0d2FyZSBGb3VuZGF0aW9uOyBlaXRoZXIgdmVyc2lvbiAy IG9mIHRoZSBMaWNlbnNlLCBvcg0KKyAqIChhdCB5b3VyIG9wdGlvbikgYW55IGxhdGVyIHZlcnNp b24uDQorICoNCisgKiBUaGlzIHByb2dyYW0gaXMgZGlzdHJpYnV0ZWQgaW4gdGhlIGhvcGUgdGhh dCBpdCB3aWxsIGJlIHVzZWZ1bCwNCisgKiBidXQgV0lUSE9VVCBBTlkgV0FSUkFOVFk7IHdpdGhv dXQgZXZlbiB0aGUgaW1wbGllZCB3YXJyYW50eSBvZg0KKyAqIE1FUkNIQU5UQUJJTElUWSBvciBG SVRORVNTIEZPUiBBIFBBUlRJQ1VMQVIgUFVSUE9TRS4gIFNlZSB0aGUNCisgKiBHTlUgR2VuZXJh bCBQdWJsaWMgTGljZW5zZSBmb3IgbW9yZSBkZXRhaWxzLg0KKyAqDQorICogWW91IHNob3VsZCBo YXZlIHJlY2VpdmVkIGEgY29weSBvZiB0aGUgR05VIEdlbmVyYWwgUHVibGljIExpY2Vuc2UNCisg KiBhbG9uZyB3aXRoIHRoaXMgcHJvZ3JhbTsgaWYgbm90LCB3cml0ZSB0byB0aGUgRnJlZSBTb2Z0 d2FyZQ0KKyAqIEZvdW5kYXRpb24sIEluYy4sIDU5IFRlbXBsZSBQbGFjZSwgU3VpdGUgMzMwLCBC b3N0b24sIE1BIDAyMTExLTEzMDcgVVNBDQorICovDQorDQorI2lmbmRlZiBfX0NSWVBUT19MQl9I DQorI2RlZmluZSBfX0NSWVBUT19MQl9IDQorDQorI2luY2x1ZGUgImFjcnlwdG8uaCINCisNCisj ZGVmaW5lIENSWVBUT19MQl9OQU1FTEVOCTMyDQorDQorc3RydWN0IGNyeXB0b19sYiB7DQorCXN0 cnVjdCBsaXN0X2hlYWQgCWxiX2VudHJ5Ow0KKw0KKwljaGFyIAkJCW5hbWVbQ1JZUFRPX0xCX05B TUVMRU5dOw0KKw0KKwl2b2lkIAkJCSgqcmVoYXNoKShzdHJ1Y3QgY3J5cHRvX2xiICopOw0KKwlz dHJ1Y3QgY3J5cHRvX2RldmljZSAqCSgqZmluZF9kZXZpY2UpIChzdHJ1Y3QgY3J5cHRvX2xiICos DQorCQkJCQkJc3RydWN0IGNyeXB0b19zZXNzaW9uX2luaXRpYWxpemVyICosIA0KKwkJCQkJCXN0 cnVjdCBjcnlwdG9fZGF0YSAqKTsNCisNCisJc3BpbmxvY2tfdCAJCWxvY2s7DQorDQorCXNwaW5s b2NrX3QgCQkqY3J5cHRvX2RldmljZV9sb2NrOw0KKwlzdHJ1Y3QgbGlzdF9oZWFkIAkqY3J5cHRv X2RldmljZV9saXN0Ow0KKw0KKwlzdHJ1Y3QgZGV2aWNlX2RyaXZlciAJKmRyaXZlcjsNCisJc3Ry dWN0IGRldmljZSAJCWRldmljZTsNCisJc3RydWN0IGNsYXNzX2RldmljZSAJY2xhc3NfZGV2aWNl Ow0KKwlzdHJ1Y3QgY29tcGxldGlvbiAJZGV2X3JlbGVhc2VkOw0KKw0KK307DQorDQoraW50IGNy eXB0b19sYl9yZWdpc3RlcihzdHJ1Y3QgY3J5cHRvX2xiICpsYiwgaW50IHNldF9jdXJyZW50LCBp bnQgc2V0X2RlZmF1bHQpOw0KK3ZvaWQgY3J5cHRvX2xiX3VucmVnaXN0ZXIoc3RydWN0IGNyeXB0 b19sYiAqKTsNCisNCitpbmxpbmUgdm9pZCBjcnlwdG9fbGJfcmVoYXNoKHZvaWQpOw0KK3N0cnVj dCBjcnlwdG9fZGV2aWNlICpjcnlwdG9fbGJfZmluZF9kZXZpY2Uoc3RydWN0IGNyeXB0b19zZXNz aW9uX2luaXRpYWxpemVyICosIHN0cnVjdCBjcnlwdG9fZGF0YSAqKTsNCisNCit2b2lkIGNyeXB0 b193YWtlX2xiKHZvaWQpOw0KKw0KK2ludCBjcnlwdG9fbGJfaW5pdCh2b2lkKTsNCit2b2lkIGNy eXB0b19sYl9maW5pKHZvaWQpOw0KKw0KKyNlbmRpZgkJCQkvKiBfX0NSWVBUT19MQl9IICovDQpk aWZmIC1OcnUgL3RtcC9lbXB0eS9jcnlwdG9fbWFpbi5jIGxpbnV4LTIuNi9kcml2ZXJzL2Fjcnlw dG8vY3J5cHRvX21haW4uYw0KLS0tIC90bXAvZW1wdHkvY3J5cHRvX21haW4uYwkxOTcwLTAxLTAx IDAzOjAwOjAwLjAwMDAwMDAwMCArMDMwMA0KKysrIGxpbnV4LTIuNi9kcml2ZXJzL2FjcnlwdG8v Y3J5cHRvX21haW4uYwkyMDA0LTEyLTE0IDE4OjUzOjExLjAwMDAwMDAwMCArMDMwMA0KQEAgLTAs MCArMSwzNDAgQEANCisvKg0KKyAqIAljcnlwdG9fbWFpbi5jDQorICoNCisgKiBDb3B5cmlnaHQg KGMpIDIwMDQgRXZnZW5peSBQb2x5YWtvdiA8am9obnBvbEAya2EubWlwdC5ydT4NCisgKiANCisg Kg0KKyAqIFRoaXMgcHJvZ3JhbSBpcyBmcmVlIHNvZnR3YXJlOyB5b3UgY2FuIHJlZGlzdHJpYnV0 ZSBpdCBhbmQvb3IgbW9kaWZ5DQorICogaXQgdW5kZXIgdGhlIHRlcm1zIG9mIHRoZSBHTlUgR2Vu ZXJhbCBQdWJsaWMgTGljZW5zZSBhcyBwdWJsaXNoZWQgYnkNCisgKiB0aGUgRnJlZSBTb2Z0d2Fy ZSBGb3VuZGF0aW9uOyBlaXRoZXIgdmVyc2lvbiAyIG9mIHRoZSBMaWNlbnNlLCBvcg0KKyAqIChh dCB5b3VyIG9wdGlvbikgYW55IGxhdGVyIHZlcnNpb24uDQorICoNCisgKiBUaGlzIHByb2dyYW0g aXMgZGlzdHJpYnV0ZWQgaW4gdGhlIGhvcGUgdGhhdCBpdCB3aWxsIGJlIHVzZWZ1bCwNCisgKiBi dXQgV0lUSE9VVCBBTlkgV0FSUkFOVFk7IHdpdGhvdXQgZXZlbiB0aGUgaW1wbGllZCB3YXJyYW50 eSBvZg0KKyAqIE1FUkNIQU5UQUJJTElUWSBvciBGSVRORVNTIEZPUiBBIFBBUlRJQ1VMQVIgUFVS UE9TRS4gIFNlZSB0aGUNCisgKiBHTlUgR2VuZXJhbCBQdWJsaWMgTGljZW5zZSBmb3IgbW9yZSBk ZXRhaWxzLg0KKyAqDQorICogWW91IHNob3VsZCBoYXZlIHJlY2VpdmVkIGEgY29weSBvZiB0aGUg R05VIEdlbmVyYWwgUHVibGljIExpY2Vuc2UNCisgKiBhbG9uZyB3aXRoIHRoaXMgcHJvZ3JhbTsg aWYgbm90LCB3cml0ZSB0byB0aGUgRnJlZSBTb2Z0d2FyZQ0KKyAqIEZvdW5kYXRpb24sIEluYy4s IDU5IFRlbXBsZSBQbGFjZSwgU3VpdGUgMzMwLCBCb3N0b24sIE1BIDAyMTExLTEzMDcgVVNBDQor ICovDQorDQorI2luY2x1ZGUgPGxpbnV4L2tlcm5lbC5oPg0KKyNpbmNsdWRlIDxsaW51eC9tb2R1 bGUuaD4NCisjaW5jbHVkZSA8bGludXgvbW9kdWxlcGFyYW0uaD4NCisjaW5jbHVkZSA8bGludXgv dHlwZXMuaD4NCisjaW5jbHVkZSA8bGludXgvbGlzdC5oPg0KKyNpbmNsdWRlIDxsaW51eC9zbGFi Lmg+DQorI2luY2x1ZGUgPGxpbnV4L2ludGVycnVwdC5oPg0KKyNpbmNsdWRlIDxsaW51eC9zcGlu bG9jay5oPg0KKw0KKyNpbmNsdWRlICJhY3J5cHRvLmgiDQorI2luY2x1ZGUgImNyeXB0b19sYi5o Ig0KKyNpbmNsdWRlICJjcnlwdG9fY29ubi5oIg0KKyNpbmNsdWRlICJjcnlwdG9fcm91dGUuaCIN CisNCitpbnQgZm9yY2VfbGJfcmVtb3ZlOw0KK21vZHVsZV9wYXJhbShmb3JjZV9sYl9yZW1vdmUs IGludCwgMCk7DQorDQorc3RydWN0IGNyeXB0b19kZXZpY2UgbWFpbl9jcnlwdG9fZGV2aWNlOw0K Kw0KK2V4dGVybiBzdHJ1Y3QgYnVzX3R5cGUgY3J5cHRvX2J1c190eXBlOw0KK2V4dGVybiBzdHJ1 Y3QgZGV2aWNlX2RyaXZlciBjcnlwdG9fZHJpdmVyOw0KK2V4dGVybiBzdHJ1Y3QgY2xhc3MgY3J5 cHRvX2NsYXNzOw0KK2V4dGVybiBzdHJ1Y3QgZGV2aWNlIGNyeXB0b19kZXY7DQorDQorZXh0ZXJu IHN0cnVjdCBjbGFzc19kZXZpY2VfYXR0cmlidXRlIGNsYXNzX2RldmljZV9hdHRyX2RldmljZXM7 DQorZXh0ZXJuIHN0cnVjdCBjbGFzc19kZXZpY2VfYXR0cmlidXRlIGNsYXNzX2RldmljZV9hdHRy X2xiczsNCisNCitzdGF0aWMgdm9pZCBkdW1wX2NpKHN0cnVjdCBjcnlwdG9fc2Vzc2lvbl9pbml0 aWFsaXplciAqY2kpDQorew0KKwlkcHJpbnRrKCIlbGx1IFslbGx1XSBvcD0lMDR1LCB0eXBlPSUw NHgsIG1vZGU9JTA0eCwgcHJpb3JpdHk9JTA0eCIsDQorCQljaS0+aWQsIGNpLT5kZXZfaWQsDQor CQljaS0+b3BlcmF0aW9uLCBjaS0+dHlwZSwgY2ktPm1vZGUsIGNpLT5wcmlvcml0eSk7DQorfQ0K Kw0KK3N0YXRpYyB2b2lkIF9fY3J5cHRvX3Nlc3Npb25faW5zZXJ0KHN0cnVjdCBjcnlwdG9fZGV2 aWNlICpkZXYsIHN0cnVjdCBjcnlwdG9fc2Vzc2lvbiAqcykNCit7DQorCXN0cnVjdCBjcnlwdG9f c2Vzc2lvbiAqX19zOw0KKw0KKwlpZiAodW5saWtlbHkobGlzdF9lbXB0eSgmZGV2LT5zZXNzaW9u X2xpc3QpKSkgew0KKwkJbGlzdF9hZGQoJnMtPmRldl9xdWV1ZV9lbnRyeSwgJmRldi0+c2Vzc2lv bl9saXN0KTsNCisJfSBlbHNlIHsNCisJCWludCBpbnNlcnRlZCA9IDA7DQorDQorCQlsaXN0X2Zv cl9lYWNoX2VudHJ5KF9fcywgJmRldi0+c2Vzc2lvbl9saXN0LCBkZXZfcXVldWVfZW50cnkpIHsN CisJCQlpZiAoX19zLT5jaS5wcmlvcml0eSA8IHMtPmNpLnByaW9yaXR5KSB7DQorCQkJCWxpc3Rf YWRkX3RhaWwoJnMtPmRldl9xdWV1ZV9lbnRyeSwgJl9fcy0+ZGV2X3F1ZXVlX2VudHJ5KTsNCisJ CQkJaW5zZXJ0ZWQgPSAxOw0KKwkJCQlicmVhazsNCisJCQl9DQorCQl9DQorDQorCQlpZiAoIWlu c2VydGVkKQ0KKwkJCWxpc3RfYWRkX3RhaWwoJnMtPmRldl9xdWV1ZV9lbnRyeSwgJmRldi0+c2Vz c2lvbl9saXN0KTsNCisJfQ0KKw0KKwlkdW1wX2NpKCZzLT5jaSk7DQorCWRwcmludGsoIiBhZGRl ZCB0byBjcnlwdG8gZGV2aWNlICVzIFslZF0uXG4iLCBkZXYtPm5hbWUsIGF0b21pY19yZWFkKCZk ZXYtPnJlZmNudCkpOw0KK30NCisNCit2b2lkIGNyeXB0b19zZXNzaW9uX2luc2VydF9tYWluKHN0 cnVjdCBjcnlwdG9fZGV2aWNlICpkZXYsIHN0cnVjdCBjcnlwdG9fc2Vzc2lvbiAqcykNCit7DQor CXN0cnVjdCBjcnlwdG9fc2Vzc2lvbiAqX19zOw0KKw0KKwlzcGluX2xvY2tfaXJxKCZkZXYtPnNl c3Npb25fbG9jayk7DQorDQorCWNyeXB0b19kZXZpY2VfZ2V0KGRldik7DQorCWlmICh1bmxpa2Vs eShsaXN0X2VtcHR5KCZkZXYtPnNlc3Npb25fbGlzdCkpKSB7DQorCQlsaXN0X2FkZCgmcy0+bWFp bl9xdWV1ZV9lbnRyeSwgJmRldi0+c2Vzc2lvbl9saXN0KTsNCisJfSBlbHNlIHsNCisJCWludCBp bnNlcnRlZCA9IDA7DQorDQorCQlsaXN0X2Zvcl9lYWNoX2VudHJ5KF9fcywgJmRldi0+c2Vzc2lv bl9saXN0LCBtYWluX3F1ZXVlX2VudHJ5KSB7DQorCQkJaWYgKF9fcy0+Y2kucHJpb3JpdHkgPCBz LT5jaS5wcmlvcml0eSkgew0KKwkJCQlsaXN0X2FkZF90YWlsKCZzLT5tYWluX3F1ZXVlX2VudHJ5 LA0KKwkJCQkJICAgICAgJl9fcy0+bWFpbl9xdWV1ZV9lbnRyeSk7DQorCQkJCWluc2VydGVkID0g MTsNCisJCQkJYnJlYWs7DQorCQkJfQ0KKwkJfQ0KKw0KKwkJaWYgKCFpbnNlcnRlZCkNCisJCQls aXN0X2FkZF90YWlsKCZzLT5tYWluX3F1ZXVlX2VudHJ5LCAmZGV2LT5zZXNzaW9uX2xpc3QpOw0K Kwl9DQorDQorCXNwaW5fdW5sb2NrX2lycSgmZGV2LT5zZXNzaW9uX2xvY2spOw0KK30NCisNCit2 b2lkIGNyeXB0b19zZXNzaW9uX2luc2VydChzdHJ1Y3QgY3J5cHRvX2RldmljZSAqZGV2LCBzdHJ1 Y3QgY3J5cHRvX3Nlc3Npb24gKnMpDQorew0KKwlzcGluX2xvY2tfaXJxKCZkZXYtPnNlc3Npb25f bG9jayk7DQorCV9fY3J5cHRvX3Nlc3Npb25faW5zZXJ0KGRldiwgcyk7DQorCXNwaW5fdW5sb2Nr X2lycSgmZGV2LT5zZXNzaW9uX2xvY2spOw0KK30NCisNCitzdHJ1Y3QgY3J5cHRvX3Nlc3Npb24g KmNyeXB0b19zZXNzaW9uX2NyZWF0ZShzdHJ1Y3QgY3J5cHRvX3Nlc3Npb25faW5pdGlhbGl6ZXIg KmNpLCBzdHJ1Y3QgY3J5cHRvX2RhdGEgKmQpDQorew0KKwlzdHJ1Y3QgY3J5cHRvX2RldmljZSAq ZGV2ID0gJm1haW5fY3J5cHRvX2RldmljZTsNCisJc3RydWN0IGNyeXB0b19kZXZpY2UgKmxkZXY7 DQorCXN0cnVjdCBjcnlwdG9fc2Vzc2lvbiAqczsNCisJaW50IGVycjsNCisNCisJaWYgKGQtPnBy aXZfc2l6ZSA+IENSWVBUT19NQVhfUFJJVl9TSVpFKSB7DQorCQlkcHJpbnRrKCJwcml2X3NpemUg JXUgaXMgdG9vIGJpZywgbWF4aW11bSBhbGxvd2VkICV1LlxuIiwNCisJCQlkLT5wcml2X3NpemUs IENSWVBUT19NQVhfUFJJVl9TSVpFKTsNCisJCXJldHVybiBOVUxMOw0KKwl9DQorCQ0KKwlsZGV2 ID0gY3J5cHRvX2xiX2ZpbmRfZGV2aWNlKGNpLCBkKTsNCisJaWYgKCFsZGV2KSB7DQorCQlkcHJp bnRrKCJDYW5ub3QgZmluZCBzdWl0YWJsZSBkZXZpY2UuXG4iKTsNCisJCXJldHVybiBOVUxMOw0K Kwl9DQorCQ0KKwlzID0ga21hbGxvYyhzaXplb2YoKnMpICsgZC0+cHJpdl9zaXplLCBHRlBfQVRP TUlDKTsNCisJaWYgKCFzKSB7DQorCQlsZGV2LT5zdGF0LmttZW1fZmFpbGVkKys7DQorCQlnb3Rv IGVycl9vdXRfZGV2aWNlX3B1dDsNCisJfQ0KKw0KKwltZW1zZXQocywgMHhBQiwgc2l6ZW9mKCpz KSk7DQorDQorCWNyeXB0b19yb3V0ZV9oZWFkX2luaXQoJnMtPnJvdXRlX2xpc3QpOw0KKwlJTklU X0xJU1RfSEVBRCgmcy0+ZGV2X3F1ZXVlX2VudHJ5KTsNCisJSU5JVF9MSVNUX0hFQUQoJnMtPm1h aW5fcXVldWVfZW50cnkpOw0KKw0KKwlzcGluX2xvY2tfaW5pdCgmcy0+bG9jayk7DQorDQorCW1l bWNweSgmcy0+Y2ksIGNpLCBzaXplb2Yocy0+Y2kpKTsNCisJbWVtY3B5KCZzLT5kYXRhLCBkLCBz aXplb2Yocy0+ZGF0YSkpOw0KKwlzLT5kYXRhLnByaXYgPSBzICsgMTsNCisJaWYgKGQtPnByaXZf c2l6ZSkgew0KKwkJcy0+ZGF0YS5wcml2ID0gcyArIDE7DQorCQlpZiAoZC0+cHJpdikNCisJCQlt ZW1jcHkocy0+ZGF0YS5wcml2LCBkLT5wcml2LCBkLT5wcml2X3NpemUpOw0KKwl9DQorDQorCXMt PmNpLmlkID0gZGV2LT5zaWQrKzsNCisJcy0+Y2kuZGV2X2lkID0gbGRldi0+c2lkKys7DQorCXMt PmNpLmZsYWdzID0gMDsNCisNCisJZXJyID0gY3J5cHRvX3JvdXRlX2FkZF9kaXJlY3QobGRldiwg cywgY2kpOw0KKwlpZiAoZXJyKSB7DQorCQlkcHJpbnRrKCJDYW4gbm90IGFkZCByb3V0ZSB0byBk ZXZpY2UgJXMuXG4iLCBsZGV2LT5uYW1lKTsNCisJCWdvdG8gZXJyX291dF9zZXNzaW9uX2ZyZWU7 DQorCX0NCisNCisJcmV0dXJuIHM7DQorDQorZXJyX291dF9zZXNzaW9uX2ZyZWU6DQorCWtmcmVl KHMpOw0KK2Vycl9vdXRfZGV2aWNlX3B1dDoNCisJY3J5cHRvX2RldmljZV9wdXQobGRldik7DQor DQorCXJldHVybiBOVUxMOw0KK30NCisNCit2b2lkIGNyeXB0b19zZXNzaW9uX2FkZChzdHJ1Y3Qg Y3J5cHRvX3Nlc3Npb24gKnMpDQorew0KKwlzdHJ1Y3QgY3J5cHRvX2RldmljZSAqbGRldjsNCisJ c3RydWN0IGNyeXB0b19kZXZpY2UgKmRldiA9ICZtYWluX2NyeXB0b19kZXZpY2U7DQorDQorCWxk ZXYgPSBjcnlwdG9fcm91dGVfZ2V0X2N1cnJlbnRfZGV2aWNlKHMpOw0KKwlCVUdfT04oIWxkZXYp OwkJLyogVGhpcyBjYW4gbm90IGhhcHBlbi4gKi8NCisNCisJc3Bpbl9sb2NrX2lycSgmcy0+bG9j ayk7DQorCWNyeXB0b19zZXNzaW9uX2luc2VydChsZGV2LCBzKTsNCisJY3J5cHRvX2RldmljZV9w dXQobGRldik7DQorCWNyeXB0b19zZXNzaW9uX2luc2VydF9tYWluKGRldiwgcyk7DQorCXNwaW5f dW5sb2NrX2lycSgmcy0+bG9jayk7DQorDQorCWlmIChsZGV2LT5kYXRhX3JlYWR5KQ0KKwkJbGRl di0+ZGF0YV9yZWFkeShsZGV2KTsNCit9DQorDQorc3RydWN0IGNyeXB0b19zZXNzaW9uICpjcnlw dG9fc2Vzc2lvbl9hbGxvYyhzdHJ1Y3QgY3J5cHRvX3Nlc3Npb25faW5pdGlhbGl6ZXIgKmNpLCBz dHJ1Y3QgY3J5cHRvX2RhdGEgKmQpDQorew0KKwlzdHJ1Y3QgY3J5cHRvX3Nlc3Npb24gKnM7DQor DQorCXMgPSBjcnlwdG9fc2Vzc2lvbl9jcmVhdGUoY2ksIGQpOw0KKwlpZiAoIXMpDQorCQlyZXR1 cm4gTlVMTDsNCisNCisJY3J5cHRvX3Nlc3Npb25fYWRkKHMpOw0KKw0KKwlyZXR1cm4gczsNCit9 DQorDQordm9pZCBjcnlwdG9fc2Vzc2lvbl9kZXF1ZXVlX3JvdXRlKHN0cnVjdCBjcnlwdG9fc2Vz c2lvbiAqcykNCit7DQorCXN0cnVjdCBjcnlwdG9fcm91dGUgKnJ0Ow0KKwlzdHJ1Y3QgY3J5cHRv X2RldmljZSAqZGV2Ow0KKw0KKwlCVUdfT04oY3J5cHRvX3JvdXRlX3F1ZXVlX2xlbihzKSA+IDEp Ow0KKw0KKwl3aGlsZSAoKHJ0ID0gY3J5cHRvX3JvdXRlX2RlcXVldWUocykpKSB7DQorCQlkZXYg PSBydC0+ZGV2Ow0KKw0KKwkJZHByaW50ayhLRVJOX0lORk8gIlJlbW92aW5nIHJvdXRlIGVudHJ5 IGZvciBkZXZpY2UgJXMuXG4iLCBkZXYtPm5hbWUpOw0KKw0KKwkJc3Bpbl9sb2NrX2lycSgmZGV2 LT5zZXNzaW9uX2xvY2spOw0KKw0KKwkJbGlzdF9kZWxfaW5pdCgmcy0+ZGV2X3F1ZXVlX2VudHJ5 KTsNCisNCisJCXNwaW5fdW5sb2NrX2lycSgmZGV2LT5zZXNzaW9uX2xvY2spOw0KKw0KKwkJY3J5 cHRvX3JvdXRlX2ZyZWUocnQpOw0KKwl9DQorfQ0KKw0KK3ZvaWQgX19jcnlwdG9fc2Vzc2lvbl9k ZXF1ZXVlX21haW4oc3RydWN0IGNyeXB0b19zZXNzaW9uICpzKQ0KK3sNCisJc3RydWN0IGNyeXB0 b19kZXZpY2UgKmRldiA9ICZtYWluX2NyeXB0b19kZXZpY2U7DQorDQorCWxpc3RfZGVsKCZzLT5t YWluX3F1ZXVlX2VudHJ5KTsNCisJY3J5cHRvX2RldmljZV9wdXQoZGV2KTsNCit9DQorDQordm9p ZCBjcnlwdG9fc2Vzc2lvbl9kZXF1ZXVlX21haW4oc3RydWN0IGNyeXB0b19zZXNzaW9uICpzKQ0K K3sNCisJc3RydWN0IGNyeXB0b19kZXZpY2UgKmRldiA9ICZtYWluX2NyeXB0b19kZXZpY2U7DQor DQorCXNwaW5fbG9ja19pcnEoJmRldi0+c2Vzc2lvbl9sb2NrKTsNCisNCisJX19jcnlwdG9fc2Vz c2lvbl9kZXF1ZXVlX21haW4ocyk7DQorDQorCXNwaW5fdW5sb2NrX2lycSgmZGV2LT5zZXNzaW9u X2xvY2spOw0KK30NCisNCitpbnQgX19kZXZpbml0IGNtYWluX2luaXQodm9pZCkNCit7DQorCXN0 cnVjdCBjcnlwdG9fZGV2aWNlICpkZXYgPSAmbWFpbl9jcnlwdG9fZGV2aWNlOw0KKwlpbnQgZXJy Ow0KKw0KKwlzbnByaW50ZihkZXYtPm5hbWUsIHNpemVvZihkZXYtPm5hbWUpLCAiY3J5cHRvX3Nl c3Npb25zIik7DQorDQorCWVyciA9IGJ1c19yZWdpc3RlcigmY3J5cHRvX2J1c190eXBlKTsNCisJ aWYgKGVycikgew0KKwkJZHByaW50ayhLRVJOX0VSUiAiRmFpbGVkIHRvIHJlZ2lzdGVyIGNyeXB0 byBidXM6IGVycj0lZC5cbiIsDQorCQkJZXJyKTsNCisJCXJldHVybiBlcnI7DQorCX0NCisNCisJ ZXJyID0gZHJpdmVyX3JlZ2lzdGVyKCZjcnlwdG9fZHJpdmVyKTsNCisJaWYgKGVycikgew0KKwkJ ZHByaW50ayhLRVJOX0VSUiAiRmFpbGVkIHRvIHJlZ2lzdGVyIGNyeXB0byBkcml2ZXI6IGVycj0l ZC5cbiIsDQorCQkJZXJyKTsNCisJCWdvdG8gZXJyX291dF9idXNfdW5yZWdpc3RlcjsNCisJfQ0K Kw0KKwllcnIgPSBjbGFzc19yZWdpc3RlcigmY3J5cHRvX2NsYXNzKTsNCisJaWYgKGVycikgew0K KwkJZHByaW50ayhLRVJOX0VSUiAiRmFpbGVkIHRvIHJlZ2lzdGVyIGNyeXB0byBjbGFzczogZXJy PSVkLlxuIiwNCisJCQllcnIpOw0KKwkJZ290byBlcnJfb3V0X2RyaXZlcl91bnJlZ2lzdGVyOw0K Kwl9DQorDQorCWVyciA9IGNyeXB0b19sYl9pbml0KCk7DQorCWlmIChlcnIpDQorCQlnb3RvIGVy cl9vdXRfY2xhc3NfdW5yZWdpc3RlcjsNCisNCisJZXJyID0gY3J5cHRvX2Nvbm5faW5pdCgpOw0K KwlpZiAoZXJyKQ0KKwkJZ290byBlcnJfb3V0X2NyeXB0b19sYl9maW5pOw0KKw0KKwllcnIgPSBf X2NyeXB0b19kZXZpY2VfYWRkKGRldik7DQorCWlmIChlcnIpDQorCQlnb3RvIGVycl9vdXRfY3J5 cHRvX2Nvbm5fZmluaTsNCisNCisJZXJyID0gY2xhc3NfZGV2aWNlX2NyZWF0ZV9maWxlKCZkZXYt PmNsYXNzX2RldmljZSwgJmNsYXNzX2RldmljZV9hdHRyX2RldmljZXMpOw0KKwlpZiAoZXJyKQ0K KwkJZHByaW50aygiRmFpbGVkIHRvIGNyZWF0ZSBcImRldmljZXNcIiBhdHRyaWJ1dGU6IGVycj0l ZC5cbiIsDQorCQkJZXJyKTsNCisJZXJyID0gY2xhc3NfZGV2aWNlX2NyZWF0ZV9maWxlKCZkZXYt PmNsYXNzX2RldmljZSwgJmNsYXNzX2RldmljZV9hdHRyX2xicyk7DQorCWlmIChlcnIpDQorCQlk cHJpbnRrKCJGYWlsZWQgdG8gY3JlYXRlIFwibGJzXCIgYXR0cmlidXRlOiBlcnI9JWQuXG4iLCBl cnIpOw0KKw0KKwlyZXR1cm4gMDsNCisNCitlcnJfb3V0X2NyeXB0b19jb25uX2Zpbmk6DQorCWNy eXB0b19jb25uX2ZpbmkoKTsNCitlcnJfb3V0X2NyeXB0b19sYl9maW5pOg0KKwljcnlwdG9fbGJf ZmluaSgpOw0KK2Vycl9vdXRfY2xhc3NfdW5yZWdpc3RlcjoNCisJY2xhc3NfdW5yZWdpc3Rlcigm Y3J5cHRvX2NsYXNzKTsNCitlcnJfb3V0X2RyaXZlcl91bnJlZ2lzdGVyOg0KKwlkcml2ZXJfdW5y ZWdpc3RlcigmY3J5cHRvX2RyaXZlcik7DQorZXJyX291dF9idXNfdW5yZWdpc3RlcjoNCisJYnVz X3VucmVnaXN0ZXIoJmNyeXB0b19idXNfdHlwZSk7DQorDQorCXJldHVybiBlcnI7DQorfQ0KKw0K K3ZvaWQgX19kZXZleGl0IGNtYWluX2Zpbmkodm9pZCkNCit7DQorCXN0cnVjdCBjcnlwdG9fZGV2 aWNlICpkZXYgPSAmbWFpbl9jcnlwdG9fZGV2aWNlOw0KKw0KKwljbGFzc19kZXZpY2VfcmVtb3Zl X2ZpbGUoJmRldi0+Y2xhc3NfZGV2aWNlLCAmY2xhc3NfZGV2aWNlX2F0dHJfZGV2aWNlcyk7DQor CWNsYXNzX2RldmljZV9yZW1vdmVfZmlsZSgmZGV2LT5jbGFzc19kZXZpY2UsICZjbGFzc19kZXZp Y2VfYXR0cl9sYnMpOw0KKwlfX2NyeXB0b19kZXZpY2VfcmVtb3ZlKGRldik7DQorDQorCWNyeXB0 b19jb25uX2ZpbmkoKTsNCisJY3J5cHRvX2xiX2ZpbmkoKTsNCisNCisJY2xhc3NfdW5yZWdpc3Rl cigmY3J5cHRvX2NsYXNzKTsNCisJZHJpdmVyX3VucmVnaXN0ZXIoJmNyeXB0b19kcml2ZXIpOw0K KwlidXNfdW5yZWdpc3RlcigmY3J5cHRvX2J1c190eXBlKTsNCit9DQorDQorbW9kdWxlX2luaXQo Y21haW5faW5pdCk7DQorbW9kdWxlX2V4aXQoY21haW5fZmluaSk7DQorDQorTU9EVUxFX0xJQ0VO U0UoIkdQTCIpOw0KK01PRFVMRV9BVVRIT1IoIkV2Z2VuaXkgUG9seWFrb3YgPGpvaG5wb2xAMmth Lm1pcHQucnU+Iik7DQorTU9EVUxFX0RFU0NSSVBUSU9OKCJBc3luY2hyb25vdXMgY3J5cHRvIGxh eWVyLiIpOw0KKw0KK0VYUE9SVF9TWU1CT0woY3J5cHRvX3Nlc3Npb25fYWxsb2MpOw0KK0VYUE9S VF9TWU1CT0xfR1BMKGNyeXB0b19zZXNzaW9uX2NyZWF0ZSk7DQorRVhQT1JUX1NZTUJPTF9HUEwo Y3J5cHRvX3Nlc3Npb25fYWRkKTsNCitFWFBPUlRfU1lNQk9MX0dQTChjcnlwdG9fc2Vzc2lvbl9k ZXF1ZXVlX3JvdXRlKTsNCmRpZmYgLU5ydSAvdG1wL2VtcHR5L2NyeXB0b19wcm92aWRlci5jIGxp bnV4LTIuNi9kcml2ZXJzL2FjcnlwdG8vY3J5cHRvX3Byb3ZpZGVyLmMNCi0tLSAvdG1wL2VtcHR5 L2NyeXB0b19wcm92aWRlci5jCTE5NzAtMDEtMDEgMDM6MDA6MDAuMDAwMDAwMDAwICswMzAwDQor KysgbGludXgtMi42L2RyaXZlcnMvYWNyeXB0by9jcnlwdG9fcHJvdmlkZXIuYwkyMDA0LTEyLTE0 IDE4OjUzOjExLjAwMDAwMDAwMCArMDMwMA0KQEAgLTAsMCArMSwxODMgQEANCisvKg0KKyAqIAlj cnlwdG9fcHJvdmlkZXIuYw0KKyAqDQorICogQ29weXJpZ2h0IChjKSAyMDA0IEV2Z2VuaXkgUG9s eWFrb3YgPGpvaG5wb2xAMmthLm1pcHQucnU+DQorICogDQorICoNCisgKiBUaGlzIHByb2dyYW0g aXMgZnJlZSBzb2Z0d2FyZTsgeW91IGNhbiByZWRpc3RyaWJ1dGUgaXQgYW5kL29yIG1vZGlmeQ0K KyAqIGl0IHVuZGVyIHRoZSB0ZXJtcyBvZiB0aGUgR05VIEdlbmVyYWwgUHVibGljIExpY2Vuc2Ug YXMgcHVibGlzaGVkIGJ5DQorICogdGhlIEZyZWUgU29mdHdhcmUgRm91bmRhdGlvbjsgZWl0aGVy IHZlcnNpb24gMiBvZiB0aGUgTGljZW5zZSwgb3INCisgKiAoYXQgeW91ciBvcHRpb24pIGFueSBs YXRlciB2ZXJzaW9uLg0KKyAqDQorICogVGhpcyBwcm9ncmFtIGlzIGRpc3RyaWJ1dGVkIGluIHRo ZSBob3BlIHRoYXQgaXQgd2lsbCBiZSB1c2VmdWwsDQorICogYnV0IFdJVEhPVVQgQU5ZIFdBUlJB TlRZOyB3aXRob3V0IGV2ZW4gdGhlIGltcGxpZWQgd2FycmFudHkgb2YNCisgKiBNRVJDSEFOVEFC SUxJVFkgb3IgRklUTkVTUyBGT1IgQSBQQVJUSUNVTEFSIFBVUlBPU0UuICBTZWUgdGhlDQorICog R05VIEdlbmVyYWwgUHVibGljIExpY2Vuc2UgZm9yIG1vcmUgZGV0YWlscy4NCisgKg0KKyAqIFlv dSBzaG91bGQgaGF2ZSByZWNlaXZlZCBhIGNvcHkgb2YgdGhlIEdOVSBHZW5lcmFsIFB1YmxpYyBM aWNlbnNlDQorICogYWxvbmcgd2l0aCB0aGlzIHByb2dyYW07IGlmIG5vdCwgd3JpdGUgdG8gdGhl IEZyZWUgU29mdHdhcmUNCisgKiBGb3VuZGF0aW9uLCBJbmMuLCA1OSBUZW1wbGUgUGxhY2UsIFN1 aXRlIDMzMCwgQm9zdG9uLCBNQSAwMjExMS0xMzA3IFVTQQ0KKyAqLw0KKw0KKyNpbmNsdWRlIDxs aW51eC9rZXJuZWwuaD4NCisjaW5jbHVkZSA8bGludXgvbW9kdWxlLmg+DQorI2luY2x1ZGUgPGxp bnV4L21vZHVsZXBhcmFtLmg+DQorI2luY2x1ZGUgPGxpbnV4L3R5cGVzLmg+DQorI2luY2x1ZGUg PGxpbnV4L2xpc3QuaD4NCisjaW5jbHVkZSA8bGludXgvc2xhYi5oPg0KKyNpbmNsdWRlIDxsaW51 eC9pbnRlcnJ1cHQuaD4NCisjaW5jbHVkZSA8bGludXgvc3BpbmxvY2suaD4NCisjaW5jbHVkZSA8 bGludXgvd29ya3F1ZXVlLmg+DQorI2luY2x1ZGUgPGxpbnV4L2Vyci5oPg0KKyNpbmNsdWRlIDxs aW51eC9jcnlwdG8uaD4NCisjaW5jbHVkZSA8bGludXgvbW0uaD4NCisNCisjZGVmaW5lIERFQlVH DQorI2luY2x1ZGUgImFjcnlwdG8uaCINCisjaW5jbHVkZSAiY3J5cHRvX3N0YXQuaCINCisjaW5j bHVkZSAiY3J5cHRvX2RlZi5oIg0KKw0KK3N0YXRpYyB2b2lkIHByb3ZfZGF0YV9yZWFkeShzdHJ1 Y3QgY3J5cHRvX2RldmljZSAqZGV2KTsNCisNCitzdGF0aWMgc3RydWN0IGNyeXB0b19jYXBhYmls aXR5IHByb3ZfY2Fwc1tdID0gew0KKwl7Q1JZUFRPX09QX0VOQ1JZUFQsIENSWVBUT19UWVBFX0FF U18xMjgsIENSWVBUT19NT0RFX0VDQiwgMTAwMH0sDQorCXtDUllQVE9fT1BfREVDUllQVCwgQ1JZ UFRPX1RZUEVfQUVTXzEyOCwgQ1JZUFRPX01PREVfRUNCLCAxMDAwfSwNCit9Ow0KK3N0YXRpYyBp bnQgcHJvdl9jYXBfbnVtYmVyID0gc2l6ZW9mKHByb3ZfY2FwcykgLyBzaXplb2YocHJvdl9jYXBz WzBdKTsNCisNCitzdGF0aWMgc3RydWN0IGNvbXBsZXRpb24gdGhyZWFkX2V4aXRlZDsNCitzdGF0 aWMgREVDTEFSRV9XQUlUX1FVRVVFX0hFQUQoY3J5cHRvX3dhaXRfcXVldWUpOw0KK3N0YXRpYyBp bnQgbmVlZF9leGl0Ow0KK3N0YXRpYyBzdHJ1Y3QgY3J5cHRvX3RmbSAqdGZtOw0KKw0KK3N0YXRp YyBzdHJ1Y3QgY3J5cHRvX2RldmljZSBwZGV2ID0gew0KKwkubmFtZSA9ICJjcnlwdG9fcHJvdmlk ZXIiLA0KKwkuZGF0YV9yZWFkeSA9IHByb3ZfZGF0YV9yZWFkeSwNCisJLmNhcCA9ICZwcm92X2Nh cHNbMF0sDQorfTsNCisNCitzdGF0aWMgdm9pZCBwcm92X2RhdGFfcmVhZHkoc3RydWN0IGNyeXB0 b19kZXZpY2UgKmRldikNCit7DQorCXdha2VfdXAoJmNyeXB0b193YWl0X3F1ZXVlKTsNCit9DQor DQorc3RhdGljIGludCBjcnlwdG9fdGhyZWFkKHZvaWQgKmRhdGEpDQorew0KKwlzdHJ1Y3QgY3J5 cHRvX2RldmljZSAqZGV2ID0gKHN0cnVjdCBjcnlwdG9fZGV2aWNlICopZGF0YTsNCisJdTggKmtl eTsNCisJdW5zaWduZWQgaW50IGtleWxlbjsNCisJaW50IGVycjsNCisJc3RydWN0IGNyeXB0b19z ZXNzaW9uICpzLCAqbjsNCisNCisJZGFlbW9uaXplKCIlcyIsIGRldi0+bmFtZSk7DQorCWFsbG93 X3NpZ25hbChTSUdURVJNKTsNCisNCisJd2hpbGUgKCFuZWVkX2V4aXQpIHsNCisJCWludGVycnVw dGlibGVfc2xlZXBfb25fdGltZW91dCgmY3J5cHRvX3dhaXRfcXVldWUsIDEwMCk7DQorDQorCQlp ZiAobmVlZF9leGl0KQ0KKwkJCWJyZWFrOw0KKw0KKwkJbGlzdF9mb3JfZWFjaF9lbnRyeV9zYWZl KHMsIG4sICZkZXYtPnNlc3Npb25fbGlzdCwgZGV2X3F1ZXVlX2VudHJ5KSB7DQorCQkJaWYgKCFz cGluX3RyeWxvY2soJnMtPmxvY2spKQ0KKwkJCQljb250aW51ZTsNCisNCisJCQlpZiAoc2Vzc2lv bl9jb21wbGV0ZWQocykpIHsNCisJCQkJZ290byB1bmxvY2s7DQorCQkJfQ0KKwkJCXN0YXJ0X3By b2Nlc3Nfc2Vzc2lvbihzKTsNCisNCisJCQlkcHJpbnRrKCJCZWdpbiB0byBwcm9jZXNzIHNlc3Np b24gJWxsdSBbJWxsdV0gaW4gJXMuXG4iLA0KKwkJCQlzLT5jaS5pZCwgcy0+Y2kuZGV2X2lkLCBw ZGV2Lm5hbWUpOw0KKw0KKwkJCWtleSA9ICgodTggKikgcGFnZV9hZGRyZXNzKHMtPmRhdGEuc2df a2V5LnBhZ2UpKSArIHMtPmRhdGEuc2dfa2V5Lm9mZnNldDsNCisJCQlrZXlsZW4gPSBzLT5kYXRh LnNnX2tleS5sZW5ndGg7DQorDQorCQkJZXJyID0gY3J5cHRvX2NpcGhlcl9zZXRrZXkodGZtLCBr ZXksIGtleWxlbik7DQorCQkJaWYgKGVycikgew0KKwkJCQlkcHJpbnRrKEtFUk5fRVJSICJGYWls ZWQgdG8gc2V0IGtleSBba2V5bGVuPSVkXTogZXJyPSVkLlxuIiwNCisJCQkJICAgICBrZXlsZW4s IGVycik7DQorCQkJCWJyb2tlX3Nlc3Npb24ocyk7DQorCQkJCWdvdG8gb3V0Ow0KKwkJCX0NCisN CisJCQlpZiAocy0+Y2kub3BlcmF0aW9uID09IENSWVBUT19PUF9FTkNSWVBUKQ0KKwkJCQljcnlw dG9fY2lwaGVyX2VuY3J5cHQodGZtLCAmcy0+ZGF0YS5zZ19kc3QsICZzLT5kYXRhLnNnX3NyYywg cy0+ZGF0YS5zZ19zcmMubGVuZ3RoKTsNCisJCQllbHNlDQorCQkJCWNyeXB0b19jaXBoZXJfZGVj cnlwdCh0Zm0sICZzLT5kYXRhLnNnX2RzdCwgJnMtPmRhdGEuc2dfc3JjLCBzLT5kYXRhLnNnX3Ny Yy5sZW5ndGgpOw0KKw0KKwkJCXMtPmRhdGEuc2dfZHN0Lmxlbmd0aCA9IHMtPmRhdGEuc2dfc3Jj Lmxlbmd0aDsNCisJCQlzLT5kYXRhLnNnX2RzdC5vZmZzZXQgPSAwOw0KKw0KKwkJCWRwcmludGso S0VSTl9JTkZPICJDb21wbGV0aW5nIHNlc3Npb24gJWxsdSBbJWxsdV0gaW4gJXMuXG4iLA0KKwkJ CQlzLT5jaS5pZCwgcy0+Y2kuZGV2X2lkLCBwZGV2Lm5hbWUpOw0KKw0KK291dDoNCisJCQljcnlw dG9fc3RhdF9jb21wbGV0ZV9pbmMocyk7DQorCQkJY3J5cHRvX3Nlc3Npb25fZGVxdWV1ZV9yb3V0 ZShzKTsNCisJCQljb21wbGV0ZV9zZXNzaW9uKHMpOw0KKwkJCXN0b3BfcHJvY2Vzc19zZXNzaW9u KHMpOw0KK3VubG9jazoNCisJCQlzcGluX3VubG9jaygmcy0+bG9jayk7DQorCQl9DQorCX0NCisN CisJY29tcGxldGVfYW5kX2V4aXQoJnRocmVhZF9leGl0ZWQsIDApOw0KK30NCisNCitpbnQgcHJv dl9pbml0KHZvaWQpDQorew0KKwlpbnQgZXJyLCBwaWQ7DQorDQorCXRmbSA9IGNyeXB0b19hbGxv Y190Zm0oImFlcyIsIDApOw0KKwlpZiAoIXRmbSkgew0KKwkJcHJpbnRrKEtFUk5fRVJSICJGYWls ZWQgdG8gYWxsb2NhdGUgU0hBMSB0Zm0uXG4iKTsNCisJCXJldHVybiAtRUlOVkFMOw0KKwl9DQor DQorCWluaXRfY29tcGxldGlvbigmdGhyZWFkX2V4aXRlZCk7DQorCXBpZCA9IGtlcm5lbF90aHJl YWQoY3J5cHRvX3RocmVhZCwgJnBkZXYsIENMT05FX0ZTIHwgQ0xPTkVfRklMRVMpOw0KKwlpZiAo SVNfRVJSKCh2b2lkICopcGlkKSkgew0KKwkJZXJyID0gLUVJTlZBTDsNCisJCWRwcmludGsoS0VS Tl9FUlIgIkZhaWxlZCB0byBjcmVhdGUga2VybmVsIGxvYWQgYmFsYW5jaW5nIHRocmVhZC5cbiIp Ow0KKwkJZ290byBlcnJfb3V0X2ZyZWVfdGZtOw0KKwl9DQorDQorCXBkZXYuY2FwX251bWJlciA9 IHByb3ZfY2FwX251bWJlcjsNCisNCisJZXJyID0gY3J5cHRvX2RldmljZV9hZGQoJnBkZXYpOw0K KwlpZiAoZXJyKQ0KKwkJZ290byBlcnJfb3V0X3JlbW92ZV90aHJlYWQ7DQorDQorCWRwcmludGso S0VSTl9JTkZPICJUZXN0IGNyeXB0byBwcm92aWRlciBtb2R1bGUgJXMgaXMgbG9hZGVkLlxuIiwg cGRldi5uYW1lKTsNCisNCisJcmV0dXJuIDA7DQorDQorZXJyX291dF9yZW1vdmVfdGhyZWFkOg0K KwluZWVkX2V4aXQgPSAxOw0KKwl3YWtlX3VwKCZjcnlwdG9fd2FpdF9xdWV1ZSk7DQorCXdhaXRf Zm9yX2NvbXBsZXRpb24oJnRocmVhZF9leGl0ZWQpOw0KK2Vycl9vdXRfZnJlZV90Zm06DQorCWNy eXB0b19mcmVlX3RmbSh0Zm0pOw0KKw0KKwlyZXR1cm4gZXJyOw0KK30NCisNCit2b2lkIHByb3Zf ZmluaSh2b2lkKQ0KK3sNCisJbmVlZF9leGl0ID0gMTsNCisJd2FrZV91cCgmY3J5cHRvX3dhaXRf cXVldWUpOw0KKwl3YWl0X2Zvcl9jb21wbGV0aW9uKCZ0aHJlYWRfZXhpdGVkKTsNCisNCisJY3J5 cHRvX2RldmljZV9yZW1vdmUoJnBkZXYpOw0KKwljcnlwdG9fZnJlZV90Zm0odGZtKTsNCisNCisJ ZHByaW50ayhLRVJOX0lORk8gIlRlc3QgY3J5cHRvIHByb3ZpZGVyIG1vZHVsZSAlcyBpcyB1bmxv YWRlZC5cbiIsIHBkZXYubmFtZSk7DQorfQ0KKw0KK21vZHVsZV9pbml0KHByb3ZfaW5pdCk7DQor bW9kdWxlX2V4aXQocHJvdl9maW5pKTsNCisNCitNT0RVTEVfTElDRU5TRSgiR1BMIik7DQorTU9E VUxFX0FVVEhPUigiRXZnZW5peSBQb2x5YWtvdiA8am9obnBvbEAya2EubWlwdC5ydT4iKTsNCitN T0RVTEVfREVTQ1JJUFRJT04oIlRlc3QgY3J5cHRvIG1vZHVsZSBwcm92aWRlci4iKTsNCkJpbmFy eSBmaWxlcyAvdG1wL2VtcHR5Ly5jcnlwdG9fcHJvdmlkZXIuYy5zd3AgYW5kIGxpbnV4LTIuNi9k cml2ZXJzL2FjcnlwdG8vLmNyeXB0b19wcm92aWRlci5jLnN3cCBkaWZmZXINCmRpZmYgLU5ydSAv dG1wL2VtcHR5L2NyeXB0b19yb3V0ZS5oIGxpbnV4LTIuNi9kcml2ZXJzL2FjcnlwdG8vY3J5cHRv X3JvdXRlLmgNCi0tLSAvdG1wL2VtcHR5L2NyeXB0b19yb3V0ZS5oCTE5NzAtMDEtMDEgMDM6MDA6 MDAuMDAwMDAwMDAwICswMzAwDQorKysgbGludXgtMi42L2RyaXZlcnMvYWNyeXB0by9jcnlwdG9f cm91dGUuaAkyMDA0LTEyLTE0IDE4OjUzOjExLjAwMDAwMDAwMCArMDMwMA0KQEAgLTAsMCArMSwy NDIgQEANCisvKg0KKyAqIAljcnlwdG9fcm91dGUuaA0KKyAqDQorICogQ29weXJpZ2h0IChjKSAy MDA0IEV2Z2VuaXkgUG9seWFrb3YgPGpvaG5wb2xAMmthLm1pcHQucnU+DQorICogDQorICoNCisg KiBUaGlzIHByb2dyYW0gaXMgZnJlZSBzb2Z0d2FyZTsgeW91IGNhbiByZWRpc3RyaWJ1dGUgaXQg YW5kL29yIG1vZGlmeQ0KKyAqIGl0IHVuZGVyIHRoZSB0ZXJtcyBvZiB0aGUgR05VIEdlbmVyYWwg UHVibGljIExpY2Vuc2UgYXMgcHVibGlzaGVkIGJ5DQorICogdGhlIEZyZWUgU29mdHdhcmUgRm91 bmRhdGlvbjsgZWl0aGVyIHZlcnNpb24gMiBvZiB0aGUgTGljZW5zZSwgb3INCisgKiAoYXQgeW91 ciBvcHRpb24pIGFueSBsYXRlciB2ZXJzaW9uLg0KKyAqDQorICogVGhpcyBwcm9ncmFtIGlzIGRp c3RyaWJ1dGVkIGluIHRoZSBob3BlIHRoYXQgaXQgd2lsbCBiZSB1c2VmdWwsDQorICogYnV0IFdJ VEhPVVQgQU5ZIFdBUlJBTlRZOyB3aXRob3V0IGV2ZW4gdGhlIGltcGxpZWQgd2FycmFudHkgb2YN CisgKiBNRVJDSEFOVEFCSUxJVFkgb3IgRklUTkVTUyBGT1IgQSBQQVJUSUNVTEFSIFBVUlBPU0Uu ICBTZWUgdGhlDQorICogR05VIEdlbmVyYWwgUHVibGljIExpY2Vuc2UgZm9yIG1vcmUgZGV0YWls cy4NCisgKg0KKyAqIFlvdSBzaG91bGQgaGF2ZSByZWNlaXZlZCBhIGNvcHkgb2YgdGhlIEdOVSBH ZW5lcmFsIFB1YmxpYyBMaWNlbnNlDQorICogYWxvbmcgd2l0aCB0aGlzIHByb2dyYW07IGlmIG5v dCwgd3JpdGUgdG8gdGhlIEZyZWUgU29mdHdhcmUNCisgKiBGb3VuZGF0aW9uLCBJbmMuLCA1OSBU ZW1wbGUgUGxhY2UsIFN1aXRlIDMzMCwgQm9zdG9uLCBNQSAwMjExMS0xMzA3IFVTQQ0KKyAqLw0K Kw0KKyNpZm5kZWYgX19DUllQVE9fUk9VVEVfSA0KKyNkZWZpbmUgX19DUllQVE9fUk9VVEVfSA0K Kw0KKyNpbmNsdWRlIDxsaW51eC90eXBlcy5oPg0KKyNpbmNsdWRlIDxsaW51eC9zbGFiLmg+DQor I2luY2x1ZGUgPGxpbnV4L3NwaW5sb2NrLmg+DQorDQorI2luY2x1ZGUgImFjcnlwdG8uaCINCisN CitzdGF0aWMgaW5saW5lIHN0cnVjdCBjcnlwdG9fcm91dGUgKmNyeXB0b19yb3V0ZV9hbGxvY19k aXJlY3Qoc3RydWN0IGNyeXB0b19kZXZpY2UgKmRldiwNCisJCQkJCQkJICAgICBzdHJ1Y3QgY3J5 cHRvX3Nlc3Npb25faW5pdGlhbGl6ZXIgKmNpKQ0KK3sNCisJc3RydWN0IGNyeXB0b19yb3V0ZSAq cnQ7DQorDQorCXJ0ID0ga21hbGxvYyhzaXplb2YoKnJ0KSwgR0ZQX0FUT01JQyk7DQorCWlmICgh cnQpIHsNCisJCWNyeXB0b19kZXZpY2VfcHV0KGRldik7DQorCQlyZXR1cm4gTlVMTDsNCisJfQ0K Kw0KKwltZW1zZXQocnQsIDAsIHNpemVvZigqcnQpKTsNCisJbWVtY3B5KCZydC0+Y2ksIGNpLCBz aXplb2YoKmNpKSk7DQorDQorCXJ0LT5kZXYgPSBkZXY7DQorDQorCXJldHVybiBydDsNCit9DQor DQorc3RhdGljIGlubGluZSBzdHJ1Y3QgY3J5cHRvX3JvdXRlICpjcnlwdG9fcm91dGVfYWxsb2Mo c3RydWN0IGNyeXB0b19kZXZpY2UgKmRldiwNCisJCQkJCQkJc3RydWN0IGNyeXB0b19zZXNzaW9u X2luaXRpYWxpemVyICpjaSkNCit7DQorCXN0cnVjdCBjcnlwdG9fcm91dGUgKnJ0Ow0KKw0KKwlp ZiAoIW1hdGNoX2luaXRpYWxpemVyKGRldiwgY2kpKQ0KKwkJcmV0dXJuIE5VTEw7DQorDQorCXJ0 ID0gY3J5cHRvX3JvdXRlX2FsbG9jX2RpcmVjdChkZXYsIGNpKTsNCisNCisJcmV0dXJuIHJ0Ow0K K30NCisNCitzdGF0aWMgaW5saW5lIHZvaWQgY3J5cHRvX3JvdXRlX2ZyZWUoc3RydWN0IGNyeXB0 b19yb3V0ZSAqcnQpDQorew0KKwljcnlwdG9fZGV2aWNlX3B1dChydC0+ZGV2KTsNCisJcnQtPmRl diA9IE5VTEw7DQorCWtmcmVlKHJ0KTsNCit9DQorDQorc3RhdGljIGlubGluZSB2b2lkIF9fY3J5 cHRvX3JvdXRlX2RlbChzdHJ1Y3QgY3J5cHRvX3JvdXRlICpydCwgc3RydWN0IGNyeXB0b19yb3V0 ZV9oZWFkICpsaXN0KQ0KK3sNCisJc3RydWN0IGNyeXB0b19yb3V0ZSAqbmV4dCwgKnByZXY7DQor DQorCWxpc3QtPnFsZW4tLTsNCisJbmV4dCA9IHJ0LT5uZXh0Ow0KKwlwcmV2ID0gcnQtPnByZXY7 DQorCXJ0LT5uZXh0ID0gcnQtPnByZXYgPSBOVUxMOw0KKwlydC0+bGlzdCA9IE5VTEw7DQorCW5l eHQtPnByZXYgPSBwcmV2Ow0KKwlwcmV2LT5uZXh0ID0gbmV4dDsNCit9DQorDQorc3RhdGljIGlu bGluZSB2b2lkIGNyeXB0b19yb3V0ZV9kZWwoc3RydWN0IGNyeXB0b19yb3V0ZSAqcnQpDQorew0K KwlzdHJ1Y3QgY3J5cHRvX3JvdXRlX2hlYWQgKmxpc3QgPSBydC0+bGlzdDsNCisNCisJaWYgKGxp c3QpIHsNCisJCXNwaW5fbG9ja19pcnEoJmxpc3QtPmxvY2spOw0KKwkJaWYgKGxpc3QgPT0gcnQt Pmxpc3QpDQorCQkJX19jcnlwdG9fcm91dGVfZGVsKHJ0LCBydC0+bGlzdCk7DQorCQlzcGluX3Vu bG9ja19pcnEoJmxpc3QtPmxvY2spOw0KKw0KKwkJY3J5cHRvX3JvdXRlX2ZyZWUocnQpOw0KKwl9 DQorfQ0KKw0KK3N0YXRpYyBpbmxpbmUgc3RydWN0IGNyeXB0b19yb3V0ZSAqX19jcnlwdG9fcm91 dGVfZGVxdWV1ZShzdHJ1Y3QgY3J5cHRvX3JvdXRlX2hlYWQgKmxpc3QpDQorew0KKwlzdHJ1Y3Qg Y3J5cHRvX3JvdXRlICpuZXh0LCAqcHJldiwgKnJlc3VsdDsNCisNCisJcHJldiA9IChzdHJ1Y3Qg Y3J5cHRvX3JvdXRlICopbGlzdDsNCisJbmV4dCA9IHByZXYtPm5leHQ7DQorCXJlc3VsdCA9IE5V TEw7DQorCWlmIChuZXh0ICE9IHByZXYpIHsNCisJCXJlc3VsdCA9IG5leHQ7DQorCQluZXh0ID0g bmV4dC0+bmV4dDsNCisJCWxpc3QtPnFsZW4tLTsNCisJCW5leHQtPnByZXYgPSBwcmV2Ow0KKwkJ cHJldi0+bmV4dCA9IG5leHQ7DQorCQlyZXN1bHQtPm5leHQgPSByZXN1bHQtPnByZXYgPSBOVUxM Ow0KKwkJcmVzdWx0LT5saXN0ID0gTlVMTDsNCisJfQ0KKwlyZXR1cm4gcmVzdWx0Ow0KK30NCisN CitzdGF0aWMgaW5saW5lIHN0cnVjdCBjcnlwdG9fcm91dGUgKmNyeXB0b19yb3V0ZV9kZXF1ZXVl KHN0cnVjdCBjcnlwdG9fc2Vzc2lvbiAqcykNCit7DQorCXN0cnVjdCBjcnlwdG9fcm91dGUgKnJ0 Ow0KKw0KKwlzcGluX2xvY2tfaXJxKCZzLT5yb3V0ZV9saXN0LmxvY2spOw0KKw0KKwlydCA9IF9f Y3J5cHRvX3JvdXRlX2RlcXVldWUoJnMtPnJvdXRlX2xpc3QpOw0KKw0KKwlzcGluX3VubG9ja19p cnEoJnMtPnJvdXRlX2xpc3QubG9jayk7DQorDQorCXJldHVybiBydDsNCit9DQorDQorc3RhdGlj IGlubGluZSB2b2lkIF9fY3J5cHRvX3JvdXRlX3F1ZXVlKHN0cnVjdCBjcnlwdG9fcm91dGUgKnJ0 LCBzdHJ1Y3QgY3J5cHRvX3JvdXRlX2hlYWQgKmxpc3QpDQorew0KKwlzdHJ1Y3QgY3J5cHRvX3Jv dXRlICpwcmV2LCAqbmV4dDsNCisNCisJcnQtPmxpc3QgPSBsaXN0Ow0KKwlsaXN0LT5xbGVuKys7 DQorCW5leHQgPSAoc3RydWN0IGNyeXB0b19yb3V0ZSAqKWxpc3Q7DQorCXByZXYgPSBuZXh0LT5w cmV2Ow0KKwlydC0+bmV4dCA9IG5leHQ7DQorCXJ0LT5wcmV2ID0gcHJldjsNCisJbmV4dC0+cHJl diA9IHByZXYtPm5leHQgPSBydDsNCit9DQorDQorc3RhdGljIGlubGluZSB2b2lkIGNyeXB0b19y b3V0ZV9xdWV1ZShzdHJ1Y3QgY3J5cHRvX3JvdXRlICpydCwgc3RydWN0IGNyeXB0b19zZXNzaW9u ICpzKQ0KK3sNCisNCisJc3Bpbl9sb2NrX2lycSgmcy0+cm91dGVfbGlzdC5sb2NrKTsNCisNCisJ X19jcnlwdG9fcm91dGVfcXVldWUocnQsICZzLT5yb3V0ZV9saXN0KTsNCisNCisJc3Bpbl91bmxv Y2tfaXJxKCZzLT5yb3V0ZV9saXN0LmxvY2spOw0KK30NCisNCitzdGF0aWMgaW5saW5lIGludCBj cnlwdG9fcm91dGVfYWRkKHN0cnVjdCBjcnlwdG9fZGV2aWNlICpkZXYsIHN0cnVjdCBjcnlwdG9f c2Vzc2lvbiAqcywgDQorCQkJCQkJc3RydWN0IGNyeXB0b19zZXNzaW9uX2luaXRpYWxpemVyICpj aSkNCit7DQorCXN0cnVjdCBjcnlwdG9fcm91dGUgKnJ0Ow0KKw0KKwlydCA9IGNyeXB0b19yb3V0 ZV9hbGxvYyhkZXYsIGNpKTsNCisJaWYgKCFydCkNCisJCXJldHVybiAtRU5PTUVNOw0KKw0KKwlj cnlwdG9fcm91dGVfcXVldWUocnQsIHMpOw0KKw0KKwlyZXR1cm4gMDsNCit9DQorDQorc3RhdGlj IGlubGluZSBpbnQgY3J5cHRvX3JvdXRlX2FkZF9kaXJlY3Qoc3RydWN0IGNyeXB0b19kZXZpY2Ug KmRldiwgc3RydWN0IGNyeXB0b19zZXNzaW9uICpzLA0KKwkJCQkJCXN0cnVjdCBjcnlwdG9fc2Vz c2lvbl9pbml0aWFsaXplciAqY2kpDQorew0KKwlzdHJ1Y3QgY3J5cHRvX3JvdXRlICpydDsNCisN CisJcnQgPSBjcnlwdG9fcm91dGVfYWxsb2NfZGlyZWN0KGRldiwgY2kpOw0KKwlpZiAoIXJ0KQ0K KwkJcmV0dXJuIC1FTk9NRU07DQorDQorCWNyeXB0b19yb3V0ZV9xdWV1ZShydCwgcyk7DQorDQor CXJldHVybiAwOw0KK30NCisNCitzdGF0aWMgaW5saW5lIGludCBjcnlwdG9fcm91dGVfcXVldWVf bGVuKHN0cnVjdCBjcnlwdG9fc2Vzc2lvbiAqcykNCit7DQorCXJldHVybiBzLT5yb3V0ZV9saXN0 LnFsZW47DQorfQ0KKw0KK3N0YXRpYyBpbmxpbmUgdm9pZCBjcnlwdG9fcm91dGVfaGVhZF9pbml0 KHN0cnVjdCBjcnlwdG9fcm91dGVfaGVhZCAqbGlzdCkNCit7DQorCXNwaW5fbG9ja19pbml0KCZs aXN0LT5sb2NrKTsNCisJbGlzdC0+cHJldiA9IGxpc3QtPm5leHQgPSAoc3RydWN0IGNyeXB0b19y b3V0ZSAqKWxpc3Q7DQorCWxpc3QtPnFsZW4gPSAwOw0KK30NCisNCitzdGF0aWMgaW5saW5lIHN0 cnVjdCBjcnlwdG9fcm91dGUgKl9fY3J5cHRvX3JvdXRlX2N1cnJlbnQoc3RydWN0IGNyeXB0b19y b3V0ZV9oZWFkICpsaXN0KQ0KK3sNCisJc3RydWN0IGNyeXB0b19yb3V0ZSAqbmV4dCwgKnByZXYs ICpyZXN1bHQ7DQorDQorCXByZXYgPSAoc3RydWN0IGNyeXB0b19yb3V0ZSAqKWxpc3Q7DQorCW5l eHQgPSBwcmV2LT5uZXh0Ow0KKwlyZXN1bHQgPSBOVUxMOw0KKwlpZiAobmV4dCAhPSBwcmV2KQ0K KwkJcmVzdWx0ID0gbmV4dDsNCisNCisJcmV0dXJuIHJlc3VsdDsNCit9DQorDQorc3RhdGljIGlu bGluZSBzdHJ1Y3QgY3J5cHRvX3JvdXRlICpjcnlwdG9fcm91dGVfY3VycmVudChzdHJ1Y3QgY3J5 cHRvX3Nlc3Npb24gKnMpDQorew0KKwlzdHJ1Y3QgY3J5cHRvX3JvdXRlX2hlYWQgKmxpc3Q7DQor CXN0cnVjdCBjcnlwdG9fcm91dGUgKnJ0ID0gTlVMTDsNCisNCisJbGlzdCA9ICZzLT5yb3V0ZV9s aXN0Ow0KKw0KKwlpZiAobGlzdCkgew0KKwkJc3Bpbl9sb2NrX2lycSgmbGlzdC0+bG9jayk7DQor DQorCQlydCA9IF9fY3J5cHRvX3JvdXRlX2N1cnJlbnQobGlzdCk7DQorDQorCQlzcGluX3VubG9j a19pcnEoJmxpc3QtPmxvY2spOw0KKwl9DQorDQorCXJldHVybiBydDsNCit9DQorDQorc3RhdGlj IGlubGluZSBzdHJ1Y3QgY3J5cHRvX2RldmljZSAqY3J5cHRvX3JvdXRlX2dldF9jdXJyZW50X2Rl dmljZShzdHJ1Y3QgY3J5cHRvX3Nlc3Npb24gKnMpDQorew0KKwlzdHJ1Y3QgY3J5cHRvX3JvdXRl ICpydCA9IE5VTEw7DQorCXN0cnVjdCBjcnlwdG9fZGV2aWNlICpkZXYgPSBOVUxMOw0KKwlzdHJ1 Y3QgY3J5cHRvX3JvdXRlX2hlYWQgKmxpc3QgPSAmcy0+cm91dGVfbGlzdDsNCisNCisJc3Bpbl9s b2NrX2lycSgmbGlzdC0+bG9jayk7DQorDQorCXJ0ID0gX19jcnlwdG9fcm91dGVfY3VycmVudChs aXN0KTsNCisJaWYgKHJ0KSB7DQorCQlkZXYgPSBydC0+ZGV2Ow0KKwkJY3J5cHRvX2RldmljZV9n ZXQoZGV2KTsNCisJfQ0KKw0KKwlzcGluX3VubG9ja19pcnEoJmxpc3QtPmxvY2spOw0KKw0KKwly ZXR1cm4gZGV2Ow0KK30NCisNCisjZW5kaWYJCQkJLyogX19DUllQVE9fUk9VVEVfSCAqLw0KZGlm ZiAtTnJ1IC90bXAvZW1wdHkvY3J5cHRvX3N0YXQuYyBsaW51eC0yLjYvZHJpdmVycy9hY3J5cHRv L2NyeXB0b19zdGF0LmMNCi0tLSAvdG1wL2VtcHR5L2NyeXB0b19zdGF0LmMJMTk3MC0wMS0wMSAw MzowMDowMC4wMDAwMDAwMDAgKzAzMDANCisrKyBsaW51eC0yLjYvZHJpdmVycy9hY3J5cHRvL2Ny eXB0b19zdGF0LmMJMjAwNC0xMi0xNCAxODo1MzoxMS4wMDAwMDAwMDAgKzAzMDANCkBAIC0wLDAg KzEsMTAwIEBADQorLyoNCisgKiAJY3J5cHRvX3N0YXQuYw0KKyAqDQorICogQ29weXJpZ2h0IChj KSAyMDA0IEV2Z2VuaXkgUG9seWFrb3YgPGpvaG5wb2xAMmthLm1pcHQucnU+DQorICogDQorICoN CisgKiBUaGlzIHByb2dyYW0gaXMgZnJlZSBzb2Z0d2FyZTsgeW91IGNhbiByZWRpc3RyaWJ1dGUg aXQgYW5kL29yIG1vZGlmeQ0KKyAqIGl0IHVuZGVyIHRoZSB0ZXJtcyBvZiB0aGUgR05VIEdlbmVy YWwgUHVibGljIExpY2Vuc2UgYXMgcHVibGlzaGVkIGJ5DQorICogdGhlIEZyZWUgU29mdHdhcmUg Rm91bmRhdGlvbjsgZWl0aGVyIHZlcnNpb24gMiBvZiB0aGUgTGljZW5zZSwgb3INCisgKiAoYXQg eW91ciBvcHRpb24pIGFueSBsYXRlciB2ZXJzaW9uLg0KKyAqDQorICogVGhpcyBwcm9ncmFtIGlz IGRpc3RyaWJ1dGVkIGluIHRoZSBob3BlIHRoYXQgaXQgd2lsbCBiZSB1c2VmdWwsDQorICogYnV0 IFdJVEhPVVQgQU5ZIFdBUlJBTlRZOyB3aXRob3V0IGV2ZW4gdGhlIGltcGxpZWQgd2FycmFudHkg b2YNCisgKiBNRVJDSEFOVEFCSUxJVFkgb3IgRklUTkVTUyBGT1IgQSBQQVJUSUNVTEFSIFBVUlBP U0UuICBTZWUgdGhlDQorICogR05VIEdlbmVyYWwgUHVibGljIExpY2Vuc2UgZm9yIG1vcmUgZGV0 YWlscy4NCisgKg0KKyAqIFlvdSBzaG91bGQgaGF2ZSByZWNlaXZlZCBhIGNvcHkgb2YgdGhlIEdO VSBHZW5lcmFsIFB1YmxpYyBMaWNlbnNlDQorICogYWxvbmcgd2l0aCB0aGlzIHByb2dyYW07IGlm IG5vdCwgd3JpdGUgdG8gdGhlIEZyZWUgU29mdHdhcmUNCisgKiBGb3VuZGF0aW9uLCBJbmMuLCA1 OSBUZW1wbGUgUGxhY2UsIFN1aXRlIDMzMCwgQm9zdG9uLCBNQSAwMjExMS0xMzA3IFVTQQ0KKyAq Lw0KKw0KKyNpbmNsdWRlIDxsaW51eC9rZXJuZWwuaD4NCisjaW5jbHVkZSA8bGludXgvbW9kdWxl Lmg+DQorI2luY2x1ZGUgPGxpbnV4L21vZHVsZXBhcmFtLmg+DQorI2luY2x1ZGUgPGxpbnV4L3R5 cGVzLmg+DQorI2luY2x1ZGUgPGxpbnV4L2xpc3QuaD4NCisjaW5jbHVkZSA8bGludXgvc2xhYi5o Pg0KKyNpbmNsdWRlIDxsaW51eC9pbnRlcnJ1cHQuaD4NCisjaW5jbHVkZSA8bGludXgvc3Bpbmxv Y2suaD4NCisNCisjaW5jbHVkZSAiYWNyeXB0by5oIg0KKyNpbmNsdWRlICJjcnlwdG9fcm91dGUu aCINCisNCit2b2lkIGNyeXB0b19zdGF0X3N0YXJ0X2luYyhzdHJ1Y3QgY3J5cHRvX3Nlc3Npb24g KnMpDQorew0KKwlzdHJ1Y3QgY3J5cHRvX2RldmljZSAqZGV2Ow0KKw0KKwlkZXYgPSBjcnlwdG9f cm91dGVfZ2V0X2N1cnJlbnRfZGV2aWNlKHMpOw0KKwlpZiAoZGV2KSB7DQorCQlzcGluX2xvY2tf aXJxKCZkZXYtPnN0YXRfbG9jayk7DQorCQlkZXYtPnN0YXQuc3N0YXJ0ZWQrKzsNCisJCXNwaW5f dW5sb2NrX2lycSgmZGV2LT5zdGF0X2xvY2spOw0KKw0KKwkJY3J5cHRvX2RldmljZV9wdXQoZGV2 KTsNCisJfQ0KK30NCisNCit2b2lkIGNyeXB0b19zdGF0X2ZpbmlzaF9pbmMoc3RydWN0IGNyeXB0 b19zZXNzaW9uICpzKQ0KK3sNCisJc3RydWN0IGNyeXB0b19kZXZpY2UgKmRldjsNCisNCisJZGV2 ID0gY3J5cHRvX3JvdXRlX2dldF9jdXJyZW50X2RldmljZShzKTsNCisJaWYgKGRldikgew0KKwkJ c3Bpbl9sb2NrX2lycSgmZGV2LT5zdGF0X2xvY2spOw0KKwkJZGV2LT5zdGF0LnNmaW5pc2hlZCsr Ow0KKwkJc3Bpbl91bmxvY2tfaXJxKCZkZXYtPnN0YXRfbG9jayk7DQorDQorCQljcnlwdG9fZGV2 aWNlX3B1dChkZXYpOw0KKwl9DQorfQ0KKw0KK3ZvaWQgY3J5cHRvX3N0YXRfY29tcGxldGVfaW5j KHN0cnVjdCBjcnlwdG9fc2Vzc2lvbiAqcykNCit7DQorCXN0cnVjdCBjcnlwdG9fZGV2aWNlICpk ZXY7DQorDQorCWRldiA9IGNyeXB0b19yb3V0ZV9nZXRfY3VycmVudF9kZXZpY2Uocyk7DQorCWlm IChkZXYpIHsNCisJCXNwaW5fbG9ja19pcnEoJmRldi0+c3RhdF9sb2NrKTsNCisJCWRldi0+c3Rh dC5zY29tcGxldGVkKys7DQorCQlzcGluX3VubG9ja19pcnEoJmRldi0+c3RhdF9sb2NrKTsNCisN CisJCWNyeXB0b19kZXZpY2VfcHV0KGRldik7DQorCX0NCit9DQorDQordm9pZCBjcnlwdG9fc3Rh dF9wdGltZV9pbmMoc3RydWN0IGNyeXB0b19zZXNzaW9uICpzKQ0KK3sNCisJc3RydWN0IGNyeXB0 b19kZXZpY2UgKmRldjsNCisNCisJZGV2ID0gY3J5cHRvX3JvdXRlX2dldF9jdXJyZW50X2Rldmlj ZShzKTsNCisJaWYgKGRldikgew0KKwkJaW50IGk7DQorDQorCQlzcGluX2xvY2tfaXJxKCZkZXYt PnN0YXRfbG9jayk7DQorCQlmb3IgKGkgPSAwOyBpIDwgZGV2LT5jYXBfbnVtYmVyOyArK2kpIHsN CisJCQlpZiAoX19tYXRjaF9pbml0aWFsaXplcigmZGV2LT5jYXBbaV0sICZzLT5jaSkpIHsNCisJ CQkJZGV2LT5jYXBbaV0ucHRpbWUgKz0gcy0+Y2kucHRpbWU7DQorCQkJCWRldi0+Y2FwW2ldLnNj b21wKys7DQorCQkJCWJyZWFrOw0KKwkJCX0NCisJCX0NCisJCXNwaW5fdW5sb2NrX2lycSgmZGV2 LT5zdGF0X2xvY2spOw0KKw0KKwkJY3J5cHRvX2RldmljZV9wdXQoZGV2KTsNCisJfQ0KK30NCisN CitFWFBPUlRfU1lNQk9MKGNyeXB0b19zdGF0X3N0YXJ0X2luYyk7DQorRVhQT1JUX1NZTUJPTChj cnlwdG9fc3RhdF9maW5pc2hfaW5jKTsNCitFWFBPUlRfU1lNQk9MKGNyeXB0b19zdGF0X2NvbXBs ZXRlX2luYyk7DQpkaWZmIC1OcnUgL3RtcC9lbXB0eS9jcnlwdG9fc3RhdC5oIGxpbnV4LTIuNi9k cml2ZXJzL2FjcnlwdG8vY3J5cHRvX3N0YXQuaA0KLS0tIC90bXAvZW1wdHkvY3J5cHRvX3N0YXQu aAkxOTcwLTAxLTAxIDAzOjAwOjAwLjAwMDAwMDAwMCArMDMwMA0KKysrIGxpbnV4LTIuNi9kcml2 ZXJzL2FjcnlwdG8vY3J5cHRvX3N0YXQuaAkyMDA0LTEyLTE0IDE4OjUzOjExLjAwMDAwMDAwMCAr MDMwMA0KQEAgLTAsMCArMSwzMiBAQA0KKy8qDQorICogCWNyeXB0b19zdGF0LmgNCisgKg0KKyAq IENvcHlyaWdodCAoYykgMjAwNCBFdmdlbml5IFBvbHlha292IDxqb2hucG9sQDJrYS5taXB0LnJ1 Pg0KKyAqIA0KKyAqDQorICogVGhpcyBwcm9ncmFtIGlzIGZyZWUgc29mdHdhcmU7IHlvdSBjYW4g cmVkaXN0cmlidXRlIGl0IGFuZC9vciBtb2RpZnkNCisgKiBpdCB1bmRlciB0aGUgdGVybXMgb2Yg dGhlIEdOVSBHZW5lcmFsIFB1YmxpYyBMaWNlbnNlIGFzIHB1Ymxpc2hlZCBieQ0KKyAqIHRoZSBG cmVlIFNvZnR3YXJlIEZvdW5kYXRpb247IGVpdGhlciB2ZXJzaW9uIDIgb2YgdGhlIExpY2Vuc2Us IG9yDQorICogKGF0IHlvdXIgb3B0aW9uKSBhbnkgbGF0ZXIgdmVyc2lvbi4NCisgKg0KKyAqIFRo aXMgcHJvZ3JhbSBpcyBkaXN0cmlidXRlZCBpbiB0aGUgaG9wZSB0aGF0IGl0IHdpbGwgYmUgdXNl ZnVsLA0KKyAqIGJ1dCBXSVRIT1VUIEFOWSBXQVJSQU5UWTsgd2l0aG91dCBldmVuIHRoZSBpbXBs aWVkIHdhcnJhbnR5IG9mDQorICogTUVSQ0hBTlRBQklMSVRZIG9yIEZJVE5FU1MgRk9SIEEgUEFS VElDVUxBUiBQVVJQT1NFLiAgU2VlIHRoZQ0KKyAqIEdOVSBHZW5lcmFsIFB1YmxpYyBMaWNlbnNl IGZvciBtb3JlIGRldGFpbHMuDQorICoNCisgKiBZb3Ugc2hvdWxkIGhhdmUgcmVjZWl2ZWQgYSBj b3B5IG9mIHRoZSBHTlUgR2VuZXJhbCBQdWJsaWMgTGljZW5zZQ0KKyAqIGFsb25nIHdpdGggdGhp cyBwcm9ncmFtOyBpZiBub3QsIHdyaXRlIHRvIHRoZSBGcmVlIFNvZnR3YXJlDQorICogRm91bmRh dGlvbiwgSW5jLiwgNTkgVGVtcGxlIFBsYWNlLCBTdWl0ZSAzMzAsIEJvc3RvbiwgTUEgMDIxMTEt MTMwNyBVU0ENCisgKi8NCisNCisjaWZuZGVmIF9fQ1JZUFRPX1NUQVRfSA0KKyNkZWZpbmUgX19D UllQVE9fU1RBVF9IDQorDQorI2luY2x1ZGUgImFjcnlwdG8uaCINCisNCit2b2lkIGNyeXB0b19z dGF0X3N0YXJ0X2luYyhzdHJ1Y3QgY3J5cHRvX3Nlc3Npb24gKnMpOw0KK3ZvaWQgY3J5cHRvX3N0 YXRfZmluaXNoX2luYyhzdHJ1Y3QgY3J5cHRvX3Nlc3Npb24gKnMpOw0KK3ZvaWQgY3J5cHRvX3N0 YXRfY29tcGxldGVfaW5jKHN0cnVjdCBjcnlwdG9fc2Vzc2lvbiAqcyk7DQordm9pZCBjcnlwdG9f c3RhdF9wdGltZV9pbmMoc3RydWN0IGNyeXB0b19zZXNzaW9uICpzKTsNCisNCisjZW5kaWYJCQkJ LyogX19DUllQVE9fU1RBVF9IICovDQpkaWZmIC1OcnUgL3RtcC9lbXB0eS9LY29uZmlnIGxpbnV4 LTIuNi9kcml2ZXJzL2FjcnlwdG8vS2NvbmZpZw0KLS0tIC90bXAvZW1wdHkvS2NvbmZpZwkxOTcw LTAxLTAxIDAzOjAwOjAwLjAwMDAwMDAwMCArMDMwMA0KKysrIGxpbnV4LTIuNi9kcml2ZXJzL2Fj cnlwdG8vS2NvbmZpZwkyMDA0LTEyLTE0IDE4OjA0OjM1LjAwMDAwMDAwMCArMDMwMA0KQEAgLTAs MCArMSw0MCBAQA0KK21lbnUgIkFzeW5jaHJvbm91cyBjcnlwdG8gbGF5ZXIiDQorDQorY29uZmln IEFDUllQVE8NCisJdHJpc3RhdGUgIkFzeW5jcm9ub3VzIGNyeXB0byBsYXllciINCisJZGVwZW5k cyBvbiBDT05ORUNUT1INCisJaGVscA0KKwkJU2F5IFkgaGVyZSBpZiB5b3Ugd2FudCB0byB1c2Ug bmV3IExpbnV4IGFzeW5jaHJvbm91cyBjcnlwdG8gbGF5ZXIuDQorCQlUaGlzIHN1cHBvcnQgaXMg YWxzbyBhdmFpbGFibGUgYXMgYSBtb2R1bGUuICBJZiBzbywgdGhlIG1vZHVsZSANCisJCXdpbGwg YmUgY2FsbGVkIGFjcnlwdG8ua28uDQorDQorY29uZmlnIFNJTVBMRV9MQg0KKwl0cmlzdGF0ZSAi U2ltcGxlIGFzeW5jcm9ub3VzIGNyeXB0byBsb2FkIGJhbGFuY2VyIg0KKwlkZXBlbmRzIG9uIEFD UllQVE8NCisJaGVscA0KKwkJU2F5IFkgaGVyZSBpZiB5b3Ugd2FudCB0byBjb21waWxlIHNpbXBs ZSBsb2FkIGJhbGFuY2VyIHdoaWNoIHRha2VzIGludG8NCisJCWFjY291bnQgb25seSBjcnlwdG8g ZGV2aWNlJ3MgcXVldWUgbGVuZ3RoLg0KKwkJVGhpcyBzdXBwb3J0IGlzIGFsc28gYXZhaWxhYmxl IGFzIGEgbW9kdWxlLiAgSWYgc28sIHRoZSBtb2R1bGUgDQorCQl3aWxsIGJlIGNhbGxlZCBzaW1w bF9sYi5rby4NCisNCitjb25maWcgVEVTVF9DUllQVE9fUFJPVklERVINCisJdHJpc3RhdGUgIlNp bXBsZSBjcnlwdG8gcHJvdmlkZXIiDQorCWRlcGVuZHMgb24gQ1JZUFRPICYmIEFDUllQVE8NCisJ aGVscA0KKwkJU2F5IFkgaGVyZSBpZiB5b3Ugd2FudCB0byBjb21waWxlIHRlc3QgYnJpZGdlIGJl dHdlZW4gDQorCQlzeW5jaHJvbm91cyBjcnlwdG8gbGF5ZXIgYW5kIGFzeW5jcm9ub3VzIG9uZS4N CisJCUl0IHN1cHBvcnRzIG9ubHkgQUVTLTEyOC1FQ0IuDQorCQlUaGlzIHN1cHBvcnQgaXMgYWxz byBhdmFpbGFibGUgYXMgYSBtb2R1bGUuICBJZiBzbywgdGhlIG1vZHVsZSANCisJCXdpbGwgYmUg Y2FsbGVkIGNyeXB0b19wcm92aWRlci5rby4NCisNCitjb25maWcgVEVTVF9DUllQVE9fQ09OU1VN RVINCisJdHJpc3RhdGUgIlNpbXBsZSBjcnlwdG8gY29uc3VtZXIiDQorCWRlcGVuZHMgb24gQUNS WVBUTw0KKwloZWxwDQorCQlTYXkgWSBoZXJlIGlmIHlvdSB3YW50IHRvIGNvbXBpbGUgdGVzdCBj cnlwdG8gY29uc3VtZXINCisJCXdoaWNoIGluamVjdHMgY3J5cHRvIHNlc3Npb24gb25lIHRpbWUg cGVyIGppZmZpZS4NCisJCUl0IHN1cHBvcnRzIG9ubHkgQUVTLTEyOC1FQ0IuDQorCQlUaGlzIHN1 cHBvcnQgaXMgYWxzbyBhdmFpbGFibGUgYXMgYSBtb2R1bGUuICBJZiBzbywgdGhlIG1vZHVsZSAN CisJCXdpbGwgYmUgY2FsbGVkIGNvbnN1bWVyLmtvLg0KKw0KK2VuZG1lbnUNCmRpZmYgLU5ydSAv dG1wL2VtcHR5L01ha2VmaWxlIGxpbnV4LTIuNi9kcml2ZXJzL2FjcnlwdG8vTWFrZWZpbGUNCi0t LSAvdG1wL2VtcHR5L01ha2VmaWxlCTE5NzAtMDEtMDEgMDM6MDA6MDAuMDAwMDAwMDAwICswMzAw DQorKysgbGludXgtMi42L2RyaXZlcnMvYWNyeXB0by9NYWtlZmlsZQkyMDA0LTEyLTE0IDE3OjUy OjI1LjAwMDAwMDAwMCArMDMwMA0KQEAgLTAsMCArMSw2IEBADQorb2JqLSQoQ09ORklHX0FDUllQ VE8pCQkJKz0gYWNyeXB0by5vDQorb2JqLSQoQ09ORklHX1NJTVBMRV9MQikJCQkrPSBzaW1wbGVf bGIubw0KK29iai0kKENPTkZJR19URVNUX0NSWVBUT19QUk9WSURFUikJKz0gY3J5cHRvX3Byb3Zp ZGVyLm8NCitvYmotJChDT05GSUdfVEVTVF9DUllQVE9fQ09OU1VNRVIpCSs9IGNvbnN1bWVyLm8N CisNCithY3J5cHRvLW9ianMJOj0gY3J5cHRvX21haW4ubyBjcnlwdG9fbGIubyBjcnlwdG9fZGV2 Lm8gY3J5cHRvX2Nvbm4ubyBjcnlwdG9fc3RhdC5vDQpkaWZmIC1OcnUgL3RtcC9lbXB0eS9zaW1w bGVfbGIuYyBsaW51eC0yLjYvZHJpdmVycy9hY3J5cHRvL3NpbXBsZV9sYi5jDQotLS0gL3RtcC9l bXB0eS9zaW1wbGVfbGIuYwkxOTcwLTAxLTAxIDAzOjAwOjAwLjAwMDAwMDAwMCArMDMwMA0KKysr IGxpbnV4LTIuNi9kcml2ZXJzL2FjcnlwdG8vc2ltcGxlX2xiLmMJMjAwNC0xMi0xNCAxODo1Mzox MS4wMDAwMDAwMDAgKzAzMDANCkBAIC0wLDAgKzEsOTEgQEANCisvKg0KKyAqIAlzaW1wbGVfbGIu Yw0KKyAqDQorICogQ29weXJpZ2h0IChjKSAyMDA0IEV2Z2VuaXkgUG9seWFrb3YgPGpvaG5wb2xA MmthLm1pcHQucnU+DQorICogDQorICoNCisgKiBUaGlzIHByb2dyYW0gaXMgZnJlZSBzb2Z0d2Fy ZTsgeW91IGNhbiByZWRpc3RyaWJ1dGUgaXQgYW5kL29yIG1vZGlmeQ0KKyAqIGl0IHVuZGVyIHRo ZSB0ZXJtcyBvZiB0aGUgR05VIEdlbmVyYWwgUHVibGljIExpY2Vuc2UgYXMgcHVibGlzaGVkIGJ5 DQorICogdGhlIEZyZWUgU29mdHdhcmUgRm91bmRhdGlvbjsgZWl0aGVyIHZlcnNpb24gMiBvZiB0 aGUgTGljZW5zZSwgb3INCisgKiAoYXQgeW91ciBvcHRpb24pIGFueSBsYXRlciB2ZXJzaW9uLg0K KyAqDQorICogVGhpcyBwcm9ncmFtIGlzIGRpc3RyaWJ1dGVkIGluIHRoZSBob3BlIHRoYXQgaXQg d2lsbCBiZSB1c2VmdWwsDQorICogYnV0IFdJVEhPVVQgQU5ZIFdBUlJBTlRZOyB3aXRob3V0IGV2 ZW4gdGhlIGltcGxpZWQgd2FycmFudHkgb2YNCisgKiBNRVJDSEFOVEFCSUxJVFkgb3IgRklUTkVT UyBGT1IgQSBQQVJUSUNVTEFSIFBVUlBPU0UuICBTZWUgdGhlDQorICogR05VIEdlbmVyYWwgUHVi bGljIExpY2Vuc2UgZm9yIG1vcmUgZGV0YWlscy4NCisgKg0KKyAqIFlvdSBzaG91bGQgaGF2ZSBy ZWNlaXZlZCBhIGNvcHkgb2YgdGhlIEdOVSBHZW5lcmFsIFB1YmxpYyBMaWNlbnNlDQorICogYWxv bmcgd2l0aCB0aGlzIHByb2dyYW07IGlmIG5vdCwgd3JpdGUgdG8gdGhlIEZyZWUgU29mdHdhcmUN CisgKiBGb3VuZGF0aW9uLCBJbmMuLCA1OSBUZW1wbGUgUGxhY2UsIFN1aXRlIDMzMCwgQm9zdG9u LCBNQSAwMjExMS0xMzA3IFVTQQ0KKyAqLw0KKw0KKyNpbmNsdWRlIDxsaW51eC9rZXJuZWwuaD4N CisjaW5jbHVkZSA8bGludXgvbW9kdWxlLmg+DQorI2luY2x1ZGUgPGxpbnV4L21vZHVsZXBhcmFt Lmg+DQorI2luY2x1ZGUgPGxpbnV4L3R5cGVzLmg+DQorI2luY2x1ZGUgPGxpbnV4L2xpc3QuaD4N CisjaW5jbHVkZSA8bGludXgvc2xhYi5oPg0KKw0KKyNpbmNsdWRlICJjcnlwdG9fbGIuaCINCisN CitzdGF0aWMgdm9pZCBzaW1wbGVfbGJfcmVoYXNoKHN0cnVjdCBjcnlwdG9fbGIgKik7DQorc3Rh dGljIHN0cnVjdCBjcnlwdG9fZGV2aWNlICpzaW1wbGVfbGJfZmluZF9kZXZpY2Uoc3RydWN0IGNy eXB0b19sYiAqLA0KKwkJCQkJCSAgIHN0cnVjdA0KKwkJCQkJCSAgIGNyeXB0b19zZXNzaW9uX2lu aXRpYWxpemVyICosDQorCQkJCQkJICAgc3RydWN0IGNyeXB0b19kYXRhICopOw0KKw0KK3N0cnVj dCBjcnlwdG9fbGIgc2ltcGxlX2xiID0gew0KKwkubmFtZSA9ICJzaW1wbGVfbGIiLA0KKwkucmVo YXNoID0gc2ltcGxlX2xiX3JlaGFzaCwNCisJLmZpbmRfZGV2aWNlID0gc2ltcGxlX2xiX2ZpbmRf ZGV2aWNlDQorfTsNCisNCitzdGF0aWMgdm9pZCBzaW1wbGVfbGJfcmVoYXNoKHN0cnVjdCBjcnlw dG9fbGIgKmxiKQ0KK3sNCit9DQorDQorc3RhdGljIHN0cnVjdCBjcnlwdG9fZGV2aWNlICpzaW1w bGVfbGJfZmluZF9kZXZpY2Uoc3RydWN0IGNyeXB0b19sYiAqbGIsDQorCQkJCQkJICAgc3RydWN0 DQorCQkJCQkJICAgY3J5cHRvX3Nlc3Npb25faW5pdGlhbGl6ZXINCisJCQkJCQkgICAqY2ksDQor CQkJCQkJICAgc3RydWN0IGNyeXB0b19kYXRhICpkYXRhKQ0KK3sNCisJc3RydWN0IGNyeXB0b19k ZXZpY2UgKmRldiwgKl9fZGV2Ow0KKwlpbnQgbWluID0gMHg3ZmZmZmZmOw0KKw0KKwlfX2RldiA9 IE5VTEw7DQorCWxpc3RfZm9yX2VhY2hfZW50cnkoZGV2LCBsYi0+Y3J5cHRvX2RldmljZV9saXN0 LCBjZGV2X2VudHJ5KSB7DQorCQlpZiAoZGV2aWNlX2Jyb2tlbihkZXYpKQ0KKwkJCWNvbnRpbnVl Ow0KKwkJaWYgKCFtYXRjaF9pbml0aWFsaXplcihkZXYsIGNpKSkNCisJCQljb250aW51ZTsNCisN CisJCWlmIChhdG9taWNfcmVhZCgmZGV2LT5yZWZjbnQpIDwgbWluKSB7DQorCQkJbWluID0gYXRv bWljX3JlYWQoJmRldi0+cmVmY250KTsNCisJCQlfX2RldiA9IGRldjsNCisJCX0NCisJfQ0KKw0K KwlyZXR1cm4gX19kZXY7DQorfQ0KKw0KK2ludCBfX2RldmluaXQgc2ltcGxlX2xiX2luaXQodm9p ZCkNCit7DQorCWRwcmludGsoS0VSTl9JTkZPICJSZWdpc3RlcmluZyBzaW1wbGUgY3J5cHRvIGxv YWQgYmFsYW5jZXIuXG4iKTsNCisNCisJcmV0dXJuIGNyeXB0b19sYl9yZWdpc3Rlcigmc2ltcGxl X2xiLCAxLCAxKTsNCit9DQorDQordm9pZCBfX2RldmV4aXQgc2ltcGxlX2xiX2Zpbmkodm9pZCkN Cit7DQorCWRwcmludGsoS0VSTl9JTkZPICJVbnJlZ2lzdGVyaW5nIHNpbXBsZSBjcnlwdG8gbG9h ZCBiYWxhbmNlci5cbiIpOw0KKw0KKwljcnlwdG9fbGJfdW5yZWdpc3Rlcigmc2ltcGxlX2xiKTsN Cit9DQorDQorbW9kdWxlX2luaXQoc2ltcGxlX2xiX2luaXQpOw0KK21vZHVsZV9leGl0KHNpbXBs ZV9sYl9maW5pKTsNCisNCitNT0RVTEVfTElDRU5TRSgiR1BMIik7DQorTU9EVUxFX0FVVEhPUigi RXZnZW5peSBQb2x5YWtvdiA8am9obnBvbEAya2EubWlwdC5ydT4iKTsNCitNT0RVTEVfREVTQ1JJ UFRJT04oIlNpbXBsZSBjcnlwdG8gbG9hZCBiYWxhbmNlci4iKTsNCi== --=-6hXhvAybrmqrLjuzQFoX Content-Disposition: attachment; filename=connector.patch Content-Type: text/x-patch; name=connector.patch; charset=KOI8-R Content-Transfer-Encoding: base64 ZGlmZiAtTnJ1IC90bXAvZW1wdHkvS2NvbmZpZyBsaW51eC0yLjYvZHJpdmVycy9jb25uZWN0b3Iv S2NvbmZpZw0KLS0tIC90bXAvZW1wdHkvS2NvbmZpZwkxOTcwLTAxLTAxIDAzOjAwOjAwLjAwMDAw MDAwMCArMDMwMA0KKysrIGxpbnV4LTIuNi9kcml2ZXJzL2Nvbm5lY3Rvci9LY29uZmlnCTIwMDQt MDktMjYgMDA6MTI6NTcuMDAwMDAwMDAwICswNDAwDQpAQCAtMCwwICsxLDEzIEBADQorbWVudSAi Q29ubmVjdG9yIC0gdW5pZmllZCB1c2Vyc3BhY2UgPC0+IGtlcm5lbHNwYWNlIGxpbmtlciINCisN Citjb25maWcgQ09OTkVDVE9SDQorCXRyaXN0YXRlICJDb25uZWN0b3IgLSB1bmlmaWVkIHVzZXJz cGFjZSA8LT4ga2VybmVsc3BhY2UgbGlua2VyIg0KKwlkZXBlbmRzIG9uIE5FVA0KKwktLS1oZWxw LS0tDQorCSAgVGhpcyBpcyB1bmlmaWVkIHVzZXJzcGFjZSA8LT4ga2VybmVsc3BhY2UgY29ubmVj dG9yIHdvcmtpbmcgb24gdG9wDQorCSAgb2YgdGhlIG5ldGxpbmsgc29ja2V0IHByb3RvY29sLg0K Kw0KKwkgIENvbm5lY3RvciBzdXBwb3J0IGNhbiBhbHNvIGJlIGJ1aWx0IGFzIGEgbW9kdWxlLiAg SWYgc28sIHRoZSBtb2R1bGUNCisJICB3aWxsIGJlIGNhbGxlZCBjb25uZWN0b3Iua28uDQorDQor ZW5kbWVudQ0KZGlmZiAtTnJ1IC90bXAvZW1wdHkvTWFrZWZpbGUgbGludXgtMi42L2RyaXZlcnMv Y29ubmVjdG9yL01ha2VmaWxlDQotLS0gL3RtcC9lbXB0eS9NYWtlZmlsZQkxOTcwLTAxLTAxIDAz OjAwOjAwLjAwMDAwMDAwMCArMDMwMA0KKysrIGxpbnV4LTIuNi9kcml2ZXJzL2Nvbm5lY3Rvci9N YWtlZmlsZQkyMDA0LTA5LTI2IDAwOjEyOjU3LjAwMDAwMDAwMCArMDQwMA0KQEAgLTAsMCArMSwy IEBADQorb2JqLSQoQ09ORklHX0NPTk5FQ1RPUikJCSs9IGNuLm8NCitjbi1vYmpzCQk6PSBjbl9x dWV1ZS5vIGNvbm5lY3Rvci5vDQpkaWZmIC1OcnUgL3RtcC9lbXB0eS9jbl9xdWV1ZS5jIGxpbnV4 LTIuNi9kcml2ZXJzL2Nvbm5lY3Rvci9jbl9xdWV1ZS5jDQotLS0gL3RtcC9lbXB0eS9jbl9xdWV1 ZS5jCTE5NzAtMDEtMDEgMDM6MDA6MDAuMDAwMDAwMDAwICswMzAwDQorKysgbGludXgtMi42L2Ry aXZlcnMvY29ubmVjdG9yL2NuX3F1ZXVlLmMJMjAwNC0wOS0yNiAyMTo1MDoxMi4wMDAwMDAwMDAg KzA0MDANCkBAIC0wLDAgKzEsMjE5IEBADQorLyoNCisgKiAJY25fcXVldWUuYw0KKyAqIA0KKyAq IDIwMDQgQ29weXJpZ2h0IChjKSBFdmdlbml5IFBvbHlha292IDxqb2hucG9sQDJrYS5taXB0LnJ1 Pg0KKyAqIEFsbCByaWdodHMgcmVzZXJ2ZWQuDQorICogDQorICogVGhpcyBwcm9ncmFtIGlzIGZy ZWUgc29mdHdhcmU7IHlvdSBjYW4gcmVkaXN0cmlidXRlIGl0IGFuZC9vciBtb2RpZnkNCisgKiBp dCB1bmRlciB0aGUgdGVybXMgb2YgdGhlIEdOVSBHZW5lcmFsIFB1YmxpYyBMaWNlbnNlIGFzIHB1 Ymxpc2hlZCBieQ0KKyAqIHRoZSBGcmVlIFNvZnR3YXJlIEZvdW5kYXRpb247IGVpdGhlciB2ZXJz aW9uIDIgb2YgdGhlIExpY2Vuc2UsIG9yDQorICogKGF0IHlvdXIgb3B0aW9uKSBhbnkgbGF0ZXIg dmVyc2lvbi4NCisgKg0KKyAqIFRoaXMgcHJvZ3JhbSBpcyBkaXN0cmlidXRlZCBpbiB0aGUgaG9w ZSB0aGF0IGl0IHdpbGwgYmUgdXNlZnVsLA0KKyAqIGJ1dCBXSVRIT1VUIEFOWSBXQVJSQU5UWTsg d2l0aG91dCBldmVuIHRoZSBpbXBsaWVkIHdhcnJhbnR5IG9mDQorICogTUVSQ0hBTlRBQklMSVRZ IG9yIEZJVE5FU1MgRk9SIEEgUEFSVElDVUxBUiBQVVJQT1NFLiAgU2VlIHRoZQ0KKyAqIEdOVSBH ZW5lcmFsIFB1YmxpYyBMaWNlbnNlIGZvciBtb3JlIGRldGFpbHMuDQorICoNCisgKiBZb3Ugc2hv dWxkIGhhdmUgcmVjZWl2ZWQgYSBjb3B5IG9mIHRoZSBHTlUgR2VuZXJhbCBQdWJsaWMgTGljZW5z ZQ0KKyAqIGFsb25nIHdpdGggdGhpcyBwcm9ncmFtOyBpZiBub3QsIHdyaXRlIHRvIHRoZSBGcmVl IFNvZnR3YXJlDQorICogRm91bmRhdGlvbiwgSW5jLiwgNTkgVGVtcGxlIFBsYWNlLCBTdWl0ZSAz MzAsIEJvc3RvbiwgTUEgIDAyMTExLTEzMDcgIFVTQQ0KKyAqDQorICovDQorDQorI2luY2x1ZGUg PGxpbnV4L2tlcm5lbC5oPg0KKyNpbmNsdWRlIDxsaW51eC9tb2R1bGUuaD4NCisjaW5jbHVkZSA8 bGludXgvbGlzdC5oPg0KKyNpbmNsdWRlIDxsaW51eC93b3JrcXVldWUuaD4NCisjaW5jbHVkZSA8 bGludXgvc3BpbmxvY2suaD4NCisjaW5jbHVkZSA8bGludXgvc2xhYi5oPg0KKyNpbmNsdWRlIDxs aW51eC9za2J1ZmYuaD4NCisjaW5jbHVkZSA8bGludXgvc3VzcGVuZC5oPg0KKw0KKyNpbmNsdWRl ICJjbl9xdWV1ZS5oIg0KKw0KK3N0YXRpYyB2b2lkIGNuX3F1ZXVlX3dyYXBwZXIodm9pZCAqZGF0 YSkNCit7DQorCXN0cnVjdCBjbl9jYWxsYmFja19lbnRyeSAqY2JxID0gKHN0cnVjdCBjbl9jYWxs YmFja19lbnRyeSAqKWRhdGE7DQorDQorCWF0b21pY19pbmMoJmNicS0+Y2ItPnJlZmNudCk7DQor CWNicS0+Y2ItPmNhbGxiYWNrKGNicS0+Y2ItPnByaXYpOw0KKwlhdG9taWNfZGVjKCZjYnEtPmNi LT5yZWZjbnQpOw0KKw0KKwljYnEtPmRlc3RydWN0X2RhdGEoY2JxLT5kZGF0YSk7DQorfQ0KKw0K K3N0YXRpYyBzdHJ1Y3QgY25fY2FsbGJhY2tfZW50cnkgKmNuX3F1ZXVlX2FsbG9jX2NhbGxiYWNr X2VudHJ5KHN0cnVjdA0KKwkJCQkJCQkgICAgICAgY25fY2FsbGJhY2sgKmNiKQ0KK3sNCisJc3Ry dWN0IGNuX2NhbGxiYWNrX2VudHJ5ICpjYnE7DQorDQorCWNicSA9IGttYWxsb2Moc2l6ZW9mKCpj YnEpLCBHRlBfS0VSTkVMKTsNCisJaWYgKCFjYnEpIHsNCisJCXByaW50ayhLRVJOX0VSUiAiRmFp bGVkIHRvIGNyZWF0ZSBuZXcgY2FsbGJhY2sgcXVldWUuXG4iKTsNCisJCXJldHVybiBOVUxMOw0K Kwl9DQorDQorCW1lbXNldChjYnEsIDAsIHNpemVvZigqY2JxKSk7DQorDQorCWNicS0+Y2IgPSBj YjsNCisNCisJSU5JVF9XT1JLKCZjYnEtPndvcmssICZjbl9xdWV1ZV93cmFwcGVyLCBjYnEpOw0K Kw0KKwlyZXR1cm4gY2JxOw0KK30NCisNCitzdGF0aWMgdm9pZCBjbl9xdWV1ZV9mcmVlX2NhbGxi YWNrKHN0cnVjdCBjbl9jYWxsYmFja19lbnRyeSAqY2JxKQ0KK3sNCisJY2FuY2VsX2RlbGF5ZWRf d29yaygmY2JxLT53b3JrKTsNCisNCisJd2hpbGUgKGF0b21pY19yZWFkKCZjYnEtPmNiLT5yZWZj bnQpKSB7DQorCQlwcmludGsoS0VSTl9JTkZPICJXYWl0aW5nIGZvciAlcyB0byBiZWNvbWUgZnJl ZTogcmVmY250PSVkLlxuIiwNCisJCSAgICAgICBjYnEtPnBkZXYtPm5hbWUsIGF0b21pY19yZWFk KCZjYnEtPmNiLT5yZWZjbnQpKTsNCisJCXNldF9jdXJyZW50X3N0YXRlKFRBU0tfSU5URVJSVVBU SUJMRSk7DQorCQlzY2hlZHVsZV90aW1lb3V0KEhaKTsNCisNCisJCWlmIChjdXJyZW50LT5mbGFn cyAmIFBGX0ZSRUVaRSkNCisJCQlyZWZyaWdlcmF0b3IoUEZfRlJFRVpFKTsNCisNCisJCWlmIChz aWduYWxfcGVuZGluZyhjdXJyZW50KSkNCisJCQlmbHVzaF9zaWduYWxzKGN1cnJlbnQpOw0KKwl9 DQorDQorCWtmcmVlKGNicSk7DQorfQ0KKw0KK2ludCBjbl9jYl9lcXVhbChzdHJ1Y3QgY2JfaWQg KmkxLCBzdHJ1Y3QgY2JfaWQgKmkyKQ0KK3sNCisJcmV0dXJuICgoaTEtPmlkeCA9PSBpMi0+aWR4 KSAmJiAoaTEtPnZhbCA9PSBpMi0+dmFsKSk7DQorfQ0KKw0KK2ludCBjbl9xdWV1ZV9hZGRfY2Fs bGJhY2soc3RydWN0IGNuX3F1ZXVlX2RldiAqZGV2LCBzdHJ1Y3QgY25fY2FsbGJhY2sgKmNiKQ0K K3sNCisJc3RydWN0IGNuX2NhbGxiYWNrX2VudHJ5ICpjYnEsICpuLCAqX19jYnE7DQorCWludCBm b3VuZCA9IDA7DQorDQorCWNicSA9IGNuX3F1ZXVlX2FsbG9jX2NhbGxiYWNrX2VudHJ5KGNiKTsN CisJaWYgKCFjYnEpDQorCQlyZXR1cm4gLUVOT01FTTsNCisNCisJYXRvbWljX2luYygmZGV2LT5y ZWZjbnQpOw0KKwljYnEtPnBkZXYgPSBkZXY7DQorDQorCXNwaW5fbG9jaygmZGV2LT5xdWV1ZV9s b2NrKTsNCisJbGlzdF9mb3JfZWFjaF9lbnRyeV9zYWZlKF9fY2JxLCBuLCAmZGV2LT5xdWV1ZV9s aXN0LCBjYWxsYmFja19lbnRyeSkgew0KKwkJaWYgKGNuX2NiX2VxdWFsKCZfX2NicS0+Y2ItPmlk LCAmY2ItPmlkKSkgew0KKwkJCWZvdW5kID0gMTsNCisJCQlicmVhazsNCisJCX0NCisJfQ0KKwlp ZiAoIWZvdW5kKSB7DQorCQlhdG9taWNfc2V0KCZjYnEtPmNiLT5yZWZjbnQsIDEpOw0KKwkJbGlz dF9hZGRfdGFpbCgmY2JxLT5jYWxsYmFja19lbnRyeSwgJmRldi0+cXVldWVfbGlzdCk7DQorCX0N CisJc3Bpbl91bmxvY2soJmRldi0+cXVldWVfbG9jayk7DQorDQorCWlmIChmb3VuZCkgew0KKwkJ YXRvbWljX2RlYygmZGV2LT5yZWZjbnQpOw0KKwkJYXRvbWljX3NldCgmY2JxLT5jYi0+cmVmY250 LCAwKTsNCisJCWNuX3F1ZXVlX2ZyZWVfY2FsbGJhY2soY2JxKTsNCisJCXJldHVybiAtRUlOVkFM Ow0KKwl9DQorDQorCWNicS0+bmxzID0gZGV2LT5ubHM7DQorCWNicS0+c2VxID0gMDsNCisJLy9j YnEtPmdyb3VwID0gKytkZXYtPm5ldGxpbmtfZ3JvdXBzOw0KKwljYnEtPmdyb3VwID0gY2JxLT5j Yi0+aWQuaWR4Ow0KKw0KKwlyZXR1cm4gMDsNCit9DQorDQordm9pZCBjbl9xdWV1ZV9kZWxfY2Fs bGJhY2soc3RydWN0IGNuX3F1ZXVlX2RldiAqZGV2LCBzdHJ1Y3QgY25fY2FsbGJhY2sgKmNiKQ0K K3sNCisJc3RydWN0IGNuX2NhbGxiYWNrX2VudHJ5ICpjYnEgPSBOVUxMLCAqbjsNCisJaW50IGZv dW5kID0gMDsNCisNCisJc3Bpbl9sb2NrKCZkZXYtPnF1ZXVlX2xvY2spOw0KKwlsaXN0X2Zvcl9l YWNoX2VudHJ5X3NhZmUoY2JxLCBuLCAmZGV2LT5xdWV1ZV9saXN0LCBjYWxsYmFja19lbnRyeSkg ew0KKwkJaWYgKGNuX2NiX2VxdWFsKCZjYnEtPmNiLT5pZCwgJmNiLT5pZCkpIHsNCisJCQlsaXN0 X2RlbCgmY2JxLT5jYWxsYmFja19lbnRyeSk7DQorCQkJZm91bmQgPSAxOw0KKwkJCWJyZWFrOw0K KwkJfQ0KKwl9DQorCXNwaW5fdW5sb2NrKCZkZXYtPnF1ZXVlX2xvY2spOw0KKw0KKwlpZiAoZm91 bmQpIHsNCisJCWF0b21pY19kZWMoJmNicS0+Y2ItPnJlZmNudCk7DQorCQljbl9xdWV1ZV9mcmVl X2NhbGxiYWNrKGNicSk7DQorCQlhdG9taWNfZGVjKCZkZXYtPnJlZmNudCk7DQorCX0NCit9DQor DQorc3RydWN0IGNuX3F1ZXVlX2RldiAqY25fcXVldWVfYWxsb2NfZGV2KGNoYXIgKm5hbWUsIHN0 cnVjdCBzb2NrICpubHMpDQorew0KKwlzdHJ1Y3QgY25fcXVldWVfZGV2ICpkZXY7DQorDQorCWRl diA9IGttYWxsb2Moc2l6ZW9mKCpkZXYpLCBHRlBfS0VSTkVMKTsNCisJaWYgKCFkZXYpIHsNCisJ CXByaW50ayhLRVJOX0VSUiAiJXM6IEZhaWxlZCB0byBhbGxvY3RlIG5ldyBzdHJ1Y3QgY25fcXVl dWVfZGV2LlxuIiwNCisJCSAgICAgICBuYW1lKTsNCisJCXJldHVybiBOVUxMOw0KKwl9DQorDQor CW1lbXNldChkZXYsIDAsIHNpemVvZigqZGV2KSk7DQorDQorCXNucHJpbnRmKGRldi0+bmFtZSwg c2l6ZW9mKGRldi0+bmFtZSksICIlcyIsIG5hbWUpOw0KKw0KKwlhdG9taWNfc2V0KCZkZXYtPnJl ZmNudCwgMCk7DQorCUlOSVRfTElTVF9IRUFEKCZkZXYtPnF1ZXVlX2xpc3QpOw0KKwlzcGluX2xv Y2tfaW5pdCgmZGV2LT5xdWV1ZV9sb2NrKTsNCisNCisJZGV2LT5ubHMgPSBubHM7DQorCWRldi0+ bmV0bGlua19ncm91cHMgPSAwOw0KKw0KKwlkZXYtPmNuX3F1ZXVlID0gY3JlYXRlX3dvcmtxdWV1 ZShkZXYtPm5hbWUpOw0KKwlpZiAoIWRldi0+Y25fcXVldWUpIHsNCisJCXByaW50ayhLRVJOX0VS UiAiRmFpbGVkIHRvIGNyZWF0ZSAlcyBxdWV1ZS5cbiIsIGRldi0+bmFtZSk7DQorCQlrZnJlZShk ZXYpOw0KKwkJcmV0dXJuIE5VTEw7DQorCX0NCisNCisJcmV0dXJuIGRldjsNCit9DQorDQordm9p ZCBjbl9xdWV1ZV9mcmVlX2RldihzdHJ1Y3QgY25fcXVldWVfZGV2ICpkZXYpDQorew0KKwlzdHJ1 Y3QgY25fY2FsbGJhY2tfZW50cnkgKmNicSwgKm47DQorDQorCWZsdXNoX3dvcmtxdWV1ZShkZXYt PmNuX3F1ZXVlKTsNCisJZGVzdHJveV93b3JrcXVldWUoZGV2LT5jbl9xdWV1ZSk7DQorDQorCXNw aW5fbG9jaygmZGV2LT5xdWV1ZV9sb2NrKTsNCisJbGlzdF9mb3JfZWFjaF9lbnRyeV9zYWZlKGNi cSwgbiwgJmRldi0+cXVldWVfbGlzdCwgY2FsbGJhY2tfZW50cnkpIHsNCisJCWxpc3RfZGVsKCZj YnEtPmNhbGxiYWNrX2VudHJ5KTsNCisJCWF0b21pY19kZWMoJmNicS0+Y2ItPnJlZmNudCk7DQor CX0NCisJc3Bpbl91bmxvY2soJmRldi0+cXVldWVfbG9jayk7DQorDQorCXdoaWxlIChhdG9taWNf cmVhZCgmZGV2LT5yZWZjbnQpKSB7DQorCQlwcmludGsoS0VSTl9JTkZPICJXYWl0aW5nIGZvciAl cyB0byBiZWNvbWUgZnJlZTogcmVmY250PSVkLlxuIiwNCisJCSAgICAgICBkZXYtPm5hbWUsIGF0 b21pY19yZWFkKCZkZXYtPnJlZmNudCkpOw0KKwkJc2V0X2N1cnJlbnRfc3RhdGUoVEFTS19JTlRF UlJVUFRJQkxFKTsNCisJCXNjaGVkdWxlX3RpbWVvdXQoSFopOw0KKw0KKwkJaWYgKGN1cnJlbnQt PmZsYWdzICYgUEZfRlJFRVpFKQ0KKwkJCXJlZnJpZ2VyYXRvcihQRl9GUkVFWkUpOw0KKw0KKwkJ aWYgKHNpZ25hbF9wZW5kaW5nKGN1cnJlbnQpKQ0KKwkJCWZsdXNoX3NpZ25hbHMoY3VycmVudCk7 DQorCX0NCisNCisJbWVtc2V0KGRldiwgMCwgc2l6ZW9mKCpkZXYpKTsNCisJa2ZyZWUoZGV2KTsN CisJZGV2ID0gTlVMTDsNCit9DQorDQorRVhQT1JUX1NZTUJPTChjbl9xdWV1ZV9hZGRfY2FsbGJh Y2spOw0KK0VYUE9SVF9TWU1CT0woY25fcXVldWVfZGVsX2NhbGxiYWNrKTsNCitFWFBPUlRfU1lN Qk9MKGNuX3F1ZXVlX2FsbG9jX2Rldik7DQorRVhQT1JUX1NZTUJPTChjbl9xdWV1ZV9mcmVlX2Rl dik7DQpkaWZmIC1OcnUgL3RtcC9lbXB0eS9jbl9xdWV1ZS5oIGxpbnV4LTIuNi9kcml2ZXJzL2Nv bm5lY3Rvci9jbl9xdWV1ZS5oDQotLS0gL3RtcC9lbXB0eS9jbl9xdWV1ZS5oCTE5NzAtMDEtMDEg MDM6MDA6MDAuMDAwMDAwMDAwICswMzAwDQorKysgbGludXgtMi42L2RyaXZlcnMvY29ubmVjdG9y L2NuX3F1ZXVlLmgJMjAwNC0wOS0yNiAwMDoxNDoxNi4wMDAwMDAwMDAgKzA0MDANCkBAIC0wLDAg KzEsOTAgQEANCisvKg0KKyAqIAljbl9xdWV1ZS5oDQorICogDQorICogMjAwNCBDb3B5cmlnaHQg KGMpIEV2Z2VuaXkgUG9seWFrb3YgPGpvaG5wb2xAMmthLm1pcHQucnU+DQorICogQWxsIHJpZ2h0 cyByZXNlcnZlZC4NCisgKiANCisgKiBUaGlzIHByb2dyYW0gaXMgZnJlZSBzb2Z0d2FyZTsgeW91 IGNhbiByZWRpc3RyaWJ1dGUgaXQgYW5kL29yIG1vZGlmeQ0KKyAqIGl0IHVuZGVyIHRoZSB0ZXJt cyBvZiB0aGUgR05VIEdlbmVyYWwgUHVibGljIExpY2Vuc2UgYXMgcHVibGlzaGVkIGJ5DQorICog dGhlIEZyZWUgU29mdHdhcmUgRm91bmRhdGlvbjsgZWl0aGVyIHZlcnNpb24gMiBvZiB0aGUgTGlj ZW5zZSwgb3INCisgKiAoYXQgeW91ciBvcHRpb24pIGFueSBsYXRlciB2ZXJzaW9uLg0KKyAqDQor ICogVGhpcyBwcm9ncmFtIGlzIGRpc3RyaWJ1dGVkIGluIHRoZSBob3BlIHRoYXQgaXQgd2lsbCBi ZSB1c2VmdWwsDQorICogYnV0IFdJVEhPVVQgQU5ZIFdBUlJBTlRZOyB3aXRob3V0IGV2ZW4gdGhl IGltcGxpZWQgd2FycmFudHkgb2YNCisgKiBNRVJDSEFOVEFCSUxJVFkgb3IgRklUTkVTUyBGT1Ig QSBQQVJUSUNVTEFSIFBVUlBPU0UuICBTZWUgdGhlDQorICogR05VIEdlbmVyYWwgUHVibGljIExp Y2Vuc2UgZm9yIG1vcmUgZGV0YWlscy4NCisgKg0KKyAqIFlvdSBzaG91bGQgaGF2ZSByZWNlaXZl ZCBhIGNvcHkgb2YgdGhlIEdOVSBHZW5lcmFsIFB1YmxpYyBMaWNlbnNlDQorICogYWxvbmcgd2l0 aCB0aGlzIHByb2dyYW07IGlmIG5vdCwgd3JpdGUgdG8gdGhlIEZyZWUgU29mdHdhcmUNCisgKiBG b3VuZGF0aW9uLCBJbmMuLCA1OSBUZW1wbGUgUGxhY2UsIFN1aXRlIDMzMCwgQm9zdG9uLCBNQSAg MDIxMTEtMTMwNyAgVVNBDQorICovDQorDQorI2lmbmRlZiBfX0NOX1FVRVVFX0gNCisjZGVmaW5l IF9fQ05fUVVFVUVfSA0KKw0KKyNpbmNsdWRlIDxhc20vdHlwZXMuaD4NCisNCitzdHJ1Y3QgY2Jf aWQNCit7DQorCV9fdTMyCQkJaWR4Ow0KKwlfX3UzMgkJCXZhbDsNCit9Ow0KKw0KKyNpZmRlZiBf X0tFUk5FTF9fDQorDQorI2luY2x1ZGUgPGFzbS9hdG9taWMuaD4NCisNCisjaW5jbHVkZSA8bGlu dXgvbGlzdC5oPg0KKyNpbmNsdWRlIDxsaW51eC93b3JrcXVldWUuaD4NCisNCisjZGVmaW5lIENO X0NCUV9OQU1FTEVOCQkzMg0KKw0KK3N0cnVjdCBjbl9xdWV1ZV9kZXYNCit7DQorCWF0b21pY190 CQlyZWZjbnQ7DQorCXVuc2lnbmVkIGNoYXIJCW5hbWVbQ05fQ0JRX05BTUVMRU5dOw0KKw0KKwlz dHJ1Y3Qgd29ya3F1ZXVlX3N0cnVjdAkqY25fcXVldWU7DQorCQ0KKwlzdHJ1Y3QgbGlzdF9oZWFk IAlxdWV1ZV9saXN0Ow0KKwlzcGlubG9ja190IAkJcXVldWVfbG9jazsNCisNCisJaW50CQkJbmV0 bGlua19ncm91cHM7DQorCXN0cnVjdCBzb2NrCQkqbmxzOw0KK307DQorDQorc3RydWN0IGNuX2Nh bGxiYWNrDQorew0KKwl1bnNpZ25lZCBjaGFyCQluYW1lW0NOX0NCUV9OQU1FTEVOXTsNCisJDQor CXN0cnVjdCBjYl9pZAkJaWQ7DQorCXZvaWQJCQkoKiBjYWxsYmFjaykodm9pZCAqKTsNCisJdm9p ZAkJCSpwcml2Ow0KKwkNCisJYXRvbWljX3QJCXJlZmNudDsNCit9Ow0KKw0KK3N0cnVjdCBjbl9j YWxsYmFja19lbnRyeQ0KK3sNCisJc3RydWN0IGxpc3RfaGVhZAljYWxsYmFja19lbnRyeTsNCisJ c3RydWN0IGNuX2NhbGxiYWNrCSpjYjsNCisJc3RydWN0IHdvcmtfc3RydWN0CXdvcms7DQorCXN0 cnVjdCBjbl9xdWV1ZV9kZXYJKnBkZXY7DQorCQ0KKwl2b2lkCQkJKCogZGVzdHJ1Y3RfZGF0YSko dm9pZCAqKTsNCisJdm9pZAkJCSpkZGF0YTsNCisNCisJaW50CQkJc2VxLCBncm91cDsNCisJc3Ry dWN0IHNvY2sJCSpubHM7DQorfTsNCisNCitpbnQgY25fcXVldWVfYWRkX2NhbGxiYWNrKHN0cnVj dCBjbl9xdWV1ZV9kZXYgKmRldiwgc3RydWN0IGNuX2NhbGxiYWNrICpjYik7DQordm9pZCBjbl9x dWV1ZV9kZWxfY2FsbGJhY2soc3RydWN0IGNuX3F1ZXVlX2RldiAqZGV2LCBzdHJ1Y3QgY25fY2Fs bGJhY2sgKmNiKTsNCisNCitzdHJ1Y3QgY25fcXVldWVfZGV2ICpjbl9xdWV1ZV9hbGxvY19kZXYo Y2hhciAqbmFtZSwgc3RydWN0IHNvY2sgKik7DQordm9pZCBjbl9xdWV1ZV9mcmVlX2RldihzdHJ1 Y3QgY25fcXVldWVfZGV2ICpkZXYpOw0KKw0KK2ludCBjbl9jYl9lcXVhbChzdHJ1Y3QgY2JfaWQg Kiwgc3RydWN0IGNiX2lkICopOw0KKw0KKyNlbmRpZiAvKiBfX0tFUk5FTF9fICovDQorI2VuZGlm IC8qIF9fQ05fUVVFVUVfSCAqLw0KZGlmZiAtTnJ1IC90bXAvZW1wdHkvY29ubmVjdG9yLmMgbGlu dXgtMi42L2RyaXZlcnMvY29ubmVjdG9yL2Nvbm5lY3Rvci5jDQotLS0gL3RtcC9lbXB0eS9jb25u ZWN0b3IuYwkxOTcwLTAxLTAxIDAzOjAwOjAwLjAwMDAwMDAwMCArMDMwMA0KKysrIGxpbnV4LTIu Ni9kcml2ZXJzL2Nvbm5lY3Rvci9jb25uZWN0b3IuYwkyMDA0LTA5LTI2IDIxOjUyOjAyLjAwMDAw MDAwMCArMDQwMA0KQEAgLTAsMCArMSw0OTggQEANCisvKg0KKyAqIAljb25uZWN0b3IuYw0KKyAq IA0KKyAqIDIwMDQgQ29weXJpZ2h0IChjKSBFdmdlbml5IFBvbHlha292IDxqb2hucG9sQDJrYS5t aXB0LnJ1Pg0KKyAqIEFsbCByaWdodHMgcmVzZXJ2ZWQuDQorICogDQorICogVGhpcyBwcm9ncmFt IGlzIGZyZWUgc29mdHdhcmU7IHlvdSBjYW4gcmVkaXN0cmlidXRlIGl0IGFuZC9vciBtb2RpZnkN CisgKiBpdCB1bmRlciB0aGUgdGVybXMgb2YgdGhlIEdOVSBHZW5lcmFsIFB1YmxpYyBMaWNlbnNl IGFzIHB1Ymxpc2hlZCBieQ0KKyAqIHRoZSBGcmVlIFNvZnR3YXJlIEZvdW5kYXRpb247IGVpdGhl ciB2ZXJzaW9uIDIgb2YgdGhlIExpY2Vuc2UsIG9yDQorICogKGF0IHlvdXIgb3B0aW9uKSBhbnkg bGF0ZXIgdmVyc2lvbi4NCisgKg0KKyAqIFRoaXMgcHJvZ3JhbSBpcyBkaXN0cmlidXRlZCBpbiB0 aGUgaG9wZSB0aGF0IGl0IHdpbGwgYmUgdXNlZnVsLA0KKyAqIGJ1dCBXSVRIT1VUIEFOWSBXQVJS QU5UWTsgd2l0aG91dCBldmVuIHRoZSBpbXBsaWVkIHdhcnJhbnR5IG9mDQorICogTUVSQ0hBTlRB QklMSVRZIG9yIEZJVE5FU1MgRk9SIEEgUEFSVElDVUxBUiBQVVJQT1NFLiAgU2VlIHRoZQ0KKyAq IEdOVSBHZW5lcmFsIFB1YmxpYyBMaWNlbnNlIGZvciBtb3JlIGRldGFpbHMuDQorICoNCisgKiBZ b3Ugc2hvdWxkIGhhdmUgcmVjZWl2ZWQgYSBjb3B5IG9mIHRoZSBHTlUgR2VuZXJhbCBQdWJsaWMg TGljZW5zZQ0KKyAqIGFsb25nIHdpdGggdGhpcyBwcm9ncmFtOyBpZiBub3QsIHdyaXRlIHRvIHRo ZSBGcmVlIFNvZnR3YXJlDQorICogRm91bmRhdGlvbiwgSW5jLiwgNTkgVGVtcGxlIFBsYWNlLCBT dWl0ZSAzMzAsIEJvc3RvbiwgTUEgIDAyMTExLTEzMDcgIFVTQQ0KKyAqLw0KKw0KKyNpbmNsdWRl IDxsaW51eC9rZXJuZWwuaD4NCisjaW5jbHVkZSA8bGludXgvbW9kdWxlLmg+DQorI2luY2x1ZGUg PGxpbnV4L2xpc3QuaD4NCisjaW5jbHVkZSA8bGludXgvc2tidWZmLmg+DQorI2luY2x1ZGUgPGxp bnV4L25ldGxpbmsuaD4NCisjaW5jbHVkZSA8bGludXgvbW9kdWxlcGFyYW0uaD4NCisNCisjaW5j bHVkZSA8bmV0L3NvY2suaD4NCisNCisjaW5jbHVkZSAiLi4vY29ubmVjdG9yL2Nvbm5lY3Rvci5o Ig0KKyNpbmNsdWRlICIuLi9jb25uZWN0b3IvY25fcXVldWUuaCINCisNCitNT0RVTEVfTElDRU5T RSgiR1BMIik7DQorTU9EVUxFX0FVVEhPUigiRXZnZW5peSBQb2x5YWtvdiA8am9obnBvbEAya2Eu bWlwdC5ydT4iKTsNCitNT0RVTEVfREVTQ1JJUFRJT04oIkdlbmVyaWMgdXNlcnNwYWNlIDwtPiBr ZXJuZWxzcGFjZSBjb25uZWN0b3IuIik7DQorDQorc3RhdGljIGludCB1bml0ID0gTkVUTElOS19O RkxPRzsNCitzdGF0aWMgdTMyIGNuX2lkeCA9IC0xOw0KK3N0YXRpYyB1MzIgY25fdmFsID0gLTE7 DQorDQorbW9kdWxlX3BhcmFtKHVuaXQsIGludCwgMCk7DQorbW9kdWxlX3BhcmFtKGNuX2lkeCwg dWludCwgMCk7DQorbW9kdWxlX3BhcmFtKGNuX3ZhbCwgdWludCwgMCk7DQorDQorc3BpbmxvY2tf dCBub3RpZnlfbG9jayA9IFNQSU5fTE9DS19VTkxPQ0tFRDsNCitzdGF0aWMgTElTVF9IRUFEKG5v dGlmeV9saXN0KTsNCisNCitzdGF0aWMgc3RydWN0IGNuX2RldiBjZGV2Ow0KKw0KKy8qDQorICog bXNnLT5zZXEgYW5kIG1zZy0+YWNrIGFyZSB1c2VkIHRvIGRldGVybWluZSBtZXNzYWdlIGdlbmVh bG9neS4NCisgKiBXaGVuIHNvbWVvbmUgc2VuZHMgbWVzc2FnZSBpdCBwdXRzIHRoZXJlIGxvY2Fs bHkgdW5pcXVlIHNlcXVlbmNlIA0KKyAqIGFuZCByYW5kb20gYWNrbm93bGVkZ2UgbnVtYmVycy4N CisgKiBTZXF1ZW5jZSBudW1iZXIgbWF5IGJlIGNvcGllZCBpbnRvIG5sbXNnaGRyLT5ubG1zZ19z ZXEgdG9vLg0KKyAqDQorICogU2VxdWVuY2UgbnVtYmVyIGlzIGluY3JlbWVudGVkIHdpdGggZWFj aCBtZXNzYWdlIHRvIGJlIHNlbnQuDQorICoNCisgKiBJZiB3ZSBleHBlY3QgcmVwbHkgdG8gb3Vy IG1lc3NhZ2UsIA0KKyAqIHRoZW4gc2VxdWVuY2UgbnVtYmVyIGluIHJlY2VpdmVkIG1lc3NhZ2Ug TVVTVCBiZSB0aGUgc2FtZSBhcyBpbiBvcmlnaW5hbCBtZXNzYWdlLA0KKyAqIGFuZCBhY2tub3ds ZWRnZSBudW1iZXIgTVVTVCBiZSB0aGUgc2FtZSArIDEuDQorICoNCisgKiBJZiB3ZSByZWNlaXZl IG1lc3NhZ2UgYW5kIGl0J3Mgc2VxdWVuY2UgbnVtYmVyIGlzIG5vdCBlcXVhbCB0byBvbmUgd2Ug YXJlIGV4cGVjdGluZywgDQorICogdGhlbiBpdCBpcyBuZXcgbWVzc2FnZS4NCisgKiBJZiB3ZSBy ZWNlaXZlIG1lc3NhZ2UgYW5kIGl0J3Mgc2VxdWVuY2UgbnVtYmVyIGlzIHRoZSBzYW1lIGFzIG9u ZSB3ZSBhcmUgZXhwZWN0aW5nLA0KKyAqIGJ1dCBpdCdzIGFja25vd2xlZGdlIGlzIG5vdCBlcXVh bCBhY2tub3dsZWRnZSBudW1iZXIgaW4gb3JpZ2luYWwgbWVzc2FnZSArIDEsDQorICogdGhlbiBp dCBpcyBuZXcgbWVzc2FnZS4NCisgKg0KKyAqLw0KK3ZvaWQgY25fbmV0bGlua19zZW5kKHN0cnVj dCBjbl9tc2cgKm1zZywgdTMyIF9fZ3JvdXBzKQ0KK3sNCisJc3RydWN0IGNuX2NhbGxiYWNrX2Vu dHJ5ICpuLCAqX19jYnE7DQorCXVuc2lnbmVkIGludCBzaXplOw0KKwlzdHJ1Y3Qgc2tfYnVmZiAq c2tiOw0KKwlzdHJ1Y3Qgbmxtc2doZHIgKm5saDsNCisJc3RydWN0IGNuX21zZyAqZGF0YTsNCisJ c3RydWN0IGNuX2RldiAqZGV2ID0gJmNkZXY7DQorCXUzMiBncm91cHMgPSAwOw0KKwlpbnQgZm91 bmQgPSAwOw0KKw0KKwlpZiAoIV9fZ3JvdXBzKQ0KKwl7DQorCQlzcGluX2xvY2soJmRldi0+Y2Jk ZXYtPnF1ZXVlX2xvY2spOw0KKwkJbGlzdF9mb3JfZWFjaF9lbnRyeV9zYWZlKF9fY2JxLCBuLCAm ZGV2LT5jYmRldi0+cXVldWVfbGlzdCwgY2FsbGJhY2tfZW50cnkpIHsNCisJCQlpZiAoY25fY2Jf ZXF1YWwoJl9fY2JxLT5jYi0+aWQsICZtc2ctPmlkKSkgew0KKwkJCQlmb3VuZCA9IDE7DQorCQkJ CWdyb3VwcyA9IF9fY2JxLT5ncm91cDsNCisJCQl9DQorCQl9DQorCQlzcGluX3VubG9jaygmZGV2 LT5jYmRldi0+cXVldWVfbG9jayk7DQorDQorCQlpZiAoIWZvdW5kKSB7DQorCQkJcHJpbnRrKEtF Uk5fRVJSICJGYWlsZWQgdG8gZmluZCBtdWx0aWNhc3QgbmV0bGluayBncm91cCBmb3IgY2FsbGJh Y2tbMHgleC4weCV4XS4gc2VxPSV1XG4iLA0KKwkJCSAgICAgICBtc2ctPmlkLmlkeCwgbXNnLT5p ZC52YWwsIG1zZy0+c2VxKTsNCisJCQlyZXR1cm47DQorCQl9DQorCX0NCisJZWxzZQ0KKwkJZ3Jv dXBzID0gX19ncm91cHM7DQorDQorCXNpemUgPSBOTE1TR19TUEFDRShzaXplb2YoKm1zZykgKyBt c2ctPmxlbik7DQorDQorCXNrYiA9IGFsbG9jX3NrYihzaXplLCBHRlBfQVRPTUlDKTsNCisJaWYg KCFza2IpIHsNCisJCXByaW50ayhLRVJOX0VSUiAiRmFpbGVkIHRvIGFsbG9jYXRlIG5ldyBza2Ig d2l0aCBzaXplPSV1LlxuIiwgc2l6ZSk7DQorCQlyZXR1cm47DQorCX0NCisNCisJbmxoID0gTkxN U0dfUFVUKHNrYiwgMCwgbXNnLT5zZXEsIE5MTVNHX0RPTkUsIHNpemUgLSBzaXplb2YoKm5saCkp Ow0KKw0KKwlkYXRhID0gKHN0cnVjdCBjbl9tc2cgKilOTE1TR19EQVRBKG5saCk7DQorDQorCW1l bWNweShkYXRhLCBtc2csIHNpemVvZigqZGF0YSkgKyBtc2ctPmxlbik7DQorI2lmIDANCisJcHJp bnRrKCIlczogbGVuPSV1LCBzZXE9JXUsIGFjaz0ldSwgZ3JvdXA9JXUuXG4iLA0KKwkgICAgICAg X19mdW5jX18sIG1zZy0+bGVuLCBtc2ctPnNlcSwgbXNnLT5hY2ssIGdyb3Vwcyk7DQorI2VuZGlm DQorCU5FVExJTktfQ0Ioc2tiKS5kc3RfZ3JvdXBzID0gZ3JvdXBzOw0KKwluZXRsaW5rX2Jyb2Fk Y2FzdChkZXYtPm5scywgc2tiLCAwLCBncm91cHMsIEdGUF9BVE9NSUMpOw0KKw0KKwlyZXR1cm47 DQorDQorICAgICAgbmxtc2dfZmFpbHVyZToNCisJcHJpbnRrKEtFUk5fRVJSICJGYWlsZWQgdG8g c2VuZCAldS4ldVxuIiwgbXNnLT5zZXEsIG1zZy0+YWNrKTsNCisJa2ZyZWVfc2tiKHNrYik7DQor CXJldHVybjsNCit9DQorDQorc3RhdGljIGludCBjbl9jYWxsX2NhbGxiYWNrKHN0cnVjdCBjbl9t c2cgKm1zZywgdm9pZCAoKmRlc3RydWN0X2RhdGEpICh2b2lkICopLCB2b2lkICpkYXRhKQ0KK3sN CisJc3RydWN0IGNuX2NhbGxiYWNrX2VudHJ5ICpuLCAqX19jYnE7DQorCXN0cnVjdCBjbl9kZXYg KmRldiA9ICZjZGV2Ow0KKwlpbnQgZm91bmQgPSAwOw0KKw0KKwlzcGluX2xvY2soJmRldi0+Y2Jk ZXYtPnF1ZXVlX2xvY2spOw0KKwlsaXN0X2Zvcl9lYWNoX2VudHJ5X3NhZmUoX19jYnEsIG4sICZk ZXYtPmNiZGV2LT5xdWV1ZV9saXN0LCBjYWxsYmFja19lbnRyeSkgew0KKwkJaWYgKGNuX2NiX2Vx dWFsKCZfX2NicS0+Y2ItPmlkLCAmbXNnLT5pZCkpIHsNCisJCQlfX2NicS0+Y2ItPnByaXYgPSBt c2c7DQorDQorCQkJX19jYnEtPmRkYXRhID0gZGF0YTsNCisJCQlfX2NicS0+ZGVzdHJ1Y3RfZGF0 YSA9IGRlc3RydWN0X2RhdGE7DQorDQorCQkJcXVldWVfd29yayhkZXYtPmNiZGV2LT5jbl9xdWV1 ZSwgJl9fY2JxLT53b3JrKTsNCisJCQlmb3VuZCA9IDE7DQorCQkJYnJlYWs7DQorCQl9DQorCX0N CisJc3Bpbl91bmxvY2soJmRldi0+Y2JkZXYtPnF1ZXVlX2xvY2spOw0KKw0KKwlyZXR1cm4gZm91 bmQ7DQorfQ0KKw0KK3N0YXRpYyBpbnQgX19jbl9yeF9za2Ioc3RydWN0IHNrX2J1ZmYgKnNrYiwg c3RydWN0IG5sbXNnaGRyICpubGgpDQorew0KKwl1MzIgcGlkLCB1aWQsIHNlcSwgZ3JvdXA7DQor CXN0cnVjdCBjbl9tc2cgKm1zZzsNCisNCisJcGlkID0gTkVUTElOS19DUkVEUyhza2IpLT5waWQ7 DQorCXVpZCA9IE5FVExJTktfQ1JFRFMoc2tiKS0+dWlkOw0KKwlzZXEgPSBubGgtPm5sbXNnX3Nl cTsNCisJZ3JvdXAgPSBORVRMSU5LX0NCKChza2IpKS5ncm91cHM7DQorCW1zZyA9IChzdHJ1Y3Qg Y25fbXNnICopTkxNU0dfREFUQShubGgpOw0KKw0KKwlpZiAobXNnLT5sZW4gIT0gbmxoLT5ubG1z Z19sZW4gLSBzaXplb2YoKm1zZykgLSBzaXplb2YoKm5saCkpIHsNCisJCXByaW50ayhLRVJOX0VS UiAic2tiIGRvZXMgbm90IGhhdmUgZW5vdWdoIGxlbmd0aDogIg0KKwkJCQkicmVxdWVzdGVkIG1z Zy0+bGVuPSV1WyV1XSwgbmxoLT5ubG1zZ19sZW49JXVbJXVdLCBza2ItPmxlbj0ldVttdXN0IGJl ICV1XS5cbiIsIA0KKwkJCQltc2ctPmxlbiwgTkxNU0dfU1BBQ0UobXNnLT5sZW4pLCANCisJCQkJ bmxoLT5ubG1zZ19sZW4sIG5saC0+bmxtc2dfbGVuIC0gc2l6ZW9mKCpubGgpLA0KKwkJCQlza2It PmxlbiwgbXNnLT5sZW4gKyBzaXplb2YoKm1zZykpOw0KKwkJcmV0dXJuIC1FSU5WQUw7DQorCX0N CisjaWYgMA0KKwlwcmludGsoS0VSTl9JTkZPICJwaWQ9JXUsIHVpZD0ldSwgc2VxPSV1LCBncm91 cD0ldS5cbiIsDQorCSAgICAgICBwaWQsIHVpZCwgc2VxLCBncm91cCk7DQorI2VuZGlmDQorCXJl dHVybiBjbl9jYWxsX2NhbGxiYWNrKG1zZywgKHZvaWQgKCopKHZvaWQgKikpa2ZyZWVfc2tiLCBz a2IpOw0KK30NCisNCitzdGF0aWMgdm9pZCBjbl9yeF9za2Ioc3RydWN0IHNrX2J1ZmYgKl9fc2ti KQ0KK3sNCisJc3RydWN0IG5sbXNnaGRyICpubGg7DQorCXUzMiBsZW47DQorCWludCBlcnI7DQor CXN0cnVjdCBza19idWZmICpza2I7DQorDQorCXNrYiA9IHNrYl9nZXQoX19za2IpOw0KKwlpZiAo IXNrYikgew0KKwkJcHJpbnRrKEtFUk5fRVJSICJGYWlsZWQgdG8gcmVmZXJlbmNlIGFuIHNrYi5c biIpOw0KKwkJcmV0dXJuOw0KKwl9DQorI2lmIDANCisJcHJpbnRrKEtFUk5fSU5GTw0KKwkgICAg ICAgInNrYjogbGVuPSV1LCBkYXRhX2xlbj0ldSwgdHJ1ZXNpemU9JXUsIHByb3RvPSV1LCBjbG9u ZWQ9JWQsIHNoYXJlZD0lZC5cbiIsDQorCSAgICAgICBza2ItPmxlbiwgc2tiLT5kYXRhX2xlbiwg c2tiLT50cnVlc2l6ZSwgc2tiLT5wcm90b2NvbCwNCisJICAgICAgIHNrYl9jbG9uZWQoc2tiKSwg c2tiX3NoYXJlZChza2IpKTsNCisjZW5kaWYNCisJd2hpbGUgKHNrYi0+bGVuID49IE5MTVNHX1NQ QUNFKDApKSB7DQorCQlubGggPSAoc3RydWN0IG5sbXNnaGRyICopc2tiLT5kYXRhOw0KKwkJaWYg KG5saC0+bmxtc2dfbGVuIDwgc2l6ZW9mKHN0cnVjdCBjbl9tc2cpIHx8DQorCQkgICAgc2tiLT5s ZW4gPCBubGgtPm5sbXNnX2xlbiB8fA0KKwkJICAgIG5saC0+bmxtc2dfbGVuID4gQ09OTkVDVE9S X01BWF9NU0dfU0laRSkgew0KKwkJCXByaW50ayhLRVJOX0lORk8gIm5sbXNnX2xlbj0ldSwgc2l6 ZW9mKCpubGgpPSV1XG4iLA0KKwkJCSAgICAgICBubGgtPm5sbXNnX2xlbiwgc2l6ZW9mKCpubGgp KTsNCisJCQlicmVhazsNCisJCX0NCisNCisJCWxlbiA9IE5MTVNHX0FMSUdOKG5saC0+bmxtc2df bGVuKTsNCisJCWlmIChsZW4gPiBza2ItPmxlbikNCisJCQlsZW4gPSBza2ItPmxlbjsNCisNCisJ CWVyciA9IF9fY25fcnhfc2tiKHNrYiwgbmxoKTsNCisJCWlmIChlcnIpIHsNCisjaWYgMA0KKwkJ CWlmIChlcnIgPCAwICYmIChubGgtPm5sbXNnX2ZsYWdzICYgTkxNX0ZfQUNLKSkNCisJCQkJbmV0 bGlua19hY2soc2tiLCBubGgsIC1lcnIpOw0KKyNlbmRpZg0KKwkJCWtmcmVlX3NrYihza2IpOw0K KwkJCWJyZWFrOw0KKwkJfSBlbHNlIHsNCisjaWYgMA0KKwkJCWlmIChubGgtPm5sbXNnX2ZsYWdz ICYgTkxNX0ZfQUNLKQ0KKwkJCQluZXRsaW5rX2Fjayhza2IsIG5saCwgMCk7DQorI2VuZGlmDQor CQkJa2ZyZWVfc2tiKHNrYik7DQorCQkJYnJlYWs7DQorCQl9DQorCQlza2JfcHVsbChza2IsIGxl bik7DQorCX0NCit9DQorDQorc3RhdGljIHZvaWQgY25faW5wdXQoc3RydWN0IHNvY2sgKnNrLCBp bnQgbGVuKQ0KK3sNCisJc3RydWN0IHNrX2J1ZmYgKnNrYjsNCisNCisJd2hpbGUgKChza2IgPSBz a2JfZGVxdWV1ZSgmc2stPnNrX3JlY2VpdmVfcXVldWUpKSAhPSBOVUxMKQ0KKwkJY25fcnhfc2ti KHNrYik7DQorfQ0KKw0KK3N0YXRpYyB2b2lkIGNuX25vdGlmeShzdHJ1Y3QgY2JfaWQgKmlkLCB1 MzIgbm90aWZ5X2V2ZW50KQ0KK3sNCisJc3RydWN0IGNuX2N0bF9lbnRyeSAqZW50Ow0KKw0KKwlz cGluX2xvY2soJm5vdGlmeV9sb2NrKTsNCisJbGlzdF9mb3JfZWFjaF9lbnRyeShlbnQsICZub3Rp ZnlfbGlzdCwgbm90aWZ5X2VudHJ5KSB7DQorCQlpbnQgaTsNCisJCXN0cnVjdCBjbl9ub3RpZnlf cmVxICpyZXE7DQorCQlzdHJ1Y3QgY25fY3RsX21zZyAqY3RsID0gZW50LT5tc2c7DQorCQlpbnQg YSwgYjsNCisNCisJCWEgPSBiID0gMDsNCisJCQ0KKwkJcmVxID0gKHN0cnVjdCBjbl9ub3RpZnlf cmVxICopY3RsLT5kYXRhOw0KKwkJZm9yIChpPTA7IGk8Y3RsLT5pZHhfbm90aWZ5X251bTsgKytp LCArK3JlcSkgew0KKwkJCWlmIChpZC0+aWR4ID49IHJlcS0+Zmlyc3QgJiYgaWQtPmlkeCA8IHJl cS0+Zmlyc3QgKyByZXEtPnJhbmdlKSB7DQorCQkJCWEgPSAxOw0KKwkJCQlicmVhazsNCisJCQl9 DQorCQl9DQorCQkNCisJCWZvciAoaT0wOyBpPGN0bC0+dmFsX25vdGlmeV9udW07ICsraSwgKyty ZXEpIHsNCisJCQlpZiAoaWQtPnZhbCA+PSByZXEtPmZpcnN0ICYmIGlkLT52YWwgPCByZXEtPmZp cnN0ICsgcmVxLT5yYW5nZSkgew0KKwkJCQliID0gMTsNCisJCQkJYnJlYWs7DQorCQkJfQ0KKwkJ fQ0KKw0KKwkJaWYgKGEgJiYgYikgew0KKwkJCXN0cnVjdCBjbl9tc2cgbTsNCisJCQkNCisJCQlw cmludGsoS0VSTl9JTkZPICJOb3RpZnlpbmcgZ3JvdXAgJXggd2l0aCBldmVudCAldSBhYm91dCAl eC4leC5cbiIsIA0KKwkJCQkJY3RsLT5ncm91cCwgbm90aWZ5X2V2ZW50LCANCisJCQkJCWlkLT5p ZHgsIGlkLT52YWwpOw0KKw0KKwkJCW1lbXNldCgmbSwgMCwgc2l6ZW9mKG0pKTsNCisJCQltLmFj ayA9IG5vdGlmeV9ldmVudDsNCisNCisJCQltZW1jcHkoJm0uaWQsIGlkLCBzaXplb2YobS5pZCkp Ow0KKwkJCWNuX25ldGxpbmtfc2VuZCgmbSwgY3RsLT5ncm91cCk7DQorCQl9DQorCX0NCisJc3Bp bl91bmxvY2soJm5vdGlmeV9sb2NrKTsNCit9DQorDQoraW50IGNuX2FkZF9jYWxsYmFjayhzdHJ1 Y3QgY2JfaWQgKmlkLCBjaGFyICpuYW1lLCB2b2lkICgqY2FsbGJhY2spICh2b2lkICopKQ0KK3sN CisJaW50IGVycjsNCisJc3RydWN0IGNuX2RldiAqZGV2ID0gJmNkZXY7DQorCXN0cnVjdCBjbl9j YWxsYmFjayAqY2I7DQorDQorCWNiID0ga21hbGxvYyhzaXplb2YoKmNiKSwgR0ZQX0tFUk5FTCk7 DQorCWlmICghY2IpIHsNCisJCXByaW50ayhLRVJOX0lORk8gIiVzOiBGYWlsZWQgdG8gYWxsb2Nh dGUgbmV3IHN0cnVjdCBjbl9jYWxsYmFjay5cbiIsDQorCQkgICAgICAgZGV2LT5jYmRldi0+bmFt ZSk7DQorCQlyZXR1cm4gLUVOT01FTTsNCisJfQ0KKw0KKwltZW1zZXQoY2IsIDAsIHNpemVvZigq Y2IpKTsNCisNCisJc25wcmludGYoY2ItPm5hbWUsIHNpemVvZihjYi0+bmFtZSksICIlcyIsIG5h bWUpOw0KKw0KKwltZW1jcHkoJmNiLT5pZCwgaWQsIHNpemVvZihjYi0+aWQpKTsNCisJY2ItPmNh bGxiYWNrID0gY2FsbGJhY2s7DQorDQorCWF0b21pY19zZXQoJmNiLT5yZWZjbnQsIDApOw0KKw0K KwllcnIgPSBjbl9xdWV1ZV9hZGRfY2FsbGJhY2soZGV2LT5jYmRldiwgY2IpOw0KKwlpZiAoZXJy KSB7DQorCQlrZnJlZShjYik7DQorCQlyZXR1cm4gZXJyOw0KKwl9DQorCQkJDQorCWNuX25vdGlm eShpZCwgMCk7DQorDQorCXJldHVybiAwOw0KK30NCisNCit2b2lkIGNuX2RlbF9jYWxsYmFjayhz dHJ1Y3QgY2JfaWQgKmlkKQ0KK3sNCisJc3RydWN0IGNuX2RldiAqZGV2ID0gJmNkZXY7DQorCXN0 cnVjdCBjbl9jYWxsYmFja19lbnRyeSAqbiwgKl9fY2JxOw0KKw0KKwlsaXN0X2Zvcl9lYWNoX2Vu dHJ5X3NhZmUoX19jYnEsIG4sICZkZXYtPmNiZGV2LT5xdWV1ZV9saXN0LCBjYWxsYmFja19lbnRy eSkgew0KKwkJaWYgKGNuX2NiX2VxdWFsKCZfX2NicS0+Y2ItPmlkLCBpZCkpIHsNCisJCQljbl9x dWV1ZV9kZWxfY2FsbGJhY2soZGV2LT5jYmRldiwgX19jYnEtPmNiKTsNCisJCQljbl9ub3RpZnko aWQsIDEpOw0KKwkJCWJyZWFrOw0KKwkJfQ0KKwl9DQorfQ0KKw0KK3N0YXRpYyBpbnQgY25fY3Rs X21zZ19lcXVhbHMoc3RydWN0IGNuX2N0bF9tc2cgKm0xLCBzdHJ1Y3QgY25fY3RsX21zZyAqbTIp DQorew0KKwlpbnQgaTsNCisJc3RydWN0IGNuX25vdGlmeV9yZXEgKnJlcTEsICpyZXEyOw0KKw0K KwlpZiAobTEtPmlkeF9ub3RpZnlfbnVtICE9IG0yLT5pZHhfbm90aWZ5X251bSkNCisJCXJldHVy biAwOw0KKwkNCisJaWYgKG0xLT52YWxfbm90aWZ5X251bSAhPSBtMi0+dmFsX25vdGlmeV9udW0p DQorCQlyZXR1cm4gMDsNCisJDQorCWlmIChtMS0+bGVuICE9IG0yLT5sZW4pDQorCQlyZXR1cm4g MDsNCisNCisJaWYgKChtMS0+aWR4X25vdGlmeV9udW0gKyBtMS0+dmFsX25vdGlmeV9udW0pKnNp emVvZigqcmVxMSkgIT0gbTEtPmxlbikgew0KKwkJcHJpbnRrKEtFUk5fRVJSICJOb3RpZnkgZW50 cnlbaWR4X251bT0leCwgdmFsX251bT0leCwgbGVuPSV1XSBjb250YWlucyBnYXJiYWdlLiBSZW1v dmluZy5cbiIsIA0KKwkJCQltMS0+aWR4X25vdGlmeV9udW0sIG0xLT52YWxfbm90aWZ5X251bSwg bTEtPmxlbik7DQorCQlyZXR1cm4gMTsNCisJfQ0KKw0KKwlyZXExID0gKHN0cnVjdCBjbl9ub3Rp ZnlfcmVxICopbTEtPmRhdGE7DQorCXJlcTIgPSAoc3RydWN0IGNuX25vdGlmeV9yZXEgKiltMi0+ ZGF0YTsNCisJDQorCWZvciAoaT0wOyBpPG0xLT5pZHhfbm90aWZ5X251bTsgKytpKSB7DQorCQlp ZiAobWVtY21wKHJlcTEsIHJlcTIsIHNpemVvZigqcmVxMSkpKQ0KKwkJCXJldHVybiAwOw0KKw0K KwkJcmVxMSsrOw0KKwkJcmVxMisrOw0KKwl9DQorDQorCWZvciAoaT0wOyBpPG0xLT52YWxfbm90 aWZ5X251bTsgKytpKSB7DQorCQlpZiAobWVtY21wKHJlcTEsIHJlcTIsIHNpemVvZigqcmVxMSkp KQ0KKwkJCXJldHVybiAwOw0KKw0KKwkJcmVxMSsrOw0KKwkJcmVxMisrOw0KKwl9DQorDQorCXJl dHVybiAxOw0KK30NCisNCitzdGF0aWMgdm9pZCBjbl9jYWxsYmFjayh2b2lkICogZGF0YSkNCit7 DQorCXN0cnVjdCBjbl9tc2cgKm1zZyA9IChzdHJ1Y3QgY25fbXNnICopZGF0YTsNCisJc3RydWN0 IGNuX2N0bF9tc2cgKmN0bDsNCisJc3RydWN0IGNuX2N0bF9lbnRyeSAqZW50Ow0KKwl1MzIgc2l6 ZTsNCisgDQorCWlmIChtc2ctPmxlbiA8IHNpemVvZigqY3RsKSkgew0KKwkJcHJpbnRrKEtFUk5f RVJSICJXcm9uZyBjb25uZWN0b3IgcmVxdWVzdCBzaXplICV1LCBtdXN0IGJlID49ICV1LlxuIiwg DQorCQkJCW1zZy0+bGVuLCBzaXplb2YoKmN0bCkpOw0KKwkJcmV0dXJuOw0KKwl9DQorCQ0KKwlj dGwgPSAoc3RydWN0IGNuX2N0bF9tc2cgKiltc2ctPmRhdGE7DQorDQorCXNpemUgPSBzaXplb2Yo KmN0bCkgKyAoY3RsLT5pZHhfbm90aWZ5X251bSArIGN0bC0+dmFsX25vdGlmeV9udW0pKnNpemVv ZihzdHJ1Y3QgY25fbm90aWZ5X3JlcSk7DQorDQorCWlmIChtc2ctPmxlbiAhPSBzaXplKSB7DQor CQlwcmludGsoS0VSTl9FUlIgIldyb25nIGNvbm5lY3RvciByZXF1ZXN0IHNpemUgJXUsIG11c3Qg YmUgPT0gJXUuXG4iLCANCisJCQkJbXNnLT5sZW4sIHNpemUpOw0KKwkJcmV0dXJuOw0KKwl9DQor DQorCWlmIChjdGwtPmxlbiArIHNpemVvZigqY3RsKSAhPSBtc2ctPmxlbikgew0KKwkJcHJpbnRr KEtFUk5fRVJSICJXcm9uZyBtZXNzYWdlOiBtc2ctPmxlbj0ldSBtdXN0IGJlIGVxdWFsIHRvIGlu bmVyX2xlbj0ldSBbKyV1XS5cbiIsIA0KKwkJCQltc2ctPmxlbiwgY3RsLT5sZW4sIHNpemVvZigq Y3RsKSk7DQorCQlyZXR1cm47DQorCX0NCisNCisJLyoNCisJICogUmVtb3ZlIG5vdGlmaWNhdGlv bi4NCisJICovDQorCWlmIChjdGwtPmdyb3VwID09IDApIHsNCisJCXN0cnVjdCBjbl9jdGxfZW50 cnkgKm47DQorCQkNCisJCXNwaW5fbG9jaygmbm90aWZ5X2xvY2spOw0KKwkJbGlzdF9mb3JfZWFj aF9lbnRyeV9zYWZlKGVudCwgbiwgJm5vdGlmeV9saXN0LCBub3RpZnlfZW50cnkpIHsNCisJCQlp ZiAoY25fY3RsX21zZ19lcXVhbHMoZW50LT5tc2csIGN0bCkpIHsNCisJCQkJbGlzdF9kZWwoJmVu dC0+bm90aWZ5X2VudHJ5KTsNCisJCQkJa2ZyZWUoZW50KTsNCisJCQl9DQorCQl9DQorCQlzcGlu X3VubG9jaygmbm90aWZ5X2xvY2spOw0KKw0KKwkJcmV0dXJuOw0KKwl9DQorDQorCXNpemUgKz0g c2l6ZW9mKCplbnQpOw0KKw0KKwllbnQgPSBrbWFsbG9jKHNpemUsIEdGUF9BVE9NSUMpOw0KKwlp ZiAoIWVudCkgew0KKwkJcHJpbnRrKEtFUk5fRVJSICJGYWlsZWQgdG8gYWxsb2NhdGUgJWQgYnl0 ZXMgZm9yIG5ldyBub3RpZnkgZW50cnkuXG4iLCBzaXplKTsNCisJCXJldHVybjsNCisJfQ0KKw0K KwltZW1zZXQoZW50LCAwLCBzaXplKTsNCisNCisJZW50LT5tc2cgPSAoc3RydWN0IGNuX2N0bF9t c2cgKikoZW50ICsgMSk7DQorDQorCW1lbWNweShlbnQtPm1zZywgY3RsLCBzaXplIC0gc2l6ZW9m KCplbnQpKTsNCisNCisJc3Bpbl9sb2NrKCZub3RpZnlfbG9jayk7DQorCWxpc3RfYWRkKCZlbnQt Pm5vdGlmeV9lbnRyeSwgJm5vdGlmeV9saXN0KTsNCisJc3Bpbl91bmxvY2soJm5vdGlmeV9sb2Nr KTsNCisNCisJew0KKwkJaW50IGk7DQorCQlzdHJ1Y3QgY25fbm90aWZ5X3JlcSAqcmVxOw0KKwkN CisJCXByaW50aygiTm90aWZ5IGdyb3VwICV4IGZvciBpZHg6ICIsIGN0bC0+Z3JvdXApOw0KKw0K KwkJcmVxID0gKHN0cnVjdCBjbl9ub3RpZnlfcmVxICopY3RsLT5kYXRhOw0KKwkJZm9yIChpPTA7 IGk8Y3RsLT5pZHhfbm90aWZ5X251bTsgKytpLCArK3JlcSkgew0KKwkJCXByaW50aygiJXUtJXUg IiwgcmVxLT5maXJzdCwgcmVxLT5maXJzdCtyZXEtPnJhbmdlLTEpOw0KKwkJfQ0KKwkJDQorCQlw cmludGsoIlxuTm90aWZ5IGdyb3VwICV4IGZvciB2YWw6ICIsIGN0bC0+Z3JvdXApOw0KKw0KKwkJ Zm9yIChpPTA7IGk8Y3RsLT52YWxfbm90aWZ5X251bTsgKytpLCArK3JlcSkgew0KKwkJCXByaW50 aygiJXUtJXUgIiwgcmVxLT5maXJzdCwgcmVxLT5maXJzdCtyZXEtPnJhbmdlLTEpOw0KKwkJfQ0K KwkJcHJpbnRrKCJcbiIpOw0KKwl9DQorfQ0KKw0KK3N0YXRpYyBpbnQgY25faW5pdCh2b2lkKQ0K K3sNCisJc3RydWN0IGNuX2RldiAqZGV2ID0gJmNkZXY7DQorDQorCWRldi0+aW5wdXQgPSBjbl9p bnB1dDsNCisJZGV2LT5pZC5pZHggPSBjbl9pZHg7DQorCWRldi0+aWQudmFsID0gY25fdmFsOw0K Kw0KKwlkZXYtPm5scyA9IG5ldGxpbmtfa2VybmVsX2NyZWF0ZSh1bml0LCBkZXYtPmlucHV0KTsN CisJaWYgKCFkZXYtPm5scykgew0KKwkJcHJpbnRrKEtFUk5fRVJSICJGYWlsZWQgdG8gY3JlYXRl IG5ldyBuZXRsaW5rIHNvY2tldCgldSkuXG4iLA0KKwkJICAgICAgIHVuaXQpOw0KKwkJcmV0dXJu IC1FSU87DQorCX0NCisNCisJZGV2LT5jYmRldiA9IGNuX3F1ZXVlX2FsbG9jX2RldigiY3F1ZXVl IiwgZGV2LT5ubHMpOw0KKwlpZiAoIWRldi0+Y2JkZXYpIHsNCisJCWlmIChkZXYtPm5scy0+c2tf c29ja2V0KQ0KKwkJCXNvY2tfcmVsZWFzZShkZXYtPm5scy0+c2tfc29ja2V0KTsNCisJCXJldHVy biAtRUlOVkFMOw0KKwl9DQorDQorCXJldHVybiBjbl9hZGRfY2FsbGJhY2soJmRldi0+aWQsICJj b25uZWN0b3IiLCAmY25fY2FsbGJhY2spOw0KK30NCisNCitzdGF0aWMgdm9pZCBjbl9maW5pKHZv aWQpDQorew0KKwlzdHJ1Y3QgY25fZGV2ICpkZXYgPSAmY2RldjsNCisNCisJY25fZGVsX2NhbGxi YWNrKCZkZXYtPmlkKTsNCisJY25fcXVldWVfZnJlZV9kZXYoZGV2LT5jYmRldik7DQorCWlmIChk ZXYtPm5scy0+c2tfc29ja2V0KQ0KKwkJc29ja19yZWxlYXNlKGRldi0+bmxzLT5za19zb2NrZXQp Ow0KK30NCisNCittb2R1bGVfaW5pdChjbl9pbml0KTsNCittb2R1bGVfZXhpdChjbl9maW5pKTsN CisNCitFWFBPUlRfU1lNQk9MKGNuX2FkZF9jYWxsYmFjayk7DQorRVhQT1JUX1NZTUJPTChjbl9k ZWxfY2FsbGJhY2spOw0KK0VYUE9SVF9TWU1CT0woY25fbmV0bGlua19zZW5kKTsNCmRpZmYgLU5y dSAvdG1wL2VtcHR5L2Nvbm5lY3Rvci5oIGxpbnV4LTIuNi9kcml2ZXJzL2Nvbm5lY3Rvci9jb25u ZWN0b3IuaA0KLS0tIC90bXAvZW1wdHkvY29ubmVjdG9yLmgJMTk3MC0wMS0wMSAwMzowMDowMC4w MDAwMDAwMDAgKzAzMDANCisrKyBsaW51eC0yLjYvZHJpdmVycy9jb25uZWN0b3IvY29ubmVjdG9y LmgJMjAwNC0wOS0yNiAwMDoxNDoxNi4wMDAwMDAwMDAgKzA0MDANCkBAIC0wLDAgKzEsODEgQEAN CisvKg0KKyAqIAljb25uZWN0b3IuaA0KKyAqIA0KKyAqIDIwMDQgQ29weXJpZ2h0IChjKSBFdmdl bml5IFBvbHlha292IDxqb2hucG9sQDJrYS5taXB0LnJ1Pg0KKyAqIEFsbCByaWdodHMgcmVzZXJ2 ZWQuDQorICogDQorICogVGhpcyBwcm9ncmFtIGlzIGZyZWUgc29mdHdhcmU7IHlvdSBjYW4gcmVk aXN0cmlidXRlIGl0IGFuZC9vciBtb2RpZnkNCisgKiBpdCB1bmRlciB0aGUgdGVybXMgb2YgdGhl IEdOVSBHZW5lcmFsIFB1YmxpYyBMaWNlbnNlIGFzIHB1Ymxpc2hlZCBieQ0KKyAqIHRoZSBGcmVl IFNvZnR3YXJlIEZvdW5kYXRpb247IGVpdGhlciB2ZXJzaW9uIDIgb2YgdGhlIExpY2Vuc2UsIG9y DQorICogKGF0IHlvdXIgb3B0aW9uKSBhbnkgbGF0ZXIgdmVyc2lvbi4NCisgKg0KKyAqIFRoaXMg cHJvZ3JhbSBpcyBkaXN0cmlidXRlZCBpbiB0aGUgaG9wZSB0aGF0IGl0IHdpbGwgYmUgdXNlZnVs LA0KKyAqIGJ1dCBXSVRIT1VUIEFOWSBXQVJSQU5UWTsgd2l0aG91dCBldmVuIHRoZSBpbXBsaWVk IHdhcnJhbnR5IG9mDQorICogTUVSQ0hBTlRBQklMSVRZIG9yIEZJVE5FU1MgRk9SIEEgUEFSVElD VUxBUiBQVVJQT1NFLiAgU2VlIHRoZQ0KKyAqIEdOVSBHZW5lcmFsIFB1YmxpYyBMaWNlbnNlIGZv ciBtb3JlIGRldGFpbHMuDQorICoNCisgKiBZb3Ugc2hvdWxkIGhhdmUgcmVjZWl2ZWQgYSBjb3B5 IG9mIHRoZSBHTlUgR2VuZXJhbCBQdWJsaWMgTGljZW5zZQ0KKyAqIGFsb25nIHdpdGggdGhpcyBw cm9ncmFtOyBpZiBub3QsIHdyaXRlIHRvIHRoZSBGcmVlIFNvZnR3YXJlDQorICogRm91bmRhdGlv biwgSW5jLiwgNTkgVGVtcGxlIFBsYWNlLCBTdWl0ZSAzMzAsIEJvc3RvbiwgTUEgIDAyMTExLTEz MDcgIFVTQQ0KKyAqLw0KKw0KKyNpZm5kZWYgX19DT05ORUNUT1JfSA0KKyNkZWZpbmUgX19DT05O RUNUT1JfSA0KKw0KKyNpbmNsdWRlICIuLi9jb25uZWN0b3IvY25fcXVldWUuaCINCisNCisjZGVm aW5lIENPTk5FQ1RPUl9NQVhfTVNHX1NJWkUgCTEwMjQNCisNCitzdHJ1Y3QgY25fbXNnDQorew0K KwlzdHJ1Y3QgY2JfaWQgCQlpZDsNCisNCisJX191MzIJCQlzZXE7DQorCV9fdTMyCQkJYWNrOw0K Kw0KKwlfX3UzMgkJCWxlbjsJCS8qIExlbmd0aCBvZiB0aGUgZm9sbG93aW5nIGRhdGEgKi8NCisJ X191OAkJCWRhdGFbMF07DQorfTsNCisNCitzdHJ1Y3QgY25fbm90aWZ5X3JlcQ0KK3sNCisJX191 MzIJCQlmaXJzdDsNCisJX191MzIJCQlyYW5nZTsNCit9Ow0KKw0KK3N0cnVjdCBjbl9jdGxfbXNn DQorew0KKwlfX3UzMgkJCWlkeF9ub3RpZnlfbnVtOw0KKwlfX3UzMgkJCXZhbF9ub3RpZnlfbnVt Ow0KKwlfX3UzMgkJCWdyb3VwOw0KKwlfX3UzMgkJCWxlbjsNCisJX191OAkJCWRhdGFbMF07DQor fTsNCisNCisjaWZkZWYgX19LRVJORUxfXw0KKw0KKyNpbmNsdWRlIDxuZXQvc29jay5oPg0KKw0K K3N0cnVjdCBjbl9jdGxfZW50cnkNCit7DQorCXN0cnVjdCBsaXN0X2hlYWQJbm90aWZ5X2VudHJ5 Ow0KKwlzdHJ1Y3QgY25fY3RsX21zZwkqbXNnOw0KK307DQorDQorc3RydWN0IGNuX2Rldg0KK3sN CisJc3RydWN0IGNiX2lkIAkJaWQ7DQorDQorCXUzMgkJCXNlcSwgZ3JvdXBzOw0KKwlzdHJ1Y3Qg c29jayAJCSpubHM7DQorCXZvaWQgCQkJKCppbnB1dCkoc3RydWN0IHNvY2sgKnNrLCBpbnQgbGVu KTsNCisNCisJc3RydWN0IGNuX3F1ZXVlX2RldgkqY2JkZXY7DQorfTsNCisNCitpbnQgY25fYWRk X2NhbGxiYWNrKHN0cnVjdCBjYl9pZCAqLCBjaGFyICosIHZvaWQgKCogY2FsbGJhY2spKHZvaWQg KikpOw0KK3ZvaWQgY25fZGVsX2NhbGxiYWNrKHN0cnVjdCBjYl9pZCAqKTsNCit2b2lkIGNuX25l dGxpbmtfc2VuZChzdHJ1Y3QgY25fbXNnICosIHUzMik7DQorDQorI2VuZGlmIC8qIF9fS0VSTkVM X18gKi8NCisjZW5kaWYgLyogX19DT05ORUNUT1JfSCAqLw0K --=-6hXhvAybrmqrLjuzQFoX Content-Disposition: attachment; filename=Kconfig.connector.patch Content-Type: text/x-patch; name=Kconfig.connector.patch; charset=KOI8-R Content-Transfer-Encoding: base64 LS0tIGxpbnV4LTIuNi9kcml2ZXJzL0tjb25maWcub3JpZwkyMDA0LTA5LTI2IDEzOjM0OjQ4LjAw MDAwMDAwMCArMDQwMA0KKysrIGxpbnV4LTIuNi9kcml2ZXJzL0tjb25maWcJMjAwNC0wOS0yNiAx MzozNDo1Ny4wMDAwMDAwMDAgKzA0MDANCkBAIC00NCw2ICs0NCw4IEBADQogDQogc291cmNlICJk cml2ZXJzL3cxL0tjb25maWciDQogDQorc291cmNlICJkcml2ZXJzL2Nvbm5lY3Rvci9LY29uZmln Ig0KKw0KIHNvdXJjZSAiZHJpdmVycy9taXNjL0tjb25maWciDQogDQogc291cmNlICJkcml2ZXJz L21lZGlhL0tjb25maWciDQoNCg== --=-6hXhvAybrmqrLjuzQFoX Content-Disposition: attachment; filename=Kconfig.crypto.patch Content-Type: text/x-patch; name=Kconfig.crypto.patch; charset=KOI8-R Content-Transfer-Encoding: base64 LS0tIGxpbnV4LTIuNi9kcml2ZXJzL0tjb25maWcubm9jcnlwdG8JMjAwNC0xMC0zMCAwOTowNTo1 Mi4wMDAwMDAwMDAgKzA0MDANCisrKyBsaW51eC0yLjYvZHJpdmVycy9LY29uZmlnCTIwMDQtMTAt MzAgMDk6MDY6MTEuMDAwMDAwMDAwICswNDAwDQpAQCAtNDIsNiArNDIsOCBAQA0KIA0KIHNvdXJj ZSAiZHJpdmVycy9pMmMvS2NvbmZpZyINCiANCitzb3VyY2UgImRyaXZlcnMvYWNyeXB0by9LY29u ZmlnIg0KKw0KIHNvdXJjZSAiZHJpdmVycy93MS9LY29uZmlnIg0KIA0KIHNvdXJjZSAiZHJpdmVy cy9jb25uZWN0b3IvS2NvbmZpZyINCg== --=-6hXhvAybrmqrLjuzQFoX Content-Disposition: attachment; filename=Makefile.connector.patch Content-Type: text/x-patch; name=Makefile.connector.patch; charset=KOI8-R Content-Transfer-Encoding: base64 LS0tIGxpbnV4LTIuNi9kcml2ZXJzL01ha2VmaWxlLm9yaWcJMjAwNC0wOS0yNSAyMzo0NzowOC4w MDAwMDAwMDAgKzA0MDANCisrKyBsaW51eC0yLjYvZHJpdmVycy9NYWtlZmlsZQkyMDA0LTA5LTI2 IDEzOjM0OjI1LjAwMDAwMDAwMCArMDQwMA0KQEAgLTQ0LDYgKzQ0LDcgQEANCiBvYmotJChDT05G SUdfSTJPKQkJKz0gbWVzc2FnZS8NCiBvYmotJChDT05GSUdfSTJDKQkJKz0gaTJjLw0KIG9iai0k KENPTkZJR19XMSkJCSs9IHcxLw0KK29iai0kKENPTkZJR19DT05ORUNUT1IpCSs9IGNvbm5lY3Rv ci8NCiBvYmotJChDT05GSUdfUEhPTkUpCQkrPSB0ZWxlcGhvbnkvDQogb2JqLSQoQ09ORklHX01E KQkJKz0gbWQvDQogb2JqLSQoQ09ORklHX0JUKQkJKz0gYmx1ZXRvb3RoLw0KDQo= --=-6hXhvAybrmqrLjuzQFoX Content-Disposition: attachment; filename=Makefile.crypto.patch Content-Type: text/x-patch; name=Makefile.crypto.patch; charset=KOI8-R Content-Transfer-Encoding: base64 LS0tIGxpbnV4LTIuNi9kcml2ZXJzL01ha2VmaWxlLm5vY3J5cHRvCTIwMDQtMTAtMzAgMDk6MDU6 MDIuMDAwMDAwMDAwICswNDAwDQorKysgbGludXgtMi42L2RyaXZlcnMvTWFrZWZpbGUJMjAwNC0x MC0zMCAwOTowNTozMS4wMDAwMDAwMDAgKzA0MDANCkBAIC00OSw1ICs0OSw2IEBADQogb2JqLSQo Q09ORklHX0dBTUVQT1JUKQkJKz0gaW5wdXQvZ2FtZXBvcnQvDQogb2JqLSQoQ09ORklHX0kyTykJ CSs9IG1lc3NhZ2UvDQogb2JqLSQoQ09ORklHX0kyQykJCSs9IGkyYy8NCitvYmotJChDT05GSUdf QUNSWVBUTykJCSs9IGFjcnlwdG8vDQogb2JqLSQoQ09ORklHX1cxKQkJKz0gdzEvDQogb2JqLSQo Q09ORklHX0NPTk5FQ1RPUikJCSs9IGNvbm5lY3Rvci8NCg== --=-6hXhvAybrmqrLjuzQFoX-- --=-84FmUi6+bypRF6VRBLEL Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.6 (GNU/Linux) iD8DBQBBvw7sIKTPhE+8wY0RAh0kAJ9siQOXijN5/uV4OgWtbeJbG2BrJgCeMjSQ JG07FP0FWSKjoWDu0GosujM= =NptR -----END PGP SIGNATURE----- --=-84FmUi6+bypRF6VRBLEL-- From davem@davemloft.net Tue Dec 14 11:37:40 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 14 Dec 2004 11:37:48 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBEJbKN0022737 for ; Tue, 14 Dec 2004 11:37:40 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1CeIPh-0004u2-00; Tue, 14 Dec 2004 11:32:49 -0800 Date: Tue, 14 Dec 2004 11:32:49 -0800 From: "David S. Miller" To: Patrick McHardy Cc: shemminger@osdl.org, netem@osdl.org, netdev@oss.sgi.com Subject: Re: [PATCH] netem: restart device after inserting packets Message-Id: <20041214113249.0725a655.davem@davemloft.net> In-Reply-To: <41B91901.3070304@trash.net> References: <20041208123103.4cc6b005@dxpl.pdx.osdl.net> <20041208210031.63f0963f.davem@davemloft.net> <41B91901.3070304@trash.net> X-Mailer: Sylpheed version 1.0.0beta3 (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12743 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Fri, 10 Dec 2004 04:33:21 +0100 Patrick McHardy wrote: > The patch is incomplete, netem may dequeue multiple packets from > the delayed queue at once and feed them to the inner queue, but > qdisc_restart will only dequeue one packet from the inner queue. > This patch moves qdisc_run back to include/net/pkt_sched.h and > replaces qdisc_restart by qdisc_run in netem_watchdog. Applied, thanks Patrick. Don't we need 2.4.x versions of these two fixes? From shemminger@osdl.org Tue Dec 14 13:12:11 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 14 Dec 2004 13:12:18 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBELBo5q029545 for ; Tue, 14 Dec 2004 13:12:11 -0800 Received: from dxpl.pdx.osdl.net (dxpl.pdx.osdl.net [172.20.1.103]) by mail.osdl.org (8.11.6/8.11.6) with ESMTP id iBELBD929163; Tue, 14 Dec 2004 13:11:13 -0800 Date: Tue, 14 Dec 2004 13:11:13 -0800 From: Stephen Hemminger To: "David S. Miller" Cc: Patrick McHardy , netem@osdl.org, netdev@oss.sgi.com Subject: Re: [PATCH] netem: restart device after inserting packets Message-Id: <20041214131113.32d080fb@dxpl.pdx.osdl.net> In-Reply-To: <20041214113249.0725a655.davem@davemloft.net> References: <20041208123103.4cc6b005@dxpl.pdx.osdl.net> <20041208210031.63f0963f.davem@davemloft.net> <41B91901.3070304@trash.net> <20041214113249.0725a655.davem@davemloft.net> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.10claws (GTK+ 1.2.10; x86_64-suse-linux) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12744 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev 2.4 version of the netem wakeup patch. Also fixes the qlen in a couple of places. This makes code basically same as 2.6 Signed-off-by: Stephen Hemminger diff -Nru a/net/sched/sch_netem.c b/net/sched/sch_netem.c --- a/net/sched/sch_netem.c 2004-12-14 13:10:18 -08:00 +++ b/net/sched/sch_netem.c 2004-12-14 13:10:18 -08:00 @@ -259,12 +259,13 @@ { struct Qdisc *sch = (struct Qdisc *)arg; struct netem_sched_data *q = qdisc_priv(sch); + struct net_device *dev = sch->dev; struct sk_buff *skb; psched_time_t now; pr_debug("netem_watchdog: fired @%lu\n", jiffies); - spin_lock_bh(&sch->dev->queue_lock); + spin_lock_bh(&dev->queue_lock); PSCHED_GET_TIME(now); while ((skb = skb_peek(&q->delayed)) != NULL) { @@ -284,8 +285,11 @@ if (q->qdisc->enqueue(skb, q->qdisc)) sch->stats.drops++; + else + sch->q.qlen++; } - spin_unlock_bh(&sch->dev->queue_lock); + qdisc_run(dev); + spin_unlock_bh(&dev->queue_lock); } static void netem_reset(struct Qdisc *sch) @@ -505,7 +509,7 @@ sch_tree_lock(sch); *old = xchg(&q->qdisc, new); qdisc_reset(*old); - sch->q.qlen = q->delayed.qlen; + sch->q.qlen = 0; sch_tree_unlock(sch); return 0; From bunk@stusta.de Tue Dec 14 16:43:46 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 14 Dec 2004 16:43:52 -0800 (PST) Received: from mailout.stusta.mhn.de (emailhub.stusta.mhn.de [141.84.69.5]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBF0hOSD009589 for ; Tue, 14 Dec 2004 16:43:46 -0800 Received: (qmail 32350 invoked from network); 15 Dec 2004 00:42:57 -0000 Received: from r063144.stusta.swh.mhn.de (10.150.63.144) by mailout.stusta.mhn.de with SMTP; 15 Dec 2004 00:42:57 -0000 Received: by r063144.stusta.swh.mhn.de (Postfix, from userid 1000) id A13EBAF651; Wed, 15 Dec 2004 01:42:54 +0100 (CET) Date: Wed, 15 Dec 2004 01:42:54 +0100 From: Adrian Bunk To: netdev@oss.sgi.com Cc: linux-kernel@vger.kernel.org Subject: [2.6 patch] net/key/af_key.c: make pfkey_table static Message-ID: <20041215004254.GG23151@stusta.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12745 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bunk@stusta.de Precedence: bulk X-list: netdev The patch below makes the needlessly global pfkey_table static. Signed-off-by: Adrian Bunk --- linux-2.6.10-rc3-mm1-full/net/key/af_key.c.old 2004-12-14 20:22:24.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/key/af_key.c 2004-12-14 20:22:31.000000000 +0100 @@ -35,7 +35,7 @@ /* List of all pfkey sockets. */ -HLIST_HEAD(pfkey_table); +static HLIST_HEAD(pfkey_table); static DECLARE_WAIT_QUEUE_HEAD(pfkey_table_wait); static rwlock_t pfkey_table_lock = RW_LOCK_UNLOCKED; static atomic_t pfkey_table_users = ATOMIC_INIT(0); From bunk@stusta.de Tue Dec 14 16:46:55 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 14 Dec 2004 16:47:02 -0800 (PST) Received: from mailout.stusta.mhn.de (emailhub.stusta.mhn.de [141.84.69.5]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBF0kYkL010512 for ; Tue, 14 Dec 2004 16:46:54 -0800 Received: (qmail 32453 invoked from network); 15 Dec 2004 00:46:06 -0000 Received: from r063144.stusta.swh.mhn.de (10.150.63.144) by mailout.stusta.mhn.de with SMTP; 15 Dec 2004 00:46:06 -0000 Received: by r063144.stusta.swh.mhn.de (Postfix, from userid 1000) id 4B6E2AF651; Wed, 15 Dec 2004 01:46:04 +0100 (CET) Date: Wed, 15 Dec 2004 01:46:04 +0100 From: Adrian Bunk To: Alan Cox , Alexey Kuznetsov Cc: netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: [2.6 patch] net/netlink/af_netlink.c: possible cleanups Message-ID: <20041215004604.GH23151@stusta.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12746 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bunk@stusta.de Precedence: bulk X-list: netdev The patch below contains the following possible cleanups: - make the needlessly global function netlink_getsockbypid static - remove the EXPORT_SYMBOL'ed but unused functions netlink_attach and netlink_detach Please review whether these changes are correct or whether they conflict with pending patches. diffstat output: include/linux/netlink.h | 3 --- net/netlink/af_netlink.c | 28 +--------------------------- 2 files changed, 1 insertion(+), 30 deletions(-) Signed-off-by: Adrian Bunk --- linux-2.6.10-rc3-mm1-full/include/linux/netlink.h.old 2004-12-14 21:43:16.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/linux/netlink.h 2004-12-14 21:44:27.000000000 +0100 @@ -116,8 +116,6 @@ #define NETLINK_CREDS(skb) (&NETLINK_CB((skb)).creds) -extern int netlink_attach(int unit, int (*function)(int,struct sk_buff *skb)); -extern void netlink_detach(int unit); extern int netlink_post(int unit, struct sk_buff *skb); extern struct sock *netlink_kernel_create(int unit, void (*input)(struct sock *sk, int len)); extern void netlink_ack(struct sk_buff *in_skb, struct nlmsghdr *nlh, int err); @@ -129,7 +127,6 @@ extern int netlink_unregister_notifier(struct notifier_block *nb); /* finegrained unicast helpers: */ -struct sock *netlink_getsockbypid(struct sock *ssk, u32 pid); struct sock *netlink_getsockbyfilp(struct file *filp); int netlink_attachskb(struct sock *sk, struct sk_buff *skb, int nonblock, long timeo); void netlink_detachskb(struct sock *sk, struct sk_buff *skb); --- linux-2.6.10-rc3-mm1-full/net/netlink/af_netlink.c.old 2004-12-14 21:43:31.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/netlink/af_netlink.c 2004-12-14 21:44:34.000000000 +0100 @@ -546,7 +546,7 @@ } } -struct sock *netlink_getsockbypid(struct sock *ssk, u32 pid) +static struct sock *netlink_getsockbypid(struct sock *ssk, u32 pid) { int protocol = ssk->sk_protocol; struct sock *sock; @@ -1210,30 +1210,6 @@ * Backward compatibility. */ -int netlink_attach(int unit, int (*function)(int, struct sk_buff *skb)) -{ - struct sock *sk = netlink_kernel_create(unit, NULL); - if (sk == NULL) - return -ENOBUFS; - nlk_sk(sk)->handler = function; - write_lock_bh(&nl_emu_lock); - netlink_kernel[unit] = sk->sk_socket; - write_unlock_bh(&nl_emu_lock); - return 0; -} - -void netlink_detach(int unit) -{ - struct socket *sock; - - write_lock_bh(&nl_emu_lock); - sock = netlink_kernel[unit]; - netlink_kernel[unit] = NULL; - write_unlock_bh(&nl_emu_lock); - - sock_release(sock); -} - int netlink_post(int unit, struct sk_buff *skb) { struct socket *sock; @@ -1522,7 +1498,5 @@ EXPORT_SYMBOL(netlink_unregister_notifier); #if defined(CONFIG_NETLINK_DEV) || defined(CONFIG_NETLINK_DEV_MODULE) -EXPORT_SYMBOL(netlink_attach); -EXPORT_SYMBOL(netlink_detach); EXPORT_SYMBOL(netlink_post); #endif From bunk@stusta.de Tue Dec 14 16:48:36 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 14 Dec 2004 16:48:43 -0800 (PST) Received: from mailout.stusta.mhn.de (mailout.stusta.mhn.de [141.84.69.5]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBF0mFHX010908 for ; Tue, 14 Dec 2004 16:48:36 -0800 Received: (qmail 32514 invoked from network); 15 Dec 2004 00:47:47 -0000 Received: from r063144.stusta.swh.mhn.de (10.150.63.144) by mailout.stusta.mhn.de with SMTP; 15 Dec 2004 00:47:47 -0000 Received: by r063144.stusta.swh.mhn.de (Postfix, from userid 1000) id B7A93AF651; Wed, 15 Dec 2004 01:47:45 +0100 (CET) Date: Wed, 15 Dec 2004 01:47:45 +0100 From: Adrian Bunk To: netdev@oss.sgi.com Cc: linux-kernel@vger.kernel.org Subject: [2.6 patch] net/packet/af_packet.c: make some code static Message-ID: <20041215004745.GI23151@stusta.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12747 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bunk@stusta.de Precedence: bulk X-list: netdev The patch below makes some needlessly global code static. diffstat output: net/packet/af_packet.c | 21 +++++++++++---------- 1 files changed, 11 insertions(+), 10 deletions(-) Signed-off-by: Adrian Bunk --- linux-2.6.10-rc3-mm1-full/net/packet/af_packet.c.old 2004-12-14 21:47:49.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/packet/af_packet.c 2004-12-14 21:50:25.000000000 +0100 @@ -145,10 +145,10 @@ */ /* List of all packet sockets. */ -HLIST_HEAD(packet_sklist); +static HLIST_HEAD(packet_sklist); static rwlock_t packet_sklist_lock = RW_LOCK_UNLOCKED; -atomic_t packet_socks_nr; +static atomic_t packet_socks_nr; /* Private packet socket structures. */ @@ -215,7 +215,7 @@ #define pkt_sk(__sk) ((struct packet_opt *)(__sk)->sk_protinfo) -void packet_sock_destruct(struct sock *sk) +static void packet_sock_destruct(struct sock *sk) { BUG_TRAP(!atomic_read(&sk->sk_rmem_alloc)); BUG_TRAP(!atomic_read(&sk->sk_wmem_alloc)); @@ -234,10 +234,10 @@ } -extern struct proto_ops packet_ops; +static struct proto_ops packet_ops; #ifdef CONFIG_SOCK_PACKET -extern struct proto_ops packet_ops_spkt; +static struct proto_ops packet_ops_spkt; static int packet_rcv_spkt(struct sk_buff *skb, struct net_device *dev, struct packet_type *pt) { @@ -1350,8 +1350,8 @@ } } -int packet_getsockopt(struct socket *sock, int level, int optname, - char __user *optval, int __user *optlen) +static int packet_getsockopt(struct socket *sock, int level, int optname, + char __user *optval, int __user *optlen) { int len; struct sock *sk = sock->sk; @@ -1500,7 +1500,8 @@ #define packet_poll datagram_poll #else -unsigned int packet_poll(struct file * file, struct socket *sock, poll_table *wait) +static unsigned int packet_poll(struct file * file, struct socket *sock, + poll_table *wait) { struct sock *sk = sock->sk; struct packet_opt *po = pkt_sk(sk); @@ -1747,7 +1748,7 @@ #ifdef CONFIG_SOCK_PACKET -struct proto_ops packet_ops_spkt = { +static struct proto_ops packet_ops_spkt = { .family = PF_PACKET, .owner = THIS_MODULE, .release = packet_release, @@ -1769,7 +1770,7 @@ }; #endif -struct proto_ops packet_ops = { +static struct proto_ops packet_ops = { .family = PF_PACKET, .owner = THIS_MODULE, .release = packet_release, From bunk@stusta.de Tue Dec 14 16:52:31 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 14 Dec 2004 16:52:39 -0800 (PST) Received: from mailout.stusta.mhn.de (mailout.stusta.mhn.de [141.84.69.5]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBF0q9Uv011634 for ; Tue, 14 Dec 2004 16:52:30 -0800 Received: (qmail 32693 invoked from network); 15 Dec 2004 00:51:42 -0000 Received: from r063144.stusta.swh.mhn.de (10.150.63.144) by mailout.stusta.mhn.de with SMTP; 15 Dec 2004 00:51:42 -0000 Received: by r063144.stusta.swh.mhn.de (Postfix, from userid 1000) id DCB41AF651; Wed, 15 Dec 2004 01:51:39 +0100 (CET) Date: Wed, 15 Dec 2004 01:51:39 +0100 From: Adrian Bunk To: netdev@oss.sgi.com Cc: linux-kernel@vger.kernel.org Subject: [2.6 patch] net/ipv4/: misc possible cleanups Message-ID: <20041215005139.GJ23151@stusta.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12748 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bunk@stusta.de Precedence: bulk X-list: netdev The patch below contains the following possible cleanups: - make some needlessly global code static - remove the following unused global functions: - fib_rules.c: fib_rules_map_destination - xfrm4_policy.: xfrm4_fini - remove the following unneeded EXPORT_SYMBOL: - tcp_timer.c: tcp_timer_bug_msg Please review which of these changes are correct and which might conflict with pending patches. diffstat output: include/net/ip.h | 2 -- include/net/ip_fib.h | 2 -- include/net/ipconfig.h | 11 ----------- include/net/tcp.h | 16 ---------------- include/net/xfrm.h | 1 - net/ipv4/af_inet.c | 8 ++++---- net/ipv4/arp.c | 8 ++++---- net/ipv4/devinet.c | 4 ++-- net/ipv4/fib_frontend.c | 6 +++--- net/ipv4/fib_rules.c | 8 +------- net/ipv4/icmp.c | 4 ++-- net/ipv4/igmp.c | 16 ++++++++-------- net/ipv4/ip_gre.c | 6 +++--- net/ipv4/ip_sockglue.c | 2 +- net/ipv4/ipconfig.c | 12 ++++++------ net/ipv4/raw.c | 4 ++-- net/ipv4/route.c | 32 ++++++++++++++++---------------- net/ipv4/tcp_input.c | 16 ++++++++++++++-- net/ipv4/tcp_minisocks.c | 4 +++- net/ipv4/tcp_timer.c | 3 --- net/ipv4/udp.c | 13 ++++++++----- net/ipv4/xfrm4_policy.c | 7 ------- 22 files changed, 77 insertions(+), 108 deletions(-) Signed-off-by: Adrian Bunk --- linux-2.6.10-rc3-mm1-full/include/net/ip.h.old 2004-12-14 05:20:46.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/net/ip.h 2004-12-14 05:20:53.000000000 +0100 @@ -295,8 +295,6 @@ extern void ip_local_error(struct sock *sk, int err, u32 daddr, u16 dport, u32 info); -extern int ipv4_proc_init(void); - /* sysctl helpers - any sysctl which holds a value that ends up being * fed into the routing cache should use these handlers. */ --- linux-2.6.10-rc3-mm1-full/net/ipv4/af_inet.c.old 2004-12-14 05:20:00.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/af_inet.c 2004-12-14 05:20:33.000000000 +0100 @@ -659,7 +659,7 @@ } -ssize_t inet_sendpage(struct socket *sock, struct page *page, int offset, size_t size, int flags) +static ssize_t inet_sendpage(struct socket *sock, struct page *page, int offset, size_t size, int flags) { struct sock *sk = sock->sk; @@ -1011,7 +1011,7 @@ return 0; } -int ipv4_proc_init(void); +static int ipv4_proc_init(void); extern void ipfrag_init(void); static int __init inet_init(void) @@ -1136,7 +1136,7 @@ extern int udp4_proc_init(void); extern void udp4_proc_exit(void); -int __init ipv4_proc_init(void) +static int __init ipv4_proc_init(void) { int rc = 0; @@ -1166,7 +1166,7 @@ } #else /* CONFIG_PROC_FS */ -int __init ipv4_proc_init(void) +static int __init ipv4_proc_init(void) { return 0; } --- linux-2.6.10-rc3-mm1-full/net/ipv4/arp.c.old 2004-12-14 05:21:08.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/arp.c 2004-12-14 05:21:53.000000000 +0100 @@ -704,7 +704,7 @@ * Process an arp request. */ -int arp_process(struct sk_buff *skb) +static int arp_process(struct sk_buff *skb) { struct net_device *dev = skb->dev; struct in_device *in_dev = in_dev_get(dev); @@ -961,7 +961,7 @@ * Set (create) an ARP cache entry. */ -int arp_req_set(struct arpreq *r, struct net_device * dev) +static int arp_req_set(struct arpreq *r, struct net_device * dev) { u32 ip = ((struct sockaddr_in *) &r->arp_pa)->sin_addr.s_addr; struct neighbour *neigh; @@ -1075,7 +1075,7 @@ return err; } -int arp_req_delete(struct arpreq *r, struct net_device * dev) +static int arp_req_delete(struct arpreq *r, struct net_device * dev) { int err; u32 ip = ((struct sockaddr_in *)&r->arp_pa)->sin_addr.s_addr; @@ -1207,7 +1207,7 @@ return NOTIFY_DONE; } -struct notifier_block arp_netdev_notifier = { +static struct notifier_block arp_netdev_notifier = { .notifier_call = arp_netdev_event, }; --- linux-2.6.10-rc3-mm1-full/net/ipv4/devinet.c.old 2004-12-14 05:22:07.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/devinet.c 2004-12-14 05:22:26.000000000 +0100 @@ -380,7 +380,7 @@ return NULL; } -int inet_rtm_deladdr(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg) +static int inet_rtm_deladdr(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg) { struct rtattr **rta = arg; struct in_device *in_dev; @@ -412,7 +412,7 @@ return -EADDRNOTAVAIL; } -int inet_rtm_newaddr(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg) +static int inet_rtm_newaddr(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg) { struct rtattr **rta = arg; struct net_device *dev; --- linux-2.6.10-rc3-mm1-full/include/net/ip_fib.h.old 2004-12-14 05:22:51.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/net/ip_fib.h 2004-12-14 05:23:53.000000000 +0100 @@ -200,7 +200,6 @@ /* Exported by fib_frontend.c */ extern void ip_fib_init(void); -extern void fib_flush(void); extern int inet_rtm_delroute(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg); extern int inet_rtm_newroute(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg); extern int inet_rtm_getroute(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg); @@ -226,7 +225,6 @@ extern int inet_rtm_delrule(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg); extern int inet_rtm_newrule(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg); extern int inet_dump_rules(struct sk_buff *skb, struct netlink_callback *cb); -extern u32 fib_rules_map_destination(u32 daddr, struct fib_result *res); #ifdef CONFIG_NET_CLS_ROUTE extern u32 fib_rules_tclass(struct fib_result *res); #endif --- linux-2.6.10-rc3-mm1-full/net/ipv4/fib_frontend.c.old 2004-12-14 05:23:09.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/fib_frontend.c 2004-12-14 05:23:33.000000000 +0100 @@ -75,7 +75,7 @@ #endif /* CONFIG_IP_MULTIPLE_TABLES */ -void fib_flush(void) +static void fib_flush(void) { int flushed = 0; #ifdef CONFIG_IP_MULTIPLE_TABLES @@ -585,11 +585,11 @@ return NOTIFY_DONE; } -struct notifier_block fib_inetaddr_notifier = { +static struct notifier_block fib_inetaddr_notifier = { .notifier_call =fib_inetaddr_event, }; -struct notifier_block fib_netdev_notifier = { +static struct notifier_block fib_netdev_notifier = { .notifier_call =fib_netdev_event, }; --- linux-2.6.10-rc3-mm1-full/net/ipv4/fib_rules.c.old 2004-12-14 05:24:04.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/fib_rules.c 2004-12-14 05:24:22.000000000 +0100 @@ -245,12 +245,6 @@ return 0; } -u32 fib_rules_map_destination(u32 daddr, struct fib_result *res) -{ - u32 mask = inet_make_mask(res->prefixlen); - return (daddr&~mask)|res->fi->fib_nh->nh_gw; -} - #ifdef CONFIG_NET_CLS_ROUTE u32 fib_rules_tclass(struct fib_result *res) { @@ -368,7 +362,7 @@ } -struct notifier_block fib_rules_notifier = { +static struct notifier_block fib_rules_notifier = { .notifier_call =fib_rules_event, }; --- linux-2.6.10-rc3-mm1-full/net/ipv4/icmp.c.old 2004-12-14 05:24:38.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/icmp.c 2004-12-14 05:25:02.000000000 +0100 @@ -327,8 +327,8 @@ * Checksum each fragment, and on the first include the headers and final * checksum. */ -int icmp_glue_bits(void *from, char *to, int offset, int len, int odd, - struct sk_buff *skb) +static int icmp_glue_bits(void *from, char *to, int offset, int len, int odd, + struct sk_buff *skb) { struct icmp_bxm *icmp_param = (struct icmp_bxm *)from; unsigned int csum; --- linux-2.6.10-rc3-mm1-full/net/ipv4/igmp.c.old 2004-12-14 05:25:26.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/igmp.c 2004-12-14 05:26:44.000000000 +0100 @@ -143,8 +143,8 @@ static void sf_markstate(struct ip_mc_list *pmc); #endif static void ip_mc_clear_src(struct ip_mc_list *pmc); -int ip_mc_add_src(struct in_device *in_dev, __u32 *pmca, int sfmode, - int sfcount, __u32 *psfsrc, int delta); +static int ip_mc_add_src(struct in_device *in_dev, __u32 *pmca, int sfmode, + int sfcount, __u32 *psfsrc, int delta); static void ip_ma_put(struct ip_mc_list *im) { @@ -1384,8 +1384,8 @@ #define igmp_ifc_event(x) do { } while (0) #endif -int ip_mc_del_src(struct in_device *in_dev, __u32 *pmca, int sfmode, - int sfcount, __u32 *psfsrc, int delta) +static int ip_mc_del_src(struct in_device *in_dev, __u32 *pmca, int sfmode, + int sfcount, __u32 *psfsrc, int delta) { struct ip_mc_list *pmc; int changerec = 0; @@ -1520,8 +1520,8 @@ /* * Add multicast source filter list to the interface list */ -int ip_mc_add_src(struct in_device *in_dev, __u32 *pmca, int sfmode, - int sfcount, __u32 *psfsrc, int delta) +static int ip_mc_add_src(struct in_device *in_dev, __u32 *pmca, int sfmode, + int sfcount, __u32 *psfsrc, int delta) { struct ip_mc_list *pmc; int isexclude; @@ -1667,8 +1667,8 @@ return err; } -int ip_mc_leave_src(struct sock *sk, struct ip_mc_socklist *iml, - struct in_device *in_dev) +static int ip_mc_leave_src(struct sock *sk, struct ip_mc_socklist *iml, + struct in_device *in_dev) { int err; --- linux-2.6.10-rc3-mm1-full/net/ipv4/ip_gre.c.old 2004-12-14 05:29:23.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/ip_gre.c 2004-12-14 05:29:54.000000000 +0100 @@ -304,7 +304,7 @@ } -void ipgre_err(struct sk_buff *skb, u32 info) +static void ipgre_err(struct sk_buff *skb, u32 info) { #ifndef I_WISH_WORLD_WERE_PERFECT @@ -552,7 +552,7 @@ return INET_ECN_encapsulate(tos, inner); } -int ipgre_rcv(struct sk_buff *skb) +static int ipgre_rcv(struct sk_buff *skb) { struct iphdr *iph; u8 *h; @@ -1279,7 +1279,7 @@ goto out; } -void ipgre_fini(void) +static void ipgre_fini(void) { if (inet_del_protocol(&ipgre_protocol, IPPROTO_GRE) < 0) printk(KERN_INFO "ipgre close: can't remove protocol\n"); --- linux-2.6.10-rc3-mm1-full/net/ipv4/ip_sockglue.c.old 2004-12-14 05:30:10.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/ip_sockglue.c 2004-12-14 05:30:17.000000000 +0100 @@ -92,7 +92,7 @@ } -void ip_cmsg_recv_retopts(struct msghdr *msg, struct sk_buff *skb) +static void ip_cmsg_recv_retopts(struct msghdr *msg, struct sk_buff *skb) { unsigned char optbuf[sizeof(struct ip_options) + 40]; struct ip_options * opt = (struct ip_options*)optbuf; --- linux-2.6.10-rc3-mm1-full/include/net/ipconfig.h.old 2004-12-14 05:30:42.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/net/ipconfig.h 2004-12-14 05:35:02.000000000 +0100 @@ -8,14 +8,10 @@ /* The following are initdata: */ -extern int ic_enable; /* Enable or disable the whole shebang */ - extern int ic_proto_enabled; /* Protocols enabled (see IC_xxx) */ -extern int ic_host_name_set; /* Host name set by ipconfig? */ extern int ic_set_manually; /* IPconfig parameters set manually */ extern u32 ic_myaddr; /* My IP address */ -extern u32 ic_netmask; /* Netmask for local subnet */ extern u32 ic_gateway; /* Gateway IP address */ extern u32 ic_servaddr; /* Boot server IP address */ @@ -24,13 +20,6 @@ extern u8 root_server_path[]; /* Path to mount as root */ - -/* The following are persistent (not initdata): */ - -extern int ic_proto_used; /* Protocol used, if any */ -extern u32 ic_nameserver; /* DNS server IP address */ -extern u8 ic_domain[]; /* DNS (not NIS) domain name */ - /* bits in ic_proto_{enabled,used} */ #define IC_PROTO 0xFF /* Protocols mask: */ #define IC_BOOTP 0x01 /* BOOTP (or DHCP, see below) */ --- linux-2.6.10-rc3-mm1-full/net/ipv4/ipconfig.c.old 2004-12-14 05:30:57.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/ipconfig.c 2004-12-14 05:35:10.000000000 +0100 @@ -109,7 +109,7 @@ */ int ic_set_manually __initdata = 0; /* IPconfig parameters set manually */ -int ic_enable __initdata = 0; /* IP config enabled? */ +static int ic_enable __initdata = 0; /* IP config enabled? */ /* Protocol choice */ int ic_proto_enabled __initdata = 0 @@ -124,10 +124,10 @@ #endif ; -int ic_host_name_set __initdata = 0; /* Host name set by us? */ +static int ic_host_name_set __initdata = 0; /* Host name set by us? */ u32 ic_myaddr = INADDR_NONE; /* My IP address */ -u32 ic_netmask = INADDR_NONE; /* Netmask for local subnet */ +static u32 ic_netmask = INADDR_NONE; /* Netmask for local subnet */ u32 ic_gateway = INADDR_NONE; /* Gateway IP address */ u32 ic_servaddr = INADDR_NONE; /* Boot server IP address */ @@ -137,9 +137,9 @@ /* Persistent data: */ -int ic_proto_used; /* Protocol used, if any */ -u32 ic_nameservers[CONF_NAMESERVERS_MAX]; /* DNS Server IP addresses */ -u8 ic_domain[64]; /* DNS (not NIS) domain name */ +static int ic_proto_used; /* Protocol used, if any */ +static u32 ic_nameservers[CONF_NAMESERVERS_MAX]; /* DNS Server IP addresses */ +static u8 ic_domain[64]; /* DNS (not NIS) domain name */ /* * Private state. --- linux-2.6.10-rc3-mm1-full/net/ipv4/raw.c.old 2004-12-14 05:36:04.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/raw.c 2004-12-14 05:36:22.000000000 +0100 @@ -562,8 +562,8 @@ * we return it, otherwise we block. */ -int raw_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg, - size_t len, int noblock, int flags, int *addr_len) +static int raw_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg, + size_t len, int noblock, int flags, int *addr_len) { struct inet_opt *inet = inet_sk(sk); size_t copied = 0; --- linux-2.6.10-rc3-mm1-full/net/ipv4/route.c.old 2004-12-14 05:36:47.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/route.c 2004-12-14 05:59:55.000000000 +0100 @@ -108,22 +108,22 @@ #define RT_GC_TIMEOUT (300*HZ) -int ip_rt_min_delay = 2 * HZ; -int ip_rt_max_delay = 10 * HZ; -int ip_rt_max_size; -int ip_rt_gc_timeout = RT_GC_TIMEOUT; -int ip_rt_gc_interval = 60 * HZ; -int ip_rt_gc_min_interval = HZ / 2; -int ip_rt_redirect_number = 9; -int ip_rt_redirect_load = HZ / 50; -int ip_rt_redirect_silence = ((HZ / 50) << (9 + 1)); -int ip_rt_error_cost = HZ; -int ip_rt_error_burst = 5 * HZ; -int ip_rt_gc_elasticity = 8; -int ip_rt_mtu_expires = 10 * 60 * HZ; -int ip_rt_min_pmtu = 512 + 20 + 20; -int ip_rt_min_advmss = 256; -int ip_rt_secret_interval = 10 * 60 * HZ; +static int ip_rt_min_delay = 2 * HZ; +static int ip_rt_max_delay = 10 * HZ; +static int ip_rt_max_size; +static int ip_rt_gc_timeout = RT_GC_TIMEOUT; +static int ip_rt_gc_interval = 60 * HZ; +static int ip_rt_gc_min_interval = HZ / 2; +static int ip_rt_redirect_number = 9; +static int ip_rt_redirect_load = HZ / 50; +static int ip_rt_redirect_silence = ((HZ / 50) << (9 + 1)); +static int ip_rt_error_cost = HZ; +static int ip_rt_error_burst = 5 * HZ; +static int ip_rt_gc_elasticity = 8; +static int ip_rt_mtu_expires = 10 * 60 * HZ; +static int ip_rt_min_pmtu = 512 + 20 + 20; +static int ip_rt_min_advmss = 256; +static int ip_rt_secret_interval = 10 * 60 * HZ; static unsigned long rt_deadline; #define RTprint(a...) printk(KERN_DEBUG a) --- linux-2.6.10-rc3-mm1-full/include/net/tcp.h.old 2004-12-14 05:43:25.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/net/tcp.h 2004-12-14 05:45:41.000000000 +0100 @@ -315,7 +315,6 @@ extern atomic_t tcp_orphan_count; extern int tcp_tw_count; extern void tcp_time_wait(struct sock *sk, int state, int timeo); -extern void tcp_tw_schedule(struct tcp_tw_bucket *tw, int timeo); extern void tcp_tw_deschedule(struct tcp_tw_bucket *tw); @@ -2020,21 +2019,6 @@ tp->westwood.rtt = rtt_seq; } -void __tcp_westwood_fast_bw(struct sock *, struct sk_buff *); -void __tcp_westwood_slow_bw(struct sock *, struct sk_buff *); - -static inline void tcp_westwood_fast_bw(struct sock *sk, struct sk_buff *skb) -{ - if (tcp_is_westwood(tcp_sk(sk))) - __tcp_westwood_fast_bw(sk, skb); -} - -static inline void tcp_westwood_slow_bw(struct sock *sk, struct sk_buff *skb) -{ - if (tcp_is_westwood(tcp_sk(sk))) - __tcp_westwood_slow_bw(sk, skb); -} - static inline __u32 __tcp_westwood_bw_rttmin(const struct tcp_opt *tp) { return max((tp->westwood.bw_est) * (tp->westwood.rtt_min) / --- linux-2.6.10-rc3-mm1-full/net/ipv4/tcp_input.c.old 2004-12-14 05:43:43.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/tcp_input.c 2004-12-14 05:44:53.000000000 +0100 @@ -2786,7 +2786,7 @@ * straight forward and doesn't need any particular care. */ -void __tcp_westwood_fast_bw(struct sock *sk, struct sk_buff *skb) +static void __tcp_westwood_fast_bw(struct sock *sk, struct sk_buff *skb) { struct tcp_opt *tp = tcp_sk(sk); @@ -2797,6 +2797,12 @@ tp->westwood.rtt_min = westwood_update_rttmin(sk); } +static inline void tcp_westwood_fast_bw(struct sock *sk, struct sk_buff *skb) +{ + if (tcp_is_westwood(tcp_sk(sk))) + __tcp_westwood_fast_bw(sk, skb); +} + /* * @westwood_dupack_update @@ -2867,7 +2873,7 @@ * dupack. But we need to be careful in such case. */ -void __tcp_westwood_slow_bw(struct sock *sk, struct sk_buff *skb) +static void __tcp_westwood_slow_bw(struct sock *sk, struct sk_buff *skb) { struct tcp_opt *tp = tcp_sk(sk); @@ -2877,6 +2883,12 @@ tp->westwood.rtt_min = westwood_update_rttmin(sk); } +static inline void tcp_westwood_slow_bw(struct sock *sk, struct sk_buff *skb) +{ + if (tcp_is_westwood(tcp_sk(sk))) + __tcp_westwood_slow_bw(sk, skb); +} + /* This routine deals with incoming acks, but not outgoing ones. */ static int tcp_ack(struct sock *sk, struct sk_buff *skb, int flag) { --- linux-2.6.10-rc3-mm1-full/net/ipv4/tcp_minisocks.c.old 2004-12-14 05:45:57.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/tcp_minisocks.c 2004-12-14 05:46:17.000000000 +0100 @@ -41,6 +41,8 @@ int sysctl_tcp_syncookies = SYNC_INIT; int sysctl_tcp_abort_on_overflow; +static void tcp_tw_schedule(struct tcp_tw_bucket *tw, int timeo); + static __inline__ int tcp_in_window(u32 seq, u32 end_seq, u32 s_win, u32 e_win) { if (seq == s_win) @@ -551,7 +553,7 @@ TIMER_INITIALIZER(tcp_twcal_tick, 0, 0); static struct hlist_head tcp_twcal_row[TCP_TW_RECYCLE_SLOTS]; -void tcp_tw_schedule(struct tcp_tw_bucket *tw, int timeo) +static void tcp_tw_schedule(struct tcp_tw_bucket *tw, int timeo) { struct hlist_head *list; int slot; --- linux-2.6.10-rc3-mm1-full/net/ipv4/tcp_timer.c.old 2004-12-14 05:48:10.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/tcp_timer.c 2004-12-14 13:57:40.000000000 +0100 @@ -653,6 +653,3 @@ EXPORT_SYMBOL(tcp_delete_keepalive_timer); EXPORT_SYMBOL(tcp_init_xmit_timers); EXPORT_SYMBOL(tcp_reset_keepalive_timer); -#ifdef TCP_DEBUG -EXPORT_SYMBOL(tcp_timer_bug_msg); -#endif --- linux-2.6.10-rc3-mm1-full/net/ipv4/udp.c.old 2004-12-14 05:50:04.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/udp.c 2004-12-14 05:51:36.000000000 +0100 @@ -219,7 +219,8 @@ /* UDP is nearly always wildcards out the wazoo, it makes no sense to try * harder than this. -DaveM */ -struct sock *udp_v4_lookup_longway(u32 saddr, u16 sport, u32 daddr, u16 dport, int dif) +static struct sock *udp_v4_lookup_longway(u32 saddr, u16 sport, + u32 daddr, u16 dport, int dif) { struct sock *sk, *result = NULL; struct hlist_node *node; @@ -263,7 +264,8 @@ return result; } -__inline__ struct sock *udp_v4_lookup(u32 saddr, u16 sport, u32 daddr, u16 dport, int dif) +static __inline__ struct sock *udp_v4_lookup(u32 saddr, u16 sport, + u32 daddr, u16 dport, int dif) { struct sock *sk; @@ -667,7 +669,8 @@ goto out; } -int udp_sendpage(struct sock *sk, struct page *page, int offset, size_t size, int flags) +static int udp_sendpage(struct sock *sk, struct page *page, int offset, + size_t size, int flags) { struct udp_opt *up = udp_sk(sk); int ret; @@ -770,8 +773,8 @@ * return it, otherwise we block. */ -int udp_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg, - size_t len, int noblock, int flags, int *addr_len) +static int udp_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg, + size_t len, int noblock, int flags, int *addr_len) { struct inet_opt *inet = inet_sk(sk); struct sockaddr_in *sin = (struct sockaddr_in *)msg->msg_name; --- linux-2.6.10-rc3-mm1-full/include/net/xfrm.h.old 2004-12-14 05:51:56.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/net/xfrm.h 2004-12-14 05:52:03.000000000 +0100 @@ -782,7 +782,6 @@ extern void xfrm_init(void); extern void xfrm4_init(void); -extern void xfrm4_fini(void); extern void xfrm6_init(void); extern void xfrm6_fini(void); extern void xfrm_state_init(void); --- linux-2.6.10-rc3-mm1-full/net/ipv4/xfrm4_policy.c.old 2004-12-14 05:52:13.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/xfrm4_policy.c 2004-12-14 05:52:22.000000000 +0100 @@ -279,10 +279,3 @@ xfrm4_policy_init(); } -void __exit xfrm4_fini(void) -{ - //xfrm4_input_fini(); - xfrm4_policy_fini(); - xfrm4_state_fini(); -} - From bunk@stusta.de Tue Dec 14 16:56:38 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 14 Dec 2004 16:56:46 -0800 (PST) Received: from mailout.stusta.mhn.de (emailhub.stusta.mhn.de [141.84.69.5]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBF0uGa9013010 for ; Tue, 14 Dec 2004 16:56:37 -0800 Received: (qmail 467 invoked from network); 15 Dec 2004 00:55:48 -0000 Received: from r063144.stusta.swh.mhn.de (10.150.63.144) by mailout.stusta.mhn.de with SMTP; 15 Dec 2004 00:55:48 -0000 Received: by r063144.stusta.swh.mhn.de (Postfix, from userid 1000) id 78952AF651; Wed, 15 Dec 2004 01:55:46 +0100 (CET) Date: Wed, 15 Dec 2004 01:55:46 +0100 From: Adrian Bunk To: netdev@oss.sgi.com Cc: linux-kernel@vger.kernel.org Subject: [2.6 patch] net/ipv6/: misc possible cleanups Message-ID: <20041215005546.GA11972@stusta.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12749 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bunk@stusta.de Precedence: bulk X-list: netdev The patch below contains the following possible cleanups: - make some needlessly global code static - remove the following unused functions: - exthdrs.c: ipv6_build_rthdr - exthdrs.c: ipv6_build_exthdr - exthdrs.c: ipv6_build_nfrag_opts - exthdrs.c: ipv6_build_frag_opts - remove the following unused global variables: - addrconf.c: in6addr_any - remove the following EXPORT_SYMBOL's: - ipv6_syms.c: addrconf_lock - ipv6_syms.c: in6addr_any - ipv6_syms.c: in6addr_loopback Please comment on which of these changes are correct and which conflict with pending patches. diffstat output: include/linux/in6.h | 5 -- include/net/addrconf.h | 1 include/net/ipv6.h | 10 ---- net/ipv6/addrconf.c | 9 +--- net/ipv6/anycast.c | 4 + net/ipv6/exthdrs.c | 78 ------------------------------------- net/ipv6/icmp.c | 2 net/ipv6/ip6_output.c | 2 net/ipv6/ipv6_syms.c | 3 - net/ipv6/mcast.c | 16 +++---- net/ipv6/route.c | 4 - net/ipv6/sysctl_net_ipv6.c | 2 12 files changed, 21 insertions(+), 115 deletions(-) Signed-off-by: Adrian Bunk --- linux-2.6.10-rc3-mm1-full/include/net/ip.h.old 2004-12-14 05:20:46.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/net/ip.h 2004-12-14 05:20:53.000000000 +0100 @@ -295,8 +295,6 @@ extern void ip_local_error(struct sock *sk, int err, u32 daddr, u16 dport, u32 info); -extern int ipv4_proc_init(void); - /* sysctl helpers - any sysctl which holds a value that ends up being * fed into the routing cache should use these handlers. */ --- linux-2.6.10-rc3-mm1-full/net/ipv4/af_inet.c.old 2004-12-14 05:20:00.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/af_inet.c 2004-12-14 05:20:33.000000000 +0100 @@ -659,7 +659,7 @@ } -ssize_t inet_sendpage(struct socket *sock, struct page *page, int offset, size_t size, int flags) +static ssize_t inet_sendpage(struct socket *sock, struct page *page, int offset, size_t size, int flags) { struct sock *sk = sock->sk; @@ -1011,7 +1011,7 @@ return 0; } -int ipv4_proc_init(void); +static int ipv4_proc_init(void); extern void ipfrag_init(void); static int __init inet_init(void) @@ -1136,7 +1136,7 @@ extern int udp4_proc_init(void); extern void udp4_proc_exit(void); -int __init ipv4_proc_init(void) +static int __init ipv4_proc_init(void) { int rc = 0; @@ -1166,7 +1166,7 @@ } #else /* CONFIG_PROC_FS */ -int __init ipv4_proc_init(void) +static int __init ipv4_proc_init(void) { return 0; } --- linux-2.6.10-rc3-mm1-full/net/ipv4/arp.c.old 2004-12-14 05:21:08.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/arp.c 2004-12-14 05:21:53.000000000 +0100 @@ -704,7 +704,7 @@ * Process an arp request. */ -int arp_process(struct sk_buff *skb) +static int arp_process(struct sk_buff *skb) { struct net_device *dev = skb->dev; struct in_device *in_dev = in_dev_get(dev); @@ -961,7 +961,7 @@ * Set (create) an ARP cache entry. */ -int arp_req_set(struct arpreq *r, struct net_device * dev) +static int arp_req_set(struct arpreq *r, struct net_device * dev) { u32 ip = ((struct sockaddr_in *) &r->arp_pa)->sin_addr.s_addr; struct neighbour *neigh; @@ -1075,7 +1075,7 @@ return err; } -int arp_req_delete(struct arpreq *r, struct net_device * dev) +static int arp_req_delete(struct arpreq *r, struct net_device * dev) { int err; u32 ip = ((struct sockaddr_in *)&r->arp_pa)->sin_addr.s_addr; @@ -1207,7 +1207,7 @@ return NOTIFY_DONE; } -struct notifier_block arp_netdev_notifier = { +static struct notifier_block arp_netdev_notifier = { .notifier_call = arp_netdev_event, }; --- linux-2.6.10-rc3-mm1-full/net/ipv4/devinet.c.old 2004-12-14 05:22:07.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/devinet.c 2004-12-14 05:22:26.000000000 +0100 @@ -380,7 +380,7 @@ return NULL; } -int inet_rtm_deladdr(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg) +static int inet_rtm_deladdr(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg) { struct rtattr **rta = arg; struct in_device *in_dev; @@ -412,7 +412,7 @@ return -EADDRNOTAVAIL; } -int inet_rtm_newaddr(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg) +static int inet_rtm_newaddr(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg) { struct rtattr **rta = arg; struct net_device *dev; --- linux-2.6.10-rc3-mm1-full/include/net/ip_fib.h.old 2004-12-14 05:22:51.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/net/ip_fib.h 2004-12-14 05:23:53.000000000 +0100 @@ -200,7 +200,6 @@ /* Exported by fib_frontend.c */ extern void ip_fib_init(void); -extern void fib_flush(void); extern int inet_rtm_delroute(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg); extern int inet_rtm_newroute(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg); extern int inet_rtm_getroute(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg); @@ -226,7 +225,6 @@ extern int inet_rtm_delrule(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg); extern int inet_rtm_newrule(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg); extern int inet_dump_rules(struct sk_buff *skb, struct netlink_callback *cb); -extern u32 fib_rules_map_destination(u32 daddr, struct fib_result *res); #ifdef CONFIG_NET_CLS_ROUTE extern u32 fib_rules_tclass(struct fib_result *res); #endif --- linux-2.6.10-rc3-mm1-full/net/ipv4/fib_frontend.c.old 2004-12-14 05:23:09.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/fib_frontend.c 2004-12-14 05:23:33.000000000 +0100 @@ -75,7 +75,7 @@ #endif /* CONFIG_IP_MULTIPLE_TABLES */ -void fib_flush(void) +static void fib_flush(void) { int flushed = 0; #ifdef CONFIG_IP_MULTIPLE_TABLES @@ -585,11 +585,11 @@ return NOTIFY_DONE; } -struct notifier_block fib_inetaddr_notifier = { +static struct notifier_block fib_inetaddr_notifier = { .notifier_call =fib_inetaddr_event, }; -struct notifier_block fib_netdev_notifier = { +static struct notifier_block fib_netdev_notifier = { .notifier_call =fib_netdev_event, }; --- linux-2.6.10-rc3-mm1-full/net/ipv4/fib_rules.c.old 2004-12-14 05:24:04.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/fib_rules.c 2004-12-14 05:24:22.000000000 +0100 @@ -245,12 +245,6 @@ return 0; } -u32 fib_rules_map_destination(u32 daddr, struct fib_result *res) -{ - u32 mask = inet_make_mask(res->prefixlen); - return (daddr&~mask)|res->fi->fib_nh->nh_gw; -} - #ifdef CONFIG_NET_CLS_ROUTE u32 fib_rules_tclass(struct fib_result *res) { @@ -368,7 +362,7 @@ } -struct notifier_block fib_rules_notifier = { +static struct notifier_block fib_rules_notifier = { .notifier_call =fib_rules_event, }; --- linux-2.6.10-rc3-mm1-full/net/ipv4/icmp.c.old 2004-12-14 05:24:38.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/icmp.c 2004-12-14 05:25:02.000000000 +0100 @@ -327,8 +327,8 @@ * Checksum each fragment, and on the first include the headers and final * checksum. */ -int icmp_glue_bits(void *from, char *to, int offset, int len, int odd, - struct sk_buff *skb) +static int icmp_glue_bits(void *from, char *to, int offset, int len, int odd, + struct sk_buff *skb) { struct icmp_bxm *icmp_param = (struct icmp_bxm *)from; unsigned int csum; --- linux-2.6.10-rc3-mm1-full/net/ipv4/igmp.c.old 2004-12-14 05:25:26.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/igmp.c 2004-12-14 05:26:44.000000000 +0100 @@ -143,8 +143,8 @@ static void sf_markstate(struct ip_mc_list *pmc); #endif static void ip_mc_clear_src(struct ip_mc_list *pmc); -int ip_mc_add_src(struct in_device *in_dev, __u32 *pmca, int sfmode, - int sfcount, __u32 *psfsrc, int delta); +static int ip_mc_add_src(struct in_device *in_dev, __u32 *pmca, int sfmode, + int sfcount, __u32 *psfsrc, int delta); static void ip_ma_put(struct ip_mc_list *im) { @@ -1384,8 +1384,8 @@ #define igmp_ifc_event(x) do { } while (0) #endif -int ip_mc_del_src(struct in_device *in_dev, __u32 *pmca, int sfmode, - int sfcount, __u32 *psfsrc, int delta) +static int ip_mc_del_src(struct in_device *in_dev, __u32 *pmca, int sfmode, + int sfcount, __u32 *psfsrc, int delta) { struct ip_mc_list *pmc; int changerec = 0; @@ -1520,8 +1520,8 @@ /* * Add multicast source filter list to the interface list */ -int ip_mc_add_src(struct in_device *in_dev, __u32 *pmca, int sfmode, - int sfcount, __u32 *psfsrc, int delta) +static int ip_mc_add_src(struct in_device *in_dev, __u32 *pmca, int sfmode, + int sfcount, __u32 *psfsrc, int delta) { struct ip_mc_list *pmc; int isexclude; @@ -1667,8 +1667,8 @@ return err; } -int ip_mc_leave_src(struct sock *sk, struct ip_mc_socklist *iml, - struct in_device *in_dev) +static int ip_mc_leave_src(struct sock *sk, struct ip_mc_socklist *iml, + struct in_device *in_dev) { int err; --- linux-2.6.10-rc3-mm1-full/net/ipv4/ip_gre.c.old 2004-12-14 05:29:23.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/ip_gre.c 2004-12-14 05:29:54.000000000 +0100 @@ -304,7 +304,7 @@ } -void ipgre_err(struct sk_buff *skb, u32 info) +static void ipgre_err(struct sk_buff *skb, u32 info) { #ifndef I_WISH_WORLD_WERE_PERFECT @@ -552,7 +552,7 @@ return INET_ECN_encapsulate(tos, inner); } -int ipgre_rcv(struct sk_buff *skb) +static int ipgre_rcv(struct sk_buff *skb) { struct iphdr *iph; u8 *h; @@ -1279,7 +1279,7 @@ goto out; } -void ipgre_fini(void) +static void ipgre_fini(void) { if (inet_del_protocol(&ipgre_protocol, IPPROTO_GRE) < 0) printk(KERN_INFO "ipgre close: can't remove protocol\n"); --- linux-2.6.10-rc3-mm1-full/net/ipv4/ip_sockglue.c.old 2004-12-14 05:30:10.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/ip_sockglue.c 2004-12-14 05:30:17.000000000 +0100 @@ -92,7 +92,7 @@ } -void ip_cmsg_recv_retopts(struct msghdr *msg, struct sk_buff *skb) +static void ip_cmsg_recv_retopts(struct msghdr *msg, struct sk_buff *skb) { unsigned char optbuf[sizeof(struct ip_options) + 40]; struct ip_options * opt = (struct ip_options*)optbuf; --- linux-2.6.10-rc3-mm1-full/include/net/ipconfig.h.old 2004-12-14 05:30:42.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/net/ipconfig.h 2004-12-14 05:35:02.000000000 +0100 @@ -8,14 +8,10 @@ /* The following are initdata: */ -extern int ic_enable; /* Enable or disable the whole shebang */ - extern int ic_proto_enabled; /* Protocols enabled (see IC_xxx) */ -extern int ic_host_name_set; /* Host name set by ipconfig? */ extern int ic_set_manually; /* IPconfig parameters set manually */ extern u32 ic_myaddr; /* My IP address */ -extern u32 ic_netmask; /* Netmask for local subnet */ extern u32 ic_gateway; /* Gateway IP address */ extern u32 ic_servaddr; /* Boot server IP address */ @@ -24,13 +20,6 @@ extern u8 root_server_path[]; /* Path to mount as root */ - -/* The following are persistent (not initdata): */ - -extern int ic_proto_used; /* Protocol used, if any */ -extern u32 ic_nameserver; /* DNS server IP address */ -extern u8 ic_domain[]; /* DNS (not NIS) domain name */ - /* bits in ic_proto_{enabled,used} */ #define IC_PROTO 0xFF /* Protocols mask: */ #define IC_BOOTP 0x01 /* BOOTP (or DHCP, see below) */ --- linux-2.6.10-rc3-mm1-full/net/ipv4/ipconfig.c.old 2004-12-14 05:30:57.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/ipconfig.c 2004-12-14 05:35:10.000000000 +0100 @@ -109,7 +109,7 @@ */ int ic_set_manually __initdata = 0; /* IPconfig parameters set manually */ -int ic_enable __initdata = 0; /* IP config enabled? */ +static int ic_enable __initdata = 0; /* IP config enabled? */ /* Protocol choice */ int ic_proto_enabled __initdata = 0 @@ -124,10 +124,10 @@ #endif ; -int ic_host_name_set __initdata = 0; /* Host name set by us? */ +static int ic_host_name_set __initdata = 0; /* Host name set by us? */ u32 ic_myaddr = INADDR_NONE; /* My IP address */ -u32 ic_netmask = INADDR_NONE; /* Netmask for local subnet */ +static u32 ic_netmask = INADDR_NONE; /* Netmask for local subnet */ u32 ic_gateway = INADDR_NONE; /* Gateway IP address */ u32 ic_servaddr = INADDR_NONE; /* Boot server IP address */ @@ -137,9 +137,9 @@ /* Persistent data: */ -int ic_proto_used; /* Protocol used, if any */ -u32 ic_nameservers[CONF_NAMESERVERS_MAX]; /* DNS Server IP addresses */ -u8 ic_domain[64]; /* DNS (not NIS) domain name */ +static int ic_proto_used; /* Protocol used, if any */ +static u32 ic_nameservers[CONF_NAMESERVERS_MAX]; /* DNS Server IP addresses */ +static u8 ic_domain[64]; /* DNS (not NIS) domain name */ /* * Private state. --- linux-2.6.10-rc3-mm1-full/net/ipv4/raw.c.old 2004-12-14 05:36:04.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/raw.c 2004-12-14 05:36:22.000000000 +0100 @@ -562,8 +562,8 @@ * we return it, otherwise we block. */ -int raw_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg, - size_t len, int noblock, int flags, int *addr_len) +static int raw_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg, + size_t len, int noblock, int flags, int *addr_len) { struct inet_opt *inet = inet_sk(sk); size_t copied = 0; --- linux-2.6.10-rc3-mm1-full/net/ipv4/route.c.old 2004-12-14 05:36:47.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/route.c 2004-12-14 05:59:55.000000000 +0100 @@ -108,22 +108,22 @@ #define RT_GC_TIMEOUT (300*HZ) -int ip_rt_min_delay = 2 * HZ; -int ip_rt_max_delay = 10 * HZ; -int ip_rt_max_size; -int ip_rt_gc_timeout = RT_GC_TIMEOUT; -int ip_rt_gc_interval = 60 * HZ; -int ip_rt_gc_min_interval = HZ / 2; -int ip_rt_redirect_number = 9; -int ip_rt_redirect_load = HZ / 50; -int ip_rt_redirect_silence = ((HZ / 50) << (9 + 1)); -int ip_rt_error_cost = HZ; -int ip_rt_error_burst = 5 * HZ; -int ip_rt_gc_elasticity = 8; -int ip_rt_mtu_expires = 10 * 60 * HZ; -int ip_rt_min_pmtu = 512 + 20 + 20; -int ip_rt_min_advmss = 256; -int ip_rt_secret_interval = 10 * 60 * HZ; +static int ip_rt_min_delay = 2 * HZ; +static int ip_rt_max_delay = 10 * HZ; +static int ip_rt_max_size; +static int ip_rt_gc_timeout = RT_GC_TIMEOUT; +static int ip_rt_gc_interval = 60 * HZ; +static int ip_rt_gc_min_interval = HZ / 2; +static int ip_rt_redirect_number = 9; +static int ip_rt_redirect_load = HZ / 50; +static int ip_rt_redirect_silence = ((HZ / 50) << (9 + 1)); +static int ip_rt_error_cost = HZ; +static int ip_rt_error_burst = 5 * HZ; +static int ip_rt_gc_elasticity = 8; +static int ip_rt_mtu_expires = 10 * 60 * HZ; +static int ip_rt_min_pmtu = 512 + 20 + 20; +static int ip_rt_min_advmss = 256; +static int ip_rt_secret_interval = 10 * 60 * HZ; static unsigned long rt_deadline; #define RTprint(a...) printk(KERN_DEBUG a) --- linux-2.6.10-rc3-mm1-full/include/net/tcp.h.old 2004-12-14 05:43:25.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/net/tcp.h 2004-12-14 05:45:41.000000000 +0100 @@ -315,7 +315,6 @@ extern atomic_t tcp_orphan_count; extern int tcp_tw_count; extern void tcp_time_wait(struct sock *sk, int state, int timeo); -extern void tcp_tw_schedule(struct tcp_tw_bucket *tw, int timeo); extern void tcp_tw_deschedule(struct tcp_tw_bucket *tw); @@ -2020,21 +2019,6 @@ tp->westwood.rtt = rtt_seq; } -void __tcp_westwood_fast_bw(struct sock *, struct sk_buff *); -void __tcp_westwood_slow_bw(struct sock *, struct sk_buff *); - -static inline void tcp_westwood_fast_bw(struct sock *sk, struct sk_buff *skb) -{ - if (tcp_is_westwood(tcp_sk(sk))) - __tcp_westwood_fast_bw(sk, skb); -} - -static inline void tcp_westwood_slow_bw(struct sock *sk, struct sk_buff *skb) -{ - if (tcp_is_westwood(tcp_sk(sk))) - __tcp_westwood_slow_bw(sk, skb); -} - static inline __u32 __tcp_westwood_bw_rttmin(const struct tcp_opt *tp) { return max((tp->westwood.bw_est) * (tp->westwood.rtt_min) / --- linux-2.6.10-rc3-mm1-full/net/ipv4/tcp_input.c.old 2004-12-14 05:43:43.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/tcp_input.c 2004-12-14 05:44:53.000000000 +0100 @@ -2786,7 +2786,7 @@ * straight forward and doesn't need any particular care. */ -void __tcp_westwood_fast_bw(struct sock *sk, struct sk_buff *skb) +static void __tcp_westwood_fast_bw(struct sock *sk, struct sk_buff *skb) { struct tcp_opt *tp = tcp_sk(sk); @@ -2797,6 +2797,12 @@ tp->westwood.rtt_min = westwood_update_rttmin(sk); } +static inline void tcp_westwood_fast_bw(struct sock *sk, struct sk_buff *skb) +{ + if (tcp_is_westwood(tcp_sk(sk))) + __tcp_westwood_fast_bw(sk, skb); +} + /* * @westwood_dupack_update @@ -2867,7 +2873,7 @@ * dupack. But we need to be careful in such case. */ -void __tcp_westwood_slow_bw(struct sock *sk, struct sk_buff *skb) +static void __tcp_westwood_slow_bw(struct sock *sk, struct sk_buff *skb) { struct tcp_opt *tp = tcp_sk(sk); @@ -2877,6 +2883,12 @@ tp->westwood.rtt_min = westwood_update_rttmin(sk); } +static inline void tcp_westwood_slow_bw(struct sock *sk, struct sk_buff *skb) +{ + if (tcp_is_westwood(tcp_sk(sk))) + __tcp_westwood_slow_bw(sk, skb); +} + /* This routine deals with incoming acks, but not outgoing ones. */ static int tcp_ack(struct sock *sk, struct sk_buff *skb, int flag) { --- linux-2.6.10-rc3-mm1-full/net/ipv4/tcp_minisocks.c.old 2004-12-14 05:45:57.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/tcp_minisocks.c 2004-12-14 05:46:17.000000000 +0100 @@ -41,6 +41,8 @@ int sysctl_tcp_syncookies = SYNC_INIT; int sysctl_tcp_abort_on_overflow; +static void tcp_tw_schedule(struct tcp_tw_bucket *tw, int timeo); + static __inline__ int tcp_in_window(u32 seq, u32 end_seq, u32 s_win, u32 e_win) { if (seq == s_win) @@ -551,7 +553,7 @@ TIMER_INITIALIZER(tcp_twcal_tick, 0, 0); static struct hlist_head tcp_twcal_row[TCP_TW_RECYCLE_SLOTS]; -void tcp_tw_schedule(struct tcp_tw_bucket *tw, int timeo) +static void tcp_tw_schedule(struct tcp_tw_bucket *tw, int timeo) { struct hlist_head *list; int slot; --- linux-2.6.10-rc3-mm1-full/net/ipv4/tcp_timer.c.old 2004-12-14 05:48:10.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/tcp_timer.c 2004-12-14 13:57:40.000000000 +0100 @@ -653,6 +653,3 @@ EXPORT_SYMBOL(tcp_delete_keepalive_timer); EXPORT_SYMBOL(tcp_init_xmit_timers); EXPORT_SYMBOL(tcp_reset_keepalive_timer); -#ifdef TCP_DEBUG -EXPORT_SYMBOL(tcp_timer_bug_msg); -#endif --- linux-2.6.10-rc3-mm1-full/net/ipv4/udp.c.old 2004-12-14 05:50:04.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/udp.c 2004-12-14 05:51:36.000000000 +0100 @@ -219,7 +219,8 @@ /* UDP is nearly always wildcards out the wazoo, it makes no sense to try * harder than this. -DaveM */ -struct sock *udp_v4_lookup_longway(u32 saddr, u16 sport, u32 daddr, u16 dport, int dif) +static struct sock *udp_v4_lookup_longway(u32 saddr, u16 sport, + u32 daddr, u16 dport, int dif) { struct sock *sk, *result = NULL; struct hlist_node *node; @@ -263,7 +264,8 @@ return result; } -__inline__ struct sock *udp_v4_lookup(u32 saddr, u16 sport, u32 daddr, u16 dport, int dif) +static __inline__ struct sock *udp_v4_lookup(u32 saddr, u16 sport, + u32 daddr, u16 dport, int dif) { struct sock *sk; @@ -667,7 +669,8 @@ goto out; } -int udp_sendpage(struct sock *sk, struct page *page, int offset, size_t size, int flags) +static int udp_sendpage(struct sock *sk, struct page *page, int offset, + size_t size, int flags) { struct udp_opt *up = udp_sk(sk); int ret; @@ -770,8 +773,8 @@ * return it, otherwise we block. */ -int udp_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg, - size_t len, int noblock, int flags, int *addr_len) +static int udp_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg, + size_t len, int noblock, int flags, int *addr_len) { struct inet_opt *inet = inet_sk(sk); struct sockaddr_in *sin = (struct sockaddr_in *)msg->msg_name; --- linux-2.6.10-rc3-mm1-full/include/net/xfrm.h.old 2004-12-14 05:51:56.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/net/xfrm.h 2004-12-14 05:52:03.000000000 +0100 @@ -782,7 +782,6 @@ extern void xfrm_init(void); extern void xfrm4_init(void); -extern void xfrm4_fini(void); extern void xfrm6_init(void); extern void xfrm6_fini(void); extern void xfrm_state_init(void); --- linux-2.6.10-rc3-mm1-full/net/ipv4/xfrm4_policy.c.old 2004-12-14 05:52:13.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/xfrm4_policy.c 2004-12-14 05:52:22.000000000 +0100 @@ -279,10 +279,3 @@ xfrm4_policy_init(); } -void __exit xfrm4_fini(void) -{ - //xfrm4_input_fini(); - xfrm4_policy_fini(); - xfrm4_state_fini(); -} - From bunk@stusta.de Tue Dec 14 16:58:53 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 14 Dec 2004 16:59:04 -0800 (PST) Received: from mailout.stusta.mhn.de (emailhub.stusta.mhn.de [141.84.69.5]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBF0wVNY013451 for ; Tue, 14 Dec 2004 16:58:52 -0800 Received: (qmail 592 invoked from network); 15 Dec 2004 00:58:04 -0000 Received: from r063144.stusta.swh.mhn.de (10.150.63.144) by mailhub.stusta.mhn.de with SMTP; 15 Dec 2004 00:58:04 -0000 Received: by r063144.stusta.swh.mhn.de (Postfix, from userid 1000) id CC960AF651; Wed, 15 Dec 2004 01:58:01 +0100 (CET) Date: Wed, 15 Dec 2004 01:58:01 +0100 From: Adrian Bunk To: wensong@linux-vs.org, ja@ssi.bg Cc: lvs-users@linuxvirtualserver.org, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: [2.6 patch] net/ipvs/: make some code static Message-ID: <20041215005801.GB11972@stusta.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12750 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bunk@stusta.de Precedence: bulk X-list: netdev The patch below makes some needlessly global code static. diffstat output: net/ipv4/ipvs/ip_vs_app.c | 2 +- net/ipv4/ipvs/ip_vs_conn.c | 2 +- net/ipv4/ipvs/ip_vs_ctl.c | 2 +- net/ipv4/ipvs/ip_vs_proto.c | 4 ++-- net/ipv4/ipvs/ip_vs_proto_icmp.c | 4 ++-- 5 files changed, 7 insertions(+), 7 deletions(-) Signed-off-by: Adrian Bunk --- linux-2.6.10-rc3-mm1-full/net/ipv4/ipvs/ip_vs_app.c.old 2004-12-14 05:15:21.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/ipvs/ip_vs_app.c 2004-12-14 05:15:28.000000000 +0100 @@ -65,7 +65,7 @@ /* * Allocate/initialize app incarnation and register it in proto apps. */ -int +static int ip_vs_app_inc_new(struct ip_vs_app *app, __u16 proto, __u16 port) { struct ip_vs_protocol *pp; --- linux-2.6.10-rc3-mm1-full/net/ipv4/ipvs/ip_vs_conn.c.old 2004-12-14 05:15:44.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/ipvs/ip_vs_conn.c 2004-12-14 05:15:51.000000000 +0100 @@ -64,7 +64,7 @@ } __attribute__((__aligned__(SMP_CACHE_BYTES))); /* lock array for conn table */ -struct ip_vs_aligned_lock +static struct ip_vs_aligned_lock __ip_vs_conntbl_lock_array[CT_LOCKARRAY_SIZE] __cacheline_aligned; static inline void ct_read_lock(unsigned key) --- linux-2.6.10-rc3-mm1-full/net/ipv4/ipvs/ip_vs_ctl.c.old 2004-12-14 05:17:03.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/ipvs/ip_vs_ctl.c 2004-12-14 05:17:14.000000000 +0100 @@ -62,7 +62,7 @@ /* 1/rate drop and drop-entry variables */ int ip_vs_drop_rate = 0; int ip_vs_drop_counter = 0; -atomic_t ip_vs_dropentry = ATOMIC_INIT(0); +static atomic_t ip_vs_dropentry = ATOMIC_INIT(0); /* number of virtual services */ static int ip_vs_num_services = 0; --- linux-2.6.10-rc3-mm1-full/net/ipv4/ipvs/ip_vs_proto.c.old 2004-12-14 05:17:33.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/ipvs/ip_vs_proto.c 2004-12-14 05:17:47.000000000 +0100 @@ -45,7 +45,7 @@ /* * register an ipvs protocol */ -int register_ip_vs_protocol(struct ip_vs_protocol *pp) +static int register_ip_vs_protocol(struct ip_vs_protocol *pp) { unsigned hash = IP_VS_PROTO_HASH(pp->protocol); @@ -62,7 +62,7 @@ /* * unregister an ipvs protocol */ -int unregister_ip_vs_protocol(struct ip_vs_protocol *pp) +static int unregister_ip_vs_protocol(struct ip_vs_protocol *pp) { struct ip_vs_protocol **pp_p; unsigned hash = IP_VS_PROTO_HASH(pp->protocol); --- linux-2.6.10-rc3-mm1-full/net/ipv4/ipvs/ip_vs_proto_icmp.c.old 2004-12-14 05:18:02.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/ipvs/ip_vs_proto_icmp.c 2004-12-14 05:18:37.000000000 +0100 @@ -22,7 +22,7 @@ static char * icmp_state_name_table[1] = { "ICMP" }; -struct ip_vs_conn * +static struct ip_vs_conn * icmp_conn_in_get(const struct sk_buff *skb, struct ip_vs_protocol *pp, const struct iphdr *iph, @@ -49,7 +49,7 @@ #endif } -struct ip_vs_conn * +static struct ip_vs_conn * icmp_conn_out_get(const struct sk_buff *skb, struct ip_vs_protocol *pp, const struct iphdr *iph, From bunk@stusta.de Tue Dec 14 17:00:13 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 14 Dec 2004 17:00:20 -0800 (PST) Received: from mailout.stusta.mhn.de (emailhub.stusta.mhn.de [141.84.69.5]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBF0xthS013580 for ; Tue, 14 Dec 2004 17:00:12 -0800 Received: (qmail 681 invoked from network); 15 Dec 2004 00:59:27 -0000 Received: from r063144.stusta.swh.mhn.de (10.150.63.144) by mailout.stusta.mhn.de with SMTP; 15 Dec 2004 00:59:27 -0000 Received: by r063144.stusta.swh.mhn.de (Postfix, from userid 1000) id 78DD5AF651; Wed, 15 Dec 2004 01:59:25 +0100 (CET) Date: Wed, 15 Dec 2004 01:59:25 +0100 From: Adrian Bunk To: acme@conectiva.com.br Cc: linux-net@vger.kernel.or, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: [2.6 patch] net/ipx/: make some code static Message-ID: <20041215005925.GC11972@stusta.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12751 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bunk@stusta.de Precedence: bulk X-list: netdev The patch below makes some needlessly global code static. diffstat output: include/net/ipx.h | 8 -------- net/ipx/af_ipx.c | 10 ++++++++-- net/ipx/ipx_proc.c | 6 +++--- 3 files changed, 11 insertions(+), 13 deletions(-) Signed-off-by: Adrian Bunk --- linux-2.6.10-rc3-mm1-full/include/net/ipx.h.old 2004-12-14 14:55:42.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/net/ipx.h 2004-12-14 14:57:03.000000000 +0100 @@ -139,14 +139,6 @@ ipxitf_down(intrfc); } -extern void __ipxitf_down(struct ipx_interface *intrfc); - -static __inline__ void __ipxitf_put(struct ipx_interface *intrfc) -{ - if (atomic_dec_and_test(&intrfc->refcnt)) - __ipxitf_down(intrfc); -} - static __inline__ void ipxrtr_hold(struct ipx_route *rt) { atomic_inc(&rt->refcnt); --- linux-2.6.10-rc3-mm1-full/net/ipx/af_ipx.c.old 2004-12-14 14:56:12.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipx/af_ipx.c 2004-12-14 14:57:28.000000000 +0100 @@ -291,7 +291,7 @@ } #endif -void __ipxitf_down(struct ipx_interface *intrfc) +static void __ipxitf_down(struct ipx_interface *intrfc) { struct sock *s; struct hlist_node *node, *t; @@ -335,6 +335,12 @@ spin_unlock_bh(&ipx_interfaces_lock); } +static __inline__ void __ipxitf_put(struct ipx_interface *intrfc) +{ + if (atomic_dec_and_test(&intrfc->refcnt)) + __ipxitf_down(intrfc); +} + static int ipxitf_device_event(struct notifier_block *notifier, unsigned long event, void *ptr) { @@ -1629,7 +1635,7 @@ return rc; } -int ipx_rcv(struct sk_buff *skb, struct net_device *dev, struct packet_type *pt) +static int ipx_rcv(struct sk_buff *skb, struct net_device *dev, struct packet_type *pt) { /* NULL here for pt means the packet was looped back */ struct ipx_interface *intrfc; --- linux-2.6.10-rc3-mm1-full/net/ipx/ipx_proc.c.old 2004-12-14 14:57:40.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipx/ipx_proc.c 2004-12-14 14:57:56.000000000 +0100 @@ -287,21 +287,21 @@ return 0; } -struct seq_operations ipx_seq_interface_ops = { +static struct seq_operations ipx_seq_interface_ops = { .start = ipx_seq_interface_start, .next = ipx_seq_interface_next, .stop = ipx_seq_interface_stop, .show = ipx_seq_interface_show, }; -struct seq_operations ipx_seq_route_ops = { +static struct seq_operations ipx_seq_route_ops = { .start = ipx_seq_route_start, .next = ipx_seq_route_next, .stop = ipx_seq_route_stop, .show = ipx_seq_route_show, }; -struct seq_operations ipx_seq_socket_ops = { +static struct seq_operations ipx_seq_socket_ops = { .start = ipx_seq_socket_start, .next = ipx_seq_socket_next, .stop = ipx_seq_interface_stop, From acme@conectiva.com.br Tue Dec 14 17:02:29 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 14 Dec 2004 17:02:38 -0800 (PST) Received: from orion.netbank.com.br (orion.netbank.com.br [200.203.199.90]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBF128e5014466 for ; Tue, 14 Dec 2004 17:02:29 -0800 Received: from [201.14.39.194] (helo=oops.ghostprotocols.net) by orion.netbank.com.br with asmtp (Exim 3.33 #1) id 1CeNbi-0007pz-00; Tue, 14 Dec 2004 23:05:34 -0200 Received: from [192.168.1.6] (amd64.kerneljanitors.org [192.168.1.6]) by oops.ghostprotocols.net (Postfix) with ESMTP id 0EDC614640; Tue, 14 Dec 2004 23:01:34 -0200 (BRST) Message-ID: <41BF7F55.4090906@conectiva.com.br> Date: Tue, 14 Dec 2004 22:03:33 -0200 From: Arnaldo Carvalho de Melo Organization: Conectiva S.A. User-Agent: Mozilla Thunderbird 0.9 (X11/20041103) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Adrian Bunk Cc: linux-net@vger.kernel.or, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: [2.6 patch] net/ipx/: make some code static References: <20041215005925.GC11972@stusta.de> In-Reply-To: <20041215005925.GC11972@stusta.de> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12752 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: acme@conectiva.com.br Precedence: bulk X-list: netdev ACKed, this was because a long time ago I planned to ressurect the SPX code. - Arnaldo Adrian Bunk wrote: > The patch below makes some needlessly global code static. > > > diffstat output: > include/net/ipx.h | 8 -------- > net/ipx/af_ipx.c | 10 ++++++++-- > net/ipx/ipx_proc.c | 6 +++--- > 3 files changed, 11 insertions(+), 13 deletions(-) > > > Signed-off-by: Adrian Bunk > > --- linux-2.6.10-rc3-mm1-full/include/net/ipx.h.old 2004-12-14 14:55:42.000000000 +0100 > +++ linux-2.6.10-rc3-mm1-full/include/net/ipx.h 2004-12-14 14:57:03.000000000 +0100 > @@ -139,14 +139,6 @@ > ipxitf_down(intrfc); > } > > -extern void __ipxitf_down(struct ipx_interface *intrfc); > - > -static __inline__ void __ipxitf_put(struct ipx_interface *intrfc) > -{ > - if (atomic_dec_and_test(&intrfc->refcnt)) > - __ipxitf_down(intrfc); > -} > - > static __inline__ void ipxrtr_hold(struct ipx_route *rt) > { > atomic_inc(&rt->refcnt); > --- linux-2.6.10-rc3-mm1-full/net/ipx/af_ipx.c.old 2004-12-14 14:56:12.000000000 +0100 > +++ linux-2.6.10-rc3-mm1-full/net/ipx/af_ipx.c 2004-12-14 14:57:28.000000000 +0100 > @@ -291,7 +291,7 @@ > } > #endif > > -void __ipxitf_down(struct ipx_interface *intrfc) > +static void __ipxitf_down(struct ipx_interface *intrfc) > { > struct sock *s; > struct hlist_node *node, *t; > @@ -335,6 +335,12 @@ > spin_unlock_bh(&ipx_interfaces_lock); > } > > +static __inline__ void __ipxitf_put(struct ipx_interface *intrfc) > +{ > + if (atomic_dec_and_test(&intrfc->refcnt)) > + __ipxitf_down(intrfc); > +} > + > static int ipxitf_device_event(struct notifier_block *notifier, > unsigned long event, void *ptr) > { > @@ -1629,7 +1635,7 @@ > return rc; > } > > -int ipx_rcv(struct sk_buff *skb, struct net_device *dev, struct packet_type *pt) > +static int ipx_rcv(struct sk_buff *skb, struct net_device *dev, struct packet_type *pt) > { > /* NULL here for pt means the packet was looped back */ > struct ipx_interface *intrfc; > --- linux-2.6.10-rc3-mm1-full/net/ipx/ipx_proc.c.old 2004-12-14 14:57:40.000000000 +0100 > +++ linux-2.6.10-rc3-mm1-full/net/ipx/ipx_proc.c 2004-12-14 14:57:56.000000000 +0100 > @@ -287,21 +287,21 @@ > return 0; > } > > -struct seq_operations ipx_seq_interface_ops = { > +static struct seq_operations ipx_seq_interface_ops = { > .start = ipx_seq_interface_start, > .next = ipx_seq_interface_next, > .stop = ipx_seq_interface_stop, > .show = ipx_seq_interface_show, > }; > > -struct seq_operations ipx_seq_route_ops = { > +static struct seq_operations ipx_seq_route_ops = { > .start = ipx_seq_route_start, > .next = ipx_seq_route_next, > .stop = ipx_seq_route_stop, > .show = ipx_seq_route_show, > }; > > -struct seq_operations ipx_seq_socket_ops = { > +static struct seq_operations ipx_seq_socket_ops = { > .start = ipx_seq_socket_start, > .next = ipx_seq_socket_next, > .stop = ipx_seq_interface_stop, > > From bunk@stusta.de Tue Dec 14 17:06:20 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 14 Dec 2004 17:06:29 -0800 (PST) Received: from mailout.stusta.mhn.de (emailhub.stusta.mhn.de [141.84.69.5]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBF15wrU015046 for ; Tue, 14 Dec 2004 17:06:18 -0800 Received: (qmail 1093 invoked from network); 15 Dec 2004 01:05:30 -0000 Received: from r063144.stusta.swh.mhn.de (10.150.63.144) by mailout.stusta.mhn.de with SMTP; 15 Dec 2004 01:05:30 -0000 Received: by r063144.stusta.swh.mhn.de (Postfix, from userid 1000) id 372BDAF651; Wed, 15 Dec 2004 02:05:28 +0100 (CET) Date: Wed, 15 Dec 2004 02:05:28 +0100 From: Adrian Bunk To: irda-users@lists.sourceforge.net Cc: netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: [2.6 patch] net/irda/: misc possible cleanups Message-ID: <20041215010528.GA12937@stusta.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12753 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bunk@stusta.de Precedence: bulk X-list: netdev The patch below contains the following possible cleanups: - make some needlessly global code static - remove the following unused global functions: - discovery.c: irlmp_find_device - ircomm/ircomm_param.c: ircomm_param_flush - irda_device.c: irda_device_set_dtr_rts - irda_device.c: irda_device_change_speed - irda_device.c: irda_device_set_mode - iriap.c: iriap_getinfobasedetails_request - iriap.c: iriap_getinfobasedetails_confirm - iriap.c: iriap_getobjects_request - iriap.c: iriap_getobjects_confirm - iriap.c: iriap_getvalue - irlan_client.c: irlan_client_reconnect_data_channel - irlap_frame.c: irlap_send_frmr_frame - irlmp.c: irlmp_status_request - remove the follwong unused global variables: - irnet/irnet_ppp.c: irnet_ppp_ops - irsysctl.c: sysctl_compression - qos.c: #ifndef CONFIG_IRDA_DYNAMIC_WINDOW irlap_requested_line_capacity Please comment on which of these changes are correct and which conflict with pending patches. diffstat output: include/net/irda/ircomm_event.h | 1 include/net/irda/ircomm_lmp.h | 27 ----- include/net/irda/ircomm_param.h | 1 include/net/irda/ircomm_ttp.h | 31 ------ include/net/irda/ircomm_tty.h | 2 include/net/irda/ircomm_tty_attach.h | 1 include/net/irda/irda_device.h | 2 include/net/irda/iriap.h | 10 -- include/net/irda/irlan_client.h | 3 include/net/irda/irlan_common.h | 3 include/net/irda/irlap.h | 2 include/net/irda/irlap_frame.h | 1 include/net/irda/irlmp.h | 3 include/net/irda/irttp.h | 3 include/net/irda/parameters.h | 2 include/net/irda/qos.h | 1 net/irda/af_irda.c | 4 net/irda/discovery.c | 35 ------- net/irda/ircomm/ircomm_core.c | 4 net/irda/ircomm/ircomm_event.c | 2 net/irda/ircomm/ircomm_lmp.c | 128 +++++++++++++-------------- net/irda/ircomm/ircomm_param.c | 17 --- net/irda/ircomm/ircomm_tty.c | 7 - net/irda/ircomm/ircomm_tty_attach.c | 12 +- net/irda/ircomm/ircomm_tty_ioctl.c | 2 net/irda/irda_device.c | 70 -------------- net/irda/iriap.c | 51 ++++------ net/irda/irias_object.c | 6 - net/irda/irlan/irlan_client.c | 41 -------- net/irda/irlan/irlan_common.c | 32 ++++-- net/irda/irlan/irlan_provider.c | 6 - net/irda/irlap.c | 8 + net/irda/irlap_event.c | 2 net/irda/irlap_frame.c | 35 +------ net/irda/irlmp.c | 12 +- net/irda/irmod.c | 4 net/irda/irnet/irnet.h | 2 net/irda/irnet/irnet_ppp.c | 9 + net/irda/irnet/irnet_ppp.h | 7 - net/irda/irsysctl.c | 1 net/irda/irttp.c | 12 +- net/irda/parameters.c | 11 +- net/irda/qos.c | 10 +- 43 files changed, 179 insertions(+), 444 deletions(-) Signed-off-by: Adrian Bunk --- linux-2.6.10-rc3-mm1-full/net/irda/af_irda.c.old 2004-12-14 14:58:49.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/irda/af_irda.c 2004-12-14 14:59:09.000000000 +0100 @@ -298,7 +298,7 @@ * Accept incoming connection * */ -void irda_connect_response(struct irda_sock *self) +static void irda_connect_response(struct irda_sock *self) { struct sk_buff *skb; @@ -1155,7 +1155,7 @@ * Destroy socket * */ -void irda_destroy_socket(struct irda_sock *self) +static void irda_destroy_socket(struct irda_sock *self) { IRDA_DEBUG(2, "%s(%p)\n", __FUNCTION__, self); --- linux-2.6.10-rc3-mm1-full/net/irda/discovery.c.old 2004-12-14 14:59:25.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/irda/discovery.c 2004-12-14 14:59:38.000000000 +0100 @@ -315,41 +315,6 @@ return(buffer); } -/* - * Function irlmp_find_device (name, saddr) - * - * Look through the discovery log at each of the links and try to find - * the device with the given name. Return daddr and saddr. If saddr is - * specified, that look at that particular link only (not impl). - */ -__u32 irlmp_find_device(hashbin_t *cachelog, char *name, __u32 *saddr) -{ - unsigned long flags; - discovery_t *d; - - spin_lock_irqsave(&cachelog->hb_spinlock, flags); - - /* Look at all discoveries for that link */ - d = (discovery_t *) hashbin_get_first(cachelog); - while (d != NULL) { - IRDA_DEBUG(1, "Discovery:\n"); - IRDA_DEBUG(1, " daddr=%08x\n", d->data.daddr); - IRDA_DEBUG(1, " nickname=%s\n", d->data.info); - - if (strcmp(name, d->data.info) == 0) { - *saddr = d->data.saddr; - - spin_unlock_irqrestore(&cachelog->hb_spinlock, flags); - return d->data.daddr; - } - d = (discovery_t *) hashbin_get_next(cachelog); - } - - spin_unlock_irqrestore(&cachelog->hb_spinlock, flags); - - return 0; -} - #ifdef CONFIG_PROC_FS static inline discovery_t *discovery_seq_idx(loff_t pos) --- linux-2.6.10-rc3-mm1-full/net/irda/ircomm/ircomm_core.c.old 2004-12-14 14:59:51.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/irda/ircomm/ircomm_core.c 2004-12-14 15:00:19.000000000 +0100 @@ -68,7 +68,7 @@ hashbin_t *ircomm = NULL; -int __init ircomm_init(void) +static int __init ircomm_init(void) { ircomm = hashbin_new(HB_LOCK); if (ircomm == NULL) { @@ -89,7 +89,7 @@ return 0; } -void __exit ircomm_cleanup(void) +static void __exit ircomm_cleanup(void) { IRDA_DEBUG(2, "%s()\n", __FUNCTION__ ); --- linux-2.6.10-rc3-mm1-full/include/net/irda/ircomm_tty_attach.h.old 2004-12-14 15:00:52.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/net/irda/ircomm_tty_attach.h 2004-12-14 15:01:01.000000000 +0100 @@ -67,7 +67,6 @@ }; extern char *ircomm_state[]; -extern char *ircomm_event[]; extern char *ircomm_tty_state[]; int ircomm_tty_do_event(struct ircomm_tty_cb *self, IRCOMM_TTY_EVENT event, --- linux-2.6.10-rc3-mm1-full/include/net/irda/ircomm_event.h.old 2004-12-14 15:01:08.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/net/irda/ircomm_event.h 2004-12-14 15:01:13.000000000 +0100 @@ -75,7 +75,6 @@ }; extern char *ircomm_state[]; -extern char *ircomm_event[]; struct ircomm_cb; /* Forward decl. */ --- linux-2.6.10-rc3-mm1-full/net/irda/ircomm/ircomm_event.c.old 2004-12-14 15:01:21.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/irda/ircomm/ircomm_event.c 2004-12-14 15:01:31.000000000 +0100 @@ -57,7 +57,7 @@ "IRCOMM_CONN", }; -char *ircomm_event[] = { +static char *ircomm_event[] = { "IRCOMM_CONNECT_REQUEST", "IRCOMM_CONNECT_RESPONSE", "IRCOMM_TTP_CONNECT_INDICATION", --- linux-2.6.10-rc3-mm1-full/include/net/irda/ircomm_lmp.h.old 2004-12-14 15:01:50.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/net/irda/ircomm_lmp.h 2004-12-14 15:08:50.000000000 +0100 @@ -32,34 +32,7 @@ #define IRCOMM_LMP_H #include -#include int ircomm_open_lsap(struct ircomm_cb *self); -int ircomm_lmp_connect_request(struct ircomm_cb *self, - struct sk_buff *userdata, - struct ircomm_info *info); -int ircomm_lmp_connect_response(struct ircomm_cb *self, struct sk_buff *skb); -int ircomm_lmp_disconnect_request(struct ircomm_cb *self, - struct sk_buff *userdata, - struct ircomm_info *info); -int ircomm_lmp_data_request(struct ircomm_cb *self, struct sk_buff *skb, - int clen); -int ircomm_lmp_control_request(struct ircomm_cb *self, - struct sk_buff *userdata); -int ircomm_lmp_data_indication(void *instance, void *sap, - struct sk_buff *skb); -void ircomm_lmp_connect_confirm(void *instance, void *sap, - struct qos_info *qos, - __u32 max_sdu_size, - __u8 max_header_size, - struct sk_buff *skb); -void ircomm_lmp_connect_indication(void *instance, void *sap, - struct qos_info *qos, - __u32 max_sdu_size, - __u8 max_header_size, - struct sk_buff *skb); -void ircomm_lmp_disconnect_indication(void *instance, void *sap, - LM_REASON reason, - struct sk_buff *skb); #endif --- linux-2.6.10-rc3-mm1-full/net/irda/ircomm/ircomm_lmp.c.old 2004-12-14 15:02:10.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/irda/ircomm/ircomm_lmp.c 2004-12-14 15:07:37.000000000 +0100 @@ -41,44 +41,6 @@ #include #include -/* - * Function ircomm_open_lsap (self) - * - * Open LSAP. This function will only be used when using "raw" services - * - */ -int ircomm_open_lsap(struct ircomm_cb *self) -{ - notify_t notify; - - IRDA_DEBUG(0, "%s()\n", __FUNCTION__ ); - - /* Register callbacks */ - irda_notify_init(¬ify); - notify.data_indication = ircomm_lmp_data_indication; - notify.connect_confirm = ircomm_lmp_connect_confirm; - notify.connect_indication = ircomm_lmp_connect_indication; - notify.disconnect_indication = ircomm_lmp_disconnect_indication; - notify.instance = self; - strlcpy(notify.name, "IrCOMM", sizeof(notify.name)); - - self->lsap = irlmp_open_lsap(LSAP_ANY, ¬ify, 0); - if (!self->lsap) { - IRDA_DEBUG(0,"%sfailed to allocate tsap\n", __FUNCTION__ ); - return -1; - } - self->slsap_sel = self->lsap->slsap_sel; - - /* - * Initialize the call-table for issuing commands - */ - self->issue.data_request = ircomm_lmp_data_request; - self->issue.connect_request = ircomm_lmp_connect_request; - self->issue.connect_response = ircomm_lmp_connect_response; - self->issue.disconnect_request = ircomm_lmp_disconnect_request; - - return 0; -} /* * Function ircomm_lmp_connect_request (self, userdata) @@ -86,9 +48,9 @@ * * */ -int ircomm_lmp_connect_request(struct ircomm_cb *self, - struct sk_buff *userdata, - struct ircomm_info *info) +static int ircomm_lmp_connect_request(struct ircomm_cb *self, + struct sk_buff *userdata, + struct ircomm_info *info) { int ret = 0; @@ -109,7 +71,8 @@ * * */ -int ircomm_lmp_connect_response(struct ircomm_cb *self, struct sk_buff *userdata) +static int ircomm_lmp_connect_response(struct ircomm_cb *self, + struct sk_buff *userdata) { struct sk_buff *tx_skb; int ret; @@ -141,9 +104,9 @@ return 0; } -int ircomm_lmp_disconnect_request(struct ircomm_cb *self, - struct sk_buff *userdata, - struct ircomm_info *info) +static int ircomm_lmp_disconnect_request(struct ircomm_cb *self, + struct sk_buff *userdata, + struct ircomm_info *info) { struct sk_buff *tx_skb; int ret; @@ -175,7 +138,7 @@ * been deallocated. We do this to make sure we don't flood IrLAP with * frames, since we are not using the IrTTP flow control mechanism */ -void ircomm_lmp_flow_control(struct sk_buff *skb) +static void ircomm_lmp_flow_control(struct sk_buff *skb) { struct irda_skb_cb *cb; struct ircomm_cb *self; @@ -215,8 +178,9 @@ * Send data frame to peer device * */ -int ircomm_lmp_data_request(struct ircomm_cb *self, struct sk_buff *skb, - int not_used) +static int ircomm_lmp_data_request(struct ircomm_cb *self, + struct sk_buff *skb, + int not_used) { struct irda_skb_cb *cb; int ret; @@ -256,8 +220,8 @@ * Incoming data which we must deliver to the state machine, to check * we are still connected. */ -int ircomm_lmp_data_indication(void *instance, void *sap, - struct sk_buff *skb) +static int ircomm_lmp_data_indication(void *instance, void *sap, + struct sk_buff *skb) { struct ircomm_cb *self = (struct ircomm_cb *) instance; @@ -282,11 +246,11 @@ * Connection has been confirmed by peer device * */ -void ircomm_lmp_connect_confirm(void *instance, void *sap, - struct qos_info *qos, - __u32 max_seg_size, - __u8 max_header_size, - struct sk_buff *skb) +static void ircomm_lmp_connect_confirm(void *instance, void *sap, + struct qos_info *qos, + __u32 max_seg_size, + __u8 max_header_size, + struct sk_buff *skb) { struct ircomm_cb *self = (struct ircomm_cb *) instance; struct ircomm_info info; @@ -315,11 +279,11 @@ * Peer device wants to make a connection with us * */ -void ircomm_lmp_connect_indication(void *instance, void *sap, - struct qos_info *qos, - __u32 max_seg_size, - __u8 max_header_size, - struct sk_buff *skb) +static void ircomm_lmp_connect_indication(void *instance, void *sap, + struct qos_info *qos, + __u32 max_seg_size, + __u8 max_header_size, + struct sk_buff *skb) { struct ircomm_cb *self = (struct ircomm_cb *)instance; struct ircomm_info info; @@ -347,9 +311,9 @@ * Peer device has closed the connection, or the link went down for some * other reason */ -void ircomm_lmp_disconnect_indication(void *instance, void *sap, - LM_REASON reason, - struct sk_buff *skb) +static void ircomm_lmp_disconnect_indication(void *instance, void *sap, + LM_REASON reason, + struct sk_buff *skb) { struct ircomm_cb *self = (struct ircomm_cb *) instance; struct ircomm_info info; @@ -367,3 +331,41 @@ if(skb) dev_kfree_skb(skb); } +/* + * Function ircomm_open_lsap (self) + * + * Open LSAP. This function will only be used when using "raw" services + * + */ +int ircomm_open_lsap(struct ircomm_cb *self) +{ + notify_t notify; + + IRDA_DEBUG(0, "%s()\n", __FUNCTION__ ); + + /* Register callbacks */ + irda_notify_init(¬ify); + notify.data_indication = ircomm_lmp_data_indication; + notify.connect_confirm = ircomm_lmp_connect_confirm; + notify.connect_indication = ircomm_lmp_connect_indication; + notify.disconnect_indication = ircomm_lmp_disconnect_indication; + notify.instance = self; + strlcpy(notify.name, "IrCOMM", sizeof(notify.name)); + + self->lsap = irlmp_open_lsap(LSAP_ANY, ¬ify, 0); + if (!self->lsap) { + IRDA_DEBUG(0,"%sfailed to allocate tsap\n", __FUNCTION__ ); + return -1; + } + self->slsap_sel = self->lsap->slsap_sel; + + /* + * Initialize the call-table for issuing commands + */ + self->issue.data_request = ircomm_lmp_data_request; + self->issue.connect_request = ircomm_lmp_connect_request; + self->issue.connect_response = ircomm_lmp_connect_response; + self->issue.disconnect_request = ircomm_lmp_disconnect_request; + + return 0; +} --- linux-2.6.10-rc3-mm1-full/include/net/irda/ircomm_param.h.old 2004-12-14 15:09:10.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/net/irda/ircomm_param.h 2004-12-14 15:09:17.000000000 +0100 @@ -141,7 +141,6 @@ struct ircomm_tty_cb; /* Forward decl. */ -int ircomm_param_flush(struct ircomm_tty_cb *self); int ircomm_param_request(struct ircomm_tty_cb *self, __u8 pi, int flush); extern pi_param_info_t ircomm_param_info; --- linux-2.6.10-rc3-mm1-full/net/irda/ircomm/ircomm_param.c.old 2004-12-14 15:09:26.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/irda/ircomm/ircomm_param.c 2004-12-14 15:09:35.000000000 +0100 @@ -92,23 +92,6 @@ pi_param_info_t ircomm_param_info = { pi_major_call_table, 3, 0x0f, 4 }; /* - * Function ircomm_param_flush (self) - * - * Flush (send) out all queued parameters - * - */ -int ircomm_param_flush(struct ircomm_tty_cb *self) -{ - /* we should lock here, but I guess this function is unused... - * Jean II */ - if (self->ctrl_skb) { - ircomm_control_request(self->ircomm, self->ctrl_skb); - self->ctrl_skb = NULL; - } - return 0; -} - -/* * Function ircomm_param_request (self, pi, flush) * * Queue a parameter for the control channel --- linux-2.6.10-rc3-mm1-full/include/net/irda/ircomm_ttp.h.old 2004-12-14 15:48:48.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/net/irda/ircomm_ttp.h 2004-12-14 15:53:48.000000000 +0100 @@ -32,39 +32,8 @@ #define IRCOMM_TTP_H #include -#include int ircomm_open_tsap(struct ircomm_cb *self); -int ircomm_ttp_connect_request(struct ircomm_cb *self, - struct sk_buff *userdata, - struct ircomm_info *info); -int ircomm_ttp_connect_response(struct ircomm_cb *self, struct sk_buff *skb); -int ircomm_ttp_disconnect_request(struct ircomm_cb *self, - struct sk_buff *userdata, - struct ircomm_info *info); -int ircomm_ttp_data_request(struct ircomm_cb *self, struct sk_buff *skb, - int clen); -int ircomm_ttp_control_request(struct ircomm_cb *self, - struct sk_buff *userdata); -int ircomm_ttp_data_indication(void *instance, void *sap, - struct sk_buff *skb); -void ircomm_ttp_connect_confirm(void *instance, void *sap, - struct qos_info *qos, - __u32 max_sdu_size, - __u8 max_header_size, - struct sk_buff *skb); -void ircomm_ttp_connect_indication(void *instance, void *sap, - struct qos_info *qos, - __u32 max_sdu_size, - __u8 max_header_size, - struct sk_buff *skb); -void ircomm_ttp_disconnect_indication(void *instance, void *sap, - LM_REASON reason, - struct sk_buff *skb); -void ircomm_ttp_flow_indication(void *instance, void *sap, LOCAL_FLOW cmd); #endif - - - --- linux-2.6.10-rc3-mm1-full/net/irda/ircomm/ircomm_tty.c.old 2004-12-14 15:54:14.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/irda/ircomm/ircomm_tty.c 2004-12-14 15:55:16.000000000 +0100 @@ -64,6 +64,7 @@ static void ircomm_tty_hangup(struct tty_struct *tty); static void ircomm_tty_do_softint(void *private_); static void ircomm_tty_shutdown(struct ircomm_tty_cb *self); +static void ircomm_tty_stop(struct tty_struct *tty); static int ircomm_tty_data_indication(void *instance, void *sap, struct sk_buff *skb); @@ -108,7 +109,7 @@ * Init IrCOMM TTY layer/driver * */ -int __init ircomm_tty_init(void) +static int __init ircomm_tty_init(void) { driver = alloc_tty_driver(IRCOMM_TTY_PORTS); if (!driver) @@ -159,7 +160,7 @@ * Remove IrCOMM TTY layer/driver * */ -void __exit ircomm_tty_cleanup(void) +static void __exit ircomm_tty_cleanup(void) { int ret; @@ -1064,7 +1065,7 @@ * This routine notifies the tty driver that it should stop outputting * characters to the tty device. */ -void ircomm_tty_stop(struct tty_struct *tty) +static void ircomm_tty_stop(struct tty_struct *tty) { struct ircomm_tty_cb *self = (struct ircomm_tty_cb *) tty->driver_data; --- linux-2.6.10-rc3-mm1-full/net/irda/ircomm/ircomm_tty_attach.c.old 2004-12-14 15:55:32.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/irda/ircomm/ircomm_tty_attach.c 2004-12-14 15:56:39.000000000 +0100 @@ -52,8 +52,9 @@ void *priv); static void ircomm_tty_getvalue_confirm(int result, __u16 obj_id, struct ias_value *value, void *priv); -void ircomm_tty_start_watchdog_timer(struct ircomm_tty_cb *self, int timeout); -void ircomm_tty_watchdog_timer_expired(void *data); +static void ircomm_tty_start_watchdog_timer(struct ircomm_tty_cb *self, + int timeout); +static void ircomm_tty_watchdog_timer_expired(void *data); static int ircomm_tty_state_idle(struct ircomm_tty_cb *self, IRCOMM_TTY_EVENT event, @@ -90,7 +91,7 @@ "*** ERROR *** ", }; -char *ircomm_tty_event[] = { +static char *ircomm_tty_event[] = { "IRCOMM_TTY_ATTACH_CABLE", "IRCOMM_TTY_DETACH_CABLE", "IRCOMM_TTY_DATA_REQUEST", @@ -594,7 +595,8 @@ * connection attempt is successful, and if not, we will retry after * the timeout */ -void ircomm_tty_start_watchdog_timer(struct ircomm_tty_cb *self, int timeout) +static void ircomm_tty_start_watchdog_timer(struct ircomm_tty_cb *self, + int timeout) { ASSERT(self != NULL, return;); ASSERT(self->magic == IRCOMM_TTY_MAGIC, return;); @@ -609,7 +611,7 @@ * Called when the connect procedure have taken to much time. * */ -void ircomm_tty_watchdog_timer_expired(void *data) +static void ircomm_tty_watchdog_timer_expired(void *data) { struct ircomm_tty_cb *self = (struct ircomm_tty_cb *) data; --- linux-2.6.10-rc3-mm1-full/include/net/irda/ircomm_tty.h.old 2004-12-14 15:56:54.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/net/irda/ircomm_tty.h 2004-12-14 21:18:10.000000000 +0100 @@ -118,10 +118,8 @@ }; void ircomm_tty_start(struct tty_struct *tty); -void ircomm_tty_stop(struct tty_struct *tty); void ircomm_tty_check_modem_status(struct ircomm_tty_cb *self); -extern void ircomm_tty_change_speed(struct ircomm_tty_cb *self); extern int ircomm_tty_tiocmget(struct tty_struct *tty, struct file *file); extern int ircomm_tty_tiocmset(struct tty_struct *tty, struct file *file, unsigned int set, unsigned int clear); --- linux-2.6.10-rc3-mm1-full/net/irda/ircomm/ircomm_tty_ioctl.c.old 2004-12-14 15:57:10.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/irda/ircomm/ircomm_tty_ioctl.c 2004-12-14 15:57:22.000000000 +0100 @@ -53,7 +53,7 @@ * Change speed of the driver. If the remote device is a DCE, then this * should make it change the speed of its serial port */ -void ircomm_tty_change_speed(struct ircomm_tty_cb *self) +static void ircomm_tty_change_speed(struct ircomm_tty_cb *self) { unsigned cflag, cval; int baud; --- linux-2.6.10-rc3-mm1-full/include/net/irda/irda_device.h.old 2004-12-14 15:57:38.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/net/irda/irda_device.h 2004-12-14 15:58:11.000000000 +0100 @@ -227,8 +227,6 @@ return (skb_queue_len(&dev->qdisc->q) == 0); } int irda_device_set_raw_mode(struct net_device* self, int status); -int irda_device_set_dtr_rts(struct net_device *dev, int dtr, int rts); -int irda_device_change_speed(struct net_device *dev, __u32 speed); struct net_device *alloc_irdadev(int sizeof_priv); /* Dongle interface */ --- linux-2.6.10-rc3-mm1-full/net/irda/irda_device.c.old 2004-12-14 15:57:51.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/irda/irda_device.c 2004-12-14 15:58:50.000000000 +0100 @@ -143,47 +143,6 @@ EXPORT_SYMBOL(irda_device_set_media_busy); -int irda_device_set_dtr_rts(struct net_device *dev, int dtr, int rts) -{ - struct if_irda_req req; - int ret; - - IRDA_DEBUG(2, "%s()\n", __FUNCTION__); - - if (!dev->do_ioctl) { - ERROR("%s: do_ioctl not impl. by device driver\n", - __FUNCTION__); - return -1; - } - - req.ifr_dtr = dtr; - req.ifr_rts = rts; - - ret = dev->do_ioctl(dev, (struct ifreq *) &req, SIOCSDTRRTS); - - return ret; -} - -int irda_device_change_speed(struct net_device *dev, __u32 speed) -{ - struct if_irda_req req; - int ret; - - IRDA_DEBUG(2, "%s()\n", __FUNCTION__); - - if (!dev->do_ioctl) { - ERROR("%s: do_ioctl not impl. by device driver\n", - __FUNCTION__); - return -1; - } - - req.ifr_baudrate = speed; - - ret = dev->do_ioctl(dev, (struct ifreq *) &req, SIOCSBANDWIDTH); - - return ret; -} - /* * Function irda_device_is_receiving (dev) * @@ -372,7 +331,7 @@ * This function should be used by low level device drivers in a similar way * as ether_setup() is used by normal network device drivers */ -void irda_device_setup(struct net_device *dev) +static void irda_device_setup(struct net_device *dev) { dev->hard_header_len = 0; dev->addr_len = 0; @@ -502,33 +461,6 @@ } EXPORT_SYMBOL(irda_device_unregister_dongle); -/* - * Function irda_device_set_mode (self, mode) - * - * Set the Infrared device driver into mode where it sends and receives - * data without using IrLAP framing. Check out the particular device - * driver to find out which modes it support. - */ -int irda_device_set_mode(struct net_device* dev, int mode) -{ - struct if_irda_req req; - int ret; - - IRDA_DEBUG(0, "%s()\n", __FUNCTION__); - - if (!dev->do_ioctl) { - ERROR("%s: set_raw_mode not impl. by device driver\n", - __FUNCTION__); - return -1; - } - - req.ifr_mode = mode; - - ret = dev->do_ioctl(dev, (struct ifreq *) &req, SIOCSMODE); - - return ret; -} - #ifdef CONFIG_ISA /* * Function setup_dma (idev, buffer, count, mode) --- linux-2.6.10-rc3-mm1-full/include/net/irda/iriap.h.old 2004-12-14 16:00:41.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/net/irda/iriap.h 2004-12-14 16:03:37.000000000 +0100 @@ -97,22 +97,12 @@ int iriap_getvaluebyclass_request(struct iriap_cb *self, __u32 saddr, __u32 daddr, char *name, char *attr); -void iriap_getvaluebyclass_confirm(struct iriap_cb *self, struct sk_buff *skb); void iriap_connect_request(struct iriap_cb *self); void iriap_send_ack( struct iriap_cb *self); void iriap_call_indication(struct iriap_cb *self, struct sk_buff *skb); void iriap_register_server(void); -void iriap_watchdog_timer_expired(void *data); - -static inline void iriap_start_watchdog_timer(struct iriap_cb *self, - int timeout) -{ - irda_start_timer(&self->watchdog_timer, timeout, self, - iriap_watchdog_timer_expired); -} - #endif --- linux-2.6.10-rc3-mm1-full/net/irda/iriap.c.old 2004-12-14 15:59:03.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/irda/iriap.c 2004-12-14 20:27:40.000000000 +0100 @@ -77,6 +77,15 @@ static int iriap_data_indication(void *instance, void *sap, struct sk_buff *skb); +static void iriap_watchdog_timer_expired(void *data); + +static inline void iriap_start_watchdog_timer(struct iriap_cb *self, + int timeout) +{ + irda_start_timer(&self->watchdog_timer, timeout, self, + iriap_watchdog_timer_expired); +} + /* * Function iriap_init (void) * @@ -328,7 +337,7 @@ /* * Function iriap_disconnect_request (handle) */ -void iriap_disconnect_request(struct iriap_cb *self) +static void iriap_disconnect_request(struct iriap_cb *self) { struct sk_buff *tx_skb; @@ -352,31 +361,6 @@ irlmp_disconnect_request(self->lsap, tx_skb); } -void iriap_getinfobasedetails_request(void) -{ - IRDA_DEBUG(0, "%s(), Not implemented!\n", __FUNCTION__); -} - -void iriap_getinfobasedetails_confirm(void) -{ - IRDA_DEBUG(0, "%s(), Not implemented!\n", __FUNCTION__); -} - -void iriap_getobjects_request(void) -{ - IRDA_DEBUG(0, "%s(), Not implemented!\n", __FUNCTION__); -} - -void iriap_getobjects_confirm(void) -{ - IRDA_DEBUG(0, "%s(), Not implemented!\n", __FUNCTION__); -} - -void iriap_getvalue(void) -{ - IRDA_DEBUG(0, "%s(), Not implemented!\n", __FUNCTION__); -} - /* * Function iriap_getvaluebyclass (addr, name, attr) * @@ -445,7 +429,8 @@ * to service user. * */ -void iriap_getvaluebyclass_confirm(struct iriap_cb *self, struct sk_buff *skb) +static void iriap_getvaluebyclass_confirm(struct iriap_cb *self, + struct sk_buff *skb) { struct ias_value *value; int charset; @@ -552,8 +537,10 @@ * Send answer back to remote LM-IAS * */ -void iriap_getvaluebyclass_response(struct iriap_cb *self, __u16 obj_id, - __u8 ret_code, struct ias_value *value) +static void iriap_getvaluebyclass_response(struct iriap_cb *self, + __u16 obj_id, + __u8 ret_code, + struct ias_value *value) { struct sk_buff *tx_skb; int n; @@ -641,8 +628,8 @@ * getvaluebyclass is requested from peer LM-IAS * */ -void iriap_getvaluebyclass_indication(struct iriap_cb *self, - struct sk_buff *skb) +static void iriap_getvaluebyclass_indication(struct iriap_cb *self, + struct sk_buff *skb) { struct ias_object *obj; struct ias_attrib *attrib; @@ -962,7 +949,7 @@ * Query has taken too long time, so abort * */ -void iriap_watchdog_timer_expired(void *data) +static void iriap_watchdog_timer_expired(void *data) { struct iriap_cb *self = (struct iriap_cb *) data; --- linux-2.6.10-rc3-mm1-full/net/irda/irias_object.c.old 2004-12-14 16:03:56.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/irda/irias_object.c 2004-12-14 16:04:32.000000000 +0100 @@ -116,7 +116,7 @@ * Delete given attribute and deallocate all its memory * */ -void __irias_delete_attrib(struct ias_attrib *attrib) +static void __irias_delete_attrib(struct ias_attrib *attrib) { ASSERT(attrib != NULL, return;); ASSERT(attrib->magic == IAS_ATTRIB_MAGIC, return;); @@ -267,8 +267,8 @@ * Add attribute to object * */ -void irias_add_attrib( struct ias_object *obj, struct ias_attrib *attrib, - int owner) +static void irias_add_attrib(struct ias_object *obj, struct ias_attrib *attrib, + int owner) { ASSERT(obj != NULL, return;); ASSERT(obj->magic == IAS_OBJECT_MAGIC, return;); --- linux-2.6.10-rc3-mm1-full/include/net/irda/irlan_client.h.old 2004-12-14 19:55:15.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/net/irda/irlan_client.h 2004-12-14 19:57:02.000000000 +0100 @@ -33,12 +33,9 @@ #include #include -void irlan_client_start_kick_timer(struct irlan_cb *self, int timeout); void irlan_client_discovery_indication(discinfo_t *, DISCOVERY_MODE, void *); void irlan_client_wakeup(struct irlan_cb *self, __u32 saddr, __u32 daddr); -void irlan_client_open_ctrl_tsap( struct irlan_cb *self); - void irlan_client_parse_response(struct irlan_cb *self, struct sk_buff *skb); void irlan_client_get_value_confirm(int result, __u16 obj_id, struct ias_value *value, void *priv); --- linux-2.6.10-rc3-mm1-full/net/irda/irlan/irlan_client.c.old 2004-12-14 19:55:33.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/irda/irlan/irlan_client.c 2004-12-14 19:57:08.000000000 +0100 @@ -66,6 +66,7 @@ struct sk_buff *); static void irlan_check_response_param(struct irlan_cb *self, char *param, char *value, int val_len); +static void irlan_client_open_ctrl_tsap(struct irlan_cb *self); static void irlan_client_kick_timer_expired(void *data) { @@ -88,7 +89,7 @@ } } -void irlan_client_start_kick_timer(struct irlan_cb *self, int timeout) +static void irlan_client_start_kick_timer(struct irlan_cb *self, int timeout) { IRDA_DEBUG(4, "%s()\n", __FUNCTION__ ); @@ -248,7 +249,7 @@ * Initialize callbacks and open IrTTP TSAPs * */ -void irlan_client_open_ctrl_tsap(struct irlan_cb *self) +static void irlan_client_open_ctrl_tsap(struct irlan_cb *self) { struct tsap_cb *tsap; notify_t notify; @@ -309,42 +310,6 @@ } /* - * Function irlan_client_reconnect_data_channel (self) - * - * Try to reconnect data channel (currently not used) - * - */ -void irlan_client_reconnect_data_channel(struct irlan_cb *self) -{ - struct sk_buff *skb; - __u8 *frame; - - IRDA_DEBUG(4, "%s()\n", __FUNCTION__ ); - - ASSERT(self != NULL, return;); - ASSERT(self->magic == IRLAN_MAGIC, return;); - - skb = dev_alloc_skb(128); - if (!skb) - return; - - /* Reserve space for TTP, LMP, and LAP header */ - skb_reserve(skb, self->max_header_size); - skb_put(skb, 2); - - frame = skb->data; - - frame[0] = CMD_RECONNECT_DATA_CHAN; - frame[1] = 0x01; - irlan_insert_array_param(skb, "RECONNECT_KEY", - self->client.reconnect_key, - self->client.key_len); - - irttp_data_request(self->client.tsap_ctrl, skb); -} - - -/* * Function print_ret_code (code) * * Print return code of request to peer IrLAN layer. --- linux-2.6.10-rc3-mm1-full/include/net/irda/irlan_common.h.old 2004-12-14 20:00:16.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/net/irda/irlan_common.h 2004-12-14 20:01:40.000000000 +0100 @@ -190,7 +190,6 @@ struct timer_list watchdog_timer; }; -struct irlan_cb *irlan_open(__u32 saddr, __u32 daddr); void irlan_close(struct irlan_cb *self); void irlan_close_tsaps(struct irlan_cb *self); @@ -204,13 +203,11 @@ struct irlan_cb *irlan_get_any(void); void irlan_get_provider_info(struct irlan_cb *self); -void irlan_get_unicast_addr(struct irlan_cb *self); void irlan_get_media_char(struct irlan_cb *self); void irlan_open_data_channel(struct irlan_cb *self); void irlan_close_data_channel(struct irlan_cb *self); void irlan_set_multicast_filter(struct irlan_cb *self, int status); void irlan_set_broadcast_filter(struct irlan_cb *self, int status); -void irlan_open_unicast_addr(struct irlan_cb *self); int irlan_insert_byte_param(struct sk_buff *skb, char *param, __u8 value); int irlan_insert_short_param(struct sk_buff *skb, char *param, __u16 value); --- linux-2.6.10-rc3-mm1-full/net/irda/irlan/irlan_common.c.old 2004-12-14 19:57:23.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/irda/irlan/irlan_common.c 2004-12-14 20:02:09.000000000 +0100 @@ -105,10 +105,13 @@ extern struct proc_dir_entry *proc_irda; #endif /* CONFIG_PROC_FS */ +static struct irlan_cb *irlan_open(__u32 saddr, __u32 daddr); static void __irlan_close(struct irlan_cb *self); static int __irlan_insert_param(struct sk_buff *skb, char *param, int type, __u8 value_byte, __u16 value_short, __u8 *value_array, __u16 value_len); +static void irlan_open_unicast_addr(struct irlan_cb *self); +static void irlan_get_unicast_addr(struct irlan_cb *self); void irlan_close_tsaps(struct irlan_cb *self); /* @@ -185,7 +188,7 @@ * Open new instance of a client/provider, we should only register the * network device if this instance is ment for a particular client/provider */ -struct irlan_cb *irlan_open(__u32 saddr, __u32 daddr) +static struct irlan_cb *irlan_open(__u32 saddr, __u32 daddr) { struct net_device *dev; struct irlan_cb *self; @@ -294,9 +297,11 @@ * Here we receive the connect indication for the data channel * */ -void irlan_connect_indication(void *instance, void *sap, struct qos_info *qos, - __u32 max_sdu_size, __u8 max_header_size, - struct sk_buff *skb) +static void irlan_connect_indication(void *instance, void *sap, + struct qos_info *qos, + __u32 max_sdu_size, + __u8 max_header_size, + struct sk_buff *skb) { struct irlan_cb *self; struct tsap_cb *tsap; @@ -339,9 +344,11 @@ netif_start_queue(self->dev); /* Clear reason */ } -void irlan_connect_confirm(void *instance, void *sap, struct qos_info *qos, - __u32 max_sdu_size, __u8 max_header_size, - struct sk_buff *skb) +static void irlan_connect_confirm(void *instance, void *sap, + struct qos_info *qos, + __u32 max_sdu_size, + __u8 max_header_size, + struct sk_buff *skb) { struct irlan_cb *self; @@ -384,8 +391,9 @@ * Callback function for the IrTTP layer. Indicates a disconnection of * the specified connection (handle) */ -void irlan_disconnect_indication(void *instance, void *sap, LM_REASON reason, - struct sk_buff *userdata) +static void irlan_disconnect_indication(void *instance, + void *sap, LM_REASON reason, + struct sk_buff *userdata) { struct irlan_cb *self; struct tsap_cb *tsap; @@ -602,7 +610,7 @@ * This function makes sure that commands on the control channel is being * sent in a command/response fashion */ -void irlan_ctrl_data_request(struct irlan_cb *self, struct sk_buff *skb) +static void irlan_ctrl_data_request(struct irlan_cb *self, struct sk_buff *skb) { IRDA_DEBUG(2, "%s()\n", __FUNCTION__ ); @@ -722,7 +730,7 @@ * address. * */ -void irlan_open_unicast_addr(struct irlan_cb *self) +static void irlan_open_unicast_addr(struct irlan_cb *self) { struct sk_buff *skb; __u8 *frame; @@ -839,7 +847,7 @@ * can construct its packets. * */ -void irlan_get_unicast_addr(struct irlan_cb *self) +static void irlan_get_unicast_addr(struct irlan_cb *self) { struct sk_buff *skb; __u8 *frame; --- linux-2.6.10-rc3-mm1-full/net/irda/irlan/irlan_provider.c.old 2004-12-14 20:02:39.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/irda/irlan/irlan_provider.c 2004-12-14 20:02:59.000000000 +0100 @@ -175,9 +175,9 @@ irttp_connect_response(tsap, IRLAN_MTU, NULL); } -void irlan_provider_disconnect_indication(void *instance, void *sap, - LM_REASON reason, - struct sk_buff *userdata) +static void irlan_provider_disconnect_indication(void *instance, void *sap, + LM_REASON reason, + struct sk_buff *userdata) { struct irlan_cb *self; struct tsap_cb *tsap; --- linux-2.6.10-rc3-mm1-full/include/net/irda/irlap.h.old 2004-12-14 20:03:18.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/net/irda/irlap.h 2004-12-14 20:03:59.000000000 +0100 @@ -253,10 +253,8 @@ int irlap_generate_rand_time_slot(int S, int s); void irlap_initiate_connection_state(struct irlap_cb *); void irlap_flush_all_queues(struct irlap_cb *); -void irlap_change_speed(struct irlap_cb *self, __u32 speed, int now); void irlap_wait_min_turn_around(struct irlap_cb *, struct qos_info *); -void irlap_init_qos_capabilities(struct irlap_cb *, struct qos_info *); void irlap_apply_default_connection_parameters(struct irlap_cb *self); void irlap_apply_connection_parameters(struct irlap_cb *self, int now); --- linux-2.6.10-rc3-mm1-full/net/irda/irlap.c.old 2004-12-14 20:03:35.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/irda/irlap.c 2004-12-14 20:04:24.000000000 +0100 @@ -60,6 +60,8 @@ extern void irlap_queue_xmit(struct irlap_cb *self, struct sk_buff *skb); static void __irlap_close(struct irlap_cb *self); +static void irlap_init_qos_capabilities(struct irlap_cb *self, + struct qos_info *qos_user); #ifdef CONFIG_IRDA_DEBUG static char *lap_reasons[] = { @@ -867,7 +869,7 @@ * Change the speed of the IrDA port * */ -void irlap_change_speed(struct irlap_cb *self, __u32 speed, int now) +static void irlap_change_speed(struct irlap_cb *self, __u32 speed, int now) { struct sk_buff *skb; @@ -894,8 +896,8 @@ * IrLAP itself. Normally, IrLAP will not specify any values, but it can * be used to restrict certain values. */ -void irlap_init_qos_capabilities(struct irlap_cb *self, - struct qos_info *qos_user) +static void irlap_init_qos_capabilities(struct irlap_cb *self, + struct qos_info *qos_user) { ASSERT(self != NULL, return;); ASSERT(self->magic == LAP_MAGIC, return;); --- linux-2.6.10-rc3-mm1-full/net/irda/irlap_event.c.old 2004-12-14 20:04:40.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/irda/irlap_event.c 2004-12-14 20:04:50.000000000 +0100 @@ -181,7 +181,7 @@ * Make sure that state is XMIT_P/XMIT_S when calling this function * (and that nobody messed up with the state). - Jean II */ -void irlap_start_poll_timer(struct irlap_cb *self, int timeout) +static void irlap_start_poll_timer(struct irlap_cb *self, int timeout) { ASSERT(self != NULL, return;); ASSERT(self->magic == LAP_MAGIC, return;); --- linux-2.6.10-rc3-mm1-full/include/net/irda/irlap_frame.h.old 2004-12-14 20:05:31.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/net/irda/irlap_frame.h 2004-12-14 20:05:39.000000000 +0100 @@ -133,7 +133,6 @@ void irlap_resend_rejected_frames(struct irlap_cb *, int command); void irlap_resend_rejected_frame(struct irlap_cb *self, int command); -void irlap_send_i_frame(struct irlap_cb *, struct sk_buff *, int command); void irlap_send_ui_frame(struct irlap_cb *self, struct sk_buff *skb, __u8 caddr, int command); --- linux-2.6.10-rc3-mm1-full/net/irda/irlap_frame.c.old 2004-12-14 20:05:04.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/irda/irlap_frame.c 2004-12-14 20:06:11.000000000 +0100 @@ -43,6 +43,9 @@ #include #include +static void irlap_send_i_frame(struct irlap_cb *self, struct sk_buff *skb, + int command); + /* * Function irlap_insert_info (self, skb) * @@ -629,34 +632,6 @@ irlap_do_event(self, RECV_RR_RSP, skb, info); } -void irlap_send_frmr_frame( struct irlap_cb *self, int command) -{ - struct sk_buff *tx_skb = NULL; - __u8 *frame; - - ASSERT( self != NULL, return;); - ASSERT( self->magic == LAP_MAGIC, return;); - - tx_skb = dev_alloc_skb( 32); - if (!tx_skb) - return; - - frame = skb_put(tx_skb, 2); - - frame[0] = self->caddr; - frame[0] |= (command) ? CMD_FRAME : 0; - - frame[1] = (self->vs << 1); - frame[1] |= PF_BIT; - frame[1] |= (self->vr << 5); - - frame[2] = 0; - - IRDA_DEBUG(4, "%s(), vr=%d, %ld\n", __FUNCTION__, self->vr, jiffies); - - irlap_queue_xmit(self, tx_skb); -} - /* * Function irlap_recv_rnr_frame (self, skb, info) * @@ -1129,8 +1104,8 @@ * * Contruct and transmit Information (I) frame */ -void irlap_send_i_frame(struct irlap_cb *self, struct sk_buff *skb, - int command) +static void irlap_send_i_frame(struct irlap_cb *self, struct sk_buff *skb, + int command) { /* Insert connection address */ skb->data[0] = self->caddr; --- linux-2.6.10-rc3-mm1-full/include/net/irda/irlmp.h.old 2004-12-14 20:06:30.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/net/irda/irlmp.h 2004-12-14 20:08:40.000000000 +0100 @@ -242,12 +242,9 @@ void irlmp_connless_data_indication(struct lsap_cb *, struct sk_buff *); #endif /* CONFIG_IRDA_ULTRA */ -void irlmp_status_request(void); void irlmp_status_indication(struct lap_cb *, LINK_STATUS link, LOCK_STATUS lock); void irlmp_flow_indication(struct lap_cb *self, LOCAL_FLOW flow); -int irlmp_slsap_inuse(__u8 slsap); -__u8 irlmp_find_free_slsap(void); LM_REASON irlmp_convert_lap_reason(LAP_REASON); static inline __u32 irlmp_get_saddr(const struct lsap_cb *self) --- linux-2.6.10-rc3-mm1-full/net/irda/irlmp.c.old 2004-12-14 20:06:48.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/irda/irlmp.c 2004-12-14 20:08:45.000000000 +0100 @@ -44,6 +44,9 @@ #include #include +static __u8 irlmp_find_free_slsap(void); +static int irlmp_slsap_inuse(__u8 slsap_sel); + /* Master structure */ struct irlmp_cb *irlmp = NULL; @@ -1278,11 +1281,6 @@ } #endif /* CONFIG_IRDA_ULTRA */ -void irlmp_status_request(void) -{ - IRDA_DEBUG(0, "%s(), Not implemented\n", __FUNCTION__); -} - /* * Propagate status indication from LAP to LSAPs (via LMP) * This don't trigger any change of state in lap_cb, lmp_cb or lsap_cb, @@ -1656,7 +1654,7 @@ * of the allocated LSAP, but I'm not sure the complexity is worth it. * Jean II */ -int irlmp_slsap_inuse(__u8 slsap_sel) +static int irlmp_slsap_inuse(__u8 slsap_sel) { struct lsap_cb *self; struct lap_cb *lap; @@ -1756,7 +1754,7 @@ * Find a free source LSAP to use. This function is called if the service * user has requested a source LSAP equal to LM_ANY */ -__u8 irlmp_find_free_slsap(void) +static __u8 irlmp_find_free_slsap(void) { __u8 lsap_sel; int wrapped = 0; --- linux-2.6.10-rc3-mm1-full/net/irda/irmod.c.old 2004-12-14 20:09:01.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/irda/irmod.c 2004-12-14 20:09:41.000000000 +0100 @@ -100,7 +100,7 @@ * Protocol stack initialisation entry point. * Initialise the various components of the IrDA stack */ -int __init irda_init(void) +static int __init irda_init(void) { IRDA_DEBUG(0, "%s()\n", __FUNCTION__); @@ -136,7 +136,7 @@ * Protocol stack cleanup/removal entry point. * Cleanup the various components of the IrDA stack */ -void __exit irda_cleanup(void) +static void __exit irda_cleanup(void) { /* Remove External APIs */ #ifdef CONFIG_SYSCTL --- linux-2.6.10-rc3-mm1-full/net/irda/irnet/irnet.h.old 2004-12-14 20:11:00.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/irda/irnet/irnet.h 2004-12-14 21:13:58.000000000 +0100 @@ -520,8 +520,6 @@ /* ---------------------------- MODULE ---------------------------- */ extern int irnet_init(void); /* Initialise IrNET module */ -extern void - irnet_cleanup(void); /* Teardown IrNET module */ /**************************** VARIABLES ****************************/ --- linux-2.6.10-rc3-mm1-full/net/irda/irnet/irnet_ppp.h.old 2004-12-14 20:12:15.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/irda/irnet/irnet_ppp.h 2004-12-14 20:12:54.000000000 +0100 @@ -116,11 +116,4 @@ &irnet_device_fops }; -/* Generic PPP callbacks (to call us) */ -struct ppp_channel_ops irnet_ppp_ops = -{ - ppp_irnet_send, - ppp_irnet_ioctl -}; - #endif /* IRNET_PPP_H */ --- linux-2.6.10-rc3-mm1-full/net/irda/irnet/irnet_ppp.c.old 2004-12-14 20:10:30.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/irda/irnet/irnet_ppp.c 2004-12-14 20:12:50.000000000 +0100 @@ -16,6 +16,13 @@ #include "irnet_ppp.h" /* Private header */ /* Please put other headers in irnet.h - Thanks */ +/* Generic PPP callbacks (to call us) */ +static struct ppp_channel_ops irnet_ppp_ops = +{ + ppp_irnet_send, + ppp_irnet_ioctl +}; + /************************* CONTROL CHANNEL *************************/ /* * When a pppd instance is not active on /dev/irnet, it acts as a control @@ -1095,7 +1102,7 @@ /* * Module exit */ -void __exit +static void __exit irnet_cleanup(void) { irda_irnet_cleanup(); --- linux-2.6.10-rc3-mm1-full/net/irda/irsysctl.c.old 2004-12-14 20:13:09.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/irda/irsysctl.c 2004-12-14 20:13:17.000000000 +0100 @@ -43,7 +43,6 @@ extern int sysctl_discovery_timeout; extern int sysctl_slot_timeout; extern int sysctl_fast_poll_increase; -int sysctl_compression = 0; extern char sysctl_devname[]; extern int sysctl_max_baud_rate; extern int sysctl_min_tx_turn_time; --- linux-2.6.10-rc3-mm1-full/include/net/irda/irttp.h.old 2004-12-14 20:14:11.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/net/irda/irttp.h 2004-12-14 20:14:53.000000000 +0100 @@ -167,9 +167,6 @@ int irttp_disconnect_request(struct tsap_cb *self, struct sk_buff *skb, int priority); void irttp_flow_request(struct tsap_cb *self, LOCAL_FLOW flow); -void irttp_status_indication(void *instance, - LINK_STATUS link, LOCK_STATUS lock); -void irttp_flow_indication(void *instance, void *sap, LOCAL_FLOW flow); struct tsap_cb *irttp_dup(struct tsap_cb *self, void *instance); static __inline __u32 irttp_get_saddr(struct tsap_cb *self) --- linux-2.6.10-rc3-mm1-full/net/irda/irttp.c.old 2004-12-14 20:13:41.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/irda/irttp.c 2004-12-14 20:15:23.000000000 +0100 @@ -64,6 +64,10 @@ static int irttp_param_max_sdu_size(void *instance, irda_param_t *param, int get); +static void irttp_flow_indication(void *instance, void *sap, LOCAL_FLOW flow); +static void irttp_status_indication(void *instance, + LINK_STATUS link, LOCK_STATUS lock); + /* Information for parsing parameters in IrTTP */ static pi_minor_info_t pi_minor_call_table[] = { { NULL, 0 }, /* 0x00 */ @@ -961,8 +965,8 @@ * Status_indication, just pass to the higher layer... * */ -void irttp_status_indication(void *instance, - LINK_STATUS link, LOCK_STATUS lock) +static void irttp_status_indication(void *instance, + LINK_STATUS link, LOCK_STATUS lock) { struct tsap_cb *self; @@ -993,7 +997,7 @@ * Flow_indication : IrLAP tells us to send more data. * */ -void irttp_flow_indication(void *instance, void *sap, LOCAL_FLOW flow) +static void irttp_flow_indication(void *instance, void *sap, LOCAL_FLOW flow) { struct tsap_cb *self; @@ -1613,7 +1617,7 @@ * for some reason should fail. We mark rx sdu as busy to apply back * pressure is necessary. */ -void irttp_do_data_indication(struct tsap_cb *self, struct sk_buff *skb) +static void irttp_do_data_indication(struct tsap_cb *self, struct sk_buff *skb) { int err; --- linux-2.6.10-rc3-mm1-full/include/net/irda/parameters.h.old 2004-12-14 20:15:45.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/net/irda/parameters.h 2004-12-14 20:17:01.000000000 +0100 @@ -90,11 +90,9 @@ } pi_param_info_t; int irda_param_pack(__u8 *buf, char *fmt, ...); -int irda_param_unpack(__u8 *buf, char *fmt, ...); int irda_param_insert(void *self, __u8 pi, __u8 *buf, int len, pi_param_info_t *info); -int irda_param_extract(void *self, __u8 *buf, int len, pi_param_info_t *info); int irda_param_extract_all(void *self, __u8 *buf, int len, pi_param_info_t *info); --- linux-2.6.10-rc3-mm1-full/net/irda/parameters.c.old 2004-12-14 20:16:01.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/irda/parameters.c 2004-12-14 20:17:28.000000000 +0100 @@ -51,6 +51,8 @@ static int irda_insert_no_value(void *self, __u8 *buf, int len, __u8 pi, PV_TYPE type, PI_HANDLER func); +static int irda_param_unpack(__u8 *buf, char *fmt, ...); + /* Parameter value call table. Must match PV_TYPE */ static PV_HANDLER pv_extract_table[] = { irda_extract_integer, /* Handler for any length integers */ @@ -400,7 +402,7 @@ /* * Function irda_param_unpack (skb, fmt, ...) */ -int irda_param_unpack(__u8 *buf, char *fmt, ...) +static int irda_param_unpack(__u8 *buf, char *fmt, ...) { irda_pv_t arg; va_list args; @@ -440,7 +442,6 @@ return 0; } -EXPORT_SYMBOL(irda_param_unpack); /* * Function irda_param_insert (self, pi, buf, len, info) @@ -496,13 +497,14 @@ EXPORT_SYMBOL(irda_param_insert); /* - * Function irda_param_extract_all (self, buf, len, info) + * Function irda_param_extract (self, buf, len, info) * * Parse all parameters. If len is correct, then everything should be * safe. Returns the number of bytes that was parsed * */ -int irda_param_extract(void *self, __u8 *buf, int len, pi_param_info_t *info) +static int irda_param_extract(void *self, __u8 *buf, int len, + pi_param_info_t *info) { pi_minor_info_t *pi_minor_info; __u8 pi_minor; @@ -549,7 +551,6 @@ type, pi_minor_info->func); return ret; } -EXPORT_SYMBOL(irda_param_extract); /* * Function irda_param_extract_all (self, buf, len, info) --- linux-2.6.10-rc3-mm1-full/include/net/irda/qos.h.old 2004-12-14 20:18:19.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/net/irda/qos.h 2004-12-14 20:18:30.000000000 +0100 @@ -87,7 +87,6 @@ void irda_qos_compute_intersection(struct qos_info *, struct qos_info *); __u32 irlap_max_line_capacity(__u32 speed, __u32 max_turn_time); -__u32 irlap_requested_line_capacity(struct qos_info *qos); void irda_qos_bits_to_value(struct qos_info *qos); --- linux-2.6.10-rc3-mm1-full/net/irda/qos.c.old 2004-12-14 20:17:41.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/irda/qos.c 2004-12-14 21:17:36.000000000 +0100 @@ -96,6 +96,10 @@ static int irlap_param_min_turn_time(void *instance, irda_param_t *param, int get); +#ifndef CONFIG_IRDA_DYNAMIC_WINDOW +static __u32 irlap_requested_line_capacity(struct qos_info *qos); +#endif + static __u32 min_turn_times[] = { 10000, 5000, 1000, 500, 100, 50, 10, 0 }; /* us */ static __u32 baud_rates[] = { 2400, 9600, 19200, 38400, 57600, 115200, 576000, 1152000, 4000000, 16000000 }; /* bps */ @@ -333,7 +337,7 @@ * Adjust QoS settings in case some values are not possible to use because * of other settings */ -void irlap_adjust_qos_settings(struct qos_info *qos) +static void irlap_adjust_qos_settings(struct qos_info *qos) { __u32 line_capacity; int index; @@ -723,7 +727,8 @@ return line_capacity; } -__u32 irlap_requested_line_capacity(struct qos_info *qos) +#ifndef CONFIG_IRDA_DYNAMIC_WINDOW +static __u32 irlap_requested_line_capacity(struct qos_info *qos) { __u32 line_capacity; line_capacity = qos->window_size.value * @@ -736,6 +741,7 @@ return line_capacity; } +#endif void irda_qos_bits_to_value(struct qos_info *qos) { From bunk@stusta.de Tue Dec 14 17:08:41 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 14 Dec 2004 17:08:49 -0800 (PST) Received: from mailout.stusta.mhn.de (mailout.stusta.mhn.de [141.84.69.5]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBF18Jn0015543 for ; Tue, 14 Dec 2004 17:08:40 -0800 Received: (qmail 1219 invoked from network); 15 Dec 2004 01:07:52 -0000 Received: from r063144.stusta.swh.mhn.de (10.150.63.144) by mailout.stusta.mhn.de with SMTP; 15 Dec 2004 01:07:52 -0000 Received: by r063144.stusta.swh.mhn.de (Postfix, from userid 1000) id 6FEF6AF651; Wed, 15 Dec 2004 02:07:45 +0100 (CET) Date: Wed, 15 Dec 2004 02:07:45 +0100 From: Adrian Bunk To: Jonathan Naylor Cc: netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: [2.6 patch] net/lapb/lapb_iface.c: remove unused code Message-ID: <20041215010745.GB12937@stusta.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12754 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bunk@stusta.de Precedence: bulk X-list: netdev The patch below removes the following unused code: - EXPORT_SYMBOL'ed functions lapb_getparms and lapb_setparms - struct lapb_parms_struct Please review whether it's correct or conflicts with pending changes. diffstat output: Documentation/networking/lapb-module.txt | 45 --------------- include/linux/lapb.h | 14 ---- net/lapb/lapb_iface.c | 67 ----------------------- 3 files changed, 126 deletions(-) Signed-off-by: Adrian Bunk --- linux-2.6.10-rc3-mm1-full/include/linux/lapb.h.old 2004-12-14 20:23:12.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/linux/lapb.h 2004-12-14 20:24:38.000000000 +0100 @@ -32,22 +32,8 @@ void (*data_transmit)(struct net_device *dev, struct sk_buff *skb); }; -struct lapb_parms_struct { - unsigned int t1; - unsigned int t1timer; - unsigned int t2; - unsigned int t2timer; - unsigned int n2; - unsigned int n2count; - unsigned int window; - unsigned int state; - unsigned int mode; -}; - extern int lapb_register(struct net_device *dev, struct lapb_register_struct *callbacks); extern int lapb_unregister(struct net_device *dev); -extern int lapb_getparms(struct net_device *dev, struct lapb_parms_struct *parms); -extern int lapb_setparms(struct net_device *dev, struct lapb_parms_struct *parms); extern int lapb_connect_request(struct net_device *dev); extern int lapb_disconnect_request(struct net_device *dev); extern int lapb_data_request(struct net_device *dev, struct sk_buff *skb); --- linux-2.6.10-rc3-mm1-full/net/lapb/lapb_iface.c.old 2004-12-14 20:23:27.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/lapb/lapb_iface.c 2004-12-15 00:12:18.000000000 +0100 @@ -195,71 +195,6 @@ return rc; } -int lapb_getparms(struct net_device *dev, struct lapb_parms_struct *parms) -{ - int rc = LAPB_BADTOKEN; - struct lapb_cb *lapb = lapb_devtostruct(dev); - - if (!lapb) - goto out; - - parms->t1 = lapb->t1 / HZ; - parms->t2 = lapb->t2 / HZ; - parms->n2 = lapb->n2; - parms->n2count = lapb->n2count; - parms->state = lapb->state; - parms->window = lapb->window; - parms->mode = lapb->mode; - - if (!timer_pending(&lapb->t1timer)) - parms->t1timer = 0; - else - parms->t1timer = (lapb->t1timer.expires - jiffies) / HZ; - - if (!timer_pending(&lapb->t2timer)) - parms->t2timer = 0; - else - parms->t2timer = (lapb->t2timer.expires - jiffies) / HZ; - - lapb_put(lapb); - rc = LAPB_OK; -out: - return rc; -} - -int lapb_setparms(struct net_device *dev, struct lapb_parms_struct *parms) -{ - int rc = LAPB_BADTOKEN; - struct lapb_cb *lapb = lapb_devtostruct(dev); - - if (!lapb) - goto out; - - rc = LAPB_INVALUE; - if (parms->t1 < 1 || parms->t2 < 1 || parms->n2 < 1) - goto out_put; - - if (lapb->state == LAPB_STATE_0) { - if (((parms->mode & LAPB_EXTENDED) && - (parms->window < 1 || parms->window > 127)) || - (parms->window < 1 || parms->window > 7)) - goto out_put; - - lapb->mode = parms->mode; - lapb->window = parms->window; - } - - lapb->t1 = parms->t1 * HZ; - lapb->t2 = parms->t2 * HZ; - lapb->n2 = parms->n2; - - rc = LAPB_OK; -out_put: - lapb_put(lapb); -out: - return rc; -} - int lapb_connect_request(struct net_device *dev) { struct lapb_cb *lapb = lapb_devtostruct(dev); @@ -424,8 +359,6 @@ EXPORT_SYMBOL(lapb_register); EXPORT_SYMBOL(lapb_unregister); -EXPORT_SYMBOL(lapb_getparms); -EXPORT_SYMBOL(lapb_setparms); EXPORT_SYMBOL(lapb_connect_request); EXPORT_SYMBOL(lapb_disconnect_request); EXPORT_SYMBOL(lapb_data_request); --- linux-2.6.10-rc3-mm1-full/Documentation/networking/lapb-module.txt.old 2004-12-14 20:24:12.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/Documentation/networking/lapb-module.txt 2004-12-14 20:24:24.000000000 +0100 @@ -49,51 +49,6 @@ may be substituted. -LAPB Parameter Structure ------------------------- - -This structure is used with the lapb_getparms and lapb_setparms functions -(see below). They are used to allow the device driver to get and set the -operational parameters of the LAPB implementation for a given connection. - -struct lapb_parms_struct { - unsigned int t1; - unsigned int t1timer; - unsigned int t2; - unsigned int t2timer; - unsigned int n2; - unsigned int n2count; - unsigned int window; - unsigned int state; - unsigned int mode; -}; - -T1 and T2 are protocol timing parameters and are given in units of 100ms. N2 -is the maximum number of tries on the link before it is declared a failure. -The window size is the maximum number of outstanding data packets allowed to -be unacknowledged by the remote end, the value of the window is between 1 -and 7 for a standard LAPB link, and between 1 and 127 for an extended LAPB -link. - -The mode variable is a bit field used for setting (at present) three values. -The bit fields have the following meanings: - -Bit Meaning -0 LAPB operation (0=LAPB_STANDARD 1=LAPB_EXTENDED). -1 [SM]LP operation (0=LAPB_SLP 1=LAPB=MLP). -2 DTE/DCE operation (0=LAPB_DTE 1=LAPB_DCE) -3-31 Reserved, must be 0. - -Extended LAPB operation indicates the use of extended sequence numbers and -consequently larger window sizes, the default is standard LAPB operation. -MLP operation is the same as SLP operation except that the addresses used by -LAPB are different to indicate the mode of operation, the default is Single -Link Procedure. The difference between DCE and DTE operation is (i) the -addresses used for commands and responses, and (ii) when the DCE is not -connected, it sends DM without polls set, every T1. The upper case constant -names will be defined in the public LAPB header file. - - Functions --------- From bunk@stusta.de Tue Dec 14 17:13:03 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 14 Dec 2004 17:13:10 -0800 (PST) Received: from mailout.stusta.mhn.de (emailhub.stusta.mhn.de [141.84.69.5]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBF1Cfkl016174 for ; Tue, 14 Dec 2004 17:13:01 -0800 Received: (qmail 1435 invoked from network); 15 Dec 2004 01:12:13 -0000 Received: from r063144.stusta.swh.mhn.de (10.150.63.144) by mailhub.stusta.mhn.de with SMTP; 15 Dec 2004 01:12:13 -0000 Received: by r063144.stusta.swh.mhn.de (Postfix, from userid 1000) id 3FAC2AF651; Wed, 15 Dec 2004 02:12:11 +0100 (CET) Date: Wed, 15 Dec 2004 02:12:11 +0100 From: Adrian Bunk To: acme@conectiva.com.br Cc: netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: [2.6 patch] net/llc/: misc possible cleanups Message-ID: <20041215011211.GC12937@stusta.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12755 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bunk@stusta.de Precedence: bulk X-list: netdev The patch below contains the following possible cleanups: - make some needlessly global code static - remove the following unused global functions: - lc_c_ac.c: llc_conn_ac_report_status - lc_c_ac.c: llc_conn_ac_send_dm_rsp_f_set_f_flag - lc_c_ac.c: llc_conn_ac_resend_i_cmd_p_set_1 - lc_c_ac.c: llc_conn_ac_resend_i_cmd_p_set_1_or_send_rr - lc_c_ac.c: llc_conn_ac_send_ack_cmd_p_set_1 - lc_c_ac.c: llc_conn_ac_send_ua_rsp_f_set_f_flag - lc_c_ac.c: llc_conn_ac_set_f_flag_p - llc_c_ev.c: llc_conn_ev_conn_resp - llc_c_ev.c: llc_conn_ev_rst_resp - llc_c_ev.c: llc_conn_ev_rx_xxx_cmd_pbit_set_0 - llc_c_ev.c: llc_conn_ev_rx_xxx_yyy - llc_c_ev.c: llc_conn_ev_any_tmr_exp - llc_c_ev.c: llc_conn_ev_qlfy_init_p_f_cycle - llc_c_ev.c: llc_conn_ev_qlfy_set_status_impossible - llc_c_ev.c: llc_conn_ev_qlfy_set_status_received - llc_if.c: llc_build_and_send_reset_pkt - llc_pdu.c: llc_pdu_decode_cr_bit - remove the following unneeded EXPORT_SYMBOL: - llc_core.c: llc_sap_list_lock Please comment on which of these changes are correct and which conflict with pending patches. diffstat output: include/net/llc_c_ac.h | 19 ------ include/net/llc_c_ev.h | 12 --- include/net/llc_conn.h | 3 include/net/llc_pdu.h | 1 include/net/llc_sap.h | 1 net/llc/llc_c_ac.c | 127 ++++------------------------------------- net/llc/llc_c_ev.c | 88 ---------------------------- net/llc/llc_conn.c | 7 +- net/llc/llc_core.c | 7 -- net/llc/llc_if.c | 24 ------- net/llc/llc_pdu.c | 13 ---- net/llc/llc_proc.c | 4 - net/llc/llc_sap.c | 5 - net/llc/llc_station.c | 2 14 files changed, 27 insertions(+), 286 deletions(-) Signed-off-by: Adrian Bunk --- linux-2.6.10-rc3-mm1-full/include/net/llc_c_ac.h.old 2004-12-14 21:20:52.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/net/llc_c_ac.h 2004-12-14 21:25:39.000000000 +0100 @@ -96,7 +96,6 @@ extern int llc_conn_ac_disc_ind(struct sock* sk, struct sk_buff *skb); extern int llc_conn_ac_rst_ind(struct sock* sk, struct sk_buff *skb); extern int llc_conn_ac_rst_confirm(struct sock* sk, struct sk_buff *skb); -extern int llc_conn_ac_report_status(struct sock* sk, struct sk_buff *skb); extern int llc_conn_ac_clear_remote_busy_if_f_eq_1(struct sock* sk, struct sk_buff *skb); extern int llc_conn_ac_stop_rej_tmr_if_data_flag_eq_2(struct sock* sk, @@ -107,8 +106,6 @@ struct sk_buff *skb); extern int llc_conn_ac_send_dm_rsp_f_set_1(struct sock* sk, struct sk_buff *skb); -extern int llc_conn_ac_send_dm_rsp_f_set_f_flag(struct sock* sk, - struct sk_buff *skb); extern int llc_conn_ac_send_frmr_rsp_f_set_x(struct sock* sk, struct sk_buff *skb); extern int llc_conn_ac_resend_frmr_rsp_f_set_0(struct sock* sk, @@ -116,11 +113,6 @@ extern int llc_conn_ac_resend_frmr_rsp_f_set_p(struct sock* sk, struct sk_buff *skb); extern int llc_conn_ac_send_i_cmd_p_set_1(struct sock* sk, struct sk_buff *skb); -extern int llc_conn_ac_send_i_cmd_p_set_0(struct sock* sk, struct sk_buff *skb); -extern int llc_conn_ac_resend_i_cmd_p_set_1(struct sock* sk, - struct sk_buff *skb); -extern int llc_conn_ac_resend_i_cmd_p_set_1_or_send_rr(struct sock* sk, - struct sk_buff *skb); extern int llc_conn_ac_send_i_xxx_x_set_0(struct sock* sk, struct sk_buff *skb); extern int llc_conn_ac_resend_i_xxx_x_set_0(struct sock* sk, struct sk_buff *skb); @@ -145,8 +137,6 @@ struct sk_buff *skb); extern int llc_conn_ac_send_rr_cmd_p_set_1(struct sock* sk, struct sk_buff *skb); -extern int llc_conn_ac_send_ack_cmd_p_set_1(struct sock* sk, - struct sk_buff *skb); extern int llc_conn_ac_send_rr_rsp_f_set_1(struct sock* sk, struct sk_buff *skb); extern int llc_conn_ac_send_ack_rsp_f_set_1(struct sock* sk, @@ -157,8 +147,6 @@ struct sk_buff *skb); extern int llc_conn_ac_send_sabme_cmd_p_set_x(struct sock* sk, struct sk_buff *skb); -extern int llc_conn_ac_send_ua_rsp_f_set_f_flag(struct sock* sk, - struct sk_buff *skb); extern int llc_conn_ac_send_ua_rsp_f_set_p(struct sock* sk, struct sk_buff *skb); extern int llc_conn_ac_set_s_flag_0(struct sock* sk, struct sk_buff *skb); @@ -183,7 +171,6 @@ extern int llc_conn_ac_set_data_flag_1_if_data_flag_eq_0(struct sock* sk, struct sk_buff *skb); extern int llc_conn_ac_set_p_flag_0(struct sock* sk, struct sk_buff *skb); -extern int llc_conn_ac_set_p_flag_1(struct sock* sk, struct sk_buff *skb); extern int llc_conn_ac_set_remote_busy_0(struct sock* sk, struct sk_buff *skb); extern int llc_conn_ac_set_retry_cnt_0(struct sock* sk, struct sk_buff *skb); extern int llc_conn_ac_set_cause_flag_0(struct sock* sk, struct sk_buff *skb); @@ -195,20 +182,14 @@ extern int llc_conn_ac_set_vs_nr(struct sock* sk, struct sk_buff *skb); extern int llc_conn_ac_rst_vs(struct sock* sk, struct sk_buff *skb); extern int llc_conn_ac_upd_vs(struct sock* sk, struct sk_buff *skb); -extern int llc_conn_ac_set_f_flag_p(struct sock* sk, struct sk_buff *skb); extern int llc_conn_disc(struct sock* sk, struct sk_buff *skb); extern int llc_conn_reset(struct sock* sk, struct sk_buff *skb); extern int llc_conn_ac_disc_confirm(struct sock* sk, struct sk_buff *skb); extern u8 llc_circular_between(u8 a, u8 b, u8 c); extern int llc_conn_ac_send_ack_if_needed(struct sock* sk, struct sk_buff *skb); -extern int llc_conn_ac_inc_npta_value(struct sock* sk, struct sk_buff *skb); extern int llc_conn_ac_adjust_npta_by_rr(struct sock* sk, struct sk_buff *skb); extern int llc_conn_ac_adjust_npta_by_rnr(struct sock* sk, struct sk_buff *skb); extern int llc_conn_ac_rst_sendack_flag(struct sock* sk, struct sk_buff *skb); -extern int llc_conn_ac_send_rr_rsp_f_set_ackpf(struct sock* sk, - struct sk_buff *skb); -extern int llc_conn_ac_send_i_rsp_f_set_ackpf(struct sock* sk, - struct sk_buff *skb); extern int llc_conn_ac_send_i_rsp_as_ack(struct sock* sk, struct sk_buff *skb); extern int llc_conn_ac_send_i_as_ack(struct sock* sk, struct sk_buff *skb); --- linux-2.6.10-rc3-mm1-full/net/llc/llc_c_ac.c.old 2004-12-14 21:21:07.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/llc/llc_c_ac.c 2004-12-14 21:26:07.000000000 +0100 @@ -33,6 +33,13 @@ static void llc_process_tmr_ev(struct sock *sk, struct sk_buff *skb); static int llc_conn_ac_data_confirm(struct sock *sk, struct sk_buff *ev); +static int llc_conn_ac_inc_npta_value(struct sock *sk, struct sk_buff *skb); + +static int llc_conn_ac_send_rr_rsp_f_set_ackpf(struct sock *sk, + struct sk_buff *skb); + +static int llc_conn_ac_set_p_flag_1(struct sock *sk, struct sk_buff *skb); + #define INCORRECT 0 int llc_conn_ac_clear_remote_busy(struct sock *sk, struct sk_buff *skb) @@ -185,11 +192,6 @@ return 0; } -int llc_conn_ac_report_status(struct sock *sk, struct sk_buff *skb) -{ - return 0; -} - int llc_conn_ac_clear_remote_busy_if_f_eq_1(struct sock *sk, struct sk_buff *skb) { @@ -291,32 +293,6 @@ goto out; } -int llc_conn_ac_send_dm_rsp_f_set_f_flag(struct sock *sk, struct sk_buff *skb) -{ - int rc = -ENOBUFS; - struct sk_buff *nskb = llc_alloc_frame(); - - if (nskb) { - struct llc_opt *llc = llc_sk(sk); - struct llc_sap *sap = llc->sap; - u8 f_bit = llc->f_flag; - - nskb->dev = llc->dev; - llc_pdu_header_init(nskb, LLC_PDU_TYPE_U, sap->laddr.lsap, - llc->daddr.lsap, LLC_PDU_RSP); - llc_pdu_init_as_dm_rsp(nskb, f_bit); - rc = llc_mac_hdr_init(nskb, llc->dev->dev_addr, llc->daddr.mac); - if (rc) - goto free; - llc_conn_send_pdu(sk, nskb); - } -out: - return rc; -free: - kfree_skb(nskb); - goto out; -} - int llc_conn_ac_send_frmr_rsp_f_set_x(struct sock *sk, struct sk_buff *skb) { u8 f_bit; @@ -426,7 +402,7 @@ return rc; } -int llc_conn_ac_send_i_cmd_p_set_0(struct sock *sk, struct sk_buff *skb) +static int llc_conn_ac_send_i_cmd_p_set_0(struct sock *sk, struct sk_buff *skb) { int rc; struct llc_opt *llc = llc_sk(sk); @@ -443,27 +419,6 @@ return rc; } -int llc_conn_ac_resend_i_cmd_p_set_1(struct sock *sk, struct sk_buff *skb) -{ - struct llc_pdu_sn *pdu = llc_pdu_sn_hdr(skb); - u8 nr = LLC_I_GET_NR(pdu); - - llc_conn_resend_i_pdu_as_cmd(sk, nr, 1); - return 0; -} - -int llc_conn_ac_resend_i_cmd_p_set_1_or_send_rr(struct sock *sk, - struct sk_buff *skb) -{ - struct llc_pdu_sn *pdu = llc_pdu_sn_hdr(skb); - u8 nr = LLC_I_GET_NR(pdu); - int rc = llc_conn_ac_send_rr_cmd_p_set_1(sk, skb); - - if (!rc) - llc_conn_resend_i_pdu_as_cmd(sk, nr, 0); - return rc; -} - int llc_conn_ac_send_i_xxx_x_set_0(struct sock *sk, struct sk_buff *skb) { int rc; @@ -745,31 +700,6 @@ goto out; } -int llc_conn_ac_send_ack_cmd_p_set_1(struct sock *sk, struct sk_buff *skb) -{ - int rc = -ENOBUFS; - struct sk_buff *nskb = llc_alloc_frame(); - - if (nskb) { - struct llc_opt *llc = llc_sk(sk); - struct llc_sap *sap = llc->sap; - - nskb->dev = llc->dev; - llc_pdu_header_init(nskb, LLC_PDU_TYPE_S, sap->laddr.lsap, - llc->daddr.lsap, LLC_PDU_CMD); - llc_pdu_init_as_rr_cmd(nskb, 1, llc->vR); - rc = llc_mac_hdr_init(nskb, llc->dev->dev_addr, llc->daddr.mac); - if (rc) - goto free; - llc_conn_send_pdu(sk, nskb); - } -out: - return rc; -free: - kfree_skb(nskb); - goto out; -} - int llc_conn_ac_send_rr_rsp_f_set_1(struct sock *sk, struct sk_buff *skb) { int rc = -ENOBUFS; @@ -911,31 +841,6 @@ goto out; } -int llc_conn_ac_send_ua_rsp_f_set_f_flag(struct sock *sk, struct sk_buff *skb) -{ - int rc = -ENOBUFS; - struct sk_buff *nskb = llc_alloc_frame(); - - if (nskb) { - struct llc_opt *llc = llc_sk(sk); - struct llc_sap *sap = llc->sap; - - nskb->dev = llc->dev; - llc_pdu_header_init(nskb, LLC_PDU_TYPE_U, sap->laddr.lsap, - llc->daddr.lsap, LLC_PDU_RSP); - llc_pdu_init_as_ua_rsp(nskb, llc->f_flag); - rc = llc_mac_hdr_init(nskb, llc->dev->dev_addr, llc->daddr.mac); - if (rc) - goto free; - llc_conn_send_pdu(sk, nskb); - } -out: - return rc; -free: - kfree_skb(nskb); - goto out; -} - int llc_conn_ac_send_ua_rsp_f_set_p(struct sock *sk, struct sk_buff *skb) { u8 f_bit; @@ -1041,7 +946,8 @@ * set to one if one PDU with p-bit set to one is received. Returns 0 for * success, 1 otherwise. */ -int llc_conn_ac_send_i_rsp_f_set_ackpf(struct sock *sk, struct sk_buff *skb) +static int llc_conn_ac_send_i_rsp_f_set_ackpf(struct sock *sk, + struct sk_buff *skb) { int rc; struct llc_opt *llc = llc_sk(sk); @@ -1091,7 +997,8 @@ * if there is any. ack_pf flag indicates if a PDU has been received with * p-bit set to one. Returns 0 for success, 1 otherwise. */ -int llc_conn_ac_send_rr_rsp_f_set_ackpf(struct sock *sk, struct sk_buff *skb) +static int llc_conn_ac_send_rr_rsp_f_set_ackpf(struct sock *sk, + struct sk_buff *skb) { int rc = -ENOBUFS; struct sk_buff *nskb = llc_alloc_frame(); @@ -1126,7 +1033,7 @@ * acknowledgements decreases by increasing of "npta". Returns 0 for * success, 1 otherwise. */ -int llc_conn_ac_inc_npta_value(struct sock *sk, struct sk_buff *skb) +static int llc_conn_ac_inc_npta_value(struct sock *sk, struct sk_buff *skb) { struct llc_opt *llc = llc_sk(sk); @@ -1387,7 +1294,7 @@ return 0; } -int llc_conn_ac_set_p_flag_1(struct sock *sk, struct sk_buff *skb) +static int llc_conn_ac_set_p_flag_1(struct sock *sk, struct sk_buff *skb) { llc_conn_set_p_flag(sk, 1); return 0; @@ -1453,12 +1360,6 @@ return 0; } -int llc_conn_ac_set_f_flag_p(struct sock *sk, struct sk_buff *skb) -{ - llc_pdu_decode_pf_bit(skb, &llc_sk(sk)->f_flag); - return 0; -} - void llc_conn_pf_cycle_tmr_cb(unsigned long timeout_data) { struct sock *sk = (struct sock *)timeout_data; --- linux-2.6.10-rc3-mm1-full/include/net/llc_c_ev.h.old 2004-12-14 21:26:24.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/net/llc_c_ev.h 2004-12-14 21:40:58.000000000 +0100 @@ -129,11 +129,9 @@ typedef int (*llc_conn_ev_qfyr_t)(struct sock *sk, struct sk_buff *skb); extern int llc_conn_ev_conn_req(struct sock *sk, struct sk_buff *skb); -extern int llc_conn_ev_conn_resp(struct sock *sk, struct sk_buff *skb); extern int llc_conn_ev_data_req(struct sock *sk, struct sk_buff *skb); extern int llc_conn_ev_disc_req(struct sock *sk, struct sk_buff *skb); extern int llc_conn_ev_rst_req(struct sock *sk, struct sk_buff *skb); -extern int llc_conn_ev_rst_resp(struct sock *sk, struct sk_buff *skb); extern int llc_conn_ev_local_busy_detected(struct sock *sk, struct sk_buff *skb); extern int llc_conn_ev_local_busy_cleared(struct sock *sk, struct sk_buff *skb); @@ -162,7 +160,6 @@ struct sk_buff *skb); extern int llc_conn_ev_rx_xxx_rsp_fbit_set_x(struct sock *sk, struct sk_buff *skb); -extern int llc_conn_ev_rx_xxx_yyy(struct sock *sk, struct sk_buff *skb); extern int llc_conn_ev_rx_zzz_cmd_pbit_set_x_inval_nr(struct sock *sk, struct sk_buff *skb); extern int llc_conn_ev_rx_zzz_rsp_fbit_set_x_inval_nr(struct sock *sk, @@ -171,13 +168,10 @@ extern int llc_conn_ev_ack_tmr_exp(struct sock *sk, struct sk_buff *skb); extern int llc_conn_ev_rej_tmr_exp(struct sock *sk, struct sk_buff *skb); extern int llc_conn_ev_busy_tmr_exp(struct sock *sk, struct sk_buff *skb); -extern int llc_conn_ev_any_tmr_exp(struct sock *sk, struct sk_buff *skb); extern int llc_conn_ev_sendack_tmr_exp(struct sock *sk, struct sk_buff *skb); /* NOT_USED functions and their variations */ extern int llc_conn_ev_rx_xxx_cmd_pbit_set_1(struct sock *sk, struct sk_buff *skb); -extern int llc_conn_ev_rx_xxx_cmd_pbit_set_0(struct sock *sk, - struct sk_buff *skb); extern int llc_conn_ev_rx_xxx_rsp_fbit_set_1(struct sock *sk, struct sk_buff *skb); extern int llc_conn_ev_rx_i_cmd_pbit_set_0_unexpd_ns(struct sock *sk, @@ -252,20 +246,14 @@ struct sk_buff *skb); extern int llc_conn_ev_qlfy_cause_flag_eq_0(struct sock *sk, struct sk_buff *skb); -extern int llc_conn_ev_qlfy_init_p_f_cycle(struct sock *sk, - struct sk_buff *skb); extern int llc_conn_ev_qlfy_set_status_conn(struct sock *sk, struct sk_buff *skb); extern int llc_conn_ev_qlfy_set_status_disc(struct sock *sk, struct sk_buff *skb); extern int llc_conn_ev_qlfy_set_status_failed(struct sock *sk, struct sk_buff *skb); -extern int llc_conn_ev_qlfy_set_status_impossible(struct sock *sk, - struct sk_buff *skb); extern int llc_conn_ev_qlfy_set_status_remote_busy(struct sock *sk, struct sk_buff *skb); -extern int llc_conn_ev_qlfy_set_status_received(struct sock *sk, - struct sk_buff *skb); extern int llc_conn_ev_qlfy_set_status_refuse(struct sock *sk, struct sk_buff *skb); extern int llc_conn_ev_qlfy_set_status_conflict(struct sock *sk, --- linux-2.6.10-rc3-mm1-full/net/llc/llc_c_ev.c.old 2004-12-14 21:26:38.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/llc/llc_c_ev.c 2004-12-14 21:39:03.000000000 +0100 @@ -105,14 +105,6 @@ ev->prim_type == LLC_PRIM_TYPE_REQ ? 0 : 1; } -int llc_conn_ev_conn_resp(struct sock *sk, struct sk_buff *skb) -{ - struct llc_conn_state_ev *ev = llc_conn_ev(skb); - - return ev->prim == LLC_CONN_PRIM && - ev->prim_type == LLC_PRIM_TYPE_RESP ? 0 : 1; -} - int llc_conn_ev_data_req(struct sock *sk, struct sk_buff *skb) { struct llc_conn_state_ev *ev = llc_conn_ev(skb); @@ -137,14 +129,6 @@ ev->prim_type == LLC_PRIM_TYPE_REQ ? 0 : 1; } -int llc_conn_ev_rst_resp(struct sock *sk, struct sk_buff *skb) -{ - struct llc_conn_state_ev *ev = llc_conn_ev(skb); - - return ev->prim == LLC_RESET_PRIM && - ev->prim_type == LLC_PRIM_TYPE_RESP ? 0 : 1; -} - int llc_conn_ev_local_busy_detected(struct sock *sk, struct sk_buff *skb) { struct llc_conn_state_ev *ev = llc_conn_ev(skb); @@ -474,27 +458,6 @@ return rc; } -int llc_conn_ev_rx_xxx_cmd_pbit_set_0(struct sock *sk, struct sk_buff *skb) -{ - u16 rc = 1; - struct llc_pdu_sn *pdu = llc_pdu_sn_hdr(skb); - - if (LLC_PDU_IS_CMD(pdu)) { - if (LLC_PDU_TYPE_IS_I(pdu) || LLC_PDU_TYPE_IS_S(pdu)) { - if (LLC_I_PF_IS_0(pdu)) - rc = 0; - } else if (LLC_PDU_TYPE_IS_U(pdu)) - switch (LLC_U_PDU_CMD(pdu)) { - case LLC_2_PDU_CMD_SABME: - case LLC_2_PDU_CMD_DISC: - if (LLC_U_PF_IS_0(pdu)) - rc = 0; - break; - } - } - return rc; -} - int llc_conn_ev_rx_xxx_cmd_pbit_set_x(struct sock *sk, struct sk_buff *skb) { u16 rc = 1; @@ -557,26 +520,6 @@ return rc; } -int llc_conn_ev_rx_xxx_yyy(struct sock *sk, struct sk_buff *skb) -{ - u16 rc = 1; - struct llc_pdu_un *pdu = llc_pdu_un_hdr(skb); - - if (LLC_PDU_TYPE_IS_I(pdu) || LLC_PDU_TYPE_IS_S(pdu)) - rc = 0; - else if (LLC_PDU_TYPE_IS_U(pdu)) - switch (LLC_U_PDU_CMD(pdu)) { - case LLC_2_PDU_CMD_SABME: - case LLC_2_PDU_CMD_DISC: - case LLC_2_PDU_RSP_UA: - case LLC_2_PDU_RSP_DM: - case LLC_2_PDU_RSP_FRMR: - rc = 0; - break; - } - return rc; -} - int llc_conn_ev_rx_zzz_cmd_pbit_set_x_inval_nr(struct sock *sk, struct sk_buff *skb) { @@ -646,16 +589,6 @@ return ev->type != LLC_CONN_EV_TYPE_BUSY_TMR; } -int llc_conn_ev_any_tmr_exp(struct sock *sk, struct sk_buff *skb) -{ - struct llc_conn_state_ev *ev = llc_conn_ev(skb); - - return ev->type == LLC_CONN_EV_TYPE_P_TMR || - ev->type == LLC_CONN_EV_TYPE_ACK_TMR || - ev->type == LLC_CONN_EV_TYPE_REJ_TMR || - ev->type == LLC_CONN_EV_TYPE_BUSY_TMR ? 0 : 1; -} - int llc_conn_ev_init_p_f_cycle(struct sock *sk, struct sk_buff *skb) { return 1; @@ -778,11 +711,6 @@ return llc_sk(sk)->cause_flag; } -int llc_conn_ev_qlfy_init_p_f_cycle(struct sock *sk, struct sk_buff *skb) -{ - return 0; -} - int llc_conn_ev_qlfy_set_status_conn(struct sock *sk, struct sk_buff *skb) { struct llc_conn_state_ev *ev = llc_conn_ev(skb); @@ -799,14 +727,6 @@ return 0; } -int llc_conn_ev_qlfy_set_status_impossible(struct sock *sk, struct sk_buff *skb) -{ - struct llc_conn_state_ev *ev = llc_conn_ev(skb); - - ev->status = LLC_STATUS_IMPOSSIBLE; - return 0; -} - int llc_conn_ev_qlfy_set_status_failed(struct sock *sk, struct sk_buff *skb) { struct llc_conn_state_ev *ev = llc_conn_ev(skb); @@ -824,14 +744,6 @@ return 0; } -int llc_conn_ev_qlfy_set_status_received(struct sock *sk, struct sk_buff *skb) -{ - struct llc_conn_state_ev *ev = llc_conn_ev(skb); - - ev->status = LLC_STATUS_RECEIVED; - return 0; -} - int llc_conn_ev_qlfy_set_status_refuse(struct sock *sk, struct sk_buff *skb) { struct llc_conn_state_ev *ev = llc_conn_ev(skb); --- linux-2.6.10-rc3-mm1-full/include/net/llc_conn.h.old 2004-12-14 21:29:57.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/net/llc_conn.h 2004-12-14 21:30:20.000000000 +0100 @@ -91,7 +91,6 @@ extern void llc_sk_free(struct sock *sk); extern void llc_sk_reset(struct sock *sk); -extern int llc_sk_init(struct sock *sk); /* Access to a connection */ extern int llc_conn_state_process(struct sock *sk, struct sk_buff *skb); @@ -106,8 +105,6 @@ extern struct sock *llc_lookup_established(struct llc_sap *sap, struct llc_addr *daddr, struct llc_addr *laddr); -extern struct sock *llc_lookup_listener(struct llc_sap *sap, - struct llc_addr *laddr); extern void llc_sap_add_socket(struct llc_sap *sap, struct sock *sk); extern void llc_sap_remove_socket(struct llc_sap *sap, struct sock *sk); --- linux-2.6.10-rc3-mm1-full/net/llc/llc_conn.c.old 2004-12-14 21:29:09.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/llc/llc_conn.c 2004-12-14 21:30:31.000000000 +0100 @@ -503,7 +503,8 @@ * local mac, and local sap. Returns pointer for parent socket found, * %NULL otherwise. */ -struct sock *llc_lookup_listener(struct llc_sap *sap, struct llc_addr *laddr) +static struct sock *llc_lookup_listener(struct llc_sap *sap, + struct llc_addr *laddr) { struct sock *rc; struct hlist_node *node; @@ -546,7 +547,7 @@ * Finds offset of next category of transitions in transition table. * Returns the start index of next category. */ -u16 find_next_offset(struct llc_conn_state *state, u16 offset) +static u16 find_next_offset(struct llc_conn_state *state, u16 offset) { u16 cnt = 0; struct llc_conn_state_trans **next_trans; @@ -785,7 +786,7 @@ * * Initializes a socket with default llc values. */ -int llc_sk_init(struct sock* sk) +static int llc_sk_init(struct sock* sk) { struct llc_opt *llc = kmalloc(sizeof(*llc), GFP_ATOMIC); int rc = -ENOMEM; --- linux-2.6.10-rc3-mm1-full/net/llc/llc_core.c.old 2004-12-14 21:30:45.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/llc/llc_core.c 2004-12-14 21:31:57.000000000 +0100 @@ -31,7 +31,7 @@ * * Allocates and initializes sap. */ -struct llc_sap *llc_sap_alloc(void) +static struct llc_sap *llc_sap_alloc(void) { struct llc_sap *sap = kmalloc(sizeof(*sap), GFP_ATOMIC); @@ -50,7 +50,7 @@ * * Adds a sap to the LLC's station sap list. */ -void llc_add_sap(struct llc_sap *sap) +static void llc_add_sap(struct llc_sap *sap) { write_lock_bh(&llc_sap_list_lock); list_add_tail(&sap->node, &llc_sap_list); @@ -63,7 +63,7 @@ * * Removes a sap to the LLC's station sap list. */ -void llc_del_sap(struct llc_sap *sap) +static void llc_del_sap(struct llc_sap *sap) { write_lock_bh(&llc_sap_list_lock); list_del(&sap->node); @@ -169,7 +169,6 @@ EXPORT_SYMBOL(llc_station_mac_sa); EXPORT_SYMBOL(llc_sap_list); -EXPORT_SYMBOL(llc_sap_list_lock); EXPORT_SYMBOL(llc_sap_find); EXPORT_SYMBOL(llc_sap_open); EXPORT_SYMBOL(llc_sap_close); --- linux-2.6.10-rc3-mm1-full/net/llc/llc_if.c.old 2004-12-14 21:32:12.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/llc/llc_if.c 2004-12-14 21:32:20.000000000 +0100 @@ -155,27 +155,3 @@ return rc; } -/** - * llc_build_and_send_reset_pkt - Resets an established LLC connection - * @prim: pointer to structure that contains service parameters. - * - * Called when upper layer wants to reset an established LLC connection - * with a remote machine. This function packages a proper event and sends - * it to connection component state machine. Returns 0 for success, 1 - * otherwise. - */ -int llc_build_and_send_reset_pkt(struct sock *sk) -{ - int rc = 1; - struct sk_buff *skb = alloc_skb(0, GFP_ATOMIC); - - if (skb) { - struct llc_conn_state_ev *ev = llc_conn_ev(skb); - - ev->type = LLC_CONN_EV_TYPE_PRIM; - ev->prim = LLC_RESET_PRIM; - ev->prim_type = LLC_PRIM_TYPE_REQ; - rc = llc_conn_state_process(sk, skb); - } - return rc; -} --- linux-2.6.10-rc3-mm1-full/include/net/llc_pdu.h.old 2004-12-14 21:32:38.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/net/llc_pdu.h 2004-12-14 21:32:43.000000000 +0100 @@ -419,7 +419,6 @@ extern void llc_pdu_set_cmd_rsp(struct sk_buff *skb, u8 type); extern void llc_pdu_set_pf_bit(struct sk_buff *skb, u8 bit_value); extern void llc_pdu_decode_pf_bit(struct sk_buff *skb, u8 *pf_bit); -extern void llc_pdu_decode_cr_bit(struct sk_buff *skb, u8 *cr_bit); extern void llc_pdu_init_as_disc_cmd(struct sk_buff *skb, u8 p_bit); extern void llc_pdu_init_as_i_cmd(struct sk_buff *skb, u8 p_bit, u8 ns, u8 nr); extern void llc_pdu_init_as_rej_cmd(struct sk_buff *skb, u8 p_bit, u8 nr); --- linux-2.6.10-rc3-mm1-full/net/llc/llc_pdu.c.old 2004-12-14 21:32:53.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/llc/llc_pdu.c 2004-12-14 21:33:00.000000000 +0100 @@ -80,19 +80,6 @@ } /** - * llc_pdu_decode_cr_bit - extracts command response bit from LLC header - * @skb: input skb that c/r bit must be extracted from it. - * @cr_bit: command/response bit (0 or 1). - * - * This function extracts command/response bit from LLC header. this bit - * is right bit of source SAP. - */ -void llc_pdu_decode_cr_bit(struct sk_buff *skb, u8 *cr_bit) -{ - *cr_bit = llc_pdu_un_hdr(skb)->ssap & LLC_PDU_CMD_RSP_MASK; -} - -/** * llc_pdu_init_as_disc_cmd - Builds DISC PDU * @skb: Address of the skb to build * @p_bit: The P bit to set in the PDU --- linux-2.6.10-rc3-mm1-full/net/llc/llc_proc.c.old 2004-12-14 21:33:18.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/llc/llc_proc.c 2004-12-14 21:33:44.000000000 +0100 @@ -185,14 +185,14 @@ return 0; } -struct seq_operations llc_seq_socket_ops = { +static struct seq_operations llc_seq_socket_ops = { .start = llc_seq_start, .next = llc_seq_next, .stop = llc_seq_stop, .show = llc_seq_socket_show, }; -struct seq_operations llc_seq_core_ops = { +static struct seq_operations llc_seq_core_ops = { .start = llc_seq_start, .next = llc_seq_next, .stop = llc_seq_stop, --- linux-2.6.10-rc3-mm1-full/include/net/llc_sap.h.old 2004-12-14 21:34:59.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/net/llc_sap.h 2004-12-14 21:35:05.000000000 +0100 @@ -14,7 +14,6 @@ struct llc_sap; struct sk_buff; -extern void llc_sap_state_process(struct llc_sap *sap, struct sk_buff *skb); extern void llc_sap_rtn_pdu(struct llc_sap *sap, struct sk_buff *skb); extern void llc_save_primitive(struct sk_buff* skb, unsigned char prim); extern struct sk_buff *llc_alloc_frame(void); --- linux-2.6.10-rc3-mm1-full/net/llc/llc_sap.c.old 2004-12-14 21:34:03.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/llc/llc_sap.c 2004-12-14 21:34:48.000000000 +0100 @@ -173,7 +173,7 @@ * if needed(on receiving an UI frame). sk can be null for the * datalink_proto case. */ -void llc_sap_state_process(struct llc_sap *sap, struct sk_buff *skb) +static void llc_sap_state_process(struct llc_sap *sap, struct sk_buff *skb) { struct llc_sap_state_ev *ev = llc_sap_ev(skb); @@ -275,7 +275,8 @@ * Search socket list of the SAP and finds connection using the local * mac, and local sap. Returns pointer for socket found, %NULL otherwise. */ -struct sock *llc_lookup_dgram(struct llc_sap *sap, struct llc_addr *laddr) +static struct sock *llc_lookup_dgram(struct llc_sap *sap, + struct llc_addr *laddr) { struct sock *rc; struct hlist_node *node; --- linux-2.6.10-rc3-mm1-full/net/llc/llc_station.c.old 2004-12-14 21:35:20.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/llc/llc_station.c 2004-12-14 21:35:29.000000000 +0100 @@ -642,7 +642,7 @@ * Queues an event (on the station event queue) for handling by the * station state machine and attempts to process any queued-up events. */ -void llc_station_state_process(struct sk_buff *skb) +static void llc_station_state_process(struct sk_buff *skb) { spin_lock_bh(&llc_main_station.ev_q.lock); skb_queue_tail(&llc_main_station.ev_q.list, skb); From bunk@stusta.de Tue Dec 14 17:20:23 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 14 Dec 2004 17:20:30 -0800 (PST) Received: from mailout.stusta.mhn.de (mailout.stusta.mhn.de [141.84.69.5]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBF1K1nM016836 for ; Tue, 14 Dec 2004 17:20:22 -0800 Received: (qmail 1763 invoked from network); 15 Dec 2004 01:19:33 -0000 Received: from r063144.stusta.swh.mhn.de (10.150.63.144) by mailout.stusta.mhn.de with SMTP; 15 Dec 2004 01:19:33 -0000 Received: by r063144.stusta.swh.mhn.de (Postfix, from userid 1000) id 76809AF651; Wed, 15 Dec 2004 02:19:31 +0100 (CET) Date: Wed, 15 Dec 2004 02:19:31 +0100 From: Adrian Bunk To: coreteam@netfilter.org Cc: netfilter-devel@lists.netfilter.org, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: [2.6 patch] net/ipv4/netfilter/: misc possible cleanups Message-ID: <20041215011931.GD12937@stusta.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12756 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bunk@stusta.de Precedence: bulk X-list: netdev The patch below contains the following possible cleanups: - make some needlessly global code static - remove the following unused global functions: - ip_conntrack_core.c: ip_conntrack_expect_find_get - ip_conntrack_core.c: ip_conntrack_unexpect_related - ip_nat_standalone.c: ip_nat_protocol_register - ip_nat_standalone.c: ip_nat_protocol_unregister - ip_nat_helper.c: ip_nat_find_helper - ipfwadm_core.c: ip_acct_ctl - remove the following variables that never change their values: - ip_conntrack_ftp.c: ip_conntrack_ftp - ip_conntrack_irc.c: ip_conntrack_irc - remove the following unneeded EXPORT_SYMBOL's: - ip_conntrack_standalone.c: ip_ct_find_helper - ip_conntrack_standalone.c: ip_conntrack_unexpect_related - ip_conntrack_standalone.c: ip_conntrack_expect_list - ip_conntrack_standalone.c: ip_conntrack_put - ip_nat_standalone.c: ip_nat_protocol_register - ip_nat_standalone.c: ip_nat_protocol_unregister - ip_nat_standalone.c: ip_nat_find_helper - remove the following unneeded EXPORT_SYMBOL_GPL: - ip_conntrack_standalone.c: ip_conntrack_expect_find_get Please comment on which of these changes are correct and which conflict with pending patches. diffstat output: include/linux/netfilter_ipv4/ip_conntrack.h | 7 include/linux/netfilter_ipv4/ip_conntrack_helper.h | 4 include/linux/netfilter_ipv4/ip_nat_core.h | 4 include/linux/netfilter_ipv4/ip_nat_helper.h | 3 include/linux/netfilter_ipv4/ip_nat_protocol.h | 4 include/linux/netfilter_ipv4/ipfwadm_core.h | 9 - net/ipv4/netfilter/ip_conntrack_core.c | 28 --- net/ipv4/netfilter/ip_conntrack_ftp.c | 3 net/ipv4/netfilter/ip_conntrack_irc.c | 8 net/ipv4/netfilter/ip_conntrack_proto_sctp.c | 20 +- net/ipv4/netfilter/ip_conntrack_standalone.c | 5 net/ipv4/netfilter/ip_nat_core.c | 94 +++++------ net/ipv4/netfilter/ip_nat_helper.c | 14 - net/ipv4/netfilter/ip_nat_standalone.c | 30 --- net/ipv4/netfilter/ipchains_core.c | 22 +- net/ipv4/netfilter/ipfwadm_core.c | 108 +++---------- net/ipv4/netfilter/ipt_CLUSTERIP.c | 2 net/ipv4/netfilter/ipt_ULOG.c | 4 net/ipv4/netfilter/ipt_hashlimit.c | 2 net/ipv4/netfilter/ipt_recent.c | 2 20 files changed, 111 insertions(+), 262 deletions(-) Signed-off-by: Adrian Bunk --- linux-2.6.10-rc3-mm1-full/include/linux/netfilter_ipv4/ip_conntrack.h.old 2004-12-14 03:53:07.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/linux/netfilter_ipv4/ip_conntrack.h 2004-12-14 03:55:53.000000000 +0100 @@ -244,13 +244,6 @@ return (struct ip_conntrack *)skb->nfct; } -/* decrement reference count on a conntrack */ -extern inline void ip_conntrack_put(struct ip_conntrack *ct); - -/* find unconfirmed expectation based on tuple */ -struct ip_conntrack_expect * -ip_conntrack_expect_find_get(const struct ip_conntrack_tuple *tuple); - /* decrement reference count on an expectation */ void ip_conntrack_expect_put(struct ip_conntrack_expect *exp); --- linux-2.6.10-rc3-mm1-full/include/linux/netfilter_ipv4/ip_conntrack_helper.h.old 2004-12-14 03:56:52.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/linux/netfilter_ipv4/ip_conntrack_helper.h 2004-12-14 03:57:33.000000000 +0100 @@ -33,9 +33,6 @@ extern int ip_conntrack_helper_register(struct ip_conntrack_helper *); extern void ip_conntrack_helper_unregister(struct ip_conntrack_helper *); -extern struct ip_conntrack_helper *ip_ct_find_helper(const struct ip_conntrack_tuple *tuple); - - /* Allocate space for an expectation: this is mandatory before calling ip_conntrack_expect_related. */ extern struct ip_conntrack_expect *ip_conntrack_expect_alloc(void); @@ -44,6 +41,5 @@ struct ip_conntrack *related_to); extern int ip_conntrack_change_expect(struct ip_conntrack_expect *expect, struct ip_conntrack_tuple *newtuple); -extern void ip_conntrack_unexpect_related(struct ip_conntrack_expect *exp); #endif /*_IP_CONNTRACK_HELPER_H*/ --- linux-2.6.10-rc3-mm1-full/net/ipv4/netfilter/ip_conntrack_standalone.c.old 2004-12-14 03:53:25.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/netfilter/ip_conntrack_standalone.c 2004-12-14 03:57:37.000000000 +0100 @@ -892,22 +892,17 @@ EXPORT_SYMBOL(ip_ct_refresh_acct); EXPORT_SYMBOL(ip_ct_protos); EXPORT_SYMBOL(ip_ct_find_proto); -EXPORT_SYMBOL(ip_ct_find_helper); EXPORT_SYMBOL(ip_conntrack_expect_alloc); EXPORT_SYMBOL(ip_conntrack_expect_related); EXPORT_SYMBOL(ip_conntrack_change_expect); -EXPORT_SYMBOL(ip_conntrack_unexpect_related); -EXPORT_SYMBOL_GPL(ip_conntrack_expect_find_get); EXPORT_SYMBOL_GPL(ip_conntrack_expect_put); EXPORT_SYMBOL(ip_conntrack_tuple_taken); EXPORT_SYMBOL(ip_ct_gather_frags); EXPORT_SYMBOL(ip_conntrack_htable_size); -EXPORT_SYMBOL(ip_conntrack_expect_list); EXPORT_SYMBOL(ip_conntrack_lock); EXPORT_SYMBOL(ip_conntrack_hash); EXPORT_SYMBOL(ip_conntrack_untracked); EXPORT_SYMBOL_GPL(ip_conntrack_find_get); -EXPORT_SYMBOL_GPL(ip_conntrack_put); #ifdef CONFIG_IP_NF_NAT_NEEDED EXPORT_SYMBOL(ip_conntrack_tcp_update); #endif --- linux-2.6.10-rc3-mm1-full/net/ipv4/netfilter/ip_conntrack_core.c.old 2004-12-14 03:53:36.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/netfilter/ip_conntrack_core.c 2004-12-14 03:57:45.000000000 +0100 @@ -77,7 +77,7 @@ DEFINE_PER_CPU(struct ip_conntrack_stat, ip_conntrack_stat); -inline void +static inline void ip_conntrack_put(struct ip_conntrack *ct) { IP_NF_ASSERT(ct); @@ -173,23 +173,6 @@ struct ip_conntrack_expect *, tuple); } -/* Find a expectation corresponding to a tuple. */ -struct ip_conntrack_expect * -ip_conntrack_expect_find_get(const struct ip_conntrack_tuple *tuple) -{ - struct ip_conntrack_expect *exp; - - READ_LOCK(&ip_conntrack_lock); - READ_LOCK(&ip_conntrack_expect_tuple_lock); - exp = __ip_ct_expect_find(tuple); - if (exp) - atomic_inc(&exp->use); - READ_UNLOCK(&ip_conntrack_expect_tuple_lock); - READ_UNLOCK(&ip_conntrack_lock); - - return exp; -} - /* remove one specific expectation from all lists and drop refcount, * does _NOT_ delete the timer. */ static void __unexpect_related(struct ip_conntrack_expect *expect) @@ -497,7 +480,7 @@ return ip_ct_tuple_mask_cmp(rtuple, &i->tuple, &i->mask); } -struct ip_conntrack_helper *ip_ct_find_helper(const struct ip_conntrack_tuple *tuple) +static struct ip_conntrack_helper *ip_ct_find_helper(const struct ip_conntrack_tuple *tuple) { return LIST_FIND(&helpers, helper_cmp, struct ip_conntrack_helper *, @@ -812,13 +795,6 @@ return ip_ct_tuple_mask_cmp(&i->tuple, tuple, &intersect_mask); } -inline void ip_conntrack_unexpect_related(struct ip_conntrack_expect *expect) -{ - WRITE_LOCK(&ip_conntrack_lock); - unexpect_related(expect); - WRITE_UNLOCK(&ip_conntrack_lock); -} - static void expectation_timed_out(unsigned long ul_expect) { struct ip_conntrack_expect *expect = (void *) ul_expect; --- linux-2.6.10-rc3-mm1-full/net/ipv4/netfilter/ip_conntrack_ftp.c.old 2004-12-14 03:58:12.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/netfilter/ip_conntrack_ftp.c 2004-12-14 03:58:47.000000000 +0100 @@ -29,7 +29,6 @@ static char ftp_buffer[65536]; static DECLARE_LOCK(ip_ftp_lock); -struct module *ip_conntrack_ftp = THIS_MODULE; #define MAX_PORTS 8 static int ports[MAX_PORTS]; @@ -438,7 +437,7 @@ ftp[i].max_expected = 1; ftp[i].timeout = 0; ftp[i].flags = IP_CT_HELPER_F_REUSE_EXPECT; - ftp[i].me = ip_conntrack_ftp; + ftp[i].me = THIS_MODULE; ftp[i].help = help; tmpname = &ftp_names[i][0]; --- linux-2.6.10-rc3-mm1-full/net/ipv4/netfilter/ip_conntrack_irc.c.old 2004-12-14 03:59:08.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/netfilter/ip_conntrack_irc.c 2004-12-14 04:00:11.000000000 +0100 @@ -56,8 +56,6 @@ static char *dccprotos[] = { "SEND ", "CHAT ", "MOVE ", "TSEND ", "SCHAT " }; #define MINMATCHLEN 5 -struct module *ip_conntrack_irc = THIS_MODULE; - #if 0 #define DEBUGP(format, args...) printk(KERN_DEBUG "%s:%s:" format, \ __FILE__, __FUNCTION__ , ## args) @@ -65,8 +63,8 @@ #define DEBUGP(format, args...) #endif -int parse_dcc(char *data, char *data_end, u_int32_t * ip, u_int16_t * port, - char **ad_beg_p, char **ad_end_p) +static int parse_dcc(char *data, char *data_end, u_int32_t * ip, + u_int16_t * port, char **ad_beg_p, char **ad_end_p) /* tries to get the ip_addr and port out of a dcc command return value: -1 on failure, 0 on success data pointer to first byte of DCC command data @@ -269,7 +267,7 @@ hlpr->max_expected = max_dcc_channels; hlpr->timeout = dcc_timeout; hlpr->flags = IP_CT_HELPER_F_REUSE_EXPECT; - hlpr->me = ip_conntrack_irc; + hlpr->me = THIS_MODULE; hlpr->help = help; tmpname = &irc_names[i][0]; --- linux-2.6.10-rc3-mm1-full/net/ipv4/netfilter/ip_conntrack_proto_sctp.c.old 2004-12-14 04:00:28.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/netfilter/ip_conntrack_proto_sctp.c 2004-12-14 04:01:47.000000000 +0100 @@ -58,13 +58,13 @@ #define HOURS * 60 MINS #define DAYS * 24 HOURS -unsigned long ip_ct_sctp_timeout_closed = 10 SECS; -unsigned long ip_ct_sctp_timeout_cookie_wait = 3 SECS; -unsigned long ip_ct_sctp_timeout_cookie_echoed = 3 SECS; -unsigned long ip_ct_sctp_timeout_established = 5 DAYS; -unsigned long ip_ct_sctp_timeout_shutdown_sent = 300 SECS / 1000; -unsigned long ip_ct_sctp_timeout_shutdown_recd = 300 SECS / 1000; -unsigned long ip_ct_sctp_timeout_shutdown_ack_sent = 3 SECS; +static unsigned long ip_ct_sctp_timeout_closed = 10 SECS; +static unsigned long ip_ct_sctp_timeout_cookie_wait = 3 SECS; +static unsigned long ip_ct_sctp_timeout_cookie_echoed = 3 SECS; +static unsigned long ip_ct_sctp_timeout_established = 5 DAYS; +static unsigned long ip_ct_sctp_timeout_shutdown_sent = 300 SECS / 1000; +static unsigned long ip_ct_sctp_timeout_shutdown_recd = 300 SECS / 1000; +static unsigned long ip_ct_sctp_timeout_shutdown_ack_sent = 3 SECS; static unsigned long * sctp_timeouts[] = { NULL, /* SCTP_CONNTRACK_NONE */ @@ -501,7 +501,7 @@ return 0; } -struct ip_conntrack_protocol ip_conntrack_protocol_sctp = { +static struct ip_conntrack_protocol ip_conntrack_protocol_sctp = { .proto = IPPROTO_SCTP, .name = "sctp", .pkt_to_tuple = sctp_pkt_to_tuple, @@ -609,7 +609,7 @@ static struct ctl_table_header *ip_ct_sysctl_header; #endif -int __init init(void) +static int __init init(void) { int ret; @@ -639,7 +639,7 @@ return ret; } -void __exit fini(void) +static void __exit fini(void) { ip_conntrack_protocol_unregister(&ip_conntrack_protocol_sctp); #ifdef CONFIG_SYSCTL --- linux-2.6.10-rc3-mm1-full/include/linux/netfilter_ipv4/ip_nat_core.h.old 2004-12-14 04:04:27.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/linux/netfilter_ipv4/ip_nat_core.h 2004-12-14 04:04:36.000000000 +0100 @@ -19,9 +19,5 @@ unsigned int hooknum, int dir); -extern void replace_in_hashes(struct ip_conntrack *conntrack, - struct ip_nat_info *info); -extern void place_in_hashes(struct ip_conntrack *conntrack, - struct ip_nat_info *info); #endif /* _IP_NAT_CORE_H */ --- linux-2.6.10-rc3-mm1-full/net/ipv4/netfilter/ip_nat_core.c.old 2004-12-14 04:04:43.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/netfilter/ip_nat_core.c 2004-12-14 04:05:37.000000000 +0100 @@ -479,6 +479,53 @@ #endif }; +static void replace_in_hashes(struct ip_conntrack *conntrack, + struct ip_nat_info *info) +{ + /* Source has changed, so replace in hashes. */ + unsigned int srchash + = hash_by_src(&conntrack->tuplehash[IP_CT_DIR_ORIGINAL] + .tuple.src, + conntrack->tuplehash[IP_CT_DIR_ORIGINAL] + .tuple.dst.protonum); + /* We place packet as seen OUTGOUNG in byips_proto hash + (ie. reverse dst and src of reply packet. */ + unsigned int ipsprotohash + = hash_by_ipsproto(conntrack->tuplehash[IP_CT_DIR_REPLY] + .tuple.dst.ip, + conntrack->tuplehash[IP_CT_DIR_REPLY] + .tuple.src.ip, + conntrack->tuplehash[IP_CT_DIR_REPLY] + .tuple.dst.protonum); + + MUST_BE_WRITE_LOCKED(&ip_nat_lock); + list_move(&info->bysource, &bysource[srchash]); + list_move(&info->byipsproto, &byipsproto[ipsprotohash]); +} + +static void place_in_hashes(struct ip_conntrack *conntrack, + struct ip_nat_info *info) +{ + unsigned int srchash + = hash_by_src(&conntrack->tuplehash[IP_CT_DIR_ORIGINAL] + .tuple.src, + conntrack->tuplehash[IP_CT_DIR_ORIGINAL] + .tuple.dst.protonum); + /* We place packet as seen OUTGOUNG in byips_proto hash + (ie. reverse dst and src of reply packet. */ + unsigned int ipsprotohash + = hash_by_ipsproto(conntrack->tuplehash[IP_CT_DIR_REPLY] + .tuple.dst.ip, + conntrack->tuplehash[IP_CT_DIR_REPLY] + .tuple.src.ip, + conntrack->tuplehash[IP_CT_DIR_REPLY] + .tuple.dst.protonum); + + MUST_BE_WRITE_LOCKED(&ip_nat_lock); + list_add(&info->bysource, &bysource[srchash]); + list_add(&info->byipsproto, &byipsproto[ipsprotohash]); +} + unsigned int ip_nat_setup_info(struct ip_conntrack *conntrack, const struct ip_nat_multi_range *mr, @@ -620,53 +667,6 @@ return NF_ACCEPT; } -void replace_in_hashes(struct ip_conntrack *conntrack, - struct ip_nat_info *info) -{ - /* Source has changed, so replace in hashes. */ - unsigned int srchash - = hash_by_src(&conntrack->tuplehash[IP_CT_DIR_ORIGINAL] - .tuple.src, - conntrack->tuplehash[IP_CT_DIR_ORIGINAL] - .tuple.dst.protonum); - /* We place packet as seen OUTGOUNG in byips_proto hash - (ie. reverse dst and src of reply packet. */ - unsigned int ipsprotohash - = hash_by_ipsproto(conntrack->tuplehash[IP_CT_DIR_REPLY] - .tuple.dst.ip, - conntrack->tuplehash[IP_CT_DIR_REPLY] - .tuple.src.ip, - conntrack->tuplehash[IP_CT_DIR_REPLY] - .tuple.dst.protonum); - - MUST_BE_WRITE_LOCKED(&ip_nat_lock); - list_move(&info->bysource, &bysource[srchash]); - list_move(&info->byipsproto, &byipsproto[ipsprotohash]); -} - -void place_in_hashes(struct ip_conntrack *conntrack, - struct ip_nat_info *info) -{ - unsigned int srchash - = hash_by_src(&conntrack->tuplehash[IP_CT_DIR_ORIGINAL] - .tuple.src, - conntrack->tuplehash[IP_CT_DIR_ORIGINAL] - .tuple.dst.protonum); - /* We place packet as seen OUTGOUNG in byips_proto hash - (ie. reverse dst and src of reply packet. */ - unsigned int ipsprotohash - = hash_by_ipsproto(conntrack->tuplehash[IP_CT_DIR_REPLY] - .tuple.dst.ip, - conntrack->tuplehash[IP_CT_DIR_REPLY] - .tuple.src.ip, - conntrack->tuplehash[IP_CT_DIR_REPLY] - .tuple.dst.protonum); - - MUST_BE_WRITE_LOCKED(&ip_nat_lock); - list_add(&info->bysource, &bysource[srchash]); - list_add(&info->byipsproto, &byipsproto[ipsprotohash]); -} - /* Returns true if succeeded. */ static int manip_pkt(u_int16_t proto, --- linux-2.6.10-rc3-mm1-full/include/linux/netfilter_ipv4/ip_nat_helper.h.old 2004-12-14 04:05:58.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/linux/netfilter_ipv4/ip_nat_helper.h 2004-12-14 04:06:08.000000000 +0100 @@ -42,9 +42,6 @@ extern void ip_nat_helper_unregister(struct ip_nat_helper *me); extern struct ip_nat_helper * -ip_nat_find_helper(const struct ip_conntrack_tuple *tuple); - -extern struct ip_nat_helper * __ip_nat_find_helper(const struct ip_conntrack_tuple *tuple); /* These return true or false. */ --- linux-2.6.10-rc3-mm1-full/net/ipv4/netfilter/ip_nat_standalone.c.old 2004-12-14 04:06:19.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/netfilter/ip_nat_standalone.c 2004-12-14 04:08:05.000000000 +0100 @@ -279,33 +279,6 @@ }; #endif -/* Protocol registration. */ -int ip_nat_protocol_register(struct ip_nat_protocol *proto) -{ - int ret = 0; - - WRITE_LOCK(&ip_nat_lock); - if (ip_nat_protos[proto->protonum] != &ip_nat_unknown_protocol) { - ret = -EBUSY; - goto out; - } - ip_nat_protos[proto->protonum] = proto; - out: - WRITE_UNLOCK(&ip_nat_lock); - return ret; -} - -/* Noone stores the protocol anywhere; simply delete it. */ -void ip_nat_protocol_unregister(struct ip_nat_protocol *proto) -{ - WRITE_LOCK(&ip_nat_lock); - ip_nat_protos[proto->protonum] = &ip_nat_unknown_protocol; - WRITE_UNLOCK(&ip_nat_lock); - - /* Someone could be still looking at the proto in a bh. */ - synchronize_net(); -} - static int init_or_cleanup(int init) { int ret = 0; @@ -381,14 +354,11 @@ module_exit(fini); EXPORT_SYMBOL(ip_nat_setup_info); -EXPORT_SYMBOL(ip_nat_protocol_register); -EXPORT_SYMBOL(ip_nat_protocol_unregister); EXPORT_SYMBOL(ip_nat_helper_register); EXPORT_SYMBOL(ip_nat_helper_unregister); EXPORT_SYMBOL(ip_nat_cheat_check); EXPORT_SYMBOL(ip_nat_mangle_tcp_packet); EXPORT_SYMBOL(ip_nat_mangle_udp_packet); EXPORT_SYMBOL(ip_nat_used_tuple); -EXPORT_SYMBOL(ip_nat_find_helper); EXPORT_SYMBOL(__ip_nat_find_helper); MODULE_LICENSE("GPL"); --- linux-2.6.10-rc3-mm1-full/net/ipv4/netfilter/ip_nat_helper.c.old 2004-12-14 04:06:33.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/netfilter/ip_nat_helper.c 2004-12-14 04:07:06.000000000 +0100 @@ -48,7 +48,7 @@ #endif static LIST_HEAD(helpers); -DECLARE_LOCK(ip_nat_seqofs_lock); +static DECLARE_LOCK(ip_nat_seqofs_lock); /* Setup TCP sequence correction given this change at this sequence */ static inline void @@ -431,18 +431,6 @@ return LIST_FIND(&helpers, helper_cmp, struct ip_nat_helper *, tuple); } -struct ip_nat_helper * -ip_nat_find_helper(const struct ip_conntrack_tuple *tuple) -{ - struct ip_nat_helper *h; - - READ_LOCK(&ip_nat_lock); - h = __ip_nat_find_helper(tuple); - READ_UNLOCK(&ip_nat_lock); - - return h; -} - static int kill_helper(const struct ip_conntrack *i, void *helper) { --- linux-2.6.10-rc3-mm1-full/include/linux/netfilter_ipv4/ip_nat_protocol.h.old 2004-12-14 04:07:23.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/linux/netfilter_ipv4/ip_nat_protocol.h 2004-12-14 04:07:42.000000000 +0100 @@ -48,10 +48,6 @@ #define MAX_IP_NAT_PROTO 256 extern struct ip_nat_protocol *ip_nat_protos[MAX_IP_NAT_PROTO]; -/* Protocol registration. */ -extern int ip_nat_protocol_register(struct ip_nat_protocol *proto); -extern void ip_nat_protocol_unregister(struct ip_nat_protocol *proto); - static inline struct ip_nat_protocol *ip_nat_find_proto(u_int8_t protocol) { return ip_nat_protos[protocol]; --- linux-2.6.10-rc3-mm1-full/net/ipv4/netfilter/ipchains_core.c.old 2004-12-14 04:08:35.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/netfilter/ipchains_core.c 2004-12-14 04:10:24.000000000 +0100 @@ -266,7 +266,7 @@ #endif /* Lock around ip_fw_chains linked list structure */ -rwlock_t ip_fw_lock = RW_LOCK_UNLOCKED; +static rwlock_t ip_fw_lock = RW_LOCK_UNLOCKED; /* Head of linked list of fw rules */ static struct ip_chain *ip_fw_chains; @@ -1758,17 +1758,17 @@ /* * Interface to the generic firewall chains. */ -int ipfw_input_check(struct firewall_ops *this, int pf, - struct net_device *dev, void *arg, - struct sk_buff **pskb) +static int ipfw_input_check(struct firewall_ops *this, int pf, + struct net_device *dev, void *arg, + struct sk_buff **pskb) { return ip_fw_check(dev->name, arg, IP_FW_INPUT_CHAIN, pskb, SLOT_NUMBER(), 0); } -int ipfw_output_check(struct firewall_ops *this, int pf, - struct net_device *dev, void *arg, - struct sk_buff **pskb) +static int ipfw_output_check(struct firewall_ops *this, int pf, + struct net_device *dev, void *arg, + struct sk_buff **pskb) { /* Locally generated bogus packets by root. . */ if ((*pskb)->len < sizeof(struct iphdr) || @@ -1778,15 +1778,15 @@ arg, IP_FW_OUTPUT_CHAIN, pskb, SLOT_NUMBER(), 0); } -int ipfw_forward_check(struct firewall_ops *this, int pf, - struct net_device *dev, void *arg, - struct sk_buff **pskb) +static int ipfw_forward_check(struct firewall_ops *this, int pf, + struct net_device *dev, void *arg, + struct sk_buff **pskb) { return ip_fw_check(dev->name, arg, IP_FW_FORWARD_CHAIN, pskb, SLOT_NUMBER(), 0); } -struct firewall_ops ipfw_ops = { +static struct firewall_ops ipfw_ops = { .fw_forward = ipfw_forward_check, .fw_input = ipfw_input_check, .fw_output = ipfw_output_check, --- linux-2.6.10-rc3-mm1-full/include/linux/netfilter_ipv4/ipfwadm_core.h.old 2004-12-14 04:11:30.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/linux/netfilter_ipv4/ipfwadm_core.h 2004-12-14 04:17:32.000000000 +0100 @@ -229,17 +229,10 @@ #include #ifdef CONFIG_IP_FIREWALL -extern struct ip_fw *ip_fw_in_chain; -extern struct ip_fw *ip_fw_out_chain; -extern struct ip_fw *ip_fw_fwd_chain; -extern int ip_fw_in_policy; -extern int ip_fw_out_policy; -extern int ip_fw_fwd_policy; extern int ip_fw_ctl(int, void *, int); #endif #ifdef CONFIG_IP_ACCT extern struct ip_fw *ip_acct_chain; -extern int ip_acct_ctl(int, void *, int); #endif #ifdef CONFIG_IP_MASQUERADE extern int ip_masq_ctl(int, void *, int); @@ -250,7 +243,5 @@ extern int ip_fw_masq_timeouts(void *user, int len); -extern int ip_fw_chk(struct sk_buff **, struct net_device *, __u16 *, - struct ip_fw *, int, int); #endif /* KERNEL */ #endif /* _IP_FW_H */ --- linux-2.6.10-rc3-mm1-full/net/ipv4/netfilter/ipfwadm_core.c.old 2004-12-14 04:10:42.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/netfilter/ipfwadm_core.c 2004-12-15 00:13:46.000000000 +0100 @@ -165,11 +165,11 @@ #if defined(CONFIG_IP_ACCT) || defined(CONFIG_IP_FIREWALL) -struct ip_fw *ip_fw_fwd_chain; -struct ip_fw *ip_fw_in_chain; -struct ip_fw *ip_fw_out_chain; -struct ip_fw *ip_acct_chain; -struct ip_fw *ip_masq_chain; +static struct ip_fw *ip_fw_fwd_chain; +static struct ip_fw *ip_fw_in_chain; +static struct ip_fw *ip_fw_out_chain; +static struct ip_fw *ip_acct_chain; +static struct ip_fw *ip_masq_chain; static struct ip_fw **chains[] = {&ip_fw_fwd_chain, &ip_fw_in_chain, &ip_fw_out_chain, &ip_acct_chain, @@ -178,9 +178,9 @@ #endif /* CONFIG_IP_ACCT || CONFIG_IP_FIREWALL */ #ifdef CONFIG_IP_FIREWALL -int ip_fw_fwd_policy=IP_FW_F_ACCEPT; -int ip_fw_in_policy=IP_FW_F_ACCEPT; -int ip_fw_out_policy=IP_FW_F_ACCEPT; +static int ip_fw_fwd_policy=IP_FW_F_ACCEPT; +static int ip_fw_in_policy=IP_FW_F_ACCEPT; +static int ip_fw_out_policy=IP_FW_F_ACCEPT; static int *policies[] = {&ip_fw_fwd_policy, &ip_fw_in_policy, &ip_fw_out_policy}; @@ -188,7 +188,7 @@ #endif #ifdef CONFIG_IP_FIREWALL_NETLINK -struct sock *ipfwsk; +static struct sock *ipfwsk; #endif /* @@ -323,9 +323,9 @@ */ -int ip_fw_chk(struct sk_buff **pskb, - struct net_device *rif, __u16 *redirport, - struct ip_fw *chain, int policy, int mode) +static int ip_fw_chk(struct sk_buff **pskb, + struct net_device *rif, __u16 *redirport, + struct ip_fw *chain, int policy, int mode) { struct ip_fw *f; __u32 src, dst; @@ -939,7 +939,7 @@ #endif /* CONFIG_IP_ACCT || CONFIG_IP_FIREWALL */ -struct ip_fw *check_ipfw_struct(struct ip_fw *frwl, int len) +static struct ip_fw *check_ipfw_struct(struct ip_fw *frwl, int len) { if ( len != sizeof(struct ip_fw) ) @@ -1008,55 +1008,6 @@ } - - -#ifdef CONFIG_IP_ACCT - -int ip_acct_ctl(int stage, void *m, int len) -{ - if ( stage == IP_ACCT_FLUSH ) - { - free_fw_chain(&ip_acct_chain); - return(0); - } - if ( stage == IP_ACCT_ZERO ) - { - zero_fw_chain(ip_acct_chain); - return(0); - } - if ( stage == IP_ACCT_INSERT || stage == IP_ACCT_APPEND || - stage == IP_ACCT_DELETE ) - { - struct ip_fw *frwl; - - if (!(frwl=check_ipfw_struct(m,len))) - return (EINVAL); - - switch (stage) - { - case IP_ACCT_INSERT: - return( insert_in_chain(&ip_acct_chain,frwl,len)); - case IP_ACCT_APPEND: - return( append_to_chain(&ip_acct_chain,frwl,len)); - case IP_ACCT_DELETE: - return( del_from_chain(&ip_acct_chain,frwl)); - default: - /* - * Should be panic but... (Why ??? - AC) - */ -#ifdef DEBUG_IP_FIREWALL - printk("ip_acct_ctl: unknown request %d\n",stage); -#endif - return(EINVAL); - } - } -#ifdef DEBUG_IP_FIREWALL - printk("ip_acct_ctl: unknown request %d\n",stage); -#endif - return(EINVAL); -} -#endif - #ifdef CONFIG_IP_FIREWALL int ip_fw_ctl(int stage, void *m, int len) { @@ -1321,45 +1272,47 @@ * Interface to the generic firewall chains. */ -int ipfw_input_check(struct firewall_ops *this, int pf, - struct net_device *dev, void *arg, - struct sk_buff **pskb) +static int ipfw_input_check(struct firewall_ops *this, int pf, + struct net_device *dev, void *arg, + struct sk_buff **pskb) { return ip_fw_chk(pskb, dev, arg, ip_fw_in_chain, ip_fw_in_policy, IP_FW_MODE_FW); } -int ipfw_output_check(struct firewall_ops *this, int pf, - struct net_device *dev, void *arg, - struct sk_buff **pskb) +static int ipfw_output_check(struct firewall_ops *this, int pf, + struct net_device *dev, void *arg, + struct sk_buff **pskb) { return ip_fw_chk(pskb, dev, arg, ip_fw_out_chain, ip_fw_out_policy, IP_FW_MODE_FW); } -int ipfw_forward_check(struct firewall_ops *this, int pf, - struct net_device *dev, void *arg, - struct sk_buff **pskb) +static int ipfw_forward_check(struct firewall_ops *this, int pf, + struct net_device *dev, void *arg, + struct sk_buff **pskb) { return ip_fw_chk(pskb, dev, arg, ip_fw_fwd_chain, ip_fw_fwd_policy, IP_FW_MODE_FW); } #ifdef CONFIG_IP_ACCT -int ipfw_acct_in(struct firewall_ops *this, int pf, struct net_device *dev, - void *arg, struct sk_buff **pskb) +static int ipfw_acct_in(struct firewall_ops *this, int pf, + struct net_device *dev, + void *arg, struct sk_buff **pskb) { return ip_fw_chk(pskb,dev,NULL,ip_acct_chain,0,IP_FW_MODE_ACCT_IN); } -int ipfw_acct_out(struct firewall_ops *this, int pf, struct net_device *dev, - void *arg, struct sk_buff **pskb) +static int ipfw_acct_out(struct firewall_ops *this, int pf, + struct net_device *dev, + void *arg, struct sk_buff **pskb) { return ip_fw_chk(pskb,dev,NULL,ip_acct_chain,0,IP_FW_MODE_ACCT_OUT); } #endif -struct firewall_ops ipfw_ops = { +static struct firewall_ops ipfw_ops = { .fw_forward = ipfw_forward_check, .fw_input = ipfw_input_check, .fw_output = ipfw_output_check, @@ -1373,7 +1326,8 @@ #if defined(CONFIG_IP_ACCT) || defined(CONFIG_IP_FIREWALL) -int ipfw_device_event(struct notifier_block *this, unsigned long event, void *ptr) +static int ipfw_device_event(struct notifier_block *this, unsigned long event, + void *ptr) { struct net_device *dev=ptr; char *devname = dev->name; --- linux-2.6.10-rc3-mm1-full/net/ipv4/netfilter/ipt_CLUSTERIP.c.old 2004-12-14 04:17:48.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/netfilter/ipt_CLUSTERIP.c 2004-12-14 04:17:57.000000000 +0100 @@ -66,7 +66,7 @@ /* clusterip_lock protects the clusterip_configs list _AND_ the configurable * data within all structurses (num_local_nodes, local_nodes[]) */ -DECLARE_RWLOCK(clusterip_lock); +static DECLARE_RWLOCK(clusterip_lock); #ifdef CONFIG_PROC_FS static struct file_operations clusterip_proc_fops; --- linux-2.6.10-rc3-mm1-full/net/ipv4/netfilter/ipt_ULOG.c.old 2004-12-14 04:18:10.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/netfilter/ipt_ULOG.c 2004-12-14 04:18:30.000000000 +0100 @@ -100,7 +100,7 @@ static ulog_buff_t ulog_buffers[ULOG_MAXNLGROUPS]; /* array of buffers */ static struct sock *nflognl; /* our socket */ -DECLARE_LOCK(ulog_lock); /* spinlock */ +static DECLARE_LOCK(ulog_lock); /* spinlock */ /* send one ulog_buff_t to userspace */ static void ulog_send(unsigned int nlgroupnum) @@ -140,7 +140,7 @@ UNLOCK_BH(&ulog_lock); } -struct sk_buff *ulog_alloc_skb(unsigned int size) +static struct sk_buff *ulog_alloc_skb(unsigned int size) { struct sk_buff *skb; --- linux-2.6.10-rc3-mm1-full/net/ipv4/netfilter/ipt_hashlimit.c.old 2004-12-14 04:18:42.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/netfilter/ipt_hashlimit.c 2004-12-14 04:18:53.000000000 +0100 @@ -97,7 +97,7 @@ struct list_head hash[0]; /* hashtable itself */ }; -DECLARE_RWLOCK(hashlimit_lock); /* protects htables list */ +static DECLARE_RWLOCK(hashlimit_lock); /* protects htables list */ static LIST_HEAD(hashlimit_htables); static kmem_cache_t *hashlimit_cachep; --- linux-2.6.10-rc3-mm1-full/net/ipv4/netfilter/ipt_recent.c.old 2004-12-14 04:19:11.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/ipv4/netfilter/ipt_recent.c 2004-12-14 04:19:18.000000000 +0100 @@ -107,7 +107,7 @@ int *hotdrop); /* Function to hash a given address into the hash table of table_size size */ -int hash_func(unsigned int addr, int table_size) +static int hash_func(unsigned int addr, int table_size) { int result = 0; unsigned int value = addr; From bunk@stusta.de Tue Dec 14 17:21:58 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 14 Dec 2004 17:22:05 -0800 (PST) Received: from mailout.stusta.mhn.de (mailout.stusta.mhn.de [141.84.69.5]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBF1Lbcq017027 for ; Tue, 14 Dec 2004 17:21:57 -0800 Received: (qmail 1883 invoked from network); 15 Dec 2004 01:21:09 -0000 Received: from r063144.stusta.swh.mhn.de (10.150.63.144) by mailhub.stusta.mhn.de with SMTP; 15 Dec 2004 01:21:09 -0000 Received: by r063144.stusta.swh.mhn.de (Postfix, from userid 1000) id 6BBBAAF651; Wed, 15 Dec 2004 02:21:07 +0100 (CET) Date: Wed, 15 Dec 2004 02:21:07 +0100 From: Adrian Bunk To: ralf@linux-mips.org Cc: linux-hams@vger.kernel.org, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: [2.6 patch] net/netrom/: make some code static Message-ID: <20041215012107.GE12937@stusta.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12757 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bunk@stusta.de Precedence: bulk X-list: netdev The patch below makes some needlessly global code static. diffstat output: net/netrom/af_netrom.c | 2 +- net/netrom/nr_route.c | 5 +++-- 2 files changed, 4 insertions(+), 3 deletions(-) Signed-off-by: Adrian Bunk --- linux-2.6.10-rc3-mm1-full/net/netrom/af_netrom.c.old 2004-12-14 21:45:43.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/netrom/af_netrom.c 2004-12-14 21:45:51.000000000 +0100 @@ -43,7 +43,7 @@ #include #include -int nr_ndevs = 4; +static int nr_ndevs = 4; int sysctl_netrom_default_path_quality = NR_DEFAULT_QUAL; int sysctl_netrom_obsolescence_count_initialiser = NR_DEFAULT_OBS; --- linux-2.6.10-rc3-mm1-full/net/netrom/nr_route.c.old 2004-12-14 21:46:05.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/netrom/nr_route.c 2004-12-14 21:46:40.000000000 +0100 @@ -45,7 +45,7 @@ static HLIST_HEAD(nr_neigh_list); static spinlock_t nr_neigh_list_lock = SPIN_LOCK_UNLOCKED; -struct nr_node *nr_node_get(ax25_address *callsign) +static struct nr_node *nr_node_get(ax25_address *callsign) { struct nr_node *found = NULL; struct nr_node *nr_node; @@ -62,7 +62,8 @@ return found; } -struct nr_neigh *nr_neigh_get_dev(ax25_address *callsign, struct net_device *dev) +static struct nr_neigh *nr_neigh_get_dev(ax25_address *callsign, + struct net_device *dev) { struct nr_neigh *found = NULL; struct nr_neigh *nr_neigh; From bunk@stusta.de Tue Dec 14 17:24:39 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 14 Dec 2004 17:24:47 -0800 (PST) Received: from mailout.stusta.mhn.de (emailhub.stusta.mhn.de [141.84.69.5]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBF1OHEo017839 for ; Tue, 14 Dec 2004 17:24:38 -0800 Received: (qmail 2035 invoked from network); 15 Dec 2004 01:23:49 -0000 Received: from r063144.stusta.swh.mhn.de (10.150.63.144) by mailhub.stusta.mhn.de with SMTP; 15 Dec 2004 01:23:49 -0000 Received: by r063144.stusta.swh.mhn.de (Postfix, from userid 1000) id AEA80AF651; Wed, 15 Dec 2004 02:23:47 +0100 (CET) Date: Wed, 15 Dec 2004 02:23:47 +0100 From: Adrian Bunk To: ralf@linux-mips.org Cc: linux-hams@vger.kernel.org, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: [2.6 patch] net/rose/: misc possible cleanups Message-ID: <20041215012347.GF12937@stusta.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12758 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bunk@stusta.de Precedence: bulk X-list: netdev The patch below contains the following possible cleanups: - make some needlessly global code static - remove the followingunused global functions: - rose_dev.c: rose_rx_ip - rose_link.c: rose_transmit_diagnostic Please comment on which of these changes are correct and which conflict with pending patches. diffstat output: include/net/rose.h | 8 -------- net/rose/af_rose.c | 2 +- net/rose/rose_dev.c | 32 -------------------------------- net/rose/rose_link.c | 39 +++++++-------------------------------- net/rose/rose_route.c | 4 ++-- net/rose/rose_subr.c | 4 +++- 6 files changed, 13 insertions(+), 76 deletions(-) Signed-off-by: Adrian Bunk --- linux-2.6.10-rc3-mm1-full/include/net/rose.h.old 2004-12-14 21:51:28.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/net/rose.h 2004-12-14 21:55:21.000000000 +0100 @@ -162,7 +162,6 @@ extern void rose_destroy_socket(struct sock *); /* rose_dev.c */ -extern int rose_rx_ip(struct sk_buff *, struct net_device *); extern void rose_setup(struct net_device *); /* rose_in.c */ @@ -170,15 +169,10 @@ /* rose_link.c */ extern void rose_start_ftimer(struct rose_neigh *); -extern void rose_start_t0timer(struct rose_neigh *); extern void rose_stop_ftimer(struct rose_neigh *); extern void rose_stop_t0timer(struct rose_neigh *); extern int rose_ftimer_running(struct rose_neigh *); -extern int rose_t0timer_running(struct rose_neigh *); extern void rose_link_rx_restart(struct sk_buff *, struct rose_neigh *, unsigned short); -extern void rose_transmit_restart_request(struct rose_neigh *); -extern void rose_transmit_restart_confirmation(struct rose_neigh *); -extern void rose_transmit_diagnostic(struct rose_neigh *, unsigned char); extern void rose_transmit_clear_request(struct rose_neigh *, unsigned int, unsigned char, unsigned char); extern void rose_transmit_link(struct sk_buff *, struct rose_neigh *); @@ -205,7 +199,6 @@ extern struct net_device *rose_dev_first(void); extern struct net_device *rose_dev_get(rose_address *); extern struct rose_route *rose_route_free_lci(unsigned int, struct rose_neigh *); -extern struct net_device *rose_ax25_dev_get(char *); extern struct rose_neigh *rose_get_neigh(rose_address *, unsigned char *, unsigned char *); extern int rose_rt_ioctl(unsigned int, void __user *); extern void rose_link_failed(ax25_cb *, int); @@ -220,7 +213,6 @@ extern void rose_write_internal(struct sock *, int); extern int rose_decode(struct sk_buff *, int *, int *, int *, int *, int *); extern int rose_parse_facilities(unsigned char *, struct rose_facilities_struct *); -extern int rose_create_facilities(unsigned char *, rose_cb *); extern void rose_disconnect(struct sock *, int, int, int); /* rose_timer.c */ --- linux-2.6.10-rc3-mm1-full/net/rose/af_rose.c.old 2004-12-14 21:51:01.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/rose/af_rose.c 2004-12-14 21:51:09.000000000 +0100 @@ -59,7 +59,7 @@ int sysctl_rose_window_size = ROSE_DEFAULT_WINDOW_SIZE; static HLIST_HEAD(rose_list); -spinlock_t rose_list_lock = SPIN_LOCK_UNLOCKED; +static spinlock_t rose_list_lock = SPIN_LOCK_UNLOCKED; static struct proto_ops rose_proto_ops; --- linux-2.6.10-rc3-mm1-full/net/rose/rose_dev.c.old 2004-12-14 21:51:42.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/rose/rose_dev.c 2004-12-14 21:52:08.000000000 +0100 @@ -37,38 +37,6 @@ #include #include -/* - * Only allow IP over ROSE frames through if the netrom device is up. - */ - -int rose_rx_ip(struct sk_buff *skb, struct net_device *dev) -{ - struct net_device_stats *stats = (struct net_device_stats *)dev->priv; - -#ifdef CONFIG_INET - if (!netif_running(dev)) { - stats->rx_errors++; - return 0; - } - - stats->rx_packets++; - stats->rx_bytes += skb->len; - - skb->protocol = htons(ETH_P_IP); - - /* Spoof incoming device */ - skb->dev = dev; - skb->h.raw = skb->data; - skb->nh.raw = skb->data; - skb->pkt_type = PACKET_HOST; - - ip_rcv(skb, skb->dev, NULL); -#else - kfree_skb(skb); -#endif - return 1; -} - static int rose_header(struct sk_buff *skb, struct net_device *dev, unsigned short type, void *daddr, void *saddr, unsigned len) { --- linux-2.6.10-rc3-mm1-full/net/rose/rose_link.c.old 2004-12-14 21:52:37.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/rose/rose_link.c 2004-12-14 21:54:23.000000000 +0100 @@ -31,6 +31,9 @@ static void rose_ftimer_expiry(unsigned long); static void rose_t0timer_expiry(unsigned long); +static void rose_transmit_restart_confirmation(struct rose_neigh *neigh); +static void rose_transmit_restart_request(struct rose_neigh *neigh); + void rose_start_ftimer(struct rose_neigh *neigh) { del_timer(&neigh->ftimer); @@ -42,7 +45,7 @@ add_timer(&neigh->ftimer); } -void rose_start_t0timer(struct rose_neigh *neigh) +static void rose_start_t0timer(struct rose_neigh *neigh) { del_timer(&neigh->t0timer); @@ -68,7 +71,7 @@ return timer_pending(&neigh->ftimer); } -int rose_t0timer_running(struct rose_neigh *neigh) +static int rose_t0timer_running(struct rose_neigh *neigh) { return timer_pending(&neigh->t0timer); } @@ -165,7 +168,7 @@ /* * This routine is called when a Restart Request is needed */ -void rose_transmit_restart_request(struct rose_neigh *neigh) +static void rose_transmit_restart_request(struct rose_neigh *neigh) { struct sk_buff *skb; unsigned char *dptr; @@ -194,7 +197,7 @@ /* * This routine is called when a Restart Confirmation is needed */ -void rose_transmit_restart_confirmation(struct rose_neigh *neigh) +static void rose_transmit_restart_confirmation(struct rose_neigh *neigh) { struct sk_buff *skb; unsigned char *dptr; @@ -219,34 +222,6 @@ } /* - * This routine is called when a Diagnostic is required. - */ -void rose_transmit_diagnostic(struct rose_neigh *neigh, unsigned char diag) -{ - struct sk_buff *skb; - unsigned char *dptr; - int len; - - len = AX25_BPQ_HEADER_LEN + AX25_MAX_HEADER_LEN + ROSE_MIN_LEN + 2; - - if ((skb = alloc_skb(len, GFP_ATOMIC)) == NULL) - return; - - skb_reserve(skb, AX25_BPQ_HEADER_LEN + AX25_MAX_HEADER_LEN); - - dptr = skb_put(skb, ROSE_MIN_LEN + 2); - - *dptr++ = AX25_P_ROSE; - *dptr++ = ROSE_GFI; - *dptr++ = 0x00; - *dptr++ = ROSE_DIAGNOSTIC; - *dptr++ = diag; - - if (!rose_send_frame(skb, neigh)) - kfree_skb(skb); -} - -/* * This routine is called when a Clear Request is needed outside of the context * of a connected socket. */ --- linux-2.6.10-rc3-mm1-full/net/rose/rose_route.c.old 2004-12-14 21:54:45.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/rose/rose_route.c 2004-12-14 21:55:09.000000000 +0100 @@ -41,7 +41,7 @@ static struct rose_node *rose_node_list; static spinlock_t rose_node_list_lock = SPIN_LOCK_UNLOCKED; -struct rose_neigh *rose_neigh_list; +static struct rose_neigh *rose_neigh_list; static spinlock_t rose_neigh_list_lock = SPIN_LOCK_UNLOCKED; static struct rose_route *rose_route_list; static spinlock_t rose_route_list_lock = SPIN_LOCK_UNLOCKED; @@ -587,7 +587,7 @@ /* * Check that the device given is a valid AX.25 interface that is "up". */ -struct net_device *rose_ax25_dev_get(char *devname) +static struct net_device *rose_ax25_dev_get(char *devname) { struct net_device *dev; --- linux-2.6.10-rc3-mm1-full/net/rose/rose_subr.c.old 2004-12-14 21:55:28.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/rose/rose_subr.c 2004-12-14 21:55:42.000000000 +0100 @@ -28,6 +28,8 @@ #include #include +static int rose_create_facilities(unsigned char *buffer, rose_cb *rose); + /* * This routine purges all of the queues of frames. */ @@ -394,7 +396,7 @@ return 1; } -int rose_create_facilities(unsigned char *buffer, rose_cb *rose) +static int rose_create_facilities(unsigned char *buffer, rose_cb *rose) { unsigned char *p = buffer + 1; char *callsign; From bunk@stusta.de Tue Dec 14 17:27:04 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 14 Dec 2004 17:27:11 -0800 (PST) Received: from mailout.stusta.mhn.de (mailout.stusta.mhn.de [141.84.69.5]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBF1Qgck018339 for ; Tue, 14 Dec 2004 17:27:03 -0800 Received: (qmail 2185 invoked from network); 15 Dec 2004 01:26:15 -0000 Received: from r063144.stusta.swh.mhn.de (10.150.63.144) by mailhub.stusta.mhn.de with SMTP; 15 Dec 2004 01:26:15 -0000 Received: by r063144.stusta.swh.mhn.de (Postfix, from userid 1000) id 033F2AF651; Wed, 15 Dec 2004 02:26:13 +0100 (CET) Date: Wed, 15 Dec 2004 02:26:12 +0100 From: Adrian Bunk To: netdev@oss.sgi.com Cc: linux-kernel@vger.kernel.org Subject: [2.6 patch] net/rxrpc/: misc possible cleanups Message-ID: <20041215012612.GG12937@stusta.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12759 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bunk@stusta.de Precedence: bulk X-list: netdev The patch below contains the following possible cleanups: - make some needlessly global code static - remove the following unused global function: - transport.c: rxrpc_clear_transport - remove the following unneeded EXPORT_SYMBOL: - rxrpc_syms.c: rxrpc_call_flush Please comment on which of these changes are correct and which conflict with pending patches. diffstat output: include/rxrpc/call.h | 5 ----- include/rxrpc/packet.h | 2 -- include/rxrpc/transport.h | 2 -- net/rxrpc/call.c | 15 +++++++++------ net/rxrpc/connection.c | 4 +++- net/rxrpc/internal.h | 3 --- net/rxrpc/peer.c | 4 +++- net/rxrpc/rxrpc_syms.c | 1 - net/rxrpc/transport.c | 10 ---------- 9 files changed, 15 insertions(+), 31 deletions(-) Signed-off-by: Adrian Bunk --- linux-2.6.10-rc3-mm1-full/include/rxrpc/packet.h.old 2004-12-14 22:24:02.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/rxrpc/packet.h 2004-12-14 22:24:10.000000000 +0100 @@ -124,6 +124,4 @@ } __attribute__((packed)); -extern const char *rxrpc_acks[]; - #endif /* _LINUX_RXRPC_PACKET_H */ --- linux-2.6.10-rc3-mm1-full/include/rxrpc/call.h.old 2004-12-14 22:25:08.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/rxrpc/call.h 2004-12-14 22:28:41.000000000 +0100 @@ -20,9 +20,6 @@ #define RXRPC_CALL_ACK_WINDOW_SIZE 16 extern unsigned rxrpc_call_rcv_timeout; /* receive activity timeout (secs) */ -extern unsigned rxrpc_call_acks_timeout; /* pending ACK (retransmit) timeout (secs) */ -extern unsigned rxrpc_call_dfr_ack_timeout; /* deferred ACK timeout (secs) */ -extern unsigned short rxrpc_call_max_resend; /* maximum consecutive resend count */ /* application call state * - only state 0 and ffff are reserved, the state is set to 1 after an opid is received @@ -210,8 +207,6 @@ int dup_data, size_t *size_sent); -extern int rxrpc_call_flush(struct rxrpc_call *call); - extern void rxrpc_call_handle_error(struct rxrpc_call *conn, int local, int errno); #endif /* _LINUX_RXRPC_CALL_H */ --- linux-2.6.10-rc3-mm1-full/net/rxrpc/call.c.old 2004-12-14 22:24:20.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/rxrpc/call.c 2004-12-14 22:28:53.000000000 +0100 @@ -26,10 +26,10 @@ LIST_HEAD(rxrpc_calls); DECLARE_RWSEM(rxrpc_calls_sem); -unsigned rxrpc_call_rcv_timeout = HZ/3; -unsigned rxrpc_call_acks_timeout = HZ/3; -unsigned rxrpc_call_dfr_ack_timeout = HZ/20; -unsigned short rxrpc_call_max_resend = HZ/10; +unsigned rxrpc_call_rcv_timeout = HZ/3; +static unsigned rxrpc_call_acks_timeout = HZ/3; +static unsigned rxrpc_call_dfr_ack_timeout = HZ/20; +static unsigned short rxrpc_call_max_resend = HZ/10; const char *rxrpc_call_states[] = { "COMPLETE", @@ -58,7 +58,7 @@ "?09", "?10", "?11", "?12", "?13", "?14", "?15" }; -const char *rxrpc_acks[] = { +static const char *rxrpc_acks[] = { "---", "REQ", "DUP", "SEQ", "WIN", "MEM", "PNG", "PNR", "DLY", "IDL", "-?-" }; @@ -79,6 +79,9 @@ struct rxrpc_message *msg, rxrpc_seq_t seq, size_t count); + +static int rxrpc_call_flush(struct rxrpc_call *call); + #define _state(call) \ _debug("[[[ state %s ]]]", rxrpc_call_states[call->app_call_state]); @@ -2079,7 +2082,7 @@ /* * flush outstanding packets to the network */ -int rxrpc_call_flush(struct rxrpc_call *call) +static int rxrpc_call_flush(struct rxrpc_call *call) { struct rxrpc_message *msg; int ret = 0; --- linux-2.6.10-rc3-mm1-full/net/rxrpc/rxrpc_syms.c.old 2004-12-14 22:28:26.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/rxrpc/rxrpc_syms.c 2004-12-14 22:28:29.000000000 +0100 @@ -23,7 +23,6 @@ EXPORT_SYMBOL(rxrpc_call_abort); EXPORT_SYMBOL(rxrpc_call_read_data); EXPORT_SYMBOL(rxrpc_call_write_data); -EXPORT_SYMBOL(rxrpc_call_flush); /* connection.c */ EXPORT_SYMBOL(rxrpc_create_connection); --- linux-2.6.10-rc3-mm1-full/net/rxrpc/internal.h.old 2004-12-14 22:29:18.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/rxrpc/internal.h 2004-12-14 22:30:16.000000000 +0100 @@ -73,7 +73,6 @@ extern struct rw_semaphore rxrpc_conns_sem; extern unsigned long rxrpc_conn_timeout; -extern void rxrpc_conn_do_timeout(struct rxrpc_connection *conn); extern void rxrpc_conn_clearall(struct rxrpc_peer *peer); /* @@ -89,8 +88,6 @@ extern void rxrpc_peer_clearall(struct rxrpc_transport *trans); -extern void rxrpc_peer_do_timeout(struct rxrpc_peer *peer); - /* * proc.c --- linux-2.6.10-rc3-mm1-full/net/rxrpc/connection.c.old 2004-12-14 22:29:36.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/rxrpc/connection.c 2004-12-14 22:30:00.000000000 +0100 @@ -30,6 +30,8 @@ DECLARE_RWSEM(rxrpc_conns_sem); unsigned long rxrpc_conn_timeout = 60 * 60; +static void rxrpc_conn_do_timeout(struct rxrpc_connection *conn); + static void __rxrpc_conn_timeout(rxrpc_timer_t *timer) { struct rxrpc_connection *conn = @@ -415,7 +417,7 @@ /* * free a connection record */ -void rxrpc_conn_do_timeout(struct rxrpc_connection *conn) +static void rxrpc_conn_do_timeout(struct rxrpc_connection *conn) { struct rxrpc_peer *peer; --- linux-2.6.10-rc3-mm1-full/net/rxrpc/peer.c.old 2004-12-14 22:30:23.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/rxrpc/peer.c 2004-12-14 22:30:41.000000000 +0100 @@ -30,6 +30,8 @@ DECLARE_RWSEM(rxrpc_peers_sem); unsigned long rxrpc_peer_timeout = 12 * 60 * 60; +static void rxrpc_peer_do_timeout(struct rxrpc_peer *peer); + static void __rxrpc_peer_timeout(rxrpc_timer_t *timer) { struct rxrpc_peer *peer = @@ -259,7 +261,7 @@ * handle a peer timing out in the graveyard * - called from krxtimod */ -void rxrpc_peer_do_timeout(struct rxrpc_peer *peer) +static void rxrpc_peer_do_timeout(struct rxrpc_peer *peer) { struct rxrpc_transport *trans = peer->trans; --- linux-2.6.10-rc3-mm1-full/include/rxrpc/transport.h.old 2004-12-14 22:30:56.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/rxrpc/transport.h 2004-12-14 22:31:03.000000000 +0100 @@ -103,6 +103,4 @@ struct rxrpc_message *msg, int error); -extern void rxrpc_clear_transport(struct rxrpc_transport *trans); - #endif /* _LINUX_RXRPC_TRANSPORT_H */ --- linux-2.6.10-rc3-mm1-full/net/rxrpc/transport.c.old 2004-12-14 22:31:11.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/rxrpc/transport.c 2004-12-14 22:31:20.000000000 +0100 @@ -150,16 +150,6 @@ /*****************************************************************************/ /* - * clear the connections on a transport endpoint - */ -void rxrpc_clear_transport(struct rxrpc_transport *trans) -{ - //struct rxrpc_connection *conn; - -} /* end rxrpc_clear_transport() */ - -/*****************************************************************************/ -/* * destroy a transport endpoint */ void rxrpc_put_transport(struct rxrpc_transport *trans) From bunk@stusta.de Tue Dec 14 17:28:45 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 14 Dec 2004 17:28:53 -0800 (PST) Received: from mailout.stusta.mhn.de (emailhub.stusta.mhn.de [141.84.69.5]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBF1SOld018587 for ; Tue, 14 Dec 2004 17:28:44 -0800 Received: (qmail 2256 invoked from network); 15 Dec 2004 01:27:56 -0000 Received: from r063144.stusta.swh.mhn.de (10.150.63.144) by mailhub.stusta.mhn.de with SMTP; 15 Dec 2004 01:27:56 -0000 Received: by r063144.stusta.swh.mhn.de (Postfix, from userid 1000) id 52C8AAF651; Wed, 15 Dec 2004 02:27:54 +0100 (CET) Date: Wed, 15 Dec 2004 02:27:54 +0100 From: Adrian Bunk To: netdev@oss.sgi.com Cc: linux-kernel@vger.kernel.org Subject: [2.6 patch] net/sched/: possible cleanups Message-ID: <20041215012754.GH12937@stusta.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12760 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bunk@stusta.de Precedence: bulk X-list: netdev The patch below contans the following possible cleanups: - make some needlessly global code static - sch_htb.c: #undef HTB_DEBUG diffstat output: include/net/act_api.h | 3 --- net/sched/gact.c | 2 +- net/sched/police.c | 8 ++++---- net/sched/sch_api.c | 11 ++++++----- net/sched/sch_dsmark.c | 2 +- net/sched/sch_generic.c | 4 ++-- net/sched/sch_htb.c | 2 +- net/sched/sch_ingress.c | 2 +- net/sched/sch_prio.c | 3 ++- 9 files changed, 18 insertions(+), 19 deletions(-) Signed-off-by: Adrian Bunk --- linux-2.6.10-rc3-mm1-full/net/sched/gact.c.old 2004-12-14 22:32:34.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/sched/gact.c 2004-12-14 22:32:42.000000000 +0100 @@ -68,7 +68,7 @@ } -g_rand gact_rand[MAX_RAND]= { NULL,gact_net_rand, gact_determ}; +static g_rand gact_rand[MAX_RAND]= { NULL,gact_net_rand, gact_determ}; #endif static int --- linux-2.6.10-rc3-mm1-full/include/net/act_api.h.old 2004-12-14 22:33:02.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/net/act_api.h 2004-12-14 22:33:14.000000000 +0100 @@ -82,9 +82,6 @@ extern int tcf_action_dump_old(struct sk_buff *skb, struct tc_action *a, int, int); extern int tcf_action_dump_1(struct sk_buff *skb, struct tc_action *a, int, int); extern int tcf_action_copy_stats (struct sk_buff *,struct tc_action *); -extern int tcf_act_police_locate(struct rtattr *rta, struct rtattr *est,struct tc_action *,int , int ); -extern int tcf_act_police_dump(struct sk_buff *, struct tc_action *, int, int); -extern int tcf_act_police(struct sk_buff **skb, struct tc_action *a); #endif /* CONFIG_NET_CLS_ACT */ extern int tcf_police(struct sk_buff *skb, struct tcf_police *p); --- linux-2.6.10-rc3-mm1-full/net/sched/police.c.old 2004-12-14 22:33:22.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/sched/police.c 2004-12-14 22:33:39.000000000 +0100 @@ -163,7 +163,7 @@ } #ifdef CONFIG_NET_CLS_ACT -int tcf_act_police_locate(struct rtattr *rta, struct rtattr *est,struct tc_action *a, int ovr, int bind) +static int tcf_act_police_locate(struct rtattr *rta, struct rtattr *est,struct tc_action *a, int ovr, int bind) { unsigned h; int ret = 0; @@ -265,7 +265,7 @@ return -1; } -int tcf_act_police_cleanup(struct tc_action *a, int bind) +static int tcf_act_police_cleanup(struct tc_action *a, int bind) { struct tcf_police *p; p = PRIV(a); @@ -275,7 +275,7 @@ return 0; } -int tcf_act_police(struct sk_buff **pskb, struct tc_action *a) +static int tcf_act_police(struct sk_buff **pskb, struct tc_action *a) { psched_time_t now; struct sk_buff *skb = *pskb; @@ -338,7 +338,7 @@ return p->action; } -int tcf_act_police_dump(struct sk_buff *skb, struct tc_action *a, int bind, int ref) +static int tcf_act_police_dump(struct sk_buff *skb, struct tc_action *a, int bind, int ref) { unsigned char *b = skb->tail; struct tc_police opt; --- linux-2.6.10-rc3-mm1-full/net/sched/sch_api.c.old 2004-12-14 22:36:33.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/sched/sch_api.c 2004-12-14 22:39:03.000000000 +0100 @@ -207,7 +207,7 @@ return NULL; } -struct Qdisc *qdisc_leaf(struct Qdisc *p, u32 classid) +static struct Qdisc *qdisc_leaf(struct Qdisc *p, u32 classid) { unsigned long cl; struct Qdisc *leaf; @@ -226,7 +226,7 @@ /* Find queueing discipline by name */ -struct Qdisc_ops *qdisc_lookup_ops(struct rtattr *kind) +static struct Qdisc_ops *qdisc_lookup_ops(struct rtattr *kind) { struct Qdisc_ops *q = NULL; @@ -290,7 +290,7 @@ /* Allocate an unique handle from space managed by kernel */ -u32 qdisc_alloc_handle(struct net_device *dev) +static u32 qdisc_alloc_handle(struct net_device *dev) { int i = 0x10000; static u32 autohandle = TC_H_MAKE(0x80000000U, 0); @@ -356,8 +356,9 @@ Old qdisc is not destroyed but returned in *old. */ -int qdisc_graft(struct net_device *dev, struct Qdisc *parent, u32 classid, - struct Qdisc *new, struct Qdisc **old) +static int qdisc_graft(struct net_device *dev, struct Qdisc *parent, + u32 classid, + struct Qdisc *new, struct Qdisc **old) { int err = 0; struct Qdisc *q = *old; --- linux-2.6.10-rc3-mm1-full/net/sched/sch_dsmark.c.old 2004-12-14 22:39:16.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/sched/sch_dsmark.c 2004-12-14 22:39:24.000000000 +0100 @@ -320,7 +320,7 @@ } -int dsmark_init(struct Qdisc *sch,struct rtattr *opt) +static int dsmark_init(struct Qdisc *sch,struct rtattr *opt) { struct dsmark_qdisc_data *p = PRIV(sch); struct rtattr *tb[TCA_DSMARK_MAX]; --- linux-2.6.10-rc3-mm1-full/net/sched/sch_generic.c.old 2004-12-14 22:39:41.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/sched/sch_generic.c 2004-12-14 22:40:00.000000000 +0100 @@ -283,7 +283,7 @@ .list = LIST_HEAD_INIT(noop_qdisc.list), }; -struct Qdisc_ops noqueue_qdisc_ops = { +static struct Qdisc_ops noqueue_qdisc_ops = { .next = NULL, .cl_ops = NULL, .id = "noqueue", @@ -294,7 +294,7 @@ .owner = THIS_MODULE, }; -struct Qdisc noqueue_qdisc = { +static struct Qdisc noqueue_qdisc = { .enqueue = NULL, .dequeue = noop_dequeue, .flags = TCQ_F_BUILTIN, --- linux-2.6.10-rc3-mm1-full/net/sched/sch_ingress.c.old 2004-12-14 22:40:25.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/sched/sch_ingress.c 2004-12-14 22:40:34.000000000 +0100 @@ -274,7 +274,7 @@ #endif #endif -int ingress_init(struct Qdisc *sch,struct rtattr *opt) +static int ingress_init(struct Qdisc *sch,struct rtattr *opt) { struct ingress_qdisc_data *p = PRIV(sch); --- linux-2.6.10-rc3-mm1-full/net/sched/sch_prio.c.old 2004-12-14 22:40:49.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/sched/sch_prio.c 2004-12-14 22:41:03.000000000 +0100 @@ -47,7 +47,8 @@ }; -struct Qdisc *prio_classify(struct sk_buff *skb, struct Qdisc *sch,int *r) +static struct Qdisc *prio_classify(struct sk_buff *skb, + struct Qdisc *sch, int *r) { struct prio_sched_data *q = qdisc_priv(sch); u32 band = skb->priority; --- linux-2.6.10-rc3-mm1-full/net/sched/sch_htb.c.old 2004-12-14 22:41:56.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/sched/sch_htb.c 2004-12-14 23:46:12.000000000 +0100 @@ -71,7 +71,7 @@ #define HTB_HSIZE 16 /* classid hash size */ #define HTB_EWMAC 2 /* rate average over HTB_EWMAC*HTB_HSIZE sec */ -#define HTB_DEBUG 1 /* compile debugging support (activated by tc tool) */ +#undef HTB_DEBUG /* compile debugging support (activated by tc tool) */ #define HTB_RATECM 1 /* whether to use rate computer */ #define HTB_HYSTERESIS 1/* whether to use mode hysteresis for speedup */ #define HTB_QLOCK(S) spin_lock_bh(&(S)->dev->queue_lock) From bunk@stusta.de Tue Dec 14 17:31:34 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 14 Dec 2004 17:31:40 -0800 (PST) Received: from mailout.stusta.mhn.de (emailhub.stusta.mhn.de [141.84.69.5]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBF1VDaw019352 for ; Tue, 14 Dec 2004 17:31:34 -0800 Received: (qmail 2391 invoked from network); 15 Dec 2004 01:30:45 -0000 Received: from r063144.stusta.swh.mhn.de (10.150.63.144) by mailhub.stusta.mhn.de with SMTP; 15 Dec 2004 01:30:45 -0000 Received: by r063144.stusta.swh.mhn.de (Postfix, from userid 1000) id 6413BAF651; Wed, 15 Dec 2004 02:30:43 +0100 (CET) Date: Wed, 15 Dec 2004 02:30:43 +0100 From: Adrian Bunk To: wensong@linux-vs.org, ja@ssi.bg Cc: netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: [2.6 patch] remove subscribers-only ipvs mailing list Message-ID: <20041215013043.GI12937@stusta.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12761 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bunk@stusta.de Precedence: bulk X-list: netdev It's generally agreed, that maintainers mustn't contain subscribers-only lists. Signed-off-by: Adrian Bunk --- linux-2.6.10-rc3-mm1-full/MAINTAINERS.old 2004-12-15 02:28:33.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/MAINTAINERS 2004-12-15 02:28:43.000000000 +0100 @@ -1595,11 +1595,10 @@ IPVS P: Wensong Zhang M: wensong@linux-vs.org P: Julian Anastasov M: ja@ssi.bg -L: lvs-users@linuxvirtualserver.org S: Maintained NFS CLIENT P: Trond Myklebust M: trond.myklebust@fys.uio.no From yoshfuji@linux-ipv6.org Tue Dec 14 17:58:03 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 14 Dec 2004 17:58:10 -0800 (PST) Received: from yue.st-paulia.net (yue.linux-ipv6.org [203.178.140.15]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBF1vgU9020673 for ; Tue, 14 Dec 2004 17:58:03 -0800 Received: from localhost (localhost [127.0.0.1]) by yue.st-paulia.net (Postfix) with ESMTP id EA41733CE5; Wed, 15 Dec 2004 10:59:01 +0900 (JST) Date: Wed, 15 Dec 2004 10:59:00 +0900 (JST) Message-Id: <20041215.105900.27736391.yoshfuji@linux-ipv6.org> To: bunk@stusta.de Cc: netdev@oss.sgi.com, linux-kernel@vger.kernel.org, yoshfuji@linux-ipv6.org Subject: Re: [2.6 patch] net/ipv6/: misc possible cleanups From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: <20041215005546.GA11972@stusta.de> References: <20041215005546.GA11972@stusta.de> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12762 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev In article <20041215005546.GA11972@stusta.de> (at Wed, 15 Dec 2004 01:55:46 +0100), Adrian Bunk says: > The patch below contains the following possible cleanups: > - make some needlessly global code static > - remove the following unused functions: > - exthdrs.c: ipv6_build_rthdr > - exthdrs.c: ipv6_build_exthdr > - exthdrs.c: ipv6_build_nfrag_opts > - exthdrs.c: ipv6_build_frag_opts > - remove the following unused global variables: > - addrconf.c: in6addr_any > - remove the following EXPORT_SYMBOL's: > - ipv6_syms.c: addrconf_lock > - ipv6_syms.c: in6addr_any > - ipv6_syms.c: in6addr_loopback > > Please comment on which of these changes are correct and which conflict > with pending patches. Please keep addrconf_lock (for SCTP). Please keep in6addr_any in addrconf.c (or enclose by #if 0 ... #endif) > --- linux-2.6.10-rc3-mm1-full/include/net/ip.h.old 2004-12-14 05:20:46.000000000 +0100 > +++ linux-2.6.10-rc3-mm1-full/include/net/ip.h 2004-12-14 05:20:53.000000000 +0100 : I think you attatched incorrect patch file. --yoshfuji From acme@conectiva.com.br Tue Dec 14 18:58:36 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 14 Dec 2004 18:58:44 -0800 (PST) Received: from orion.netbank.com.br (orion.netbank.com.br [200.203.199.90]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBF2wFTO022826 for ; Tue, 14 Dec 2004 18:58:36 -0800 Received: from [201.14.39.194] (helo=oops.ghostprotocols.net) by orion.netbank.com.br with asmtp (Exim 3.33 #1) id 1CePQD-0008AY-00; Wed, 15 Dec 2004 01:01:49 -0200 Received: from [192.168.1.6] (amd64.kerneljanitors.org [192.168.1.6]) by oops.ghostprotocols.net (Postfix) with ESMTP id 2B94D14640; Wed, 15 Dec 2004 00:57:50 -0200 (BRST) Message-ID: <41BF9A95.5050902@conectiva.com.br> Date: Tue, 14 Dec 2004 23:59:49 -0200 From: Arnaldo Carvalho de Melo Organization: Conectiva S.A. User-Agent: Mozilla Thunderbird 0.9 (X11/20041103) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Adrian Bunk Cc: netdev@oss.sgi.com Subject: Re: [2.6 patch] net/sched/: possible cleanups References: <20041215012754.GH12937@stusta.de> In-Reply-To: <20041215012754.GH12937@stusta.de> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12763 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: acme@conectiva.com.br Precedence: bulk X-list: netdev Adrian Bunk wrote: > The patch below contans the following possible cleanups: > - make some needlessly global code static > - sch_htb.c: #undef HTB_DEBUG > > > diffstat output: > include/net/act_api.h | 3 --- Adrian, may I suggest that you post the networking related patches only to netdev? - Arnaldo From random@72616e646f6d20323030342d30342d31360a.nosense.org Tue Dec 14 19:14:36 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 14 Dec 2004 19:14:43 -0800 (PST) Received: from ubu.nosense.org (137.cust6.sa.dsl.ozemail.com.au [210.84.229.137]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBF3EETN024509 for ; Tue, 14 Dec 2004 19:14:35 -0800 Received: from ubu.nosense.org (ubu.nosense.org [127.0.0.1]) by ubu.nosense.org (Postfix) with SMTP id 292B462A9F for ; Wed, 15 Dec 2004 13:43:48 +1030 (CST) Date: Wed, 15 Dec 2004 13:43:47 +1030 From: Mark Smith To: netdev@oss.sgi.com Subject: Re: [2.6.9] Networking crash, slightly exotic setup, bridged tap/tun interfaces Message-Id: <20041215134347.58e366c9.random@72616e646f6d20323030342d30342d31360a.nosense.org> In-Reply-To: <20041215000213.77cc0fa0.random@72616e646f6d20323030342d30342d31360a.nosense.org> References: <20041214105245.32c9b1a6.random@72616e646f6d20323030342d30342d31360a.nosense.org> <41BE2B20.5080602@conectiva.com.br> <20041214113036.3ccb3480.random@72616e646f6d20323030342d30342d31360a.nosense.org> <41BE320A.8090201@conectiva.com.br> <20041215000213.77cc0fa0.random@72616e646f6d20323030342d30342d31360a.nosense.org> Organization: The No Sense Organisation X-Mailer: Sylpheed version 1.0.0beta1 (GTK+ 1.2.10; i686-pc-linux-gnu) X-Location: Adelaide, Australia Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12764 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: random@72616e646f6d20323030342d30342d31360a.nosense.org Precedence: bulk X-list: netdev On Wed, 15 Dec 2004 00:02:13 +1030 Mark Smith wrote: > > I'll leave it running overnight in this configuration to see how it > goes. > As a follow up, it has now been running for at least twelve hours, happily trading OSPF hellos and periodic LSA flooded updates. No problems so far, so what ever was causing the problems I was having seems to have been fixed between 2.6.9 and 2.6.10-rc3. I've put together a tarball containing notes on how I set it all up and the config files I used. If anybody is interested in it, send me an email privately, and I'll send it to you. It's only 19KB, I was tempted to attach it to this email, but figured its probably bad nettiquet to do so :-) Thanks, Mark. From laforge@netfilter.org Wed Dec 15 01:04:09 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 15 Dec 2004 01:04:17 -0800 (PST) Received: from ganesha.gnumonks.org (Debian-exim@ganesha.gnumonks.org [213.95.27.120]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBF93l9l010445 for ; Wed, 15 Dec 2004 01:04:08 -0800 Received: from dsl-082-082-101-099.arcor-ip.net ([82.82.101.99] helo=sunbeam.gnumonks.org) by ganesha.gnumonks.org with asmtp (TLS-1.0:RSA_ARCFOUR_SHA:16) (Exim 4.34) id 1CeV47-0008Ek-Rx; Wed, 15 Dec 2004 10:03:24 +0100 Received: from laforge by sunbeam.gnumonks.org with local (Exim 4.34) id 1CeV46-0000mO-4q; Wed, 15 Dec 2004 10:03:22 +0100 Date: Wed, 15 Dec 2004 10:03:22 +0100 From: Harald Welte To: Adrian Bunk Cc: coreteam@netfilter.org, netdev@oss.sgi.com, netfilter-devel@lists.netfilter.org, linux-kernel@vger.kernel.org, Rusty Russell Subject: Re: [netfilter-core] [2.6 patch] net/ipv4/netfilter/: misc possible cleanups Message-ID: <20041215090322.GA2862@sunbeam.de.gnumonks.org> Mail-Followup-To: Harald Welte , Adrian Bunk , coreteam@netfilter.org, netdev@oss.sgi.com, netfilter-devel@lists.netfilter.org, linux-kernel@vger.kernel.org, Rusty Russell References: <20041215011931.GD12937@stusta.de> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="u3/rZRmxL6MmkK24" Content-Disposition: inline In-Reply-To: <20041215011931.GD12937@stusta.de> User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12765 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: laforge@netfilter.org Precedence: bulk X-list: netdev --u3/rZRmxL6MmkK24 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Dec 15, 2004 at 02:19:31AM +0100, Adrian Bunk wrote: > The patch below contains the following possible cleanups: > - make some needlessly global code static > - remove the following unused global functions: > - ip_conntrack_core.c: ip_conntrack_expect_find_get > - ip_conntrack_core.c: ip_conntrack_unexpect_related > - ip_nat_standalone.c: ip_nat_protocol_register > - ip_nat_standalone.c: ip_nat_protocol_unregister > - ip_nat_helper.c: ip_nat_find_helper > - remove the following unneeded EXPORT_SYMBOL's: > - ip_conntrack_standalone.c: ip_ct_find_helper > - ip_conntrack_standalone.c: ip_conntrack_unexpect_related > - ip_conntrack_standalone.c: ip_conntrack_expect_list > - ip_conntrack_standalone.c: ip_conntrack_put > - ip_nat_standalone.c: ip_nat_protocol_register > - ip_nat_standalone.c: ip_nat_protocol_unregister > - ip_nat_standalone.c: ip_nat_find_helper > - remove the following unneeded EXPORT_SYMBOL_GPL: > - ip_conntrack_standalone.c: ip_conntrack_expect_find_get As you might be aware, netfilter/iptables has an enormously large codebase (I'd say even larger than what is in the tree) in the so-called patch-o-matic subsystem. The abovementioned exports facilitate those modulse, and A certain amount of those new modules (especially the ones requiring the functions above) are scheduled for mainline inclusion over the next couple of months. > - ipfwadm_core.c: ip_acct_ctl This can be removed, yes. Please be aware that ipfwadm and ipchains will be removed soon (after 2.6.10 is out), so that won't be a problem any more :) > - remove the following variables that never change their values: > - ip_conntrack_ftp.c: ip_conntrack_ftp > - ip_conntrack_irc.c: ip_conntrack_irc Mh, I don't really understand why this change was introduced. My original _irc code didn't have this global variable... I'll try to track the change and get back to you. --=20 - Harald Welte http://www.netfilter.org/ =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D "Fragmentation is like classful addressing -- an interesting early architectural error that shows how much experimentation was going on while IP was being designed." -- Paul Vixie --u3/rZRmxL6MmkK24 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Digital signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) iD8DBQFBv/3ZXaXGVTD0i/8RAoTLAKCMebmX0BaH+KMxoq2A+peqvkylDwCglJKO cZtj+5CjN6pU60yX43ERto0= =t3FD -----END PGP SIGNATURE----- --u3/rZRmxL6MmkK24-- From hanemade@gmail.com Wed Dec 15 01:11:24 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 15 Dec 2004 01:11:34 -0800 (PST) Received: from rproxy.gmail.com (rproxy.gmail.com [64.233.170.195]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBF9B3ik011114 for ; Wed, 15 Dec 2004 01:11:24 -0800 Received: by rproxy.gmail.com with SMTP id b11so1120934rne for ; Wed, 15 Dec 2004 01:10:41 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:references; b=cw8DsdbKbdQ5ZHwW1Yr7N02rOM62WxQ53xOPFsYOYbNK0yuUm6FeS/722ZHSrHWrPwNOZQ4x8WBsVkutsIfm16C48TdeAREnADYtOnZk5Fa4NpDKi8CnNcyECIBUXZFAbe+aPgf+fPsIz05n0NM5J9DKxN2uMfc/sFAmOPe+DzU= Received: by 10.39.1.5 with SMTP id d5mr537334rni; Wed, 15 Dec 2004 01:08:54 -0800 (PST) Received: by 10.38.161.18 with HTTP; Wed, 15 Dec 2004 01:08:54 -0800 (PST) Message-ID: Date: Wed, 15 Dec 2004 14:38:54 +0530 From: Harsh Reply-To: Harsh To: MANJUNATH Subject: Re: loopback packet processing. Cc: netdev@oss.sgi.com In-Reply-To: Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit References: X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12766 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hanemade@gmail.com Precedence: bulk X-list: netdev Hi MANJUNATH, I actually did a lot work on kernel packet processing and found that ping packet creation enters in ethernet.c. Also can you tell me what happens in ip_finish_output2 when ping ICMP packet is created by kernel? what header is added there. Also in receving end control enters in ethernet.c to check eth->h_proto field which is ETH_P_IP. How & where it is added by kernel in case of PING packet? regards, Harsh. On Tue, 14 Dec 2004 16:53:33 +0530 (IST), MANJUNATH wrote: > > > Hi, > > When u send data to the loopback address, control never > Enters "ethernet" modules.(Data link layer) > > Since ping inturn uses program interface to ICMP which is a part > of IPV4 s/w, for loopback ping IP modules themselves replies. Ethernet > does not come into picture. > > cheers, > Manjunath > > From jchapman@katalix.com Wed Dec 15 02:03:04 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 15 Dec 2004 02:03:17 -0800 (PST) Received: from s14.s14avahost.net (s14.s14avahost.net [66.98.146.55]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBFA2hDs014595 for ; Wed, 15 Dec 2004 02:03:04 -0800 Received: from cpanel by s14.s14avahost.net with local (Exim 4.43) id 1CeVyn-0004ys-C1; Wed, 15 Dec 2004 04:01:57 -0600 Received: from 193.130.116.242 ([193.130.116.242]) by www.katalix.com (IMP) with HTTP for ; Wed, 15 Dec 2004 04:01:57 -0600 Message-ID: <1103104917.41c00b9533464@www.katalix.com> Date: Wed, 15 Dec 2004 04:01:57 -0600 From: jchapman@katalix.com To: gandalf@wlug.westbo.se, robert.olsson@data.slu.se Cc: netdev@oss.sgi.com Subject: Re: Re: [PATCH] e1000 poll behavior MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit User-Agent: Internet Messaging Program (IMP) 3.2.2 X-Originating-IP: 193.130.116.242 X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - s14.s14avahost.net X-AntiAbuse: Original Domain - oss.sgi.com X-AntiAbuse: Originator/Caller UID/GID - [32001 960] / [47 12] X-AntiAbuse: Sender Address Domain - katalix.com X-Source: /usr/local/cpanel/3rdparty/bin/php X-Source-Args: /usr/local/cpanel/3rdparty/bin/php /usr/local/cpanel/base/horde/imp/compose.php X-Source-Dir: :/base/horde/imp X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12767 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jchapman@katalix.com Precedence: bulk X-list: netdev Is there any advantage in exiting NAPI polled state as soon as possible with the work_done < work_to_do test? Why not stay in polled mode until no rx or tx work is done, i.e. only exit polled state if work_done is zero? When the system is congested, it is common for work_done < work_to_do to be true, which keeps interrupts enabled and just makes the situation worse. When working on e100 a while ago, I found better and more consistent pps numbers by staying in polled mode until the interface went idle. See current e100 code. Unfortunately I don't have e1000 hardware to test if the same is true for e1000. /james Robert Olsson writes: > Martin Josefsson writes: > > > What about the case where tx_cleaned is true but (work_done < > > > work_to_do) is true. Then the statement is false and we continue, > > later we return (work_done >= work_to_do) which equals 0 (not > > seen in patch). We are not allowed to return 0 when still on > > poll_list. That will sort of continue polling in a degraded way > > (no increase in quota) but will screw up device refcounting > > badly. > > > > That final return must be changed into "return 1;" > > Ohoh thanks yes... > > --- drivers/net/e1000/e1000_main.c.orig 2004-12-09 17:49:56.000000000 +0100 > +++ drivers/net/e1000/e1000_main.c 2004-12-10 20:13:57.000000000 +0100 > @@ -2179,15 +2179,15 @@ > *budget -= work_done; > netdev->quota -= work_done; > > - /* if no Rx and Tx cleanup work was done, exit the polling mode */ > - if(!tx_cleaned || (work_done < work_to_do) || > + /* if no Tx and not enough Rx work done, exit the polling mode */ > + if((!tx_cleaned && (work_done < work_to_do)) || > !netif_running(netdev)) { > netif_rx_complete(netdev); > e1000_irq_enable(adapter); > return 0; > } > > - return (work_done >= work_to_do); > + return 1; > } > > #endif From patrick@tykepenguin.com Wed Dec 15 02:57:41 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 15 Dec 2004 02:57:47 -0800 (PST) Received: from fentible.pjc.net (spc1-leed3-6-0-cust18.seac.broadband.ntl.com [80.7.68.18]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBFAvLXI018556 for ; Wed, 15 Dec 2004 02:57:41 -0800 Received: from patrick by fentible.pjc.net with local (Exim 4.34) id 1CeWpq-0007f7-N0; Wed, 15 Dec 2004 10:56:46 +0000 Date: Wed, 15 Dec 2004 10:56:46 +0000 From: Patrick Caulfield To: Adrian Bunk Cc: Steve Whitehouse , linux-decnet-user@lists.sourceforge.net, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: [2.6 patch] net/decnet/: misc possible cleanups Message-ID: <20041215105646.GB27420@tykepenguin.com> Mail-Followup-To: Adrian Bunk , Steve Whitehouse , linux-decnet-user@lists.sourceforge.net, netdev@oss.sgi.com, linux-kernel@vger.kernel.org References: <20041214125838.GC23151@stusta.de> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <20041214125838.GC23151@stusta.de> User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12768 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: patrick@tykepenguin.com Precedence: bulk X-list: netdev On Tue, Dec 14, 2004 at 01:58:38PM +0100, Adrian Bunk wrote: > The patch below contains the following possible cleanups: > - make needlessly global code static > - dn_fib.c: remove the write-only global variable dn_fib_info_cnt > - dn_fib.c: remove the unused global function dn_fib_rt_message > - dn_neigh.c: remove the unused global function dn_neigh_pointopoint_notify > - dn_timer.c: remove the fast timer code that isn't used > > Please review and comment on this patch. > Looks fine to me. I'm quite happy to lose the fast ack code - unused code is only a confusion to those reading it IMHO. If we do the delayed-ack code in future then it's easy enough to reinstate. Thanks. -- patrick From tgraf@suug.ch Wed Dec 15 03:28:06 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 15 Dec 2004 03:28:14 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBFBRjcl020081 for ; Wed, 15 Dec 2004 03:28:06 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id E539EF; Wed, 15 Dec 2004 12:26:58 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id 370721C0EA; Wed, 15 Dec 2004 12:27:41 +0100 (CET) Date: Wed, 15 Dec 2004 12:27:41 +0100 From: Thomas Graf To: Adrian Bunk Cc: Jamal Hadi Salim , netdev@oss.sgi.com Subject: Re: [2.6 patch] net/sched/: possible cleanups Message-ID: <20041215112741.GJ8493@postel.suug.ch> References: <20041215012754.GH12937@stusta.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041215012754.GH12937@stusta.de> X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12769 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * Adrian Bunk <20041215012754.GH12937@stusta.de> 2004-12-15 02:27 > -extern int tcf_act_police_locate(struct rtattr *rta, struct rtattr *est,struct tc_action *,int , int ); > -extern int tcf_act_police_dump(struct sk_buff *, struct tc_action *, int, int); > -extern int tcf_act_police(struct sk_buff **skb, struct tc_action *a); This looks like a police compat API for action policers but I think it's fine, Jamal? The other changes all make sense to me. From tgraf@suug.ch Wed Dec 15 05:01:53 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 15 Dec 2004 05:02:00 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBFD1Wou027442 for ; Wed, 15 Dec 2004 05:01:53 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id 2CE6DF; Wed, 15 Dec 2004 14:00:47 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id 005B31C0EA; Wed, 15 Dec 2004 14:01:28 +0100 (CET) Date: Wed, 15 Dec 2004 14:01:28 +0100 From: Thomas Graf To: "David S. Miller" Cc: netdev@oss.sgi.com Subject: [PATCH] PKT_SCHED: Provide compat policer stats in action policer Message-ID: <20041215130128.GK8493@postel.suug.ch> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12770 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev Dave, This should go in before 2.6.10. It fixes a forgotten case to provide police backward compatibility statistics for old iproute2 versions running on a new kernel with actions enabled. Should make distributions happy with older iproute2 versions and all-included kernel configs since they probably favour actions over plain policer. Testing results: iproute2-2.4.7 on 2.6.10-rc3-bk8: cls-police: police creation succeeded cls-police: Sending 10 ICMP echo requests cls-police: police dumping succeeded with output: filter protocol ip pref 10 u32 filter protocol ip pref 10 u32 fh 800: ht divisor 1 filter protocol ip pref 10 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 10:12 police 3 action drop rate 2Kbit burst 10Kb mtu 2Kb match 00010000/00ff0000 at 8 Sent 420 bytes 10 pkts (dropped 0, overlimits 0) <-- This would have been missing cls-police: police deletion succeeded iproute2-2.6.9 on 2.6.10-rc3-bk8: ... filter protocol ip pref 10 u32 filter protocol ip pref 10 u32 fh 800: ht divisor 1 filter protocol ip pref 10 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 10:12 (rule hit 10 success 10) match 00010000/00ff0000 at 8 (success 10 ) police 0x4 rate 2000bit burst 10Kb mtu 2Kb action drop ref 1 bind 1 Sent 420 bytes 10 pkts (dropped 0, overlimits 0) ... (Same results for fw classifier) Signed-off-by: Thomas Graf --- linux-2.6.10-rc3-bk7.orig/net/sched/act_api.c 2004-12-14 14:24:34.000000000 +0100 +++ linux-2.6.10-rc3-bk7/net/sched/act_api.c 2004-12-14 16:15:13.000000000 +0100 @@ -418,6 +418,7 @@ int tcf_action_copy_stats (struct sk_buff *skb,struct tc_action *a) { + int err; struct gnet_dump d; struct tcf_act_hdr *h = a->priv; @@ -428,7 +429,14 @@ if (NULL == h) goto errout; - if (gnet_stats_start_copy(skb, TCA_ACT_STATS, h->stats_lock, &d) < 0) + if (a->type == TCA_OLD_COMPAT) + err = gnet_stats_start_copy_compat(skb, TCA_ACT_STATS, + TCA_STATS, TCA_XSTATS, h->stats_lock, &d); + else + err = gnet_stats_start_copy(skb, TCA_ACT_STATS, + h->stats_lock, &d); + + if (err < 0) goto errout; if (NULL != a->ops && NULL != a->ops->get_stats) From hadi@cyberus.ca Wed Dec 15 05:51:27 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 15 Dec 2004 05:51:35 -0800 (PST) Received: from mx02.cybersurf.com (mx02.cybersurf.com [209.197.145.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBFDp68p029791 for ; Wed, 15 Dec 2004 05:51:27 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx02.cybersurf.com with esmtp (Exim 4.30) id 1CeZY4-0004Vh-0C for netdev@oss.sgi.com; Wed, 15 Dec 2004 08:50:36 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CeZXx-0004rS-6p; Wed, 15 Dec 2004 08:50:29 -0500 Subject: Re: [patch 4/10] s390: network driver. From: jamal Reply-To: hadi@cyberus.ca To: Thomas Spatzier Cc: Paul Jakma , Hasso Tepper , Herbert Xu , jgarzik@pobox.com, netdev@oss.sgi.com, "David S. Miller" In-Reply-To: References: Content-Type: text/plain Organization: jamalopolous Message-Id: <1103118626.1076.53.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 15 Dec 2004 08:50:27 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12771 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Tue, 2004-12-14 at 02:40, Thomas Spatzier wrote: > > > Paul Jakma wrote on 10.12.2004 16:37:15: > > Thomas' original patch was to address this problem. I wonder could he > > recap the kernel side of this problem? > > Here is why we submitted the original patch: We got reports from > several customers that their dynamic routing daemons got hung when > one network interface lost its physical connection. Some debugging > showed that the write queues of sockets went full and got blocked. > This was because we issued a netif_stop_queue when we detect a > cable pull or something. I did some more thinking in the background and i wish to change my opinion. What you see is Very Odd. I think there may be a bug upstream at the socket layer or even before that - but doesnt sound like a device level bug. Wasnt someone supposed to send a small proggie to Herbert? When you netif_stop_queue you should never receive packets anymore at the device level. If you receive any its a bug and you should drop them and bitch violently. In other words i think what you have at the moment is bandaid not the solution. > As a solution, we removed the netif_stop_queue calls and just dropped > the packets + we increment the respective error counts in the > net_device_stats and call netif_carrier_off. > This solved the customer problems and seems to be right thing for > zebra etc. We need to Fix this issue. Either your driver is doing something wrong or something is broken upstackstream. Can you describe how your driver uses the netif_start/stop/wake interfaces? Whoever promised to send that program to Herbert - please do. cheers, jamal From hadi@cyberus.ca Wed Dec 15 06:04:16 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 15 Dec 2004 06:04:23 -0800 (PST) Received: from mx03.cybersurf.com (mx03.cybersurf.com [209.197.145.106]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBFE3rFZ030743 for ; Wed, 15 Dec 2004 06:04:16 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx03.cybersurf.com with esmtp (Exim 4.30) id 1CeZkW-0008UK-O8 for netdev@oss.sgi.com; Wed, 15 Dec 2004 09:03:28 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CeZkI-0007d2-Ld; Wed, 15 Dec 2004 09:03:15 -0500 Subject: Re: [2.6 patch] net/sched/: possible cleanups From: jamal Reply-To: hadi@cyberus.ca To: Thomas Graf Cc: Adrian Bunk , netdev@oss.sgi.com In-Reply-To: <20041215112741.GJ8493@postel.suug.ch> References: <20041215012754.GH12937@stusta.de> <20041215112741.GJ8493@postel.suug.ch> Content-Type: text/plain Organization: jamalopolous Message-Id: <1103119392.1078.67.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 15 Dec 2004 09:03:12 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12772 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev This is fine. The new architecture doesnt require classifiers explicitly call those function. Rest also looks fine to me. Good effort Adrian. cheers, jamal On Wed, 2004-12-15 at 06:27, Thomas Graf wrote: > * Adrian Bunk <20041215012754.GH12937@stusta.de> 2004-12-15 02:27 > > > -extern int tcf_act_police_locate(struct rtattr *rta, struct rtattr *est,struct tc_action *,int , int ); > > -extern int tcf_act_police_dump(struct sk_buff *, struct tc_action *, int, int); > > -extern int tcf_act_police(struct sk_buff **skb, struct tc_action *a); > > This looks like a police compat API for action policers but I think it's > fine, Jamal? > > The other changes all make sense to me. > From hadi@cyberus.ca Wed Dec 15 06:08:11 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 15 Dec 2004 06:08:19 -0800 (PST) Received: from mx01.cybersurf.com (mx01.cybersurf.com [209.197.145.104]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBFE7oSn031356 for ; Wed, 15 Dec 2004 06:08:10 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx01.cybersurf.com with esmtp (Exim 4.30) id 1CeZoK-00005k-3X for netdev@oss.sgi.com; Wed, 15 Dec 2004 09:07:24 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CeZo1-0008Px-Mm; Wed, 15 Dec 2004 09:07:05 -0500 Subject: Re: [2.6 patch] net/netlink/af_netlink.c: possible cleanups From: jamal Reply-To: hadi@cyberus.ca To: Adrian Bunk Cc: Alan Cox , Alexey Kuznetsov , netdev@oss.sgi.com, linux-kernel@vger.kernel.org In-Reply-To: <20041215004604.GH23151@stusta.de> References: <20041215004604.GH23151@stusta.de> Content-Type: text/plain Organization: jamalopolous Message-Id: <1103119623.1077.71.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 15 Dec 2004 09:07:03 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12773 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev I think this should be left alone for now; what we need to do is deprecate NETLINK_DEV incase someone is still using it. Else we could get rid of it totaly including what Adrian is deleting below. Any users of NETLINK_DEV? Maybe deleting the feature will get someone whining? ;-> cheers, jamal On Tue, 2004-12-14 at 19:46, Adrian Bunk wrote: > The patch below contains the following possible cleanups: > - make the needlessly global function netlink_getsockbypid static > - remove the EXPORT_SYMBOL'ed but unused functions netlink_attach and > netlink_detach > > Please review whether these changes are correct or whether they conflict > with pending patches. > > > diffstat output: > include/linux/netlink.h | 3 --- > net/netlink/af_netlink.c | 28 +--------------------------- > 2 files changed, 1 insertion(+), 30 deletions(-) > > > Signed-off-by: Adrian Bunk > > --- linux-2.6.10-rc3-mm1-full/include/linux/netlink.h.old 2004-12-14 21:43:16.000000000 +0100 > +++ linux-2.6.10-rc3-mm1-full/include/linux/netlink.h 2004-12-14 21:44:27.000000000 +0100 > @@ -116,8 +116,6 @@ > #define NETLINK_CREDS(skb) (&NETLINK_CB((skb)).creds) > > > -extern int netlink_attach(int unit, int (*function)(int,struct sk_buff *skb)); > -extern void netlink_detach(int unit); > extern int netlink_post(int unit, struct sk_buff *skb); > extern struct sock *netlink_kernel_create(int unit, void (*input)(struct sock *sk, int len)); > extern void netlink_ack(struct sk_buff *in_skb, struct nlmsghdr *nlh, int err); > @@ -129,7 +127,6 @@ > extern int netlink_unregister_notifier(struct notifier_block *nb); > > /* finegrained unicast helpers: */ > -struct sock *netlink_getsockbypid(struct sock *ssk, u32 pid); > struct sock *netlink_getsockbyfilp(struct file *filp); > int netlink_attachskb(struct sock *sk, struct sk_buff *skb, int nonblock, long timeo); > void netlink_detachskb(struct sock *sk, struct sk_buff *skb); > --- linux-2.6.10-rc3-mm1-full/net/netlink/af_netlink.c.old 2004-12-14 21:43:31.000000000 +0100 > +++ linux-2.6.10-rc3-mm1-full/net/netlink/af_netlink.c 2004-12-14 21:44:34.000000000 +0100 > @@ -546,7 +546,7 @@ > } > } > > -struct sock *netlink_getsockbypid(struct sock *ssk, u32 pid) > +static struct sock *netlink_getsockbypid(struct sock *ssk, u32 pid) > { > int protocol = ssk->sk_protocol; > struct sock *sock; > @@ -1210,30 +1210,6 @@ > * Backward compatibility. > */ > > -int netlink_attach(int unit, int (*function)(int, struct sk_buff *skb)) > -{ > - struct sock *sk = netlink_kernel_create(unit, NULL); > - if (sk == NULL) > - return -ENOBUFS; > - nlk_sk(sk)->handler = function; > - write_lock_bh(&nl_emu_lock); > - netlink_kernel[unit] = sk->sk_socket; > - write_unlock_bh(&nl_emu_lock); > - return 0; > -} > - > -void netlink_detach(int unit) > -{ > - struct socket *sock; > - > - write_lock_bh(&nl_emu_lock); > - sock = netlink_kernel[unit]; > - netlink_kernel[unit] = NULL; > - write_unlock_bh(&nl_emu_lock); > - > - sock_release(sock); > -} > - > int netlink_post(int unit, struct sk_buff *skb) > { > struct socket *sock; > @@ -1522,7 +1498,5 @@ > EXPORT_SYMBOL(netlink_unregister_notifier); > > #if defined(CONFIG_NETLINK_DEV) || defined(CONFIG_NETLINK_DEV_MODULE) > -EXPORT_SYMBOL(netlink_attach); > -EXPORT_SYMBOL(netlink_detach); > EXPORT_SYMBOL(netlink_post); > #endif > > > From hadi@cyberus.ca Wed Dec 15 06:10:25 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 15 Dec 2004 06:10:32 -0800 (PST) Received: from mx02.cybersurf.com (mx02.cybersurf.com [209.197.145.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBFEA5Om031798 for ; Wed, 15 Dec 2004 06:10:25 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx02.cybersurf.com with esmtp (Exim 4.30) id 1CeZqW-0002Ea-03 for netdev@oss.sgi.com; Wed, 15 Dec 2004 09:09:40 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CeZqS-0000VD-NY; Wed, 15 Dec 2004 09:09:37 -0500 Subject: Re: [PATCH] PKT_SCHED: Provide compat policer stats in action policer From: jamal Reply-To: hadi@cyberus.ca To: Thomas Graf , Patrick McHardy Cc: "David S. Miller" , netdev@oss.sgi.com In-Reply-To: <20041215130128.GK8493@postel.suug.ch> References: <20041215130128.GK8493@postel.suug.ch> Content-Type: text/plain Organization: jamalopolous Message-Id: <1103119774.1077.74.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 15 Dec 2004 09:09:34 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12774 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev Hopefully this makes my point to Patrick earlier about regression testing? ;-> Good effort Thomas. cheers, jamal On Wed, 2004-12-15 at 08:01, Thomas Graf wrote: > Dave, > > This should go in before 2.6.10. It fixes a forgotten case to provide > police backward compatibility statistics for old iproute2 versions > running on a new kernel with actions enabled. Should make distributions > happy with older iproute2 versions and all-included kernel configs > since they probably favour actions over plain policer. > > Testing results: > iproute2-2.4.7 on 2.6.10-rc3-bk8: > cls-police: police creation succeeded > cls-police: Sending 10 ICMP echo requests > cls-police: police dumping succeeded with output: > filter protocol ip pref 10 u32 > filter protocol ip pref 10 u32 fh 800: ht divisor 1 > filter protocol ip pref 10 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 10:12 > police 3 action drop rate 2Kbit burst 10Kb mtu 2Kb > match 00010000/00ff0000 at 8 > Sent 420 bytes 10 pkts (dropped 0, overlimits 0) <-- This would have been missing > cls-police: police deletion succeeded > > iproute2-2.6.9 on 2.6.10-rc3-bk8: > ... > filter protocol ip pref 10 u32 > filter protocol ip pref 10 u32 fh 800: ht divisor 1 > filter protocol ip pref 10 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 10:12 (rule hit 10 success 10) > match 00010000/00ff0000 at 8 (success 10 ) > police 0x4 rate 2000bit burst 10Kb mtu 2Kb action drop > ref 1 bind 1 > Sent 420 bytes 10 pkts (dropped 0, overlimits 0) > ... > > (Same results for fw classifier) > > Signed-off-by: Thomas Graf > > --- linux-2.6.10-rc3-bk7.orig/net/sched/act_api.c 2004-12-14 14:24:34.000000000 +0100 > +++ linux-2.6.10-rc3-bk7/net/sched/act_api.c 2004-12-14 16:15:13.000000000 +0100 > @@ -418,6 +418,7 @@ > > int tcf_action_copy_stats (struct sk_buff *skb,struct tc_action *a) > { > + int err; > struct gnet_dump d; > struct tcf_act_hdr *h = a->priv; > > @@ -428,7 +429,14 @@ > if (NULL == h) > goto errout; > > - if (gnet_stats_start_copy(skb, TCA_ACT_STATS, h->stats_lock, &d) < 0) > + if (a->type == TCA_OLD_COMPAT) > + err = gnet_stats_start_copy_compat(skb, TCA_ACT_STATS, > + TCA_STATS, TCA_XSTATS, h->stats_lock, &d); > + else > + err = gnet_stats_start_copy(skb, TCA_ACT_STATS, > + h->stats_lock, &d); > + > + if (err < 0) > goto errout; > > if (NULL != a->ops && NULL != a->ops->get_stats) > > From thomas.spatzier@de.ibm.com Wed Dec 15 07:05:22 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 15 Dec 2004 07:05:29 -0800 (PST) Received: from mtagate4.de.ibm.com (mtagate4.de.ibm.com [195.212.29.153]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBFF50QC002082 for ; Wed, 15 Dec 2004 07:05:21 -0800 Received: from d12nrmr1607.megacenter.de.ibm.com (d12nrmr1607.megacenter.de.ibm.com [9.149.167.49]) by mtagate4.de.ibm.com (8.12.10/8.12.10) with ESMTP id iBFF4WvU181512 for ; Wed, 15 Dec 2004 15:04:32 GMT Received: from d12av02.megacenter.de.ibm.com (d12av02.megacenter.de.ibm.com [9.149.165.228]) by d12nrmr1607.megacenter.de.ibm.com (8.12.10/NCO/VER6.6) with ESMTP id iBFF5CqS066160 for ; Wed, 15 Dec 2004 16:05:12 +0100 Received: from d12av02.megacenter.de.ibm.com (loopback [127.0.0.1]) by d12av02.megacenter.de.ibm.com (8.12.11/8.12.11) with ESMTP id iBFF4VnU010547 for ; Wed, 15 Dec 2004 16:04:32 +0100 Received: from d12ml061.megacenter.de.ibm.com ([9.149.165.51]) by d12av02.megacenter.de.ibm.com (8.12.11/8.12.11) with ESMTP id iBFF4VtI010538; Wed, 15 Dec 2004 16:04:31 +0100 In-Reply-To: <1103118626.1076.53.camel@jzny.localdomain> Subject: Re: [patch 4/10] s390: network driver. To: hadi@cyberus.ca Cc: "David S. Miller" , Hasso Tepper , Herbert Xu , jgarzik@pobox.com, netdev@oss.sgi.com, Paul Jakma X-Mailer: Lotus Notes Release 6.0.2CF1 June 9, 2003 Message-ID: From: Thomas Spatzier Date: Wed, 15 Dec 2004 16:03:08 +0100 X-MIMETrack: Serialize by Router on D12ML061/12/M/IBM(Release 6.51HF338 | June 21, 2004) at 15/12/2004 16:05:03 MIME-Version: 1.0 Content-type: text/plain; charset=US-ASCII X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12775 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: thomas.spatzier@de.ibm.com Precedence: bulk X-list: netdev jamal wrote on 15.12.2004 14:50:27: > When you netif_stop_queue you should never receive packets anymore > at the device level. If you receive any its a bug and you should drop > them and bitch violently. In other words i think what you have at the > moment is bandaid not the solution. When we do a netif_stop_queue, we do not get any more packets. So this behaviour is ok. The problem is that the socket write queues fill up then and get blocked as soon as they are full. > Can you describe how your driver uses the netif_start/stop/wake > interfaces? Before the patch we are talking about: When we detect a cable pull (or something like this) we call netif_stop_queue and set switch off the IFF_RUNNING flag. Then when we detect that the cable is plugged in again, we call netif_wake_queue and switch the IFF_RUNNING flag on. And with the patch: On cable pull we switch off IFF_RUNNING and call netif_carrier_off. We still get packets but drop them. When the cable is plugged in we switch on IFF_RUNNING and call netif_carrier_on. Regard, Thomas. From tgraf@suug.ch Wed Dec 15 07:23:16 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 15 Dec 2004 07:23:23 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBFFMtWN003210 for ; Wed, 15 Dec 2004 07:23:15 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id D1716F; Wed, 15 Dec 2004 16:22:09 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id 794CF1C0EA; Wed, 15 Dec 2004 16:22:52 +0100 (CET) Date: Wed, 15 Dec 2004 16:22:52 +0100 From: Thomas Graf To: jamal Cc: Patrick McHardy , "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PATCH] PKT_SCHED: Provide compat policer stats in action policer Message-ID: <20041215152252.GM8493@postel.suug.ch> References: <20041215130128.GK8493@postel.suug.ch> <1103119774.1077.74.camel@jzny.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1103119774.1077.74.camel@jzny.localdomain> X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12776 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * jamal <1103119774.1077.74.camel@jzny.localdomain> 2004-12-15 09:09 > > Hopefully this makes my point to Patrick earlier about regression > testing? ;-> Good effort Thomas. I generally agree with you but I also understand patrick's point. The testsuite doesn't replace manual testing for explict changes but rather helps to catch the easly forgotten cases. I made some significant changes to the testsuite and I'm still adding lots of tests. I already found a few minor bugs such as unhandled policer results, unhandled error cases in action configuration which I will push patches for once 2.6.10 is out. The architecture currently looks like this: qdisc specific and generic tests foreach classful-qdisc classifier specific and generic tests foreach classifier action/policer/indev tests generic tests include: - add,dump,delete,add,delete basic tests - stress tests (adding/dumping/deleting a few hundred objects) - statistics tests So for example indev is tested K*I*N*M times where K is the number of kernels available in my testing enviroment, I is the number of iproute2 versions, N is the number of classful qdiscs and M is the number of classifiers supporting indev. This helps to catch bugs specific to a certain module and can be used to check consistency between various iproute2 modules. /proc/conf.gz gets splitted up and is provided to the test scripts as enviroment variables so they can easly check if the kernel is capable of for example indev and if not check if it properly returns an error code. I will publish the latest testsuite as soon as I've finished writing the most urgent tests. From wensong@linux-vs.org Wed Dec 15 07:30:10 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 15 Dec 2004 07:30:17 -0800 (PST) Received: from lb1.ctrip.com ([218.244.111.2]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBFFTm8w003845 for ; Wed, 15 Dec 2004 07:30:09 -0800 Received: from penguin.linux-vs.org ([61.149.156.92]) by lb1.ctrip.com (8.12.10/8.12.10) with ESMTP id iBFFSBMh023877 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Wed, 15 Dec 2004 23:28:17 +0800 Received: from penguin.linux-vs.org (localhost.localdomain [127.0.0.1]) by penguin.linux-vs.org (8.12.8/8.12.8) with ESMTP id iBFFQop2001933; Wed, 15 Dec 2004 23:26:50 +0800 Received: from localhost (wensong@localhost) by penguin.linux-vs.org (8.12.8/8.12.8/Submit) with ESMTP id iBFFQaSj001929; Wed, 15 Dec 2004 23:26:38 +0800 X-Authentication-Warning: penguin.linux-vs.org: wensong owned process doing -bs Date: Wed, 15 Dec 2004 23:26:36 +0800 (CST) From: Wensong Zhang To: "David S. Miller" cc: Adrian Bunk , Julian Anastasov , lvs-users@linuxvirtualserver.org, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: [2.6 patch] net/ipvs/: make some code static In-Reply-To: <20041215005801.GB11972@stusta.de> Message-ID: References: <20041215005801.GB11972@stusta.de> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12777 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: wensong@linux-vs.org Precedence: bulk X-list: netdev Hi Dave, Please apply Adrian's patch. Thanks, Wensong On Wed, 15 Dec 2004, Adrian Bunk wrote: > The patch below makes some needlessly global code static. > > > diffstat output: > net/ipv4/ipvs/ip_vs_app.c | 2 +- > net/ipv4/ipvs/ip_vs_conn.c | 2 +- > net/ipv4/ipvs/ip_vs_ctl.c | 2 +- > net/ipv4/ipvs/ip_vs_proto.c | 4 ++-- > net/ipv4/ipvs/ip_vs_proto_icmp.c | 4 ++-- > 5 files changed, 7 insertions(+), 7 deletions(-) > > > Signed-off-by: Adrian Bunk > > --- linux-2.6.10-rc3-mm1-full/net/ipv4/ipvs/ip_vs_app.c.old 2004-12-14 05:15:21.000000000 +0100 > +++ linux-2.6.10-rc3-mm1-full/net/ipv4/ipvs/ip_vs_app.c 2004-12-14 05:15:28.000000000 +0100 > @@ -65,7 +65,7 @@ > /* > * Allocate/initialize app incarnation and register it in proto apps. > */ > -int > +static int > ip_vs_app_inc_new(struct ip_vs_app *app, __u16 proto, __u16 port) > { > struct ip_vs_protocol *pp; > --- linux-2.6.10-rc3-mm1-full/net/ipv4/ipvs/ip_vs_conn.c.old 2004-12-14 05:15:44.000000000 +0100 > +++ linux-2.6.10-rc3-mm1-full/net/ipv4/ipvs/ip_vs_conn.c 2004-12-14 05:15:51.000000000 +0100 > @@ -64,7 +64,7 @@ > } __attribute__((__aligned__(SMP_CACHE_BYTES))); > > /* lock array for conn table */ > -struct ip_vs_aligned_lock > +static struct ip_vs_aligned_lock > __ip_vs_conntbl_lock_array[CT_LOCKARRAY_SIZE] __cacheline_aligned; > > static inline void ct_read_lock(unsigned key) > --- linux-2.6.10-rc3-mm1-full/net/ipv4/ipvs/ip_vs_ctl.c.old 2004-12-14 05:17:03.000000000 +0100 > +++ linux-2.6.10-rc3-mm1-full/net/ipv4/ipvs/ip_vs_ctl.c 2004-12-14 05:17:14.000000000 +0100 > @@ -62,7 +62,7 @@ > /* 1/rate drop and drop-entry variables */ > int ip_vs_drop_rate = 0; > int ip_vs_drop_counter = 0; > -atomic_t ip_vs_dropentry = ATOMIC_INIT(0); > +static atomic_t ip_vs_dropentry = ATOMIC_INIT(0); > > /* number of virtual services */ > static int ip_vs_num_services = 0; > --- linux-2.6.10-rc3-mm1-full/net/ipv4/ipvs/ip_vs_proto.c.old 2004-12-14 05:17:33.000000000 +0100 > +++ linux-2.6.10-rc3-mm1-full/net/ipv4/ipvs/ip_vs_proto.c 2004-12-14 05:17:47.000000000 +0100 > @@ -45,7 +45,7 @@ > /* > * register an ipvs protocol > */ > -int register_ip_vs_protocol(struct ip_vs_protocol *pp) > +static int register_ip_vs_protocol(struct ip_vs_protocol *pp) > { > unsigned hash = IP_VS_PROTO_HASH(pp->protocol); > > @@ -62,7 +62,7 @@ > /* > * unregister an ipvs protocol > */ > -int unregister_ip_vs_protocol(struct ip_vs_protocol *pp) > +static int unregister_ip_vs_protocol(struct ip_vs_protocol *pp) > { > struct ip_vs_protocol **pp_p; > unsigned hash = IP_VS_PROTO_HASH(pp->protocol); > --- linux-2.6.10-rc3-mm1-full/net/ipv4/ipvs/ip_vs_proto_icmp.c.old 2004-12-14 05:18:02.000000000 +0100 > +++ linux-2.6.10-rc3-mm1-full/net/ipv4/ipvs/ip_vs_proto_icmp.c 2004-12-14 05:18:37.000000000 +0100 > @@ -22,7 +22,7 @@ > > static char * icmp_state_name_table[1] = { "ICMP" }; > > -struct ip_vs_conn * > +static struct ip_vs_conn * > icmp_conn_in_get(const struct sk_buff *skb, > struct ip_vs_protocol *pp, > const struct iphdr *iph, > @@ -49,7 +49,7 @@ > #endif > } > > -struct ip_vs_conn * > +static struct ip_vs_conn * > icmp_conn_out_get(const struct sk_buff *skb, > struct ip_vs_protocol *pp, > const struct iphdr *iph, > From kaber@trash.net Wed Dec 15 07:43:43 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 15 Dec 2004 07:43:49 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBFFhNYp004856 for ; Wed, 15 Dec 2004 07:43:43 -0800 Received: from localhost ([127.0.0.1]) by kaber.coreworks.de with esmtp (Exim 4.34) id 1CebIG-000736-SU; Wed, 15 Dec 2004 16:42:24 +0100 Message-ID: <41C05B60.6030504@trash.net> Date: Wed, 15 Dec 2004 16:42:24 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.3) Gecko/20041008 Debian/1.7.3-5 X-Accept-Language: en MIME-Version: 1.0 To: hadi@cyberus.ca CC: Thomas Graf , "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PATCH] PKT_SCHED: Provide compat policer stats in action policer References: <20041215130128.GK8493@postel.suug.ch> <1103119774.1077.74.camel@jzny.localdomain> In-Reply-To: <1103119774.1077.74.camel@jzny.localdomain> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12778 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev jamal wrote: >Hopefully this makes my point to Patrick earlier about regression >testing? ;-> Good effort Thomas. > > I value regression testing, but I prefer to use my eyes first. Since this problem is not related to the policer oops fix it doesn't convince me that my time would have been well invested doing the tests you described. Regards Patrick From ilya.pashkovsky@gmail.com Wed Dec 15 08:10:13 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 15 Dec 2004 08:10:20 -0800 (PST) Received: from rproxy.gmail.com (rproxy.gmail.com [64.233.170.199]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBFG9qRr006235 for ; Wed, 15 Dec 2004 08:10:12 -0800 Received: by rproxy.gmail.com with SMTP id b11so1157439rne for ; Wed, 15 Dec 2004 08:09:24 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:return-path:date:to:subject:from:content-type:mime-version:message-id:user-agent; b=CTBvoC4PgIfYk1w8pf561N8PHOmB7r+xArfQwDjdS1JD1WEaPRlF8gJlTTfuYZbrwiqkKTJJW94fKOAegH3eiko1YL5D/1YQIq7IIv4mEXuDLuX4SieXEWhFWNY3ft3g83U3n38C4ISWC7qYeoqKKGFNBf8Lk8XY8piWxT/XvK8= Received: by 10.39.1.5 with SMTP id d5mr681794rni; Wed, 15 Dec 2004 08:08:22 -0800 (PST) Received: from sibox ([82.166.114.51]) by smtp.gmail.com with ESMTP id 58sm56011rnc.2004.12.15.08.08.18; Wed, 15 Dec 2004 08:08:22 -0800 (PST) Date: Wed, 15 Dec 2004 18:08:13 +0200 To: netdev@oss.sgi.com Subject: [Patch] SO_REUSEADDR fix in ipv4/ipv6 (n connects + 1 listen) From: "Ilya Pashkovsky" Content-Type: multipart/mixed; boundary=----------4cQblArTdRJef3s7HFoYxt MIME-Version: 1.0 Message-ID: User-Agent: Opera M2/7.60 (Win32, build 7364) X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12779 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ilya.pashkovsky@gmail.com Precedence: bulk X-list: netdev ------------4cQblArTdRJef3s7HFoYxt Content-Type: text/plain; format=flowed; delsp=yes; charset=iso-8859-15 Content-Transfer-Encoding: 8bit Hello, fellow developers. This is a proper documentation for the reuse patch, along with its adoption for 2.6.10rc3-mm1 patchset. Current implementation of SO_REUSEADDR in the kernel disallows same port binding if a listening socket is already bound to it. This limits the options for firewall/nat piercing for applications, and is non-standard. The current limit was made because of the security risk of allowing free binds to same socket. This allowed listen call preemption and, since no unambiguous and secure fix was not found, the option to reuse port after listen-bind was removed. (the problem is that of multiple listeners bound to same port. this patch does not fix it, since the behaviour is ambiguos and not defined clearly, though maybe a userid check along with another flag telling to allow listen preemption (SO_REUSEPORT can be reused for this purpose) may solve that.) Also, SO_REUSEADDR option value is boolean, but is checked to be more than 1 in code. This check was removed. The way multiple listeners was removed is by disallowing any reuse of a port with a bound listener. This is implemented by a check in tcp_ipv4.c (and tcp_ipv4.c) in function tcp_bind_conflict(). There's a check for any existing sockets matching the source port and having a TCP_LISTEN state. This check was changed to allow binding unless the new socket is also in TCP_LISTEN state. This still disallows multiple listeners but allows reuse of the port for outgoing connections. This check is made/modified in both ipv4 and ipv6 code. Testing was done using normal workloads on 2 i386 linux installations with normal workloads and also the netcat test. Test with netcat (uses SO_REUSEADDR by default): host A: nc -v -l -p 9999 host B: nc -v -l -p 9000 host A: nc -v -p 9999 host.B.ip.addr 9000 host B: nc -v host.A.ip.addr 9999 host A and B can be same host. Testing did not reveal any problems and networking software worked fine in both ipv4 and ipv6 networks. Signed-off-by: Ilya Pashkovsky -- -- ilya ------------4cQblArTdRJef3s7HFoYxt Content-Disposition: attachment; filename=patch-rc3-mm1-reuse Content-Type: application/octet-stream; name=patch-rc3-mm1-reuse Content-Transfer-Encoding: Base64 LS0tIGxpbnV4L25ldC9pcHY0L3RjcF9pcHY0LmMub3JpZwkyMDA0LTEyLTE1IDEy OjMwOjExLjcyMzEzMzAxNiArMDIwMAorKysgbGludXgvbmV0L2lwdjQvdGNwX2lw djQuYwkyMDA0LTEyLTE1IDEyOjMwOjMxLjIxMjE3MDIzMiArMDIwMApAQCAtNTAs NiArNTAsOCBAQAogICoJWU9TSElGVUpJIEhpZGVha2kgQFVTQUdJIGFuZDoJU3Vw cG9ydCBJUFY2X1Y2T05MWSBzb2NrZXQgb3B0aW9uLCB3aGljaAogICoJQWxleGV5 IEt1em5ldHNvdgkJYWxsb3cgYm90aCBJUHY0IGFuZCBJUHY2IHNvY2tldHMgdG8g YmluZAogICoJCQkJCWEgc2luZ2xlIHBvcnQgYXQgdGhlIHNhbWUgdGltZS4KKyAq CUlseWEgUGFzaGtvdnNreQkJOglhbGxvdyByZXVzZSBvZiBwb3J0IHdpdGggc2lu Z2xlIGxpc3RlbmVyCisgKgkJCQkJcmVtb3ZlIGJvb2xlYW4gc2tfcmV1c2UgPiAx IGNoZWNrCiAgKi8KIAogI2luY2x1ZGUgPGxpbnV4L2NvbmZpZy5oPgpAQCAtMTkz LDcgKzE5NSw4IEBAIHN0YXRpYyBpbmxpbmUgaW50IHRjcF9iaW5kX2NvbmZsaWN0 KHN0cnUKIAkJICAgICAhc2syLT5za19ib3VuZF9kZXZfaWYgfHwKIAkJICAgICBz ay0+c2tfYm91bmRfZGV2X2lmID09IHNrMi0+c2tfYm91bmRfZGV2X2lmKSkgewog CQkJaWYgKCFyZXVzZSB8fCAhc2syLT5za19yZXVzZSB8fAotCQkJICAgIHNrMi0+ c2tfc3RhdGUgPT0gVENQX0xJU1RFTikgeworCQkJICAgIChzazItPnNrX3N0YXRl ID09IFRDUF9MSVNURU4gJiYKKwkJCQlzay0+c2tfc3RhdGUgPT0gVENQX0xJU1RF TikpIHsKIAkJCQljb25zdCB1MzIgc2syX3Jjdl9zYWRkciA9IHRjcF92NF9yY3Zf c2FkZHIoc2syKTsKIAkJCQlpZiAoIXNrMl9yY3Zfc2FkZHIgfHwgIXNrX3Jjdl9z YWRkciB8fAogCQkJCSAgICBzazJfcmN2X3NhZGRyID09IHNrX3Jjdl9zYWRkcikK QEAgLTI1OSw4ICsyNjIsNiBAQCBzdGF0aWMgaW50IHRjcF92NF9nZXRfcG9ydChz dHJ1Y3Qgc29jayAqCiAJZ290byB0Yl9ub3RfZm91bmQ7CiB0Yl9mb3VuZDoKIAlp ZiAoIWhsaXN0X2VtcHR5KCZ0Yi0+b3duZXJzKSkgewotCQlpZiAoc2stPnNrX3Jl dXNlID4gMSkKLQkJCWdvdG8gc3VjY2VzczsKIAkJaWYgKHRiLT5mYXN0cmV1c2Ug PiAwICYmCiAJCSAgICBzay0+c2tfcmV1c2UgJiYgc2stPnNrX3N0YXRlICE9IFRD UF9MSVNURU4pIHsKIAkJCWdvdG8gc3VjY2VzczsKLS0tIGxpbnV4L25ldC9pcHY2 L3RjcF9pcHY2LmMub3JpZwkyMDA0LTEyLTE1IDEyOjMxOjQxLjg4ODQyNTgwOCAr MDIwMAorKysgbGludXgvbmV0L2lwdjYvdGNwX2lwdjYuYwkyMDA0LTEyLTE1IDAx OjQyOjA4LjYwMzI2NTcyOCArMDIwMApAQCAtMTgsNiArMTgsNyBAQAogICoJQWxl eGV5IEt1em5ldHNvdgkJYWxsb3cgYm90aCBJUHY0IGFuZCBJUHY2IHNvY2tldHMg dG8gYmluZAogICoJCQkJCWEgc2luZ2xlIHBvcnQgYXQgdGhlIHNhbWUgdGltZS4K ICAqCVlPU0hJRlVKSSBIaWRlYWtpIEBVU0FHSToJY29udmVydCAvcHJvYy9uZXQv dGNwNiB0byBzZXFfZmlsZS4KKyAqCUlseWEgUGFzaGtvdnNreQkJOglhbGxvdyBy ZXVzZSBvZiBwb3J0IHdpdGggc2luZ2xlIGxpc3RlbmVyCiAgKgogICoJVGhpcyBw cm9ncmFtIGlzIGZyZWUgc29mdHdhcmU7IHlvdSBjYW4gcmVkaXN0cmlidXRlIGl0 IGFuZC9vcgogICogICAgICBtb2RpZnkgaXQgdW5kZXIgdGhlIHRlcm1zIG9mIHRo ZSBHTlUgR2VuZXJhbCBQdWJsaWMgTGljZW5zZQpAQCAtMTExLDcgKzExMiw4IEBA IHN0YXRpYyBpbmxpbmUgaW50IHRjcF92Nl9iaW5kX2NvbmZsaWN0KHMKIAkJICAg ICAhc2syLT5za19ib3VuZF9kZXZfaWYgfHwKIAkJICAgICBzay0+c2tfYm91bmRf ZGV2X2lmID09IHNrMi0+c2tfYm91bmRfZGV2X2lmKSAmJgogCQkgICAgKCFzay0+ c2tfcmV1c2UgfHwgIXNrMi0+c2tfcmV1c2UgfHwKLQkJICAgICBzazItPnNrX3N0 YXRlID09IFRDUF9MSVNURU4pICYmCisJCSAgICAgKHNrMi0+c2tfc3RhdGUgPT0g VENQX0xJU1RFTiAmJgorCQkJc2stPnNrX3N0YXRlID09IFRDUF9MSVNURU4pKSAm JgogCQkgICAgIGlwdjZfcmN2X3NhZGRyX2VxdWFsKHNrLCBzazIpKQogCQkJYnJl YWs7CiAJfQo= ------------4cQblArTdRJef3s7HFoYxt-- From zaitcev@redhat.com Wed Dec 15 13:14:30 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 15 Dec 2004 13:14:38 -0800 (PST) Received: from mx1.redhat.com (mx1.redhat.com [66.187.233.31]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBFLEAIq023685 for ; Wed, 15 Dec 2004 13:14:30 -0800 Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.12.11/8.12.11) with ESMTP id iBFLDi3k024796; Wed, 15 Dec 2004 16:13:44 -0500 Received: from devserv.devel.redhat.com (devserv.devel.redhat.com [172.16.58.1]) by int-mx1.corp.redhat.com (8.11.6/8.11.6) with ESMTP id iBFLDir07952; Wed, 15 Dec 2004 16:13:44 -0500 Received: from lembas.zaitcev.lan (vpn50-27.rdu.redhat.com [172.16.50.27]) by devserv.devel.redhat.com (8.12.11/8.12.10) with SMTP id iBFLDJfK006568; Wed, 15 Dec 2004 16:13:20 -0500 Date: Wed, 15 Dec 2004 13:13:42 -0800 From: Pete Zaitcev To: netdev@oss.sgi.com Cc: zaitcev@redhat.com, linux-kernel@vger.kernel.org, , davem@redhat.com Subject: udp_poll breaks vpnc Message-ID: <20041215131342.21768388@lembas.zaitcev.lan> Organization: Red Hat, Inc. X-Mailer: Sylpheed-Claws 0.9.12cvs126.2 (GTK+ 2.4.14; i386-redhat-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12780 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: zaitcev@redhat.com Precedence: bulk X-list: netdev Hi, Guys: I found that the attached patch breaks VPNC. By looking at strace, it never gets any poll events about arriving encrypted data. It may be a bug in VPNC, but this is a rather old binary which I used even on 2.4... Unfortunately, I cannot investigate it closer at this time, because of, uhh, some work commitments. Sorry. I do not insist that we back this out, but let it be known that something was broken. Cheers, -- Pete diff -urp -X dontdiff linux-2.6.10-rc2-bk8-usb/include/net/udp.h linux-2.6.10-rc3-usb/include/net/udp.h --- linux-2.6.10-rc2-bk8-usb/include/net/udp.h 2004-08-19 17:16:10.000000000 -0700 +++ linux-2.6.10-rc3-usb/include/net/udp.h 2004-12-15 02:01:19.000000000 -0800 @@ -71,6 +71,8 @@ extern int udp_sendmsg(struct kiocb *ioc extern int udp_rcv(struct sk_buff *skb); extern int udp_ioctl(struct sock *sk, int cmd, unsigned long arg); extern int udp_disconnect(struct sock *sk, int flags); +extern unsigned int udp_poll(struct file *file, struct socket *sock, + poll_table *wait); DECLARE_SNMP_STAT(struct udp_mib, udp_statistics); #define UDP_INC_STATS(field) SNMP_INC_STATS(udp_statistics, field) diff -urp -X dontdiff linux-2.6.10-rc2-bk8-usb/net/ipv4/af_inet.c linux-2.6.10-rc3-usb/net/ipv4/af_inet.c --- linux-2.6.10-rc2-bk8-usb/net/ipv4/af_inet.c 2004-11-23 09:54:15.000000000 -0800 +++ linux-2.6.10-rc3-usb/net/ipv4/af_inet.c 2004-12-15 02:01:21.000000000 -0800 @@ -809,7 +809,7 @@ struct proto_ops inet_dgram_ops = { .socketpair = sock_no_socketpair, .accept = sock_no_accept, .getname = inet_getname, - .poll = datagram_poll, + .poll = udp_poll, .ioctl = inet_ioctl, .listen = sock_no_listen, .shutdown = inet_shutdown, diff -urp -X dontdiff linux-2.6.10-rc2-bk8-usb/net/ipv4/udp.c linux-2.6.10-rc3-usb/net/ipv4/udp.c --- linux-2.6.10-rc2-bk8-usb/net/ipv4/udp.c 2004-11-23 09:54:15.000000000 -0800 +++ linux-2.6.10-rc3-usb/net/ipv4/udp.c 2004-12-15 02:01:21.000000000 -0800 @@ -1303,6 +1303,52 @@ static int udp_getsockopt(struct sock *s return 0; } +/** + * udp_poll - wait for a UDP event. + * @file - file struct + * @sock - socket + * @wait - poll table + * + * This is same as datagram poll, except for the special case of + * blocking sockets. If application is using a blocking fd + * and a packet with checksum error is in the queue; + * then it could get return from select indicating data available + * but then block when reading it. Add special case code + * to work around these arguably broken applications. + */ +unsigned int udp_poll(struct file *file, struct socket *sock, poll_table *wait) +{ + unsigned int mask = datagram_poll(file, sock, wait); + struct sock *sk = sock->sk; + + /* Check for false positives due to checksum errors */ + if ( (mask & POLLRDNORM) && + !(file->f_flags & O_NONBLOCK) && + !(sk->sk_shutdown & RCV_SHUTDOWN)){ + struct sk_buff_head *rcvq = &sk->sk_receive_queue; + struct sk_buff *skb; + + spin_lock_irq(&rcvq->lock); + while ((skb = skb_peek(rcvq)) != NULL) { + if (udp_checksum_complete(skb)) { + UDP_INC_STATS_BH(UDP_MIB_INERRORS); + __skb_unlink(skb, rcvq); + kfree_skb(skb); + } else { + skb->ip_summed = CHECKSUM_UNNECESSARY; + break; + } + } + spin_unlock_irq(&rcvq->lock); + + /* nothing to see, move along */ + if (skb == NULL) + mask &= ~(POLLIN | POLLRDNORM); + } + + return mask; + +} struct proto udp_prot = { .name = "UDP", @@ -1517,6 +1563,7 @@ EXPORT_SYMBOL(udp_ioctl); EXPORT_SYMBOL(udp_port_rover); EXPORT_SYMBOL(udp_prot); EXPORT_SYMBOL(udp_sendmsg); +EXPORT_SYMBOL(udp_poll); #ifdef CONFIG_PROC_FS EXPORT_SYMBOL(udp_proc_register); diff -urp -X dontdiff linux-2.6.10-rc2-bk8-usb/net/ipv6/af_inet6.c linux-2.6.10-rc3-usb/net/ipv6/af_inet6.c --- linux-2.6.10-rc2-bk8-usb/net/ipv6/af_inet6.c 2004-11-23 09:54:15.000000000 -0800 +++ linux-2.6.10-rc3-usb/net/ipv6/af_inet6.c 2004-12-15 02:01:21.000000000 -0800 @@ -501,7 +501,7 @@ struct proto_ops inet6_dgram_ops = { .socketpair = sock_no_socketpair, /* a do nothing */ .accept = sock_no_accept, /* a do nothing */ .getname = inet6_getname, - .poll = datagram_poll, /* ok */ + .poll = udp_poll, /* ok */ .ioctl = inet6_ioctl, /* must change */ .listen = sock_no_listen, /* ok */ .shutdown = inet_shutdown, /* ok */ From davem@davemloft.net Wed Dec 15 13:19:21 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 15 Dec 2004 13:19:28 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBFLJ1hQ024239 for ; Wed, 15 Dec 2004 13:19:21 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1CegTF-0007px-00; Wed, 15 Dec 2004 13:14:05 -0800 Date: Wed, 15 Dec 2004 13:14:05 -0800 From: "David S. Miller" To: Pete Zaitcev Cc: netdev@oss.sgi.com, zaitcev@redhat.com, linux-kernel@vger.kernel.org, shemminger@osdl.org, davem@redhat.com Subject: Re: udp_poll breaks vpnc Message-Id: <20041215131405.7f5308a5.davem@davemloft.net> In-Reply-To: <20041215131342.21768388@lembas.zaitcev.lan> References: <20041215131342.21768388@lembas.zaitcev.lan> X-Mailer: Sylpheed version 1.0.0beta3 (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12781 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Wed, 15 Dec 2004 13:13:42 -0800 Pete Zaitcev wrote: > I found that the attached patch breaks VPNC. By looking at strace, it never > gets any poll events about arriving encrypted data. It may be a bug in VPNC, > but this is a rather old binary which I used even on 2.4... Fixed by a followon patch which is in BK as of 2 weeks ago. Basically, we were using UDP poll for RAW sockets by accident. From kernel@linuxace.com Wed Dec 15 13:20:42 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 15 Dec 2004 13:20:49 -0800 (PST) Received: from linuxace.com (adsl-67-120-171-161.dsl.lsan03.pacbell.net [67.120.171.161]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBFLKLhM024516 for ; Wed, 15 Dec 2004 13:20:41 -0800 Received: (qmail 17979 invoked by uid 0); 15 Dec 2004 21:19:53 -0000 Date: Wed, 15 Dec 2004 13:19:53 -0800 From: Phil Oester To: Pete Zaitcev Cc: netdev@oss.sgi.com, linux-kernel@vger.kernel.org, shemminger@osdl.org, davem@redhat.com Subject: Re: udp_poll breaks vpnc Message-ID: <20041215211953.GA17945@linuxace.com> References: <20041215131342.21768388@lembas.zaitcev.lan> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041215131342.21768388@lembas.zaitcev.lan> User-Agent: Mutt/1.4.1i X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12782 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kernel@linuxace.com Precedence: bulk X-list: netdev On Wed, Dec 15, 2004 at 01:13:42PM -0800, Pete Zaitcev wrote: > Hi, Guys: > > I found that the attached patch breaks VPNC. By looking at strace, it never > gets any poll events about arriving encrypted data. It may be a bug in VPNC, > but this is a rather old binary which I used even on 2.4... This was fixed post-rc3. Phil From acme@conectiva.com.br Wed Dec 15 15:41:13 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 15 Dec 2004 15:41:27 -0800 (PST) Received: from orion.netbank.com.br (orion.netbank.com.br [200.203.199.90]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBFNeocJ029228 for ; Wed, 15 Dec 2004 15:41:11 -0800 Received: from [201.14.39.194] (helo=oops.ghostprotocols.net) by orion.netbank.com.br with asmtp (Exim 3.33 #1) id 1Ceilu-00012P-00; Wed, 15 Dec 2004 21:41:30 -0200 Received: from [192.168.1.6] (amd64.kerneljanitors.org [192.168.1.6]) by oops.ghostprotocols.net (Postfix) with ESMTP id 523B277AF7; Wed, 15 Dec 2004 21:40:09 -0200 (BRST) Message-ID: <41C0BDCA.1000305@conectiva.com.br> Date: Wed, 15 Dec 2004 20:42:18 -0200 From: Arnaldo Carvalho de Melo Organization: Conectiva S.A. User-Agent: Mozilla Thunderbird 0.9 (X11/20041103) X-Accept-Language: en-us, en MIME-Version: 1.0 To: "David S. Miller" Cc: netdev@oss.sgi.com Subject: [RFC] struct stream_sock (aka struct sock shrink-me-harder) Content-Type: multipart/mixed; boundary="------------050301010404020702080302" X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12783 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: acme@conectiva.com.br Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------050301010404020702080302 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Hi Dave, It compiles, boots, survives some testing, but is incomplete yet in the old protos that I have to convert to per protocol slabcaches, which will also allow us to get rid of sk_protinfo. Some cases are more tricky, like bluetooth, that uses both per protocol slabcache _and_ sk->sk_protinfo, probably the best thing to do in this case is to make bluetooth use sk->sk_prot, and that is what I plan to do eventually to all families, to get rid of sk->sk_slab (i.e. when all families use sk->sk_prot we can just use sk->sk_prot->slab). I think that besides the savings per struct sock instance on non stream/seqpacket protos this makes things clearer by separating the stream stuff from the bare struct sock that is needed for all familes. Anyway, what do you think? Any member in struct stream_sock that should not have been moved from struct sock? Best Regards, - Arnaldo --------------050301010404020702080302 Content-Type: text/plain; name="stream_sock.patch" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="stream_sock.patch" You can import this changeset into BK by piping this whole message to: '| bk receive [path to repository]' or apply the patch as usual. =================================================================== ChangeSet@1.2175, 2004-12-15 20:31:22-02:00, acme@conectiva.com.br [SOCK] move stream specific stuff from struct sock to struct stream_sock [root@oldpandora ~]# diff -u before after --- before 2004-12-15 05:13:13.000000000 -0200 +++ after 2004-12-15 05:13:21.000000000 -0200 @@ -1,8 +1,8 @@ -rawv6_sock 640 -udpv6_sock 608 +rawv6_sock 608 +udpv6_sock 576 tcpv6_sock 1120 unix_sock 384 -raw_sock 480 -udp_sock 480 +raw_sock 448 +udp_sock 448 tcp_sock 1024 -sock 320 +sock 288 [root@oldpandora ~]# Enough said? :-) Signed-off-by: Arnaldo Carvalho de Melo include/linux/atmdev.h | 18 ++ include/linux/ip.h | 7 - include/linux/ipv6.h | 34 ++--- include/linux/tcp.h | 11 + include/linux/udp.h | 9 - include/net/af_unix.h | 2 include/net/ax25.h | 15 ++ include/net/bluetooth/bluetooth.h | 13 + include/net/dn.h | 14 +- include/net/sctp/sctp.h | 23 ++- include/net/sock.h | 204 ------------------------------ include/net/stream_sock.h | 252 ++++++++++++++++++++++++++++++++++++++ include/net/tcp.h | 22 +-- include/net/tcp_ecn.h | 6 net/atm/common.c | 48 ++++--- net/atm/signaling.c | 6 net/atm/svc.c | 4 net/ax25/af_ax25.c | 55 ++++---- net/ax25/ax25_in.c | 13 - net/bluetooth/af_bluetooth.c | 7 - net/bluetooth/l2cap.c | 14 +- net/bluetooth/rfcomm/sock.c | 14 +- net/bluetooth/sco.c | 9 - net/core/sock.c | 24 ++- net/core/stream.c | 35 ++--- net/decnet/af_decnet.c | 8 - net/decnet/dn_nsp_in.c | 6 net/decnet/dn_nsp_out.c | 2 net/ipv4/af_inet.c | 14 +- net/ipv4/ip_output.c | 19 +- net/ipv4/tcp.c | 47 +++---- net/ipv4/tcp_diag.c | 4 net/ipv4/tcp_input.c | 30 ++-- net/ipv4/tcp_ipv4.c | 19 +- net/ipv4/tcp_minisocks.c | 13 + net/ipv4/tcp_output.c | 57 ++++---- net/ipv4/tcp_timer.c | 6 net/ipv6/af_inet6.c | 3 net/ipv6/ip6_output.c | 20 +-- net/ipv6/tcp_ipv6.c | 24 +-- net/irda/af_irda.c | 6 net/llc/af_llc.c | 3 net/netrom/af_netrom.c | 11 - net/rose/af_rose.c | 11 - net/sctp/associola.c | 4 net/sctp/endpointola.c | 2 net/sctp/sm_statefuns.c | 2 net/sctp/socket.c | 10 - net/unix/af_unix.c | 16 +- net/wanrouter/af_wanpipe.c | 4 net/x25/af_x25.c | 11 - 51 files changed, 694 insertions(+), 517 deletions(-) diff -Nru a/include/linux/atmdev.h b/include/linux/atmdev.h --- a/include/linux/atmdev.h 2004-12-15 20:32:26 -02:00 +++ b/include/linux/atmdev.h 2004-12-15 20:32:26 -02:00 @@ -11,6 +11,8 @@ #include #include #include +#include +#include #define ESI_LEN 6 @@ -30,9 +32,6 @@ #define ATM_DS3_PCR (8000*12) /* DS3: 12 cells in a 125 usec time slot */ -#define atm_sk(__sk) ((struct atm_vcc *)(__sk)->sk_protinfo) -#define ATM_SD(s) (atm_sk((s)->sk)) - #define __AAL_STAT_ITEMS \ __HANDLE_ITEM(tx); /* TX okay */ \ @@ -310,6 +309,19 @@ /* by CLIP and sch_atm. */ }; +struct atm_sock { + struct sock sk; + struct stream_sock *pssk; + struct stream_sock ssk; + struct atm_vcc vcc; +}; + +static inline struct atm_vcc *atm_sk(const struct sock *sk) +{ + return &((struct atm_sock *)sk)->vcc; +} + +#define ATM_SD(s) (atm_sk((s)->sk)) struct atm_dev_addr { struct sockaddr_atmsvc addr; /* ATM address */ diff -Nru a/include/linux/ip.h b/include/linux/ip.h --- a/include/linux/ip.h 2004-12-15 20:32:26 -02:00 +++ b/include/linux/ip.h 2004-12-15 20:32:26 -02:00 @@ -150,11 +150,12 @@ /* WARNING: don't change the layout of the members in inet_sock! */ struct inet_sock { - struct sock sk; + struct sock sk; + struct stream_sock *pssk; #if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) - struct ipv6_pinfo *pinet6; + struct ipv6_pinfo *pinet6; #endif - struct inet_opt inet; + struct inet_opt inet; }; static inline struct inet_opt * inet_sk(const struct sock *__sk) diff -Nru a/include/linux/ipv6.h b/include/linux/ipv6.h --- a/include/linux/ipv6.h 2004-12-15 20:32:26 -02:00 +++ b/include/linux/ipv6.h 2004-12-15 20:32:26 -02:00 @@ -256,27 +256,31 @@ /* WARNING: don't change the layout of the members in {raw,udp,tcp}6_sock! */ struct raw6_sock { - struct sock sk; - struct ipv6_pinfo *pinet6; - struct inet_opt inet; - struct raw6_opt raw6; - struct ipv6_pinfo inet6; + struct sock sk; + struct stream_sock *pssk; + struct ipv6_pinfo *pinet6; + struct inet_opt inet; + struct raw6_opt raw6; + struct ipv6_pinfo inet6; }; struct udp6_sock { - struct sock sk; - struct ipv6_pinfo *pinet6; - struct inet_opt inet; - struct udp_opt udp; - struct ipv6_pinfo inet6; + struct sock sk; + struct stream_sock *pssk; + struct ipv6_pinfo *pinet6; + struct inet_opt inet; + struct udp_opt udp; + struct ipv6_pinfo inet6; }; struct tcp6_sock { - struct sock sk; - struct ipv6_pinfo *pinet6; - struct inet_opt inet; - struct tcp_opt tcp; - struct ipv6_pinfo inet6; + struct sock sk; + struct stream_sock *pssk; + struct ipv6_pinfo *pinet6; + struct inet_opt inet; + struct tcp_opt tcp; + struct stream_sock ssk; + struct ipv6_pinfo inet6; }; static inline struct ipv6_pinfo * inet6_sk(const struct sock *__sk) diff -Nru a/include/linux/tcp.h b/include/linux/tcp.h --- a/include/linux/tcp.h 2004-12-15 20:32:26 -02:00 +++ b/include/linux/tcp.h 2004-12-15 20:32:26 -02:00 @@ -196,6 +196,7 @@ #include #include #include +#include /* This defines a selective acknowledgement block. */ struct tcp_sack_block { @@ -440,12 +441,14 @@ /* WARNING: don't change the layout of the members in tcp_sock! */ struct tcp_sock { - struct sock sk; + struct sock sk; + struct stream_sock *pssk; #if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) - struct ipv6_pinfo *pinet6; + struct ipv6_pinfo *pinet6; #endif - struct inet_opt inet; - struct tcp_opt tcp; + struct inet_opt inet; + struct tcp_opt tcp; + struct stream_sock ssk; }; static inline struct tcp_opt * tcp_sk(const struct sock *__sk) diff -Nru a/include/linux/udp.h b/include/linux/udp.h --- a/include/linux/udp.h 2004-12-15 20:32:26 -02:00 +++ b/include/linux/udp.h 2004-12-15 20:32:26 -02:00 @@ -53,12 +53,13 @@ /* WARNING: don't change the layout of the members in udp_sock! */ struct udp_sock { - struct sock sk; + struct sock sk; + struct stream_sock *pssk; #if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) - struct ipv6_pinfo *pinet6; + struct ipv6_pinfo *pinet6; #endif - struct inet_opt inet; - struct udp_opt udp; + struct inet_opt inet; + struct udp_opt udp; }; static inline struct udp_opt * udp_sk(const struct sock *__sk) diff -Nru a/include/net/af_unix.h b/include/net/af_unix.h --- a/include/net/af_unix.h 2004-12-15 20:32:26 -02:00 +++ b/include/net/af_unix.h 2004-12-15 20:32:26 -02:00 @@ -62,6 +62,8 @@ struct unix_sock { /* WARNING: sk has to be the first member */ struct sock sk; + struct stream_sock *pssk; + struct stream_sock ssk; struct unix_address *addr; struct dentry *dentry; struct vfsmount *mnt; diff -Nru a/include/net/ax25.h b/include/net/ax25.h --- a/include/net/ax25.h 2004-12-15 20:32:26 -02:00 +++ b/include/net/ax25.h 2004-12-15 20:32:26 -02:00 @@ -12,6 +12,8 @@ #include #include #include +#include +#include #define AX25_T1CLAMPLO 1 #define AX25_T1CLAMPHI (30 * HZ) @@ -203,7 +205,17 @@ atomic_t refcount; } ax25_cb; -#define ax25_sk(__sk) ((ax25_cb *)(__sk)->sk_protinfo) +struct ax25_sock { + struct sock sk; + struct stream_sock *pssk; + struct stream_sock ssk; + struct ax25_cb cb; +}; + +static inline struct ax25_cb *ax25_sk(const struct sock *sk) +{ + return &((struct ax25_sock *)sk)->cb; +} #define ax25_for_each(__ax25, node, list) \ hlist_for_each_entry(__ax25, node, list, ax25_node) @@ -230,6 +242,7 @@ extern void ax25_send_to_raw(ax25_address *, struct sk_buff *, int); extern void ax25_destroy_socket(ax25_cb *); extern ax25_cb *ax25_create_cb(void); +extern void ax25_init_cb(ax25_cb *ax25); extern void ax25_fillin_cb(ax25_cb *, ax25_dev *); extern int ax25_create(struct socket *, int); extern struct sock *ax25_make_new(struct sock *, struct ax25_dev *); diff -Nru a/include/net/bluetooth/bluetooth.h b/include/net/bluetooth/bluetooth.h --- a/include/net/bluetooth/bluetooth.h 2004-12-15 20:32:26 -02:00 +++ b/include/net/bluetooth/bluetooth.h 2004-12-15 20:32:26 -02:00 @@ -30,6 +30,7 @@ #include #include #include +#include #ifndef AF_BLUETOOTH #define AF_BLUETOOTH 31 @@ -111,11 +112,13 @@ #define bt_sk(__sk) ((struct bt_sock *) __sk) struct bt_sock { - struct sock sk; - bdaddr_t src; - bdaddr_t dst; - struct list_head accept_q; - struct sock *parent; + struct sock sk; + struct stream_sock *pssk; + struct stream_sock ssk; + bdaddr_t src; + bdaddr_t dst; + struct list_head accept_q; + struct sock *parent; }; struct bt_sock_list { diff -Nru a/include/net/dn.h b/include/net/dn.h --- a/include/net/dn.h 2004-12-15 20:32:26 -02:00 +++ b/include/net/dn.h 2004-12-15 20:32:26 -02:00 @@ -2,6 +2,7 @@ #define _NET_DN_H #include +#include #include typedef unsigned short dn_address; @@ -133,7 +134,18 @@ }; -#define DN_SK(__sk) ((struct dn_scp *)(__sk)->sk_protinfo) +/* WARNING: don't change the layout of the members in tcp_sock! */ +struct decnet_sock { + struct sock sk; + struct stream_sock *pssk; + struct dn_scp scp; + struct stream_sock ssk; +}; + +static inline struct dn_scp *DN_SK(const struct sock *sk) +{ + return &((struct decnet_sock *)sk)->scp; +} /* * src,dst : Source and Destination DECnet addresses diff -Nru a/include/net/sctp/sctp.h b/include/net/sctp/sctp.h --- a/include/net/sctp/sctp.h 2004-12-15 20:32:26 -02:00 +++ b/include/net/sctp/sctp.h 2004-12-15 20:32:26 -02:00 @@ -88,6 +88,7 @@ #include #include #include +#include #include #include #include @@ -584,21 +585,25 @@ /* WARNING: Do not change the layout of the members in sctp_sock! */ struct sctp_sock { - struct sock sk; + struct sock sk; + struct stream_sock *pssk; #if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) - struct ipv6_pinfo *pinet6; + struct ipv6_pinfo *pinet6; #endif /* CONFIG_IPV6 */ - struct inet_opt inet; - struct sctp_opt sctp; + struct inet_opt inet; + struct sctp_opt sctp; + struct stream_sock ssk; }; #if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) struct sctp6_sock { - struct sock sk; - struct ipv6_pinfo *pinet6; - struct inet_opt inet; - struct sctp_opt sctp; - struct ipv6_pinfo inet6; + struct sock sk; + struct stream_sock *pssk; + struct ipv6_pinfo *pinet6; + struct inet_opt inet; + struct sctp_opt sctp; + struct stream_sock ssk; + struct ipv6_pinfo inet6; }; #endif /* CONFIG_IPV6 */ diff -Nru a/include/net/sock.h b/include/net/sock.h --- a/include/net/sock.h 2004-12-15 20:32:26 -02:00 +++ b/include/net/sock.h 2004-12-15 20:32:26 -02:00 @@ -130,26 +130,18 @@ * @sk_wmem_alloc - transmit queue bytes committed * @sk_write_queue - Packet sending queue * @sk_omem_alloc - "o" is "option" or "other" - * @sk_wmem_queued - persistent queue size - * @sk_forward_alloc - space allocated forward * @sk_allocation - allocation mode * @sk_sndbuf - size of send buffer in bytes * @sk_flags - %SO_LINGER (l_onoff), %SO_BROADCAST, %SO_KEEPALIVE, %SO_OOBINLINE settings * @sk_no_check - %SO_NO_CHECK setting, wether or not checkup packets * @sk_debug - %SO_DEBUG setting * @sk_rcvtstamp - %SO_TIMESTAMP setting - * @sk_no_largesend - whether to sent large segments or not - * @sk_route_caps - route capabilities (e.g. %NETIF_F_TSO) - * @sk_lingertime - %SO_LINGER l_linger setting - * @sk_hashent - hash entry in several tables (e.g. tcp_ehash) * @sk_backlog - always used with the per-socket spinlock held * @sk_callback_lock - used with the callbacks in the end of this struct * @sk_error_queue - rarely used * @sk_prot - protocol handlers inside a network family * @sk_err - last error * @sk_err_soft - errors that don't cause failure but are the cause of a persistent failure not just 'timed out' - * @sk_ack_backlog - current listen backlog - * @sk_max_ack_backlog - listen backlog set in listen() * @sk_priority - %SO_PRIORITY setting * @sk_type - socket type (%SOCK_STREAM, etc) * @sk_localroute - route locally only, %SO_DONTROUTE setting @@ -166,11 +158,6 @@ * @sk_socket - Identd and reporting IO signals * @sk_user_data - RPC layer private data * @sk_owner - module that owns this socket - * @sk_sndmsg_page - cached page for sendmsg - * @sk_sndmsg_off - cached offset for sendmsg - * @sk_send_head - front of stuff to transmit - * @sk_write_pending - a write to stream socket waits to start - * @sk_queue_shrunk - write queue has been shrunk recently * @sk_state_change - callback to indicate change in the state of the sock * @sk_data_ready - callback to indicate there is data to be processed * @sk_write_space - callback to indicate there is bf sending space available @@ -206,18 +193,13 @@ atomic_t sk_wmem_alloc; struct sk_buff_head sk_write_queue; atomic_t sk_omem_alloc; - int sk_wmem_queued; - int sk_forward_alloc; unsigned int sk_allocation; int sk_sndbuf; unsigned long sk_flags; char sk_no_check; unsigned char sk_debug; unsigned char sk_rcvtstamp; - unsigned char sk_no_largesend; - int sk_route_caps; - unsigned long sk_lingertime; - int sk_hashent; + /* one byte hole, try to pack */ /* * The backlog queue is special, it is always used with * the per-socket spinlock held and requires low latency @@ -232,8 +214,6 @@ struct proto *sk_prot; int sk_err, sk_err_soft; - unsigned short sk_ack_backlog; - unsigned short sk_max_ack_backlog; __u32 sk_priority; unsigned short sk_type; unsigned char sk_localroute; @@ -250,13 +230,7 @@ struct socket *sk_socket; void *sk_user_data; struct module *sk_owner; - struct page *sk_sndmsg_page; - __u32 sk_sndmsg_off; - struct sk_buff *sk_send_head; - int sk_write_pending; void *sk_security; - __u8 sk_queue_shrunk; - /* three bytes hole, try to pack */ void (*sk_state_change)(struct sock *sk); void (*sk_data_ready)(struct sock *sk, int bytes); void (*sk_write_space)(struct sock *sk); @@ -408,59 +382,6 @@ return test_bit(flag, &sk->sk_flags); } -static inline void sk_acceptq_removed(struct sock *sk) -{ - sk->sk_ack_backlog--; -} - -static inline void sk_acceptq_added(struct sock *sk) -{ - sk->sk_ack_backlog++; -} - -static inline int sk_acceptq_is_full(struct sock *sk) -{ - return sk->sk_ack_backlog > sk->sk_max_ack_backlog; -} - -/* - * Compute minimal free write space needed to queue new packets. - */ -static inline int sk_stream_min_wspace(struct sock *sk) -{ - return sk->sk_wmem_queued / 2; -} - -static inline int sk_stream_wspace(struct sock *sk) -{ - return sk->sk_sndbuf - sk->sk_wmem_queued; -} - -extern void sk_stream_write_space(struct sock *sk); - -static inline int sk_stream_memory_free(struct sock *sk) -{ - return sk->sk_wmem_queued < sk->sk_sndbuf; -} - -extern void sk_stream_rfree(struct sk_buff *skb); - -static inline void sk_stream_set_owner_r(struct sk_buff *skb, struct sock *sk) -{ - skb->sk = sk; - skb->destructor = sk_stream_rfree; - atomic_add(skb->truesize, &sk->sk_rmem_alloc); - sk->sk_forward_alloc -= skb->truesize; -} - -static inline void sk_stream_free_skb(struct sock *sk, struct sk_buff *skb) -{ - sk->sk_queue_shrunk = 1; - sk->sk_wmem_queued -= skb->truesize; - sk->sk_forward_alloc += skb->truesize; - __kfree_skb(skb); -} - /* The per-socket spinlock must be held here. */ #define sk_add_backlog(__sk, __skb) \ do { if (!(__sk)->sk_backlog.tail) { \ @@ -485,12 +406,6 @@ rc; \ }) -extern int sk_stream_wait_connect(struct sock *sk, long *timeo_p); -extern int sk_stream_wait_memory(struct sock *sk, long *timeo_p); -extern void sk_stream_wait_close(struct sock *sk, long timeo_p); -extern int sk_stream_error(struct sock *sk, int flags, int err); -extern void sk_stream_kill_queues(struct sock *sk); - extern int sk_wait_data(struct sock *sk, long *timeo); /* Networking protocol blocks we attach to sockets. @@ -656,37 +571,6 @@ return &container_of(socket, struct socket_alloc, socket)->vfs_inode; } -extern void __sk_stream_mem_reclaim(struct sock *sk); -extern int sk_stream_mem_schedule(struct sock *sk, int size, int kind); - -#define SK_STREAM_MEM_QUANTUM ((int)PAGE_SIZE) - -static inline int sk_stream_pages(int amt) -{ - return (amt + SK_STREAM_MEM_QUANTUM - 1) / SK_STREAM_MEM_QUANTUM; -} - -static inline void sk_stream_mem_reclaim(struct sock *sk) -{ - if (sk->sk_forward_alloc >= SK_STREAM_MEM_QUANTUM) - __sk_stream_mem_reclaim(sk); -} - -static inline void sk_stream_writequeue_purge(struct sock *sk) -{ - struct sk_buff *skb; - - while ((skb = __skb_dequeue(&sk->sk_write_queue)) != NULL) - sk_stream_free_skb(sk, skb); - sk_stream_mem_reclaim(sk); -} - -static inline int sk_stream_rmem_schedule(struct sock *sk, struct sk_buff *skb) -{ - return (int)skb->truesize <= sk->sk_forward_alloc || - sk_stream_mem_schedule(sk, skb->truesize, 1); -} - /* Used by processes to "lock" a socket state, so that * interrupts and bottom half handlers won't change it * from under us. It essentially blocks any incoming @@ -1009,35 +893,6 @@ return dst; } -static inline void sk_charge_skb(struct sock *sk, struct sk_buff *skb) -{ - sk->sk_wmem_queued += skb->truesize; - sk->sk_forward_alloc -= skb->truesize; -} - -static inline int skb_copy_to_page(struct sock *sk, char __user *from, - struct sk_buff *skb, struct page *page, - int off, int copy) -{ - if (skb->ip_summed == CHECKSUM_NONE) { - int err = 0; - unsigned int csum = csum_and_copy_from_user(from, - page_address(page) + off, - copy, 0, &err); - if (err) - return err; - skb->csum = csum_block_add(skb->csum, csum, skb->len); - } else if (copy_from_user(page_address(page) + off, from, copy)) - return -EFAULT; - - skb->len += copy; - skb->data_len += copy; - skb->truesize += copy; - sk->sk_wmem_queued += copy; - sk->sk_forward_alloc -= copy; - return 0; -} - /* * Queue a received datagram if it will fit. Stream and sequenced * protocols can't normally use this as they need to fit buffers in @@ -1149,63 +1004,6 @@ if (sk->sk_socket && sk->sk_socket->fasync_list) sock_wake_async(sk->sk_socket, how, band); } - -#define SOCK_MIN_SNDBUF 2048 -#define SOCK_MIN_RCVBUF 256 - -static inline void sk_stream_moderate_sndbuf(struct sock *sk) -{ - if (!(sk->sk_userlocks & SOCK_SNDBUF_LOCK)) { - sk->sk_sndbuf = min(sk->sk_sndbuf, sk->sk_wmem_queued / 2); - sk->sk_sndbuf = max(sk->sk_sndbuf, SOCK_MIN_SNDBUF); - } -} - -static inline struct sk_buff *sk_stream_alloc_pskb(struct sock *sk, - int size, int mem, int gfp) -{ - struct sk_buff *skb = alloc_skb(size + sk->sk_prot->max_header, gfp); - - if (skb) { - skb->truesize += mem; - if (sk->sk_forward_alloc >= (int)skb->truesize || - sk_stream_mem_schedule(sk, skb->truesize, 0)) { - skb_reserve(skb, sk->sk_prot->max_header); - return skb; - } - __kfree_skb(skb); - } else { - sk->sk_prot->enter_memory_pressure(); - sk_stream_moderate_sndbuf(sk); - } - return NULL; -} - -static inline struct sk_buff *sk_stream_alloc_skb(struct sock *sk, - int size, int gfp) -{ - return sk_stream_alloc_pskb(sk, size, 0, gfp); -} - -static inline struct page *sk_stream_alloc_page(struct sock *sk) -{ - struct page *page = NULL; - - if (sk->sk_forward_alloc >= (int)PAGE_SIZE || - sk_stream_mem_schedule(sk, PAGE_SIZE, 0)) - page = alloc_pages(sk->sk_allocation, 0); - else { - sk->sk_prot->enter_memory_pressure(); - sk_stream_moderate_sndbuf(sk); - } - return page; -} - -#define sk_stream_for_retrans_queue(skb, sk) \ - for (skb = (sk)->sk_write_queue.next; \ - (skb != (sk)->sk_send_head) && \ - (skb != (struct sk_buff *)&(sk)->sk_write_queue); \ - skb = skb->next) /* * Default write policy as shown to user space via poll/select/SIGIO diff -Nru a/include/net/stream_sock.h b/include/net/stream_sock.h --- /dev/null Wed Dec 31 16:00:00 196900 +++ b/include/net/stream_sock.h 2004-12-15 20:32:26 -02:00 @@ -0,0 +1,252 @@ +/* + * INET An implementation of the TCP/IP protocol suite for the LINUX + * operating system. INET is implemented using the BSD Socket + * interface as the means of communication with the user level. + * + * Authors: Arnaldo Carvalho de Melo + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ +#ifndef _STREAM_SOCK_H +#define _STREAM_SOCK_H + +#include + +/** + * struct stream_sock - stream sock area (used in tcp_sock, etc) + * @ssk_lingertime - %SO_LINGER l_linger setting + * @ssk_ack_backlog - current listen backlog + * @ssk_max_ack_backlog - listen backlog set in listen() + * @ssk_forward_alloc - space allocated forward + * @ssk_wmem_queued - persistent queue size + * @ssk_write_pending - a write to stream socket waits to start + * @ssk_send_head - front of stuff to transmit + * @ssk_route_caps - route capabilities (e.g. %NETIF_F_TSO) + * @ssk_queue_shrunk - write queue has been shrunk recently + * @ssk_no_largesend - whether to sent large segments or not + * @ssk_sndmsg_page - cached page for sendmsg + * @ssk_sndmsg_off - cached offset for sendmsg + * @ssk_hashent - hash entry in several tables (e.g. tcp_ehash) + */ +struct stream_sock { + unsigned long ssk_lingertime; + unsigned short ssk_ack_backlog; + unsigned short ssk_max_ack_backlog; + int ssk_forward_alloc; + int ssk_wmem_queued; + int ssk_write_pending; + struct sk_buff *ssk_send_head; + int ssk_route_caps; + __u8 ssk_queue_shrunk; + unsigned char ssk_no_largesend; + /* two bytes hole, try to pack */ + struct page *ssk_sndmsg_page; + __u32 ssk_sndmsg_off; + int ssk_hashent; +}; + +struct pstream_sock { + struct sock sk; + struct stream_sock *pssk; +}; + +static inline struct stream_sock *sk_ssk(const struct sock *sk) +{ + return ((struct pstream_sock *)sk)->pssk; +} + +#define ssk_set_pointer(sk, type, member) { \ + ((struct pstream_sock *)sk)->pssk = \ + (struct stream_sock *)(((unsigned char *)sk) + \ + offsetof(type, member)); \ +} + +static inline void sk_acceptq_removed(struct sock *sk) +{ + sk_ssk(sk)->ssk_ack_backlog--; +} + +static inline void sk_acceptq_added(struct sock *sk) +{ + sk_ssk(sk)->ssk_ack_backlog++; +} + +static inline int sk_acceptq_is_full(struct sock *sk) +{ + return sk_ssk(sk)->ssk_ack_backlog > sk_ssk(sk)->ssk_max_ack_backlog; +} + +static inline void sk_stream_free_skb(struct sock *sk, struct sk_buff *skb) +{ + struct stream_sock *ssk = sk_ssk(sk); + + ssk->ssk_queue_shrunk = 1; + ssk->ssk_wmem_queued -= skb->truesize; + ssk->ssk_forward_alloc += skb->truesize; + __kfree_skb(skb); +} + +extern void sk_stream_rfree(struct sk_buff *skb); + +static inline void sk_stream_set_owner_r(struct sk_buff *skb, + struct sock *sk) +{ + skb->sk = sk; + skb->destructor = sk_stream_rfree; + atomic_add(skb->truesize, &sk->sk_rmem_alloc); + sk_ssk(sk)->ssk_forward_alloc -= skb->truesize; +} + +extern void __sk_stream_mem_reclaim(struct sock *sk); +extern int sk_stream_mem_schedule(struct sock *sk, int size, int kind); + +#define SK_STREAM_MEM_QUANTUM ((int)PAGE_SIZE) + +static inline int sk_stream_pages(int amt) +{ + return (amt + SK_STREAM_MEM_QUANTUM - 1) / SK_STREAM_MEM_QUANTUM; +} + +static inline void sk_stream_mem_reclaim(struct sock *sk) +{ + if (sk_ssk(sk)->ssk_forward_alloc >= SK_STREAM_MEM_QUANTUM) + __sk_stream_mem_reclaim(sk); +} + +static inline void sk_stream_writequeue_purge(struct sock *sk) +{ + struct sk_buff *skb; + + while ((skb = __skb_dequeue(&sk->sk_write_queue)) != NULL) + sk_stream_free_skb(sk, skb); + sk_stream_mem_reclaim(sk); +} + +static inline int sk_stream_rmem_schedule(struct sock *sk, + struct sk_buff *skb) +{ + return (int)skb->truesize <= sk_ssk(sk)->ssk_forward_alloc || + sk_stream_mem_schedule(sk, skb->truesize, 1); +} + +static inline void sk_charge_skb(struct sock *sk, struct sk_buff *skb) +{ + struct stream_sock *ssk = sk_ssk(sk); + + ssk->ssk_wmem_queued += skb->truesize; + ssk->ssk_forward_alloc -= skb->truesize; +} + +static inline int skb_copy_to_page(struct sock *sk, char __user *from, + struct sk_buff *skb, struct page *page, + int off, int copy) +{ + struct stream_sock *ssk; + + if (skb->ip_summed == CHECKSUM_NONE) { + int err = 0; + unsigned int csum = csum_and_copy_from_user(from, + page_address(page) + off, + copy, 0, &err); + if (err) + return err; + skb->csum = csum_block_add(skb->csum, csum, skb->len); + } else if (copy_from_user(page_address(page) + off, from, copy)) + return -EFAULT; + + skb->len += copy; + skb->data_len += copy; + skb->truesize += copy; + + ssk = sk_ssk(sk); + ssk->ssk_wmem_queued += copy; + ssk->ssk_forward_alloc -= copy; + return 0; +} + +#define SOCK_MIN_RCVBUF 256 +#define SOCK_MIN_SNDBUF 2048 + +static inline void sk_stream_moderate_sndbuf(struct sock *sk) +{ + if (!(sk->sk_userlocks & SOCK_SNDBUF_LOCK)) { + sk->sk_sndbuf = min(sk->sk_sndbuf, sk_ssk(sk)->ssk_wmem_queued / 2); + sk->sk_sndbuf = max(sk->sk_sndbuf, SOCK_MIN_SNDBUF); + } +} + +static inline struct sk_buff *sk_stream_alloc_pskb(struct sock *sk, + int size, int mem, int gfp) +{ + struct sk_buff *skb = alloc_skb(size + sk->sk_prot->max_header, gfp); + + if (skb) { + skb->truesize += mem; + if (sk_ssk(sk)->ssk_forward_alloc >= (int)skb->truesize || + sk_stream_mem_schedule(sk, skb->truesize, 0)) { + skb_reserve(skb, sk->sk_prot->max_header); + return skb; + } + __kfree_skb(skb); + } else { + sk->sk_prot->enter_memory_pressure(); + sk_stream_moderate_sndbuf(sk); + } + return NULL; +} + +static inline struct sk_buff *sk_stream_alloc_skb(struct sock *sk, + int size, int gfp) +{ + return sk_stream_alloc_pskb(sk, size, 0, gfp); +} + +static inline struct page *sk_stream_alloc_page(struct sock *sk) +{ + struct page *page = NULL; + + if (sk_ssk(sk)->ssk_forward_alloc >= (int)PAGE_SIZE || + sk_stream_mem_schedule(sk, PAGE_SIZE, 0)) + page = alloc_pages(sk->sk_allocation, 0); + else { + sk->sk_prot->enter_memory_pressure(); + sk_stream_moderate_sndbuf(sk); + } + return page; +} + +#define sk_stream_for_retrans_queue(skb, sk) \ + for (skb = (sk)->sk_write_queue.next; \ + (skb != sk_ssk(sk)->ssk_send_head) && \ + (skb != (struct sk_buff *)&(sk)->sk_write_queue); \ + skb = skb->next) + +/* + * Compute minimal free write space needed to queue new packets. + */ +static inline int sk_stream_min_wspace(struct sock *sk) +{ + return sk_ssk(sk)->ssk_wmem_queued / 2; +} + +static inline int sk_stream_wspace(struct sock *sk) +{ + return sk->sk_sndbuf - sk_ssk(sk)->ssk_wmem_queued; +} + +extern void sk_stream_write_space(struct sock *sk); + +static inline int sk_stream_memory_free(struct sock *sk) +{ + return sk_ssk(sk)->ssk_wmem_queued < sk->sk_sndbuf; +} + +extern int sk_stream_wait_connect(struct sock *sk, long *timeo_p); +extern int sk_stream_wait_memory(struct sock *sk, long *timeo_p); +extern void sk_stream_wait_close(struct sock *sk, long timeo_p); +extern int sk_stream_error(struct sock *sk, int flags, int err); +extern void sk_stream_kill_queues(struct sock *sk); +#endif /* _STREAM_SOCK_H */ diff -Nru a/include/net/tcp.h b/include/net/tcp.h --- a/include/net/tcp.h 2004-12-15 20:32:26 -02:00 +++ b/include/net/tcp.h 2004-12-15 20:32:26 -02:00 @@ -32,6 +32,7 @@ #include #include #include +#include #include #include #if defined(CONFIG_IPV6) || defined (CONFIG_IPV6_MODULE) @@ -1457,7 +1458,7 @@ extern void tcp_set_skb_tso_segs(struct sk_buff *, unsigned int); -/* This checks if the data bearing packet SKB (usually sk->sk_send_head) +/* This checks if the data bearing packet SKB (usually sk_ssk(sk)->ssk_send_head) * should be put on the wire right now. */ static __inline__ int tcp_snd_test(const struct tcp_opt *tp, @@ -1523,7 +1524,7 @@ unsigned cur_mss, int nonagle) { - struct sk_buff *skb = sk->sk_send_head; + struct sk_buff *skb = sk_ssk(sk)->ssk_send_head; if (skb) { if (!tcp_skb_is_last(sk, skb)) @@ -1543,7 +1544,7 @@ static __inline__ int tcp_may_send_now(struct sock *sk, struct tcp_opt *tp) { - struct sk_buff *skb = sk->sk_send_head; + struct sk_buff *skb = sk_ssk(sk)->ssk_send_head; return (skb && tcp_snd_test(tp, skb, tcp_current_mss(sk, 1), @@ -1951,10 +1952,12 @@ static inline void tcp_v4_setup_caps(struct sock *sk, struct dst_entry *dst) { - sk->sk_route_caps = dst->dev->features; - if (sk->sk_route_caps & NETIF_F_TSO) { - if (sk->sk_no_largesend || dst->header_len) - sk->sk_route_caps &= ~NETIF_F_TSO; + struct stream_sock *ssk = sk_ssk(sk); + + ssk->ssk_route_caps = dst->dev->features; + if (ssk->ssk_route_caps & NETIF_F_TSO) { + if (ssk->ssk_no_largesend || dst->header_len) + ssk->ssk_route_caps &= ~NETIF_F_TSO; } } @@ -1963,13 +1966,14 @@ static inline int tcp_use_frto(const struct sock *sk) { const struct tcp_opt *tp = tcp_sk(sk); + const struct stream_sock *ssk = sk_ssk(sk); /* F-RTO must be activated in sysctl and there must be some * unsent new data, and the advertised window should allow * sending it. */ - return (sysctl_tcp_frto && sk->sk_send_head && - !after(TCP_SKB_CB(sk->sk_send_head)->end_seq, + return (sysctl_tcp_frto && ssk->ssk_send_head && + !after(TCP_SKB_CB(ssk->ssk_send_head)->end_seq, tp->snd_una + tp->snd_wnd)); } diff -Nru a/include/net/tcp_ecn.h b/include/net/tcp_ecn.h --- a/include/net/tcp_ecn.h 2004-12-15 20:32:26 -02:00 +++ b/include/net/tcp_ecn.h 2004-12-15 20:32:26 -02:00 @@ -30,11 +30,13 @@ static __inline__ void TCP_ECN_send_syn(struct sock *sk, struct tcp_opt *tp, struct sk_buff *skb) { + struct stream_sock *ssk = sk_ssk(sk); + tp->ecn_flags = 0; - if (sysctl_tcp_ecn && !(sk->sk_route_caps & NETIF_F_TSO)) { + if (sysctl_tcp_ecn && !(ssk->ssk_route_caps & NETIF_F_TSO)) { TCP_SKB_CB(skb)->flags |= TCPCB_FLAG_ECE|TCPCB_FLAG_CWR; tp->ecn_flags = TCP_ECN_OK; - sk->sk_no_largesend = 1; + ssk->ssk_no_largesend = 1; } } diff -Nru a/net/atm/common.c b/net/atm/common.c --- a/net/atm/common.c 2004-12-15 20:32:26 -02:00 +++ b/net/atm/common.c 2004-12-15 20:32:26 -02:00 @@ -31,6 +31,7 @@ #include "addr.h" /* address registry */ #include "signaling.h" /* for WAITING and sigd_attach */ +static kmem_cache_t *atm_sk_cachep; #if 0 #define DPRINTK(format,args...) printk(KERN_DEBUG format,##args) @@ -46,7 +47,7 @@ struct atm_vcc *vcc = atm_sk(sk); struct hlist_head *head = &vcc_hash[vcc->vci & (VCC_HTABLE_SIZE - 1)]; - sk->sk_hashent = vcc->vci & (VCC_HTABLE_SIZE - 1); + sk_ssk(sk)->ssk_hashent = vcc->vci & (VCC_HTABLE_SIZE - 1); sk_add_node(sk, head); } @@ -96,8 +97,6 @@ if (atomic_read(&vcc->sk->sk_wmem_alloc)) printk(KERN_DEBUG "vcc_sock_destruct: wmem leakage (%d bytes) detected.\n", atomic_read(&sk->sk_wmem_alloc)); - - kfree(sk->sk_protinfo); } static void vcc_def_wakeup(struct sock *sk) @@ -139,7 +138,7 @@ sock->sk = NULL; if (sock->type == SOCK_STREAM) return -EINVAL; - sk = sk_alloc(family, GFP_KERNEL, 1, NULL); + sk = sk_alloc(family, GFP_KERNEL, sizeof(struct atm_sock), atm_sk_cachep); if (!sk) return -ENOMEM; sock_init_data(sock, sk); @@ -147,13 +146,7 @@ sk->sk_state_change = vcc_def_wakeup; sk->sk_write_space = vcc_write_space; - vcc = sk->sk_protinfo = kmalloc(sizeof(*vcc), GFP_KERNEL); - if (!vcc) { - sk_free(sk); - return -ENOMEM; - } - - memset(vcc, 0, sizeof(*vcc)); + vcc = atm_sk(sk); vcc->sk = sk; vcc->dev = NULL; memset(&vcc->local,0,sizeof(struct sockaddr_atmsvc)); @@ -779,26 +772,38 @@ static int __init atm_init(void) { - int error; + int error = -ENOMEM; + + atm_sk_cachep = kmem_cache_create("atm_sock", sizeof(struct atm_sock), 0, + SLAB_HWCACHE_ALIGN, NULL, NULL); + + if (atm_sk_cachep == NULL) { + printk(KERN_ERR "failed to alloc atm sock slab cache\n"); + goto out; + } if ((error = atmpvc_init()) < 0) { printk(KERN_ERR "atmpvc_init() failed with %d\n", error); - goto failure; + goto out_free_slab; } if ((error = atmsvc_init()) < 0) { printk(KERN_ERR "atmsvc_init() failed with %d\n", error); - goto failure; + goto out_pvc_exit; } if ((error = atm_proc_init()) < 0) { printk(KERN_ERR "atm_proc_init() failed with %d\n",error); - goto failure; + goto out_svc_exit; } - return 0; - -failure: +out: + return error; +out_free_slab: + kmem_cache_destroy(atm_sk_cachep); + atm_sk_cachep = NULL; +out_pvc_exit: atmsvc_exit(); +out_svc_exit: atmpvc_exit(); - return error; + goto out; } static void __exit atm_exit(void) @@ -806,6 +811,11 @@ atm_proc_exit(); atmsvc_exit(); atmpvc_exit(); + + if (atm_sk_cachep) { + kmem_cache_destroy(atm_sk_cachep); + atm_sk_cachep = NULL; + } } module_init(atm_init); diff -Nru a/net/atm/signaling.c b/net/atm/signaling.c --- a/net/atm/signaling.c 2004-12-15 20:32:26 -02:00 +++ b/net/atm/signaling.c 2004-12-15 20:32:26 -02:00 @@ -133,13 +133,13 @@ vcc = *(struct atm_vcc **) &msg->listen_vcc; DPRINTK("as_indicate!!!\n"); lock_sock(vcc->sk); - if (vcc->sk->sk_ack_backlog == - vcc->sk->sk_max_ack_backlog) { + if (sk_ssk(vcc->sk)->ssk_ack_backlog == + sk_ssk(vcc->sk)->ssk_max_ack_backlog) { sigd_enq(NULL,as_reject,vcc,NULL,NULL); dev_kfree_skb(skb); goto as_indicate_complete; } - vcc->sk->sk_ack_backlog++; + sk_ssk(vcc->sk)->ssk_ack_backlog++; skb_queue_tail(&vcc->sk->sk_receive_queue, skb); DPRINTK("waking vcc->sk->sk_sleep 0x%p\n", vcc->sk->sk_sleep); vcc->sk->sk_state_change(vcc->sk); diff -Nru a/net/atm/svc.c b/net/atm/svc.c --- a/net/atm/svc.c 2004-12-15 20:32:26 -02:00 +++ b/net/atm/svc.c 2004-12-15 20:32:26 -02:00 @@ -320,7 +320,7 @@ goto out; } set_bit(ATM_VF_LISTEN,&vcc->flags); - vcc->sk->sk_max_ack_backlog = backlog > 0 ? backlog : + sk_ssk(vcc->sk)->ssk_max_ack_backlog = backlog > 0 ? backlog : ATM_BACKLOG_DEFAULT; error = -sk->sk_err; out: @@ -387,7 +387,7 @@ error = vcc_connect(newsock, msg->pvc.sap_addr.itf, msg->pvc.sap_addr.vpi, msg->pvc.sap_addr.vci); dev_kfree_skb(skb); - old_vcc->sk->sk_ack_backlog--; + sk_ssk(old_vcc->sk)->ssk_ack_backlog--; if (error) { sigd_enq2(NULL,as_reject,old_vcc,NULL,NULL, &old_vcc->qos,error); diff -Nru a/net/ax25/af_ax25.c b/net/ax25/af_ax25.c --- a/net/ax25/af_ax25.c 2004-12-15 20:32:26 -02:00 +++ b/net/ax25/af_ax25.c 2004-12-15 20:32:26 -02:00 @@ -49,7 +49,7 @@ #include #include - +static kmem_cache_t *ax25_sk_cachep; HLIST_HEAD(ax25_list); spinlock_t ax25_list_lock = SPIN_LOCK_UNLOCKED; @@ -466,14 +466,8 @@ /* * Create an empty AX.25 control block. */ -ax25_cb *ax25_create_cb(void) +void ax25_init_cb(ax25_cb *ax25) { - ax25_cb *ax25; - - if ((ax25 = kmalloc(sizeof(*ax25), GFP_ATOMIC)) == NULL) - return NULL; - - memset(ax25, 0x00, sizeof(*ax25)); atomic_set(&ax25->refcount, 1); skb_queue_head_init(&ax25->write_queue); @@ -490,6 +484,14 @@ ax25_fillin_cb(ax25, NULL); ax25->state = AX25_STATE_0; +} + +ax25_cb *ax25_create_cb(void) +{ + ax25_cb *ax25 = kmalloc(sizeof(*ax25), GFP_ATOMIC); + + if (ax25) + ax25_init_cb(ax25); return ax25; } @@ -743,7 +745,7 @@ lock_sock(sk); if (sk->sk_type == SOCK_SEQPACKET && sk->sk_state != TCP_LISTEN) { - sk->sk_max_ack_backlog = backlog; + sk_ssk(sk)->ssk_max_ack_backlog = backlog; sk->sk_state = TCP_LISTEN; goto out; } @@ -805,14 +807,12 @@ return -ESOCKTNOSUPPORT; } - if ((sk = sk_alloc(PF_AX25, GFP_ATOMIC, 1, NULL)) == NULL) + if ((sk = sk_alloc(PF_AX25, GFP_ATOMIC, sizeof(struct ax25_sock), + ax25_sk_cachep)) == NULL) return -ENOMEM; - ax25 = sk->sk_protinfo = ax25_create_cb(); - if (!ax25) { - sk_free(sk); - return -ENOMEM; - } + ax25 = ax25_sk(sk); + ax25_init_cb(ax25); sock_init_data(sock, sk); sk_set_owner(sk, THIS_MODULE); @@ -831,13 +831,12 @@ struct sock *sk; ax25_cb *ax25, *oax25; - if ((sk = sk_alloc(PF_AX25, GFP_ATOMIC, 1, NULL)) == NULL) + if ((sk = sk_alloc(PF_AX25, GFP_ATOMIC, sizeof(struct ax25_sock), + ax25_sk_cachep)) == NULL) return NULL; - if ((ax25 = ax25_create_cb()) == NULL) { - sk_free(sk); - return NULL; - } + ax25 = ax25_sk(sk); + ax25_init_cb(ax25); switch (osk->sk_type) { case SOCK_DGRAM: @@ -893,7 +892,6 @@ memcpy(ax25->digipeat, oax25->digipeat, sizeof(ax25_digi)); } - sk->sk_protinfo = ax25; ax25->sk = sk; return sk; @@ -1335,7 +1333,7 @@ /* Now attach up the new socket */ kfree_skb(skb); - sk->sk_ack_backlog--; + sk_ssk(sk)->ssk_ack_backlog--; newsock->sk = newsk; newsock->state = SS_CONNECTED; @@ -1989,6 +1987,14 @@ static int __init ax25_init(void) { + + ax25_sk_cachep = kmem_cache_create("ax25_sock", sizeof(struct ax25_sock), 0, + SLAB_HWCACHE_ALIGN, NULL, NULL); + + if (ax25_sk_cachep == NULL) { + printk(KERN_ERR "failed to alloc ax25 sock slab cache\n"); + return -ENOMEM; + } sock_register(&ax25_family_ops); dev_add_pack(&ax25_packet_type); register_netdevice_notifier(&ax25_dev_notifier); @@ -2023,5 +2029,10 @@ dev_remove_pack(&ax25_packet_type); sock_unregister(PF_AX25); + + if (ax25_sk_cachep) { + kmem_cache_destroy(ax25_sk_cachep); + ax25_sk_cachep = NULL; + } } module_exit(ax25_exit); diff -Nru a/net/ax25/ax25_in.c b/net/ax25/ax25_in.c --- a/net/ax25/ax25_in.c 2004-12-15 20:32:26 -02:00 +++ b/net/ax25/ax25_in.c 2004-12-15 20:32:26 -02:00 @@ -356,7 +356,7 @@ if (sk != NULL) { bh_lock_sock(sk); - if (sk->sk_ack_backlog == sk->sk_max_ack_backlog || + if (sk_ssk(sk)->ssk_ack_backlog == sk_ssk(sk)->ssk_max_ack_backlog || (make = ax25_make_new(sk, ax25_dev)) == NULL) { if (mine) ax25_return_dm(dev, &src, &dest, &dp); @@ -373,7 +373,7 @@ make->sk_state = TCP_ESTABLISHED; - sk->sk_ack_backlog++; + sk_ssk(sk)->ssk_ack_backlog++; bh_unlock_sock(sk); } else { if (!mine) { @@ -381,13 +381,8 @@ return 0; } - if ((ax25 = ax25_create_cb()) == NULL) { - ax25_return_dm(dev, &src, &dest, &dp); - kfree_skb(skb); - return 0; - } - - ax25_fillin_cb(ax25, ax25_dev); + ax25_init_cb(ax25_sk(sk)); + ax25_fillin_cb(ax25_sk(sk), ax25_dev); } ax25->source_addr = dest; diff -Nru a/net/bluetooth/af_bluetooth.c b/net/bluetooth/af_bluetooth.c --- a/net/bluetooth/af_bluetooth.c 2004-12-15 20:32:26 -02:00 +++ b/net/bluetooth/af_bluetooth.c 2004-12-15 20:32:26 -02:00 @@ -127,6 +127,9 @@ sk->sk_protinfo = pi; } + if (sock->type == SOCK_STREAM) + ssk_set_pointer(sk, struct bt_sock, ssk); + sock_init_data(sock, sk); INIT_LIST_HEAD(&bt_sk(sk)->accept_q); @@ -161,7 +164,7 @@ sock_hold(sk); list_add_tail(&bt_sk(sk)->accept_q, &bt_sk(parent)->accept_q); bt_sk(sk)->parent = parent; - parent->sk_ack_backlog++; + sk_ssk(parent)->ssk_ack_backlog++; } EXPORT_SYMBOL(bt_accept_enqueue); @@ -170,7 +173,7 @@ BT_DBG("sk %p state %d", sk, sk->sk_state); list_del_init(&bt_sk(sk)->accept_q); - bt_sk(sk)->parent->sk_ack_backlog--; + sk_ssk(bt_sk(sk)->parent)->ssk_ack_backlog--; bt_sk(sk)->parent = NULL; sock_put(sk); } diff -Nru a/net/bluetooth/l2cap.c b/net/bluetooth/l2cap.c --- a/net/bluetooth/l2cap.c 2004-12-15 20:32:26 -02:00 +++ b/net/bluetooth/l2cap.c 2004-12-15 20:32:26 -02:00 @@ -604,8 +604,8 @@ goto done; } - sk->sk_max_ack_backlog = backlog; - sk->sk_ack_backlog = 0; + sk_ssk(sk)->ssk_max_ack_backlog = backlog; + sk_ssk(sk)->ssk_ack_backlog = 0; sk->sk_state = BT_LISTEN; done: @@ -892,8 +892,9 @@ l2cap_sock_clear_timer(sk); __l2cap_sock_close(sk, 0); - if (sock_flag(sk, SOCK_LINGER) && sk->sk_lingertime) - err = bt_sock_wait_state(sk, BT_CLOSED, sk->sk_lingertime); + if (sock_flag(sk, SOCK_LINGER) && sk_ssk(sk)->ssk_lingertime) + err = bt_sock_wait_state(sk, BT_CLOSED, + sk_ssk(sk)->ssk_lingertime); } release_sock(sk); return err; @@ -1406,8 +1407,9 @@ result = L2CAP_CR_NO_MEM; /* Check for backlog size */ - if (parent->sk_ack_backlog > parent->sk_max_ack_backlog) { - BT_DBG("backlog full %d", parent->sk_ack_backlog); + if (sk_ssk(parent)->ssk_ack_backlog > + sk_ssk(parent)->ssk_max_ack_backlog) { + BT_DBG("backlog full %d", sk_ssk(parent)->ssk_ack_backlog); goto response; } diff -Nru a/net/bluetooth/rfcomm/sock.c b/net/bluetooth/rfcomm/sock.c --- a/net/bluetooth/rfcomm/sock.c 2004-12-15 20:32:26 -02:00 +++ b/net/bluetooth/rfcomm/sock.c 2004-12-15 20:32:26 -02:00 @@ -428,8 +428,8 @@ goto done; } - sk->sk_max_ack_backlog = backlog; - sk->sk_ack_backlog = 0; + sk_ssk(sk)->ssk_max_ack_backlog = backlog; + sk_ssk(sk)->ssk_ack_backlog = 0; sk->sk_state = BT_LISTEN; done: @@ -739,8 +739,9 @@ sk->sk_shutdown = SHUTDOWN_MASK; __rfcomm_sock_close(sk); - if (sock_flag(sk, SOCK_LINGER) && sk->sk_lingertime) - err = bt_sock_wait_state(sk, BT_CLOSED, sk->sk_lingertime); + if (sock_flag(sk, SOCK_LINGER) && sk_ssk(sk)->ssk_lingertime) + err = bt_sock_wait_state(sk, BT_CLOSED, + sk_ssk(sk)->ssk_lingertime); } release_sock(sk); return err; @@ -783,8 +784,9 @@ return 0; /* Check for backlog size */ - if (parent->sk_ack_backlog > parent->sk_max_ack_backlog) { - BT_DBG("backlog full %d", parent->sk_ack_backlog); + if (sk_ssk(parent)->ssk_ack_backlog > + sk_ssk(parent)->ssk_max_ack_backlog) { + BT_DBG("backlog full %d", sk_ssk(parent)->ssk_ack_backlog); goto done; } diff -Nru a/net/bluetooth/sco.c b/net/bluetooth/sco.c --- a/net/bluetooth/sco.c 2004-12-15 20:32:26 -02:00 +++ b/net/bluetooth/sco.c 2004-12-15 20:32:26 -02:00 @@ -540,8 +540,8 @@ goto done; } - sk->sk_max_ack_backlog = backlog; - sk->sk_ack_backlog = 0; + sk_ssk(sk)->ssk_max_ack_backlog = backlog; + sk_ssk(sk)->ssk_ack_backlog = 0; sk->sk_state = BT_LISTEN; done: @@ -733,9 +733,10 @@ sco_sock_close(sk); - if (sock_flag(sk, SOCK_LINGER) && sk->sk_lingertime) { + if (sock_flag(sk, SOCK_LINGER) && sk_ssk(sk)->ssk_lingertime) { lock_sock(sk); - err = bt_sock_wait_state(sk, BT_CLOSED, sk->sk_lingertime); + err = bt_sock_wait_state(sk, BT_CLOSED, + sk_ssk(sk)->ssk_lingertime); release_sock(sk); } diff -Nru a/net/core/sock.c b/net/core/sock.c --- a/net/core/sock.c 2004-12-15 20:32:26 -02:00 +++ b/net/core/sock.c 2004-12-15 20:32:26 -02:00 @@ -307,7 +307,8 @@ break; case SO_LINGER: - if(optlensk_type == SOCK_STREAM || + optlen < sizeof(ling)) { ret = -EINVAL; /* 1003.1g */ break; } @@ -320,10 +321,10 @@ else { #if (BITS_PER_LONG == 32) if (ling.l_linger >= MAX_SCHEDULE_TIMEOUT/HZ) - sk->sk_lingertime = MAX_SCHEDULE_TIMEOUT; + sk_ssk(sk)->ssk_lingertime = MAX_SCHEDULE_TIMEOUT; else #endif - sk->sk_lingertime = ling.l_linger * HZ; + sk_ssk(sk)->ssk_lingertime = ling.l_linger * HZ; sock_set_flag(sk, SOCK_LINGER); } break; @@ -513,9 +514,11 @@ break; case SO_LINGER: + if (sk->sk_type != SOCK_STREAM) + return -EINVAL; lv = sizeof(v.ling); v.ling.l_onoff = !!sock_flag(sk, SOCK_LINGER); - v.ling.l_linger = sk->sk_lingertime / HZ; + v.ling.l_linger = sk_ssk(sk)->ssk_lingertime / HZ; break; case SO_BSDCOMPAT: @@ -1156,8 +1159,6 @@ skb_queue_head_init(&sk->sk_write_queue); skb_queue_head_init(&sk->sk_error_queue); - sk->sk_send_head = NULL; - init_timer(&sk->sk_timer); sk->sk_allocation = GFP_KERNEL; @@ -1184,13 +1185,18 @@ sk->sk_error_report = sock_def_error_report; sk->sk_destruct = sock_def_destruct; - sk->sk_sndmsg_page = NULL; - sk->sk_sndmsg_off = 0; + if (sk->sk_type == SOCK_STREAM) { + struct stream_sock *ssk = sk_ssk(sk); + + ssk->ssk_send_head = NULL; + ssk->ssk_sndmsg_page = NULL; + ssk->ssk_sndmsg_off = 0; + ssk->ssk_write_pending = 0; + } sk->sk_peercred.pid = 0; sk->sk_peercred.uid = -1; sk->sk_peercred.gid = -1; - sk->sk_write_pending = 0; sk->sk_rcvlowat = 1; sk->sk_rcvtimeo = MAX_SCHEDULE_TIMEOUT; sk->sk_sndtimeo = MAX_SCHEDULE_TIMEOUT; diff -Nru a/net/core/stream.c b/net/core/stream.c --- a/net/core/stream.c 2004-12-15 20:32:26 -02:00 +++ b/net/core/stream.c 2004-12-15 20:32:26 -02:00 @@ -64,13 +64,13 @@ return sock_intr_errno(*timeo_p); prepare_to_wait(sk->sk_sleep, &wait, TASK_INTERRUPTIBLE); - sk->sk_write_pending++; + sk_ssk(sk)->ssk_write_pending++; if (sk_wait_event(sk, timeo_p, !((1 << sk->sk_state) & ~(TCPF_ESTABLISHED | TCPF_CLOSE_WAIT)))) break; finish_wait(sk->sk_sleep, &wait); - sk->sk_write_pending--; + sk_ssk(sk)->ssk_write_pending--; } return 0; } @@ -136,10 +136,10 @@ break; set_bit(SOCK_NOSPACE, &sk->sk_socket->flags); - sk->sk_write_pending++; + sk_ssk(sk)->ssk_write_pending++; sk_wait_event(sk, ¤t_timeo, sk_stream_memory_free(sk) && vm_wait); - sk->sk_write_pending--; + sk_ssk(sk)->ssk_write_pending--; if (vm_wait) { vm_wait -= current_timeo; @@ -173,7 +173,7 @@ struct sock *sk = skb->sk; atomic_sub(skb->truesize, &sk->sk_rmem_alloc); - sk->sk_forward_alloc += skb->truesize; + sk_ssk(sk)->ssk_forward_alloc += skb->truesize; } EXPORT_SYMBOL(sk_stream_rfree); @@ -191,10 +191,12 @@ void __sk_stream_mem_reclaim(struct sock *sk) { - if (sk->sk_forward_alloc >= SK_STREAM_MEM_QUANTUM) { - atomic_sub(sk->sk_forward_alloc / SK_STREAM_MEM_QUANTUM, + struct stream_sock *ssk = sk_ssk(sk); + + if (ssk->ssk_forward_alloc >= SK_STREAM_MEM_QUANTUM) { + atomic_sub(ssk->ssk_forward_alloc / SK_STREAM_MEM_QUANTUM, sk->sk_prot->memory_allocated); - sk->sk_forward_alloc &= SK_STREAM_MEM_QUANTUM - 1; + ssk->ssk_forward_alloc &= SK_STREAM_MEM_QUANTUM - 1; if (*sk->sk_prot->memory_pressure && (atomic_read(sk->sk_prot->memory_allocated) < sk->sk_prot->sysctl_mem[0])) @@ -207,8 +209,9 @@ int sk_stream_mem_schedule(struct sock *sk, int size, int kind) { int amt = sk_stream_pages(size); + struct stream_sock *ssk = sk_ssk(sk); - sk->sk_forward_alloc += amt * SK_STREAM_MEM_QUANTUM; + ssk->ssk_forward_alloc += amt * SK_STREAM_MEM_QUANTUM; atomic_add(amt, sk->sk_prot->memory_allocated); /* Under limit. */ @@ -231,14 +234,14 @@ if (kind) { if (atomic_read(&sk->sk_rmem_alloc) < sk->sk_prot->sysctl_rmem[0]) return 1; - } else if (sk->sk_wmem_queued < sk->sk_prot->sysctl_wmem[0]) + } else if (ssk->ssk_wmem_queued < sk->sk_prot->sysctl_wmem[0]) return 1; if (!*sk->sk_prot->memory_pressure || sk->sk_prot->sysctl_mem[2] > atomic_read(sk->sk_prot->sockets_allocated) * - sk_stream_pages(sk->sk_wmem_queued + + sk_stream_pages(ssk->ssk_wmem_queued + atomic_read(&sk->sk_rmem_alloc) + - sk->sk_forward_alloc)) + ssk->ssk_forward_alloc)) return 1; suppress_allocation: @@ -249,12 +252,12 @@ /* Fail only if socket is _under_ its sndbuf. * In this case we cannot block, so that we have to fail. */ - if (sk->sk_wmem_queued + size >= sk->sk_sndbuf) + if (ssk->ssk_wmem_queued + size >= sk->sk_sndbuf) return 1; } /* Alas. Undo changes. */ - sk->sk_forward_alloc -= amt * SK_STREAM_MEM_QUANTUM; + ssk->ssk_forward_alloc -= amt * SK_STREAM_MEM_QUANTUM; atomic_sub(amt, sk->sk_prot->memory_allocated); return 0; } @@ -275,8 +278,8 @@ /* Account for returned memory. */ sk_stream_mem_reclaim(sk); - BUG_TRAP(!sk->sk_wmem_queued); - BUG_TRAP(!sk->sk_forward_alloc); + BUG_TRAP(!sk_ssk(sk)->ssk_wmem_queued); + BUG_TRAP(!sk_ssk(sk)->ssk_forward_alloc); /* It is _impossible_ for the backlog to contain anything * when we get here. All user references to this socket diff -Nru a/net/decnet/af_decnet.c b/net/decnet/af_decnet.c --- a/net/decnet/af_decnet.c 2004-12-15 20:32:26 -02:00 +++ b/net/decnet/af_decnet.c 2004-12-15 20:32:26 -02:00 @@ -942,7 +942,7 @@ fl.proto = DNPROTO_NSP; if (dn_route_output_sock(&sk->sk_dst_cache, &fl, sk, flags) < 0) goto out; - sk->sk_route_caps = sk->sk_dst_cache->dev->features; + sk_ssk(sk)->ssk_route_caps = sk->sk_dst_cache->dev->features; sock->state = SS_CONNECTING; scp->state = DN_CI; scp->segsize_loc = dst_path_metric(sk->sk_dst_cache, RTAX_ADVMSS); @@ -1080,7 +1080,7 @@ } cb = DN_SKB_CB(skb); - sk->sk_ack_backlog--; + sk_ssk(sk)->ssk_ack_backlog--; newsk = dn_alloc_sock(newsock, sk->sk_allocation); if (newsk == NULL) { release_sock(sk); @@ -1268,8 +1268,8 @@ if ((DN_SK(sk)->state != DN_O) || (sk->sk_state == TCP_LISTEN)) goto out; - sk->sk_max_ack_backlog = backlog; - sk->sk_ack_backlog = 0; + sk_ssk(sk)->ssk_max_ack_backlog = backlog; + sk_ssk(sk)->ssk_ack_backlog = 0; sk->sk_state = TCP_LISTEN; err = 0; dn_rehash_sock(sk); diff -Nru a/net/decnet/dn_nsp_in.c b/net/decnet/dn_nsp_in.c --- a/net/decnet/dn_nsp_in.c 2004-12-15 20:32:26 -02:00 +++ b/net/decnet/dn_nsp_in.c 2004-12-15 20:32:26 -02:00 @@ -324,12 +324,14 @@ static void dn_nsp_conn_init(struct sock *sk, struct sk_buff *skb) { - if (sk->sk_ack_backlog >= sk->sk_max_ack_backlog) { + struct stream_sock *ssk = sk_ssk(sk); + + if (ssk->ssk_ack_backlog >= ssk->ssk_max_ack_backlog) { kfree_skb(skb); return; } - sk->sk_ack_backlog++; + ssk->ssk_ack_backlog++; skb_queue_tail(&sk->sk_receive_queue, skb); sk->sk_state_change(sk); } diff -Nru a/net/decnet/dn_nsp_out.c b/net/decnet/dn_nsp_out.c --- a/net/decnet/dn_nsp_out.c 2004-12-15 20:32:26 -02:00 +++ b/net/decnet/dn_nsp_out.c 2004-12-15 20:32:26 -02:00 @@ -99,7 +99,7 @@ fl.proto = DNPROTO_NSP; if (dn_route_output_sock(&sk->sk_dst_cache, &fl, sk, 0) == 0) { dst = sk_dst_get(sk); - sk->sk_route_caps = dst->dev->features; + sk_ssk(sk)->ssk_route_caps = dst->dev->features; goto try_again; } diff -Nru a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c --- a/net/ipv4/af_inet.c 2004-12-15 20:32:26 -02:00 +++ b/net/ipv4/af_inet.c 2004-12-15 20:32:26 -02:00 @@ -148,8 +148,8 @@ BUG_TRAP(!atomic_read(&sk->sk_rmem_alloc)); BUG_TRAP(!atomic_read(&sk->sk_wmem_alloc)); - BUG_TRAP(!sk->sk_wmem_queued); - BUG_TRAP(!sk->sk_forward_alloc); + BUG_TRAP(!(sk->sk_type == SOCK_STREAM && sk_ssk(sk)->ssk_wmem_queued)); + BUG_TRAP(!(sk->sk_type == SOCK_STREAM && sk_ssk(sk)->ssk_forward_alloc)); if (inet->opt) kfree(inet->opt); @@ -215,7 +215,7 @@ if (err) goto out; } - sk->sk_max_ack_backlog = backlog; + sk_ssk(sk)->ssk_max_ack_backlog = backlog; err = 0; out: @@ -308,9 +308,13 @@ inet->id = 0; - sock_init_data(sock, sk); sk_set_owner(sk, sk->sk_prot->owner); + if (sock->type == SOCK_STREAM) + ssk_set_pointer(sk, struct tcp_sock, ssk); + + sock_init_data(sock, sk); + sk->sk_destruct = inet_sock_destruct; sk->sk_family = PF_INET; sk->sk_protocol = protocol; @@ -375,7 +379,7 @@ timeout = 0; if (sock_flag(sk, SOCK_LINGER) && !(current->flags & PF_EXITING)) - timeout = sk->sk_lingertime; + timeout = sk_ssk(sk)->ssk_lingertime; sock->sk = NULL; sk->sk_prot->close(sk, timeout); } diff -Nru a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c --- a/net/ipv4/ip_output.c 2004-12-15 20:32:26 -02:00 +++ b/net/ipv4/ip_output.c 2004-12-15 20:32:26 -02:00 @@ -747,8 +747,10 @@ inet->cork.fragsize = mtu = dst_pmtu(&rt->u.dst); inet->cork.rt = rt; inet->cork.length = 0; - sk->sk_sndmsg_page = NULL; - sk->sk_sndmsg_off = 0; + if (sk->sk_type == SOCK_STREAM) { + sk_ssk(sk)->ssk_sndmsg_page = NULL; + sk_ssk(sk)->ssk_sndmsg_off = 0; + } if ((exthdrlen = rt->u.dst.header_len) != 0) { length += exthdrlen; transhdrlen += exthdrlen; @@ -914,8 +916,9 @@ } else { int i = skb_shinfo(skb)->nr_frags; skb_frag_t *frag = &skb_shinfo(skb)->frags[i-1]; - struct page *page = sk->sk_sndmsg_page; - int off = sk->sk_sndmsg_off; + struct stream_sock *ssk = sk_ssk(sk); + struct page *page = ssk->ssk_sndmsg_page; + int off = ssk->ssk_sndmsg_off; unsigned int left; if (page && (left = PAGE_SIZE - off) > 0) { @@ -927,7 +930,7 @@ goto error; } get_page(page); - skb_fill_page_desc(skb, i, page, sk->sk_sndmsg_off, 0); + skb_fill_page_desc(skb, i, page, ssk->ssk_sndmsg_off, 0); frag = &skb_shinfo(skb)->frags[i]; } } else if (i < MAX_SKB_FRAGS) { @@ -938,8 +941,8 @@ err = -ENOMEM; goto error; } - sk->sk_sndmsg_page = page; - sk->sk_sndmsg_off = 0; + ssk->ssk_sndmsg_page = page; + ssk->ssk_sndmsg_off = 0; skb_fill_page_desc(skb, i, page, 0, 0); frag = &skb_shinfo(skb)->frags[i]; @@ -953,7 +956,7 @@ err = -EFAULT; goto error; } - sk->sk_sndmsg_off += copy; + ssk->ssk_sndmsg_off += copy; frag->size += copy; skb->len += copy; skb->data_len += copy; diff -Nru a/net/ipv4/tcp.c b/net/ipv4/tcp.c --- a/net/ipv4/tcp.c 2004-12-15 20:32:26 -02:00 +++ b/net/ipv4/tcp.c 2004-12-15 20:32:26 -02:00 @@ -462,10 +462,10 @@ { struct inet_opt *inet = inet_sk(sk); struct tcp_opt *tp = tcp_sk(sk); + struct stream_sock *ssk = sk_ssk(sk); struct tcp_listen_opt *lopt; - sk->sk_max_ack_backlog = 0; - sk->sk_ack_backlog = 0; + ssk->ssk_max_ack_backlog = ssk->ssk_ack_backlog = 0; tp->accept_queue = tp->accept_queue_tail = NULL; rwlock_init(&tp->syn_wait_lock); tcp_delack_init(tp); @@ -575,7 +575,7 @@ sk_acceptq_removed(sk); tcp_openreq_fastfree(req); } - BUG_TRAP(!sk->sk_ack_backlog); + BUG_TRAP(!sk_ssk(sk)->ssk_ack_backlog); } static inline void tcp_mark_push(struct tcp_opt *tp, struct sk_buff *skb) @@ -599,8 +599,8 @@ TCP_SKB_CB(skb)->sacked = 0; __skb_queue_tail(&sk->sk_write_queue, skb); sk_charge_skb(sk, skb); - if (!sk->sk_send_head) - sk->sk_send_head = skb; + if (!sk_ssk(sk)->ssk_send_head) + sk_ssk(sk)->ssk_send_head = skb; else if (tp->nonagle&TCP_NAGLE_PUSH) tp->nonagle &= ~TCP_NAGLE_PUSH; } @@ -618,7 +618,7 @@ static inline void tcp_push(struct sock *sk, struct tcp_opt *tp, int flags, int mss_now, int nonagle) { - if (sk->sk_send_head) { + if (sk_ssk(sk)->ssk_send_head) { struct sk_buff *skb = sk->sk_write_queue.prev; if (!(flags & MSG_MORE) || forced_push(tp)) tcp_mark_push(tp, skb); @@ -658,7 +658,8 @@ int offset = poffset % PAGE_SIZE; int size = min_t(size_t, psize, PAGE_SIZE - offset); - if (!sk->sk_send_head || (copy = mss_now - skb->len) <= 0) { + if (!sk_ssk(sk)->ssk_send_head || + (copy = mss_now - skb->len) <= 0) { new_segment: if (!sk_stream_memory_free(sk)) goto wait_for_sndbuf; @@ -707,7 +708,7 @@ if (forced_push(tp)) { tcp_mark_push(tp, skb); __tcp_push_pending_frames(sk, tp, mss_now, TCP_NAGLE_PUSH); - } else if (skb == sk->sk_send_head) + } else if (skb == sk_ssk(sk)->ssk_send_head) tcp_push_one(sk, mss_now); continue; @@ -740,11 +741,12 @@ { ssize_t res; struct sock *sk = sock->sk; + struct stream_sock *ssk = sk_ssk(sk); #define TCP_ZC_CSUM_FLAGS (NETIF_F_IP_CSUM | NETIF_F_NO_CSUM | NETIF_F_HW_CSUM) - if (!(sk->sk_route_caps & NETIF_F_SG) || - !(sk->sk_route_caps & TCP_ZC_CSUM_FLAGS)) + if (!(ssk->ssk_route_caps & NETIF_F_SG) || + !(ssk->ssk_route_caps & TCP_ZC_CSUM_FLAGS)) return sock_no_sendpage(sock, page, offset, size, flags); #undef TCP_ZC_CSUM_FLAGS @@ -757,14 +759,14 @@ return res; } -#define TCP_PAGE(sk) (sk->sk_sndmsg_page) -#define TCP_OFF(sk) (sk->sk_sndmsg_off) +#define TCP_PAGE(sk) (sk_ssk(sk)->ssk_sndmsg_page) +#define TCP_OFF(sk) (sk_ssk(sk)->ssk_sndmsg_off) static inline int select_size(struct sock *sk, struct tcp_opt *tp) { int tmp = tp->mss_cache_std; - if (sk->sk_route_caps & NETIF_F_SG) { + if (sk_ssk(sk)->ssk_route_caps & NETIF_F_SG) { int pgbreak = SKB_MAX_HEAD(MAX_TCP_HEADER); if (tmp >= pgbreak && @@ -821,7 +823,7 @@ skb = sk->sk_write_queue.prev; - if (!sk->sk_send_head || + if (!sk_ssk(sk)->ssk_send_head || (copy = mss_now - skb->len) <= 0) { new_segment: @@ -839,7 +841,7 @@ /* * Check whether we can use HW checksum. */ - if (sk->sk_route_caps & + if (sk_ssk(sk)->ssk_route_caps & (NETIF_F_IP_CSUM | NETIF_F_NO_CSUM | NETIF_F_HW_CSUM)) skb->ip_summed = CHECKSUM_HW; @@ -872,7 +874,7 @@ merge = 1; } else if (i == MAX_SKB_FRAGS || (!i && - !(sk->sk_route_caps & NETIF_F_SG))) { + !(sk_ssk(sk)->ssk_route_caps & NETIF_F_SG))) { /* Need to add new fragment and cannot * do this because interface is non-SG, * or because all the page slots are @@ -951,7 +953,7 @@ if (forced_push(tp)) { tcp_mark_push(tp, skb); __tcp_push_pending_frames(sk, tp, mss_now, TCP_NAGLE_PUSH); - } else if (skb == sk->sk_send_head) + } else if (skb == sk_ssk(sk)->ssk_send_head) tcp_push_one(sk, mss_now); continue; @@ -977,8 +979,8 @@ do_fault: if (!skb->len) { - if (sk->sk_send_head == skb) - sk->sk_send_head = NULL; + if (sk_ssk(sk)->ssk_send_head == skb) + sk_ssk(sk)->ssk_send_head = NULL; __skb_unlink(skb, skb->list); sk_stream_free_skb(sk, skb); } @@ -1654,7 +1656,8 @@ NET_INC_STATS_USER(LINUX_MIB_TCPABORTONCLOSE); tcp_set_state(sk, TCP_CLOSE); tcp_send_active_reset(sk, GFP_KERNEL); - } else if (sock_flag(sk, SOCK_LINGER) && !sk->sk_lingertime) { + } else if (sock_flag(sk, SOCK_LINGER) && + !sk_ssk(sk)->ssk_lingertime) { /* Check zero linger _after_ checking for unread data. */ sk->sk_prot->disconnect(sk, 0); NET_INC_STATS_USER(LINUX_MIB_TCPABORTONDATA); @@ -1739,7 +1742,7 @@ if (sk->sk_state != TCP_CLOSE) { sk_stream_mem_reclaim(sk); if (atomic_read(&tcp_orphan_count) > sysctl_tcp_max_orphans || - (sk->sk_wmem_queued > SOCK_MIN_SNDBUF && + (sk_ssk(sk)->ssk_wmem_queued > SOCK_MIN_SNDBUF && atomic_read(&tcp_memory_allocated) > sysctl_tcp_mem[2])) { if (net_ratelimit()) printk(KERN_INFO "TCP: too many of orphaned " @@ -1818,7 +1821,7 @@ tcp_set_ca_state(tp, TCP_CA_Open); tcp_clear_retrans(tp); tcp_delack_init(tp); - sk->sk_send_head = NULL; + sk_ssk(sk)->ssk_send_head = NULL; tp->saw_tstamp = 0; tcp_sack_reset(tp); __sk_dst_reset(sk); diff -Nru a/net/ipv4/tcp_diag.c b/net/ipv4/tcp_diag.c --- a/net/ipv4/tcp_diag.c 2004-12-15 20:32:26 -02:00 +++ b/net/ipv4/tcp_diag.c 2004-12-15 20:32:26 -02:00 @@ -158,8 +158,8 @@ if (minfo) { minfo->tcpdiag_rmem = atomic_read(&sk->sk_rmem_alloc); - minfo->tcpdiag_wmem = sk->sk_wmem_queued; - minfo->tcpdiag_fmem = sk->sk_forward_alloc; + minfo->tcpdiag_wmem = sk_ssk(sk)->ssk_wmem_queued; + minfo->tcpdiag_fmem = sk_ssk(sk)->ssk_forward_alloc; minfo->tcpdiag_tmem = atomic_read(&sk->sk_wmem_alloc); } diff -Nru a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c --- a/net/ipv4/tcp_input.c 2004-12-15 20:32:26 -02:00 +++ b/net/ipv4/tcp_input.c 2004-12-15 20:32:26 -02:00 @@ -962,6 +962,7 @@ tcp_sacktag_write_queue(struct sock *sk, struct sk_buff *ack_skb, u32 prior_snd_una) { struct tcp_opt *tp = tcp_sk(sk); + struct stream_sock *ssk = sk_ssk(sk); unsigned char *ptr = ack_skb->h.raw + TCP_SKB_CB(ack_skb)->sacked; struct tcp_sack_block *sp = (struct tcp_sack_block *)(ptr+2); int num_sacks = (ptr[1] - TCPOLEN_SACK_BASE)>>3; @@ -973,9 +974,9 @@ /* So, SACKs for already sent large segments will be lost. * Not good, but alternative is to resegment the queue. */ - if (sk->sk_route_caps & NETIF_F_TSO) { - sk->sk_route_caps &= ~NETIF_F_TSO; - sk->sk_no_largesend = 1; + if (ssk->ssk_route_caps & NETIF_F_TSO) { + ssk->ssk_route_caps &= ~NETIF_F_TSO; + ssk->ssk_no_largesend = 1; tp->mss_cache = tp->mss_cache_std; } @@ -2435,7 +2436,7 @@ __s32 seq_rtt = -1; while ((skb = skb_peek(&sk->sk_write_queue)) && - skb != sk->sk_send_head) { + skb != sk_ssk(sk)->ssk_send_head) { struct tcp_skb_cb *scb = TCP_SKB_CB(skb); __u8 sacked = scb->sacked; @@ -2529,7 +2530,7 @@ /* Was it a usable window open? */ - if (!after(TCP_SKB_CB(sk->sk_send_head)->end_seq, + if (!after(TCP_SKB_CB(sk_ssk(sk)->ssk_send_head)->end_seq, tp->snd_una + tp->snd_wnd)) { tp->backoff = 0; tcp_clear_xmit_timer(sk, TCP_TIME_PROBE0); @@ -2967,7 +2968,7 @@ * being used to time the probes, and is probably far higher than * it needs to be for normal retransmission. */ - if (sk->sk_send_head) + if (sk_ssk(sk)->ssk_send_head) tcp_ack_probe(sk); return 1; @@ -3943,7 +3944,7 @@ /* When incoming ACK allowed to free some skb from write_queue, - * we remember this event in flag sk->sk_queue_shrunk and wake up socket + * we remember this event in flag sk_ssk(sk)->ssk_queue_shrunk and wake up socket * on the exit from tcp input handler. * * PROBLEM: sndbuf expansion does not work well with largesend. @@ -3971,8 +3972,8 @@ static inline void tcp_check_space(struct sock *sk) { - if (sk->sk_queue_shrunk) { - sk->sk_queue_shrunk = 0; + if (sk_ssk(sk)->ssk_queue_shrunk) { + sk_ssk(sk)->ssk_queue_shrunk = 0; if (sk->sk_socket && test_bit(SOCK_NOSPACE, &sk->sk_socket->flags)) tcp_new_space(sk); @@ -3991,7 +3992,7 @@ static __inline__ void tcp_data_snd_check(struct sock *sk) { - struct sk_buff *skb = sk->sk_send_head; + struct sk_buff *skb = sk_ssk(sk)->ssk_send_head; if (skb != NULL) __tcp_data_snd_check(sk, skb); @@ -4337,7 +4338,7 @@ tcp_rcv_rtt_measure_ts(tp, skb); - if ((int)skb->truesize > sk->sk_forward_alloc) + if ((int)skb->truesize > sk_ssk(sk)->ssk_forward_alloc) goto step5; NET_INC_STATS_BH(LINUX_MIB_TCPHPHITS); @@ -4515,7 +4516,7 @@ TCP_ECN_rcv_synack(tp, th); if (tp->ecn_flags&TCP_ECN_OK) - sk->sk_no_largesend = 1; + sk_ssk(sk)->ssk_no_largesend = 1; tp->snd_wl1 = TCP_SKB_CB(skb)->seq; tcp_ack(sk, skb, FLAG_SLOWPATH); @@ -4585,7 +4586,8 @@ sk_wake_async(sk, 0, POLL_OUT); } - if (sk->sk_write_pending || tp->defer_accept || tp->ack.pingpong) { + if (sk_ssk(sk)->ssk_write_pending || + tp->defer_accept || tp->ack.pingpong) { /* Save one ACK. Data will be ready after * several ticks, if write_pending is set. * @@ -4653,7 +4655,7 @@ TCP_ECN_rcv_syn(tp, th); if (tp->ecn_flags&TCP_ECN_OK) - sk->sk_no_largesend = 1; + sk_ssk(sk)->ssk_no_largesend = 1; tcp_sync_mss(sk, tp->pmtu_cookie); tcp_initialize_rcv_mss(sk); diff -Nru a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c --- a/net/ipv4/tcp_ipv4.c 2004-12-15 20:32:26 -02:00 +++ b/net/ipv4/tcp_ipv4.c 2004-12-15 20:32:26 -02:00 @@ -359,8 +359,8 @@ lock = &tcp_lhash_lock; tcp_listen_wlock(); } else { - list = &tcp_ehash[(sk->sk_hashent = tcp_sk_hashfn(sk))].chain; - lock = &tcp_ehash[sk->sk_hashent].lock; + list = &tcp_ehash[(sk_ssk(sk)->ssk_hashent = tcp_sk_hashfn(sk))].chain; + lock = &tcp_ehash[sk_ssk(sk)->ssk_hashent].lock; write_lock(lock); } __sk_add_node(sk, list); @@ -391,7 +391,7 @@ tcp_listen_wlock(); lock = &tcp_lhash_lock; } else { - struct tcp_ehash_bucket *head = &tcp_ehash[sk->sk_hashent]; + struct tcp_ehash_bucket *head = &tcp_ehash[sk_ssk(sk)->ssk_hashent]; lock = &head->lock; write_lock_bh(&head->lock); } @@ -612,7 +612,7 @@ * in hash table socket with a funny identity. */ inet->num = lport; inet->sport = htons(lport); - sk->sk_hashent = hash; + sk_ssk(sk)->ssk_hashent = hash; BUG_TRAP(sk_unhashed(sk)); __sk_add_node(sk, &head->chain); sock_prot_inc_use(sk->sk_prot); @@ -864,7 +864,7 @@ /* This unhashes the socket and releases the local port, if necessary. */ tcp_set_state(sk, TCP_CLOSE); ip_rt_put(rt); - sk->sk_route_caps = 0; + sk_ssk(sk)->ssk_route_caps = 0; inet->dport = 0; return err; } @@ -1954,7 +1954,7 @@ } /* Routing failed... */ - sk->sk_route_caps = 0; + sk_ssk(sk)->ssk_route_caps = 0; if (!sysctl_ip_dynaddr || sk->sk_state != TCP_SYN_SENT || @@ -2095,6 +2095,7 @@ int tcp_v4_destroy_sock(struct sock *sk) { struct tcp_opt *tp = tcp_sk(sk); + struct stream_sock *ssk = sk_ssk(sk); tcp_clear_xmit_timers(sk); @@ -2114,9 +2115,9 @@ /* * If sendmsg cached page exists, toss it. */ - if (sk->sk_sndmsg_page) { - __free_page(sk->sk_sndmsg_page); - sk->sk_sndmsg_page = NULL; + if (ssk->ssk_sndmsg_page) { + __free_page(ssk->ssk_sndmsg_page); + ssk->ssk_sndmsg_page = NULL; } atomic_dec(&tcp_sockets_allocated); diff -Nru a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c --- a/net/ipv4/tcp_minisocks.c 2004-12-15 20:32:26 -02:00 +++ b/net/ipv4/tcp_minisocks.c 2004-12-15 20:32:26 -02:00 @@ -294,7 +294,7 @@ */ static void __tcp_tw_hashdance(struct sock *sk, struct tcp_tw_bucket *tw) { - struct tcp_ehash_bucket *ehead = &tcp_ehash[sk->sk_hashent]; + struct tcp_ehash_bucket *ehead = &tcp_ehash[sk_ssk(sk)->ssk_hashent]; struct tcp_bind_hashbucket *bhead; /* Step 1: Put TW into bind hash. Original socket stays there too. @@ -354,7 +354,7 @@ tw->tw_rcv_wscale = tp->rcv_wscale; atomic_set(&tw->tw_refcnt, 1); - tw->tw_hashent = sk->sk_hashent; + tw->tw_hashent = sk_ssk(sk)->ssk_hashent; tw->tw_rcv_nxt = tp->rcv_nxt; tw->tw_snd_nxt = tp->snd_nxt; tw->tw_rcv_wnd = tcp_receive_window(tp); @@ -695,6 +695,7 @@ memcpy(newsk, sk, sizeof(struct tcp_sock)); newsk->sk_state = TCP_SYN_RECV; + ssk_set_pointer(newsk, struct tcp_sock, ssk); /* SANITY */ sk_node_init(&newsk->sk_node); @@ -712,13 +713,13 @@ atomic_set(&newsk->sk_wmem_alloc, 0); skb_queue_head_init(&newsk->sk_write_queue); atomic_set(&newsk->sk_omem_alloc, 0); - newsk->sk_wmem_queued = 0; - newsk->sk_forward_alloc = 0; + sk_ssk(newsk)->ssk_wmem_queued = 0; + sk_ssk(newsk)->ssk_forward_alloc = 0; sock_reset_flag(newsk, SOCK_DONE); newsk->sk_userlocks = sk->sk_userlocks & ~SOCK_BINDPORT_LOCK; newsk->sk_backlog.head = newsk->sk_backlog.tail = NULL; - newsk->sk_send_head = NULL; + sk_ssk(newsk)->ssk_send_head = NULL; rwlock_init(&newsk->sk_callback_lock); skb_queue_head_init(&newsk->sk_error_queue); newsk->sk_write_space = sk_stream_write_space; @@ -839,7 +840,7 @@ newtp->mss_clamp = req->mss; TCP_ECN_openreq_child(newtp, req); if (newtp->ecn_flags&TCP_ECN_OK) - newsk->sk_no_largesend = 1; + sk_ssk(newsk)->ssk_no_largesend = 1; tcp_ca_init(newtp); diff -Nru a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c --- a/net/ipv4/tcp_output.c 2004-12-15 20:32:26 -02:00 +++ b/net/ipv4/tcp_output.c 2004-12-15 20:32:26 -02:00 @@ -54,9 +54,11 @@ static __inline__ void update_send_head(struct sock *sk, struct tcp_opt *tp, struct sk_buff *skb) { - sk->sk_send_head = skb->next; - if (sk->sk_send_head == (struct sk_buff *)&sk->sk_write_queue) - sk->sk_send_head = NULL; + struct stream_sock *ssk = sk_ssk(sk); + + ssk->ssk_send_head = skb->next; + if (ssk->ssk_send_head == (struct sk_buff *)&sk->sk_write_queue) + ssk->ssk_send_head = NULL; tp->snd_nxt = TCP_SKB_CB(skb)->end_seq; tcp_packets_out_inc(sk, tp, skb); } @@ -404,8 +406,8 @@ sk_charge_skb(sk, skb); /* Queue it, remembering where we must start sending. */ - if (sk->sk_send_head == NULL) - sk->sk_send_head = skb; + if (sk_ssk(sk)->ssk_send_head == NULL) + sk_ssk(sk)->ssk_send_head = skb; } /* Send _single_ skb sitting at the send head. This function requires @@ -414,13 +416,13 @@ void tcp_push_one(struct sock *sk, unsigned cur_mss) { struct tcp_opt *tp = tcp_sk(sk); - struct sk_buff *skb = sk->sk_send_head; + struct sk_buff *skb = sk_ssk(sk)->ssk_send_head; if (tcp_snd_test(tp, skb, cur_mss, TCP_NAGLE_PUSH)) { /* Send it out now. */ TCP_SKB_CB(skb)->when = tcp_time_stamp; if (!tcp_transmit_skb(sk, skb_clone(skb, sk->sk_allocation))) { - sk->sk_send_head = NULL; + sk_ssk(sk)->ssk_send_head = NULL; tp->snd_nxt = TCP_SKB_CB(skb)->end_seq; tcp_packets_out_inc(sk, tp, skb); return; @@ -566,6 +568,8 @@ int tcp_trim_head(struct sock *sk, struct sk_buff *skb, u32 len) { + struct stream_sock *ssk; + if (skb_cloned(skb) && pskb_expand_head(skb, 0, 0, GFP_ATOMIC)) return -ENOMEM; @@ -581,9 +585,11 @@ skb->ip_summed = CHECKSUM_HW; skb->truesize -= len; - sk->sk_queue_shrunk = 1; - sk->sk_wmem_queued -= len; - sk->sk_forward_alloc += len; + + ssk = sk_ssk(sk); + ssk->ssk_queue_shrunk = 1; + ssk->ssk_wmem_queued -= len; + ssk->ssk_forward_alloc += len; /* Any change of skb->len requires recalculation of tso * factor and mss. @@ -679,7 +685,7 @@ } do_large = (large && - (sk->sk_route_caps & NETIF_F_TSO) && + (sk_ssk(sk)->ssk_route_caps & NETIF_F_TSO) && !tp->urg_mode); if (do_large) { @@ -745,7 +751,7 @@ */ mss_now = tcp_current_mss(sk, 1); - while ((skb = sk->sk_send_head) && + while ((skb = sk_ssk(sk)->ssk_send_head) && tcp_snd_test(tp, skb, mss_now, tcp_skb_is_last(sk, skb) ? nonagle : TCP_NAGLE_PUSH)) { @@ -772,7 +778,8 @@ return 0; } - return !tcp_get_pcount(&tp->packets_out) && sk->sk_send_head; + return !tcp_get_pcount(&tp->packets_out) && + sk_ssk(sk)->ssk_send_head; } return 0; } @@ -1017,6 +1024,7 @@ int tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb) { struct tcp_opt *tp = tcp_sk(sk); + struct stream_sock *ssk = sk_ssk(sk); unsigned int cur_mss = tcp_current_mss(sk, 0); int err; @@ -1024,16 +1032,16 @@ * copying overhead: frgagmentation, tunneling, mangling etc. */ if (atomic_read(&sk->sk_wmem_alloc) > - min(sk->sk_wmem_queued + (sk->sk_wmem_queued >> 2), sk->sk_sndbuf)) + min(ssk->ssk_wmem_queued + (ssk->ssk_wmem_queued >> 2), sk->sk_sndbuf)) return -EAGAIN; if (before(TCP_SKB_CB(skb)->seq, tp->snd_una)) { if (before(TCP_SKB_CB(skb)->end_seq, tp->snd_una)) BUG(); - if (sk->sk_route_caps & NETIF_F_TSO) { - sk->sk_route_caps &= ~NETIF_F_TSO; - sk->sk_no_largesend = 1; + if (ssk->ssk_route_caps & NETIF_F_TSO) { + ssk->ssk_route_caps &= ~NETIF_F_TSO; + ssk->ssk_no_largesend = 1; tp->mss_cache = tp->mss_cache_std; } @@ -1067,7 +1075,7 @@ /* Collapse two adjacent packets if worthwhile and we can. */ if(!(TCP_SKB_CB(skb)->flags & TCPCB_FLAG_SYN) && (skb->len < (cur_mss >> 1)) && - (skb->next != sk->sk_send_head) && + (skb->next != ssk->ssk_send_head) && (skb->next != (struct sk_buff *)&sk->sk_write_queue) && (skb_shinfo(skb)->nr_frags == 0 && skb_shinfo(skb->next)->nr_frags == 0) && (sysctl_tcp_retrans_collapse != 0)) @@ -1245,7 +1253,7 @@ */ mss_now = tcp_current_mss(sk, 1); - if (sk->sk_send_head != NULL) { + if (sk_ssk(sk)->ssk_send_head != NULL) { TCP_SKB_CB(skb)->flags |= TCPCB_FLAG_FIN; TCP_SKB_CB(skb)->end_seq++; tp->write_seq++; @@ -1635,9 +1643,10 @@ { if (sk->sk_state != TCP_CLOSE) { struct tcp_opt *tp = tcp_sk(sk); + struct stream_sock *ssk = sk_ssk(sk); struct sk_buff *skb; - if ((skb = sk->sk_send_head) != NULL && + if ((skb = ssk->ssk_send_head) != NULL && before(TCP_SKB_CB(skb)->seq, tp->snd_una+tp->snd_wnd)) { int err; unsigned int mss = tcp_current_mss(sk, 0); @@ -1658,9 +1667,9 @@ return -1; /* SWS override triggered forced fragmentation. * Disable TSO, the connection is too sick. */ - if (sk->sk_route_caps & NETIF_F_TSO) { - sk->sk_no_largesend = 1; - sk->sk_route_caps &= ~NETIF_F_TSO; + if (ssk->ssk_route_caps & NETIF_F_TSO) { + ssk->ssk_no_largesend = 1; + ssk->ssk_route_caps &= ~NETIF_F_TSO; tp->mss_cache = tp->mss_cache_std; } } else if (!tcp_skb_pcount(skb)) @@ -1693,7 +1702,7 @@ err = tcp_write_wakeup(sk); - if (tcp_get_pcount(&tp->packets_out) || !sk->sk_send_head) { + if (tcp_get_pcount(&tp->packets_out) || !sk_ssk(sk)->ssk_send_head) { /* Cancel probe timer, if it is not required. */ tp->probes_out = 0; tp->backoff = 0; diff -Nru a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c --- a/net/ipv4/tcp_timer.c 2004-12-15 20:32:26 -02:00 +++ b/net/ipv4/tcp_timer.c 2004-12-15 20:32:26 -02:00 @@ -114,7 +114,7 @@ orphans <<= 1; if (orphans >= sysctl_tcp_max_orphans || - (sk->sk_wmem_queued > SOCK_MIN_SNDBUF && + (sk_ssk(sk)->ssk_wmem_queued > SOCK_MIN_SNDBUF && atomic_read(&tcp_memory_allocated) > sysctl_tcp_mem[2])) { if (net_ratelimit()) printk(KERN_INFO "Out of socket memory\n"); @@ -271,7 +271,7 @@ struct tcp_opt *tp = tcp_sk(sk); int max_probes; - if (tcp_get_pcount(&tp->packets_out) || !sk->sk_send_head) { + if (tcp_get_pcount(&tp->packets_out) || !sk_ssk(sk)->ssk_send_head) { tp->probes_out = 0; return; } @@ -608,7 +608,7 @@ elapsed = keepalive_time_when(tp); /* It is alive without keepalive 8) */ - if (tcp_get_pcount(&tp->packets_out) || sk->sk_send_head) + if (tcp_get_pcount(&tp->packets_out) || sk_ssk(sk)->ssk_send_head) goto resched; elapsed = tcp_time_stamp - tp->rcv_tstamp; diff -Nru a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c --- a/net/ipv6/af_inet6.c 2004-12-15 20:32:26 -02:00 +++ b/net/ipv6/af_inet6.c 2004-12-15 20:32:26 -02:00 @@ -173,6 +173,9 @@ if (sk == NULL) goto out; + if (sock->type == SOCK_STREAM) + ssk_set_pointer(sk, struct tcp6_sock, ssk); + sock_init_data(sock, sk); sk->sk_prot = answer_prot; sk_set_owner(sk, sk->sk_prot->owner); diff -Nru a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c --- a/net/ipv6/ip6_output.c 2004-12-15 20:32:26 -02:00 +++ b/net/ipv6/ip6_output.c 2004-12-15 20:32:26 -02:00 @@ -847,8 +847,10 @@ np->cork.hop_limit = hlimit; inet->cork.fragsize = mtu = dst_pmtu(&rt->u.dst); inet->cork.length = 0; - sk->sk_sndmsg_page = NULL; - sk->sk_sndmsg_off = 0; + if (sk->sk_type == SOCK_STREAM) { + sk_ssk(sk)->ssk_sndmsg_page = NULL; + sk_ssk(sk)->ssk_sndmsg_off = 0; + } exthdrlen = rt->u.dst.header_len + (opt ? opt->opt_flen : 0); length += exthdrlen; transhdrlen += exthdrlen; @@ -1028,8 +1030,9 @@ } else { int i = skb_shinfo(skb)->nr_frags; skb_frag_t *frag = &skb_shinfo(skb)->frags[i-1]; - struct page *page = sk->sk_sndmsg_page; - int off = sk->sk_sndmsg_off; + struct stream_sock *ssk = sk_ssk(sk); + struct page *page = ssk->ssk_sndmsg_page; + int off = ssk->ssk_sndmsg_off; unsigned int left; if (page && (left = PAGE_SIZE - off) > 0) { @@ -1041,7 +1044,8 @@ goto error; } get_page(page); - skb_fill_page_desc(skb, i, page, sk->sk_sndmsg_off, 0); + skb_fill_page_desc(skb, i, page, + ssk->ssk_sndmsg_off, 0); frag = &skb_shinfo(skb)->frags[i]; } } else if(i < MAX_SKB_FRAGS) { @@ -1052,8 +1056,8 @@ err = -ENOMEM; goto error; } - sk->sk_sndmsg_page = page; - sk->sk_sndmsg_off = 0; + ssk->ssk_sndmsg_page = page; + ssk->ssk_sndmsg_off = 0; skb_fill_page_desc(skb, i, page, 0, 0); frag = &skb_shinfo(skb)->frags[i]; @@ -1067,7 +1071,7 @@ err = -EFAULT; goto error; } - sk->sk_sndmsg_off += copy; + ssk->ssk_sndmsg_off += copy; frag->size += copy; skb->len += copy; skb->data_len += copy; diff -Nru a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c --- a/net/ipv6/tcp_ipv6.c 2004-12-15 20:32:26 -02:00 +++ b/net/ipv6/tcp_ipv6.c 2004-12-15 20:32:26 -02:00 @@ -220,9 +220,11 @@ lock = &tcp_lhash_lock; tcp_listen_wlock(); } else { - sk->sk_hashent = tcp_v6_sk_hashfn(sk); - list = &tcp_ehash[sk->sk_hashent].chain; - lock = &tcp_ehash[sk->sk_hashent].lock; + struct stream_sock *ssk = sk_ssk(sk); + + ssk->ssk_hashent = tcp_v6_sk_hashfn(sk); + list = &tcp_ehash[ssk->ssk_hashent].chain; + lock = &tcp_ehash[ssk->ssk_hashent].lock; write_lock(lock); } @@ -492,7 +494,7 @@ unique: BUG_TRAP(sk_unhashed(sk)); __sk_add_node(sk, &head->chain); - sk->sk_hashent = hash; + sk_ssk(sk)->ssk_hashent = hash; sock_prot_inc_use(sk->sk_prot); write_unlock_bh(&head->lock); @@ -695,7 +697,7 @@ inet->rcv_saddr = LOOPBACK4_IPV6; ip6_dst_store(sk, dst, NULL); - sk->sk_route_caps = dst->dev->features & + sk_ssk(sk)->ssk_route_caps = dst->dev->features & ~(NETIF_F_IP_CSUM | NETIF_F_TSO); tp->ext_header_len = 0; @@ -729,7 +731,7 @@ __sk_dst_reset(sk); failure: inet->dport = 0; - sk->sk_route_caps = 0; + sk_ssk(sk)->ssk_route_caps = 0; return err; } @@ -1386,7 +1388,7 @@ #endif ip6_dst_store(newsk, dst, NULL); - newsk->sk_route_caps = dst->dev->features & + sk_ssk(newsk)->ssk_route_caps = dst->dev->features & ~(NETIF_F_IP_CSUM | NETIF_F_TSO); newtcp6sk = (struct tcp6_sock *)newsk; @@ -1776,7 +1778,7 @@ err = ip6_dst_lookup(sk, &dst, &fl); if (err) { - sk->sk_route_caps = 0; + sk_ssk(sk)->ssk_route_caps = 0; return err; } if (final_p) @@ -1789,7 +1791,7 @@ } ip6_dst_store(sk, dst, NULL); - sk->sk_route_caps = dst->dev->features & + sk_ssk(sk)->ssk_route_caps = dst->dev->features & ~(NETIF_F_IP_CSUM | NETIF_F_TSO); tcp_sk(sk)->ext2_header_len = dst->header_len; } @@ -1837,13 +1839,13 @@ ipv6_addr_copy(&fl.fl6_dst, final_p); if ((err = xfrm_lookup(&dst, &fl, sk, 0)) < 0) { - sk->sk_route_caps = 0; + sk_ssk(sk)->ssk_route_caps = 0; dst_release(dst); return err; } ip6_dst_store(sk, dst, NULL); - sk->sk_route_caps = dst->dev->features & + sk_ssk(sk)->ssk_route_caps = dst->dev->features & ~(NETIF_F_IP_CSUM | NETIF_F_TSO); tcp_sk(sk)->ext2_header_len = dst->header_len; } diff -Nru a/net/irda/af_irda.c b/net/irda/af_irda.c --- a/net/irda/af_irda.c 2004-12-15 20:32:26 -02:00 +++ b/net/irda/af_irda.c 2004-12-15 20:32:26 -02:00 @@ -760,8 +760,8 @@ return -EOPNOTSUPP; if (sk->sk_state != TCP_LISTEN) { - sk->sk_max_ack_backlog = backlog; - sk->sk_state = TCP_LISTEN; + sk_ssk(sk)->ssk_max_ack_backlog = backlog; + sk->sk_state = TCP_LISTEN; return 0; } @@ -937,7 +937,7 @@ skb->sk = NULL; skb->destructor = NULL; kfree_skb(skb); - sk->sk_ack_backlog--; + sk_ssk(sk)->ssk_ack_backlog--; newsock->state = SS_CONNECTED; diff -Nru a/net/llc/af_llc.c b/net/llc/af_llc.c --- a/net/llc/af_llc.c 2004-12-15 20:32:26 -02:00 +++ b/net/llc/af_llc.c 2004-12-15 20:32:26 -02:00 @@ -464,9 +464,7 @@ rc = 0; if (!(unsigned)backlog) /* BSDism */ backlog = 1; - sk->sk_max_ack_backlog = backlog; if (sk->sk_state != TCP_LISTEN) { - sk->sk_ack_backlog = 0; sk->sk_state = TCP_LISTEN; } sk->sk_socket->flags |= __SO_ACCEPTCON; @@ -648,7 +646,6 @@ /* put original socket back into a clean listen state. */ sk->sk_state = TCP_LISTEN; - sk->sk_ack_backlog--; skb->sk = NULL; dprintk("%s: ok success on %02X, client on %02X\n", __FUNCTION__, llc_sk(sk)->addr.sllc_sap, newllc->daddr.lsap); diff -Nru a/net/netrom/af_netrom.c b/net/netrom/af_netrom.c --- a/net/netrom/af_netrom.c 2004-12-15 20:32:26 -02:00 +++ b/net/netrom/af_netrom.c 2004-12-15 20:32:26 -02:00 @@ -406,8 +406,8 @@ lock_sock(sk); if (sk->sk_state != TCP_LISTEN) { memset(&nr_sk(sk)->user_addr, 0, AX25_ADDR_LEN); - sk->sk_max_ack_backlog = backlog; - sk->sk_state = TCP_LISTEN; + sk_ssk(sk)->ssk_max_ack_backlog = backlog; + sk->sk_state = TCP_LISTEN; release_sock(sk); return 0; } @@ -806,7 +806,7 @@ /* Now attach up the new socket */ kfree_skb(skb); - sk->sk_ack_backlog--; + sk_ssk(sk)->ssk_ack_backlog--; newsock->sk = newsk; out: @@ -939,7 +939,8 @@ user = (ax25_address *)(skb->data + 21); - if (!sk || sk->sk_ack_backlog == sk->sk_max_ack_backlog || + if (!sk || sk_ssk(sk)->ssk_ack_backlog == + sk_ssk(sk)->ssk_max_ack_backlog || (make = nr_make_new(sk)) == NULL) { nr_transmit_refusal(skb, 0); if (sk) @@ -992,7 +993,7 @@ nr_make->vr = 0; nr_make->vl = 0; nr_make->state = NR_STATE_3; - sk->sk_ack_backlog++; + sk_ssk(sk)->ssk_ack_backlog++; nr_insert_socket(make); diff -Nru a/net/rose/af_rose.c b/net/rose/af_rose.c --- a/net/rose/af_rose.c 2004-12-15 20:32:26 -02:00 +++ b/net/rose/af_rose.c 2004-12-15 20:32:26 -02:00 @@ -497,8 +497,8 @@ memset(&rose->dest_addr, 0, ROSE_ADDR_LEN); memset(&rose->dest_call, 0, AX25_ADDR_LEN); memset(rose->dest_digis, 0, AX25_ADDR_LEN * ROSE_MAX_DIGIS); - sk->sk_max_ack_backlog = backlog; - sk->sk_state = TCP_LISTEN; + sk_ssk(sk)->ssk_max_ack_backlog = backlog; + sk->sk_state = TCP_LISTEN; return 0; } @@ -888,7 +888,7 @@ /* Now attach up the new socket */ skb->sk = NULL; kfree_skb(skb); - sk->sk_ack_backlog--; + sk_ssk(sk)->ssk_ack_backlog--; newsock->sk = newsk; out: @@ -954,7 +954,8 @@ /* * We can't accept the Call Request. */ - if (!sk || sk->sk_ack_backlog == sk->sk_max_ack_backlog || + if (!sk || sk_ssk(sk)->ssk_ack_backlog == + sk_ssk(sk)->ssk_max_ack_backlog || (make = rose_make_new(sk)) == NULL) { rose_transmit_clear_request(neigh, lci, ROSE_NETWORK_CONGESTION, 120); return 0; @@ -994,7 +995,7 @@ make_rose->va = 0; make_rose->vr = 0; make_rose->vl = 0; - sk->sk_ack_backlog++; + sk_ssk(sk)->ssk_ack_backlog++; rose_insert_socket(make); diff -Nru a/net/sctp/associola.c b/net/sctp/associola.c --- a/net/sctp/associola.c 2004-12-15 20:32:26 -02:00 +++ b/net/sctp/associola.c 2004-12-15 20:32:26 -02:00 @@ -310,7 +310,7 @@ /* Decrement the backlog value for a TCP-style listening socket. */ if (sctp_style(sk, TCP) && sctp_sstate(sk, LISTENING)) - sk->sk_ack_backlog--; + sk_ssk(sk)->ssk_ack_backlog--; /* Mark as dead, so other users can know this structure is * going away. @@ -915,7 +915,7 @@ /* Decrement the backlog value for a TCP-style socket. */ if (sctp_style(oldsk, TCP)) - oldsk->sk_ack_backlog--; + sk_ssk(oldsk)->ssk_ack_backlog--; /* Release references to the old endpoint and the sock. */ sctp_endpoint_put(assoc->ep); diff -Nru a/net/sctp/endpointola.c b/net/sctp/endpointola.c --- a/net/sctp/endpointola.c 2004-12-15 20:32:26 -02:00 +++ b/net/sctp/endpointola.c 2004-12-15 20:32:26 -02:00 @@ -171,7 +171,7 @@ /* Increment the backlog value for a TCP-style listening socket. */ if (sctp_style(sk, TCP) && sctp_sstate(sk, LISTENING)) - sk->sk_ack_backlog++; + sk_ssk(sk)->ssk_ack_backlog++; } /* Free the endpoint structure. Delay cleanup until diff -Nru a/net/sctp/sm_statefuns.c b/net/sctp/sm_statefuns.c --- a/net/sctp/sm_statefuns.c 2004-12-15 20:32:26 -02:00 +++ b/net/sctp/sm_statefuns.c 2004-12-15 20:32:26 -02:00 @@ -216,7 +216,7 @@ */ if (!sctp_sstate(sk, LISTENING) || (sctp_style(sk, TCP) && - (sk->sk_ack_backlog >= sk->sk_max_ack_backlog))) + (sk_ssk(sk)->ssk_ack_backlog >= sk_ssk(sk)->ssk_max_ack_backlog))) return sctp_sf_tabort_8_4_8(ep, asoc, type, arg, commands); /* 3.1 A packet containing an INIT chunk MUST have a zero Verification diff -Nru a/net/sctp/socket.c b/net/sctp/socket.c --- a/net/sctp/socket.c 2004-12-15 20:32:26 -02:00 +++ b/net/sctp/socket.c 2004-12-15 20:32:26 -02:00 @@ -143,7 +143,7 @@ *((struct sctp_chunk **)(chunk->skb->cb)) = chunk; asoc->sndbuf_used += SCTP_DATA_SNDSIZE(chunk); - sk->sk_wmem_queued += SCTP_DATA_SNDSIZE(chunk); + sk_ssk(sk)->ssk_wmem_queued += SCTP_DATA_SNDSIZE(chunk); } /* Verify that this is a valid address. */ @@ -976,7 +976,7 @@ sctp_association_free(asoc); } else if (sock_flag(sk, SOCK_LINGER) && - !sk->sk_lingertime) + !sk_ssk(sk)->ssk_lingertime) sctp_primitive_ABORT(asoc, NULL); else sctp_primitive_SHUTDOWN(asoc, NULL); @@ -3853,7 +3853,7 @@ return -EAGAIN; } sk->sk_state = SCTP_SS_LISTENING; - sk->sk_max_ack_backlog = backlog; + sk_ssk(sk)->ssk_max_ack_backlog = backlog; sctp_hash_endpoint(ep); return 0; } @@ -4321,7 +4321,7 @@ asoc = chunk->asoc; sk = asoc->base.sk; asoc->sndbuf_used -= SCTP_DATA_SNDSIZE(chunk); - sk->sk_wmem_queued -= SCTP_DATA_SNDSIZE(chunk); + sk_ssk(sk)->ssk_wmem_queued -= SCTP_DATA_SNDSIZE(chunk); __sctp_write_space(asoc); sctp_association_put(asoc); @@ -4415,7 +4415,7 @@ { int amt = 0; - amt = sk->sk_sndbuf - sk->sk_wmem_queued; + amt = sk->sk_sndbuf - sk_ssk(sk)->ssk_wmem_queued; if (amt < 0) amt = 0; return amt; diff -Nru a/net/unix/af_unix.c b/net/unix/af_unix.c --- a/net/unix/af_unix.c 2004-12-15 20:32:26 -02:00 +++ b/net/unix/af_unix.c 2004-12-15 20:32:26 -02:00 @@ -430,10 +430,10 @@ unix_state_wlock(sk); if (sk->sk_state != TCP_CLOSE && sk->sk_state != TCP_LISTEN) goto out_unlock; - if (backlog > sk->sk_max_ack_backlog) + if (backlog > sk_ssk(sk)->ssk_max_ack_backlog) wake_up_interruptible_all(&u->peer_wait); - sk->sk_max_ack_backlog = backlog; - sk->sk_state = TCP_LISTEN; + sk_ssk(sk)->ssk_max_ack_backlog = backlog; + sk->sk_state = TCP_LISTEN; /* set credentials so connect can copy them */ sk->sk_peercred.pid = current->tgid; sk->sk_peercred.uid = current->euid; @@ -547,11 +547,13 @@ atomic_inc(&unix_nr_socks); + ssk_set_pointer(sk, struct unix_sock, ssk); + sk_ssk(sk)->ssk_max_ack_backlog = sysctl_unix_max_dgram_qlen; + sock_init_data(sock,sk); sk_set_owner(sk, THIS_MODULE); sk->sk_write_space = unix_write_space; - sk->sk_max_ack_backlog = sysctl_unix_max_dgram_qlen; sk->sk_destruct = unix_sock_destructor; u = unix_sk(sk); u->dentry = NULL; @@ -922,7 +924,7 @@ sched = !sock_flag(other, SOCK_DEAD) && !(other->sk_shutdown & RCV_SHUTDOWN) && (skb_queue_len(&other->sk_receive_queue) > - other->sk_max_ack_backlog); + sk_ssk(other)->ssk_max_ack_backlog); unix_state_runlock(other); @@ -995,7 +997,7 @@ goto out_unlock; if (skb_queue_len(&other->sk_receive_queue) > - other->sk_max_ack_backlog) { + sk_ssk(other)->ssk_max_ack_backlog) { err = -EAGAIN; if (!timeo) goto out_unlock; @@ -1364,7 +1366,7 @@ if (unix_peer(other) != sk && (skb_queue_len(&other->sk_receive_queue) > - other->sk_max_ack_backlog)) { + sk_ssk(other)->ssk_max_ack_backlog)) { if (!timeo) { err = -EAGAIN; goto out_unlock; diff -Nru a/net/wanrouter/af_wanpipe.c b/net/wanrouter/af_wanpipe.c --- a/net/wanrouter/af_wanpipe.c 2004-12-15 20:32:26 -02:00 +++ b/net/wanrouter/af_wanpipe.c 2004-12-15 20:32:26 -02:00 @@ -415,7 +415,6 @@ sll->sll_halen = 0; skb->dev = dev; - sk->sk_ack_backlog++; /* We must do this manually, since the sock_queue_rcv_skb() * function sets the skb->dev to NULL. However, we use @@ -425,7 +424,6 @@ wanpipe_unlink_driver(newsk); wanpipe_kill_sock_irq (newsk); - --sk->sk_ack_backlog; return -ENOMEM; } @@ -1519,7 +1517,6 @@ sk->sk_family = PF_WANPIPE; wp_sk(sk)->num = protocol; sk->sk_state = WANSOCK_DISCONNECTED; - sk->sk_ack_backlog = 0; sk->sk_bound_dev_if = 0; atomic_inc(&wanpipe_socks_nr); @@ -2427,7 +2424,6 @@ newsk->sk_sleep = &newsock->wait; /* Now attach up the new socket */ - sk->sk_ack_backlog--; newsock->sk = newsk; kfree_skb(skb); diff -Nru a/net/x25/af_x25.c b/net/x25/af_x25.c --- a/net/x25/af_x25.c 2004-12-15 20:32:26 -02:00 +++ b/net/x25/af_x25.c 2004-12-15 20:32:26 -02:00 @@ -423,8 +423,8 @@ if (sk->sk_state != TCP_LISTEN) { memset(&x25_sk(sk)->dest_addr, 0, X25_ADDR_LEN); - sk->sk_max_ack_backlog = backlog; - sk->sk_state = TCP_LISTEN; + sk_ssk(sk)->ssk_max_ack_backlog = backlog; + sk->sk_state = TCP_LISTEN; rc = 0; } @@ -776,7 +776,7 @@ /* Now attach up the new socket */ skb->sk = NULL; kfree_skb(skb); - sk->sk_ack_backlog--; + sk_ssk(sk)->ssk_ack_backlog--; newsock->sk = newsk; newsock->state = SS_CONNECTED; rc = 0; @@ -835,7 +835,8 @@ /* * We can't accept the Call Request. */ - if (!sk || sk->sk_ack_backlog == sk->sk_max_ack_backlog) + if (!sk || sk_ssk(sk)->ssk_ack_backlog == + sk_ssk(sk)->ssk_max_ack_backlog) goto out_clear_request; /* @@ -886,7 +887,7 @@ makex25->state = X25_STATE_3; - sk->sk_ack_backlog++; + sk_ssk(sk)->ssk_ack_backlog++; x25_insert_socket(make); --------------050301010404020702080302-- From pp@ee.oulu.fi Wed Dec 15 16:33:07 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 15 Dec 2004 16:33:14 -0800 (PST) Received: from ee.oulu.fi (ee.oulu.fi [130.231.61.23]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBG0WjdH002038 for ; Wed, 15 Dec 2004 16:33:06 -0800 Received: from tk28.oulu.fi (tk28 [130.231.48.68]) by ee.oulu.fi (8.13.1/8.13.1) with ESMTP id iBG0WIst015068; Thu, 16 Dec 2004 02:32:18 +0200 (EET) Received: (from pp@localhost) by tk28.oulu.fi (8.13.0/8.13.0/Submit) id iBG0WHFP006299; Thu, 16 Dec 2004 02:32:17 +0200 (EET) Date: Thu, 16 Dec 2004 02:32:17 +0200 From: Pekka Pietikainen To: "YOSHIFUJI Hideaki / ?$B5HF#1QL@" Cc: netdev@oss.sgi.com Subject: Re: unregister_netdevice: waiting for tun6to4 to become free. Message-ID: <20041216003217.GA6058@ee.oulu.fi> References: <41523EC5.20805@tomt.net> <41527796.4010204@tomt.net> <4152843D.6010204@tomt.net> <20040923.172253.125681726.yoshfuji@linux-ipv6.org> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <20040923.172253.125681726.yoshfuji@linux-ipv6.org> User-Agent: Mutt/1.4.2i X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12784 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pp@ee.oulu.fi Precedence: bulk X-list: netdev On Thu, Sep 23, 2004 at 05:22:53PM +0900, YOSHIFUJI Hideaki / ?$B5HF#1QL@ wrote: > > > If I boot with net.ipv6.conf.default.forward = 1 in sysctl.conf - but > > > then after the boot do echo 1 > /proc/sys/net/ipv6/conf/all/forward (to > > > actually get it running) > > > I have the ipv6/conf/default/forwarding set in sysctl.conf on all the > > > routers, and it seems zebra sets the ipv6/conf/all/forwarding later on. > > if default is set first, then tunnels created, followed by setting > > "all"; everything breaks down on ifdown. > > > > pretty neat :) > > Thank you everyone for tracking down this bug. > I will fix this bug and come up with patch tomorrow (or so). > Thanks again. Hiya There seems to be a regression, I've recently bumped into this bug again. (2.6.9-fc3-latest-thing, I didn't see anything relevant in latest Linus bk). It did work for some time ago, since them my ipv6 configuration has changed and I've updated the kernel quite a bit. if [ "$device" != "sit0" ]; then if ipv6_exec_ip tunnel show $device 2>/dev/null | LC_ALL=C grep -q -w "ipv6/ip"; then ipv6_exec_ip tunnel del $device is what triggers it, tunnel del gets stuck with a refcnt of 1, device is no longer in /proc/net/dev. Tunnel is run on top of PPPoE and it's a dynamic sixxs.net aiccu thing. IPv6 forwarding is on. http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=282023 seems to be related too. From alan@lxorguk.ukuu.org.uk Wed Dec 15 16:59:42 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 15 Dec 2004 16:59:49 -0800 (PST) Received: from localhost.localdomain (clock-tower.bc.nu [81.2.110.250] (may be forged)) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBG0xKlq003242 for ; Wed, 15 Dec 2004 16:59:42 -0800 Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by localhost.localdomain (8.12.11/8.12.11) with ESMTP id iBFNwUHs003639; Wed, 15 Dec 2004 23:58:30 GMT Received: (from alan@localhost) by localhost.localdomain (8.12.11/8.12.11/Submit) id iBFNwTdP003638; Wed, 15 Dec 2004 23:58:29 GMT X-Authentication-Warning: localhost.localdomain: alan set sender to alan@lxorguk.ukuu.org.uk using -f Subject: Re: [2.6 patch] net/lapb/lapb_iface.c: remove unused code From: Alan Cox To: Adrian Bunk Cc: Jonathan Naylor , netdev@oss.sgi.com, Linux Kernel Mailing List In-Reply-To: <20041215010745.GB12937@stusta.de> References: <20041215010745.GB12937@stusta.de> Content-Type: text/plain Content-Transfer-Encoding: 7bit Message-Id: <1103155101.3585.26.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.6 (1.4.6-2) Date: Wed, 15 Dec 2004 23:58:22 +0000 X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12785 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: alan@lxorguk.ukuu.org.uk Precedence: bulk X-list: netdev On Mer, 2004-12-15 at 01:07, Adrian Bunk wrote: > The patch below removes the following unused code: > - EXPORT_SYMBOL'ed functions lapb_getparms and lapb_setparms > - struct lapb_parms_struct > > Please review whether it's correct or conflicts with pending changes. Please keep these, they are used by out of kernel code and they are also required in order to run test sets on the full code functionality. From rusty@rustcorp.com.au Wed Dec 15 17:41:35 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 15 Dec 2004 17:41:42 -0800 (PST) Received: from ozlabs.org (ozlabs.org [203.10.76.45]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBG1fEo4005017 for ; Wed, 15 Dec 2004 17:41:34 -0800 Received: from localhost.localdomain (localhost [127.0.0.1]) (using TLSv1 with cipher RC4-MD5 (128/128 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTP id 541CC2BE83; Thu, 16 Dec 2004 12:40:44 +1100 (EST) Subject: Re: [netfilter-core] [2.6 patch] net/ipv4/netfilter/: misc possible cleanups From: Rusty Russell To: Harald Welte Cc: Adrian Bunk , Netfilter Core Team , netdev@oss.sgi.com, Netfilter development mailing list , lkml - Kernel Mailing List In-Reply-To: <20041215090322.GA2862@sunbeam.de.gnumonks.org> References: <20041215011931.GD12937@stusta.de> <20041215090322.GA2862@sunbeam.de.gnumonks.org> Content-Type: text/plain Date: Thu, 16 Dec 2004 12:40:48 +1100 Message-Id: <1103161248.2200.7.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Evolution 2.0.2 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12786 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rusty@rustcorp.com.au Precedence: bulk X-list: netdev On Wed, 2004-12-15 at 10:03 +0100, Harald Welte wrote: > On Wed, Dec 15, 2004 at 02:19:31AM +0100, Adrian Bunk wrote: > > The patch below contains the following possible cleanups: ... > As you might be aware, netfilter/iptables has an enormously large > codebase (I'd say even larger than what is in the tree) in the so-called > patch-o-matic subsystem. The abovementioned exports facilitate those > modulse, and A certain amount of those new modules (especially the ones > requiring the functions above) are scheduled for mainline inclusion over > the next couple of months. True, but some of these cleanups are genuine. Deleting code also increases the coverage of the testsuite: I've put this in my patch set and will merge them in pieces. At the rate I work, those that are needed in the next few months won't be deleted. If patches are not due to be merged in that timeframe, it'd be nice if they contained the exports etc. that they need rather than relying on long-term unused features of the tree. Cheers, Rusty. -- A bad analogy is like a leaky screwdriver -- Richard Braakman From acme@conectiva.com.br Wed Dec 15 18:05:01 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 15 Dec 2004 18:05:09 -0800 (PST) Received: from orion.netbank.com.br (orion.netbank.com.br [200.203.199.90]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBG24bLT006392 for ; Wed, 15 Dec 2004 18:05:01 -0800 Received: from [201.14.39.194] (helo=oops.ghostprotocols.net) by orion.netbank.com.br with asmtp (Exim 3.33 #1) id 1Cel1D-0001Qn-00; Thu, 16 Dec 2004 00:05:27 -0200 Received: from [192.168.1.6] (amd64.kerneljanitors.org [192.168.1.6]) by oops.ghostprotocols.net (Postfix) with ESMTP id 1B4BF77B4D; Thu, 16 Dec 2004 00:04:09 -0200 (BRST) Message-ID: <41C0DF8B.2020007@conectiva.com.br> Date: Wed, 15 Dec 2004 23:06:19 -0200 From: Arnaldo Carvalho de Melo Organization: Conectiva S.A. User-Agent: Mozilla Thunderbird 0.9 (X11/20041103) X-Accept-Language: en-us, en MIME-Version: 1.0 To: John Richard Moser Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: Sockets from kernel space? References: <41C0E720.8050201@comcast.net> In-Reply-To: <41C0E720.8050201@comcast.net> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12787 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: acme@conectiva.com.br Precedence: bulk X-list: netdev John Richard Moser wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Is it possible to create socket connections (AF_UNIX for example) from > the kernel to local user processes that are listen()ing? > > A good link to somewhere to help with this would be nice. Please send networking development related messages to netdev@oss.sgi.com, there are several networking hackers that don't even subscribe lkml. Having said that, look at the svc_makesock and svc_create_socket functions in net/sunrpc/svcsock.c as a starting point. - Arnaldo From nigelenki@comcast.net Wed Dec 15 19:15:11 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 15 Dec 2004 19:15:24 -0800 (PST) Received: from rwcrmhc13.comcast.net (rwcrmhc13.comcast.net [204.127.198.39]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBG3Ep6h008781 for ; Wed, 15 Dec 2004 19:15:11 -0800 Received: from [192.168.0.2] (pcp485126pcs.whtmrs01.md.comcast.net[68.33.188.198]) by comcast.net (rwcrmhc13) with ESMTP id <2004121603141201500s020me>; Thu, 16 Dec 2004 03:14:20 +0000 Message-ID: <41C0FDBA.5060406@comcast.net> Date: Wed, 15 Dec 2004 22:15:06 -0500 From: John Richard Moser User-Agent: Mozilla Thunderbird 1.0 (X11/20041211) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Arnaldo Carvalho de Melo CC: linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: Sockets from kernel space? References: <41C0E720.8050201@comcast.net> <41C0DF8B.2020007@conectiva.com.br> In-Reply-To: <41C0DF8B.2020007@conectiva.com.br> X-Enigmail-Version: 0.89.5.0 X-Enigmail-Supports: pgp-inline, pgp-mime Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12788 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: nigelenki@comcast.net Precedence: bulk X-list: netdev -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Thanks. I'll look at those. I'm aiming at potentially writing an LSM that allows a process to attach to the kernel, which will then be sent messages through an AF_UNIX (these are the app<->app sockets right?) socket with the details of any listen(2) or connect(2) calls made. I was going to do it in userspace, but realized it was easily avoidable that way. If this works, I can pretty much securely create a host firewall that regulates based on network operations, user, and program. This would allow the creation of discressionary firewalls, like Zone Alarm, Norton PF, McAffee PF, etc. The daemon sits in userspace, the kernel asks it for policy decisions, it asks connected/authenticated clients about unknown policy, and makes them re-authenticate to get an answer. The authentication is in userspace (PAM), hence the daemon. Arnaldo Carvalho de Melo wrote: [...] | | Please send networking development related messages to netdev@oss.sgi.com, | there are several networking hackers that don't even subscribe lkml. | | Having said that, look at the svc_makesock and svc_create_socket functions | in net/sunrpc/svcsock.c as a starting point. | | - Arnaldo - -- All content of all messages exchanged herein are left in the Public Domain, unless otherwise explicitly stated. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.6 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFBwP26hDd4aOud5P8RAhSjAJ956RdBt9deoh3RgW7UKWdEgNeLMACeOR+b nVFR/uA/ZNXkv2b6HYcRczw= =VUfC -----END PGP SIGNATURE----- From acme@conectiva.com.br Wed Dec 15 19:28:24 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 15 Dec 2004 19:28:32 -0800 (PST) Received: from orion.netbank.com.br (orion.netbank.com.br [200.203.199.90]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBG3S39C009513 for ; Wed, 15 Dec 2004 19:28:24 -0800 Received: from [201.14.39.194] (helo=oops.ghostprotocols.net) by orion.netbank.com.br with asmtp (Exim 3.33 #1) id 1CemJx-0001eC-00; Thu, 16 Dec 2004 01:28:53 -0200 Received: from [192.168.1.6] (amd64.kerneljanitors.org [192.168.1.6]) by oops.ghostprotocols.net (Postfix) with ESMTP id E52EA77B65; Thu, 16 Dec 2004 01:27:34 -0200 (BRST) Message-ID: <41C0F31A.4050305@conectiva.com.br> Date: Thu, 16 Dec 2004 00:29:46 -0200 From: Arnaldo Carvalho de Melo Organization: Conectiva S.A. User-Agent: Mozilla Thunderbird 0.9 (X11/20041103) X-Accept-Language: en-us, en MIME-Version: 1.0 To: John Richard Moser Cc: netdev@oss.sgi.com Subject: Re: Sockets from kernel space? References: <41C0E720.8050201@comcast.net> <41C0DF8B.2020007@conectiva.com.br> <41C0FDBA.5060406@comcast.net> In-Reply-To: <41C0FDBA.5060406@comcast.net> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12789 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: acme@conectiva.com.br Precedence: bulk X-list: netdev John Richard Moser wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Thanks. I'll look at those. > > I'm aiming at potentially writing an LSM that allows a process to attach > to the kernel, which will then be sent messages through an AF_UNIX > (these are the app<->app sockets right?) socket with the details of any > listen(2) or connect(2) calls made. I was going to do it in userspace, > but realized it was easily avoidable that way. > > If this works, I can pretty much securely create a host firewall that > regulates based on network operations, user, and program. This would > allow the creation of discressionary firewalls, like Zone Alarm, Norton > PF, McAffee PF, etc. The daemon sits in userspace, the kernel asks it > for policy decisions, it asks connected/authenticated clients about > unknown policy, and makes them re-authenticate to get an answer. The > authentication is in userspace (PAM), hence the daemon. Look at the iproute2 code to know how to use netlink 8) - Arnaldo From jgarzik@pobox.com Wed Dec 15 23:47:38 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 15 Dec 2004 23:47:46 -0800 (PST) Received: from www.linux.org.uk (IDENT:93@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBG7lHGu021521 for ; Wed, 15 Dec 2004 23:47:37 -0800 Received: from rdu74-155-169.nc.rr.com ([24.74.155.169] helo=[10.10.10.88]) by www.linux.org.uk with asmtp (TLSv1:AES256-SHA:256) (Exim 4.33) id 1CeqLa-0008DK-CK; Thu, 16 Dec 2004 07:46:50 +0000 Message-ID: <41C13D55.3070002@pobox.com> Date: Thu, 16 Dec 2004 02:46:29 -0500 From: Jeff Garzik User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3) Gecko/20040922 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Netdev CC: Herbert Xu , Arnaldo Carvalho de Melo , =?UTF-8?B?WU9TSElGVUpJIEhpZGVha2kgLyDlkInol6Toi7HmmI4=?= , "David S. Miller" Subject: Badness in dst_release Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12790 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev After a week or more of uptime, the "Badness in dst_release" messages started again. Kernel 2.6.10-rc3-bk2. The first message is below. Arnaldo and Yoshifuji should have already have access to this machine (although I just rebooted into -bk-latest) IMO, if you want additional debugging code, add some non-intrusive checks in the upstream kernel. Since this problem doesn't appear for a while, additional checks would be useful. Badness in dst_release at include/net/dst.h:149 [] ip6_dst_check+0x64/0x6a [ipv6] [] ip6_dst_lookup+0x1a7/0x1c1 [ipv6] [] udpv6_sendmsg+0x297/0x931 [ipv6] [] udp_recvmsg+0x60/0x2e9 [] inet_sendmsg+0x4d/0x59 [] sock_sendmsg+0xe8/0x103 [] find_busiest_group+0xcf/0x2db [] load_balance_newidle+0x36/0xb4 [] copy_from_user+0x42/0x6e [] autoremove_wake_function+0x0/0x57 [] sys_sendmsg+0x189/0x1e6 [] schedule_timeout+0xbd/0xbf [] unqueue_me+0x5a/0xa5 [] add_wait_queue+0x1a/0x46 [] futex_wait+0x152/0x170 [] find_extend_vma+0x29/0x7e [] default_wake_function+0x0/0x12 [] futex_wake+0x74/0xc4 [] copy_from_user+0x42/0x6e [] sys_socketcall+0x236/0x254 [] sysenter_past_esp+0x52/0x71 From laforge@gnumonks.org Thu Dec 16 01:29:19 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 16 Dec 2004 01:29:26 -0800 (PST) Received: from ganesha.gnumonks.org (Debian-exim@ganesha.gnumonks.org [213.95.27.120]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBG9Sv9g028639 for ; Thu, 16 Dec 2004 01:29:18 -0800 Received: from dsl-082-082-096-175.arcor-ip.net ([82.82.96.175] helo=sunbeam.gnumonks.org) by ganesha.gnumonks.org with asmtp (TLS-1.0:RSA_ARCFOUR_SHA:16) (Exim 4.34) id 1Cerw1-0001ox-FA; Thu, 16 Dec 2004 10:28:33 +0100 Received: from laforge by sunbeam.gnumonks.org with local (Exim 4.34) id 1Cerw0-0002Mq-6c; Thu, 16 Dec 2004 10:28:32 +0100 Date: Thu, 16 Dec 2004 10:28:32 +0100 From: Harald Welte To: Henrik Nordstrom Cc: Neil Horman , linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: primary and secondary ip addresses Message-ID: <20041216092831.GO2862@sunbeam.de.gnumonks.org> References: <41912F7A.6000408@redhat.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="4NNqsQ/a67jpuySk" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12791 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: laforge@gnumonks.org Precedence: bulk X-list: netdev --4NNqsQ/a67jpuySk Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable [Cc'ing netdev, since nobody replied since november] On Wed, Nov 10, 2004 at 12:38:19AM +0100, Henrik Nordstrom wrote: > On Tue, 9 Nov 2004, Neil Horman wrote: >=20 > >Can anyone tell me why when removing a primary ip address from an=20 > >interface, all of its associated secondary ip addresses (those addresses= =20 > >on the same subnet), are also removed? Theres a comment in the kernel= =20 > >thats pretty clear on the subject indicating it needs to be done, but I= =20 > >don't quite grasp the reasoning behind why. >=20 > A more valid question is why one can not add more than one primary in the= =20 > same subnet. >=20 > ip addr add x.x.x.x/y dev eth0 primary >=20 > would make sense to me, clearly indicating that these addresses are=20 > maintained separately. agreed. > or put in another way why the addresses automatically become secondary=20 > addresses to the first added address. =2E.. and why does removing the primary address remove all secondary addresses. This makes it complicated if you have one interface with multiple ip addresse, that change over time (failover, let's say.) You don't know yet, which address you need to remove, because you don't know which of your peers fails.=20 So you have all this complicated userspace magic that checks whether the address that is about to be deleted is the primary, and if yes, re-add all the other addresses. If if is a secondary, you're happy and only need to delete that one. Can anyone comment what the idea of all this was? Thanks. --=20 - Harald Welte http://www.gnumonks.org/ =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D Programming is like sex: One mistake and you have to support it your lifeti= me --4NNqsQ/a67jpuySk Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Digital signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) iD8DBQFBwVU/XaXGVTD0i/8RAj5VAJ9UZR9hFLrDdh8AUjlT6069T4kW3QCghhCz vdaKFJ7GWshT8AAwO6rpJSM= =+ao0 -----END PGP SIGNATURE----- --4NNqsQ/a67jpuySk-- From hasso@estpak.ee Thu Dec 16 01:54:50 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 16 Dec 2004 01:54:57 -0800 (PST) Received: from arena (test.estpak.ee [194.126.115.47]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBG9sTeJ030019 for ; Thu, 16 Dec 2004 01:54:49 -0800 Received: from arena ([127.0.0.1] ident=hasso) by arena with esmtp (Exim 3.36 #1 (Debian)) id 1CesKo-0000Jd-00; Thu, 16 Dec 2004 11:54:10 +0200 From: Hasso Tepper To: Harald Welte Subject: Re: primary and secondary ip addresses Date: Thu, 16 Dec 2004 11:53:51 +0200 User-Agent: KMail/1.7.2 Cc: Henrik Nordstrom , Neil Horman , linux-net@vger.kernel.org, netdev@oss.sgi.com References: <41912F7A.6000408@redhat.com> <20041216092831.GO2862@sunbeam.de.gnumonks.org> In-Reply-To: <20041216092831.GO2862@sunbeam.de.gnumonks.org> Organization: Elion Enterprises Ltd. MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-15" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200412161153.51251.hasso@estpak.ee> X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12792 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hasso@estpak.ee Precedence: bulk X-list: netdev Harald Welte wrote: > [Cc'ing netdev, since nobody replied since november] > On Wed, Nov 10, 2004 at 12:38:19AM +0100, Henrik Nordstrom wrote: > > A more valid question is why one can not add more than one primary in > > the same subnet. > > > > ip addr add x.x.x.x/y dev eth0 primary > > > > would make sense to me, clearly indicating that these addresses are > > maintained separately. > > agreed. And why I can't even choose which address is primary? > > or put in another way why the addresses automatically become secondary > > addresses to the first added address. > > ... and why does removing the primary address remove all secondary > addresses. This makes it complicated if you have one interface with > multiple ip addresse, that change over time (failover, let's say.) You > don't know yet, which address you need to remove, because you don't know > which of your peers fails. > > So you have all this complicated userspace magic that checks whether the > address that is about to be deleted is the primary, and if yes, re-add > all the other addresses. If if is a secondary, you're happy and only > need to delete that one. > > Can anyone comment what the idea of all this was? This reminds me related issue ... Actually there is concept in Junos software I'd love to see in Linux as well. There is "preferred address" and "primary address" in Junos. Quoting Junos documentation: ************************************************************************** Primary ======= Configure this address to be the primary address of the protocol on the interface. If the logical unit has more than one address, the primary address is used by default as the source address when packets originate from the interface and the destination does not indicate the subnet (ie. multicast destination for example). Default For unicast traffic, the primary address is the lowest non-127 preferred address on the unit. Preferred ========= Configure this address to be the preferred address on the interface. If you configure more than one address on the same subnet, the preferred source address is chosen by default as the source address when you originate packets to destinations on the subnet. Default The lowest numbered address on the subnet is the preferred address. ************************************************************************** Would it be overkill for Linux? From the routers point of view it does make perfect sense. And of course you can manually override defaults in Junos. -- Hasso Tepper Elion Enterprises Ltd. WAN administrator From hno@marasystems.com Thu Dec 16 02:08:18 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 16 Dec 2004 02:08:25 -0800 (PST) Received: from filer.marasystems.com (marasystems.com [83.241.133.2]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBGA7ua4031033 for ; Thu, 16 Dec 2004 02:08:17 -0800 Received: from localhost (henrik@localhost) by filer.marasystems.com (8.11.6/8.11.6) with ESMTP id iBGA7Bj30718; Thu, 16 Dec 2004 11:07:11 +0100 Date: Thu, 16 Dec 2004 11:07:11 +0100 (CET) From: Henrik Nordstrom To: Hasso Tepper cc: Harald Welte , Neil Horman , linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: primary and secondary ip addresses In-Reply-To: <200412161153.51251.hasso@estpak.ee> Message-ID: References: <41912F7A.6000408@redhat.com> <20041216092831.GO2862@sunbeam.de.gnumonks.org> <200412161153.51251.hasso@estpak.ee> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12793 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hno@marasystems.com Precedence: bulk X-list: netdev On Thu, 16 Dec 2004, Hasso Tepper wrote: > And why I can't even choose which address is primary? It is the first you add in a subnet. > Primary > ======= > > Configure this address to be the primary address of the protocol on > the interface. If the logical unit has more than one address, the > primary address is used by default as the source address when packets > originate from the interface and the destination does not indicate > the subnet (ie. multicast destination for example). This is what the primary addresses is used for, indirectly via the automatically created route entries. > Preferred > ========= > > Configure this address to be the preferred address on the interface. > If you configure more than one address on the same subnet, the > preferred source address is chosen by default as the source address > when you originate packets to destinations on the subnet. What is the difference from primary here? Broadcast traffic using the primary, the rest using preferred? Regards Henrik From hasso@estpak.ee Thu Dec 16 03:03:18 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 16 Dec 2004 03:03:31 -0800 (PST) Received: from arena (test.estpak.ee [194.126.115.47]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBGB2vUY002475 for ; Thu, 16 Dec 2004 03:03:17 -0800 Received: from arena ([127.0.0.1] ident=hasso) by arena with esmtp (Exim 3.36 #1 (Debian)) id 1CetPC-0000LM-00; Thu, 16 Dec 2004 13:02:46 +0200 From: Hasso Tepper To: Henrik Nordstrom Subject: Re: primary and secondary ip addresses Date: Thu, 16 Dec 2004 13:02:42 +0200 User-Agent: KMail/1.7.2 Cc: Harald Welte , Neil Horman , linux-net@vger.kernel.org, netdev@oss.sgi.com References: <41912F7A.6000408@redhat.com> <200412161153.51251.hasso@estpak.ee> In-Reply-To: Organization: Elion Enterprises Ltd. MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200412161302.42357.hasso@estpak.ee> X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12794 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hasso@estpak.ee Precedence: bulk X-list: netdev Henrik Nordstrom wrote: > On Thu, 16 Dec 2004, Hasso Tepper wrote: > > And why I can't even choose which address is primary? > > It is the first you add in a subnet. Yes, but to change the primary I have remove old primary (and therefore all secondaries) and assign new primary to the interface and then all secondaries etc. I can't just tell "make this address which is secondary at the moment, primary". > > Primary > > ======= > > > > Configure this address to be the primary address of the protocol on > > the interface. If the logical unit has more than one address, the > > primary address is used by default as the source address when packets > > originate from the interface and the destination does not indicate > > the subnet (ie. multicast destination for example). > > This is what the primary addresses is used for, indirectly via the > automatically created route entries. No. There is only one primary address per interface and it is used if destination address doesn't indicate which source address to use. Ie. if packet is sent over eth1 to the address 224.0.0.6 and eth1 has addresses 192.168.0.1/24 and 10.10.10.1/24 which one choose for source address? > > Preferred > > ========= > > > > Configure this address to be the preferred address on the interface. > > If you configure more than one address on the same subnet, the > > preferred source address is chosen by default as the source address > > when you originate packets to destinations on the subnet. > > What is the difference from primary here? This has same meaning as primary in the Linux. There is as many preferred addresses on the interfaces as there is subnets - every subnet has one preferred address. Preferred aadress is used if next hop indicates subnet - if packet is sent to the 192.168.0.2, 192.168.0.1 is used as source; if packet is sent to the 10.10.10.28, 10.10.10.1 is used as source. -- Hasso Tepper Elion Enterprises Ltd. WAN administrator From Robert.Olsson@data.slu.se Thu Dec 16 04:02:58 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 16 Dec 2004 04:03:05 -0800 (PST) Received: from mail1.slu.se (mail1.slu.se [130.238.96.11]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBGC2bZA005354 for ; Thu, 16 Dec 2004 04:02:58 -0800 Received: from robur.slu.se (robur.slu.se [130.238.98.12]) by mail1.slu.se (8.12.10/8.12.10) with ESMTP id iBGC26nN015877; Thu, 16 Dec 2004 13:02:06 +0100 Received: by robur.slu.se (Postfix, from userid 1000) id AA9AAEC001; Thu, 16 Dec 2004 13:02:06 +0100 (CET) From: Robert Olsson MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16833.31038.665676.278498@robur.slu.se> Date: Thu, 16 Dec 2004 13:02:06 +0100 To: jchapman@katalix.com Cc: gandalf@wlug.westbo.se, robert.olsson@data.slu.se, netdev@oss.sgi.com Subject: Re: Re: [PATCH] e1000 poll behavior In-Reply-To: <1103104917.41c00b9533464@www.katalix.com> References: <1103104917.41c00b9533464@www.katalix.com> X-Mailer: VM 7.18 under Emacs 21.3.1 X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12795 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Robert.Olsson@data.slu.se Precedence: bulk X-list: netdev jchapman@katalix.com writes: > Is there any advantage in exiting NAPI polled state as soon as > possible with the work_done < work_to_do test? Why not stay in > polled mode until no rx or tx work is done, i.e. only exit polled > state if work_done is zero? Yes it's a variant. > When working on e100 a while ago, I found better and more consistent > pps numbers by staying in polled mode until the interface went idle. > See current e100 code. Unfortunately I don't have e1000 hardware to > test if the same is true for e1000. I'll try this with the forwarding setup when I got a chance but I'll think server-like setups are more interesting to test. --ro From buytenh@wantstofly.org Thu Dec 16 06:43:32 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 16 Dec 2004 06:43:43 -0800 (PST) Received: from xi.wantstofly.org (alephnull.demon.nl [212.238.201.82]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBGEhB5Z013729 for ; Thu, 16 Dec 2004 06:43:31 -0800 Received: by xi.wantstofly.org (Postfix, from userid 500) id D3F282B0EE; Thu, 16 Dec 2004 15:42:47 +0100 (MET) Date: Thu, 16 Dec 2004 15:42:47 +0100 From: Lennert Buytenhek To: Robert Olsson Cc: netdev@oss.sgi.com Subject: Re: [PATCH,RFC] pktgen sleeping/timing rework Message-ID: <20041216144247.GA4109@xi.wantstofly.org> References: <20041210222058.GA5984@xi.wantstofly.org> <16830.58158.882590.971712@robur.slu.se> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <16830.58158.882590.971712@robur.slu.se> User-Agent: Mutt/1.4.1i X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12796 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: buytenh@wantstofly.org Precedence: bulk X-list: netdev On Tue, Dec 14, 2004 at 01:57:18PM +0100, Robert Olsson wrote: > Timing code gets cleaner and bit more predictable at least we know > better now how to arrange delay for lower rates. It still need some > experimentation. Sorting out the pps rate discrepancies for >10kpps was next on my list of things to do. I think fixing this is doable. > Appplied this to the development version and also renamed ipg to delay. > It breaks some scripts. We need a variant and of this and the FCS patch > for the kernel version. Is it sane to try and keep pktgen-2.6 up to date when the devel version is so much more flexible? cheers, Lennert From Robert.Olsson@data.slu.se Thu Dec 16 07:58:02 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 16 Dec 2004 07:58:09 -0800 (PST) Received: from mail1.slu.se (mail1.slu.se [130.238.96.11]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBGFvfBR016284 for ; Thu, 16 Dec 2004 07:58:02 -0800 Received: from robur.slu.se (robur.slu.se [130.238.98.12]) by mail1.slu.se (8.12.10/8.12.10) with ESMTP id iBGFvInN019043; Thu, 16 Dec 2004 16:57:18 +0100 Received: by robur.slu.se (Postfix, from userid 1000) id A50F6EC289; Thu, 16 Dec 2004 16:57:18 +0100 (CET) From: Robert Olsson MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16833.45150.618941.690334@robur.slu.se> Date: Thu, 16 Dec 2004 16:57:18 +0100 To: Lennert Buytenhek Cc: Robert Olsson , netdev@oss.sgi.com Subject: Re: [PATCH,RFC] pktgen sleeping/timing rework In-Reply-To: <20041216144247.GA4109@xi.wantstofly.org> References: <20041210222058.GA5984@xi.wantstofly.org> <16830.58158.882590.971712@robur.slu.se> <20041216144247.GA4109@xi.wantstofly.org> X-Mailer: VM 7.18 under Emacs 21.3.1 X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12797 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Robert.Olsson@data.slu.se Precedence: bulk X-list: netdev Lennert Buytenhek writes: Hello! > Is it sane to try and keep pktgen-2.6 up to date when the devel > version is so much more flexible? Not really. The development version had some untested areas (and still has) so it was thought for 2.7 but it's been waiting for more than a year. Do we know anything about 2.7 plans now? BTW BSD claims new performance heights. Just saw something on the quagga list. --ro From hno@marasystems.com Thu Dec 16 08:04:00 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 16 Dec 2004 08:04:07 -0800 (PST) Received: from filer.marasystems.com (marasystems.com [83.241.133.2]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBGG3cTh016921 for ; Thu, 16 Dec 2004 08:03:59 -0800 Received: from localhost (henrik@localhost) by filer.marasystems.com (8.11.6/8.11.6) with ESMTP id iBGG2xk00884; Thu, 16 Dec 2004 17:02:59 +0100 Date: Thu, 16 Dec 2004 17:02:59 +0100 (CET) From: Henrik Nordstrom To: Hasso Tepper cc: Harald Welte , Neil Horman , linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: primary and secondary ip addresses In-Reply-To: <200412161302.42357.hasso@estpak.ee> Message-ID: References: <41912F7A.6000408@redhat.com> <200412161153.51251.hasso@estpak.ee> <200412161302.42357.hasso@estpak.ee> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12798 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hno@marasystems.com Precedence: bulk X-list: netdev On Thu, 16 Dec 2004, Hasso Tepper wrote: > No. There is only one primary address per interface and it is used if > destination address doesn't indicate which source address to use. In the linux world your first primary is used for this purpose if I am not mistaken. But this shouldn't happen much in any sane network, and when it does it is not that unlikely the return traffic won't know how to reach you correctly.. Regards Henrik From paul@clubi.ie Thu Dec 16 08:49:59 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 16 Dec 2004 08:50:06 -0800 (PST) Received: from hibernia.jakma.org (IDENT:U2FsdGVkX199OChaoPiXa6k2a2uKHG9GpfJHUKd+tyU@hibernia.jakma.org [212.17.55.49]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBGGnY93021789 for ; Thu, 16 Dec 2004 08:49:57 -0800 Received: from sheen.jakma.org (sheen.jakma.org [212.17.55.53]) by hibernia.jakma.org (8.13.1/8.12.11) with ESMTP id iBGGmV6o003558; Thu, 16 Dec 2004 16:48:34 GMT Date: Thu, 16 Dec 2004 16:48:31 +0000 (UTC) From: Paul Jakma X-X-Sender: paul@sheen.jakma.org To: Harald Welte cc: Henrik Nordstrom , Neil Horman , linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: primary and secondary ip addresses In-Reply-To: <20041216092831.GO2862@sunbeam.de.gnumonks.org> Message-ID: References: <41912F7A.6000408@redhat.com> <20041216092831.GO2862@sunbeam.de.gnumonks.org> Mail-Followup-To: paul@hibernia.jakma.org X-NSA: arafat al aqsar jihad musharef jet-A1 avgas ammonium qran inshallah allah al-akbar martyr iraq saddam hammas hisballah rabin ayatollah korea vietnam revolt mustard gas british airways washington MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 19:53:11 2004 clamav-milter version 0.80j on hibernia.jakma.org X-Virus-Status: Clean X-Virus-Status: Clean X-archive-position: 12799 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: paul@clubi.ie Precedence: bulk X-list: netdev On Thu, 16 Dec 2004, Harald Welte wrote: > ... and why does removing the primary address remove all secondary > addresses. Because applications may have state which relies on the implied source address selection, I'd hazard a guess. If you dont remove the secondaries, you might break the apps - but i dont know. (certainly, the primary/secondary thing in kernel is useful - though, it possibly doesnt need to be in kernel.) > This makes it complicated if you have one interface > with multiple ip addresse, that change over time (failover, let's > say.) You don't know yet, which address you need to remove, > because you don't know which of your peers fails. That's bad planning. ;) You should assign your failover addresses seperately to your primary. Ie, the primary should never fail over - its unique to the machine/interface. > So you have all this complicated userspace magic that checks > whether the address that is about to be deleted is the primary, and > if yes, re-add all the other addresses. If if is a secondary, > you're happy and only need to delete that one. Ick ;) > Can anyone comment what the idea of all this was? Source address selection. ? > Thanks. regards, -- Paul Jakma paul@clubi.ie paul@jakma.org Key ID: 64A2FF6A Fortune: Prizes are for children. -- Charles Ives, upon being given, but refusing, the Pulitzer prize From shemminger@osdl.org Thu Dec 16 11:09:53 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 16 Dec 2004 11:10:42 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBGJ9WZb027987 for ; Thu, 16 Dec 2004 11:09:52 -0800 Received: from dxpl.pdx.osdl.net (dxpl.pdx.osdl.net [172.20.1.103]) by mail.osdl.org (8.11.6/8.11.6) with ESMTP id iBGJ8H900465; Thu, 16 Dec 2004 11:08:17 -0800 Date: Thu, 16 Dec 2004 11:08:17 -0800 From: Stephen Hemminger To: tcc_linux@thinkthink.com Cc: netdev@oss.sgi.com, Jeff Garzik Subject: Re: [Bug 3904] New: CHECKSUM_HW gets set even w/o hardware support. Message-Id: <20041216110817.0d3a78d4@dxpl.pdx.osdl.net> In-Reply-To: <200412150642.iBF6geO1013040@fire-1.osdl.org> References: <200412150642.iBF6geO1013040@fire-1.osdl.org> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.10claws (GTK+ 1.2.10; x86_64-suse-linux) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12800 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev On Tue, 14 Dec 2004 22:42:40 -0800 bugme-daemon@osdl.org wrote: > http://bugme.osdl.org/show_bug.cgi?id=3904 > > Summary: CHECKSUM_HW gets set even w/o hardware support. > Kernel Version: 2.6.9 > Status: NEW > Severity: normal > Owner: shemminger@osdl.org > Submitter: tcc_linux@thinkthink.com > > > Distribution: > Debian w/ kernel from kernel.org > Hardware Environment: > Fujitsu LifeBook N5010 > Software Environment: > Debian unstable distro > Problem Description: > CHECKSUM_HW is improperly set in udp.c resulting in udp frames being sent > with bad checksums. The hardware and therefore the driver (8139cp) is not > setting CHECKSUM_HW (I checked). Possibly a problem with any NIC that doesn't do > hardware checksumming. #if 0'ing out the hw checksum code in > udp.c:udp_push_pending_frames() works as a workaround (but obviously does not > identify who set CHECKSUM_HW). > Steps to reproduce: > Need board that uses 8139cp driver. Write a small program that sends udp > data. Observe with tcpdump. Sendto, send, and write all behave the same. > Tcpdump also has a problem displaying what the checksum should be, but does > properly flag the bad checksum. > > ------- You are receiving this mail because: ------- > You are the assignee for the bug, or are watching the assignee. I believe the problem is that the 8139cp driver allows enabling TX csum, via ethtool but does not support it. What does ethtool report as being enabled on that ethernet. I doubt it is a udp problem. From penguin@muskoka.com Thu Dec 16 15:32:45 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 16 Dec 2004 15:32:53 -0800 (PST) Received: from gold.muskoka.com (gold.muskoka.com [216.123.107.5]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBGNWOwX000448 for ; Thu, 16 Dec 2004 15:32:44 -0800 Received: from muskoka.com (ppp72.muskoka.com [216.123.108.82]) by gold.muskoka.com (8.12.3/8.12.3/Debian-6.4) with ESMTP id iBGNYYGl023220; Thu, 16 Dec 2004 18:34:34 -0500 Message-ID: <41C21664.2070404@muskoka.com> Date: Thu, 16 Dec 2004 18:12:36 -0500 From: Paul Gortmaker User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3.1) Gecko/20030425 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Jeff Garzik CC: NetDev Subject: [PATCH] 2.6.9 Use skb_padto() in drivers/net/8390.c X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12801 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: penguin@muskoka.com Precedence: bulk X-list: netdev The 8390 driver had been fixed for leaking information in short packets prior to skb_padto() existing. This change gets rid of the scratch area on the stack and makes it use skb_padto(). Thanks to Mark Smith for bringing this to my attention. Signed-off-by: Paul Gortmaker --- linux-386/drivers/net/8390.c~ Mon Oct 18 17:55:55 2004 +++ linux-386/drivers/net/8390.c Thu Dec 16 18:06:23 2004 @@ -43,6 +43,7 @@ Paul Gortmaker : Separate out Tx timeout code from Tx path. Paul Gortmaker : Remove old unused single Tx buffer code. Hayato Fujiwara : Add m32r support. + Paul Gortmaker : use skb_padto() instead of stack scratch area Sources: The National Semiconductor LAN Databook, and the 3Com 3c503 databook. @@ -272,11 +273,15 @@ static int ei_start_xmit(struct sk_buff { long e8390_base = dev->base_addr; struct ei_device *ei_local = (struct ei_device *) netdev_priv(dev); - int length, send_length, output_page; + int send_length = skb->len, output_page; unsigned long flags; - char scratch[ETH_ZLEN]; - length = skb->len; + if (skb->len < ETH_ZLEN) { + skb = skb_padto(skb, ETH_ZLEN); + if (skb == NULL) + return 0; + send_length = ETH_ZLEN; + } /* Mask interrupts from the ethercard. SMP: We have to grab the lock here otherwise the IRQ handler @@ -298,8 +303,6 @@ static int ei_start_xmit(struct sk_buff ei_local->irqlock = 1; - send_length = ETH_ZLEN < length ? length : ETH_ZLEN; - /* * We have two Tx slots available for use. Find the first free * slot, and then perform some sanity checks. With two Tx bufs, @@ -344,13 +347,7 @@ static int ei_start_xmit(struct sk_buff * trigger the send later, upon receiving a Tx done interrupt. */ - if (length == send_length) - ei_block_output(dev, length, skb->data, output_page); - else { - memset(scratch, 0, ETH_ZLEN); - memcpy(scratch, skb->data, skb->len); - ei_block_output(dev, ETH_ZLEN, scratch, output_page); - } + ei_block_output(dev, send_length, skb->data, output_page); if (! ei_local->txing) { From daniele@orlandi.com Thu Dec 16 20:30:20 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 16 Dec 2004 20:30:27 -0800 (PST) Received: from relay3.uli.it (relay3.uli.it [62.212.0.49]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBH4TwjQ014907 for ; Thu, 16 Dec 2004 20:30:19 -0800 Received: from nabla.orlandi.com (nabla.orlandi.com [62.212.12.10]) (using TLSv1 with cipher RC4-MD5 (128/128 bits)) (No client certificate requested) by greg.uli.it (Postfix) with ESMTP id DDA59ADF6 for ; Fri, 17 Dec 2004 05:29:29 +0100 (CET) From: Daniele Orlandi To: netdev@oss.sgi.com Subject: Clarifications on queues Date: Fri, 17 Dec 2004 05:29:28 +0100 User-Agent: KMail/1.7.1 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200412170529.28898.daniele@orlandi.com> X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12802 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: daniele@orlandi.com Precedence: bulk X-list: netdev Hello, I'm trying to understand how netdev works wrt the queueing. Here is what I deducted, please correct me if there is something or everything wrong :) There is a transmit queue associated to the device which is managed by the Qdisc discipline manager. Once a skb has been enqueued into the Qdisk queue (by dev_queue_xmit) it is destined to the device via hard_start_xmit and the sender shouldn't mess with it. In the upper layer there is sk_write_queue which is managed by the protocol implementation and enqueing/dequeuing is made thru skb_queue_* If the protocol needs to cope with retransmissions, I can use this queue. Now, some question :) The (low-speed) device I'm working with has a FIFO managed by the hardware into which I write frames to be transmitted. Is it ok if I put frames in the FIFO without restrictions? I mean, from a performances viewpoint, is it good if the FIFO fills eventually up or is it better to keep it empty or almost empty and let the device queueing be handled by netdev? Thank you, Bye! -- Daniele Orlandi From jmorris@redhat.com Thu Dec 16 21:16:44 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 16 Dec 2004 21:16:58 -0800 (PST) Received: from mx1.redhat.com (mx1.redhat.com [66.187.233.31]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBH5GNbU016501 for ; Thu, 16 Dec 2004 21:16:44 -0800 Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.12.11/8.12.11) with ESMTP id iBH5Fsn8020788; Fri, 17 Dec 2004 00:15:54 -0500 Received: from mail.boston.redhat.com (mail.boston.redhat.com [172.16.76.12]) by int-mx1.corp.redhat.com (8.11.6/8.11.6) with ESMTP id iBH5Fsr10763; Fri, 17 Dec 2004 00:15:54 -0500 Received: from thoron.boston.redhat.com (thoron.boston.redhat.com [172.16.80.63]) by mail.boston.redhat.com (8.12.8/8.12.8) with ESMTP id iBH5FqZV030813; Fri, 17 Dec 2004 00:15:52 -0500 Date: Fri, 17 Dec 2004 00:15:54 -0500 (EST) From: James Morris X-X-Sender: jmorris@thoron.boston.redhat.com To: Bryan Fulton cc: linux-kernel@vger.kernel.org, , Subject: Re: [Coverity] Untrusted user data in kernel In-Reply-To: <1103247211.3071.74.camel@localhost.localdomain> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12803 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jmorris@redhat.com Precedence: bulk X-list: netdev This at least needs CAP_NET_ADMIN. On Thu, 16 Dec 2004, Bryan Fulton wrote: > //////////////////////////////////////////////////////// > // 3: /net/ipv6/netfilter/ip6_tables.c::do_replace // > //////////////////////////////////////////////////////// > > - tainted unsigned scalar tmp.num_counters multiplied and passed to > vmalloc (1161) and memset (1166) which could overflow or be too large > > Call to function "copy_from_user" TAINTS argument "tmp" > > 1143 if (copy_from_user(&tmp, user, sizeof(tmp)) != 0) > 1144 return -EFAULT; > > ... > > TAINTED variable "((tmp).num_counters * 16)" was passed to a tainted > sink. > > 1161 counters = vmalloc(tmp.num_counters * sizeof(struct > ip6t_counters)); > 1162 if (!counters) { > 1163 ret = -ENOMEM; > 1164 goto free_newinfo; > 1165 } > > TAINTED variable "((tmp).num_counters * 16)" was passed to a tainted > sink. > > 1166 memset(counters, 0, tmp.num_counters * sizeof(struct > ip6t_counters)); > -- James Morris From kaber@trash.net Thu Dec 16 21:26:31 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 16 Dec 2004 21:26:38 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBH5QA3R017096 for ; Thu, 16 Dec 2004 21:26:31 -0800 Received: from localhost ([127.0.0.1]) by kaber.coreworks.de with esmtp (Exim 4.34) id 1CfAcT-000484-Gh; Fri, 17 Dec 2004 06:25:37 +0100 Message-ID: <41C26DD1.7070006@trash.net> Date: Fri, 17 Dec 2004 06:25:37 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.3) Gecko/20041008 Debian/1.7.3-5 X-Accept-Language: en MIME-Version: 1.0 To: James Morris CC: Bryan Fulton , netdev@oss.sgi.com, netfilter-devel@lists.netfilter.org, linux-kernel@vger.kernel.org Subject: Re: [Coverity] Untrusted user data in kernel References: In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12804 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev James Morris wrote: >This at least needs CAP_NET_ADMIN. > It is already checked in do_ip6t_set_ctl(). Otherwise anyone could replace iptables rules :) Regards Patrick > >On Thu, 16 Dec 2004, Bryan Fulton wrote: > > >>//////////////////////////////////////////////////////// >>// 3: /net/ipv6/netfilter/ip6_tables.c::do_replace // >>//////////////////////////////////////////////////////// >> >>- tainted unsigned scalar tmp.num_counters multiplied and passed to >>vmalloc (1161) and memset (1166) which could overflow or be too large >> >>Call to function "copy_from_user" TAINTS argument "tmp" >> >>1143 if (copy_from_user(&tmp, user, sizeof(tmp)) != 0) >>1144 return -EFAULT; >> >>... >> >>TAINTED variable "((tmp).num_counters * 16)" was passed to a tainted >>sink. >> >>1161 counters = vmalloc(tmp.num_counters * sizeof(struct >>ip6t_counters)); >>1162 if (!counters) { >>1163 ret = -ENOMEM; >>1164 goto free_newinfo; >>1165 } >> >>TAINTED variable "((tmp).num_counters * 16)" was passed to a tainted >>sink. >> >>1166 memset(counters, 0, tmp.num_counters * sizeof(struct >>ip6t_counters)); >> >> >> > > > From jmorris@redhat.com Thu Dec 16 22:45:53 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 16 Dec 2004 22:46:00 -0800 (PST) Received: from mx1.redhat.com (mx1.redhat.com [66.187.233.31]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBH6jWkO019289 for ; Thu, 16 Dec 2004 22:45:53 -0800 Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.12.11/8.12.11) with ESMTP id iBH6j6Sc008164; Fri, 17 Dec 2004 01:45:06 -0500 Received: from mail.boston.redhat.com (mail.boston.redhat.com [172.16.76.12]) by int-mx1.corp.redhat.com (8.11.6/8.11.6) with ESMTP id iBH6j6r25433; Fri, 17 Dec 2004 01:45:06 -0500 Received: from thoron.boston.redhat.com (thoron.boston.redhat.com [172.16.80.63]) by mail.boston.redhat.com (8.12.8/8.12.8) with ESMTP id iBH6j3ZV003241; Fri, 17 Dec 2004 01:45:03 -0500 Date: Fri, 17 Dec 2004 01:45:05 -0500 (EST) From: James Morris X-X-Sender: jmorris@thoron.boston.redhat.com To: Patrick McHardy cc: Bryan Fulton , , , Subject: Re: [Coverity] Untrusted user data in kernel In-Reply-To: <41C26DD1.7070006@trash.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12805 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jmorris@redhat.com Precedence: bulk X-list: netdev On Fri, 17 Dec 2004, Patrick McHardy wrote: > James Morris wrote: > > >This at least needs CAP_NET_ADMIN. > > > It is already checked in do_ip6t_set_ctl(). Otherwise anyone could > replace iptables rules :) That's what I meant, you need the capability to do anything bad :-) - James -- James Morris From tom@dbservice.com Fri Dec 17 05:17:28 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 17 Dec 2004 05:17:35 -0800 (PST) Received: from matterhorn.neopsis.com (neopsis.com [213.239.204.14]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBHDH7cg007849 for ; Fri, 17 Dec 2004 05:17:28 -0800 Received: from [192.168.0.11] (wan_emmen [62.65.141.13]) by matterhorn.dbservice.com (Postfix) with ESMTP id A74849AB3; Fri, 17 Dec 2004 14:15:16 +0100 (MET) Message-ID: <41C2DCBC.1080302@dbservice.com> Date: Fri, 17 Dec 2004 14:18:52 +0100 From: Tomas Carnecky User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206) X-Accept-Language: en-us, en MIME-Version: 1.0 To: James Morris Cc: Patrick McHardy , Bryan Fulton , netdev@oss.sgi.com, netfilter-devel@lists.netfilter.org, linux-kernel@vger.kernel.org Subject: Re: [Coverity] Untrusted user data in kernel References: In-Reply-To: X-Enigmail-Version: 0.89.5.0 X-Enigmail-Supports: pgp-inline, pgp-mime Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Neopsis-MailScanner-Information: Please contact the ISP for more information X-Neopsis-MailScanner: Found to be clean X-MailScanner-From: tom@dbservice.com X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12806 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tom@dbservice.com Precedence: bulk X-list: netdev James Morris wrote: > That's what I meant, you need the capability to do anything bad :-) > But.. even if you have the 'permission' to do bad things, it shouldn't be possible. It's a bug, and only because you can't exploit it if you haven't the right capabilities doesn't make the bug disappear. IMHO such things (passing values between user/kernel space) should always be checked. tom From pavel@atrey.karlin.mff.cuni.cz Fri Dec 17 07:11:16 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 17 Dec 2004 07:11:23 -0800 (PST) Received: from atrey.karlin.mff.cuni.cz (postfix@atrey.karlin.mff.cuni.cz [195.113.31.123]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBHFAtMG011899 for ; Fri, 17 Dec 2004 07:11:16 -0800 Received: by atrey.karlin.mff.cuni.cz (Postfix, from userid 512) id 607554B40F8; Fri, 17 Dec 2004 16:10:31 +0100 (CET) Date: Fri, 17 Dec 2004 16:10:31 +0100 From: Pavel Machek To: James Morris Cc: Bryan Fulton , linux-kernel@vger.kernel.org, netdev@oss.sgi.com, netfilter-devel@lists.netfilter.org Subject: Re: [Coverity] Untrusted user data in kernel Message-ID: <20041217151031.GA27170@atrey.karlin.mff.cuni.cz> References: <1103247211.3071.74.camel@localhost.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.6i X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12807 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pavel@ucw.cz Precedence: bulk X-list: netdev Hi! > This at least needs CAP_NET_ADMIN. Hmm, but that means that CAP_NET_ADMIN implies all other capabilities, unless you fix this. Pavel > > TAINTED variable "((tmp).num_counters * 16)" was passed to a tainted > > sink. > > > > 1161 counters = vmalloc(tmp.num_counters * sizeof(struct > > ip6t_counters)); > > 1162 if (!counters) { > > 1163 ret = -ENOMEM; > > 1164 goto free_newinfo; > > 1165 } > > > > TAINTED variable "((tmp).num_counters * 16)" was passed to a tainted > > sink. > > > > 1166 memset(counters, 0, tmp.num_counters * sizeof(struct > > ip6t_counters)); -- Boycott Kodak -- for their patent abuse against Java. From andreaf@cs.columbia.edu Fri Dec 17 07:12:27 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 17 Dec 2004 07:12:34 -0800 (PST) Received: from cs.columbia.edu (cs.columbia.edu [128.59.16.20]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBHFC6kh012202 for ; Fri, 17 Dec 2004 07:12:27 -0800 Received: from lion.cs.columbia.edu (IDENT:RUHDGBMrT5M5xIOp8Ia5fCuyzv5N5hD0@lion.cs.columbia.edu [128.59.16.120]) by cs.columbia.edu (8.12.10/8.12.10) with ESMTP id iBHFAbe9018778 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NOT); Fri, 17 Dec 2004 10:10:37 -0500 (EST) Received: from [128.59.17.219] (dhcp69.cs.columbia.edu [128.59.17.219]) by lion.cs.columbia.edu (8.12.9/8.12.9) with ESMTP id iBHFAXK8004423; Fri, 17 Dec 2004 10:10:33 -0500 Message-ID: <41C2F6E5.5010607@cs.columbia.edu> Date: Fri, 17 Dec 2004 10:10:29 -0500 From: Andrea G Forte User-Agent: Mozilla Thunderbird 0.9 (Windows/20041103) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Hasso Tepper CC: Henrik Nordstrom , Harald Welte , Neil Horman , linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: primary and secondary ip addresses References: <41912F7A.6000408@redhat.com> <200412161153.51251.hasso@estpak.ee> <200412161302.42357.hasso@estpak.ee> In-Reply-To: <200412161302.42357.hasso@estpak.ee> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-PMX-Version: 4.7.0.111621, Antispam-Engine: 2.0.2.0, Antispam-Data: 2004.12.16.22 X-PerlMx-Spam: Gauge=IIIIIII, Probability=7%, Report='__CT 0, __CTE 0, __CT_TEXT_PLAIN 0, __HAS_MSGID 0, __MIME_VERSION 0, __SANE_MSGID 0' X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12808 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: andreaf@cs.columbia.edu Precedence: bulk X-list: netdev Yes, using the same concept used in Junos would make things much easier. On top of this, when I change the primary IP, it takes about 500ms to see the change applied which is critical for real time applications. Using more than one primary IP would solve this as well. Also it seems that if I have a primary address and a secondary one (alias), even though I make the primary address invalid (trying to force the kernel to use the secondary first), the kernel still tries the primary first regardless of its validity (I set the primary IP to 0.0.0.0). Anybody has any comments about this? Can anyone point out to me where is the code for primary/secondary IPs in the kernel? Thanks a lot, Andrea Hasso Tepper wrote: >Henrik Nordstrom wrote: > > >>On Thu, 16 Dec 2004, Hasso Tepper wrote: >> >> >>>And why I can't even choose which address is primary? >>> >>> >>It is the first you add in a subnet. >> >> > >Yes, but to change the primary I have remove old primary (and therefore all >secondaries) and assign new primary to the interface and then all >secondaries etc. I can't just tell "make this address which is secondary at >the moment, primary". > > > >>>Primary >>>======= >>> >>>Configure this address to be the primary address of the protocol on >>>the interface. If the logical unit has more than one address, the >>>primary address is used by default as the source address when packets >>>originate from the interface and the destination does not indicate >>>the subnet (ie. multicast destination for example). >>> >>> >>This is what the primary addresses is used for, indirectly via the >>automatically created route entries. >> >> > >No. There is only one primary address per interface and it is used if >destination address doesn't indicate which source address to use. Ie. if >packet is sent over eth1 to the address 224.0.0.6 and eth1 has addresses >192.168.0.1/24 and 10.10.10.1/24 which one choose for source address? > > > >>>Preferred >>>========= >>> >>>Configure this address to be the preferred address on the interface. >>>If you configure more than one address on the same subnet, the >>>preferred source address is chosen by default as the source address >>>when you originate packets to destinations on the subnet. >>> >>> >>What is the difference from primary here? >> >> > >This has same meaning as primary in the Linux. There is as many preferred >addresses on the interfaces as there is subnets - every subnet has one >preferred address. Preferred aadress is used if next hop indicates subnet - >if packet is sent to the 192.168.0.2, 192.168.0.1 is used as source; if >packet is sent to the 10.10.10.28, 10.10.10.1 is used as source. > > > > From hno@marasystems.com Fri Dec 17 07:29:04 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 17 Dec 2004 07:29:11 -0800 (PST) Received: from filer.marasystems.com (marasystems.com [83.241.133.2]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBHFSgwE013357 for ; Fri, 17 Dec 2004 07:29:03 -0800 Received: from localhost (henrik@localhost) by filer.marasystems.com (8.11.6/8.11.6) with ESMTP id iBHFROD17028; Fri, 17 Dec 2004 16:27:24 +0100 Date: Fri, 17 Dec 2004 16:27:24 +0100 (CET) From: Henrik Nordstrom To: Andrea G Forte cc: Hasso Tepper , Harald Welte , Neil Horman , linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: primary and secondary ip addresses In-Reply-To: <41C2F6E5.5010607@cs.columbia.edu> Message-ID: References: <41912F7A.6000408@redhat.com> <200412161153.51251.hasso@estpak.ee> <200412161302.42357.hasso@estpak.ee> <41C2F6E5.5010607@cs.columbia.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12809 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hno@marasystems.com Precedence: bulk X-list: netdev On Fri, 17 Dec 2004, Andrea G Forte wrote: > Yes, using the same concept used in Junos would make things much easier. On > top of this, when I change the primary IP, it takes about 500ms to see the > change applied which is critical for real time applications. Using more than > one primary IP would solve this as well. > Also it seems that if I have a primary address and a secondary one (alias), > even though I make the primary address invalid (trying to force the kernel to > use the secondary first), the kernel still tries the primary first regardless > of its validity (I set the primary IP to 0.0.0.0). Which source IP is used by the kernel is determined primary by your routing tables. The requirements for an IP address to be allowed to be used in the routing table is that the IP address does exists on any of your interfaces, either as primary or secondary. When you add/delete a primary address to a interface the kernel automatically adds/deletes routes accordingly, including source IP address selection. If the routing table does not have information about which source IP address to use for this traffic then the kernel searches the interface for a valid primary address. Regards Henrik From jmorris@redhat.com Fri Dec 17 07:39:34 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 17 Dec 2004 07:39:41 -0800 (PST) Received: from mx1.redhat.com (mx1.redhat.com [66.187.233.31]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBHFdDqU014502 for ; Fri, 17 Dec 2004 07:39:34 -0800 Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.12.11/8.12.11) with ESMTP id iBHFclVi017620; Fri, 17 Dec 2004 10:38:47 -0500 Received: from mail.boston.redhat.com (mail.boston.redhat.com [172.16.76.12]) by int-mx1.corp.redhat.com (8.11.6/8.11.6) with ESMTP id iBHFckr16928; Fri, 17 Dec 2004 10:38:46 -0500 Received: from thoron.boston.redhat.com (thoron.boston.redhat.com [172.16.80.63]) by mail.boston.redhat.com (8.12.8/8.12.8) with ESMTP id iBHFciZV003220; Fri, 17 Dec 2004 10:38:44 -0500 Date: Fri, 17 Dec 2004 10:38:46 -0500 (EST) From: James Morris X-X-Sender: jmorris@thoron.boston.redhat.com To: Pavel Machek cc: Bryan Fulton , , , Subject: Re: [Coverity] Untrusted user data in kernel In-Reply-To: <20041217151031.GA27170@atrey.karlin.mff.cuni.cz> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12810 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jmorris@redhat.com Precedence: bulk X-list: netdev On Fri, 17 Dec 2004, Pavel Machek wrote: > Hi! > > > This at least needs CAP_NET_ADMIN. > > Hmm, but that means that CAP_NET_ADMIN implies all other capabilities, > unless you fix this. I'm not saying it doesn't need to be fixed, but that it is not exploitable by unprivileged users. - James -- James Morris From davidsen@tmr.com Fri Dec 17 07:48:00 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 17 Dec 2004 07:48:14 -0800 (PST) Received: from oddball.prodigy.com (prgy-npn1.prodigy.com [207.115.54.37]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBHFldFB015518 for ; Fri, 17 Dec 2004 07:47:59 -0800 Received: from [127.0.0.1] (oddball.prodigy.com [127.0.0.1]) by oddball.prodigy.com (8.11.6/8.11.6) with ESMTP id iBHFlcq12154; Fri, 17 Dec 2004 10:47:40 -0500 Message-ID: <41C2FF99.3020908@tmr.com> Date: Fri, 17 Dec 2004 10:47:37 -0500 From: Bill Davidsen User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3) Gecko/20040913 X-Accept-Language: en-us, en MIME-Version: 1.0 To: James Morris CC: Patrick McHardy , Bryan Fulton , netdev@oss.sgi.com, netfilter-devel@lists.netfilter.org, linux-kernel@vger.kernel.org Subject: Re: [Coverity] Untrusted user data in kernel References: <41C26DD1.7070006@trash.net> In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12811 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davidsen@tmr.com Precedence: bulk X-list: netdev James Morris wrote: > On Fri, 17 Dec 2004, Patrick McHardy wrote: > > >>James Morris wrote: >> >> >>>This at least needs CAP_NET_ADMIN. >>> >> >>It is already checked in do_ip6t_set_ctl(). Otherwise anyone could >>replace iptables rules :) > > > That's what I meant, you need the capability to do anything bad :-) Are you saying that processes with capability don't make mistakes? This isn't a bug related to untrusted users doing privileged operations, it's a case of using unchecked user data. -- -bill davidsen (davidsen@tmr.com) "The secret to procrastination is to put things off until the last possible moment - but no longer" -me From andreaf@cs.columbia.edu Fri Dec 17 07:59:24 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 17 Dec 2004 07:59:31 -0800 (PST) Received: from cs.columbia.edu (cs.columbia.edu [128.59.16.20]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBHFx3mj017070 for ; Fri, 17 Dec 2004 07:59:24 -0800 Received: from lion.cs.columbia.edu (IDENT:qfg9YwCARNiKcgDtc357aG19pfrIG0aA@lion.cs.columbia.edu [128.59.16.120]) by cs.columbia.edu (8.12.10/8.12.10) with ESMTP id iBHFwEe9027260 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NOT); Fri, 17 Dec 2004 10:58:14 -0500 (EST) Received: from [128.59.17.219] (dhcp69.cs.columbia.edu [128.59.17.219]) by lion.cs.columbia.edu (8.12.9/8.12.9) with ESMTP id iBHFwDK8007440; Fri, 17 Dec 2004 10:58:13 -0500 Message-ID: <41C30212.6000906@cs.columbia.edu> Date: Fri, 17 Dec 2004 10:58:10 -0500 From: Andrea G Forte User-Agent: Mozilla Thunderbird 0.9 (Windows/20041103) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Henrik Nordstrom CC: Hasso Tepper , Harald Welte , Neil Horman , linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: primary and secondary ip addresses References: <41912F7A.6000408@redhat.com> <200412161153.51251.hasso@estpak.ee> <200412161302.42357.hasso@estpak.ee> <41C2F6E5.5010607@cs.columbia.edu> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-PMX-Version: 4.7.0.111621, Antispam-Engine: 2.0.2.0, Antispam-Data: 2004.12.16.22 X-PerlMx-Spam: Gauge=IIIIIII, Probability=7%, Report='__CT 0, __CTE 0, __CT_TEXT_PLAIN 0, __HAS_MSGID 0, __MIME_VERSION 0, __SANE_MSGID 0' X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12812 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: andreaf@cs.columbia.edu Precedence: bulk X-list: netdev > Which source IP is used by the kernel is determined primary by your > routing tables. > > The requirements for an IP address to be allowed to be used in the > routing table is that the IP address does exists on any of your > interfaces, either as primary or secondary. > > When you add/delete a primary address to a interface the kernel > automatically adds/deletes routes accordingly, including source IP > address selection. > This does not help, since if I want to use my secondary IP address instead of my primary, I cannot delete the primary otherwise all of my secondary IPs are lost as well (and since I can only have only one primary IP address). > If the routing table does not have information about which source IP > address to use for this traffic then the kernel searches the interface > for a valid primary address. > I update all the routing entries and eventually things start to work again. The problem is that: -If I use a secondary IP and try to invalidate the primary (i.e. by removing its routing table entry), it takes about 500ms for the actual change (data packets sent on the secondary IP instead of the primary) to take effect. -If I try to update the primary address directly without creating any secondary IP, then it still takes about 300ms for the change to take place. I honestly do not understand what harm could do to have more than one primary address, especially on different subnets. Cheers, Andrea > Regards > Henrik From linux-os@chaos.analogic.com Fri Dec 17 08:14:41 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 17 Dec 2004 08:14:48 -0800 (PST) Received: from chaos.analogic.com (alog0090.analogic.com [208.224.220.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBHGEKPP017908 for ; Fri, 17 Dec 2004 08:14:41 -0800 Received: from chaos.analogic.com (localhost.localdomain [127.0.0.1]) by chaos.analogic.com (8.12.11/8.12.11) with ESMTP id iBHGBntL004228; Fri, 17 Dec 2004 11:11:49 -0500 Received: (from linux-os@localhost) by chaos.analogic.com (8.12.11/8.12.11/Submit) id iBHGBboc004227; Fri, 17 Dec 2004 11:11:37 -0500 Date: Fri, 17 Dec 2004 11:11:37 -0500 (EST) From: linux-os Reply-To: linux-os@analogic.com To: Bill Davidsen cc: James Morris , Patrick McHardy , Bryan Fulton , netdev@oss.sgi.com, netfilter-devel@lists.netfilter.org, linux-kernel@vger.kernel.org Subject: Re: [Coverity] Untrusted user data in kernel In-Reply-To: <41C2FF99.3020908@tmr.com> Message-ID: References: <41C26DD1.7070006@trash.net> <41C2FF99.3020908@tmr.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12813 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: linux-os@chaos.analogic.com Precedence: bulk X-list: netdev On Fri, 17 Dec 2004, Bill Davidsen wrote: > James Morris wrote: >> On Fri, 17 Dec 2004, Patrick McHardy wrote: >> >> >>> James Morris wrote: >>> >>> >>>> This at least needs CAP_NET_ADMIN. >>>> >>> >>> It is already checked in do_ip6t_set_ctl(). Otherwise anyone could >>> replace iptables rules :) >> >> >> That's what I meant, you need the capability to do anything bad :-) > > Are you saying that processes with capability don't make mistakes? This isn't > a bug related to untrusted users doing privileged operations, it's a case of > using unchecked user data. > But isn't there always the possibility of "unchecked user data"? I can, as root, do `cp /dev/zero /dev/mem` and have the most spectacular crask you've evet seen. I can even make my file- systems unrecoverable. Cheers, Dick Johnson Penguin : Linux version 2.6.9 on an i686 machine (5537.79 BogoMips). Notice : All mail here is now cached for review by Dictator Bush. 98.36% of all statistics are fiction. From oliver@neukum.org Fri Dec 17 08:31:55 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 17 Dec 2004 08:32:02 -0800 (PST) Received: from Mail1.KONTENT.De (IDENT:30@mail1.kontent.de [81.88.34.36]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBHGVYV8022548 for ; Fri, 17 Dec 2004 08:31:55 -0800 Received: from p3ee1ebe1.dip.t-dialin.net (p3EE1EBE1.dip.t-dialin.net [62.225.235.225]) by Mail1.KONTENT.De (Postfix) with ESMTP id 7B55F49451D; Fri, 17 Dec 2004 17:31:18 +0100 (CET) From: Oliver Neukum To: linux-os@analogic.com Subject: Re: [Coverity] Untrusted user data in kernel Date: Fri, 17 Dec 2004 17:31:05 +0100 User-Agent: KMail/1.6.2 Cc: Bill Davidsen , James Morris , Patrick McHardy , Bryan Fulton , netdev@oss.sgi.com, netfilter-devel@lists.netfilter.org, linux-kernel@vger.kernel.org References: <41C26DD1.7070006@trash.net> <41C2FF99.3020908@tmr.com> In-Reply-To: MIME-Version: 1.0 Content-Disposition: inline Content-Type: text/plain; charset="iso-8859-2" Content-Transfer-Encoding: 7bit Message-Id: <200412171731.05735.oliver@neukum.org> X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12814 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: oliver@neukum.org Precedence: bulk X-list: netdev > > Are you saying that processes with capability don't make mistakes? This isn't > > a bug related to untrusted users doing privileged operations, it's a case of > > using unchecked user data. > > > > But isn't there always the possibility of "unchecked user data"? > I can, as root, do `cp /dev/zero /dev/mem` and have the most > spectacular crask you've evet seen. I can even make my file- > systems unrecoverable. Only if you have the capability for raw hardware access. The same is true for the firmware interface. What other subsystems might be dangerous? Regards Oliver From hno@marasystems.com Fri Dec 17 08:41:11 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 17 Dec 2004 08:41:18 -0800 (PST) Received: from filer.marasystems.com (marasystems.com [83.241.133.2]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBHGeoJJ023251 for ; Fri, 17 Dec 2004 08:41:11 -0800 Received: from localhost (henrik@localhost) by filer.marasystems.com (8.11.6/8.11.6) with ESMTP id iBHGdli17494; Fri, 17 Dec 2004 17:39:47 +0100 Date: Fri, 17 Dec 2004 17:39:47 +0100 (CET) From: Henrik Nordstrom To: Andrea G Forte cc: Hasso Tepper , Harald Welte , Neil Horman , linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: primary and secondary ip addresses In-Reply-To: <41C30212.6000906@cs.columbia.edu> Message-ID: References: <41912F7A.6000408@redhat.com> <200412161153.51251.hasso@estpak.ee> <200412161302.42357.hasso@estpak.ee> <41C2F6E5.5010607@cs.columbia.edu> <41C30212.6000906@cs.columbia.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12815 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hno@marasystems.com Precedence: bulk X-list: netdev On Fri, 17 Dec 2004, Andrea G Forte wrote: > This does not help, since if I want to use my secondary IP address instead of > my primary, I cannot delete the primary otherwise all of my secondary IPs are > lost as well (and since I can only have only one primary IP address). Why change the primary address? What is wrong with simply changing the route to use the other source IP? > -If I use a secondary IP and try to invalidate the primary (i.e. by removing > its routing table entry), it takes about 500ms for the actual change (data > packets sent on the secondary IP instead of the primary) to take effect. This is most likely the routing cache or something. > I honestly do not understand what harm could do to have more than one primary > address, especially on different subnets. How it works today is that the first IP you add in a subnet becomes a primary, any additional IPs you add in the same subnet becomes secondary. You can have any number of primary IPs with each any number of secondary IPs, the primary IPs just can't be in the same subnet. Regards Henrik From andreaf@cs.columbia.edu Fri Dec 17 09:19:25 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 17 Dec 2004 09:19:32 -0800 (PST) Received: from cs.columbia.edu (cs.columbia.edu [128.59.16.20]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBHHJ4Ri024568 for ; Fri, 17 Dec 2004 09:19:25 -0800 Received: from lion.cs.columbia.edu (IDENT:qBxRl2FrqbRSFvDkEq6soYnCSRNvBWgW@lion.cs.columbia.edu [128.59.16.120]) by cs.columbia.edu (8.12.10/8.12.10) with ESMTP id iBHHHse9026666 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NOT); Fri, 17 Dec 2004 12:17:55 -0500 (EST) Received: from [128.59.17.219] (dhcp69.cs.columbia.edu [128.59.17.219]) by lion.cs.columbia.edu (8.12.9/8.12.9) with ESMTP id iBHHHpK8012478; Fri, 17 Dec 2004 12:17:51 -0500 Message-ID: <41C314BC.3060507@cs.columbia.edu> Date: Fri, 17 Dec 2004 12:17:48 -0500 From: Andrea G Forte User-Agent: Mozilla Thunderbird 0.9 (Windows/20041103) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Henrik Nordstrom CC: Hasso Tepper , Harald Welte , Neil Horman , linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: primary and secondary ip addresses References: <41912F7A.6000408@redhat.com> <200412161153.51251.hasso@estpak.ee> <200412161302.42357.hasso@estpak.ee> <41C2F6E5.5010607@cs.columbia.edu> <41C30212.6000906@cs.columbia.edu> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-PMX-Version: 4.7.0.111621, Antispam-Engine: 2.0.2.0, Antispam-Data: 2004.12.7.0 X-PerlMx-Spam: Gauge=IIIIIII, Probability=7%, Report='__CT 0, __CTE 0, __CT_TEXT_PLAIN 0, __HAS_MSGID 0, __MIME_VERSION 0, __SANE_MSGID 0' X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12816 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: andreaf@cs.columbia.edu Precedence: bulk X-list: netdev > How it works today is that the first IP you add in a subnet becomes a > primary, any additional IPs you add in the same subnet becomes > secondary. You can have any number of primary IPs with each any number > of secondary IPs, the primary IPs just can't be in the same subnet. > Well, when I assign a new IP (for a new subnet) to the same interface (using ip route add...), even though the routing table is updated, it still takes about 500ms for the change to happen. This new IP should be detected as primary, right? Also, am I right to say that any IP address assigned to an alias interface is a secondary IP? Do you know where the code for the routing cache is? Thanks for your help, Andrea > Regards > Henrik From hasso@estpak.ee Fri Dec 17 10:04:53 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 17 Dec 2004 10:05:00 -0800 (PST) Received: from arena (dream.estpak.ee [194.126.115.147]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBHI4UJ8001368 for ; Fri, 17 Dec 2004 10:04:53 -0800 Received: from arena ([127.0.0.1] ident=hasso) by arena with esmtp (Exim 3.36 #1 (Debian)) id 1CfMSB-0000NU-00; Fri, 17 Dec 2004 20:03:47 +0200 From: Hasso Tepper To: Henrik Nordstrom Subject: Re: primary and secondary ip addresses Date: Fri, 17 Dec 2004 20:03:22 +0200 User-Agent: KMail/1.7.2 Cc: Andrea G Forte , Harald Welte , Neil Horman , linux-net@vger.kernel.org, netdev@oss.sgi.com References: <41912F7A.6000408@redhat.com> <41C30212.6000906@cs.columbia.edu> In-Reply-To: Organization: Elion Enterprises Ltd. MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200412172003.22319.hasso@estpak.ee> X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12817 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hasso@estpak.ee Precedence: bulk X-list: netdev Henrik Nordstrom wrote: > On Fri, 17 Dec 2004, Andrea G Forte wrote: > > This does not help, since if I want to use my secondary IP address > > instead of my primary, I cannot delete the primary otherwise all of my > > secondary IPs are lost as well (and since I can only have only one > > primary IP address). > > Why change the primary address? What is wrong with simply changing the > route to use the other source IP? There is no support for it in most of user space software. None of the routing protocols suites support it etc. -- Hasso Tepper Elion Enterprises Ltd. WAN administrator From mabrown@securepipe.com Fri Dec 17 10:37:26 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 17 Dec 2004 10:37:35 -0800 (PST) Received: from spmx.securepipe.com (IDENT:AUser@spmx.securepipe.com [64.73.37.194]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBHIb5HL002835 for ; Fri, 17 Dec 2004 10:37:25 -0800 Received: (qmail 32415 invoked from network); 17 Dec 2004 18:37:03 -0000 Received: from unknown (HELO cartero.wi.securepipe.com) (64.73.37.245) by spmx.securepipe.com with SMTP; 17 Dec 2004 18:37:03 -0000 Received: (qmail 8674 invoked from network); 17 Dec 2004 18:36:48 -0000 Received: from unknown (HELO gargoyle.wi.securepipe.com) (imapmabrown@10.10.14.2) by 0 with SMTP; 17 Dec 2004 18:36:48 -0000 Date: Fri, 17 Dec 2004 12:37:02 -0600 (CST) From: "Martin A. Brown" X-X-Sender: mabrown@gargoyle.wi.securepipe.com To: Hasso Tepper cc: Henrik Nordstrom , Andrea G Forte , Harald Welte , Neil Horman , linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: primary and secondary ip addresses In-Reply-To: <200412172003.22319.hasso@estpak.ee> Message-ID: References: <41912F7A.6000408@redhat.com> <41C30212.6000906@cs.columbia.edu> <200412172003.22319.hasso@estpak.ee> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12819 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mabrown@securepipe.com Precedence: bulk X-list: netdev Hello Hasso and Andrea, We've gotten a little far afield from Neil Horman's initial question about why there are primary and secondary IPs, and I can't address your concern Andrea about the (route cache?) 500ms latency between the time that an address is added (or removed) from an interface and the time that the address is actually used. Even so, the Linux routing code allows the kernel to suggest an IP with the "src" keyword. : > Why change the primary address? What is wrong with simply changing the : > route to use the other source IP? : : There is no support for it in most of user space software. : None of the routing protocols suites support it etc. Though some software provides support for explicit configuration of source address for initiated sockets, you can use INADDR_ANY and let the kernel perform source address selection for you. Linux select an IP based on the routing table. [0] Example: # ip route show 192.168.90.0/24 192.168.90.0/24 dev eth0 scope link src 192.168.90.250 # ip route change 192.168.88.0/24 dev eth0 scope link src $SECONDARY If you want to be fancy about it, you can have a higher preference routing table (make sure there's an entry in /etc/iproute2/rt_tables for $SECONDARY_TABLE). Then you can add and remove tables in this routing table instead of changing the route in the main routing table. # ip rule add prio table $SECONDARY_TABLE # ip route add table $SECONDARY_TABLE $DESTNET dev $REALDEV src $SECONDARY Best of luck! -Martin [0] http://linux-ip.net/gl/ip-cref/node155.html -- Martin A. Brown --- SecurePipe, Inc. --- mabrown@securepipe.com From davidsen@tmr.com Fri Dec 17 10:37:05 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 17 Dec 2004 10:37:15 -0800 (PST) Received: from oddball.prodigy.com (prgy-npn1.prodigy.com [207.115.54.37]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBHIaiE3002823 for ; Fri, 17 Dec 2004 10:37:05 -0800 Received: from [127.0.0.1] (oddball.prodigy.com [127.0.0.1]) by oddball.prodigy.com (8.11.6/8.11.6) with ESMTP id iBHIbHq12702; Fri, 17 Dec 2004 13:37:26 -0500 Message-ID: <41C3275C.2010505@tmr.com> Date: Fri, 17 Dec 2004 13:37:16 -0500 From: Bill Davidsen User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3) Gecko/20040913 X-Accept-Language: en-us, en MIME-Version: 1.0 To: linux-os@analogic.com CC: James Morris , Patrick McHardy , Bryan Fulton , netdev@oss.sgi.com, netfilter-devel@lists.netfilter.org, linux-kernel@vger.kernel.org Subject: Re: [Coverity] Untrusted user data in kernel References: <41C2FF99.3020908@tmr.com> In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12818 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davidsen@tmr.com Precedence: bulk X-list: netdev linux-os wrote: > On Fri, 17 Dec 2004, Bill Davidsen wrote: > >> James Morris wrote: >> >>> On Fri, 17 Dec 2004, Patrick McHardy wrote: >>> >>> >>>> James Morris wrote: >>>> >>>> >>>>> This at least needs CAP_NET_ADMIN. >>>>> >>>> >>>> It is already checked in do_ip6t_set_ctl(). Otherwise anyone could >>>> replace iptables rules :) >>> >>> >>> >>> That's what I meant, you need the capability to do anything bad :-) >> >> >> Are you saying that processes with capability don't make mistakes? >> This isn't a bug related to untrusted users doing privileged >> operations, it's a case of using unchecked user data. >> > > But isn't there always the possibility of "unchecked user data"? > I can, as root, do `cp /dev/zero /dev/mem` and have the most > spectacular crask you've evet seen. I can even make my file- > systems unrecoverable. But that's not the type of thing you would do by accident. The kernel can't protect against deliberate abuse by trusted users, nor should it. But the type of problem caused by an application program bug can, and I believe should, be caught. The difference between "oops" and "take that!" -- -bill davidsen (davidsen@tmr.com) "The secret to procrastination is to put things off until the last possible moment - but no longer" -me From hasso@estpak.ee Fri Dec 17 10:54:30 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 17 Dec 2004 10:54:37 -0800 (PST) Received: from arena (dream.estpak.ee [194.126.115.147]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBHIs9Jv004278 for ; Fri, 17 Dec 2004 10:54:30 -0800 Received: from arena ([127.0.0.1] ident=hasso) by arena with esmtp (Exim 3.36 #1 (Debian)) id 1CfNEU-0000SN-00; Fri, 17 Dec 2004 20:53:42 +0200 From: Hasso Tepper To: "Martin A. Brown" Subject: Re: primary and secondary ip addresses Date: Fri, 17 Dec 2004 20:53:27 +0200 User-Agent: KMail/1.7.2 Cc: Henrik Nordstrom , Andrea G Forte , Harald Welte , Neil Horman , linux-net@vger.kernel.org, netdev@oss.sgi.com References: <41912F7A.6000408@redhat.com> <200412172003.22319.hasso@estpak.ee> In-Reply-To: Organization: Elion Enterprises Ltd. MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Disposition: inline Message-Id: <200412172053.28016.hasso@estpak.ee> X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id iBHIs9Jv004278 X-archive-position: 12820 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hasso@estpak.ee Precedence: bulk X-list: netdev Ühel kenal päeval (reede 17 detsember 2004 20:37) kirjutas Martin A. Brown: > Hello Hasso and Andrea, > > We've gotten a little far afield from Neil Horman's initial question > about why there are primary and secondary IPs, and I can't address your > concern Andrea about the (route cache?) 500ms latency between the time > that an address is added (or removed) from an interface and the time that > the address is actually used. Even so, the Linux routing code allows the > kernel to suggest an IP with the "src" keyword. I know. > : > Why change the primary address? What is wrong with simply changing > : > the route to use the other source IP? > : > : There is no support for it in most of user space software. > : None of the routing protocols suites support it etc. > > Though some software provides support for explicit configuration of > source address for initiated sockets, you can use INADDR_ANY and let the > kernel perform source address selection for you. Well, that's the point - we want to have full control over this selection process without doing fancy things in user space. > Linux select an IP based on the routing table. [0] Example: > > # ip route show 192.168.90.0/24 > 192.168.90.0/24 dev eth0 scope link src 192.168.90.250 > # ip route change 192.168.88.0/24 dev eth0 scope link src $SECONDARY > > If you want to be fancy about it, you can have a higher preference > routing table (make sure there's an entry in /etc/iproute2/rt_tables for > $SECONDARY_TABLE). Then you can add and remove tables in this routing > table instead of changing the route in the main routing table. > > # ip rule add prio table $SECONDARY_TABLE > # ip route add table $SECONDARY_TABLE $DESTNET dev $REALDEV src > $SECONDARY All these tricks don't help if you are using dynamic routing. -- Hasso Tepper Elion Enterprises Ltd. WAN administrator From tom@dbservice.com Fri Dec 17 11:14:38 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 17 Dec 2004 11:14:47 -0800 (PST) Received: from matterhorn.neopsis.com (neopsis.com [213.239.204.14]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBHJEHFO006070 for ; Fri, 17 Dec 2004 11:14:38 -0800 Received: from [192.168.0.11] (wan_emmen [62.65.141.13]) by matterhorn.dbservice.com (Postfix) with ESMTP id DD45E9AB3; Fri, 17 Dec 2004 20:12:28 +0100 (MET) Message-ID: <41C330F7.4000806@dbservice.com> Date: Fri, 17 Dec 2004 20:18:15 +0100 From: Tomas Carnecky User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206) X-Accept-Language: en-us, en MIME-Version: 1.0 To: linux-os@analogic.com Cc: Bill Davidsen , James Morris , Patrick McHardy , Bryan Fulton , netdev@oss.sgi.com, netfilter-devel@lists.netfilter.org, linux-kernel@vger.kernel.org Subject: Re: [Coverity] Untrusted user data in kernel References: <41C26DD1.7070006@trash.net> <41C2FF99.3020908@tmr.com> In-Reply-To: X-Enigmail-Version: 0.89.5.0 X-Enigmail-Supports: pgp-inline, pgp-mime Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Neopsis-MailScanner-Information: Please contact the ISP for more information X-Neopsis-MailScanner: Found to be clean X-MailScanner-From: tom@dbservice.com X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12821 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tom@dbservice.com Precedence: bulk X-list: netdev linux-os wrote: > On Fri, 17 Dec 2004, Bill Davidsen wrote: > >> James Morris wrote: >> >>> On Fri, 17 Dec 2004, Patrick McHardy wrote: >>> >>> That's what I meant, you need the capability to do anything bad :-) >> >> >> Are you saying that processes with capability don't make mistakes? >> This isn't a bug related to untrusted users doing privileged >> operations, it's a case of using unchecked user data. >> > > But isn't there always the possibility of "unchecked user data"? > I can, as root, do `cp /dev/zero /dev/mem` and have the most > spectacular crask you've evet seen. I can even make my file- > systems unrecoverable. > But the difference between you example (cp /dev/zero /dev/mem) and passing unchecked data to the kernel is... you _can_ check the data and do something about it if you discover that the data is not valid/within a range/whatever even if the user has full permissions. No same person would do a 'cp /dev/zero /dev/mem', but passing bad data is more likely to happen, badly written userspace configuration tools etc. tom From hno@marasystems.com Fri Dec 17 11:18:46 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 17 Dec 2004 11:18:53 -0800 (PST) Received: from filer.marasystems.com (marasystems.com [83.241.133.2]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBHJIMq9006671 for ; Fri, 17 Dec 2004 11:18:44 -0800 Received: from localhost (henrik@localhost) by filer.marasystems.com (8.11.6/8.11.6) with ESMTP id iBHJH7A18420; Fri, 17 Dec 2004 20:17:07 +0100 Date: Fri, 17 Dec 2004 20:17:07 +0100 (CET) From: Henrik Nordstrom To: Andrea G Forte cc: Hasso Tepper , Harald Welte , Neil Horman , linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: primary and secondary ip addresses In-Reply-To: <41C314BC.3060507@cs.columbia.edu> Message-ID: References: <41912F7A.6000408@redhat.com> <200412161153.51251.hasso@estpak.ee> <200412161302.42357.hasso@estpak.ee> <41C2F6E5.5010607@cs.columbia.edu> <41C30212.6000906@cs.columbia.edu> <41C314BC.3060507@cs.columbia.edu> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12822 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hno@marasystems.com Precedence: bulk X-list: netdev On Fri, 17 Dec 2004, Andrea G Forte wrote: > Well, when I assign a new IP (for a new subnet) to the same interface (using > ip route add...), even though the routing table is updated, it still takes > about 500ms for the change to happen. This new IP should be detected as > primary, right? If it is on a new subnet yes. > Also, am I right to say that any IP address assigned to an > alias interface is a secondary IP? It is irrelevant if you label the address or not. The "alias" labels are just a compatibility layer to work with old userspace programs (mainly ifconfig), not interfaces of their own. See /sbin/ip addr for all the details. Regards Henrik From davem@davemloft.net Fri Dec 17 11:22:29 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 17 Dec 2004 11:22:36 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBHJM8NU007227 for ; Fri, 17 Dec 2004 11:22:29 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1CfNac-00073o-00; Fri, 17 Dec 2004 11:16:34 -0800 Date: Fri, 17 Dec 2004 11:16:34 -0800 From: "David S. Miller" To: Tomas Carnecky Cc: jmorris@redhat.com, kaber@trash.net, bryan@coverity.com, netdev@oss.sgi.com, netfilter-devel@lists.netfilter.org, linux-kernel@vger.kernel.org Subject: Re: [Coverity] Untrusted user data in kernel Message-Id: <20041217111634.740d4d46.davem@davemloft.net> In-Reply-To: <41C2DCBC.1080302@dbservice.com> References: <41C2DCBC.1080302@dbservice.com> X-Mailer: Sylpheed version 1.0.0beta3 (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12823 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Fri, 17 Dec 2004 14:18:52 +0100 Tomas Carnecky wrote: > IMHO such things (passing values between user/kernel space) should > always be checked. As per Patrick's posting, which James was responding to, it is checked at the level above this function. From hno@marasystems.com Fri Dec 17 11:25:57 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 17 Dec 2004 11:26:04 -0800 (PST) Received: from filer.marasystems.com (marasystems.com [83.241.133.2]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBHJPZEP007781 for ; Fri, 17 Dec 2004 11:25:56 -0800 Received: from localhost (henrik@localhost) by filer.marasystems.com (8.11.6/8.11.6) with ESMTP id iBHJPA818545; Fri, 17 Dec 2004 20:25:10 +0100 Date: Fri, 17 Dec 2004 20:25:10 +0100 (CET) From: Henrik Nordstrom To: Hasso Tepper cc: "Martin A. Brown" , linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: primary and secondary ip addresses In-Reply-To: <200412172053.28016.hasso@estpak.ee> Message-ID: References: <41912F7A.6000408@redhat.com> <200412172003.22319.hasso@estpak.ee> <200412172053.28016.hasso@estpak.ee> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12824 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hno@marasystems.com Precedence: bulk X-list: netdev On Fri, 17 Dec 2004, Hasso Tepper wrote: > All these tricks don't help if you are using dynamic routing. Are you seriously saying you are doing dynamic routing for your locally attached lans? source address assignment for routed traffic via gateways is automatically derived by the source address assignment for traffic addressed to the gateway itself. So even with routing protocols etc you can control the source address assignment simply by setting up routing to use the correct source address to speak to your gateways, the added routes will then inherit the intended source address. But I am not sure about how you can control this on multicast or broadcast traffic in a reasonable manner, mostly because I rarely have to care about this on such traffic. Regards Henrik From davem@davemloft.net Fri Dec 17 11:26:23 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 17 Dec 2004 11:26:31 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBHJQ3Oi007802 for ; Fri, 17 Dec 2004 11:26:23 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1CfNeL-00074k-00; Fri, 17 Dec 2004 11:20:25 -0800 Date: Fri, 17 Dec 2004 11:20:25 -0800 From: "David S. Miller" To: Andrea G Forte Cc: hno@marasystems.com, hasso@estpak.ee, laforge@gnumonks.org, nhorman@redhat.com, linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: primary and secondary ip addresses Message-Id: <20041217112025.27688eb6.davem@davemloft.net> In-Reply-To: <41C30212.6000906@cs.columbia.edu> References: <41912F7A.6000408@redhat.com> <200412161153.51251.hasso@estpak.ee> <200412161302.42357.hasso@estpak.ee> <41C2F6E5.5010607@cs.columbia.edu> <41C30212.6000906@cs.columbia.edu> X-Mailer: Sylpheed version 1.0.0beta3 (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12825 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Fri, 17 Dec 2004 10:58:10 -0500 Andrea G Forte wrote: > This does not help, since if I want to use my secondary IP address > instead of my primary, I cannot delete the primary otherwise all of my > secondary IPs are lost as well (and since I can only have only one > primary IP address). By definition, a secondary IP address on an interface is not to be used as a source. It is the whole reason for the distinction between primary and secondary IP addresses, and it is why all secondaries are deleted once the primary is removed (because there are no valid source addresses to choose from any longer, therefore IP valid communications are no longer possible). From tom@dbservice.com Fri Dec 17 11:30:44 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 17 Dec 2004 11:30:52 -0800 (PST) Received: from matterhorn.neopsis.com (neopsis.com [213.239.204.14]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBHJUNsv008826 for ; Fri, 17 Dec 2004 11:30:44 -0800 Received: from [192.168.0.11] (wan_emmen [62.65.141.13]) by matterhorn.dbservice.com (Postfix) with ESMTP id 862D69AB3; Fri, 17 Dec 2004 20:28:34 +0100 (MET) Message-ID: <41C334DF.107@dbservice.com> Date: Fri, 17 Dec 2004 20:34:55 +0100 From: Tomas Carnecky User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206) X-Accept-Language: en-us, en MIME-Version: 1.0 To: "David S. Miller" Cc: jmorris@redhat.com, kaber@trash.net, bryan@coverity.com, netdev@oss.sgi.com, netfilter-devel@lists.netfilter.org, linux-kernel@vger.kernel.org Subject: Re: [Coverity] Untrusted user data in kernel References: <41C2DCBC.1080302@dbservice.com> <20041217111634.740d4d46.davem@davemloft.net> In-Reply-To: <20041217111634.740d4d46.davem@davemloft.net> X-Enigmail-Version: 0.89.5.0 X-Enigmail-Supports: pgp-inline, pgp-mime Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Neopsis-MailScanner-Information: Please contact the ISP for more information X-Neopsis-MailScanner: Found to be clean X-MailScanner-From: tom@dbservice.com X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12826 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tom@dbservice.com Precedence: bulk X-list: netdev David S. Miller wrote: > On Fri, 17 Dec 2004 14:18:52 +0100 > Tomas Carnecky wrote: > > >>IMHO such things (passing values between user/kernel space) should >>always be checked. > > > As per Patrick's posting, which James was responding to, it is > checked at the level above this function. Is only the capability checked or also the data passed to the kernel? It's not clear from Patricks reply: > It is already checked in do_ip6t_set_ctl(). Otherwise anyone could > replace iptables rules :) For me it seems that only CAP_NET_ADMIN is checked and not the data. tom From oliver@neukum.org Fri Dec 17 11:30:52 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 17 Dec 2004 11:30:59 -0800 (PST) Received: from Mail1.KONTENT.De (IDENT:30@mail1.kontent.de [81.88.34.36]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBHJUVKG008834 for ; Fri, 17 Dec 2004 11:30:51 -0800 Received: from p3ee1e728.dip.t-dialin.net (p3EE1E728.dip.t-dialin.net [62.225.231.40]) by Mail1.KONTENT.De (Postfix) with ESMTP id 7C6B249786F; Fri, 17 Dec 2004 20:30:07 +0100 (CET) From: Oliver Neukum To: Tomas Carnecky Subject: Re: [Coverity] Untrusted user data in kernel Date: Fri, 17 Dec 2004 20:30:04 +0100 User-Agent: KMail/1.6.2 Cc: linux-os@analogic.com, Bill Davidsen , James Morris , Patrick McHardy , Bryan Fulton , netdev@oss.sgi.com, netfilter-devel@lists.netfilter.org, linux-kernel@vger.kernel.org References: <41C26DD1.7070006@trash.net> <41C330F7.4000806@dbservice.com> In-Reply-To: <41C330F7.4000806@dbservice.com> MIME-Version: 1.0 Content-Disposition: inline Content-Type: text/plain; charset="iso-8859-2" Content-Transfer-Encoding: 7bit Message-Id: <200412172030.04831.oliver@neukum.org> X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12827 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: oliver@neukum.org Precedence: bulk X-list: netdev > But the difference between you example (cp /dev/zero /dev/mem) and > passing unchecked data to the kernel is... you _can_ check the data and This is the difference: static int open_port(struct inode * inode, struct file * filp) { return capable(CAP_SYS_RAWIO) ? 0 : -EPERM; } (from mem.c) Regards Oliver From tom@dbservice.com Fri Dec 17 11:35:15 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 17 Dec 2004 11:35:22 -0800 (PST) Received: from matterhorn.neopsis.com (neopsis.com [213.239.204.14]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBHJYsd5009892 for ; Fri, 17 Dec 2004 11:35:15 -0800 Received: from [192.168.0.11] (wan_emmen [62.65.141.13]) by matterhorn.dbservice.com (Postfix) with ESMTP id A35479AB3; Fri, 17 Dec 2004 20:33:06 +0100 (MET) Message-ID: <41C335FA.2050009@dbservice.com> Date: Fri, 17 Dec 2004 20:39:38 +0100 From: Tomas Carnecky User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Oliver Neukum Cc: linux-os@analogic.com, Bill Davidsen , James Morris , Patrick McHardy , Bryan Fulton , netdev@oss.sgi.com, netfilter-devel@lists.netfilter.org, linux-kernel@vger.kernel.org Subject: Re: [Coverity] Untrusted user data in kernel References: <41C26DD1.7070006@trash.net> <41C330F7.4000806@dbservice.com> <200412172030.04831.oliver@neukum.org> In-Reply-To: <200412172030.04831.oliver@neukum.org> X-Enigmail-Version: 0.89.5.0 X-Enigmail-Supports: pgp-inline, pgp-mime Content-Type: text/plain; charset=ISO-8859-2; format=flowed Content-Transfer-Encoding: 7bit X-Neopsis-MailScanner-Information: Please contact the ISP for more information X-Neopsis-MailScanner: Found to be clean X-MailScanner-From: tom@dbservice.com X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12828 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tom@dbservice.com Precedence: bulk X-list: netdev Oliver Neukum wrote: >>But the difference between you example (cp /dev/zero /dev/mem) and >>passing unchecked data to the kernel is... you _can_ check the data and > > > This is the difference: > static int open_port(struct inode * inode, struct file * filp) > { > return capable(CAP_SYS_RAWIO) ? 0 : -EPERM; > } > (from mem.c) > OK, but my point was, whenever you can check the 'contents' of the data passed to the kernel, do it. You can't check if the data someone writes to /dev/mem is valid or not, but you can check for out-of-range/etc. data in ioctl & friends. tom From davem@davemloft.net Fri Dec 17 11:35:55 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 17 Dec 2004 11:36:02 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBHJZYIQ009958 for ; Fri, 17 Dec 2004 11:35:55 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1CfNni-00076Z-00; Fri, 17 Dec 2004 11:30:06 -0800 Date: Fri, 17 Dec 2004 11:30:06 -0800 From: "David S. Miller" To: Tomas Carnecky Cc: jmorris@redhat.com, kaber@trash.net, bryan@coverity.com, netdev@oss.sgi.com, netfilter-devel@lists.netfilter.org, linux-kernel@vger.kernel.org Subject: Re: [Coverity] Untrusted user data in kernel Message-Id: <20041217113006.3cbae2ba.davem@davemloft.net> In-Reply-To: <41C334DF.107@dbservice.com> References: <41C2DCBC.1080302@dbservice.com> <20041217111634.740d4d46.davem@davemloft.net> <41C334DF.107@dbservice.com> X-Mailer: Sylpheed version 1.0.0beta3 (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12829 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Fri, 17 Dec 2004 20:34:55 +0100 Tomas Carnecky wrote: > > It is already checked in do_ip6t_set_ctl(). Otherwise anyone could > > replace iptables rules :) > For me it seems that only CAP_NET_ADMIN is checked and not the data. If that's the case then I agree with you Tomas. From hno@marasystems.com Fri Dec 17 11:49:57 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 17 Dec 2004 11:50:04 -0800 (PST) Received: from filer.marasystems.com (marasystems.com [83.241.133.2]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBHJnZJF011154 for ; Fri, 17 Dec 2004 11:49:56 -0800 Received: from localhost (henrik@localhost) by filer.marasystems.com (8.11.6/8.11.6) with ESMTP id iBHJmsv18771; Fri, 17 Dec 2004 20:49:00 +0100 Date: Fri, 17 Dec 2004 20:48:54 +0100 (CET) From: Henrik Nordstrom To: "David S. Miller" cc: Andrea G Forte , hasso@estpak.ee, laforge@gnumonks.org, nhorman@redhat.com, linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: primary and secondary ip addresses In-Reply-To: <20041217112025.27688eb6.davem@davemloft.net> Message-ID: References: <41912F7A.6000408@redhat.com> <200412161153.51251.hasso@estpak.ee> <200412161302.42357.hasso@estpak.ee> <41C2F6E5.5010607@cs.columbia.edu> <41C30212.6000906@cs.columbia.edu> <20041217112025.27688eb6.davem@davemloft.net> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12830 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hno@marasystems.com Precedence: bulk X-list: netdev On Fri, 17 Dec 2004, David S. Miller wrote: > By definition, a secondary IP address on an interface is not to be used > as a source. But you can, but adding a route with such address as source or applications excplicitly binding to this source address. And it is highly useful to be able to use different source addresses in the same subnet for different purposes. > It is the whole reason for the distinction between primary and secondary > IP addresses, and it is why all secondaries are deleted once the primary > is removed (because there are no valid source addresses to choose from > any longer, therefore IP valid communications are no longer possible). Which is a false assumption in very many situations. Regards Henrik From dave@thedillows.org Fri Dec 17 12:07:53 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 17 Dec 2004 12:08:00 -0800 (PST) Received: from smtp.knology.net (smtp.knology.net [24.214.63.101]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBHK7Whn014578 for ; Fri, 17 Dec 2004 12:07:53 -0800 Received: (qmail 31458 invoked by uid 0); 17 Dec 2004 20:07:31 -0000 Received: from user-69-1-45-93.knology.net (HELO ori.thedillows.org) (69.1.45.93) by smtp2.knology.net with SMTP; 17 Dec 2004 20:07:31 -0000 Received: from ori.thedillows.org (localhost [127.0.0.1]) by ori.thedillows.org (8.13.1/8.13.1) with ESMTP id iBHK76oe004833; Fri, 17 Dec 2004 15:07:06 -0500 Received: (from il1@localhost) by ori.thedillows.org (8.13.1/8.13.1/Submit) id iBHK76tk004832; Fri, 17 Dec 2004 15:07:06 -0500 X-Authentication-Warning: ori.thedillows.org: il1 set sender to dave@thedillows.org using -f Subject: [BK netdev-2.6] Update Typhoon firmware From: David Dillow To: Jeff Garzik Cc: Netdev Content-Type: text/plain Content-Transfer-Encoding: 7bit Date: Fri, 17 Dec 2004 15:07:05 -0500 Message-Id: <1103314025.4217.1.camel@ori.thedillows.org> Mime-Version: 1.0 X-Mailer: Evolution 2.0.2 (2.0.2-3) X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12831 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dave@thedillows.org Precedence: bulk X-list: netdev Jeff, please do a bk pull http://typhoon.bkbits.net/typhoon-2.6-firmware This will update the following files: drivers/net/typhoon-firmware.h | 5568 ++++++++++++++++++----------------------- drivers/net/typhoon.c | 13 2 files changed, 2533 insertions(+), 3048 deletions(-) through these ChangeSets: (04/12/17 1.2099) Update the Typhoon firmware to the latest version from 3Com. Due to the size of the patch (~400K), I've put it up at http://www.thedillows.org/typhoon-firmware-03.001.008.patch.bz2 -- David Dillow From hasso@estpak.ee Fri Dec 17 12:55:56 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 17 Dec 2004 12:56:03 -0800 (PST) Received: from arena (dream.estpak.ee [194.126.115.147]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBHKtZG3029080 for ; Fri, 17 Dec 2004 12:55:56 -0800 Received: from arena ([127.0.0.1] ident=hasso) by arena with esmtp (Exim 3.36 #1 (Debian)) id 1CfP8F-0000fM-00; Fri, 17 Dec 2004 22:55:23 +0200 From: Hasso Tepper To: Henrik Nordstrom Subject: Re: primary and secondary ip addresses Date: Fri, 17 Dec 2004 22:55:21 +0200 User-Agent: KMail/1.7.2 Cc: "Martin A. Brown" , linux-net@vger.kernel.org, netdev@oss.sgi.com References: <41912F7A.6000408@redhat.com> <200412172053.28016.hasso@estpak.ee> In-Reply-To: Organization: Elion Enterprises Ltd. MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200412172255.21316.hasso@estpak.ee> X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12832 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hasso@estpak.ee Precedence: bulk X-list: netdev Henrik Nordstrom wrote: > On Fri, 17 Dec 2004, Hasso Tepper wrote: > > All these tricks don't help if you are using dynamic routing. > > Are you seriously saying you are doing dynamic routing for your locally > attached lans? > > source address assignment for routed traffic via gateways is > automatically derived by the source address assignment for traffic > addressed to the gateway itself. So? Router learns via rip that default route should go via 10.0.0.1 and it has 10.0.0.2/24 and 10.0.0.3/24 addresses on eth0. > So even with routing protocols etc you > can control the source address assignment simply by setting up routing to > use the correct source address to speak to your gateways, the added > routes will then inherit the intended source address. No. It would be true in very trivial case only - ie. "use address I'm using for announcing these routes as gateway for these routes". All routing protocols can carry nexthop information. Moreover, addresses routing protocols use for transport don't have to have anything common with addresses used for routing. You can even carry IPv4 routing info with routing protocol using IPv6 for transport (ospfv3) or iso (is-is). -- Hasso Tepper Elion Enterprises Ltd. WAN administrator From andreaf@cs.columbia.edu Fri Dec 17 12:56:07 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 17 Dec 2004 12:56:17 -0800 (PST) Received: from cs.columbia.edu (cs.columbia.edu [128.59.16.20]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBHKtkUG029086 for ; Fri, 17 Dec 2004 12:56:07 -0800 Received: from lion.cs.columbia.edu (IDENT:80UUmpDXpt87EbRGIUhGCHIiY7r9jjcv@lion.cs.columbia.edu [128.59.16.120]) by cs.columbia.edu (8.12.10/8.12.10) with ESMTP id iBHKsbe9004613 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NOT); Fri, 17 Dec 2004 15:54:37 -0500 (EST) Received: from [128.59.17.219] (dhcp69.cs.columbia.edu [128.59.17.219]) by lion.cs.columbia.edu (8.12.9/8.12.9) with ESMTP id iBHKsZK8025706; Fri, 17 Dec 2004 15:54:35 -0500 Message-ID: <41C34788.4070203@cs.columbia.edu> Date: Fri, 17 Dec 2004 15:54:32 -0500 From: Andrea G Forte User-Agent: Mozilla Thunderbird 0.9 (Windows/20041103) X-Accept-Language: en-us, en MIME-Version: 1.0 To: "Martin A. Brown" CC: Hasso Tepper , Henrik Nordstrom , Harald Welte , Neil Horman , linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: primary and secondary ip addresses References: <41912F7A.6000408@redhat.com> <41C30212.6000906@cs.columbia.edu> <200412172003.22319.hasso@estpak.ee> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-PMX-Version: 4.7.0.111621, Antispam-Engine: 2.0.2.0, Antispam-Data: 2004.12.17.11 X-PerlMx-Spam: Gauge=IIIIIII, Probability=7%, Report='__CT 0, __CTE 0, __CT_TEXT_PLAIN 0, __HAS_MSGID 0, __MIME_VERSION 0, __SANE_MSGID 0' X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12833 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: andreaf@cs.columbia.edu Precedence: bulk X-list: netdev Martin A. Brown wrote: >Hello Hasso and Andrea, > >We've gotten a little far afield from Neil Horman's initial question about >why there are primary and secondary IPs, and I can't address your concern >Andrea about the (route cache?) 500ms latency between the time that an >address is added (or removed) from an interface and the time that the >address is actually used. Even so, the Linux routing code allows the >kernel to suggest an IP with the "src" keyword. > > > I apologize if I moved the topic too far from its original scope. > : > Why change the primary address? What is wrong with simply changing the > : > route to use the other source IP? > : > : There is no support for it in most of user space software. > : None of the routing protocols suites support it etc. > >Though some software provides support for explicit configuration of source >address for initiated sockets, you can use INADDR_ANY and let the kernel >perform source address selection for you. > >Linux select an IP based on the routing table. [0] Example: > > # ip route show 192.168.90.0/24 > 192.168.90.0/24 dev eth0 scope link src 192.168.90.250 > # ip route change 192.168.88.0/24 dev eth0 scope link src $SECONDARY > >If you want to be fancy about it, you can have a higher preference routing >table (make sure there's an entry in /etc/iproute2/rt_tables for >$SECONDARY_TABLE). Then you can add and remove tables in this routing >table instead of changing the route in the main routing table. > > # ip rule add prio table $SECONDARY_TABLE > # ip route add table $SECONDARY_TABLE $DESTNET dev $REALDEV src $SECONDARY > > > I will give it a try and see if I get any improvement. However, as last question (after this I will stop moving this topic further away from its original scope), do you guys know where I can find the code related to the routing cache in the kernel? I would like to see if there is a kind of timer for the update of such a cache. Thank you again for all of your help. Andrea >Best of luck! > >-Martin > > [0] http://linux-ip.net/gl/ip-cref/node155.html > > > From shemminger@osdl.org Fri Dec 17 13:08:07 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 17 Dec 2004 13:08:17 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBHL7k7d030978 for ; Fri, 17 Dec 2004 13:08:07 -0800 Received: from dxpl.pdx.osdl.net (dxpl.pdx.osdl.net [172.20.1.103]) by mail.osdl.org (8.11.6/8.11.6) with ESMTP id iBHL7G908257; Fri, 17 Dec 2004 13:07:16 -0800 Date: Fri, 17 Dec 2004 13:07:16 -0800 From: Stephen Hemminger To: Andrew Morton , cliff white , netdev@oss.sgi.com Subject: 2.6.10-rc3-mm1 FIONREAD breakge Message-Id: <20041217130716.1464a068@dxpl.pdx.osdl.net> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.10claws (GTK+ 1.2.10; x86_64-suse-linux) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12834 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev Latest -mm kernel broke FIONREAD on UDP sockets. This stopped STP in it's tracks because then names don't resolve. The following works on 2.6.10-rc3 but breaks on 2.6.10-rc3-mm1 ===================================== /* Did FIONREAD break on UDP? */ #include #include #include #include #include #include int main(int argc, char **argv) { int s; static struct sockaddr_in sin = { AF_INET }; int len; s = socket(AF_INET, SOCK_DGRAM, 0); if (bind(s, (struct sockaddr *) &sin, sizeof(sin)) < 0) perror("bind"); len = sizeof(sin); getsockname(s, (struct sockaddr *) &sin, &len); printf("Bound to %s.%d\n", inet_ntoa(sin.sin_addr), ntohs(sin.sin_port)); for(;;) { unsigned long count; fflush(stdout); if (ioctl(s, FIONREAD, &count) < 0) perror("ioctl(FIONREAD)"); printf("nread = %ld\n", count); sleep(5); } } From akpm@osdl.org Fri Dec 17 13:28:03 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 17 Dec 2004 13:28:10 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBHLRgIb032328 for ; Fri, 17 Dec 2004 13:28:03 -0800 Received: from bix (build.pdx.osdl.net [172.20.1.2]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id iBHLRA911902; Fri, 17 Dec 2004 13:27:10 -0800 Date: Fri, 17 Dec 2004 13:26:48 -0800 From: Andrew Morton To: Stephen Hemminger Cc: akpm@digeo.com, cliffw@osdl.org, netdev@oss.sgi.com Subject: Re: 2.6.10-rc3-mm1 FIONREAD breakge Message-Id: <20041217132648.41f3c97b.akpm@osdl.org> In-Reply-To: <20041217130716.1464a068@dxpl.pdx.osdl.net> References: <20041217130716.1464a068@dxpl.pdx.osdl.net> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.10; i386-redhat-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12835 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: akpm@osdl.org Precedence: bulk X-list: netdev Stephen Hemminger wrote: > > Latest -mm kernel broke FIONREAD on UDP sockets. yup, you'll need this: --- 25/fs/ioctl.c~ioctl-cleanups-broke-fionread-et-al 2004-12-13 11:12:37.687951760 -0800 +++ 25-akpm/fs/ioctl.c 2004-12-13 11:12:37.690951304 -0800 @@ -91,10 +91,8 @@ asmlinkage long sys_ioctl(unsigned int f int block; int res; - if (!S_ISREG(inode->i_mode)) { - error = -ENOTTY; - goto done; - } + if (!S_ISREG(inode->i_mode)) + break; /* do we support this mess? */ if (!mapping->a_ops->bmap) { error = -EINVAL; @@ -112,19 +110,15 @@ asmlinkage long sys_ioctl(unsigned int f goto done; } case FIGETBSZ: - if (!S_ISREG(inode->i_mode)) { - error = -ENOTTY; - goto done; - } + if (!S_ISREG(inode->i_mode)) + break; error = -EBADF; if (inode->i_sb) error = put_user(inode->i_sb->s_blocksize, p); goto done; case FIONREAD: - if (!S_ISREG(inode->i_mode)) { - error = -ENOTTY; - goto done; - } + if (!S_ISREG(inode->i_mode)) + break; error = put_user(i_size_read(inode) - filp->f_pos, p); goto done; } _ From roland@topspin.com Fri Dec 17 13:58:24 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 17 Dec 2004 13:58:39 -0800 (PST) Received: from umhlanga.STRATNET.NET (umhlanga.stratnet.net [12.162.17.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBHLw3pi001072 for ; Fri, 17 Dec 2004 13:58:24 -0800 Received: from exch-1.topspincom.com ([12.162.17.3]) by umhlanga.STRATNET.NET with Microsoft SMTPSVC(5.0.2195.5329); Fri, 17 Dec 2004 13:57:40 -0800 Received: from eddore ([10.10.253.169]) by exch-1.topspincom.com with Microsoft SMTPSVC(5.0.2195.5329); Fri, 17 Dec 2004 13:57:40 -0800 Received: from roland by eddore with local (Exim 4.34) id 1CfQ6W-000135-1f; Fri, 17 Dec 2004 13:57:40 -0800 To: netdev@oss.sgi.com, davem@redhat.com Cc: openib-general@openib.org X-Message-Flag: Warning: May contain useful information From: Roland Dreier Date: Fri, 17 Dec 2004 13:57:40 -0800 Message-ID: <52llbwoaej.fsf@topspin.com> User-Agent: Gnus/5.1006 (Gnus v5.10.6) XEmacs/21.4 (Security Through Obscurity, linux) MIME-Version: 1.0 X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: roland@topspin.com Subject: LLTX and netif_stop_queue Content-Type: text/plain; charset=us-ascii X-SA-Exim-Version: 4.1 (built Tue, 17 Aug 2004 11:06:07 +0200) X-SA-Exim-Scanned: Yes (on eddore) X-OriginalArrivalTime: 17 Dec 2004 21:57:40.0508 (UTC) FILETIME=[6DAC65C0:01C4E483] X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12836 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: roland@topspin.com Precedence: bulk X-list: netdev While testing my IP-over-InfiniBand driver, I discovered that if a net device sets NETIF_F_LLTX, it seems the device's hard_start_xmit method can be called even after a netif_stop_queue(). This is because in the LLTX case, qdisc_restart() holds no locks while calling hard_start_xmit, so something like the following can happen: CPU 1 CPU 2 qdisc_restart: drop queue lock call hard_start_xmit() net driver: acquire TX lock queue packet to HW acquire queue lock... qdisc_restart: drop queue lock call hard_start_xmit: queue full, call netif_stop_queue() release TX lock net driver: acquire TX lock queue is already full! Is my understanding correct? If so it seems the patch below would make sense. (e1000 seems to handle this properly already) Thanks, Roland Since tg3 and sungem now use lockless TX (NETIF_F_LLTX), it's possible for their hard_start_xmit method to be called even after they call netif_stop_queue. Therefore a full queue no longer indicates a bug -- this patch fixes the comment and removes the KERN_ERR printk. Signed-off-by: Roland Dreier Index: linux-bk/drivers/net/sungem.c =================================================================== --- linux-bk.orig/drivers/net/sungem.c 2004-12-16 15:56:19.000000000 -0800 +++ linux-bk/drivers/net/sungem.c 2004-12-17 13:46:43.307064457 -0800 @@ -976,12 +976,10 @@ return NETDEV_TX_LOCKED; } - /* This is a hard error, log it. */ + /* This may happen, since we have NETIF_F_LLTX set */ if (TX_BUFFS_AVAIL(gp) <= (skb_shinfo(skb)->nr_frags + 1)) { netif_stop_queue(dev); spin_unlock_irqrestore(&gp->tx_lock, flags); - printk(KERN_ERR PFX "%s: BUG! Tx Ring full when queue awake!\n", - dev->name); return NETDEV_TX_BUSY; } Index: linux-bk/drivers/net/tg3.c =================================================================== --- linux-bk.orig/drivers/net/tg3.c 2004-12-16 15:56:06.000000000 -0800 +++ linux-bk/drivers/net/tg3.c 2004-12-17 13:46:25.952622672 -0800 @@ -3076,12 +3076,10 @@ return NETDEV_TX_LOCKED; } - /* This is a hard error, log it. */ + /* This may happen, since we have NETIF_F_LLTX set */ if (unlikely(TX_BUFFS_AVAIL(tp) <= (skb_shinfo(skb)->nr_frags + 1))) { netif_stop_queue(dev); spin_unlock_irqrestore(&tp->tx_lock, flags); - printk(KERN_ERR PFX "%s: BUG! Tx Ring full when queue awake!\n", - dev->name); return NETDEV_TX_BUSY; } From jdmason@us.ibm.com Fri Dec 17 14:14:50 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 17 Dec 2004 14:14:58 -0800 (PST) Received: from e32.co.us.ibm.com (e32.co.us.ibm.com [32.97.110.130]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBHMEJBc001944 for ; Fri, 17 Dec 2004 14:14:50 -0800 Received: from westrelay03.boulder.ibm.com (westrelay03.boulder.ibm.com [9.17.195.12]) by e32.co.us.ibm.com (8.12.10/8.12.9) with ESMTP id iBHMDoFJ827322 for ; Fri, 17 Dec 2004 17:13:50 -0500 Received: from d03av01.boulder.ibm.com (d03av01.boulder.ibm.com [9.17.195.167]) by westrelay03.boulder.ibm.com (8.12.10/NCO/VER6.6) with ESMTP id iBHMDoL1215134 for ; Fri, 17 Dec 2004 15:13:50 -0700 Received: from d03av01.boulder.ibm.com (loopback [127.0.0.1]) by d03av01.boulder.ibm.com (8.12.11/8.12.11) with ESMTP id iBHMDo0G021295 for ; Fri, 17 Dec 2004 15:13:50 -0700 Received: from dreadnought.austin.ibm.com (dreadnought.austin.ibm.com [9.41.94.123]) by d03av01.boulder.ibm.com (8.12.11/8.12.11) with ESMTP id iBHMDoCC021289; Fri, 17 Dec 2004 15:13:50 -0700 From: Jon Mason Organization: IBM To: jgarzik@pobox.com Subject: No Maintainer entry for dl2k driver Date: Fri, 17 Dec 2004 16:13:43 -0600 User-Agent: KMail/1.7 Cc: netdev@oss.sgi.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200412171613.43330.jdmason@us.ibm.com> X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12837 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jdmason@us.ibm.com Precedence: bulk X-list: netdev It has come to my attention that there is no entry in the kernel MAINTAINERS file for the D-Link Gigabit Ethernet Driver (and possibly no maintainer). So, I have created a patch for that file with an entry for the driver status as Orphaned. --- MAINTAINERS.orig 2004-12-17 16:01:37.955336376 -0600 +++ MAINTAINERS 2004-12-17 16:07:21.027181528 -0600 @@ -704,6 +704,10 @@ M: mvw@planets.elm.net L: linux-kernel@vger.kernel.org S: Maintained +DL2K NETWORK DRIVER +L: linux-net@vger.kernel.org +S: Orphan + DAVICOM FAST ETHERNET (DMFE) NETWORK DRIVER P: Tobias Ringstrom M: tori@unhappy.mine.nu -- Jon Mason jdmason@us.ibm.com From ccaputo-dated-1105918335.00e26c@alt.net Fri Dec 17 15:32:59 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 17 Dec 2004 15:33:06 -0800 (PST) Received: from nacho.alt.net (nacho.alt.net [207.14.113.18]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBHNWc7n006434 for ; Fri, 17 Dec 2004 15:32:58 -0800 Received: (qmail 13977 invoked by uid 500); 17 Dec 2004 23:32:15 -0000 Received: by nacho.alt.net (tmda-sendmail, from uid 500); Fri, 17 Dec 2004 15:32:15 -0800 (PST) Date: Fri, 17 Dec 2004 15:32:13 -0800 (PST) To: netdev@oss.sgi.com Subject: NMI watchdog lockups with e1000 in 2.6.9 / 2.6.9-ac14 Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII From: Chris Caputo X-Delivery-Agent: TMDA/1.0.2 (Bold Forbes) X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12838 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ccaputo@alt.net Precedence: bulk X-list: netdev Is using the NMI Watchdog with the e1000 driver a bad thing? I ask because I have seen NMI "LOCKUP"s (below). I am guessing they are caused by mdelay/udelay calls. If there is something I should try to help narrow this down, please let me know. Chris --- 2.6.9: swapper: page allocation failure. order:0, mode:0x20 [] __alloc_pages+0x1b9/0x35e [] __get_free_pages+0x25/0x3f [] kmem_getpages+0x21/0xc9 [] cache_grow+0xab/0x14d [] cache_alloc_refill+0x174/0x219 [] __kmalloc+0x85/0x8c [] alloc_skb+0x47/0xe0 [] e1000_alloc_rx_buffers+0x44/0xe3 [] e1000_clean_rx_irq+0x18e/0x447 [] __mod_timer+0xeb/0x12a [] e1000_clean+0x51/0xca [] net_rx_action+0x77/0xf6 [] __do_softirq+0xb7/0xc6 <4>NMI Watchdog detected LOCKUP on CPU2, eip c010f334, registers: printk: 31 messages suppressed. ntpd: page allocation failure. order:0, mode:0x20 Stack pointer is garbage, not printing trace Modules linked in: CPU: 2 EIP: 0060:[] Not tainted VLI EFLAGS: 00000097 (2.6.9) EIP is at delay_tsc+0xd/0x15 eax: d5b4a6f8 ebx: 00000be8 ecx: d5b4a4cc edx: 000276b1 esi: 0000266d edi: c0459560 ebp: c0442371 esp: c0431d58 ds: 007b es: 007b ss: 0068 Process swapper (pid: 0, threadinfo=c0431000 task=c3190af0) Stack: 00000000 c025318e 00000be8 c027a93b 00000be8 00000005 0000000a 00000024 0000000d 00000025 c03bd2e0 00000025 0000686d 00000004 c0119fd0 c03bd2e0 c044234d 00000025 00006892 000078d3 0000686d c011a0dd 0000686d 00006892 Call Trace: Stack pointer is garbage, not printing trace Code: af fe 89 04 24 01 f9 8b 04 24 8d 1c 0b 89 5c 24 04 8b 54 24 04 0f ac d0 0a c1 ea 0a eb b2 53 8b 5c 24 08 0f 31 89 c1 f3 90 0f 31 <29> c8 39 d8 72 f6 5b c3 55 b8 48 7d 37 c0 57 56 53 83 ec 24 8b console shuts up ... And with 2.6.9-ac14: swapper: page allocation failure. order:0, mode:0x20 [] __alloc_pages+0x1b9/0x35e [] __get_free_pages+0x25/0x3f [] kmem_getpages+0x21/0xc9 [] cache_grow+0xab/0x14d [] cache_alloc_refill+0x174/0x219 [] __kmalloc+0x85/0x8c [] alloc_skb+0x47/0xe0 [] e1000_alloc_rx_buffers+0x44/0xe3 [] e1000_clean_rx_irq+0x18e/0x447 [] __kfree_skb+0x55/0xbd [] e1000_clean+0x51/0xca [] net_rx_action+0x77/0xf6 [] __do_softirq+0xb7/0x<4>NMI Watchdog detected LOCKUP on CPU2, eip c02538a6, registers: Modules linked in: CPU: 2 EIP: 0060:[] Not tainted VLI EFLAGS: 00000002 (2.6.9-ac14) EIP is at __const_udelay+0x1b/0x39 eax: 00000000 ebx: 002e8000 ecx: 0000431c edx: 00000200 esi: 0000262e edi: c045b560 ebp: c0443cd4 esp: c0433d60 ds: 007b es: 007b ss: 0068 Process swapper (pid: 0, threadinfo=c0433000 task=c3190af0) Stack: 00000000 c027b049 000010c7 00000005 00000078 00000022 0000000d 00000025 c03bf300 00000025 000161d2 00000004 c011a34c c03bf300 c0443cb2 00000025 000161f7 000167df 000161d2 c011a459 000161d2 000161f7 00000004 34ffffff Call Trace: Stack pointer is garbage, not printing trace Code: e2 89 d1 83 c1 01 89 4c 24 08 5b e9 71 ff ff ff 53 ba 00 f0 ff ff 21 e2 8b 52 10 8b 4c 24 08 c1 e2 08 c1 e1 02 8b 9a 88 0a 40 c0 <89> c8 69 db <4>printk: 38 messages suppressed. syslog-ng: page allocation failure. order:0, mode:0x20 Stack pointer is garbage, not printing trace fa 00 00 00 89 da f7 e2 89 d1 83 c1 01 89 4c 24 08 console shuts up ... From xose@wanadoo.es Fri Dec 17 16:22:03 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 17 Dec 2004 16:22:20 -0800 (PST) Received: from smtp1.jazztel.es (smtp1.jazztel.es [62.14.3.161]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBI0Lfno024224 for ; Fri, 17 Dec 2004 16:22:02 -0800 Received: from antivirus by smtp1.jazztel.es with antivirus id 1CfSLY-0002Ik-00 Sat, 18 Dec 2004 01:21:20 +0100 Received: from [212.106.244.34] (helo=wanadoo.es) by smtp1.jazztel.es with esmtp id 1CfSLY-0002I8-00 Sat, 18 Dec 2004 01:21:20 +0100 Message-ID: <41C377E9.1040305@wanadoo.es> Date: Sat, 18 Dec 2004 01:20:57 +0100 From: Xose Vazquez Perez User-Agent: Mozilla/5.0 (X11; U; Linux i686; es-ES; rv:1.4.3) Gecko/20041005 X-Accept-Language: gl, es, en MIME-Version: 1.0 To: netdev@oss.sgi.com, jdmason@us.ibm.com Subject: Re: No Maintainer entry for dl2k driver Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Scanned: by antivirus X-Virus-Status: Clean X-archive-position: 12839 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: xose@wanadoo.es Precedence: bulk X-list: netdev Jon Mason wrote: > It has come to my attention that there is no entry in the kernel MAINTAINERS > file for the D-Link Gigabit Ethernet Driver (and possibly no maintainer). > So, I have created a patch for that file with an entry for the driver status > as Orphaned. Edward Peng is the current maintainer. The 2.4 kernel already has latest release, and 'I think' he is working on the 2.6 driver. From xose@wanadoo.es Fri Dec 17 16:42:08 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 17 Dec 2004 16:42:16 -0800 (PST) Received: from smtp2.jazztel.es (smtp2.jazztel.es [62.14.3.162]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBI0fldb025208 for ; Fri, 17 Dec 2004 16:42:08 -0800 Received: from antivirus by smtp2.jazztel.es with antivirus id 1CfSeo-00043W-00 Sat, 18 Dec 2004 01:41:14 +0100 Received: from [212.106.244.34] (helo=wanadoo.es) by smtp2.jazztel.es with esmtp id 1CfSeo-000438-00 Sat, 18 Dec 2004 01:41:14 +0100 Message-ID: <41C37CAA.3030608@wanadoo.es> Date: Sat, 18 Dec 2004 01:41:14 +0100 From: Xose Vazquez Perez User-Agent: Mozilla/5.0 (X11; U; Linux i686; es-ES; rv:1.4.3) Gecko/20041005 X-Accept-Language: gl, es, en MIME-Version: 1.0 To: marcelo.tosatti@cyclades.com, netdev@oss.sgi.com Subject: Re: Ax88190 specs (was 8390 specs) Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Scanned: by antivirus X-Virus-Status: Clean X-archive-position: 12840 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: xose@wanadoo.es Precedence: bulk X-list: netdev Tosatti wrote: > I'm having a problem with a Linksys EtherFast card (driven by axnet_cs), > where the interrupt status reports ENISR_RX_ERR (receive error) > under load. > > I would like to know more details about this status bit, > what can causes it to be turned on, etc. > > Do you know where I can find 8390 specs? > > 8390.c mentions > Sources: > The National Semiconductor LAN Databook, and the 3Com 3c503 databook. > > But I can't find those available in either Google search or the > National Semiconductor website. > > Any information or pointers are welcome, thanks! Maybe these http://www.asix.com.tw/download/Ax88190.pdf http://www.asix.com.tw/download/Ax88190a.pdf can help you. More docs at -> http://www.asix.com.tw/download_datasheet.htm regards, -- TLOZ OOT: worse than drugs. From bunk@stusta.de Fri Dec 17 16:51:19 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 17 Dec 2004 16:51:25 -0800 (PST) Received: from mailout.stusta.mhn.de (mailout.stusta.mhn.de [141.84.69.5]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBI0owql025892 for ; Fri, 17 Dec 2004 16:51:19 -0800 Received: (qmail 19175 invoked from network); 18 Dec 2004 00:50:29 -0000 Received: from r063144.stusta.swh.mhn.de (10.150.63.144) by mailhub.stusta.mhn.de with SMTP; 18 Dec 2004 00:50:29 -0000 Received: by r063144.stusta.swh.mhn.de (Postfix, from userid 1000) id A8FDCBBEAF; Sat, 18 Dec 2004 01:50:27 +0100 (CET) Date: Sat, 18 Dec 2004 01:50:27 +0100 From: Adrian Bunk To: Steven Whitehouse Cc: patrick@tykepenguin.com, Steve Whitehouse , linux-decnet-user@lists.sourceforge.net, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: [2.6 patch] net/decnet/: misc possible cleanups Message-ID: <20041218005027.GC21288@stusta.de> References: <20041214125838.GC23151@stusta.de> <20041214133235.GB10131@souterrain.chygwyn.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041214133235.GB10131@souterrain.chygwyn.com> User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12841 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bunk@stusta.de Precedence: bulk X-list: netdev On Tue, Dec 14, 2004 at 01:32:35PM +0000, Steven Whitehouse wrote: > Hi, Hi Steven, >... > Also, when I was writing the routing code - a lot of the design was "borrowed" > from the ipv4 routing code. It might be worth doing a comparison to see where > the two have diverged (something I used to do now and again) to pick up any > bugs I'd inadvertently copied over, if you are working on clean ups in this > area, unfortunately, I'm not working especially in this area. I'm currently going through the complete kernel sources searching for global code that can be made static or even removed. > Steve. cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed From bunk@stusta.de Fri Dec 17 17:01:16 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 17 Dec 2004 17:01:23 -0800 (PST) Received: from mailout.stusta.mhn.de (mailout.stusta.mhn.de [141.84.69.5]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBI10s8T026594 for ; Fri, 17 Dec 2004 17:01:15 -0800 Received: (qmail 19857 invoked from network); 18 Dec 2004 01:00:26 -0000 Received: from r063144.stusta.swh.mhn.de (10.150.63.144) by mailhub.stusta.mhn.de with SMTP; 18 Dec 2004 01:00:26 -0000 Received: by r063144.stusta.swh.mhn.de (Postfix, from userid 1000) id 64DD4BBEAF; Sat, 18 Dec 2004 02:00:25 +0100 (CET) Date: Sat, 18 Dec 2004 02:00:25 +0100 From: Adrian Bunk To: Arnaldo Carvalho de Melo Cc: netdev@oss.sgi.com Subject: Re: [2.6 patch] net/sched/: possible cleanups Message-ID: <20041218010024.GD21288@stusta.de> References: <20041215012754.GH12937@stusta.de> <41BF9A95.5050902@conectiva.com.br> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <41BF9A95.5050902@conectiva.com.br> User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12842 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bunk@stusta.de Precedence: bulk X-list: netdev On Tue, Dec 14, 2004 at 11:59:49PM -0200, Arnaldo Carvalho de Melo wrote: > > Adrian Bunk wrote: > >The patch below contans the following possible cleanups: > >- make some needlessly global code static > >- sch_htb.c: #undef HTB_DEBUG > > > > > >diffstat output: > > include/net/act_api.h | 3 --- > > Adrian, may I suggest that you post the networking related patches > only to netdev? Until now I thought it's never a bad idea to Cc linux-kernel on any patches. Is there a specific reason why you consider this being a bad thing (well, bandwith shouldn't be that much of an issue considering how high-volume linux-kernel is...)? > - Arnaldo cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed From acme@conectiva.com.br Fri Dec 17 20:14:51 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 17 Dec 2004 20:15:04 -0800 (PST) Received: from perninha.conectiva.com.br (perninha.conectiva.com.br [200.140.247.100]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBI4EUoN000680 for ; Fri, 17 Dec 2004 20:14:51 -0800 Received: by perninha.conectiva.com.br (Postfix, from userid 568) id BF91D4731F; Sat, 18 Dec 2004 02:14:05 -0200 (BRST) Received: from burns.conectiva (burns.conectiva [10.0.0.4]) by perninha.conectiva.com.br (Postfix) with SMTP id 34D4747495 for ; Sat, 18 Dec 2004 02:14:05 -0200 (BRST) Received: (qmail 1072 invoked by uid 0); 18 Dec 2004 05:10:21 -0000 Received: from mapi8.distro.conectiva (HELO oops.ghostprotocols.net) (10.0.16.10) by burns.conectiva with SMTP; 18 Dec 2004 05:10:21 -0000 Received: from [192.168.1.6] (amd64.kerneljanitors.org [192.168.1.6]) by oops.ghostprotocols.net (Postfix) with ESMTP id 1B0321463D; Sat, 18 Dec 2004 02:14:01 -0200 (BRST) Message-ID: <41C3AF1C.2010103@conectiva.com.br> Date: Sat, 18 Dec 2004 02:16:28 -0200 From: Arnaldo Carvalho de Melo Organization: Conectiva S.A. User-Agent: Mozilla Thunderbird 0.9 (X11/20041103) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Adrian Bunk Cc: netdev@oss.sgi.com Subject: Re: [2.6 patch] net/sched/: possible cleanups References: <20041215012754.GH12937@stusta.de> <41BF9A95.5050902@conectiva.com.br> <20041218010024.GD21288@stusta.de> In-Reply-To: <20041218010024.GD21288@stusta.de> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Bogosity: No, tests=bogofilter, spamicity=0.120640, version=0.16.3 X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12843 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: acme@conectiva.com.br Precedence: bulk X-list: netdev Adrian Bunk wrote: > On Tue, Dec 14, 2004 at 11:59:49PM -0200, Arnaldo Carvalho de Melo wrote: > >>Adrian Bunk wrote: >> >>>The patch below contans the following possible cleanups: >>>- make some needlessly global code static >>>- sch_htb.c: #undef HTB_DEBUG >>> >>> >>>diffstat output: >>>include/net/act_api.h | 3 --- >> >>Adrian, may I suggest that you post the networking related patches >>only to netdev? > > > Until now I thought it's never a bad idea to Cc linux-kernel on any > patches. Is there a specific reason why you consider this being a bad > thing (well, bandwith shouldn't be that much of an issue considering how > high-volume linux-kernel is...)? I mentioned that when making the same request to another person one or two days ago: all the networking hackers I know are subscribed to netdev, several of them aren't subscribed to linux-kernel. - Arnaldo From davem@davemloft.net Fri Dec 17 21:50:28 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 17 Dec 2004 21:50:35 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBI5o533007069 for ; Fri, 17 Dec 2004 21:50:27 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1CfXOK-0008CI-00; Fri, 17 Dec 2004 21:44:32 -0800 Date: Fri, 17 Dec 2004 21:44:32 -0800 From: "David S. Miller" To: Roland Dreier Cc: netdev@oss.sgi.com, openib-general@openib.org Subject: Re: LLTX and netif_stop_queue Message-Id: <20041217214432.07b7b21e.davem@davemloft.net> In-Reply-To: <52llbwoaej.fsf@topspin.com> References: <52llbwoaej.fsf@topspin.com> X-Mailer: Sylpheed version 1.0.0beta3 (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12844 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Fri, 17 Dec 2004 13:57:40 -0800 Roland Dreier wrote: > While testing my IP-over-InfiniBand driver, I discovered that if a net > device sets NETIF_F_LLTX, it seems the device's hard_start_xmit method > can be called even after a netif_stop_queue(). > > This is because in the LLTX case, qdisc_restart() holds no locks while > calling hard_start_xmit, so something like the following can happen: ... > if (TX_BUFFS_AVAIL(gp) <= (skb_shinfo(skb)->nr_frags + 1)) { > netif_stop_queue(dev); > spin_unlock_irqrestore(&gp->tx_lock, flags); > - printk(KERN_ERR PFX "%s: BUG! Tx Ring full when queue awake!\n", > - dev->name); > return NETDEV_TX_BUSY; > } I understand the bug, but we're not going to fix it this way. This is a crucial invariant that we need to check for because it indicates a pretty serious state error except in this bug case you've discovered. Perhaps one way to fix this is to add a pointer to a spinlock to the netdev struct, and have hold that the upper level grab that when NETIF_F_LLTX when doing queue state checks. Actually, that could end up being racy too. If we can't find a good fix for this, besides removing the necessary debugging message, we might have to pull NETIF_F_LLTX out or disable it temporarily until we figure out a way. From herbert@gondor.apana.org.au Fri Dec 17 23:27:44 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 17 Dec 2004 23:27:54 -0800 (PST) Received: from arnor.apana.org.au (c211-30-229-77.rivrw4.nsw.optusnet.com.au [211.30.229.77]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBI7ROKN031150 for ; Fri, 17 Dec 2004 23:27:44 -0800 Received: from gondolin.me.apana.org.au ([192.168.0.6] ident=mail) by arnor.apana.org.au with esmtp (Exim 3.35 #1 (Debian)) id 1CfYyw-0005aB-00; Sat, 18 Dec 2004 18:26:26 +1100 Received: from herbert by gondolin.me.apana.org.au with local (Exim 3.36 #1 (Debian)) id 1CfYwb-0005LX-00; Sat, 18 Dec 2004 18:24:01 +1100 Date: Sat, 18 Dec 2004 18:24:01 +1100 To: Jeff Garzik Cc: Netdev , Arnaldo Carvalho de Melo , YOSHIFUJI Hideaki / ???????????? , "David S. Miller" Subject: Re: Badness in dst_release Message-ID: <20041218072401.GA20492@gondor.apana.org.au> References: <41C13D55.3070002@pobox.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="liOOAslEiF7prFVr" Content-Disposition: inline In-Reply-To: <41C13D55.3070002@pobox.com> User-Agent: Mutt/1.5.6+20040722i From: Herbert Xu X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12845 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@gondor.apana.org.au Precedence: bulk X-list: netdev --liOOAslEiF7prFVr Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Thu, Dec 16, 2004 at 02:46:29AM -0500, Jeff Garzik wrote: > > After a week or more of uptime, the "Badness in dst_release" messages > started again. Kernel 2.6.10-rc3-bk2. The first message is below. Could you please check your kern.log for any abnormal messages prior to this badness? > IMO, if you want additional debugging code, add some non-intrusive > checks in the upstream kernel. Since this problem doesn't appear for a > while, additional checks would be useful. Here's a check which might help. Please apply it and see if it produces anything. BTW, there are still races left in addrconf.c. However, they are less likely to trigger compared to the one that's been fixed already. The races are similar in nature to it. For example, dad_complete may race with dad_failure to lead to the sequence ip6_del_rt/ip6_ins_rt. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt --liOOAslEiF7prFVr Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename=p ===== net/ipv6/route.c 1.102 vs edited ===== --- 1.102/net/ipv6/route.c 2004-11-30 14:24:46 +11:00 +++ edited/net/ipv6/route.c 2004-12-18 18:13:59 +11:00 @@ -379,6 +379,7 @@ int err; write_lock_bh(&rt6_lock); + WARN_ON(rt->u.dst->obsolete); err = fib6_add(&ip6_routing_table, rt, nlh, _rtattr); write_unlock_bh(&rt6_lock); --liOOAslEiF7prFVr-- From ak@suse.de Fri Dec 17 23:51:39 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 17 Dec 2004 23:51:46 -0800 (PST) Received: from Cantor.suse.de (news.suse.de [195.135.220.2]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBI7pIGj001016 for ; Fri, 17 Dec 2004 23:51:39 -0800 Received: from hermes.suse.de (hermes-ext.suse.de [195.135.221.8]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by Cantor.suse.de (Postfix) with ESMTP id E9844123644D; Sat, 18 Dec 2004 08:50:49 +0100 (CET) To: Crazy AMD K7 Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: do_IRQ: stack overflow: 872.. References: <1131604877.20041218092730@mail.ru.suse.lists.linux.kernel> From: Andi Kleen Date: 18 Dec 2004 08:50:49 +0100 In-Reply-To: <1131604877.20041218092730@mail.ru.suse.lists.linux.kernel> Message-ID: Lines: 27 User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.3 MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12846 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ak@suse.de Precedence: bulk X-list: netdev Crazy AMD K7 writes: > Hi! > I have found a few days ago strange messages in /var/log/messages > More than 10 times there was do_IRQ: stack overflow: (nimber).... followed > with code. If need I can send all this data. I have run > ksymoops with only first 3 cases. Here is the first, the second and > the third are in attachment. > After that oopses my system continued to work. It's not really an oops, just a warning that stack space got quiet tight. The problem seems to be that the br netfilter code is nesting far too deeply and recursing several times. Looks like a design bug to me, it shouldn't do that. > uname uname -a > Linux linux 2.4.28 #2 ÷ÔÒ îÏÑ 30 15:43:35 MSK 2004 i686 unknown > gcc -v > Reading specs from /usr/lib/gcc-lib/i386-redhat-linux/2.96/specs > gcc version 2.96 20000731 (Red Hat Linux 7.3 2.96-113) > I have applies ebtables_brnf patch (http://bridge.sf.net) and a Don't do that then or contact the author to fix it. Unfortunately the code is also in 2.6 mainline. -Andi From ast@domdv.de Sat Dec 18 02:55:53 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 18 Dec 2004 02:56:03 -0800 (PST) Received: from hermes.domdv.de (hermes.domdv.de [193.102.202.1]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBIAtWft012004 for ; Sat, 18 Dec 2004 02:55:53 -0800 Received: from [10.1.1.3] (helo=castor.lan.domdv.de) by zeus.domdv.de with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.42) id 1CfcEq-0004Tn-0p; Sat, 18 Dec 2004 11:55:04 +0100 Received: from pcast2.lan.domdv.de ([10.1.9.234]) by castor.lan.domdv.de with esmtpa (Exim 4.42) id 1CfcEp-0000wQ-JY; Sat, 18 Dec 2004 11:55:04 +0100 Message-ID: <41C40C86.80707@domdv.de> Date: Sat, 18 Dec 2004 11:55:02 +0100 From: Andreas Steinmetz User-Agent: Mozilla Thunderbird 1.0 (X11/20041207) X-Accept-Language: en-us, en MIME-Version: 1.0 To: netdev@oss.sgi.com Subject: 2.4.27 network related Oopses X-Enigmail-Version: 0.89.5.0 X-Enigmail-Supports: pgp-inline, pgp-mime Content-Type: multipart/mixed; boundary="------------050103020009060705080807" X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12847 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ast@domdv.de Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------050103020009060705080807 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit [please CC me on replies as I'm not subscribed] Below are two somewhat similar Oopses from two different systems with an identical kernel configuration (attached) captured with netconsole. The kernel is a 2.4.27 with some patches. The Oopses happen not often (once every 4 to 8 weeks on average). They do have in common that they happen when there is lots of disk I/O and network I/O. ============================================================================ ksymoops 2.4.9 on i686 2.4.27. Options used -V (default) -k /proc/ksyms (default) -l /proc/modules (default) -o /lib/modules/2.4.27/ (default) -m /boot/System.map (specified) Oops: 0002 CPU: 0 EIP: 0010:[] Not tainted Using defaults from ksymoops -t elf32-i386 -a i386 EFLAGS: 00010013 eax: 000ff0c4 ebx: ffffffff ecx: f7d22140 edx: 00000000 esi: c1c0f6a8 edi: 00000286 ebp: cc98d020 esp: ed541bac ds: 0018 es: 0018 ss: 0018 Process cut (pid: 25393, stackpage=ed541000) cc31f3c0 0025ca70 0000c98d e1b55a40 00000003 e1b55aa0 cc98d020 c02f1ea3 cc98d000 00000358 e1b55a40 c02f1ff9 e1b55a40 00000202 fffffff4 c0332733 e1b55a40 c05f6290 cc98d020 e1b55a40 00000006 e1b55a40 cc98d020 c0332849 Call Trace: [] [] [] [] [] [] [] [] [] [] [] [] [] [] Warning (Oops_read): Code line not seen, dumping what data is available >>EIP; c013ddc1 <===== >>ecx; f7d22140 <_end+376713c8/38275308> >>esi; c1c0f6a8 <_end+155e930/38275308> >>ebp; cc98d020 <_end+c2dc2a8/38275308> >>esp; ed541bac <_end+2ce90e34/38275308> Trace; c02f1ea3 Trace; c02f1ff9 <__kfree_skb+f9/150> Trace; c0332733 Trace; c0332849 Trace; c0313831 Trace; c02fe903 Trace; c0313760 Trace; c03132e4 Trace; c0313760 Trace; c0313ab9 Trace; c03138d0 Trace; c02fe903 Trace; c03138d0 Trace; c03136cb 1 warning issued. Results may not be reliable. ================================================================================ ksymoops 2.4.9 on i686 2.4.27. Options used -V (default) -k /proc/ksyms (default) -l /proc/modules (default) -o /lib/modules/2.4.27/ (default) -m /boot/System.map (specified) Oops: 0002 CPU: 0 EIP: 0010:[] Not tainted Using defaults from ksymoops -t elf32-i386 -a i386 EFLAGS: 00010013 eax: 000e5896 ebx: ffffffff ecx: f7d22240 edx: 00000000 esi: c1c0f6a8 edi: 00000286 ebp: c6626020 esp: d373bbac ds: 0018 es: 0018 ss: 0018 Process userupdate.cvs (pid: 10934, stackpage=d373b000) f5e16780 00132720 00006626 f670f600 00000003 f670f660 c6626020 c02f1ea3 c6626000 00000358 f670f600 c02f1ff9 f670f600 00000202 fffffff4 c0332733 f670f600 c05f6290 c6626020 f670f600 00000006 f670f600 c6626020 c0332849 Call Trace: [] [] [] [] [] [] [] [] [] [] [] [] [] Warning (Oops_read): Code line not seen, dumping what data is available >>EIP; c013ddc1 <===== >>ecx; f7d22240 <_end+376714c8/38275308> >>esi; c1c0f6a8 <_end+155e930/38275308> >>ebp; c6626020 <_end+5f752a8/38275308> >>esp; d373bbac <_end+1308ae34/38275308> Trace; c02f1ea3 Trace; c02f1ff9 <__kfree_skb+f9/150> Trace; c0332733 Trace; c0332849 Trace; c0313831 Trace; c02fe903 Trace; c0313760 Trace; c03132e4 Trace; c0313760 Trace; c0313ab9 Trace; c03138d0 Trace; c02fe903 Trace; c03138d0 1 warning issued. Results may not be reliable. -- Andreas Steinmetz SPAMmers use robotrap@domdv.de --------------050103020009060705080807 Content-Type: application/x-gunzip; name="config.gz" Content-Transfer-Encoding: base64 Content-Disposition: inline; filename="config.gz" H4sIAP+XWkECA4xcS5PbtrLe51ewKotrVyWxXqPRnCovIBCUEJEEhwD1yIalzNC2rmVpjkaT 2P/+NEhR4gMNepGM1d+HJxuN7gbIX3/51SFv5+O37Xn3tN3vfzifs0N22p6zZ+fb9mvmPB0P n3af/+M8Hw//d3ay5935l19/oSL0+CxdT8Yff0AFl59ymkhn9+ocjmfnNTuXrIS7fc2DckA9 PmdQ8fnttDv/cPbZP9neOb6cd8fD661eto5YzAMWKuKXBf3j9nn79x4KH5/f4M/r28vL8VTp TCDcxGey2iEQLVksuQhr/briCyBUgbyh6HR8yl5fjyfn/OMlc7aHZ+dTpjucvRYjuNQ9nIyN lQYjDLizAEpSFAuCtRkbYxVGMHU8CTjnVnxkRhdjw1MMFve1qV1MzIWZT0IzQuNECmbGVjyk cx7RsRUeWNGhi7S7ifkanYolJ3SYDgwjBu1OV1G6EvFCpmIBo68APFz60awuo0G0pvOGcE1c ty6ZyhWJ6qJIRMQt2rj2LF5JFqQzFsJSoKmMeOgLujD0syDqlqGplPgzEXM1D+ot+P2UEjpn qZxzT30cV7E5kSnoX73ATAioLOINcSJZOnRDsWoMYMbavCiKRQpt0oVMGt3xej2vOrPXUWsw oGYtUQI6PyVGjE8W5ufLaSyocM015q3JGMVoBMbLiDLXLA/FnM/mAQsMD+qCjGbVAV+E49EM L3GbOy3gorYQiZqnLEh8osDOmRauiuOalQ4iRNmTyPTAo5SLthh0kfgGOhelsDaRoF6pyyWZ +gxpvFDA3MLO8h1orxlvLzcDHzJ1ayuivNoG/ASNnXJhNvQF7PKYUWVov4BJuKnVn+rq6pKi hpsMRlQnhCSob0EMKIhZMsvnQkV+MkOGEVBuLaY7YRig3MglWNdbX6fSTWF1UiZlSmh1SECl yr/9XlARs5T5XnVUhZCIRBk7M+WhF6gWXkcbVV6kAUc2QiGCdMF9n8WmB1hZIiT/cS0HP1M+ C3V/wSbFqUwk7H7mxau5rkhZ2NTTGgOMQspdCwEUPfLJJp3CVrhAWbGioD/pLFAohfi+WMFW oyReC4NlCG4MWFuxguEJz0O5OaNByBccoRF3ZNOh0tLKtMKvdCqEaooS2ZBAf1kcxQz+30AY bQjyHjVltNlqEBS/GmLpMxY1ZRupqvYyF5KaOSp6TRR0b2NQpcuglBJhoxqPNCWXFSSaQ1Bz Fge561pvlcgEeZJ5Kcv2lhNcNkUMw5S4MQlazzXIvh1PPxyVPX05HPfHzz8cN/tnB96t8y5Q 7vuaO6sMnvAWrPAeHHStFxWH+2aPSByJWLUL7t8+545ztN/+uEQQbxBTgJdfKx5G5r1ckgaU l5ruj09fnediBDc1nfoLmJll6rk1c3KRrs0rHXrOEZ9Al6TRY+oSK0w5mE6Ek2OSSnimJGI4 JRAhVyIGfYFl7KLN6XG4hD6Me1ZK0vA6WgS5npjM8QX1haisplJK402kxAVr1RhO7b1uKGUL 5zD+GK9CKlI3fYVWv+3Pu98LPSi10nkXE/DUtM75y+B9JSp022MKapoSuKnPQ0ZM+wpgut7e rY6LpN+oQcvuzPu8m4JzpngEfppt9pdB6f6E2fnf4+nr7vC5HRlHhC5qblD+GywkqT0fcJVg THnNhjYB9bhfmOdqkULYNjMXxrVMGdeHfF1xhWruGY+KUVMi61LiLklIGcwYOAfV2soSkc9g 1UyLSP6G5fTUWwUkXhiAkKjqYK5y49S3aUqYEhdXeMniqZCs0ULDRt3mEUDefBwgm8XM3EiQ t9KsnQcySJd9rOMX3BwSg112TW3pjjAaVr29MKVCLHg+3bn6OQ6P/qOV8NNuf85ODsXMN3Qi 9KB0GKqYGCPTgvGYsITVnhkIeWR4yCCHcIbOYUEGXJmhgFAzEC2U2kQMK9VQmxuSawjsYmZY Cdl4MFcIQgEWKnTYBak221XAlTQyIwRKSQTzWThTc6SrVbe9BtAokMhMJyH1GUH6KFZhY4kC 1DQDhVSReAbKFbM/iwCpMWMXOOBxLGLLlIUznyFVKyERxDDHF8Q0yWVXDBpxgXwxQ5AEh66T 3Bg5LMVC01v7WLHQlrvT+Q2CXZmd/ulYbUuJGIOlOW22mCsVuYZ2HQjxonpTzrtq0vV9o+Wc bzQ0yrzBLyH0SSe9Qf+x3Xy97jVSL/EX7Tjl5WWfnbf7qhd4LaJ3URLB9tEsWmFQMIBmbwPc wRmSIxqYN3afRFMz4JsjWG18XQ57ibkLDP4ivVvBZBb7JVqxBxttTkEZ81XqQTgJEiD6rbl9 PErtPn04npxP293J+e9b9paBD1LLv4AizFnFo7qIUjp9bAvnatrcBHOu/AvtY17Mq2cAGnAU 1/NfpTxmrqWU9B5NhRR79C2l1NQzlZo12moR4C/iipcMV2r7Y6XwENqREuU8Gn0WjVBfNrsN IrBQPHTZGi9U6Nio/iSv8rbYW7VlyXDQ8GOdc/Z6LhSp1iXYr2csNO4Fkl1zgfAjeyr9YOed d8oy+QHWQ8XFz/mlz1WTkXlLkijQMPAhwC+/w0E5J/0m2tiTc1FI06Fbd2LyivxZfSO4CHWj iAkvGbplbE5KwsBcue6JtShh0lxyCnbB43Le0Te1Ej/BkiyOGn6RgaUDgw5KmPh+Vy15dEoi 3sHr26aGR1QEUXtm8HRLgUPMkYLnq08YSdueqmyfvXw5Hn5UEmu39MMcbL05MaGRlK//tKOG VHCxQyrygbvsQ+AFH2KYvtYpKYDXpeWy33SBPGqGvzCLHSmWSuFon21fIfzOMsc9Pr19yw7n 3I/4sHvO/jh/PzufYCP5ku1fPuwOn44OOBg63/J82v1T37XLquduakvIFBR7LsFlLpdmm3rB Cmc/T5daaXorXbCu1qi0xPMaNyakAECTQhUObNVRtOli6RwTNhJwTGEoXFDVVk491U9fdi8g KFXkw99vnz/tvpufDg3c8ciefSooYBLneXxv73krwdceWi0NXMovp6EVe5sTtVGMYV3Ej+0i +pkGpHleVKLC86aCxN2aBVV4IqZGJ+PWhN5Cat7JBRKhv2kqp7ERQ1ld7YpH1h4SIPZ79gdE GB0P1ms7x+f9u/XQ7IS61IQ3awjc+9F6XUt9F6JUgKWMuxY5UZyvo25Ns49DxdzzmZ1DN5MB HT8M7SR5dze0z+s8UkOkOwWUKwU8oM5axmMrJYKZsRJCObkf9e+sHBEpPh707Q25dNAb6Cfm I9nzkjBNYmQbb9UVspU9VfwXKLB9piXtD7ooy9VC2hmcB2TGOjjw2Pt2zZA+feixjkem4mDw YO/xkhPQwzWyMLUd13dbJFMStzyI1eFLc4CqsVCEjaNiw1IkeVL9Jzj6Mf8UD9T8p3gBc322 +SmqrN/sybevfF9s+T5aWvovVUZx0evd8+7162/OefuS/eZQ9/dYVM8Rro+9Ev1KN2Vr8AA1 ID+OepXTnXlcsM2jLWEhO1YPcjXlWv2sPfLjt6wY3HN51Jf98fkPGJDz/29fs7+P399fh/1N H6S87DPHT8Latp8fXeXub5G8Na+onFV4VFCB6daBJsC/pSKhqgRJudwXsxmEuLXnsT/++3tx JTB3FE9GX2S4SmG1rFNUN/Pq78GYeASb35xCKLbxFzBEYneDdQdhNLAT7hG3qSAQah8F4fQe Mw1XwkMXYW0x5cWjjlTKB8JSiz61kRuLHvBwgJnvoobgbkgf7kcWTWIzYp+NkmGZcqnjGSuq t4gORgRLn0vWVc961MXgfidDdjCWnHQyFJO23k4TCYuNU5xBo0ePKktX3GA97D/0bSqk6HAw sSgAsz4ZjaaYj3RjRNyipF6iEogCXBEQHuK0mYucBhbo5b5lSOO7oW08sCfbVgNXtq4Cjvrr OSFvno56Y9LFuf/+3aIgG73sJrD+B131TNZrzIRfa2mY8GtBsMsegY0AQs6Z/NgfI6yArAvG cIAw5CakHyc9rIN6ZXrcZz8xFgh8AxJFzP0ZbvQowYORP0OFudDXAlOdMiKWzSUisj+2wJTb 7aUmDAY9bmFIPhjZCI/5qtcnA50cLqPuemgnpW+1AJKRGVE2s8qD+36vS5tHtnl16fChZ3ks CrqIo0l/lA5HnoXgg7snlYgtqiKjoWWxtZKKuYvjvb3qM7gAvOP6vZaqA+QlsnHDtwHl1/Ns OGxqIbJTXBiNZFETbtztL3JJjDGnP3wYOe+83SlbwX83F7N1qlhGKFBIlyndv+JemuNe3b7L SaXKvm9fHX54PZ/yHOOrAwvD/3H47pyy/75BUwAOqPP37vz739vD59p5VdlGKuaUVxuLTsfz 8em4rzR3C53KQksIokQ+LEPYVZLkNKon46tAGs03Et3UrlSm5s1m2iR32c2JyQrZc64UGkR2 wuWmiyX1fZ3UBJR52tYGcCNxFcacTJCnWDiqsWnz1Ycair1xojE8Q6qbjAVtjaA8tmqN4nZ2 BWEIp7cbMySmh+xsOpMGBDvTdZMgMN0/nYrQLeKi28nwY0J8/hdy+qsS8+iZvoWqSPseJTt/ yU66w+/A2B5PDjgiASyg9/WT3rx4cafrZsGS0NfZXXOyC7bbTcAI4vMm4TQgFMMeGYbMLCep Sxa6Ik6HsBkjR/RYZyulZUC7KDGEij5yT6F/j2zjOjON3NWPMN8vvyhWf2GgemLafOcBhMhe QwJ30u/39VM04y6JFKP6hmXs8Ri5yw4uPdJREoGTjGw205HZUFE5efjeQ3TYRzLcLhutzRlN dxZLJL310O+Zp4UxWPLY5JcgGmUwrKQfsiGS5vNgQYTm+DwkSrKAI4owWKAWsTiIwB4t9GWA eFFMotAEjDqNUEgJYcPQKStxMFIsVSsusTssJXHSHzygBJ3FSON1GjOJWFUIqB+wxxtxisZc SeiitqIE0yBARqmwHQii9zSec+Ss94riNa94qDeEdDLCDUYk9B1gq5mHYZcmvrK6WYgkBVx/ YDqq0vrf+/itviB6oMfI+XuwiXnrpeBb1+VkOBn0kL0E9oq5udYN8/PrCuYZjyf9sVmB5OJh 4nNTnlLxmQiH1YF5rsvNRlinM4XPquQ5j5AJiBpWrRRHlZsk8KNIouoL23Vx86allpE8RK6J tCRValOX6iNkUr21o4VT6TbvOoBYmO4lS796n0b/ymNefSWJ1Q62c0hfZ0JWpIbzF5T0v8Yt NdXxwj57fXX0Gnp3OB5+/7L9dto+747vmzeGYuLWl0lx0eL4NTs4sb4ub/DAlOXKnFn5Y4rZ XQkuTt18FSPYHpzd4ZydPm0bja8M4RL5tj1nbycn1kM0+cig++aB8pNLnHe7w6fT9pQ9vzf6 17HbvhPCpRsC+e/XH6/n7FuNrpEmXeyfL8cfZRZeP5ZznqH/LadCZFR7NtRNQ2G6jlA0f3h5 OztPx5PBl+ZhlFQvXuuf6YJtpuBhN8WBSHRggsrB1Y8ZC9P1R9j9R3bO5uP9eFKZh5z0p9iY 35coYLYsGm+USvI/rVHTL9vT9knfn2/dpllWRrxUaWlRqksKQmbMj84hfe4ErimW3So4YXHy 4jZOOSpvbzxM0khtZP2VjkII3UpC9XFwNy7jOmp4fIPam3XwM3/9fMqxDMx38Cdh45kiEVlR PKKe6SEAqhO7+XWNdqu8fze8s1SL32vQ+HzjxgRpVF9fuBtXtK6Qafe62Q0+6fXRNvgEcb00 qM/2Ryga5h7fAMXz8+qHOxs+Hvas8MhSmizJjOG9w44oLhg41TgshCvEEJn52tvVWkDn8Oxr FkBPHQTxZSQ+356e/91qS5MdXo9gudpnwAxWWyzrqy0X6Us0YDn6yJKqckY/wbn7Cc64i/Mw wI4PS46coj7theLK/rhjVJ6kEfLmfIUi6cZKmfl3/YkMujiDXgcnIGtwhjxmJXE1ue+qZTy+ s09NoKJeb2Kl+MH9XRehq4ZJr4sw7CJ09aFjMvzgoasPDwMrwWZirpxA0tF9YNc1NQ9ox2MB ozGeYCdfJUf1B/1+B2UyGJouU5eE1WR4P+lXL3TcgPHgfu6ZrARg/v3kTtmXi9DZs/abq+AI 1K40JDL3TExvnmp5xXmXpeAWEA/ofb+Xtiqo5If6gEfIi8XB4r4/aJfOu/fnEVzF3dPXemY8 d3dmJGDNd74blBA0ZWLBfT6bq5CHMwsHXHDdkoVB5WjcX1sILEj6vUXfwtB+krDjxtfbLdND QuIL28AI8pWiC+py25DFNCYWfOYNFjY4RnJsBZx/toFQ25NVAXLN6zJfEAKs9Dsmsa2Z3JdJ EzntJg2Q7GZBWpEYnp+tLX3Rz0dPB4o+R4QyEU+7KFOCvYxQcBTos3Xc7vTB9mxA2amwdVQl 8VTMYuKZdf6R094A/9wAjwIOkWXo+lgnNSEiIacQ6OCvbwT6BFMffhTq4hFqaW9BJQ6u9Bum rpgZwunz05fn42eHgjPXCKfbRRrBhS3umDWSlLdUeRrXD1z02HwtNCenVmkMM4mcOejqogDC S3PeYYl9EiFWSC5OIW8zxsOH8QhJzkc+x85EpAg3UTtK94p7j+cvmfNpf3x5+ZFfhCwPwYpE QO0wGlU1MmsfObm7U/Z0duLs8JyddKYm2B62nyE4frfWb3RNxo4bX69j1nILbhwgH30jS3P7 Ymre1PLPRnzLnndbU54oP+5tfi+hmJmd/tpgnjupBPGPiVC12/qP+utFS9N35PTtXE970LVL 8blwBFLTF90Yh72nVeYqzr+pZlaKkqKjItBjTyDHP3nlNiyNzdfkiIcXnVuhyE9QeMo8O2b5 9k2r5PUrkmpYzOBF8Oe0ljiAn/i3LoiqFQ2kK5qPIymFhvLLawW3ZDk+wj89OzbAQArWxNi+ CqJq93Na9bNl4mE87tUnR/i8ftr7F9DQTln6a3tYGtVfdeGwYRrfnwp4yNeNmVuu8eZC1YEh ajyPMLXJV051ZnLB9QtU1ffN8aYLMNA5NAuOaR+gkWqq22O4HuENXlBktLEILKtk0GhJfy0O aydxPRuEtJ/gHS+gdBVz5GqY5eFrKDfEKIoEKhrSHy3BWzTf0yrvhuR7grzuCbf1KFyCdTZ3 mAL211/C/CDC0t5Xfi+HtRe5db/yLxAiYUsshNIsc+1urW63XbmrP2JhipeTMP5fY9fWnLYO hP8K0/fOhIsTeOiDLAus4tux7ND0hXGDmzKHQIbLnJN/X60MxrIkyy9t2P28ulrX3c9N8kBw xvFaP9fPE5lk0jW+lTjpUMHUVXEaMbowbhIroIhtFMl34yCcsxMAFRt1ZSnmQ3AngIV8f+DF XZAo6NKKQBI9oBqHW3V2I2g7bwV7R/b5UUo0UWkGHshRzbckc9HFaXTH6FdRbG5BoJBv0W2Y DKXUgoF4ey1CmnprhBTFybsshFUGyCWBgf4VphOWu915YHHAM8oqKttOJNyQrCDYtDvdwAst htjCVjHgbMozbzGT21qSzG0Jwcva6p23vhnUTm5RceZ7gEFQ7N8ufOmu0kVy7M0P/NuX7ekw nTqzr0PnS1MPRLwJ35SvJ2OJTVrSPY31h5kySD6W1UGmzoMxjakzsqcxdZw+oB65nRqo+1qg YR9Qn4w/jvuAJn1AfarAEAXZAs3soNm4h6WZ89DHUo96mk165Gn6ZK4nPjJDN19P7WaGoz7Z 5qihoV/f0hq2O/VNMbJmc2xF2IvqWBGPVsSTFTGzIob2wgwntqp02nW5jOl0nRotC3VuVOfZ fKp6Ahz2p8OubIQ03rY1fPq8X/5L5xCMBC0atdCrweoJxbF4L7/+uvz+XR61gQOuJkz0st9I IZ98q6JShcEprXKFyoXSNpi5htVyuD29lrtdsS8Pl5OwdeezbD0OR4qmDQIHuCjyVrQVuFXn 0D+czlDN5+Nht+NV4LVrGkwQH9O1jxvrVpDmvuxTe5OtUaBj9gN1rHmEBdPh0Gftp3QYbQmu 4Q94V5xOuthXeB7lnmHZL4oX0tGjhuoNrLu7S3k+HM5/Bq/FfnDY7z4Hv8rB5VRuBv9tuXSz PcGXNTYNoNZdCNIJqdYzThQwi1O0IErdVOKOQwEJxZeMc+RacfOUENMRZxNHmWc6FpaSTbDd lp9Mua3SimOelz7MesEMK4wm7HseJsyPzR0LdWQ9ScW2U//aXN55d6A3z7U7N7RPZW5oMOQb AnOvOsUJqg1onaxK6qVrtm2+3hRFp0lGlkb1CnV1kaWbdXQ0QU0etuLV6rqj78WbweNQDGjY xLQiyuThaVefxCiKDNc01SuI07ir2H7C/9X64kHWtefg19HiebspD5Ngu7/8X4dwgY2y3PDh ASidtBYac0SdzO2opNgUH+eDOpwlZGGkQwd9mvHx0jHX0hKtTB7woMbIcJ8ilB4m7Qt6eQpg bqRxqYZyiToxDNE5Y0+yR3P9GCuP22IHsxR/8CzN082mF557ygha+fNZBtAKdI1ftsH4vnVp 8KBuoFY+zYhPUGYDenQBfAqYL1uMzgENOAl569tA88yDkNDYhnvmq7nUBqIJ+seKsVoh3qJf +a64dUZt0CV5YQnie3gDlbwKtcFCnK1z081501zA7LmLXRoA+6tpyq9gSTAaP4wN3Ra/uCT9 jvDSltgPmnbO6BUqDiNqejflBafhJU1p7HSMvignKVuhwDwlZXQRmFeqafC86BhZAqyusq9H KfX4+lZs3sqzkvEF4t0q03wW4b5oaw0pbpCTn01KNP5zHYwwapH5g5jhWEvrBbp0zhsmbBuq pHLkQaXhg2eik/G+qQQ1NLQJn9biO6BVOukq+tQqJF/Z02pfopFC2dQCg84UtFAj3LnJUaWG eFkwtBjJxtiC4H9gMw1dbcee4Wcfq9N+NaHzld6xijP3ynP5Ks6N2W2Rfzu3pF594gd/Nz99 14AIKgcaxR6RnO1rFUM40GtiowZan296ibdOqFdB7sepAPJRtMj1HkxCz9Ca/CB4nevuMAQi ioECOfZkhm+hymhIYvDAmsMnEATq2+ODBKm+xsNWNMN+6+mUsIzPutk10EUHaWh0Lv8KRo0B MCGT5gtYV2eWvnwby0XMVus5ogEXy8lWBUPRC7TB4mrWlHJKghh54j60laqY8WDxylcq0JJx qhSQN1Hmp3G+EBzJ5qbUnycsIZ5tN/hTvP7b4hOuuKqWELCsrbIYfC3n12/fDZ+uBzPHT74y fTsWH3+2r+qnNipu2XsZq9+CH1gRAlWtIgy9iUbmKDKZbPguHDWjDu5iZzhSxBIH8VV2I/VV FFcaX9V2RdyryJHGOND3OlqpmueMINVmiieaUrRC1m6VRrGPSAD/N3tUbZ5lmhHv17E4fg6O h8t5uy+lZsXjRgX+DKgLfjJywkJ6z85fFRQtQyV1AAA= --------------050103020009060705080807-- From bdschuym@pandora.be Sat Dec 18 03:09:20 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 18 Dec 2004 03:09:26 -0800 (PST) Received: from astra.telenet-ops.be (astra.telenet-ops.be [195.130.132.58]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBIB90cH012772 for ; Sat, 18 Dec 2004 03:09:20 -0800 Received: from localhost (localhost.localdomain [127.0.0.1]) by astra.telenet-ops.be (Postfix) with SMTP id 470843281B2; Sat, 18 Dec 2004 12:08:36 +0100 (MET) Received: from 192.168.0.138 (D5763CA9.kabel.telenet.be [213.118.60.169]) by astra.telenet-ops.be (Postfix) with ESMTP id 7482F32848B; Sat, 18 Dec 2004 12:08:35 +0100 (MET) Subject: Re: do_IRQ: stack overflow: 872.. From: Bart De Schuymer To: Andi Kleen Cc: Crazy AMD K7 , linux-kernel@vger.kernel.org, netdev@oss.sgi.com In-Reply-To: References: <1131604877.20041218092730@mail.ru.suse.lists.linux.kernel> Content-Type: text/plain; charset=utf-8 Date: Sat, 18 Dec 2004 12:12:09 +0100 Message-Id: <1103368330.3566.15.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Evolution 2.0.2 (2.0.2-3) Content-Transfer-Encoding: 8bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12848 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bdschuym@pandora.be Precedence: bulk X-list: netdev Op za, 18-12-2004 te 08:50 +0100, schreef Andi Kleen: > Crazy AMD K7 writes: > > > Hi! > > I have found a few days ago strange messages in /var/log/messages > > More than 10 times there was do_IRQ: stack overflow: (nimber).... followed > > with code. If need I can send all this data. I have run > > ksymoops with only first 3 cases. Here is the first, the second and > > the third are in attachment. > > After that oopses my system continued to work. > > It's not really an oops, just a warning that stack space got quiet tight. > > The problem seems to be that the br netfilter code is nesting far too > deeply and recursing several times. Looks like a design bug to me, > it shouldn't do that. > > > uname uname -a > > Linux linux 2.4.28 #2 ÷ÔÒ îÃÑ 30 15:43:35 MSK 2004 i686 unknown > > gcc -v > > Reading specs from /usr/lib/gcc-lib/i386-redhat-linux/2.96/specs > > gcc version 2.96 20000731 (Red Hat Linux 7.3 2.96-113) > > I have applies ebtables_brnf patch (http://bridge.sf.net) and a > > Don't do that then or contact the author to fix it. Unfortunately > the code is also in 2.6 mainline. > Hi. The bridge-nf code does not use recursive function calls and there is no long consecutive function calling. Furthermore, there is no function in the bridge-nf code that uses a large part of the stack. Andi, if you make such statements then please point out the code part you have (of course) read after which you decided to make the statement. The bridge-nf code is used by quite a few people and by commercial companies and I have never had a report like this. AMD has been having all sorts of strange problems for weeks now, they're all somehow related to bridge-nf, but I doubt he is using the bridge-nf patch on a clean 2.4 kernel. AMD, is there any chance you can use the latest 2.6 kernel, without extra patches ? Bart From ak@suse.de Sat Dec 18 03:15:10 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 18 Dec 2004 03:15:17 -0800 (PST) Received: from Cantor.suse.de (ns.suse.de [195.135.220.2]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBIBEnhX013520 for ; Sat, 18 Dec 2004 03:15:09 -0800 Received: from hermes.suse.de (hermes-ext.suse.de [195.135.221.8]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by Cantor.suse.de (Postfix) with ESMTP id B163612372AC; Sat, 18 Dec 2004 12:14:20 +0100 (CET) Date: Sat, 18 Dec 2004 12:14:20 +0100 From: Andi Kleen To: Bart De Schuymer Cc: Andi Kleen , Crazy AMD K7 , linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: do_IRQ: stack overflow: 872.. Message-ID: <20041218111420.GE338@wotan.suse.de> References: <1131604877.20041218092730@mail.ru.suse.lists.linux.kernel> <1103368330.3566.15.camel@localhost.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1103368330.3566.15.camel@localhost.localdomain> X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12849 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ak@suse.de Precedence: bulk X-list: netdev > The bridge-nf code does not use recursive function calls and there is no > long consecutive function calling. Furthermore, there is no function in > the bridge-nf code that uses a large part of the stack. Just take a look at the backtrace in the original post. It clearly shows a problem. And it points strongly towards br-netfilter. From bdschuym@pandora.be Sat Dec 18 03:48:45 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 18 Dec 2004 03:48:52 -0800 (PST) Received: from europa.telenet-ops.be (europa.telenet-ops.be [195.130.132.60]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBIBmO96014871 for ; Sat, 18 Dec 2004 03:48:45 -0800 Received: from localhost (localhost.localdomain [127.0.0.1]) by europa.telenet-ops.be (Postfix) with SMTP id 6F0B51980C2; Sat, 18 Dec 2004 12:48:01 +0100 (MET) Received: from 192.168.0.138 (D5763CA9.kabel.telenet.be [213.118.60.169]) by europa.telenet-ops.be (Postfix) with ESMTP id 1AFEE198324; Sat, 18 Dec 2004 12:48:01 +0100 (MET) Subject: Re: do_IRQ: stack overflow: 872.. From: Bart De Schuymer To: Andi Kleen Cc: Crazy AMD K7 , linux-kernel@vger.kernel.org, netdev@oss.sgi.com In-Reply-To: <20041218111420.GE338@wotan.suse.de> References: <1131604877.20041218092730@mail.ru.suse.lists.linux.kernel> <1103368330.3566.15.camel@localhost.localdomain> <20041218111420.GE338@wotan.suse.de> Content-Type: text/plain Date: Sat, 18 Dec 2004 12:51:30 +0100 Message-Id: <1103370690.3566.33.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Evolution 2.0.2 (2.0.2-3) Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12850 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bdschuym@pandora.be Precedence: bulk X-list: netdev Op za, 18-12-2004 te 12:14 +0100, schreef Andi Kleen: > > The bridge-nf code does not use recursive function calls and there is no > > long consecutive function calling. Furthermore, there is no function in > > the bridge-nf code that uses a large part of the stack. > > Just take a look at the backtrace in the original post. It clearly > shows a problem. And it points strongly towards br-netfilter. I don't doubt you are a much better reader of such backtraces than me. However, let's count the number of times a function from net/bridge/br_netfilter.c is in the backtrace: 1. br_nf*: 6 times 2. *sabotage*: 3 times Seriously, out of 222 lines, only 9 from bridge-nf. The function ip_queue_xmit, OTOH, is 8 times in the trace. Anyway, as I already suspected weeks ago, AMD must be seeing some incompatibility between ip_queue (he's using snort) and the bridge-nf patch. He is using the patch (I gave it to him) below on top of the bridge-nf patch. Before using that patch he got a kernel panic occasionally. However he seems not to get a message in his syslog. Bart --- linux-2.4.28-ebt-brnf/net/bridge/br_netfilter.c.old 2004-11-27 23:43:18.000000000 +0100 +++ linux-2.4.28-ebt-brnf/net/bridge/br_netfilter.c 2004-11-27 23:52:05.000000000 +0100 @@ -870,6 +870,10 @@ static unsigned int ip_sabotage_out(unsi { struct sk_buff *skb = *pskb; +if (!skb) { + printk("TROUBLE IN IP_SABOTAGE_OUT: skb==NULL\n"); + goto in_trouble; +} #ifdef CONFIG_SYSCTL if (!skb->nf_bridge) { struct vlan_ethhdr *hdr = @@ -884,6 +888,10 @@ static unsigned int ip_sabotage_out(unsi } #endif +if (!out) { + printk("TROUBLE IN IP_SABOTAGE_OUT: out == NULL\n"); + goto in_trouble; +} if ((out->hard_start_xmit == br_dev_xmit && okfn != br_nf_forward_finish && okfn != br_nf_local_out_finish && @@ -920,6 +928,9 @@ static unsigned int ip_sabotage_out(unsi } return NF_ACCEPT; +in_trouble: + dump_stack(); + return NF_DROP; } /* For br_nf_local_out we need (prio = NF_BR_PRI_FIRST), to insure that innocent From random@72616e646f6d20323030342d30342d31360a.nosense.org Sat Dec 18 05:22:14 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 18 Dec 2004 05:22:20 -0800 (PST) Received: from ubu.nosense.org (7.cust11.sa.dsl.ozemail.com.au [210.84.234.7]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBIDLqxH021478 for ; Sat, 18 Dec 2004 05:22:12 -0800 Received: from ubu.nosense.org (ubu.nosense.org [127.0.0.1]) by ubu.nosense.org (Postfix) with SMTP id 1D46562A9F for ; Sat, 18 Dec 2004 23:51:27 +1030 (CST) Date: Sat, 18 Dec 2004 23:51:26 +1030 From: Mark Smith To: netdev@oss.sgi.com Subject: [RFC / (sort of) PATCH - 2.6.10-rc3] Structured locally generated Ethernet addresses Message-Id: <20041218235126.4a77a127.random@72616e646f6d20323030342d30342d31360a.nosense.org> Organization: The No Sense Organisation X-Mailer: Sylpheed version 1.0.0beta1 (GTK+ 1.2.10; i686-pc-linux-gnu) X-Location: Adelaide, Australia Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12851 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: random@72616e646f6d20323030342d30342d31360a.nosense.org Precedence: bulk X-list: netdev Hi, I've been playing around with quite a number of interfaces which are using locally generated MAC addresses, via the random_ether_addr() function. Although these random ethernet addresses work fine, I thought it might be more useful if they were more structured, by having a common, random, local OUI, and then using an incremental serial number. It would help identify MAC addresses belonging to the same host in things such as bridge / switch station caches, or ARP tables. It is also closer to how vendors user their officially assigned OUIs. I've put together a patch against 2.6.10-rc3. I've modified the dummy interface code to use it. I'll admit straight up that I don't know much about kernel programming, so I may be doing things wrong. However, I think it is ok, and it seems to be working alright. A few other drivers would need to be modified to use it, the tun/tap one as an example. Prior to developing this patch I discovered that the tun/tap driver wasn't using random_ether_addr() either, so I've fixed that and sent a patch off to the maintainer. Multiple tun interfaces is the main application where I've been dealing with a number of random, local ethernet addresses. I think it would be useful to have something similar to this incorporated into the Linux kernel. I'm interested in any comments, including those as to why this might not be a good idea. I think you can learn as much from finding out what is wrong to do as what is right to do. The patch is below. Thanks, Mark. diff -ur linux-2.6.10-rc3/drivers/net/dummy.c linux-2.6.10-rc3-mrs/drivers/net/dummy.c --- linux-2.6.10-rc3/drivers/net/dummy.c 2004-12-18 23:01:16.000000000 +1030 +++ linux-2.6.10-rc3-mrs/drivers/net/dummy.c 2004-12-18 23:00:10.000000000 +1030 @@ -72,7 +72,10 @@ dev->flags |= IFF_NOARP; dev->flags &= ~IFF_MULTICAST; SET_MODULE_OWNER(dev); - random_ether_addr(dev->dev_addr); + + /* random_ether_addr(dev->dev_addr); - MRS */ + + locally_assigned_ether_addr(dev->dev_addr); } static int dummy_xmit(struct sk_buff *skb, struct net_device *dev) diff -ur linux-2.6.10-rc3/include/linux/etherdevice.h linux-2.6.10-rc3-mrs/include/linux/etherdevice.h --- linux-2.6.10-rc3/include/linux/etherdevice.h 2004-10-19 07:25:06.000000000 +0930 +++ linux-2.6.10-rc3-mrs/include/linux/etherdevice.h 2004-12-18 23:06:27.000000000 +1030 @@ -78,6 +78,18 @@ addr [0] &= 0xfe; /* clear multicast bit */ addr [0] |= 0x02; /* set local assignment bit (IEEE802) */ } + +/** + * locally_assigned_ether_addr - Generate locally assigned Ethernet address + * using an occasionally generated random local OUI, and then an incrementing + * serial number + * @addr: Pointer to a six-byte array containing the Ethernet address + * + * Mark Smith + */ + +extern void locally_assigned_ether_addr(u8 *addr); + #endif #endif /* _LINUX_ETHERDEVICE_H */ diff -ur linux-2.6.10-rc3/include/linux/if_ether.h linux-2.6.10-rc3-mrs/include/linux/if_ether.h --- linux-2.6.10-rc3/include/linux/if_ether.h 2004-10-19 07:24:37.000000000 +0930 +++ linux-2.6.10-rc3-mrs/include/linux/if_ether.h 2004-12-18 19:02:19.000000000 +1030 @@ -27,6 +27,7 @@ */ #define ETH_ALEN 6 /* Octets in one ethernet addr */ +#define ETH_OUILEN 3 /* Octets in OUI part of eth addr */ #define ETH_HLEN 14 /* Total octets in header. */ #define ETH_ZLEN 60 /* Min. octets in frame sans FCS */ #define ETH_DATA_LEN 1500 /* Max. octets in payload */ diff -ur linux-2.6.10-rc3/Makefile linux-2.6.10-rc3-mrs/Makefile --- linux-2.6.10-rc3/Makefile 2004-12-18 23:00:44.000000000 +1030 +++ linux-2.6.10-rc3-mrs/Makefile 2004-12-18 17:28:45.000000000 +1030 @@ -1,7 +1,7 @@ VERSION = 2 PATCHLEVEL = 6 SUBLEVEL = 10 -EXTRAVERSION =-rc3 +EXTRAVERSION =-rc3-mrs NAME=Woozy Numbat # *DOCUMENTATION* diff -ur linux-2.6.10-rc3/net/ethernet/eth.c linux-2.6.10-rc3-mrs/net/ethernet/eth.c --- linux-2.6.10-rc3/net/ethernet/eth.c 2004-12-18 23:01:48.000000000 +1030 +++ linux-2.6.10-rc3-mrs/net/ethernet/eth.c 2004-12-18 23:09:01.000000000 +1030 @@ -306,3 +306,67 @@ return alloc_netdev(sizeof_priv, "eth%d", ether_setup); } EXPORT_SYMBOL(alloc_etherdev); + +/** + * locally_assigned_ether_addr - Generate locally assigned Ethernet address + * using an occasionally generated random local OUI, then an incrementing + * serial number + * @addr: Pointer to a six-byte array containing the Ethernet address + * + * Make sure that random OUI has multicast bit reset, and has locally + * assigned bit set. Note that the random OUI is very occasionally generated ie. + * most of the time, it will be the same for calls to this function. + * All interfaces on host will then have nearly the same OUI, and only + * vary in their serial number. + * This will make identifying a single Linux host with multiple + * generated MAC addresses easier in things such as ARP tables + * and bridge / switch station caches. + * + * This probably should be called within a lock / semaphore; I know what they + * are, I know what they're for, I just don't know how to make sure they + * are being used or how to check for them yet :-( + * Oh well, you've got to start somewhere. + * + * Mark Smith + */ + +void locally_assigned_ether_addr(u8 *addr) +{ + + static u8 random_OUI[ETH_OUILEN]; /* Just FYI, OUIs are 3 octets */ + static u8 random_OUI_generated = 0; + static u16 serial_number = 0; + u16 serial_num_hton; + + if (random_OUI_generated == 0) { + get_random_bytes(&random_OUI[0], ETH_OUILEN); + random_OUI[0] &= 0xfe; /* clear multicast bit */ + random_OUI[0] |= 0x02; /* set local assignment bit (IEEE802) */ + random_OUI_generated = 1; + } + + /* Copy OUI into addr */ + memcpy(addr, random_OUI, ETH_OUILEN); + + /* now the 4th octet */ + addr[3] = 0x00; + + serial_number++; + serial_num_hton = htons(serial_number); + /* now octets 5 and 6 */ + memcpy(&addr[4], &serial_num_hton, sizeof(u16)); + + /* We might have run out of serial numbers, so pick a new OUI + * next time we're called. + * (Unlikely, supposedly we've now generated at 2^16 local + * addresses. Still, need to handle this corner case, there are some + * crazy networking people out there, must be caffeine poisoning.) */ + + if (serial_number == 0xffff) { + random_OUI_generated = 0; + serial_number = 0; + } + +} + +EXPORT_SYMBOL(locally_assigned_ether_addr); From ak@suse.de Sat Dec 18 05:54:10 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 18 Dec 2004 05:54:17 -0800 (PST) Received: from Cantor.suse.de (cantor.suse.de [195.135.220.2]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBIDrnhB022767 for ; Sat, 18 Dec 2004 05:54:10 -0800 Received: from hermes.suse.de (hermes-ext.suse.de [195.135.221.8]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by Cantor.suse.de (Postfix) with ESMTP id 4470612368D4; Sat, 18 Dec 2004 14:53:21 +0100 (CET) Date: Sat, 18 Dec 2004 14:53:20 +0100 From: Andi Kleen To: Bart De Schuymer Cc: Andi Kleen , Crazy AMD K7 , linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: do_IRQ: stack overflow: 872.. Message-ID: <20041218135320.GA10030@wotan.suse.de> References: <1131604877.20041218092730@mail.ru.suse.lists.linux.kernel> <1103368330.3566.15.camel@localhost.localdomain> <20041218111420.GE338@wotan.suse.de> <1103370690.3566.33.camel@localhost.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1103370690.3566.33.camel@localhost.localdomain> X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12852 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ak@suse.de Precedence: bulk X-list: netdev On Sat, Dec 18, 2004 at 12:51:30PM +0100, Bart De Schuymer wrote: > > > > Just take a look at the backtrace in the original post. It clearly > > shows a problem. And it points strongly towards br-netfilter. > > I don't doubt you are a much better reader of such backtraces than me. > However, let's count the number of times a function from > net/bridge/br_netfilter.c is in the backtrace: > 1. br_nf*: 6 times > 2. *sabotage*: 3 times > Seriously, out of 222 lines, only 9 from bridge-nf. > The function ip_queue_xmit, OTOH, is 8 times in the trace. Yep, but ip_queue_xmit doesn't call itself recursively. Someone must be doing it. And that's likely the bridge code. BTW not all of these entries are probably true, there can be a lot of false positives. > Anyway, as I already suspected weeks ago, AMD must be seeing some > incompatibility between ip_queue (he's using snort) and the bridge-nf > patch. > > He is using the patch (I gave it to him) below on top of the bridge-nf > patch. Before using that patch he got a kernel panic occasionally. > However he seems not to get a message in his syslog. Ok, since this report seems to be for a totally non standard severly hacked up kernel I suppose nothing from it can be concluded for the mainline kernel. Thanks for clearing this up. Note to the original poster: when you report a bug with a patched kernel always mention it. -Andi From vonbrand@laptop11.inf.utfsm.cl Sat Dec 18 06:22:52 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 18 Dec 2004 06:22:59 -0800 (PST) Received: from inti.inf.utfsm.cl (inti.inf.utfsm.cl [200.1.21.155]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBIEMVHs024135 for ; Sat, 18 Dec 2004 06:22:52 -0800 Received: from paynac.inf.utfsm.cl (paynac.inf.utfsm.cl [200.1.19.204]) by inti.inf.utfsm.cl (8.12.11/8.12.11) with ESMTP id iBIEL0pq019473; Sat, 18 Dec 2004 11:21:50 -0300 Received: from laptop11.inf.utfsm.cl (localhost.localdomain [127.0.0.1]) by localhost.localdomain (8.13.1/8.12.11) with ESMTP id iBI1gHW0007368; Fri, 17 Dec 2004 22:42:18 -0300 Received: from laptop11.inf.utfsm.cl (vonbrand@localhost) by laptop11.inf.utfsm.cl (8.13.1/8.13.1/Submit) with ESMTP id iBI1g5Gd007361; Fri, 17 Dec 2004 22:42:10 -0300 Message-Id: <200412180142.iBI1g5Gd007361@laptop11.inf.utfsm.cl> To: linux-os@analogic.com cc: Bill Davidsen , James Morris , Patrick McHardy , Bryan Fulton , netdev@oss.sgi.com, netfilter-devel@lists.netfilter.org, linux-kernel@vger.kernel.org Subject: Re: [Coverity] Untrusted user data in kernel In-Reply-To: Message from linux-os of "Fri, 17 Dec 2004 11:11:37 CDT." X-Mailer: MH-E 7.4.2; nmh 1.0.4; XEmacs 21.4 (patch 15) Date: Fri, 17 Dec 2004 22:42:04 -0300 From: Horst von Brand X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12853 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: vonbrand@inf.utfsm.cl Precedence: bulk X-list: netdev linux-os said: > On Fri, 17 Dec 2004, Bill Davidsen wrote: [...] > > Are you saying that processes with capability don't make mistakes? This > > isn't a bug related to untrusted users doing privileged operations, > > it's a case of using unchecked user data. > But isn't there always the possibility of "unchecked user data"? Yes. But it should be kept to a minimum. > I can, as root, do `cp /dev/zero /dev/mem` and have the most > spectacular crask you've evet seen. I can even make my file- > systems unrecoverable. Right. And you can get rid of /dev/mem if you don't want to screw yourself this way (which is well-known). The problem at hand is _not_ in this same league. -- Dr. Horst H. von Brand User #22616 counter.li.org Departamento de Informatica Fono: +56 32 654431 Universidad Tecnica Federico Santa Maria +56 32 654239 Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513 From roland@topspin.com Sat Dec 18 07:36:34 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 18 Dec 2004 07:36:41 -0800 (PST) Received: from umhlanga.STRATNET.NET (umhlanga.stratnet.net [12.162.17.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBIFaDJf026623 for ; Sat, 18 Dec 2004 07:36:34 -0800 Received: from exch-1.topspincom.com ([12.162.17.3]) by umhlanga.STRATNET.NET with Microsoft SMTPSVC(5.0.2195.5329); Sat, 18 Dec 2004 07:35:50 -0800 Received: from eddore ([10.10.253.169]) by exch-1.topspincom.com with Microsoft SMTPSVC(5.0.2195.5329); Sat, 18 Dec 2004 07:35:50 -0800 Received: from roland by eddore with local (Exim 4.34) id 1CfgcY-0001qe-0T; Sat, 18 Dec 2004 07:35:50 -0800 To: "David S. Miller" Cc: netdev@oss.sgi.com, openib-general@openib.org X-Message-Flag: Warning: May contain useful information References: <52llbwoaej.fsf@topspin.com> <20041217214432.07b7b21e.davem@davemloft.net> From: Roland Dreier Date: Sat, 18 Dec 2004 07:35:49 -0800 In-Reply-To: <20041217214432.07b7b21e.davem@davemloft.net> (David S. Miller's message of "Fri, 17 Dec 2004 21:44:32 -0800") Message-ID: <528y7vobze.fsf@topspin.com> User-Agent: Gnus/5.1006 (Gnus v5.10.6) XEmacs/21.4 (Security Through Obscurity, linux) MIME-Version: 1.0 X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: roland@topspin.com Subject: Re: LLTX and netif_stop_queue Content-Type: text/plain; charset=us-ascii X-SA-Exim-Version: 4.1 (built Tue, 17 Aug 2004 11:06:07 +0200) X-SA-Exim-Scanned: Yes (on eddore) X-OriginalArrivalTime: 18 Dec 2004 15:35:50.0563 (UTC) FILETIME=[40B1F730:01C4E517] X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12854 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: roland@topspin.com Precedence: bulk X-list: netdev David> I understand the bug, but we're not going to fix it this David> way. This is a crucial invariant that we need to check for David> because it indicates a pretty serious state error except in David> this bug case you've discovered. OK, that's fair enough. I was pretty bummed myself when I realized hard_start_xmit was getting called even after I stopped the queue. Thanks for confirming my analysis. David> Perhaps one way to fix this is to add a pointer to a David> spinlock to the netdev struct, and have hold that the upper David> level grab that when NETIF_F_LLTX when doing queue state David> checks. Actually, that could end up being racy too. I may be missing something, but it seems to me that we get all of the benefits of LLTX by just documenting that device drivers can use the xmit_lock member of struct net_device to synchronize other parts of the driver against hard_start_xmit. I guess the driver also should set xmit_lock_owner to -1 after it acquires xmit_lock. This would mean that NETIF_F_LLTX can go away, and LLTX drivers would just replace their use of their own private tx_lock with xmit_lock (except in hard_start_xmit, where the upper layer already holds xmit_lock). By the way, am I correct in thinking that the use of xmit_lock_owner in qdisc_restart() is racy? if (!spin_trylock(&dev->xmit_lock)) { /* get the lock */ if (!spin_trylock(&dev->xmit_lock)) { /* fail */ if (dev->xmit_lock_owner == smp_processor_id()) { /* test the wrong value */ /* set the value too late */ dev->xmit_lock_owner = smp_processor_id(); I don't see a simple way to fix this unfortunately. Thanks, Roland From snort2004@mail.ru Sat Dec 18 08:07:43 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 18 Dec 2004 08:07:50 -0800 (PST) Received: from smtp.direct.ru (customers.imt.ru [212.16.0.33]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBIG7LDM028054 for ; Sat, 18 Dec 2004 08:07:42 -0800 Received: (qmail 25224 invoked from network); 18 Dec 2004 16:06:52 -0000 Received: from unknown (HELO AMDK7) (212.16.24.4) by 212.16.0.33 with SMTP; 18 Dec 2004 16:06:52 -0000 Date: Sat, 18 Dec 2004 19:07:42 +0300 From: Crazy AMD K7 X-Mailer: The Bat! (v1.46d) Reply-To: Crazy AMD K7 X-Priority: 3 (Normal) Message-ID: <1421293830.20041218190742@mail.ru> To: Andi Kleen CC: Bart De Schuymer , linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re[2]: do_IRQ: stack overflow: 872.. In-reply-To: <20041218135320.GA10030@wotan.suse.de> References: <1131604877.20041218092730@mail.ru.suse.lists.linux.kernel> <1103368330.3566.15.camel@localhost.localdomain> <20041218111420.GE338@wotan.suse.de> <1103370690.3566.33.camel@localhost.localdomain> <20041218135320.GA10030@wotan.suse.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12855 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: snort2004@mail.ru Precedence: bulk X-list: netdev > Note to the original poster: when you report a bug with a patched > kernel always mention it. I have mentioned earlier and Bart knows it. I use 2.4.28 + ebtables-brnf-8_vs_2.4.28.diff + U32 patch from patch-o-matic-ng-20040621.tar.bz2 + patch for br_netfilter.c made by Bart to find out why kernel panic happens(it was a few letters ago) All patches has applies cleanly. U32 doesn't affect on br_netfilter.c [root@linux kernel]# md5sum linux-2.4.28.tar.bz2 ac7735000d185bc7778c08288760a8a3 linux-2.4.28.tar.bz2 (taken from http://www.ru.kernel.org/pub/linux/kernel/v2.4/linux-2.4.28.tar.bz2) [root@linux bridge]# md5sum ebtables-brnf-8_vs_2.4.28.diff.gz 30542b1a7a502593afb4d37055ec5e35 ebtables-brnf-8_vs_2.4.28.diff.gz [root@linux iptables]# md5sum patch-o-matic-ng-20040621.tar.bz2 4fd3c744bf55f119fef6c7c3c4acc4b6 patch-o-matic-ng-20040621.tar.bz2 If the problem will continue appear and will not be solved an any way, of course, I will use 2.6 kernel, now I am not ready to use it. Pasha From bdschuym@pandora.be Sat Dec 18 08:43:29 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 18 Dec 2004 08:43:41 -0800 (PST) Received: from poros.telenet-ops.be (poros.telenet-ops.be [195.130.132.44]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBIGh89D032681 for ; Sat, 18 Dec 2004 08:43:28 -0800 Received: from localhost (localhost.localdomain [127.0.0.1]) by poros.telenet-ops.be (Postfix) with SMTP id CFF1A3BC360; Sat, 18 Dec 2004 17:42:44 +0100 (MET) Received: from 192.168.0.138 (D5763CA9.kabel.telenet.be [213.118.60.169]) by poros.telenet-ops.be (Postfix) with ESMTP id 8D21F3BC326; Sat, 18 Dec 2004 17:42:44 +0100 (MET) Subject: Re: Re[2]: do_IRQ: stack overflow: 872.. From: Bart De Schuymer To: Crazy AMD K7 Cc: Andi Kleen , linux-kernel@vger.kernel.org, netdev@oss.sgi.com In-Reply-To: <1421293830.20041218190742@mail.ru> References: <1131604877.20041218092730@mail.ru.suse.lists.linux.kernel> <1103368330.3566.15.camel@localhost.localdomain> <20041218111420.GE338@wotan.suse.de> <1103370690.3566.33.camel@localhost.localdomain> <20041218135320.GA10030@wotan.suse.de> <1421293830.20041218190742@mail.ru> Content-Type: text/plain Date: Sat, 18 Dec 2004 17:46:20 +0100 Message-Id: <1103388380.3557.15.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Evolution 2.0.2 (2.0.2-3) Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12856 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bdschuym@pandora.be Precedence: bulk X-list: netdev Op za, 18-12-2004 te 19:07 +0300, schreef Crazy AMD K7: > > Note to the original poster: when you report a bug with a patched > > kernel always mention it. > I have mentioned earlier and Bart knows it. > > I use 2.4.28 > + ebtables-brnf-8_vs_2.4.28.diff > + U32 patch from patch-o-matic-ng-20040621.tar.bz2 > + patch for br_netfilter.c made by Bart to find out why kernel panic happens(it was a few > letters ago) > All patches has applies cleanly. > U32 doesn't affect on br_netfilter.c Sorry, I don't know the ip_queue mechanism and I don't know what could possibly go wrong. All we know is that you no longer have kernel panics with the simple patch I gave you (which just drops packets when a kernel panic would happen otherwise, and tells about this with a printk). However, you state there are no entries in your syslog that tell about this dropping. Is your syslog working right? Do you have a console open on which kernel messages get printed? I still secretly suspect the snort code of inserting packets back into the kernel that don't have an output device (I don't know if that's possible, though). cheers, Bart From tgraf@suug.ch Sat Dec 18 09:00:42 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 18 Dec 2004 09:00:50 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBIH0Lsn001123 for ; Sat, 18 Dec 2004 09:00:42 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id DDAFDF; Sat, 18 Dec 2004 17:59:34 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id 013561C0EA; Sat, 18 Dec 2004 18:00:17 +0100 (CET) Date: Sat, 18 Dec 2004 18:00:17 +0100 From: Thomas Graf To: "David S. Miller" Cc: netdev@oss.sgi.com Subject: [PATCH] PKT_SCHED: dsmark must take care of shared/cloned skbs Message-ID: <20041218170017.GH17998@postel.suug.ch> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12857 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev Dave, Correctly handle shared and cloned skbs by copying them before writing and dequeue unwriteable skbs unchanged. Assumes that IP/IPv6 header is always linear so no pulling required. Signed-off-by: Thomas Graf --- linux-2.6.10-rc3-bk12.orig/net/sched/sch_dsmark.c 2004-12-18 15:16:29.000000000 +0100 +++ linux-2.6.10-rc3-bk12/net/sched/sch_dsmark.c 2004-12-18 16:24:58.000000000 +0100 @@ -195,7 +195,8 @@ D2PRINTK("dsmark_enqueue(skb %p,sch %p,[qdisc %p])\n",skb,sch,p); if (p->set_tc_index) { - /* FIXME: Safe with non-linear skbs? --RR */ + /* Safe with non-linear skbs? --RR + IP/IPv6 header is always linear --TGR */ switch (skb->protocol) { case __constant_htons(ETH_P_IP): skb->tc_index = ipv4_get_dsfield(skb->nh.iph); @@ -250,6 +251,34 @@ return ret; } +static inline int +dsmark_make_writeable(struct sk_buff **pskb, int offset) +{ + struct sk_buff *nskb; + + if ((offset + (*pskb)->mac_len) > (*pskb)->len) + return 0; + + if (skb_shared(*pskb) || skb_cloned(*pskb)) + goto copy_skb; + + /* IP/IPv6 header is always linear, no need to pull */ + return 1; + +copy_skb: + nskb = skb_copy(*pskb, GFP_ATOMIC); + if (!nskb) + return 0; + BUG_ON(skb_is_nonlinear(nskb)); + + /* Rest of kernel will get very unhappy if we pass it a + suddenly-orphaned skbuff */ + if ((*pskb)->sk) + skb_set_owner_w(nskb, (*pskb)->sk); + kfree_skb(*pskb); + *pskb = nskb; + return 1; +} static struct sk_buff *dsmark_dequeue(struct Qdisc *sch) { @@ -266,10 +295,14 @@ D2PRINTK("index %d->%d\n",skb->tc_index,index); switch (skb->protocol) { case __constant_htons(ETH_P_IP): + if (!dsmark_make_writeable(&skb, sizeof(struct iphdr))) + goto unwriteable; ipv4_change_dsfield(skb->nh.iph, p->mask[index],p->value[index]); break; case __constant_htons(ETH_P_IPV6): + if (!dsmark_make_writeable(&skb, sizeof(struct ipv6hdr))) + goto unwriteable; ipv6_change_dsfield(skb->nh.ipv6h, p->mask[index],p->value[index]); break; @@ -280,12 +313,17 @@ * and don't need yet another qdisc as a bypass. */ if (p->mask[index] != 0xff || p->value[index]) - printk(KERN_WARNING "dsmark_dequeue: " - "unsupported protocol %d\n", - htons(skb->protocol)); + if (net_ratelimit()) + printk(KERN_WARNING "dsmark_dequeue: " + "unsupported protocol %d\n", + htons(skb->protocol)); break; }; return skb; +unwriteable: + if (net_ratelimit()) + printk(KERN_WARNING "dsmark_dequeue: skb not writable\n"); + return skb; } From roland@topspin.com Sat Dec 18 09:58:59 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 18 Dec 2004 09:59:06 -0800 (PST) Received: from umhlanga.STRATNET.NET (umhlanga.stratnet.net [12.162.17.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBIHwcLk003230 for ; Sat, 18 Dec 2004 09:58:59 -0800 Received: from exch-1.topspincom.com ([12.162.17.3]) by umhlanga.STRATNET.NET with Microsoft SMTPSVC(5.0.2195.5329); Sat, 18 Dec 2004 09:58:15 -0800 Received: from eddore ([10.10.253.169]) by exch-1.topspincom.com with Microsoft SMTPSVC(5.0.2195.5329); Sat, 18 Dec 2004 09:58:15 -0800 Received: from roland by eddore with local (Exim 4.34) id 1CfiqN-0001wb-3w; Sat, 18 Dec 2004 09:58:15 -0800 To: "David S. Miller" Cc: netdev@oss.sgi.com, openib-general@openib.org X-Message-Flag: Warning: May contain useful information References: <52llbwoaej.fsf@topspin.com> <20041217214432.07b7b21e.davem@davemloft.net> <528y7vobze.fsf@topspin.com> From: Roland Dreier Date: Sat, 18 Dec 2004 09:58:15 -0800 In-Reply-To: <528y7vobze.fsf@topspin.com> (Roland Dreier's message of "Sat, 18 Dec 2004 07:35:49 -0800") Message-ID: <52sm63mqtk.fsf@topspin.com> User-Agent: Gnus/5.1006 (Gnus v5.10.6) XEmacs/21.4 (Security Through Obscurity, linux) MIME-Version: 1.0 X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: roland@topspin.com Subject: Re: [openib-general] Re: LLTX and netif_stop_queue Content-Type: text/plain; charset=us-ascii X-SA-Exim-Version: 4.1 (built Tue, 17 Aug 2004 11:06:07 +0200) X-SA-Exim-Scanned: Yes (on eddore) X-OriginalArrivalTime: 18 Dec 2004 17:58:15.0591 (UTC) FILETIME=[25EDC370:01C4E52B] X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12858 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: roland@topspin.com Precedence: bulk X-list: netdev Roland> I may be missing something, but it seems to me that we get Roland> all of the benefits of LLTX by just documenting that Roland> device drivers can use the xmit_lock member of struct Roland> net_device to synchronize other parts of the driver Roland> against hard_start_xmit. I guess the driver also should Roland> set xmit_lock_owner to -1 after it acquires xmit_lock. Thinking about this a little more, I realize that there's no reason for the driver to set xmit_lock_owner -- if the driver is able to acquire the lock, then xmit_lock_owner will already be -1. So it seems LLTX can be replaced by just having drivers use net_device.xmit_lock instead of their own private tx_lock. Assuming this works (and I don't see anything wrong with it) then this seems like a pretty nice solution: we remove some code from the networking core and get rid of all the "trylock" logic in driver's hard_start_xmit. - Roland From roland@topspin.com Sat Dec 18 10:27:24 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 18 Dec 2004 10:27:31 -0800 (PST) Received: from umhlanga.STRATNET.NET (umhlanga.stratnet.net [12.162.17.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBIIR4e5004487 for ; Sat, 18 Dec 2004 10:27:24 -0800 Received: from exch-1.topspincom.com ([12.162.17.3]) by umhlanga.STRATNET.NET with Microsoft SMTPSVC(5.0.2195.5329); Sat, 18 Dec 2004 10:26:41 -0800 Received: from eddore ([10.10.253.169]) by exch-1.topspincom.com with Microsoft SMTPSVC(5.0.2195.5329); Sat, 18 Dec 2004 10:26:40 -0800 Received: from roland by eddore with local (Exim 4.34) id 1CfjHs-0001z9-Dy; Sat, 18 Dec 2004 10:26:40 -0800 To: "David S. Miller" Cc: netdev@oss.sgi.com, openib-general@openib.org X-Message-Flag: Warning: May contain useful information References: <52llbwoaej.fsf@topspin.com> <20041217214432.07b7b21e.davem@davemloft.net> <528y7vobze.fsf@topspin.com> <52sm63mqtk.fsf@topspin.com> From: Roland Dreier Date: Sat, 18 Dec 2004 10:26:40 -0800 In-Reply-To: <52sm63mqtk.fsf@topspin.com> (Roland Dreier's message of "Sat, 18 Dec 2004 09:58:15 -0800") Message-ID: <52oegrmpi7.fsf@topspin.com> User-Agent: Gnus/5.1006 (Gnus v5.10.6) XEmacs/21.4 (Security Through Obscurity, linux) MIME-Version: 1.0 X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: roland@topspin.com Subject: Re: [openib-general] Re: LLTX and netif_stop_queue Content-Type: text/plain; charset=us-ascii X-SA-Exim-Version: 4.1 (built Tue, 17 Aug 2004 11:06:07 +0200) X-SA-Exim-Scanned: Yes (on eddore) X-OriginalArrivalTime: 18 Dec 2004 18:26:40.0919 (UTC) FILETIME=[1E622A70:01C4E52F] X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12859 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: roland@topspin.com Precedence: bulk X-list: netdev Roland> So it seems LLTX can be replaced by just having drivers Roland> use net_device.xmit_lock instead of their own private Roland> tx_lock. Assuming this works (and I don't see anything Roland> wrong with it) then this seems like a pretty nice Roland> solution: we remove some code from the networking core and Roland> get rid of all the "trylock" logic in driver's Roland> hard_start_xmit. Actually trying it instead of talking out of my ass... Just doing this naively without changing the net core can deadlock because the net core acquires dev->xmit_lock without disabling interrupts. So if the driver tries to use xmit_lock in its interrupt handler, it will deadlock if the interrupt occurred during hard_start_xmit. Even doing local_irq_save() in hard_start_xmit isn't good enough, because there's still a window between the net core's call to hard_start_xmit and the actual local_irq_save where xmit_lock is held with interrupts on. Maybe it makes sense to change NETIF_F_LLTX to NETIF_F_TX_IRQ_DIS or something like that and have that flag mean "disable interrupts when calling hard_start_xmit." (We could just do this unconditionally but I'm not sure if any drivers rely on having interrupts enabled during hard_start_xmit and I'm worried about making a change in semantics like that -- not to mention some drivers may not need interrupts disabled and may not want the cost). - Roland From safari-b-netdev=oss.sgi.com-1103394645-3imsxvtvtrg55ikq@b.safari.iki.fi Sat Dec 18 10:31:32 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 18 Dec 2004 10:31:39 -0800 (PST) Received: from fep18.inet.fi (fep18.inet.fi [194.251.242.243]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBIIVAcm005025 for ; Sat, 18 Dec 2004 10:31:31 -0800 Received: from safari.iki.fi ([80.223.105.208]) by fep18.inet.fi with ESMTP id <20041218183046.SPFT17327.fep18.inet.fi@safari.iki.fi> for ; Sat, 18 Dec 2004 20:30:46 +0200 Received: (qmail 12535 invoked by uid 500); 18 Dec 2004 18:30:45 -0000 Date: Sat, 18 Dec 2004 20:30:45 +0200 From: Sami Farin <7atbggg02@sneakemail.com> To: Linux Networking Mailing List Subject: Re: ss shows invalid ipv6 addresses Message-ID: <20041218183045.GB11231@m.safari.iki.fi> Mail-Followup-To: Linux Networking Mailing List References: <20041111230054.GD9589@m.safari.iki.fi> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041111230054.GD9589@m.safari.iki.fi> User-Agent: Mutt/1.5.6i X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12860 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: 7atbggg02@sneakemail.com Precedence: bulk X-list: netdev On Fri, Nov 12, 2004 at 01:00:54AM +0200, Sami Farin wrote: > thttpd does this: > 00:43:51.115227 bind(4, {sa_family=AF_INET6, sin6_port=htons(80), inet_pton(AF_INET6, "::", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = 0 <0.000119> > > which results into... > > # ss -l|head -n 2 ; svc -t /service/thttpd ; sleep 2 ; ss -l|head -n 2 > Recv-Q Send-Q Local Address:Port Peer Address:Port > 0 0 ::fa38:ebe3:fe5f:fb3a:ffa6:b35e:http ::e541:73fb:7f26:747d:b382:4122:* > Recv-Q Send-Q Local Address:Port Peer Address:Port > 0 0 ::81:7111:de4f:c527:81:701d:http ::81:46b5:9313:b133:81:c2cd:* > > there are some comments about ``Cursed "v4 mapped" addresses'' > in ss.c already ;) but is it possible fix this thing so that ss > can show some sane addresses for v4 mapped addresses? Nothing to do with v4 mappeds, it seems. net/ipv4/tcp_diag.c just doesn't do ipv6_addr_copy() if ipv6 support is as a module. When I compile ipv6 support directly into kernel, ss works. ESTAB 0 0 2002:50df:69d0::1:50011 2001:4118:10:4000::2205:silc -- From tgraf@suug.ch Sat Dec 18 13:01:19 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 18 Dec 2004 13:01:26 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBIL0wHs013062 for ; Sat, 18 Dec 2004 13:01:18 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id 3B704F; Sat, 18 Dec 2004 22:00:12 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id 087411C0EA; Sat, 18 Dec 2004 22:00:54 +0100 (CET) Date: Sat, 18 Dec 2004 22:00:54 +0100 From: Thomas Graf To: "David S. Miller" Cc: netdev@oss.sgi.com Subject: Re: [PATCH] PKT_SCHED: dsmark must take care of shared/cloned skbs Message-ID: <20041218210054.GI17998@postel.suug.ch> References: <20041218170017.GH17998@postel.suug.ch> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041218170017.GH17998@postel.suug.ch> X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12861 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev > Correctly handle shared and cloned skbs by copying them before writing > and dequeue unwriteable skbs unchanged. Assumes that IP/IPv6 header > is always linear so no pulling required. Hmm.. OTOH, setting skb->tc_index in enqueue() might need a pskb_copy() in some cases. I'm not aware of such a path though, does anyone know about a path where it would be required? If so it would be better to make the skb writeable in enqueue() to safe a copy. From advertiser@localhost.localdomain Sat Dec 18 16:54:44 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 18 Dec 2004 16:54:51 -0800 (PST) Received: from localhost.localdomain ([82.201.178.238]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBJ0sLvv022851 for ; Sat, 18 Dec 2004 16:54:43 -0800 Received: from localhost.localdomain (pioneer [127.0.0.1]) by localhost.localdomain (8.12.8/8.12.8) with ESMTP id iBIErT2f005930; Sat, 18 Dec 2004 16:53:29 +0200 Received: (from advertiser@localhost) by localhost.localdomain (8.12.8/8.12.8/Submit) id iBIErT46005927; Sat, 18 Dec 2004 16:53:29 +0200 Date: Sat, 18 Dec 2004 16:53:29 +0200 From: advertiser@zoogle.com Message-Id: <200412181453.iBIErT46005927@localhost.localdomain> To: basemelkimry@hotmail.com Subject: Cheap Prices NOT Cheap Hosting X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12862 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: advertiser@zoogle.com Precedence: bulk X-list: netdev HellO... Visit http://www.mkhoster.com For Verey Good Hosting Offer Cpanel PHP CGI-perl Mysql And MORE ....... From random@72616e646f6d20323030342d30342d31360a.nosense.org Sat Dec 18 17:22:59 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 18 Dec 2004 17:23:07 -0800 (PST) Received: from ubu.nosense.org (7.cust11.sa.dsl.ozemail.com.au [210.84.234.7]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBJ1MaLA024024 for ; Sat, 18 Dec 2004 17:22:57 -0800 Received: from ubu.nosense.org (ubu.nosense.org [127.0.0.1]) by ubu.nosense.org (Postfix) with SMTP id 2C8DB62A9F for ; Sun, 19 Dec 2004 11:52:12 +1030 (CST) Date: Sun, 19 Dec 2004 11:52:11 +1030 From: Mark Smith To: netdev@oss.sgi.com Subject: An example scenario ([RFC / (sort of) PATCH - 2.6.10-rc3] Structured locally generated Ethernet addresses) Message-Id: <20041219115211.1a3f205f.random@72616e646f6d20323030342d30342d31360a.nosense.org> In-Reply-To: <20041218235126.4a77a127.random@72616e646f6d20323030342d30342d31360a.nosense.org> References: <20041218235126.4a77a127.random@72616e646f6d20323030342d30342d31360a.nosense.org> Organization: The No Sense Organisation X-Mailer: Sylpheed version 1.0.0beta1 (GTK+ 1.2.10; i686-pc-linux-gnu) X-Location: Adelaide, Australia Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12863 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: random@72616e646f6d20323030342d30342d31360a.nosense.org Precedence: bulk X-list: netdev I've realised I didn't really provide a good scenario where this is useful. I've been using Qemu to emulate multiple PCs, am then running Linux on them, and then running Quagga routing daemons within each instance. Qemu can create tun tap/tun style interfaces on the host kernel, corresponding to the virtual NE2K NICs within the Qemu instance. I've been bridging these tun interfaces together to interconnect the Qemu instaces, creating a "virtual" LAN between them (don't you love it when you have to overload a word to make it more generic :-) ). I occured to me that I could also add a real, physical ethernet interface to these tun bridge groups, which would then make the random, locally assigned MAC addresses, assigned to the tun interfaces, visible to devices outside of the host kernel running the Qemu instances. (With multiple Qemu tun bridge groups, it could be possible to bridge in a vlan subinterface, rather than the main ethernet interface. I'll give that a try soon.) I don't think the structure or values of of the random ethernet addresses matters much if they are only visible within a host. However, when they start being visible to devices outside of the host, such as the above scenario, I think having them come from the same, random OUI would be useful. Thanks, Mark. On Sat, 18 Dec 2004 23:51:26 +1030 Mark Smith wrote: > Hi, > > I've been playing around with quite a number of interfaces which are > using locally generated MAC addresses, via the random_ether_addr() > function. > > Although these random ethernet addresses work fine, I thought it might > be more useful if they were more structured, by having a common, random, > local OUI, and then using an incremental serial number. It would help > identify MAC addresses belonging to the same host in things such as > bridge / switch station caches, or ARP tables. It is also closer to how > vendors user their officially assigned OUIs. > > I've put together a patch against 2.6.10-rc3. I've modified the dummy > interface code to use it. I'll admit straight up that I don't know much > about kernel programming, so I may be doing things wrong. However, I > think it is ok, and it seems to be working alright. > > A few other drivers would need to be modified to use it, the tun/tap one > as an example. Prior to developing this patch I discovered that the > tun/tap driver wasn't using random_ether_addr() either, so I've fixed > that and sent a patch off to the maintainer. Multiple tun interfaces is > the main application where I've been dealing with a number of random, > local ethernet addresses. > > I think it would be useful to have something similar to this > incorporated into the Linux kernel. > > I'm interested in any comments, including those as to why this might not > be a good idea. I think you can learn as much from finding out what is > wrong to do as what is right to do. > > The patch is below. > > Thanks, > Mark. > > diff -ur linux-2.6.10-rc3/drivers/net/dummy.c linux-2.6.10-rc3-mrs/drivers/net/dummy.c > --- linux-2.6.10-rc3/drivers/net/dummy.c 2004-12-18 23:01:16.000000000 +1030 > +++ linux-2.6.10-rc3-mrs/drivers/net/dummy.c 2004-12-18 23:00:10.000000000 +1030 > @@ -72,7 +72,10 @@ > dev->flags |= IFF_NOARP; > dev->flags &= ~IFF_MULTICAST; > SET_MODULE_OWNER(dev); > - random_ether_addr(dev->dev_addr); > + > + /* random_ether_addr(dev->dev_addr); - MRS */ > + > + locally_assigned_ether_addr(dev->dev_addr); > } > > static int dummy_xmit(struct sk_buff *skb, struct net_device *dev) > diff -ur linux-2.6.10-rc3/include/linux/etherdevice.h linux-2.6.10-rc3-mrs/include/linux/etherdevice.h > --- linux-2.6.10-rc3/include/linux/etherdevice.h 2004-10-19 07:25:06.000000000 +0930 > +++ linux-2.6.10-rc3-mrs/include/linux/etherdevice.h 2004-12-18 23:06:27.000000000 +1030 > @@ -78,6 +78,18 @@ > addr [0] &= 0xfe; /* clear multicast bit */ > addr [0] |= 0x02; /* set local assignment bit (IEEE802) */ > } > + > +/** > + * locally_assigned_ether_addr - Generate locally assigned Ethernet address > + * using an occasionally generated random local OUI, and then an incrementing > + * serial number > + * @addr: Pointer to a six-byte array containing the Ethernet address > + * > + * Mark Smith > + */ > + > +extern void locally_assigned_ether_addr(u8 *addr); > + > #endif > > #endif /* _LINUX_ETHERDEVICE_H */ > diff -ur linux-2.6.10-rc3/include/linux/if_ether.h linux-2.6.10-rc3-mrs/include/linux/if_ether.h > --- linux-2.6.10-rc3/include/linux/if_ether.h 2004-10-19 07:24:37.000000000 +0930 > +++ linux-2.6.10-rc3-mrs/include/linux/if_ether.h 2004-12-18 19:02:19.000000000 +1030 > @@ -27,6 +27,7 @@ > */ > > #define ETH_ALEN 6 /* Octets in one ethernet addr */ > +#define ETH_OUILEN 3 /* Octets in OUI part of eth addr */ > #define ETH_HLEN 14 /* Total octets in header. */ > #define ETH_ZLEN 60 /* Min. octets in frame sans FCS */ > #define ETH_DATA_LEN 1500 /* Max. octets in payload */ > diff -ur linux-2.6.10-rc3/Makefile linux-2.6.10-rc3-mrs/Makefile > --- linux-2.6.10-rc3/Makefile 2004-12-18 23:00:44.000000000 +1030 > +++ linux-2.6.10-rc3-mrs/Makefile 2004-12-18 17:28:45.000000000 +1030 > @@ -1,7 +1,7 @@ > VERSION = 2 > PATCHLEVEL = 6 > SUBLEVEL = 10 > -EXTRAVERSION =-rc3 > +EXTRAVERSION =-rc3-mrs > NAME=Woozy Numbat > > # *DOCUMENTATION* > diff -ur linux-2.6.10-rc3/net/ethernet/eth.c linux-2.6.10-rc3-mrs/net/ethernet/eth.c > --- linux-2.6.10-rc3/net/ethernet/eth.c 2004-12-18 23:01:48.000000000 +1030 > +++ linux-2.6.10-rc3-mrs/net/ethernet/eth.c 2004-12-18 23:09:01.000000000 +1030 > @@ -306,3 +306,67 @@ > return alloc_netdev(sizeof_priv, "eth%d", ether_setup); > } > EXPORT_SYMBOL(alloc_etherdev); > + > +/** > + * locally_assigned_ether_addr - Generate locally assigned Ethernet address > + * using an occasionally generated random local OUI, then an incrementing > + * serial number > + * @addr: Pointer to a six-byte array containing the Ethernet address > + * > + * Make sure that random OUI has multicast bit reset, and has locally > + * assigned bit set. Note that the random OUI is very occasionally generated ie. > + * most of the time, it will be the same for calls to this function. > + * All interfaces on host will then have nearly the same OUI, and only > + * vary in their serial number. > + * This will make identifying a single Linux host with multiple > + * generated MAC addresses easier in things such as ARP tables > + * and bridge / switch station caches. > + * > + * This probably should be called within a lock / semaphore; I know what they > + * are, I know what they're for, I just don't know how to make sure they > + * are being used or how to check for them yet :-( > + * Oh well, you've got to start somewhere. > + * > + * Mark Smith > + */ > + > +void locally_assigned_ether_addr(u8 *addr) > +{ > + > + static u8 random_OUI[ETH_OUILEN]; /* Just FYI, OUIs are 3 octets */ > + static u8 random_OUI_generated = 0; > + static u16 serial_number = 0; > + u16 serial_num_hton; > + > + if (random_OUI_generated == 0) { > + get_random_bytes(&random_OUI[0], ETH_OUILEN); > + random_OUI[0] &= 0xfe; /* clear multicast bit */ > + random_OUI[0] |= 0x02; /* set local assignment bit (IEEE802) */ > + random_OUI_generated = 1; > + } > + > + /* Copy OUI into addr */ > + memcpy(addr, random_OUI, ETH_OUILEN); > + > + /* now the 4th octet */ > + addr[3] = 0x00; > + > + serial_number++; > + serial_num_hton = htons(serial_number); > + /* now octets 5 and 6 */ > + memcpy(&addr[4], &serial_num_hton, sizeof(u16)); > + > + /* We might have run out of serial numbers, so pick a new OUI > + * next time we're called. > + * (Unlikely, supposedly we've now generated at 2^16 local > + * addresses. Still, need to handle this corner case, there are some > + * crazy networking people out there, must be caffeine poisoning.) */ > + > + if (serial_number == 0xffff) { > + random_OUI_generated = 0; > + serial_number = 0; > + } > + > +} > + > +EXPORT_SYMBOL(locally_assigned_ether_addr); -- From acme@conectiva.com.br Sat Dec 18 20:38:31 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 18 Dec 2004 20:38:39 -0800 (PST) Received: from perninha.conectiva.com.br (perninha.conectiva.com.br [200.140.247.100]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBJ4cAVw032677 for ; Sat, 18 Dec 2004 20:38:31 -0800 Received: by perninha.conectiva.com.br (Postfix, from userid 568) id 520DE4745A; Sun, 19 Dec 2004 02:37:45 -0200 (BRST) Received: from burns.conectiva (burns.conectiva [10.0.0.4]) by perninha.conectiva.com.br (Postfix) with SMTP id 043DE47454 for ; Sun, 19 Dec 2004 02:37:45 -0200 (BRST) Received: (qmail 3009 invoked by uid 0); 19 Dec 2004 05:34:03 -0000 Received: from mapi8.distro.conectiva (HELO oops.ghostprotocols.net) (10.0.16.10) by burns.conectiva with SMTP; 19 Dec 2004 05:34:03 -0000 Received: from [192.168.1.6] (amd64.kerneljanitors.org [192.168.1.6]) by oops.ghostprotocols.net (Postfix) with ESMTP id AE7A51463E; Sun, 19 Dec 2004 02:37:40 -0200 (BRST) Message-ID: <41C50633.1010102@conectiva.com.br> Date: Sun, 19 Dec 2004 02:40:19 -0200 From: Arnaldo Carvalho de Melo Organization: Conectiva S.A. User-Agent: Mozilla Thunderbird 0.9 (X11/20041103) X-Accept-Language: en-us, en MIME-Version: 1.0 To: "David S. Miller" Cc: netdev@oss.sgi.com Subject: FYI: struct sock class hierarchy Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Bogosity: No, tests=bogofilter, spamicity=0.483429, version=0.16.3 X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12864 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: acme@conectiva.com.br Precedence: bulk X-list: netdev Dave, Further info on that struct sock hierarchy stuff I'm mentioned that I planned doing while at the netsummit, with the changes I have in one of my pending trees, things are now looking like this: struct inet_sock { struct sock sk; struct stream_sock *pssk; #if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) struct ipv6_pinfo *pinet6; #endif struct inet_opt inet; }; I.e. inet sock is an specialization of struct sock (the pointer to any instance of both structs point to the same memory address). Now tcp: struct tcp_sock { struct inet_sock inet; struct tcp_opt tcp; struct stream_sock ssk; }; Now it is tcp_sock that is an specialization of struct inet_sock, that is, in turn, as said above, an specialization of struct sock (the pointer to any instance of the three structs points to the same memory address) And finally struct tcp6_sock: struct tcp6_sock { struct tcp_sock tcp; struct ipv6_pinfo inet6; }; I guess you got the idea: tcp6_sock <- tcp_sock <- inet_sock <- sock This was done for struct udp_sock, raw6_sock, sctp_sock, etc Using this approach we see clearly how the layouts are organized and the relations among the, ho-hum, classes, i.e. the class hierarchy. For further eye candy please take a look at: http://master.kernel.org/~acme/sock_class_hierarchy.ps That has the complete struct sock class hierarchy subtree for inet protocols, including SCTP, DCCP, raw, raw6, tcp_tw_bucket, etc. Apart from the introduction of struct stream_sock it is completely equivalent to the current state of things in Linus tree, but much clearer, IMHO, by not having all those cut'n'pasted layouts, complete with the #ifdef IPV6 optimization for when IPv6 is not compiled. BTW, this #ifdef IPV6 is wrong, as it leads to a kernel where IPV6 can't be later compiled and loaded, but this remains a futile exercise while the other #ifdef IPV6 is scattered in the common IPV6/IPV4 code: [acme@oldpandora stream_sock-2.6]$ grep -l IPV6 net/ipv4/*.c | wc -l 6 But this is something we'll possibly address in the future... :-) - Arnaldo From p_gortmaker@yahoo.com Sat Dec 18 23:38:06 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 18 Dec 2004 23:38:13 -0800 (PST) Received: from gold.muskoka.com (gold.muskoka.com [216.123.107.5]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBJ7bj0o004894 for ; Sat, 18 Dec 2004 23:38:05 -0800 Received: from yahoo.com (ppp73.muskoka.com [216.123.108.83]) by gold.muskoka.com (8.12.3/8.12.3/Debian-6.4) with ESMTP id iBJ7d4Gl019539; Sun, 19 Dec 2004 02:39:26 -0500 Message-ID: <41C52FBD.5080705@yahoo.com> Date: Sun, 19 Dec 2004 02:37:33 -0500 From: Paul Gortmaker User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3.1) Gecko/20030425 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Xose Vazquez Perez CC: marcelo.tosatti@cyclades.com, netdev@oss.sgi.com Subject: Re: Ax88190 specs (was 8390 specs) References: <41C37CAA.3030608@wanadoo.es> In-Reply-To: <41C37CAA.3030608@wanadoo.es> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12865 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: p_gortmaker@yahoo.com Precedence: bulk X-list: netdev I sent Marcelo the relevant pages a while ago. If anyone wants to compare the crap-o AX to the NS part, the NS documents in full are at: http://www.national.com/ds.cgi/DP/DP8390D.pdf The AT/LANTIC is the more modern version - the one that could do both NE2000 and wd8013 on pretty much one chip. I believe the 83905 datasheet contains all the core 8390 info in it and may have been more up to date: http://www.national.com/ds.cgi/DP/DP83905.pdf Paul. Xose Vazquez Perez wrote: > Tosatti wrote: > >> I'm having a problem with a Linksys EtherFast card (driven by >> axnet_cs), where the interrupt status reports ENISR_RX_ERR (receive >> error) under load. >> >> I would like to know more details about this status bit, what can >> causes it to be turned on, etc. >> >> Do you know where I can find 8390 specs? >> >> 8390.c mentions >> Sources: >> The National Semiconductor LAN Databook, and the 3Com 3c503 databook. >> >> But I can't find those available in either Google search or the >> National Semiconductor website. >> >> Any information or pointers are welcome, thanks! > > > > Maybe these > > http://www.asix.com.tw/download/Ax88190.pdf > http://www.asix.com.tw/download/Ax88190a.pdf > > can help you. More docs at -> > http://www.asix.com.tw/download_datasheet.htm > > regards, From bunk@stusta.de Sun Dec 19 08:08:51 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 19 Dec 2004 08:09:01 -0800 (PST) Received: from mailout.stusta.mhn.de (emailhub.stusta.mhn.de [141.84.69.5]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBJG8U87001472 for ; Sun, 19 Dec 2004 08:08:51 -0800 Received: (qmail 21823 invoked from network); 19 Dec 2004 16:08:01 -0000 Received: from r063144.stusta.swh.mhn.de (10.150.63.144) by mailout.stusta.mhn.de with SMTP; 19 Dec 2004 16:08:01 -0000 Received: by r063144.stusta.swh.mhn.de (Postfix, from userid 1000) id 63D72BBE01; Sun, 19 Dec 2004 17:07:58 +0100 (CET) Date: Sun, 19 Dec 2004 17:07:58 +0100 From: Adrian Bunk To: Marcel Holtmann Cc: Max Krasnyansky , bluez-devel@lists.sourceforge.net, Linux Kernel Mailing List , Network Development Mailing List Subject: Re: [2.6 patch] net/bluetooth/: misc possible cleanups Message-ID: <20041219160758.GY21288@stusta.de> References: <20041214041352.GZ23151@stusta.de> <1103009649.2143.65.camel@pegasus> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1103009649.2143.65.camel@pegasus> User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12866 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bunk@stusta.de Precedence: bulk X-list: netdev On Tue, Dec 14, 2004 at 08:34:08AM +0100, Marcel Holtmann wrote: > Hi Adrian, Hi Marcel, >... > > Please comment on which of these changes are correct and which conflict > > with pending patches. > > Please send a separate patch for all the RFCOMM changes, because these > conflicts with some pending patches and then it will make it easier for > me to merge them. > > The rest of the changes are fine with me, but I like to see also a > separate patch for the CMTP stuff and cmtp_send_capimsg() don't need a > forward declaration. Simply move the function to another place in the > source code. splitted patches follow as reply to this email. > Regards > > Marcel cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed From bunk@stusta.de Sun Dec 19 08:14:19 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 19 Dec 2004 08:14:26 -0800 (PST) Received: from mailout.stusta.mhn.de (mailout.stusta.mhn.de [141.84.69.5]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBJGDv6F002175 for ; Sun, 19 Dec 2004 08:14:18 -0800 Received: (qmail 22280 invoked from network); 19 Dec 2004 16:13:29 -0000 Received: from r063144.stusta.swh.mhn.de (10.150.63.144) by mailhub.stusta.mhn.de with SMTP; 19 Dec 2004 16:13:29 -0000 Received: by r063144.stusta.swh.mhn.de (Postfix, from userid 1000) id 9ABFFBBE01; Sun, 19 Dec 2004 17:13:24 +0100 (CET) Date: Sun, 19 Dec 2004 17:13:23 +0100 From: Adrian Bunk To: Marcel Holtmann Cc: Max Krasnyansky , bluez-devel@lists.sourceforge.net, Linux Kernel Mailing List , Network Development Mailing List Subject: [2.6 patch] bluetooth/rfcomm/: make some code static Message-ID: <20041219161323.GZ21288@stusta.de> References: <20041214041352.GZ23151@stusta.de> <1103009649.2143.65.camel@pegasus> <20041219160758.GY21288@stusta.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041219160758.GY21288@stusta.de> User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12867 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bunk@stusta.de Precedence: bulk X-list: netdev The patch below makes some needlessly global code static. diffstat output: include/net/bluetooth/rfcomm.h | 27 ------------------------ net/bluetooth/rfcomm/core.c | 37 ++++++++++++++++++++++++++------- net/bluetooth/rfcomm/sock.c | 4 +-- 3 files changed, 32 insertions(+), 36 deletions(-) Signed-off-by: Adrian Bunk --- linux-2.6.10-rc3-mm1-full/include/net/bluetooth/rfcomm.h.old 2004-12-14 02:19:37.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/net/bluetooth/rfcomm.h 2004-12-14 02:27:24.000000000 +0100 @@ -216,22 +216,6 @@ #define RFCOMM_CFC_DISABLED 0 #define RFCOMM_CFC_ENABLED RFCOMM_MAX_CREDITS -extern struct task_struct *rfcomm_thread; -extern unsigned long rfcomm_event; - -static inline void rfcomm_schedule(uint event) -{ - if (!rfcomm_thread) - return; - //set_bit(event, &rfcomm_event); - set_bit(RFCOMM_SCHED_WAKEUP, &rfcomm_event); - wake_up_process(rfcomm_thread); -} - -extern struct semaphore rfcomm_sem; -#define rfcomm_lock() down(&rfcomm_sem); -#define rfcomm_unlock() up(&rfcomm_sem); - /* ---- RFCOMM DLCs (channels) ---- */ struct rfcomm_dlc *rfcomm_dlc_alloc(int prio); void rfcomm_dlc_free(struct rfcomm_dlc *d); @@ -271,11 +255,6 @@ } /* ---- RFCOMM sessions ---- */ -struct rfcomm_session *rfcomm_session_add(struct socket *sock, int state); -struct rfcomm_session *rfcomm_session_get(bdaddr_t *src, bdaddr_t *dst); -struct rfcomm_session *rfcomm_session_create(bdaddr_t *src, bdaddr_t *dst, int *err); -void rfcomm_session_del(struct rfcomm_session *s); -void rfcomm_session_close(struct rfcomm_session *s, int err); void rfcomm_session_getaddr(struct rfcomm_session *s, bdaddr_t *src, bdaddr_t *dst); static inline void rfcomm_session_hold(struct rfcomm_session *s) @@ -283,12 +262,6 @@ atomic_inc(&s->refcnt); } -static inline void rfcomm_session_put(struct rfcomm_session *s) -{ - if (atomic_dec_and_test(&s->refcnt)) - rfcomm_session_del(s); -} - /* ---- RFCOMM chechsum ---- */ extern u8 rfcomm_crc_table[]; --- linux-2.6.10-rc3-mm1-full/net/bluetooth/rfcomm/core.c.old 2004-12-14 02:19:43.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/bluetooth/rfcomm/core.c 2004-12-14 02:27:41.000000000 +0100 @@ -61,8 +61,12 @@ struct proc_dir_entry *proc_bt_rfcomm; #endif -struct task_struct *rfcomm_thread; -DECLARE_MUTEX(rfcomm_sem); +static struct task_struct *rfcomm_thread; + +static DECLARE_MUTEX(rfcomm_sem); +#define rfcomm_lock() down(&rfcomm_sem); +#define rfcomm_unlock() up(&rfcomm_sem); + unsigned long rfcomm_event; static LIST_HEAD(session_list); @@ -81,6 +85,10 @@ static void rfcomm_process_connect(struct rfcomm_session *s); +static struct rfcomm_session *rfcomm_session_create(bdaddr_t *src, bdaddr_t *dst, int *err); +static struct rfcomm_session *rfcomm_session_get(bdaddr_t *src, bdaddr_t *dst); +static void rfcomm_session_del(struct rfcomm_session *s); + /* ---- RFCOMM frame parsing macros ---- */ #define __get_dlci(b) ((b & 0xfc) >> 2) #define __get_channel(b) ((b & 0xf8) >> 3) @@ -111,6 +119,21 @@ #define __get_rpn_stop_bits(line) (((line) >> 2) & 0x1) #define __get_rpn_parity(line) (((line) >> 3) & 0x3) +static inline void rfcomm_schedule(uint event) +{ + if (!rfcomm_thread) + return; + //set_bit(event, &rfcomm_event); + set_bit(RFCOMM_SCHED_WAKEUP, &rfcomm_event); + wake_up_process(rfcomm_thread); +} + +static inline void rfcomm_session_put(struct rfcomm_session *s) +{ + if (atomic_dec_and_test(&s->refcnt)) + rfcomm_session_del(s); +} + /* ---- RFCOMM FCS computation ---- */ /* CRC on 2 bytes */ @@ -458,7 +481,7 @@ } /* ---- RFCOMM sessions ---- */ -struct rfcomm_session *rfcomm_session_add(struct socket *sock, int state) +static struct rfcomm_session *rfcomm_session_add(struct socket *sock, int state) { struct rfcomm_session *s = kmalloc(sizeof(*s), GFP_KERNEL); if (!s) @@ -487,7 +510,7 @@ return s; } -void rfcomm_session_del(struct rfcomm_session *s) +static void rfcomm_session_del(struct rfcomm_session *s) { int state = s->state; @@ -505,7 +528,7 @@ module_put(THIS_MODULE); } -struct rfcomm_session *rfcomm_session_get(bdaddr_t *src, bdaddr_t *dst) +static struct rfcomm_session *rfcomm_session_get(bdaddr_t *src, bdaddr_t *dst) { struct rfcomm_session *s; struct list_head *p, *n; @@ -521,7 +544,7 @@ return NULL; } -void rfcomm_session_close(struct rfcomm_session *s, int err) +static void rfcomm_session_close(struct rfcomm_session *s, int err) { struct rfcomm_dlc *d; struct list_head *p, *n; @@ -542,7 +565,7 @@ rfcomm_session_put(s); } -struct rfcomm_session *rfcomm_session_create(bdaddr_t *src, bdaddr_t *dst, int *err) +static struct rfcomm_session *rfcomm_session_create(bdaddr_t *src, bdaddr_t *dst, int *err) { struct rfcomm_session *s = NULL; struct sockaddr_l2 addr; --- linux-2.6.10-rc3-mm1-full/net/bluetooth/rfcomm/sock.c.old 2004-12-14 02:28:14.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/bluetooth/rfcomm/sock.c 2004-12-14 02:28:33.000000000 +0100 @@ -393,7 +393,7 @@ return err; } -int rfcomm_sock_listen(struct socket *sock, int backlog) +static int rfcomm_sock_listen(struct socket *sock, int backlog) { struct sock *sk = sock->sk; int err = 0; @@ -437,7 +437,7 @@ return err; } -int rfcomm_sock_accept(struct socket *sock, struct socket *newsock, int flags) +static int rfcomm_sock_accept(struct socket *sock, struct socket *newsock, int flags) { DECLARE_WAITQUEUE(wait, current); struct sock *sk = sock->sk, *nsk; From bunk@stusta.de Sun Dec 19 08:35:15 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 19 Dec 2004 08:35:22 -0800 (PST) Received: from mailout.stusta.mhn.de (emailhub.stusta.mhn.de [141.84.69.5]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBJGYrtO006574 for ; Sun, 19 Dec 2004 08:35:14 -0800 Received: (qmail 23884 invoked from network); 19 Dec 2004 16:34:25 -0000 Received: from r063144.stusta.swh.mhn.de (10.150.63.144) by mailout.stusta.mhn.de with SMTP; 19 Dec 2004 16:34:25 -0000 Received: by r063144.stusta.swh.mhn.de (Postfix, from userid 1000) id 4E1C1BBE01; Sun, 19 Dec 2004 17:34:22 +0100 (CET) Date: Sun, 19 Dec 2004 17:34:21 +0100 From: Adrian Bunk To: Marcel Holtmann Cc: Max Krasnyansky , bluez-devel@lists.sourceforge.net, Linux Kernel Mailing List , Network Development Mailing List Subject: [2.6 patch] bluetooth/cmtp/capi.c: make a function static Message-ID: <20041219163421.GB21288@stusta.de> References: <20041214041352.GZ23151@stusta.de> <1103009649.2143.65.camel@pegasus> <20041219160758.GY21288@stusta.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041219160758.GY21288@stusta.de> User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12868 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bunk@stusta.de Precedence: bulk X-list: netdev The patch below makes a needlessly global function static. diffstat output: net/bluetooth/cmtp/capi.c | 28 +++++++++++++--------------- net/bluetooth/cmtp/cmtp.h | 1 - 2 files changed, 13 insertions(+), 16 deletions(-) Signed-off-by: Adrian Bunk --- linux-2.6.10-rc3-mm1-full/net/bluetooth/cmtp/cmtp.h.old 2004-12-18 01:44:36.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/bluetooth/cmtp/cmtp.h 2004-12-18 02:28:40.000000000 +0100 @@ -120,7 +120,6 @@ void cmtp_detach_device(struct cmtp_session *session); void cmtp_recv_capimsg(struct cmtp_session *session, struct sk_buff *skb); -void cmtp_send_capimsg(struct cmtp_session *session, struct sk_buff *skb); static inline void cmtp_schedule(struct cmtp_session *session) { --- linux-2.6.10-rc3-mm1-full/net/bluetooth/cmtp/capi.c.old 2004-12-18 01:44:43.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/bluetooth/cmtp/capi.c 2004-12-19 17:09:00.000000000 +0100 @@ -139,6 +139,19 @@ return session->msgnum; } +static void cmtp_send_capimsg(struct cmtp_session *session, struct sk_buff *skb) +{ + struct cmtp_scb *scb = (void *) skb->cb; + + BT_DBG("session %p skb %p len %d", session, skb, skb->len); + + scb->id = -1; + scb->data = (CAPIMSG_COMMAND(skb->data) == CAPI_DATA_B3); + + skb_queue_tail(&session->transmit, skb); + + cmtp_schedule(session); +} static void cmtp_send_interopmsg(struct cmtp_session *session, __u8 subcmd, __u16 appl, __u16 msgnum, @@ -337,21 +350,6 @@ capi_ctr_handle_message(ctrl, appl, skb); } -void cmtp_send_capimsg(struct cmtp_session *session, struct sk_buff *skb) -{ - struct cmtp_scb *scb = (void *) skb->cb; - - BT_DBG("session %p skb %p len %d", session, skb, skb->len); - - scb->id = -1; - scb->data = (CAPIMSG_COMMAND(skb->data) == CAPI_DATA_B3); - - skb_queue_tail(&session->transmit, skb); - - cmtp_schedule(session); -} - - static int cmtp_load_firmware(struct capi_ctr *ctrl, capiloaddata *data) { BT_DBG("ctrl %p data %p", ctrl, data); From bunk@stusta.de Sun Dec 19 08:35:35 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 19 Dec 2004 08:35:43 -0800 (PST) Received: from mailout.stusta.mhn.de (emailhub.stusta.mhn.de [141.84.69.5]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBJGZEjM006582 for ; Sun, 19 Dec 2004 08:35:34 -0800 Received: (qmail 23923 invoked from network); 19 Dec 2004 16:34:45 -0000 Received: from r063144.stusta.swh.mhn.de (10.150.63.144) by mailout.stusta.mhn.de with SMTP; 19 Dec 2004 16:34:45 -0000 Received: by r063144.stusta.swh.mhn.de (Postfix, from userid 1000) id 01111BBE01; Sun, 19 Dec 2004 17:34:42 +0100 (CET) Date: Sun, 19 Dec 2004 17:34:42 +0100 From: Adrian Bunk To: Marcel Holtmann Cc: Max Krasnyansky , bluez-devel@lists.sourceforge.net, Linux Kernel Mailing List , Network Development Mailing List Subject: [2.6 patch] bluetooth: make some code static Message-ID: <20041219163442.GC21288@stusta.de> References: <20041214041352.GZ23151@stusta.de> <1103009649.2143.65.camel@pegasus> <20041219160758.GY21288@stusta.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041219160758.GY21288@stusta.de> User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12869 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bunk@stusta.de Precedence: bulk X-list: netdev The patch below makes some needlessly global code static. diffstat output: include/net/bluetooth/hci_core.h | 2 -- net/bluetooth/hci_conn.c | 2 +- net/bluetooth/hci_core.c | 4 ++-- net/bluetooth/hci_sock.c | 10 +++++----- net/bluetooth/l2cap.c | 2 +- 5 files changed, 9 insertions(+), 11 deletions(-) Signed-off-by: Adrian Bunk --- linux-2.6.10-rc3-mm1-full/include/net/bluetooth/hci_core.h.old 2004-12-14 02:13:54.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/include/net/bluetooth/hci_core.h 2004-12-14 02:31:34.000000000 +0100 @@ -277,7 +277,6 @@ return NULL; } -void hci_acl_connect(struct hci_conn *conn); void hci_acl_disconn(struct hci_conn *conn, __u8 reason); void hci_add_sco(struct hci_conn *conn, __u16 handle); @@ -589,6 +583,5 @@ #define hci_req_unlock(d) up(&d->req_lock) void hci_req_complete(struct hci_dev *hdev, int result); -void hci_req_cancel(struct hci_dev *hdev, int err); #endif /* __HCI_CORE_H */ --- linux-2.6.10-rc3-mm1-full/net/bluetooth/hci_conn.c.old 2004-12-14 02:14:10.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/bluetooth/hci_conn.c 2004-12-14 02:14:15.000000000 +0100 @@ -53,7 +53,7 @@ #define BT_DBG(D...) #endif -void hci_acl_connect(struct hci_conn *conn) +static void hci_acl_connect(struct hci_conn *conn) { struct hci_dev *hdev = conn->hdev; struct inquiry_entry *ie; --- linux-2.6.10-rc3-mm1-full/net/bluetooth/hci_core.c.old 2004-12-14 02:14:53.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/bluetooth/hci_core.c 2004-12-14 02:31:09.000000000 +0100 @@ -59,7 +59,7 @@ static void hci_tx_task(unsigned long arg); static void hci_notify(struct hci_dev *hdev, int event); -rwlock_t hci_task_lock = RW_LOCK_UNLOCKED; +static rwlock_t hci_task_lock = RW_LOCK_UNLOCKED; /* HCI device list */ LIST_HEAD(hci_dev_list); @@ -106,7 +106,7 @@ } } -void hci_req_cancel(struct hci_dev *hdev, int err) +static void hci_req_cancel(struct hci_dev *hdev, int err) { BT_DBG("%s err 0x%2.2x", hdev->name, err); --- linux-2.6.10-rc3-mm1-full/net/bluetooth/hci_sock.c.old 2004-12-14 02:16:59.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/bluetooth/hci_sock.c 2004-12-14 02:17:59.000000000 +0100 @@ -447,7 +447,7 @@ goto done; } -int hci_sock_setsockopt(struct socket *sock, int level, int optname, char __user *optval, int len) +static int hci_sock_setsockopt(struct socket *sock, int level, int optname, char __user *optval, int len) { struct hci_ufilter uf = { .opcode = 0 }; struct sock *sk = sock->sk; @@ -514,7 +514,7 @@ return err; } -int hci_sock_getsockopt(struct socket *sock, int level, int optname, char __user *optval, int __user *optlen) +static int hci_sock_getsockopt(struct socket *sock, int level, int optname, char __user *optval, int __user *optlen) { struct hci_ufilter uf; struct sock *sk = sock->sk; @@ -567,7 +567,7 @@ return 0; } -struct proto_ops hci_sock_ops = { +static struct proto_ops hci_sock_ops = { .family = PF_BLUETOOTH, .owner = THIS_MODULE, .release = hci_sock_release, @@ -647,13 +647,13 @@ return NOTIFY_DONE; } -struct net_proto_family hci_sock_family_ops = { +static struct net_proto_family hci_sock_family_ops = { .family = PF_BLUETOOTH, .owner = THIS_MODULE, .create = hci_sock_create, }; -struct notifier_block hci_sock_nblock = { +static struct notifier_block hci_sock_nblock = { .notifier_call = hci_sock_dev_event }; --- linux-2.6.10-rc3-mm1-full/net/bluetooth/l2cap.c.old 2004-12-14 02:18:13.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/net/bluetooth/l2cap.c 2004-12-14 02:18:21.000000000 +0100 @@ -61,7 +61,7 @@ static struct proto_ops l2cap_sock_ops; -struct bt_sock_list l2cap_sk_list = { +static struct bt_sock_list l2cap_sk_list = { .lock = RW_LOCK_UNLOCKED }; From hadi@cyberus.ca Sun Dec 19 11:30:07 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 19 Dec 2004 11:30:15 -0800 (PST) Received: from mx02.cybersurf.com (mx02.cybersurf.com [209.197.145.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBJJTkn0012744 for ; Sun, 19 Dec 2004 11:30:06 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx02.cybersurf.com with esmtp (Exim 4.30) id 1Cg6k5-0008Hy-Lu for netdev@oss.sgi.com; Sun, 19 Dec 2004 14:29:21 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1Cg6jz-0005bS-Jy; Sun, 19 Dec 2004 14:29:15 -0500 Subject: Re: [patch 4/10] s390: network driver. From: jamal Reply-To: hadi@cyberus.ca To: Thomas Spatzier Cc: "David S. Miller" , Hasso Tepper , Herbert Xu , jgarzik@pobox.com, netdev@oss.sgi.com, Paul Jakma In-Reply-To: References: Content-Type: text/plain Organization: jamalopolous Message-Id: <1103484552.1046.155.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 19 Dec 2004 14:29:12 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12870 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Wed, 2004-12-15 at 10:03, Thomas Spatzier wrote: > jamal wrote on 15.12.2004 14:50:27: > > > When you netif_stop_queue you should never receive packets anymore > > at the device level. If you receive any its a bug and you should drop > > them and bitch violently. In other words i think what you have at the > > moment is bandaid not the solution. > > When we do a netif_stop_queue, we do not get any more packets. > So this behaviour is ok. The problem is that the socket write > queues fill up then and get blocked as soon as they are full. > This is the strange part. Anyone still willing to provide a sample program that hangs? > > Can you describe how your driver uses the netif_start/stop/wake > > interfaces? > > Before the patch we are talking about: > When we detect a cable pull (or something like this) we call > netif_stop_queue and set switch off the IFF_RUNNING flag. > Then when we detect that the cable is plugged in again, we > call netif_wake_queue and switch the IFF_RUNNING flag on. > Not too bad except user space doesnt get async notification. > And with the patch: > On cable pull we switch off IFF_RUNNING and call > netif_carrier_off. We still get packets but drop them. > When the cable is plugged in we switch on IFF_RUNNING and > call netif_carrier_on. Ok, thats something you need to change. Why you are setting that IFF_RUNNING flag? Look at e1000 they do it right there. On link up: netif_carrier_on(netdev); netif_wake_queue(netdev); On Link Down: netif_carrier_off(netdev); // wonder if these need swapping netif_stop_queue(netdev); Still, I think we need to resolve the original problem. And I have a feeling we wont be seeing any program which reproduces it ;-> cheers, jamal From hadi@cyberus.ca Sun Dec 19 11:32:51 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 19 Dec 2004 11:32:58 -0800 (PST) Received: from mx01.cybersurf.com (mx01.cybersurf.com [209.197.145.104]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBJJWUQc013282 for ; Sun, 19 Dec 2004 11:32:50 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx01.cybersurf.com with esmtp (Exim 4.30) id 1Cg6mZ-00020V-85 for netdev@oss.sgi.com; Sun, 19 Dec 2004 14:31:55 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1Cg6ly-0005v3-5l; Sun, 19 Dec 2004 14:31:18 -0500 Subject: Re: LLTX and netif_stop_queue From: jamal Reply-To: hadi@cyberus.ca To: "David S. Miller" Cc: Roland Dreier , netdev@oss.sgi.com, openib-general@openib.org In-Reply-To: <20041217214432.07b7b21e.davem@davemloft.net> References: <52llbwoaej.fsf@topspin.com> <20041217214432.07b7b21e.davem@davemloft.net> Content-Type: text/plain Organization: jamalopolous Message-Id: <1103484675.1050.158.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 19 Dec 2004 14:31:15 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12871 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Sat, 2004-12-18 at 00:44, David S. Miller wrote: > Perhaps one way to fix this is to add a pointer to a spinlock to > the netdev struct, and have hold that the upper level grab that > when NETIF_F_LLTX when doing queue state checks. Actually, that > could end up being racy too. How about releasing the qlock only when the LLTX transmit lock is grabbed? That should bring it to par with what it was originally. cheers, jamal From hadi@cyberus.ca Sun Dec 19 11:34:45 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 19 Dec 2004 11:34:51 -0800 (PST) Received: from mx02.cybersurf.com (mx02.cybersurf.com [209.197.145.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBJJYPnp013787 for ; Sun, 19 Dec 2004 11:34:45 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx02.cybersurf.com with esmtp (Exim 4.30) id 1Cg6oa-0003MQ-On for netdev@oss.sgi.com; Sun, 19 Dec 2004 14:34:00 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1Cg6oY-0006Ja-Ps; Sun, 19 Dec 2004 14:33:59 -0500 Subject: Re: LLTX and netif_stop_queue From: jamal Reply-To: hadi@cyberus.ca To: Roland Dreier Cc: "David S. Miller" , netdev@oss.sgi.com, openib-general@openib.org In-Reply-To: <528y7vobze.fsf@topspin.com> References: <52llbwoaej.fsf@topspin.com> <20041217214432.07b7b21e.davem@davemloft.net> <528y7vobze.fsf@topspin.com> Content-Type: text/plain Organization: jamalopolous Message-Id: <1103484835.1050.160.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 19 Dec 2004 14:33:55 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12872 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Sat, 2004-12-18 at 10:35, Roland Dreier wrote: > By the way, am I correct in thinking that the use of xmit_lock_owner > in qdisc_restart() is racy? No. > if (!spin_trylock(&dev->xmit_lock)) { > /* get the lock */ > > if (!spin_trylock(&dev->xmit_lock)) { > /* fail */ > if (dev->xmit_lock_owner == smp_processor_id()) { > /* test the wrong value */ > > /* set the value too late */ > dev->xmit_lock_owner = smp_processor_id(); > The setting is protected by the queue lock. So no race. cheers, jamal From hadi@cyberus.ca Sun Dec 19 11:55:43 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 19 Dec 2004 11:55:51 -0800 (PST) Received: from mx03.cybersurf.com (mx03.cybersurf.com [209.197.145.106]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBJJtMjX014864 for ; Sun, 19 Dec 2004 11:55:43 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx03.cybersurf.com with esmtp (Exim 4.30) id 1Cg78r-0007wf-Tr for netdev@oss.sgi.com; Sun, 19 Dec 2004 14:54:57 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1Cg78o-0000GC-1n; Sun, 19 Dec 2004 14:54:54 -0500 Subject: Re: LLTX and netif_stop_queue From: jamal Reply-To: hadi@cyberus.ca To: "David S. Miller" Cc: Roland Dreier , netdev@oss.sgi.com, openib-general@openib.org In-Reply-To: <1103484675.1050.158.camel@jzny.localdomain> References: <52llbwoaej.fsf@topspin.com> <20041217214432.07b7b21e.davem@davemloft.net> <1103484675.1050.158.camel@jzny.localdomain> Content-Type: multipart/mixed; boundary="=-WbwKvjCEiZp2XREqtXF+" Organization: jamalopolous Message-Id: <1103486090.1047.166.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 19 Dec 2004 14:54:51 -0500 X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12873 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev --=-WbwKvjCEiZp2XREqtXF+ Content-Type: text/plain Content-Transfer-Encoding: 7bit On Sun, 2004-12-19 at 14:31, jamal wrote: > How about releasing the qlock only when the LLTX transmit lock is > grabbed? That should bring it to par with what it was originally. Something like two attached patches... one showing how to do it for e1000. untested (not even compiled) cheers, jamal --=-WbwKvjCEiZp2XREqtXF+ Content-Disposition: attachment; filename=p1 Content-Type: text/plain; name=p1; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit --- a/net/sched/bak.sch_generic.c 2004-12-19 13:46:19.799356432 -0500 +++ b/net/sched/sch_generic.c 2004-12-19 13:49:14.384815408 -0500 @@ -128,12 +128,11 @@ } /* Remember that the driver is grabbed by us. */ dev->xmit_lock_owner = smp_processor_id(); + /* And release queue */ + spin_unlock(&dev->queue_lock); } { - /* And release queue */ - spin_unlock(&dev->queue_lock); - if (!netif_queue_stopped(dev)) { int ret; if (netdev_nit) @@ -141,15 +140,14 @@ ret = dev->hard_start_xmit(skb, dev); if (ret == NETDEV_TX_OK) { + dev->xmit_lock_owner = -1; if (!nolock) { - dev->xmit_lock_owner = -1; spin_unlock(&dev->xmit_lock); } spin_lock(&dev->queue_lock); return -1; } if (ret == NETDEV_TX_LOCKED && nolock) { - spin_lock(&dev->queue_lock); goto collision; } } --=-WbwKvjCEiZp2XREqtXF+ Content-Disposition: attachment; filename=p2 Content-Type: text/plain; name=p2; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit --- a/drivers/net/e1000/b.e1000_main.c 2004-12-19 13:50:59.481838232 -0500 +++ b/drivers/net/e1000/e1000_main.c 2004-12-19 13:53:34.326298296 -0500 @@ -1809,6 +1809,10 @@ return NETDEV_TX_LOCKED; } + /* new from sch_generic for LLTX */ + spin_unlock(&dev->queue_lock); + dev->xmit_lock_owner = smp_processor_id(); + /* need: count + 2 desc gap to keep tail from touching * head, otherwise try next time */ if(E1000_DESC_UNUSED(&adapter->tx_ring) < count + 2) { --=-WbwKvjCEiZp2XREqtXF+-- From hadi@znyx.com Sun Dec 19 12:03:39 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 19 Dec 2004 12:03:52 -0800 (PST) Received: from lotus.znyx.com (znx208-2-156-007.znyx.com [208.2.156.7]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBJK3Ic9015582 for ; Sun, 19 Dec 2004 12:03:38 -0800 Received: from [127.0.0.1] ([208.2.156.2]) by lotus.znyx.com (Lotus Domino Release 5.0.11) with ESMTP id 2004121912054736:82284 ; Sun, 19 Dec 2004 12:05:47 -0800 Subject: Re: LLTX and netif_stop_queue From: Jamal Hadi Salim Reply-To: hadi@znyx.com To: "David S. Miller" Cc: Roland Dreier , netdev@oss.sgi.com, openib-general@openib.org In-Reply-To: <1103486090.1047.166.camel@jzny.localdomain> References: <52llbwoaej.fsf@topspin.com> <20041217214432.07b7b21e.davem@davemloft.net> <1103484675.1050.158.camel@jzny.localdomain> <1103486090.1047.166.camel@jzny.localdomain> Organization: ZNYX Networks Message-Id: <1103486544.1049.169.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 19 Dec 2004 15:02:24 -0500 X-MIMETrack: Itemize by SMTP Server on Lotus/Znyx(Release 5.0.11 |July 24, 2002) at 12/19/2004 12:05:47 PM, Serialize by Router on Lotus/Znyx(Release 5.0.11 |July 24, 2002) at 12/19/2004 12:06:37 PM, Serialize complete at 12/19/2004 12:06:37 PM Content-Type: multipart/mixed; boundary="=-G0Wb++BCRuJsoWHykTou" X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12874 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@znyx.com Precedence: bulk X-list: netdev --=-G0Wb++BCRuJsoWHykTou Content-Transfer-Encoding: 7bit Content-Type: text/plain On Sun, 2004-12-19 at 14:54, jamal wrote: > On Sun, 2004-12-19 at 14:31, jamal wrote: > > > How about releasing the qlock only when the LLTX transmit lock is > > grabbed? That should bring it to par with what it was originally. > > Something like two attached patches... one showing how to do it for > e1000. untested (not even compiled) After attempting to compile, p2 take2. cheers, jamal --=-G0Wb++BCRuJsoWHykTou Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename=p2-take2 Content-Type: text/plain; name=p2-take2; charset=ISO-8859-1 --- 2610-rc3-bk12/drivers/net/e1000/bak.e1000_main.c 2004-12-19 13:50:59.000000000 -0500 +++ 2610-rc3-bk12/drivers/net/e1000/e1000_main.c 2004-12-19 13:58:17.000000000 -0500 @@ -1809,6 +1809,10 @@ return NETDEV_TX_LOCKED; } + /* new from sch_generic for LLTX */ + spin_unlock(&netdev->queue_lock); + netdev->xmit_lock_owner = smp_processor_id(); + /* need: count + 2 desc gap to keep tail from touching * head, otherwise try next time */ if(E1000_DESC_UNUSED(&adapter->tx_ring) < count + 2) { --=-G0Wb++BCRuJsoWHykTou-- From hadi@cyberus.ca Sun Dec 19 12:19:33 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 19 Dec 2004 12:19:41 -0800 (PST) Received: from mx03.cybersurf.com (mx03.cybersurf.com [209.197.145.106]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBJKJDiD019771 for ; Sun, 19 Dec 2004 12:19:33 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx03.cybersurf.com with esmtp (Exim 4.30) id 1Cg7Vy-0006P7-6o for netdev@oss.sgi.com; Sun, 19 Dec 2004 15:18:50 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1Cg7Vo-0002hs-Jt; Sun, 19 Dec 2004 15:18:40 -0500 Subject: Re: primary and secondary ip addresses From: jamal Reply-To: hadi@cyberus.ca To: Henrik Nordstrom Cc: "David S. Miller" , Andrea G Forte , hasso@estpak.ee, laforge@gnumonks.org, nhorman@redhat.com, linux-net@vger.kernel.org, netdev@oss.sgi.com In-Reply-To: References: <41912F7A.6000408@redhat.com> <200412161153.51251.hasso@estpak.ee> <200412161302.42357.hasso@estpak.ee> <41C2F6E5.5010607@cs.columbia.edu> <41C30212.6000906@cs.columbia.edu> <20041217112025.27688eb6.davem@davemloft.net> Content-Type: text/plain Organization: jamalopolous Message-Id: <1103487517.1047.181.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 19 Dec 2004 15:18:37 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12875 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Fri, 2004-12-17 at 14:48, Henrik Nordstrom wrote: > On Fri, 17 Dec 2004, David S. Miller wrote: > > > By definition, a secondary IP address on an interface is not to be used > > as a source. > > But you can, but adding a route with such address as source or > applications excplicitly binding to this source address. Even when not bound, a secondary address could be used as src if within a fitting scope+mask. > And it is > highly useful to be able to use different source addresses in the same > subnet for different purposes. You can. You can also have multiple primary addresses each on different subnets and scope. And each primary can have multiple secondary addresses. > > It is the whole reason for the distinction between primary and secondary > > IP addresses, and it is why all secondaries are deleted once the primary > > is removed (because there are no valid source addresses to choose from > > any longer, therefore IP valid communications are no longer possible). > > Which is a false assumption in very many situations. The operative term is "IP valid communications are no longer possible". When you attach an IP address for the first time on a port/interface thats a signal "IP communication using this device is now Valid". Its like ifconfiging up a device - but only for IP processing purposes. When you delete that IP address that created that signal i.e the primary address/first address attached, you are signaling "IP communication using this device is now no longer Valid". Thats why those secondary addresses are deleted. Someone please take note of this somewhere in some FAQ since it has been an issue of contention for a long time. A routing protocol implementation MUST take the above into consideration. Having said the above, I think it would make sense to have a "promotion" scheme so that in the case a primary address is deleted, one could promote the next secondary address in line. But that should be optional. Now where is the fireman who wants to do this? I could help cheering since i know the code. cheers, jamal From hadi@cyberus.ca Sun Dec 19 12:20:23 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 19 Dec 2004 12:20:30 -0800 (PST) Received: from mx02.cybersurf.com (mx02.cybersurf.com [209.197.145.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBJKK2I6019955 for ; Sun, 19 Dec 2004 12:20:23 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx02.cybersurf.com with esmtp (Exim 4.30) id 1Cg7Wk-0005XG-7t for netdev@oss.sgi.com; Sun, 19 Dec 2004 15:19:38 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1Cg6f6-0004uI-Kb; Sun, 19 Dec 2004 14:24:12 -0500 Subject: Re: [PATCH] PKT_SCHED: Provide compat policer stats in action policer From: jamal Reply-To: hadi@cyberus.ca To: Patrick McHardy Cc: Thomas Graf , "David S. Miller" , netdev@oss.sgi.com In-Reply-To: <41C05B60.6030504@trash.net> References: <20041215130128.GK8493@postel.suug.ch> <1103119774.1077.74.camel@jzny.localdomain> <41C05B60.6030504@trash.net> Content-Type: text/plain Organization: jamalopolous Message-Id: <1103484249.1046.143.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 19 Dec 2004 14:24:09 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12876 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Wed, 2004-12-15 at 10:42, Patrick McHardy wrote: > I value regression testing, but I prefer to use my eyes first. > Since > this problem is not related to the policer oops fix it doesn't > convince me that my time would have been well invested doing the > tests you described. But it is _absolutely_ related to the policer oops. If those tests were run to begin with there would be no oops neither this latest problem. Hopefully with the regression tests in place this will get better. [You fear Murphy less than i - and thats a style difference. Your style is actually more effective in Linux because you can distribute the burden onto users. As a matter of fact it is within Daves tolerance range (but not mine[1]). So you should do just fine] cheers, jamal [1]I am not saying anyone can write perfect code (Alexey was close;->), but if there are known things to check for, such as those i proposed (and have been checking on for a long time), trusting your eyes maybe insufficient. From hadi@cyberus.ca Sun Dec 19 12:24:38 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 19 Dec 2004 12:24:45 -0800 (PST) Received: from mx03.cybersurf.com (mx03.cybersurf.com [209.197.145.106]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBJKOHZ9020924 for ; Sun, 19 Dec 2004 12:24:37 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx03.cybersurf.com with esmtp (Exim 4.30) id 1Cg7as-0000N5-J5 for netdev@oss.sgi.com; Sun, 19 Dec 2004 15:23:54 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1Cg7ao-0003G0-PA; Sun, 19 Dec 2004 15:23:51 -0500 Subject: Re: [PATCH] PKT_SCHED: dsmark must take care of shared/cloned skbs From: jamal Reply-To: hadi@cyberus.ca To: Thomas Graf Cc: "David S. Miller" , netdev@oss.sgi.com In-Reply-To: <20041218170017.GH17998@postel.suug.ch> References: <20041218170017.GH17998@postel.suug.ch> Content-Type: text/plain Organization: jamalopolous Message-Id: <1103487827.1048.188.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 19 Dec 2004 15:23:47 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12877 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev If the qdisc at that level muddies the packet thats fair game - thats what goes out on the wire. So we should leave the code as is. cheers, jamal On Sat, 2004-12-18 at 12:00, Thomas Graf wrote: > Dave, > > Correctly handle shared and cloned skbs by copying them before writing > and dequeue unwriteable skbs unchanged. Assumes that IP/IPv6 header > is always linear so no pulling required. > > Signed-off-by: Thomas Graf > > --- linux-2.6.10-rc3-bk12.orig/net/sched/sch_dsmark.c 2004-12-18 15:16:29.000000000 +0100 > +++ linux-2.6.10-rc3-bk12/net/sched/sch_dsmark.c 2004-12-18 16:24:58.000000000 +0100 > @@ -195,7 +195,8 @@ > > D2PRINTK("dsmark_enqueue(skb %p,sch %p,[qdisc %p])\n",skb,sch,p); > if (p->set_tc_index) { > - /* FIXME: Safe with non-linear skbs? --RR */ > + /* Safe with non-linear skbs? --RR > + IP/IPv6 header is always linear --TGR */ > switch (skb->protocol) { > case __constant_htons(ETH_P_IP): > skb->tc_index = ipv4_get_dsfield(skb->nh.iph); > @@ -250,6 +251,34 @@ > return ret; > } > > +static inline int > +dsmark_make_writeable(struct sk_buff **pskb, int offset) > +{ > + struct sk_buff *nskb; > + > + if ((offset + (*pskb)->mac_len) > (*pskb)->len) > + return 0; > + > + if (skb_shared(*pskb) || skb_cloned(*pskb)) > + goto copy_skb; > + > + /* IP/IPv6 header is always linear, no need to pull */ > + return 1; > + > +copy_skb: > + nskb = skb_copy(*pskb, GFP_ATOMIC); > + if (!nskb) > + return 0; > + BUG_ON(skb_is_nonlinear(nskb)); > + > + /* Rest of kernel will get very unhappy if we pass it a > + suddenly-orphaned skbuff */ > + if ((*pskb)->sk) > + skb_set_owner_w(nskb, (*pskb)->sk); > + kfree_skb(*pskb); > + *pskb = nskb; > + return 1; > +} > > static struct sk_buff *dsmark_dequeue(struct Qdisc *sch) > { > @@ -266,10 +295,14 @@ > D2PRINTK("index %d->%d\n",skb->tc_index,index); > switch (skb->protocol) { > case __constant_htons(ETH_P_IP): > + if (!dsmark_make_writeable(&skb, sizeof(struct iphdr))) > + goto unwriteable; > ipv4_change_dsfield(skb->nh.iph, > p->mask[index],p->value[index]); > break; > case __constant_htons(ETH_P_IPV6): > + if (!dsmark_make_writeable(&skb, sizeof(struct ipv6hdr))) > + goto unwriteable; > ipv6_change_dsfield(skb->nh.ipv6h, > p->mask[index],p->value[index]); > break; > @@ -280,12 +313,17 @@ > * and don't need yet another qdisc as a bypass. > */ > if (p->mask[index] != 0xff || p->value[index]) > - printk(KERN_WARNING "dsmark_dequeue: " > - "unsupported protocol %d\n", > - htons(skb->protocol)); > + if (net_ratelimit()) > + printk(KERN_WARNING "dsmark_dequeue: " > + "unsupported protocol %d\n", > + htons(skb->protocol)); > break; > }; > return skb; > +unwriteable: > + if (net_ratelimit()) > + printk(KERN_WARNING "dsmark_dequeue: skb not writable\n"); > + return skb; > } > > > > From tgraf@suug.ch Sun Dec 19 12:31:16 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 19 Dec 2004 12:31:23 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBJKUttB021569 for ; Sun, 19 Dec 2004 12:31:16 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id 8A2B9F; Sun, 19 Dec 2004 21:30:08 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id 5998C1C0EA; Sun, 19 Dec 2004 21:30:50 +0100 (CET) Date: Sun, 19 Dec 2004 21:30:50 +0100 From: Thomas Graf To: "David S. Miller" Cc: Jamal Hadi Salim , netdev@oss.sgi.com Subject: [PATCH] PKT_SCHED: Fix cls indev validation Message-ID: <20041219203050.GK17998@postel.suug.ch> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12878 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev Dave, This patch is actually part of a patchset for 2.6.11 inclusion but the memory corruption possibility might make it worth for 2.6.10. It's not really dangerous since it requires admin capabilities though. Puts the indev validation into its own function to allow parameter validation before any changes are made. Changes the sanity check from >= IFNAMSIZ to > IFNAMSIZ to correctly handle 0 terminated strings and replaces the dangerous sprintf with a memcpy bound to the TLV size. Providing a indev TLV for kernels without CONFIG_NET_CLS_IND support will now lead to EOPPNOTSUPP. The sprintf could have been abused to overwrite memory after the indev attribute if the TLV data was not 0 terminated. (Bound to a few limitations, e.g. it would have to contain a rta header at RTA_ALIGN(IFNAMSIZ)+1 etc.) Signed-off-by: Thomas Graf --- linux-2.6.10-rc3-bk7.orig/include/net/pkt_cls.h 2004-12-14 14:24:04.000000000 +0100 +++ linux-2.6.10-rc3-bk7/include/net/pkt_cls.h 2004-12-14 14:54:41.000000000 +0100 @@ -160,19 +160,25 @@ #ifdef CONFIG_NET_CLS_IND static inline int -tcf_change_indev(struct tcf_proto *tp, char *indev, struct rtattr *indev_tlv) +tcf_validate_indev(struct rtattr *indev_tlv) { - if (RTA_PAYLOAD(indev_tlv) >= IFNAMSIZ) { - printk("cls: bad indev name %s\n", (char *) RTA_DATA(indev_tlv)); + if (indev_tlv && RTA_PAYLOAD(indev_tlv) > IFNAMSIZ) { + if (net_ratelimit()) + printk("cls: bad indev name %s\n", (char *) RTA_DATA(indev_tlv)); return -EINVAL; } - memset(indev, 0, IFNAMSIZ); - sprintf(indev, "%s", (char *) RTA_DATA(indev_tlv)); - return 0; } +static inline void +tcf_change_indev(struct tcf_proto *tp, char *indev, struct rtattr *id_tlv) +{ + memset(indev, 0, IFNAMSIZ); + memcpy(indev, RTA_DATA(id_tlv), RTA_PAYLOAD(id_tlv)); + indev[IFNAMSIZ - 1] = '\0'; +} + static inline int tcf_match_indev(struct sk_buff *skb, char *indev) { @@ -185,6 +191,12 @@ return 1; } +#else /* CONFIG_NET_CLS_IND */ +static inline int +tcf_validate_indev(struct rtattr *indev_tlv) +{ + return indev_tlv ? -EOPNOTSUPP : 0; +} #endif /* CONFIG_NET_CLS_IND */ #ifdef CONFIG_NET_CLS_POLICE --- linux-2.6.10-rc3-bk7.orig/net/sched/cls_u32.c 2004-12-14 14:24:34.000000000 +0100 +++ linux-2.6.10-rc3-bk7/net/sched/cls_u32.c 2004-12-14 14:55:18.000000000 +0100 @@ -488,6 +488,12 @@ struct tc_u_knode *n, struct rtattr **tb, struct rtattr *est) { + int err; + + err = tcf_validate_indev(tb[TCA_U32_INDEV-1]); + if (err < 0) + return err; + if (tb[TCA_U32_LINK-1]) { u32 handle = *(u32*)RTA_DATA(tb[TCA_U32_LINK-1]); struct tc_u_hnode *ht_down = NULL; @@ -535,12 +541,10 @@ } #endif #endif + #ifdef CONFIG_NET_CLS_IND - if (tb[TCA_U32_INDEV-1]) { - int err = tcf_change_indev(tp, n->indev, tb[TCA_U32_INDEV-1]); - if (err < 0) - return err; - } + if (tb[TCA_U32_INDEV-1]) + tcf_change_indev(tp, n->indev, tb[TCA_U32_INDEV-1]); #endif return 0; --- linux-2.6.10-rc3-bk7.orig/net/sched/cls_fw.c 2004-12-14 14:24:34.000000000 +0100 +++ linux-2.6.10-rc3-bk7/net/sched/cls_fw.c 2004-12-14 14:33:50.000000000 +0100 @@ -208,21 +208,24 @@ fw_change_attrs(struct tcf_proto *tp, struct fw_filter *f, struct rtattr **tb, struct rtattr **tca, unsigned long base) { - int err = -EINVAL; + int err; + + err = tcf_validate_indev(tb[TCA_FW_INDEV-1]); + if (err < 0) + goto errout; if (tb[TCA_FW_CLASSID-1]) { - if (RTA_PAYLOAD(tb[TCA_FW_CLASSID-1]) != sizeof(u32)) + if (RTA_PAYLOAD(tb[TCA_FW_CLASSID-1]) != sizeof(u32)) { + err = -EINVAL; goto errout; + } f->res.classid = *(u32*)RTA_DATA(tb[TCA_FW_CLASSID-1]); tcf_bind_filter(tp, &f->res, base); } #ifdef CONFIG_NET_CLS_IND - if (tb[TCA_FW_INDEV-1]) { - err = tcf_change_indev(tp, f->indev, tb[TCA_FW_INDEV-1]); - if (err < 0) - goto errout; - } + if (tb[TCA_FW_INDEV-1]) + tcf_change_indev(tp, f->indev, tb[TCA_FW_INDEV-1]); #endif /* CONFIG_NET_CLS_IND */ #ifdef CONFIG_NET_CLS_ACT From tgraf@suug.ch Sun Dec 19 12:37:05 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 19 Dec 2004 12:37:12 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBJKaiic022144 for ; Sun, 19 Dec 2004 12:37:05 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id E214AF; Sun, 19 Dec 2004 21:35:58 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id 91B3A1C0EA; Sun, 19 Dec 2004 21:36:41 +0100 (CET) Date: Sun, 19 Dec 2004 21:36:41 +0100 From: Thomas Graf To: jamal Cc: "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PATCH] PKT_SCHED: dsmark must take care of shared/cloned skbs Message-ID: <20041219203641.GL17998@postel.suug.ch> References: <20041218170017.GH17998@postel.suug.ch> <1103487827.1048.188.camel@jzny.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1103487827.1048.188.camel@jzny.localdomain> X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12879 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * jamal <1103487827.1048.188.camel@jzny.localdomain> 2004-12-19 15:23 > If the qdisc at that level muddies the packet thats fair game - thats > what goes out on the wire. So we should leave the code as is. Agreed for egress but I think it is needed for stuff like IMQ. It's debatable whether we should take care of IMQ and alike though. From laforge@gnumonks.org Sun Dec 19 13:42:18 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 19 Dec 2004 13:42:25 -0800 (PST) Received: from ganesha.gnumonks.org (Debian-exim@ganesha.gnumonks.org [213.95.27.120]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBJLfuCg024720 for ; Sun, 19 Dec 2004 13:42:17 -0800 Received: from dsl-082-082-097-002.arcor-ip.net ([82.82.97.2] helo=sunbeam.gnumonks.org) by ganesha.gnumonks.org with asmtp (TLS-1.0:RSA_ARCFOUR_SHA:16) (Exim 4.34) id 1Cg8ns-0004Ae-8v; Sun, 19 Dec 2004 22:41:24 +0100 Received: from laforge by sunbeam.gnumonks.org with local (Exim 4.34) id 1Cg8np-0006Fy-0F; Sun, 19 Dec 2004 22:41:21 +0100 Date: Sun, 19 Dec 2004 22:41:20 +0100 From: Harald Welte To: jamal Cc: Henrik Nordstrom , "David S. Miller" , Andrea G Forte , hasso@estpak.ee, nhorman@redhat.com, linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: primary and secondary ip addresses Message-ID: <20041219214120.GX17302@sunbeam.de.gnumonks.org> References: <41912F7A.6000408@redhat.com> <200412161153.51251.hasso@estpak.ee> <200412161302.42357.hasso@estpak.ee> <41C2F6E5.5010607@cs.columbia.edu> <41C30212.6000906@cs.columbia.edu> <20041217112025.27688eb6.davem@davemloft.net> <1103487517.1047.181.camel@jzny.localdomain> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="xi8lRpXYeGgNnUjs" Content-Disposition: inline In-Reply-To: <1103487517.1047.181.camel@jzny.localdomain> User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12880 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: laforge@gnumonks.org Precedence: bulk X-list: netdev --xi8lRpXYeGgNnUjs Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sun, Dec 19, 2004 at 03:18:37PM -0500, jamal wrote: > Having said the above, I think it would make sense to have a "promotion" > scheme so that in the case a primary address is deleted, one could > promote the next secondary address in line. But that should be optional. Oh yes, please. This would save a lot of headache. I'm much in favour of such a proposal. > Now where is the fireman who wants to do this? I could help cheering > since i know the code. how would you think it fits best into the current netlink messages? > cheers, > jamal --=20 - Harald Welte http://www.gnumonks.org/ =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D Programming is like sex: One mistake and you have to support it your lifeti= me --xi8lRpXYeGgNnUjs Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Digital signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) iD8DBQFBxfWAXaXGVTD0i/8RAhzGAJ42/15fWwkV1AeR+SECS1JNOp5AUACgswb1 C+lV9lUxoaVp7VW08PfAqXQ= =eAwb -----END PGP SIGNATURE----- --xi8lRpXYeGgNnUjs-- From tgraf@suug.ch Sun Dec 19 14:02:35 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 19 Dec 2004 14:02:44 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBJM2ENa025692 for ; Sun, 19 Dec 2004 14:02:35 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id E33D284; Sun, 19 Dec 2004 23:01:28 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id 10AD81C0EA; Sun, 19 Dec 2004 23:02:12 +0100 (CET) Date: Sun, 19 Dec 2004 23:02:11 +0100 From: Thomas Graf To: Harald Welte Cc: jamal , Henrik Nordstrom , "David S. Miller" , Andrea G Forte , hasso@estpak.ee, nhorman@redhat.com, linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: primary and secondary ip addresses Message-ID: <20041219220211.GQ17998@postel.suug.ch> References: <200412161153.51251.hasso@estpak.ee> <200412161302.42357.hasso@estpak.ee> <41C2F6E5.5010607@cs.columbia.edu> <41C30212.6000906@cs.columbia.edu> <20041217112025.27688eb6.davem@davemloft.net> <1103487517.1047.181.camel@jzny.localdomain> <20041219214120.GX17302@sunbeam.de.gnumonks.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041219214120.GX17302@sunbeam.de.gnumonks.org> X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12881 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * Harald Welte <20041219214120.GX17302@sunbeam.de.gnumonks.org> 2004-12-19 22:41 > On Sun, Dec 19, 2004 at 03:18:37PM -0500, jamal wrote: > > > Having said the above, I think it would make sense to have a "promotion" > > scheme so that in the case a primary address is deleted, one could > > promote the next secondary address in line. But that should be optional. > > Oh yes, please. This would save a lot of headache. I'm much in favour > of such a proposal. Agreed, would be nice to have. > > Now where is the fireman who wants to do this? I could help cheering > > since i know the code. > > how would you think it fits best into the current netlink messages? 1) IFA_F_PROM_CAND flag and have inet_del_ifa* iterate over its secondary addresses and elect the first with the flag set. 2) IFA_PROM_PRIO TLV of type u32 holding a priority where 0 means no candiate. inet_del_ifa* iterates over its secondary addresses and elects the one with the highest prio as new primary address or deletes all addresses if none is found. * respectively the equivalent function of the other address families. Second variant requires more work but is more flexible so it's definitely my favourite. I'm willing to put some effort into this, I'm not familiar with all address families though. From tommy.christensen@tpack.net Sun Dec 19 14:29:35 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 19 Dec 2004 14:29:42 -0800 (PST) Received: from mail.tpack.net (ip18.tpack.net [213.173.228.18]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBJMTAsO026933 for ; Sun, 19 Dec 2004 14:29:35 -0800 Received: (qmail 9841 invoked from network); 19 Dec 2004 22:28:40 -0000 Received: from dhcp-210.cph.tpack.net (HELO ?172.17.159.11?) (192.168.0.210) by 0 with SMTP; 19 Dec 2004 22:28:40 -0000 Message-ID: <41C600D7.70005@tpack.net> Date: Sun, 19 Dec 2004 23:29:43 +0100 From: Tommy Christensen User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.2) Gecko/20040803 X-Accept-Language: en-us, en MIME-Version: 1.0 To: hadi@cyberus.ca CC: Thomas Spatzier , "David S. Miller" , Hasso Tepper , Herbert Xu , jgarzik@pobox.com, netdev@oss.sgi.com, Paul Jakma Subject: Re: [patch 4/10] s390: network driver. References: <1103484552.1046.155.camel@jzny.localdomain> In-Reply-To: <1103484552.1046.155.camel@jzny.localdomain> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12882 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tommy.christensen@tpack.net Precedence: bulk X-list: netdev jamal wrote: > On Wed, 2004-12-15 at 10:03, Thomas Spatzier wrote: > >>jamal wrote on 15.12.2004 14:50:27: >> >> >>>When you netif_stop_queue you should never receive packets anymore >>>at the device level. If you receive any its a bug and you should drop >>>them and bitch violently. In other words i think what you have at the >>>moment is bandaid not the solution. >> >>When we do a netif_stop_queue, we do not get any more packets. >>So this behaviour is ok. The problem is that the socket write >>queues fill up then and get blocked as soon as they are full. >> > > This is the strange part. Anyone still willing to provide a sample > program that hangs? I haven't tried this myself, but surely it can happen when the socket send-buffer is smaller than what can be queued up for transmission: i.e. in the qdisc queue and device DMA ring. And even more so, when sending to multiple devices from one socket. > Look at e1000 they do it right there. > > On link up: > netif_carrier_on(netdev); > netif_wake_queue(netdev); > On Link Down: > netif_carrier_off(netdev); // wonder if these need swapping > netif_stop_queue(netdev); > Well, this is the same as what started this whole thread. I believe that stopping the queue on link-down events is simply bad behavior of the driver. -Tommy From roland@topspin.com Sun Dec 19 14:36:32 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 19 Dec 2004 14:36:39 -0800 (PST) Received: from umhlanga.STRATNET.NET (umhlanga.stratnet.net [12.162.17.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBJMaBC6027615 for ; Sun, 19 Dec 2004 14:36:31 -0800 Received: from exch-1.topspincom.com ([12.162.17.3]) by umhlanga.STRATNET.NET with Microsoft SMTPSVC(5.0.2195.5329); Sun, 19 Dec 2004 14:35:26 -0800 Received: from eddore ([10.10.253.169]) by exch-1.topspincom.com with Microsoft SMTPSVC(5.0.2195.5329); Sun, 19 Dec 2004 14:35:26 -0800 Received: from roland by eddore with local (Exim 4.34) id 1Cg9eA-0002Qu-B8; Sun, 19 Dec 2004 14:35:26 -0800 To: hadi@cyberus.ca Cc: "David S. Miller" , netdev@oss.sgi.com, openib-general@openib.org X-Message-Flag: Warning: May contain useful information References: <52llbwoaej.fsf@topspin.com> <20041217214432.07b7b21e.davem@davemloft.net> <1103484675.1050.158.camel@jzny.localdomain> From: Roland Dreier Date: Sun, 19 Dec 2004 14:35:26 -0800 In-Reply-To: <1103484675.1050.158.camel@jzny.localdomain> (jamal's message of "19 Dec 2004 14:31:15 -0500") Message-ID: <52fz21ncgh.fsf@topspin.com> User-Agent: Gnus/5.1006 (Gnus v5.10.6) XEmacs/21.4 (Security Through Obscurity, linux) MIME-Version: 1.0 X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: roland@topspin.com Subject: Re: LLTX and netif_stop_queue Content-Type: text/plain; charset=us-ascii X-SA-Exim-Version: 4.1 (built Tue, 17 Aug 2004 11:06:07 +0200) X-SA-Exim-Scanned: Yes (on eddore) X-OriginalArrivalTime: 19 Dec 2004 22:35:26.0720 (UTC) FILETIME=[09446800:01C4E61B] X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12883 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: roland@topspin.com Precedence: bulk X-list: netdev jamal> How about releasing the qlock only when the LLTX transmit jamal> lock is grabbed? That should bring it to par with what it jamal> was originally. This seems a little risky. I can't point to a specific deadlock but it doesn't seem right on general principles to unlock in a different order than you nested the locks when acquiring them -- if I understand correctly, you're suggesting lock(queue_lock), lock(tx_lock), unlock(queue_lock), unlock(tx_lock). Thanks, Roland From jgarzik@pobox.com Sun Dec 19 14:44:20 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 19 Dec 2004 14:44:27 -0800 (PST) Received: from www.linux.org.uk (IDENT:93@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBJMhxGT028304 for ; Sun, 19 Dec 2004 14:44:19 -0800 Received: from [69.134.152.124] (helo=[10.10.10.88]) by www.linux.org.uk with asmtp (TLSv1:AES256-SHA:256) (Exim 4.33) id 1Cg9ll-0003Vv-5Y; Sun, 19 Dec 2004 22:43:17 +0000 Message-ID: <41C603F8.9030705@pobox.com> Date: Sun, 19 Dec 2004 17:43:04 -0500 From: Jeff Garzik User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3) Gecko/20040922 X-Accept-Language: en-us, en MIME-Version: 1.0 To: hadi@cyberus.ca, "David S. Miller" , Paul Jakma CC: Thomas Spatzier , Hasso Tepper , Herbert Xu , netdev@oss.sgi.com Subject: Re: [patch 4/10] s390: network driver. References: <1103484552.1046.155.camel@jzny.localdomain> In-Reply-To: <1103484552.1046.155.camel@jzny.localdomain> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12884 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev jamal wrote: > On Wed, 2004-12-15 at 10:03, Thomas Spatzier wrote: > > >>jamal wrote on 15.12.2004 14:50:27: >> >> >>>When you netif_stop_queue you should never receive packets anymore >>>at the device level. If you receive any its a bug and you should drop >>>them and bitch violently. In other words i think what you have at the >>>moment is bandaid not the solution. >> >>When we do a netif_stop_queue, we do not get any more packets. >>So this behaviour is ok. The problem is that the socket write >>queues fill up then and get blocked as soon as they are full. >> > > > This is the strange part. Anyone still willing to provide a sample > program that hangs? > > >>>Can you describe how your driver uses the netif_start/stop/wake >>>interfaces? >> >>Before the patch we are talking about: >>When we detect a cable pull (or something like this) we call >>netif_stop_queue and set switch off the IFF_RUNNING flag. >>Then when we detect that the cable is plugged in again, we >>call netif_wake_queue and switch the IFF_RUNNING flag on. >> > > > Not too bad except user space doesnt get async notification. > > >>And with the patch: >>On cable pull we switch off IFF_RUNNING and call >>netif_carrier_off. We still get packets but drop them. >>When the cable is plugged in we switch on IFF_RUNNING and >>call netif_carrier_on. > > > Ok, thats something you need to change. > Why you are setting that IFF_RUNNING flag? > Look at e1000 they do it right there. > > On link up: > netif_carrier_on(netdev); > netif_wake_queue(netdev); > On Link Down: > netif_carrier_off(netdev); // wonder if these need swapping > netif_stop_queue(netdev); > > Still, I think we need to resolve the original problem. > And I have a feeling we wont be seeing any program which > reproduces it ;-> I've been watching this thread, and am still waiting to see a good, isolated test case. My initial reaction was based on an observation that (a) the proposed s390 change creates a CPU cycle soaker, a /dev/null for skbs. (b) it really sounds like the userland program is doing something broken Even if (b) is not true, the change is unacceptable due to (a) regardless. Jeff From hadi@cyberus.ca Sun Dec 19 14:54:32 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 19 Dec 2004 14:54:39 -0800 (PST) Received: from mx02.cybersurf.com (mx02.cybersurf.com [209.197.145.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBJMsB1K029148 for ; Sun, 19 Dec 2004 14:54:32 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx02.cybersurf.com with esmtp (Exim 4.30) id 1Cg9vv-0007jp-I2 for netdev@oss.sgi.com; Sun, 19 Dec 2004 17:53:47 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1Cg9vs-0000bC-LA; Sun, 19 Dec 2004 17:53:45 -0500 Subject: Re: [PATCH] PKT_SCHED: dsmark must take care of shared/cloned skbs From: jamal Reply-To: hadi@cyberus.ca To: Thomas Graf Cc: "David S. Miller" , netdev@oss.sgi.com In-Reply-To: <20041219203641.GL17998@postel.suug.ch> References: <20041218170017.GH17998@postel.suug.ch> <1103487827.1048.188.camel@jzny.localdomain> <20041219203641.GL17998@postel.suug.ch> Content-Type: text/plain Organization: jamalopolous Message-Id: <1103496820.1049.204.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 19 Dec 2004 17:53:41 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12885 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Sun, 2004-12-19 at 15:36, Thomas Graf wrote: > * jamal <1103487827.1048.188.camel@jzny.localdomain> 2004-12-19 15:23 > > If the qdisc at that level muddies the packet thats fair game - thats > > what goes out on the wire. So we should leave the code as is. > > Agreed for egress but I think it is needed for stuff like IMQ. It's > debatable whether we should take care of IMQ and alike though. You are right, it may cause an issue on an IMQ like device when we have one in kernel. Hang on to the patch for now is my opinion; "schedulers" should probably not be mucking with packets once they are queued in (dsmark aint really a scheduler). We should talk to Werner to see if we can move that functionality prequeueing ... cheers, jamal From hadi@cyberus.ca Sun Dec 19 15:01:07 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 19 Dec 2004 15:01:16 -0800 (PST) Received: from mx01.cybersurf.com (mx01.cybersurf.com [209.197.145.104]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBJN0lln029771 for ; Sun, 19 Dec 2004 15:01:07 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx01.cybersurf.com with esmtp (Exim 4.30) id 1CgA28-0006uV-Cv for netdev@oss.sgi.com; Sun, 19 Dec 2004 18:00:12 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CgA1U-00018u-Ra; Sun, 19 Dec 2004 17:59:33 -0500 Subject: Re: primary and secondary ip addresses From: jamal Reply-To: hadi@cyberus.ca To: Thomas Graf Cc: Harald Welte , Henrik Nordstrom , "David S. Miller" , Andrea G Forte , hasso@estpak.ee, nhorman@redhat.com, linux-net@vger.kernel.org, netdev@oss.sgi.com In-Reply-To: <20041219220211.GQ17998@postel.suug.ch> References: <200412161153.51251.hasso@estpak.ee> <200412161302.42357.hasso@estpak.ee> <41C2F6E5.5010607@cs.columbia.edu> <41C30212.6000906@cs.columbia.edu> <20041217112025.27688eb6.davem@davemloft.net> <1103487517.1047.181.camel@jzny.localdomain> <20041219214120.GX17302@sunbeam.de.gnumonks.org> <20041219220211.GQ17998@postel.suug.ch> Content-Type: text/plain Organization: jamalopolous Message-Id: <1103497168.1046.218.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 19 Dec 2004 17:59:28 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12886 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev I had something much simpler in mind. Basically, promote the next one in line. This would be cleanly backward compatible and would be an improvement over whats there (however medievial it is). Let me see if i can whip something that at least compiles. Unfortunately i wont have time to chase it to completion of testing until around xmas when i have time off from work. cheers, jamal On Sun, 2004-12-19 at 17:02, Thomas Graf wrote: > * Harald Welte <20041219214120.GX17302@sunbeam.de.gnumonks.org> 2004-12-19 22:41 > > On Sun, Dec 19, 2004 at 03:18:37PM -0500, jamal wrote: > > > > > Having said the above, I think it would make sense to have a "promotion" > > > scheme so that in the case a primary address is deleted, one could > > > promote the next secondary address in line. But that should be optional. > > > > Oh yes, please. This would save a lot of headache. I'm much in favour > > of such a proposal. > > Agreed, would be nice to have. > > > > Now where is the fireman who wants to do this? I could help cheering > > > since i know the code. > > > > how would you think it fits best into the current netlink messages? > > 1) IFA_F_PROM_CAND flag and have inet_del_ifa* iterate over its > secondary addresses and elect the first with the flag set. > > 2) IFA_PROM_PRIO TLV of type u32 holding a priority where 0 means no > candiate. inet_del_ifa* iterates over its secondary addresses and > elects the one with the highest prio as new primary address or > deletes all addresses if none is found. > > * respectively the equivalent function of the other address families. > > Second variant requires more work but is more flexible so it's > definitely my favourite. I'm willing to put some effort into this, > I'm not familiar with all address families though. > From hadi@cyberus.ca Sun Dec 19 15:06:32 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 19 Dec 2004 15:06:39 -0800 (PST) Received: from mx02.cybersurf.com (mx02.cybersurf.com [209.197.145.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBJN6B4j030324 for ; Sun, 19 Dec 2004 15:06:32 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx02.cybersurf.com with esmtp (Exim 4.30) id 1CgA7V-0007Ls-3U for netdev@oss.sgi.com; Sun, 19 Dec 2004 18:05:45 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CgA77-0001nh-0Q; Sun, 19 Dec 2004 18:05:21 -0500 Subject: Re: [patch 4/10] s390: network driver. From: jamal Reply-To: hadi@cyberus.ca To: Tommy Christensen Cc: Thomas Spatzier , "David S. Miller" , Hasso Tepper , Herbert Xu , jgarzik@pobox.com, netdev@oss.sgi.com, Paul Jakma In-Reply-To: <41C600D7.70005@tpack.net> References: <1103484552.1046.155.camel@jzny.localdomain> <41C600D7.70005@tpack.net> Content-Type: text/plain Organization: jamalopolous Message-Id: <1103497516.1046.231.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 19 Dec 2004 18:05:17 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12887 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Sun, 2004-12-19 at 17:29, Tommy Christensen wrote: > I haven't tried this myself, but surely it can happen when the > socket send-buffer is smaller than what can be queued up for > transmission: i.e. in the qdisc queue and device DMA ring. Shouldnt this have to do with socket options? If you wish to block while waiting for send, then you should be allowed. For a routing protocol that actually is notified that the link went down, it should probably flush those socket buffer at that point. > And even more so, when sending to multiple devices from one socket. Yes, this looks more within reason. I dont know what the answer to this is. But would be helpful for someone to create a setup that reproduces this. > > Look at e1000 they do it right there. > > > > On link up: > > netif_carrier_on(netdev); > > netif_wake_queue(netdev); > > On Link Down: > > netif_carrier_off(netdev); // wonder if these need swapping > > netif_stop_queue(netdev); > > > > Well, this is the same as what started this whole thread. > I believe that stopping the queue on link-down events is simply > bad behavior of the driver. >From what Thomas was saying, this is not what he was doing. Read his email. cheers, jamal From hadi@cyberus.ca Sun Dec 19 15:08:08 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 19 Dec 2004 15:08:15 -0800 (PST) Received: from mx01.cybersurf.com (mx01.cybersurf.com [209.197.145.104]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBJN7l0V030811 for ; Sun, 19 Dec 2004 15:08:08 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx01.cybersurf.com with esmtp (Exim 4.30) id 1CgA8v-0001nc-Hl for netdev@oss.sgi.com; Sun, 19 Dec 2004 18:07:13 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CgA8K-0001tD-CN; Sun, 19 Dec 2004 18:06:36 -0500 Subject: Re: LLTX and netif_stop_queue From: jamal Reply-To: hadi@cyberus.ca To: Roland Dreier Cc: "David S. Miller" , netdev@oss.sgi.com, openib-general@openib.org In-Reply-To: <52fz21ncgh.fsf@topspin.com> References: <52llbwoaej.fsf@topspin.com> <20041217214432.07b7b21e.davem@davemloft.net> <1103484675.1050.158.camel@jzny.localdomain> <52fz21ncgh.fsf@topspin.com> Content-Type: text/plain Organization: jamalopolous Message-Id: <1103497592.1046.235.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 19 Dec 2004 18:06:32 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12888 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Sun, 2004-12-19 at 17:35, Roland Dreier wrote: > jamal> How about releasing the qlock only when the LLTX transmit > jamal> lock is grabbed? That should bring it to par with what it > jamal> was originally. > > This seems a little risky. I can't point to a specific deadlock but > it doesn't seem right on general principles to unlock in a different > order than you nested the locks when acquiring them -- if I understand > correctly, you're suggesting lock(queue_lock), lock(tx_lock), > unlock(queue_lock), unlock(tx_lock). There is no deadlock. Thats exactly how things work. Try the patches i posted. cheers, jamal From roland@topspin.com Sun Dec 19 15:17:42 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 19 Dec 2004 15:17:49 -0800 (PST) Received: from umhlanga.STRATNET.NET (umhlanga.stratnet.net [12.162.17.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBJNHLQS031532 for ; Sun, 19 Dec 2004 15:17:42 -0800 Received: from exch-1.topspincom.com ([12.162.17.3]) by umhlanga.STRATNET.NET with Microsoft SMTPSVC(5.0.2195.5329); Sun, 19 Dec 2004 15:16:58 -0800 Received: from eddore ([10.10.253.169]) by exch-1.topspincom.com with Microsoft SMTPSVC(5.0.2195.5329); Sun, 19 Dec 2004 15:16:58 -0800 Received: from roland by eddore with local (Exim 4.34) id 1CgAIM-0002Tm-1x; Sun, 19 Dec 2004 15:16:58 -0800 To: hadi@cyberus.ca Cc: "David S. Miller" , netdev@oss.sgi.com, openib-general@openib.org X-Message-Flag: Warning: May contain useful information References: <52llbwoaej.fsf@topspin.com> <20041217214432.07b7b21e.davem@davemloft.net> <1103484675.1050.158.camel@jzny.localdomain> <52fz21ncgh.fsf@topspin.com> <1103497592.1046.235.camel@jzny.localdomain> From: Roland Dreier Date: Sun, 19 Dec 2004 15:16:58 -0800 In-Reply-To: <1103497592.1046.235.camel@jzny.localdomain> (jamal's message of "19 Dec 2004 18:06:32 -0500") Message-ID: <527jndnaj9.fsf@topspin.com> User-Agent: Gnus/5.1006 (Gnus v5.10.6) XEmacs/21.4 (Security Through Obscurity, linux) MIME-Version: 1.0 X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: roland@topspin.com Subject: Re: LLTX and netif_stop_queue Content-Type: text/plain; charset=us-ascii X-SA-Exim-Version: 4.1 (built Tue, 17 Aug 2004 11:06:07 +0200) X-SA-Exim-Scanned: Yes (on eddore) X-OriginalArrivalTime: 19 Dec 2004 23:16:58.0548 (UTC) FILETIME=[D6832F40:01C4E620] X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12889 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: roland@topspin.com Precedence: bulk X-list: netdev Roland> This seems a little risky. I can't point to a specific Roland> deadlock but it doesn't seem right on general principles Roland> to unlock in a different order than you nested the locks Roland> when acquiring them -- if I understand correctly, you're Roland> suggesting lock(queue_lock), lock(tx_lock), Roland> unlock(queue_lock), unlock(tx_lock). jamal> There is no deadlock. Thats exactly how things work. Try jamal> the patches i posted. Thinking about it more, I guess it's OK. I was just think about the basic general rule that locks need to be acquired/released in LIFO order to avoid deadlock (eg lock(A), lock(B), unlock(B), unlock(A)). However in this case unlocking queue_lock after acquiring the driver's tx_lock seems to be OK because the driver does a trylock on tx_lock in the xmit path, so it can't deadlock. At worst the trylock will just fail to get the lock. Thanks, Roland From tgraf@suug.ch Sun Dec 19 15:18:49 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 19 Dec 2004 15:18:57 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBJNISlG031699 for ; Sun, 19 Dec 2004 15:18:49 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id 310E184; Mon, 20 Dec 2004 00:17:42 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id 71FD21C0EA; Mon, 20 Dec 2004 00:18:24 +0100 (CET) Date: Mon, 20 Dec 2004 00:18:24 +0100 From: Thomas Graf To: jamal Cc: "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PATCH] PKT_SCHED: dsmark must take care of shared/cloned skbs Message-ID: <20041219231824.GS17998@postel.suug.ch> References: <20041218170017.GH17998@postel.suug.ch> <1103487827.1048.188.camel@jzny.localdomain> <20041219203641.GL17998@postel.suug.ch> <1103496820.1049.204.camel@jzny.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1103496820.1049.204.camel@jzny.localdomain> X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12890 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * jamal <1103496820.1049.204.camel@jzny.localdomain> 2004-12-19 17:53 > On Sun, 2004-12-19 at 15:36, Thomas Graf wrote: > > * jamal <1103487827.1048.188.camel@jzny.localdomain> 2004-12-19 15:23 > > > If the qdisc at that level muddies the packet thats fair game - thats > > > what goes out on the wire. So we should leave the code as is. > > > > Agreed for egress but I think it is needed for stuff like IMQ. It's > > debatable whether we should take care of IMQ and alike though. > > You are right, it may cause an issue on an IMQ like device when we have > one in kernel. Hang on to the patch for now is my opinion; "schedulers" > should probably not be mucking with packets once they are queued in > (dsmark aint really a scheduler). We should talk to Werner to see if we > can move that functionality prequeueing ... Fine with me, we could make the dscp mapping an action. I doubt that there are many users out there since the ECN bits are still taken into account which doesn't make much sense. Something like the following patch would be required to bring back the old behaviour which relied on the lower two bits being unused. --- linux-2.6.10-rc3-bk12.orig/net/sched/sch_dsmark.c 2004-12-19 01:31:01.000000000 +0100 +++ linux-2.6.10-rc3-bk12/net/sched/sch_dsmark.c 2004-12-19 01:31:27.000000000 +0100 @@ -14,6 +14,7 @@ #include #include #include +#include #include @@ -199,10 +200,12 @@ IP/IPv6 header is always linear --TGR */ switch (skb->protocol) { case __constant_htons(ETH_P_IP): - skb->tc_index = ipv4_get_dsfield(skb->nh.iph); + skb->tc_index = ipv4_get_dsfield(skb->nh.iph) + & ~INET_ECN_MASK; break; case __constant_htons(ETH_P_IPV6): - skb->tc_index = ipv6_get_dsfield(skb->nh.ipv6h); + skb->tc_index = ipv6_get_dsfield(skb->nh.ipv6h) + & ~INET_ECN_MASK; break; default: skb->tc_index = 0; From tommy.christensen@tpack.net Sun Dec 19 15:45:55 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 19 Dec 2004 15:46:01 -0800 (PST) Received: from mail.tpack.net (ip18.tpack.net [213.173.228.18]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBJNjXPL001444 for ; Sun, 19 Dec 2004 15:45:54 -0800 Received: (qmail 22338 invoked from network); 19 Dec 2004 23:45:02 -0000 Received: from dhcp-210.cph.tpack.net (HELO ?172.17.159.11?) (192.168.0.210) by 0 with SMTP; 19 Dec 2004 23:45:02 -0000 Message-ID: <41C612BC.5070909@tpack.net> Date: Mon, 20 Dec 2004 00:46:04 +0100 From: Tommy Christensen User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.2) Gecko/20040803 X-Accept-Language: en-us, en MIME-Version: 1.0 To: hadi@cyberus.ca CC: Thomas Spatzier , "David S. Miller" , Hasso Tepper , Herbert Xu , jgarzik@pobox.com, netdev@oss.sgi.com, Paul Jakma Subject: Re: [patch 4/10] s390: network driver. References: <1103484552.1046.155.camel@jzny.localdomain> <41C600D7.70005@tpack.net> <1103497516.1046.231.camel@jzny.localdomain> In-Reply-To: <1103497516.1046.231.camel@jzny.localdomain> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12891 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tommy.christensen@tpack.net Precedence: bulk X-list: netdev jamal wrote: > On Sun, 2004-12-19 at 17:29, Tommy Christensen wrote: > >>I haven't tried this myself, but surely it can happen when the >>socket send-buffer is smaller than what can be queued up for >>transmission: i.e. in the qdisc queue and device DMA ring. > > Shouldnt this have to do with socket options? If you wish to block while > waiting for send, then you should be allowed. > For a routing protocol that actually is notified that the link went > down, it should probably flush those socket buffer at that point. OK. So is this the recommendation for these pour souls? - Use a socket for each device. - Set the socket buffer (SO_SNDBUF) large enough. E.g. 1 MB ? Or use non-blocking sockets - just in case. - If you care about not sending stale packets, it is the responsibility of the application to flush the socket on link-down events (by down'ing the interface?). >>Well, this is the same as what started this whole thread. >>I believe that stopping the queue on link-down events is simply >>bad behavior of the driver. > >>From what Thomas was saying, this is not what he was doing. Read his > email. It was at least to the same effect. The key issue is whether the packets are kept in the queue (qdisc) until the link is back up, or they are drained (and dropped) by the driver. -Tommy From paul@clubi.ie Sun Dec 19 15:57:59 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 19 Dec 2004 15:58:06 -0800 (PST) Received: from hibernia.jakma.org (IDENT:U2FsdGVkX1/iwvMTikWjfKG75+QOTErErdtn9t3R3Z8@hibernia.jakma.org [212.17.55.49]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBJNvahm002199 for ; Sun, 19 Dec 2004 15:57:58 -0800 Received: from sheen.jakma.org (sheen.jakma.org [212.17.55.53]) by hibernia.jakma.org (8.13.1/8.12.11) with ESMTP id iBJNsnqs012797; Sun, 19 Dec 2004 23:55:01 GMT Date: Sun, 19 Dec 2004 23:54:49 +0000 (UTC) From: Paul Jakma X-X-Sender: paul@sheen.jakma.org To: jamal cc: Thomas Spatzier , "David S. Miller" , Hasso Tepper , Herbert Xu , jgarzik@pobox.com, netdev@oss.sgi.com Subject: Re: [patch 4/10] s390: network driver. In-Reply-To: <1103484552.1046.155.camel@jzny.localdomain> Message-ID: References: <1103484552.1046.155.camel@jzny.localdomain> Mail-Followup-To: paul@hibernia.jakma.org X-NSA: arafat al aqsar jihad musharef jet-A1 avgas ammonium qran inshallah allah al-akbar martyr iraq saddam hammas hisballah rabin ayatollah korea vietnam revolt mustard gas british airways washington MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 19:53:11 2004 clamav-milter version 0.80j on hibernia.jakma.org X-Virus-Status: Clean X-Virus-Status: Clean X-archive-position: 12893 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: paul@clubi.ie Precedence: bulk X-list: netdev On Sun, 19 Dec 2004, jamal wrote: > This is the strange part. Anyone still willing to provide a sample > program that hangs? I can provide instructions, or you can wait a wee bit - I didnt have any e1000 hardware to test with (e1000 being one of the drivers which has this behavious, AFAIK/TTBOMK) - but a computer with an e1000 arrived Friday. So, give me a bit and I'll try come up with a test case. A simple UDP send should be enough. > Still, I think we need to resolve the original problem. > And I have a feeling we wont be seeing any program which > reproduces it ;-> Just send a UDP packet - AIUI when link goes down the application can (or at least a window exists) where the application can send a packet down the file descriptor without error, but it will be queued rather than sent and the fd eventually blocks/returns EAGAIN. That window is the problem. I'll test tomorrow. > cheers, > jamal regards, -- Paul Jakma paul@clubi.ie paul@jakma.org Key ID: 64A2FF6A Fortune: You want to know why I kept getting promoted? Because my mouth knows more than my brain. -- W.G. From hadi@cyberus.ca Sun Dec 19 15:57:22 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 19 Dec 2004 15:57:31 -0800 (PST) Received: from mx03.cybersurf.com (mx03.cybersurf.com [209.197.145.106]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBJNv2YP002152 for ; Sun, 19 Dec 2004 15:57:22 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx03.cybersurf.com with esmtp (Exim 4.30) id 1CgAul-0002NY-H2 for netdev@oss.sgi.com; Sun, 19 Dec 2004 18:56:39 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CgAud-0006R5-QT; Sun, 19 Dec 2004 18:56:32 -0500 Subject: Re: primary and secondary ip addresses From: jamal Reply-To: hadi@cyberus.ca To: Thomas Graf , Harald Welte Cc: Henrik Nordstrom , "David S. Miller" , Andrea G Forte , hasso@estpak.ee, nhorman@redhat.com, linux-net@vger.kernel.org, netdev@oss.sgi.com In-Reply-To: <1103497168.1046.218.camel@jzny.localdomain> References: <200412161153.51251.hasso@estpak.ee> <200412161302.42357.hasso@estpak.ee> <41C2F6E5.5010607@cs.columbia.edu> <41C30212.6000906@cs.columbia.edu> <20041217112025.27688eb6.davem@davemloft.net> <1103487517.1047.181.camel@jzny.localdomain> <20041219214120.GX17302@sunbeam.de.gnumonks.org> <20041219220211.GQ17998@postel.suug.ch> <1103497168.1046.218.camel@jzny.localdomain> Content-Type: multipart/mixed; boundary="=-kTO5Ol7W07FFL8Jv/nPg" Organization: jamalopolous Message-Id: <1103500587.1048.269.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 19 Dec 2004 18:56:27 -0500 X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12892 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev --=-kTO5Ol7W07FFL8Jv/nPg Content-Type: text/plain Content-Transfer-Encoding: 7bit Harald, My (stoopid) ISP doesnt like your email address. In any case, attached patch of what i was alluding to. Maybe be missing some things. Compiles - not tested and extremely dangerous becdause of side effects to arp and forwarding - so needs a lot of testing. damn spent my tim hortons coffee on this patch and too cold to go out. cheers, jamal On Sun, 2004-12-19 at 17:59, jamal wrote: > I had something much simpler in mind. > Basically, promote the next one in line. This would be cleanly backward > compatible and would be an improvement over whats there (however > medievial it is). Let me see if i can whip something that at least > compiles. Unfortunately i wont have time to chase it to completion of > testing until around xmas when i have time off from work. > > cheers, > jamal > > On Sun, 2004-12-19 at 17:02, Thomas Graf wrote: > > * Harald Welte <20041219214120.GX17302@sunbeam.de.gnumonks.org> 2004-12-19 22:41 > > > On Sun, Dec 19, 2004 at 03:18:37PM -0500, jamal wrote: > > > > > > > Having said the above, I think it would make sense to have a "promotion" > > > > scheme so that in the case a primary address is deleted, one could > > > > promote the next secondary address in line. But that should be optional. > > > > > > Oh yes, please. This would save a lot of headache. I'm much in favour > > > of such a proposal. > > > > Agreed, would be nice to have. > > > > > > Now where is the fireman who wants to do this? I could help cheering > > > > since i know the code. > > > > > > how would you think it fits best into the current netlink messages? > > > > 1) IFA_F_PROM_CAND flag and have inet_del_ifa* iterate over its > > secondary addresses and elect the first with the flag set. > > > > 2) IFA_PROM_PRIO TLV of type u32 holding a priority where 0 means no > > candiate. inet_del_ifa* iterates over its secondary addresses and > > elects the one with the highest prio as new primary address or > > deletes all addresses if none is found. > > > > * respectively the equivalent function of the other address families. > > > > Second variant requires more work but is more flexible so it's > > definitely my favourite. I'm willing to put some effort into this, > > I'm not familiar with all address families though. > > --=-kTO5Ol7W07FFL8Jv/nPg Content-Disposition: attachment; filename=p5 Content-Type: text/plain; name=p5; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit --- a/include/linux/bak.sysctl.h 2004-12-19 18:26:04.622675432 -0500 +++ b/include/linux/sysctl.h 2004-12-19 18:27:18.471448712 -0500 @@ -395,6 +395,7 @@ NET_IPV4_CONF_FORCE_IGMP_VERSION=17, NET_IPV4_CONF_ARP_ANNOUNCE=18, NET_IPV4_CONF_ARP_IGNORE=19, + NET_IPV4_CONF_PROMOTE_ALIASES=20, }; /* /proc/sys/net/ipv4/netfilter */ --- a/net/ipv4/bak.devinet.c 2004-12-19 17:30:33.787039416 -0500 +++ b/net/ipv4/devinet.c 2004-12-19 18:11:46.931064376 -0500 @@ -230,11 +230,15 @@ static void inet_del_ifa(struct in_device *in_dev, struct in_ifaddr **ifap, int destroy) { + struct in_ifaddr *promote = NULL; struct in_ifaddr *ifa1 = *ifap; ASSERT_RTNL(); - /* 1. Deleting primary ifaddr forces deletion all secondaries */ + /* 1. Deleting primary ifaddr forces deletion all secondaries + * unless sysctl aliaspromo not set + * ipv4_devconf.aliaspromo + **/ if (!(ifa1->ifa_flags & IFA_F_SECONDARY)) { struct in_ifaddr *ifa; @@ -248,11 +252,16 @@ continue; } - *ifap1 = ifa->ifa_next; + if (!IN_DEV_PROMOTE_ALIASES(in_dev)) { + *ifap1 = ifa->ifa_next; - rtmsg_ifa(RTM_DELADDR, ifa); - notifier_call_chain(&inetaddr_chain, NETDEV_DOWN, ifa); - inet_free_ifa(ifa); + rtmsg_ifa(RTM_DELADDR, ifa); + notifier_call_chain(&inetaddr_chain, NETDEV_DOWN, ifa); + inet_free_ifa(ifa); + } else { + promote = ifa; + break; + } } } @@ -278,6 +287,13 @@ if (!in_dev->ifa_list) inetdev_destroy(in_dev); } + + if (promote && IN_DEV_PROMOTE_ALIASES(in_dev)) { + /* not sure if we should send a delete notify first? */ + promote->ifa_flags &= ~IFA_F_SECONDARY; + rtmsg_ifa(RTM_NEWADDR, promote); + notifier_call_chain(&inetaddr_chain, NETDEV_UP, promote); + } } static int inet_insert_ifa(struct in_ifaddr *ifa) @@ -1374,6 +1390,15 @@ .proc_handler = &ipv4_doint_and_flush, .strategy = &ipv4_doint_and_flush_strategy, }, + { + .ctl_name = NET_IPV4_CONF_PROMOTE_ALIASES, + .procname = "promote_aliases", + .data = &ipv4_devconf.promote_aliases, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = &ipv4_doint_and_flush, + .strategy = &ipv4_doint_and_flush_strategy, + }, }, .devinet_dev = { { --- a/include/linux/bak.inetdevice.h 2004-12-19 17:54:26.610217184 -0500 +++ b/include/linux/inetdevice.h 2004-12-19 17:58:45.130916064 -0500 @@ -29,6 +29,7 @@ int no_xfrm; int no_policy; int force_igmp_version; + int promote_aliases; void *sysctl; }; @@ -71,6 +72,7 @@ #define IN_DEV_SEC_REDIRECTS(in_dev) (ipv4_devconf.secure_redirects || (in_dev)->cnf.secure_redirects) #define IN_DEV_IDTAG(in_dev) ((in_dev)->cnf.tag) #define IN_DEV_MEDIUM_ID(in_dev) ((in_dev)->cnf.medium_id) +#define IN_DEV_PROMOTE_ALIASES(in_dev) (ipv4_devconf.promote_aliases || (in_dev)->cnf.promote_aliases) #define IN_DEV_RX_REDIRECTS(in_dev) \ ((IN_DEV_FORWARD(in_dev) && \ --=-kTO5Ol7W07FFL8Jv/nPg-- From jgarzik@pobox.com Sun Dec 19 16:16:32 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 19 Dec 2004 16:16:38 -0800 (PST) Received: from www.linux.org.uk (IDENT:93@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBK0GB4S003637 for ; Sun, 19 Dec 2004 16:16:31 -0800 Received: from [69.134.152.124] (helo=[10.10.10.88]) by www.linux.org.uk with asmtp (TLSv1:AES256-SHA:256) (Exim 4.33) id 1CgBDD-0005YK-DU; Mon, 20 Dec 2004 00:15:43 +0000 Message-ID: <41C619A7.1060302@pobox.com> Date: Sun, 19 Dec 2004 19:15:35 -0500 From: Jeff Garzik User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3) Gecko/20040922 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Tommy Christensen CC: hadi@cyberus.ca, Thomas Spatzier , "David S. Miller" , Hasso Tepper , Herbert Xu , netdev@oss.sgi.com, Paul Jakma Subject: Re: [patch 4/10] s390: network driver. References: <1103484552.1046.155.camel@jzny.localdomain> <41C600D7.70005@tpack.net> <1103497516.1046.231.camel@jzny.localdomain> <41C612BC.5070909@tpack.net> In-Reply-To: <41C612BC.5070909@tpack.net> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12894 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Tommy Christensen wrote: > It was at least to the same effect. The key issue is whether the > packets are kept in the queue (qdisc) until the link is back up, > or they are drained (and dropped) by the driver. They should absolutely not be dropped by the driver. No ifs, ands, or buts. This sort of problem is NOT solved by modifying hundreds of drivers. Jeff From advertiser@localhost.localdomain Sun Dec 19 16:46:14 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 19 Dec 2004 16:46:20 -0800 (PST) Received: from localhost.localdomain ([82.201.178.238]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBK0jmGi008056 for ; Sun, 19 Dec 2004 16:46:13 -0800 Received: from localhost.localdomain (pioneer [127.0.0.1]) by localhost.localdomain (8.12.8/8.12.8) with ESMTP id iBK2k3kl013527 for ; Mon, 20 Dec 2004 04:46:03 +0200 Received: (from advertiser@localhost) by localhost.localdomain (8.12.8/8.12.8/Submit) id iBK2k35x013522 for netdev@oss.sgi.com; Mon, 20 Dec 2004 04:46:03 +0200 Date: Mon, 20 Dec 2004 04:46:03 +0200 From: advertiser@mkhoster.box Message-Id: <200412200246.iBK2k35x013522@localhost.localdomain> To: netdev@oss.sgi.com Subject: Cheap Prices NOT Cheap Hosting X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12895 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: advertiser@mkhoster.box Precedence: bulk X-list: netdev .HellO... ------------------------------------------------------------ ############################################################## $ Visit http://www.mkhoster.com For Verey Good Hosting Offer $ $--- Cpanel $ $--- PHP $ $--- CGI-perl $ $--- Mysql $ $--- And MORE ....... $ ############################################################## FOR MORE INFORMATIONS -----< http://mkhoster.com/support.html >----- ************************************************************** From bunk@stusta.de Sun Dec 19 17:54:25 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 19 Dec 2004 17:54:33 -0800 (PST) Received: from mailout.stusta.mhn.de (emailhub.stusta.mhn.de [141.84.69.5]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBK1s3Mj010407 for ; Sun, 19 Dec 2004 17:54:24 -0800 Received: (qmail 22540 invoked from network); 20 Dec 2004 01:53:34 -0000 Received: from r063144.stusta.swh.mhn.de (10.150.63.144) by mailhub.stusta.mhn.de with SMTP; 20 Dec 2004 01:53:34 -0000 Received: by r063144.stusta.swh.mhn.de (Postfix, from userid 1000) id AB652BB5BB; Mon, 20 Dec 2004 02:53:31 +0100 (CET) Date: Mon, 20 Dec 2004 02:53:31 +0100 From: Adrian Bunk To: Margit Schubert-While , prism54-private@prism54.org, netdev@oss.sgi.com, jgarzik@pobox.com, linux-net@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [2.6 patch] prism54: misc cleanups Message-ID: <20041220015331.GP21288@stusta.de> References: <20041030054534.GC4374@stusta.de> <20041031010143.GI7887@ruslug.rutgers.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041031010143.GI7887@ruslug.rutgers.edu> User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12896 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bunk@stusta.de Precedence: bulk X-list: netdev > > As a side effect it turned out that the mgt_unlatch_all function was > > completely unused, and I've therefore removed it. > > mgt_unlatch_all is there as work in progress. We currently set ESSID to > commit but we may need more work than that depending on the mode we're > in. Even though we're not using it right now we may use it soon due to > WPA. Please don't remove it. Sorry for the late answer. Below is a version of this patch that #if 0's mgt_unlatch_all instead. > Luis cu Adrian <-- snip --> The patch below makes some functions in prism54 that are only required locally static. As a side effect it turned out that the mgt_unlatch_all function was completely unused, and it's therefore #if 0'ed. I also considered moving display_buffer as static inline into islpci_mgt.h, but I wasn't 100% sure and therefore left it. diffstat output: drivers/net/wireless/prism54/isl_ioctl.c | 32 +++++++++++------- drivers/net/wireless/prism54/isl_ioctl.h | 5 -- drivers/net/wireless/prism54/islpci_dev.c | 5 +- drivers/net/wireless/prism54/islpci_dev.h | 2 - drivers/net/wireless/prism54/islpci_mgt.c | 2 + drivers/net/wireless/prism54/oid_mgt.c | 38 +--------------------- drivers/net/wireless/prism54/oid_mgt.h | 4 -- 7 files changed, 28 insertions(+), 60 deletions(-) Signed-off-by: Adrian Bunk --- linux-2.6.10-rc1-mm2-full/drivers/net/wireless/prism54/isl_ioctl.h.old 2004-10-30 06:52:18.000000000 +0200 +++ linux-2.6.10-rc1-mm2-full/drivers/net/wireless/prism54/isl_ioctl.h 2004-10-30 07:02:58.000000000 +0200 @@ -41,15 +41,10 @@ void prism54_wpa_ie_init(islpci_private *priv); void prism54_wpa_ie_clean(islpci_private *priv); -void prism54_wpa_ie_add(islpci_private *priv, u8 *bssid, - u8 *wpa_ie, size_t wpa_ie_len); -size_t prism54_wpa_ie_get(islpci_private *priv, u8 *bssid, u8 *wpa_ie); int prism54_set_mac_address(struct net_device *, void *); int prism54_ioctl(struct net_device *, struct ifreq *, int); -int prism54_set_wpa(struct net_device *, struct iw_request_info *, - __u32 *, char *); extern const struct iw_handler_def prism54_handler_def; --- linux-2.6.10-rc1-mm2-full/drivers/net/wireless/prism54/isl_ioctl.c.old 2004-10-30 06:43:10.000000000 +0200 +++ linux-2.6.10-rc1-mm2-full/drivers/net/wireless/prism54/isl_ioctl.c 2004-10-30 07:11:26.000000000 +0200 @@ -36,6 +36,14 @@ #include /* New driver API */ + +static void prism54_wpa_ie_add(islpci_private *priv, u8 *bssid, + u8 *wpa_ie, size_t wpa_ie_len); +static size_t prism54_wpa_ie_get(islpci_private *priv, u8 *bssid, u8 *wpa_ie); +static int prism54_set_wpa(struct net_device *, struct iw_request_info *, + __u32 *, char *); + + /** * prism54_mib_mode_helper - MIB change mode helper function * @mib: the &struct islpci_mib object to modify @@ -47,7 +55,7 @@ * Wireless API modes to Device firmware modes. It also checks for * correct valid Linux wireless modes. */ -int +static int prism54_mib_mode_helper(islpci_private *priv, u32 iw_mode) { u32 config = INL_CONFIG_MANUALRUN; @@ -647,7 +655,7 @@ return current_ev; } -int +static int prism54_get_scan(struct net_device *ndev, struct iw_request_info *info, struct iw_point *dwrq, char *extra) { @@ -1582,7 +1590,7 @@ #define MAC2STR(a) (a)[0], (a)[1], (a)[2], (a)[3], (a)[4], (a)[5] #define MACSTR "%02x:%02x:%02x:%02x:%02x:%02x" -void +static void prism54_wpa_ie_add(islpci_private *priv, u8 *bssid, u8 *wpa_ie, size_t wpa_ie_len) { @@ -1649,7 +1657,7 @@ up(&priv->wpa_sem); } -size_t +static size_t prism54_wpa_ie_get(islpci_private *priv, u8 *bssid, u8 *wpa_ie) { struct list_head *ptr; @@ -1736,7 +1744,7 @@ } } -int +static int prism54_process_trap_helper(islpci_private *priv, enum oid_num_t oid, char *data) { @@ -2314,7 +2322,7 @@ return ret; } -int +static int prism54_set_wpa(struct net_device *ndev, struct iw_request_info *info, __u32 * uwrq, char *extra) { @@ -2358,7 +2366,7 @@ return 0; } -int +static int prism54_get_wpa(struct net_device *ndev, struct iw_request_info *info, __u32 * uwrq, char *extra) { @@ -2367,7 +2375,7 @@ return 0; } -int +static int prism54_set_prismhdr(struct net_device *ndev, struct iw_request_info *info, __u32 * uwrq, char *extra) { @@ -2380,7 +2388,7 @@ return 0; } -int +static int prism54_get_prismhdr(struct net_device *ndev, struct iw_request_info *info, __u32 * uwrq, char *extra) { @@ -2389,7 +2397,7 @@ return 0; } -int +static int prism54_debug_oid(struct net_device *ndev, struct iw_request_info *info, __u32 * uwrq, char *extra) { @@ -2401,7 +2409,7 @@ return 0; } -int +static int prism54_debug_get_oid(struct net_device *ndev, struct iw_request_info *info, struct iw_point *data, char *extra) { @@ -2437,7 +2445,7 @@ return ret; } -int +static int prism54_debug_set_oid(struct net_device *ndev, struct iw_request_info *info, struct iw_point *data, char *extra) { --- linux-2.6.10-rc1-mm2-full/drivers/net/wireless/prism54/islpci_dev.h.old 2004-10-30 06:58:23.000000000 +0200 +++ linux-2.6.10-rc1-mm2-full/drivers/net/wireless/prism54/islpci_dev.h 2004-10-30 07:04:02.000000000 +0200 @@ -211,8 +211,6 @@ priv->device_base); } -struct net_device_stats *islpci_statistics(struct net_device *); - int islpci_free_memory(islpci_private *); struct net_device *islpci_setup(struct pci_dev *); #endif /* _ISLPCI_DEV_H */ --- linux-2.6.10-rc1-mm2-full/drivers/net/wireless/prism54/islpci_dev.c.old 2004-10-30 06:46:08.000000000 +0200 +++ linux-2.6.10-rc1-mm2-full/drivers/net/wireless/prism54/islpci_dev.c 2004-10-30 07:12:13.000000000 +0200 @@ -44,6 +44,7 @@ static int prism54_bring_down(islpci_private *); static int islpci_alloc_memory(islpci_private *); +static struct net_device_stats *islpci_statistics(struct net_device *); /* Temporary dummy MAC address to use until firmware is loaded. * The idea there is that some tools (such as nameif) may query @@ -52,7 +53,7 @@ * Of course, this is not the final/real MAC address. It doesn't * matter, as you are suppose to be able to change it anytime via * ndev->set_mac_address. Jean II */ -const unsigned char dummy_mac[6] = { 0x00, 0x30, 0xB4, 0x00, 0x00, 0x00 }; +static const unsigned char dummy_mac[6] = { 0x00, 0x30, 0xB4, 0x00, 0x00, 0x00 }; static int isl_upload_firmware(islpci_private *priv) @@ -607,7 +608,7 @@ return rc; } -struct net_device_stats * +static struct net_device_stats * islpci_statistics(struct net_device *ndev) { islpci_private *priv = netdev_priv(ndev); --- linux-2.6.10-rc1-mm2-full/drivers/net/wireless/prism54/islpci_mgt.c.old 2004-10-30 06:46:45.000000000 +0200 +++ linux-2.6.10-rc1-mm2-full/drivers/net/wireless/prism54/islpci_mgt.c 2004-10-30 07:15:39.000000000 +0200 @@ -44,6 +44,7 @@ /****************************************************************************** Driver general functions ******************************************************************************/ +#if VERBOSE > SHOW_ERROR_MESSAGES void display_buffer(char *buffer, int length) { @@ -58,6 +59,7 @@ printk("\n"); } +#endif /***************************************************************************** Queue handling for management frames --- linux-2.6.10-rc1-mm2-full/drivers/net/wireless/prism54/oid_mgt.h.old 2004-10-30 07:05:55.000000000 +0200 +++ linux-2.6.10-rc1-mm2-full/drivers/net/wireless/prism54/oid_mgt.h 2004-10-30 07:42:02.000000000 +0200 @@ -28,8 +28,7 @@ void mgt_clean(islpci_private *); -/* I don't know where to put these 3 */ -extern const int frequency_list_bg[]; +/* I don't know where to put these 2 */ extern const int frequency_list_a[]; int channel_of_freq(int); @@ -49,7 +48,6 @@ void mgt_get(islpci_private *, enum oid_num_t, void *); int mgt_commit(islpci_private *); -void mgt_unlatch_all(islpci_private *); int mgt_mlme_answer(islpci_private *); --- linux-2.6.10-rc3-mm1-full/drivers/net/wireless/prism54/oid_mgt.c.old 2004-12-20 01:03:44.000000000 +0100 +++ linux-2.6.10-rc3-mm1-full/drivers/net/wireless/prism54/oid_mgt.c 2004-12-20 01:05:31.000000000 +0100 @@ -24,8 +24,8 @@ #include "isl_ioctl.h" /* to convert between channel and freq */ -const int frequency_list_bg[] = { 2412, 2417, 2422, 2427, 2432, 2437, 2442, - 2447, 2452, 2457, 2462, 2467, 2472, 2484 +static const int frequency_list_bg[] = { 2412, 2417, 2422, 2427, 2432, + 2437, 2442, 2447, 2452, 2457, 2462, 2467, 2472, 2484 }; int @@ -730,6 +730,7 @@ * * The way to do this is to set ESSID. Note though that they may get * unlatch before though by setting another OID. */ +#if 0 void mgt_unlatch_all(islpci_private *priv) { @@ -756,6 +757,7 @@ if (rvalue) printk(KERN_DEBUG "%s: Unlatching OIDs failed\n", priv->ndev->name); } +#endif /* This will tell you if you are allowed to answer a mlme(ex) request .*/ From peterc@gelato.unsw.edu.au Sun Dec 19 19:56:19 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 19 Dec 2004 19:56:26 -0800 (PST) Received: from note.orchestra.cse.unsw.EDU.AU (root@note.orchestra.cse.unsw.EDU.AU [129.94.242.24]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBK3twXp014351 for ; Sun, 19 Dec 2004 19:56:19 -0800 Received: From lemon.gelato.unsw.edu.au ([129.94.173.27]) (for ) By note With Smtp ; Mon, 20 Dec 2004 14:55:33 +1100 Received: from berry.gelato.unsw.edu.au ([129.94.173.230]:60312) by lemon.gelato.unsw.edu.au with esmtp (Exim 4.34) id 1CgEdx-0001KX-Nb for netdev@oss.sgi.com; Mon, 20 Dec 2004 14:55:33 +1100 Received: from peterc by berry.gelato.unsw.EDU.AU with local (Exim 3.36 #1 (Debian)) id 1CgEdx-0003l7-00 for ; Mon, 20 Dec 2004 14:55:33 +1100 From: Peter Chubb To: netdev@oss.sgi.com Date: Mon, 20 Dec 2004 14:55:33 +1100 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16838.19765.583047.997351@berry.gelato.unsw.EDU.AU> Reply-to: peterc@gelato.unsw.edu.au Subject: TG3 driver failure on HP 16-way X-Mailer: VM 7.17 under 21.4 (patch 15) "Security Through Obscurity" XEmacs Lucid Comments: Hyperbole mail buttons accepted, v04.18. X-Face: GgFg(Z>fx((4\32hvXq<)|jndSniCH~~$D)Ka:P@e@JR1P%Vr}EwUdfwf-4j\rUs#JR{'h# !]])6%Jh~b$VA|ALhnpPiHu[-x~@<"@Iv&|%R)Fq[[,(&Z'O)Q)xCqe1\M[F8#9l8~}#u$S$Rm`S9% \'T@`:&8>Sb*c5d'=eDYI&GF`+t[LfDH="MP5rwOO]w>ALi7'=QJHz&y&C&TE_3j! X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12897 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: peterc@gelato.unsw.edu.au Precedence: bulk X-list: netdev Hi, I'm seeing interesting behaviour with the TG3 driver on an HP 16-way IA64 box. The driver detects the BCM5701 NICs, but then cannot bring them up. (I'm using the current head of tree linux 2.6.10-rc3 or thereabouts; driver version 3.14) lspci says: Ethernet controller: Broadcom Corporation NetXtreme BCM5701 Gigabit Ethernet (rev 15) I see: # modprobe tg3 tg3.c:v3.14 (November 15, 2004) ACPI: PCI interrupt 0000:00:01.0[A] -> GSI 28 (level, low) -> IRQ 59 eth0: Tigon3 [partno(A7109-60001) rev 0105 PHY(5701)] (PCI:33MHz:64-bit) 10/100/1000BaseT Ethernet 00:0e:7f:ed:91:8f eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[0] ACPI: PCI interrupt 0001:00:01.0[A] -> GSI 132 (level, low) -> IRQ 60 eth1: Tigon3 [partno(A7109-60001) rev 0105 PHY(5701)] (PCI:33MHz:64-bit) 10/100/1000BaseT Ethernet 00:0e:7f:ed:71:b9 eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[0] # ifup eth0 # ethtool eth0 Settings for eth0: Supported ports: [ MII ] Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Half 1000baseT/Full Supports auto-negotiation: Yes Advertised link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Half 1000baseT/Full Advertised auto-negotiation: Yes Speed: Unknown! (65535) Duplex: Unknown! (255) Port: Twisted Pair PHYAD: 1 Transceiver: internal Auto-negotiation: on Supports Wake-on: g Wake-on: d Current message level: 0x000000ff (255) Link detected: no # ethtool -s eth0 autoneg off speed 100 duplex full # ethtool eth0 Settings for eth0: Supported ports: [ MII ] Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Half 1000baseT/Full Supports auto-negotiation: Yes Advertised link modes: Not reported Advertised auto-negotiation: No Speed: 100Mb/s Duplex: Full Port: Twisted Pair PHYAD: 1 Transceiver: internal Auto-negotiation: off Supports Wake-on: g Wake-on: d Current message level: 0x000000ff (255) Link detected: no # mii-tool eth0 eth0: 100 Mbit, full duplex, link ok # ifconfig eth0 eth0 Link encap:Ethernet HWaddr 00:0E:7F:ED:91:8F inet addr:192.168.3.5 Bcast:192.168.3.255 Mask:255.255.255.0 inet6 addr: fe80::20e:7fff:feed:918f/64 Scope:Link UP BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:2 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) Interrupt:59 The card is plugged into a 100Mb switch. The BCM5700 driver from broadcom also fails to autonegotiate, but works when I use ethtool to set up the correct parameters. From roland@topspin.com Sun Dec 19 22:16:00 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 19 Dec 2004 22:16:09 -0800 (PST) Received: from umhlanga.STRATNET.NET (umhlanga.stratnet.net [12.162.17.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBK6Fcqs024751 for ; Sun, 19 Dec 2004 22:15:58 -0800 Received: from exch-1.topspincom.com ([12.162.17.3]) by umhlanga.STRATNET.NET with Microsoft SMTPSVC(5.0.2195.5329); Sun, 19 Dec 2004 22:15:14 -0800 Received: from eddore ([10.10.253.169]) by exch-1.topspincom.com with Microsoft SMTPSVC(5.0.2195.5329); Sun, 19 Dec 2004 22:15:14 -0800 Received: from localhost ([127.0.0.1] helo=eddore) by eddore with smtp (Exim 4.34) id 1CgGp8-00086o-0t; Sun, 19 Dec 2004 22:15:14 -0800 Cc: openib-general@openib.org, netdev@oss.sgi.com In-Reply-To: <200412192215.l62Q9JXNhGg51wOf@topspin.com> X-Mailer: Roland's Patchbomber Date: Sun, 19 Dec 2004 22:15:14 -0800 Message-Id: <200412192215.vegmgBmv5xungHlQ@topspin.com> Mime-Version: 1.0 To: linux-kernel@vger.kernel.org From: Roland Dreier X-SA-Exim-Connect-IP: 127.0.0.1 X-SA-Exim-Mail-From: roland@topspin.com Subject: [PATCH][v4][17/24] IPoIB IPv4 multicast Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-SA-Exim-Version: 4.1 (built Tue, 17 Aug 2004 11:06:07 +0200) X-SA-Exim-Scanned: Yes (on eddore) X-OriginalArrivalTime: 20 Dec 2004 06:15:14.0379 (UTC) FILETIME=[44CB19B0:01C4E65B] X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12899 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: roland@topspin.com Precedence: bulk X-list: netdev Add ip_ib_mc_map() to convert IPv4 multicast addresses to IPoIB hardware addresses. Also add so INFINIBAND_ALEN has a home. The mapping for multicast addresses is described in http://www.ietf.org/internet-drafts/draft-ietf-ipoib-ip-over-infiniband-08.txt Signed-off-by: Roland Dreier --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/include/linux/if_infiniband.h 2004-12-19 22:04:16.867213523 -0800 @@ -0,0 +1,29 @@ +/* + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available at + * , or the OpenIB.org BSD + * license, available in the LICENSE.TXT file accompanying this + * software. These details are also available at + * . + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * $Id$ + */ + +#ifndef _LINUX_IF_INFINIBAND_H +#define _LINUX_IF_INFINIBAND_H + +#define INFINIBAND_ALEN 20 /* Octets in IPoIB HW addr */ + +#endif /* _LINUX_IF_INFINIBAND_H */ --- linux-bk.orig/include/net/ip.h 2004-12-19 21:09:26.000000000 -0800 +++ linux-bk/include/net/ip.h 2004-12-19 22:04:16.868213376 -0800 @@ -229,6 +229,39 @@ buf[3]=addr&0x7F; } +/* + * Map a multicast IP onto multicast MAC for type IP-over-InfiniBand. + * Leave P_Key as 0 to be filled in by driver. + */ + +static inline void ip_ib_mc_map(u32 addr, char *buf) +{ + buf[0] = 0; /* Reserved */ + buf[1] = 0xff; /* Multicast QPN */ + buf[2] = 0xff; + buf[3] = 0xff; + addr = ntohl(addr); + buf[4] = 0xff; + buf[5] = 0x12; /* link local scope */ + buf[6] = 0x40; /* IPv4 signature */ + buf[7] = 0x1b; + buf[8] = 0; /* P_Key */ + buf[9] = 0; + buf[10] = 0; + buf[11] = 0; + buf[12] = 0; + buf[13] = 0; + buf[14] = 0; + buf[15] = 0; + buf[19] = addr & 0xff; + addr >>= 8; + buf[18] = addr & 0xff; + addr >>= 8; + buf[17] = addr & 0xff; + addr >>= 8; + buf[16] = addr & 0x0f; +} + #if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) #include #endif --- linux-bk.orig/net/ipv4/arp.c 2004-12-19 21:09:46.000000000 -0800 +++ linux-bk/net/ipv4/arp.c 2004-12-19 22:04:16.868213376 -0800 @@ -213,6 +213,9 @@ case ARPHRD_IEEE802_TR: ip_tr_mc_map(addr, haddr); return 0; + case ARPHRD_INFINIBAND: + ip_ib_mc_map(addr, haddr); + return 0; default: if (dir) { memcpy(haddr, dev->broadcast, dev->addr_len); From roland@topspin.com Sun Dec 19 22:16:00 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 19 Dec 2004 22:16:09 -0800 (PST) Received: from umhlanga.STRATNET.NET (umhlanga.stratnet.net [12.162.17.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBK6Fcqu024751 for ; Sun, 19 Dec 2004 22:16:00 -0800 Received: from exch-1.topspincom.com ([12.162.17.3]) by umhlanga.STRATNET.NET with Microsoft SMTPSVC(5.0.2195.5329); Sun, 19 Dec 2004 22:15:15 -0800 Received: from eddore ([10.10.253.169]) by exch-1.topspincom.com with Microsoft SMTPSVC(5.0.2195.5329); Sun, 19 Dec 2004 22:15:14 -0800 Received: from localhost ([127.0.0.1] helo=eddore) by eddore with smtp (Exim 4.34) id 1CgGp8-00086w-Cr; Sun, 19 Dec 2004 22:15:14 -0800 Cc: openib-general@openib.org, netdev@oss.sgi.com In-Reply-To: <200412192215.vegmgBmv5xungHlQ@topspin.com> X-Mailer: Roland's Patchbomber Date: Sun, 19 Dec 2004 22:15:14 -0800 Message-Id: <200412192215.69tnzAhGIT1vQGLF@topspin.com> Mime-Version: 1.0 To: linux-kernel@vger.kernel.org From: Roland Dreier X-SA-Exim-Connect-IP: 127.0.0.1 X-SA-Exim-Mail-From: roland@topspin.com Subject: [PATCH][v4][18/24] IPoIB IPv6 support Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-SA-Exim-Version: 4.1 (built Tue, 17 Aug 2004 11:06:07 +0200) X-SA-Exim-Scanned: Yes (on eddore) X-OriginalArrivalTime: 20 Dec 2004 06:15:14.0926 (UTC) FILETIME=[451E90E0:01C4E65B] X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12898 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: roland@topspin.com Precedence: bulk X-list: netdev Add ipv6_ib_mc_map() to convert IPv6 multicast addresses to IPoIB hardware addresses, and add support for autoconfiguration for devices with type ARPHRD_INFINIBAND. The mapping for multicast addresses is described in http://www.ietf.org/internet-drafts/draft-ietf-ipoib-ip-over-infiniband-08.txt Signed-off-by: Nitin Hande Signed-off-by: Roland Dreier --- linux-bk.orig/include/net/if_inet6.h 2004-12-19 21:09:54.000000000 -0800 +++ linux-bk/include/net/if_inet6.h 2004-12-19 22:04:17.213162542 -0800 @@ -266,5 +266,20 @@ { buf[0] = 0x00; } + +static inline void ipv6_ib_mc_map(struct in6_addr *addr, char *buf) +{ + buf[0] = 0; /* Reserved */ + buf[1] = 0xff; /* Multicast QPN */ + buf[2] = 0xff; + buf[3] = 0xff; + buf[4] = 0xff; + buf[5] = 0x12; /* link local scope */ + buf[6] = 0x60; /* IPv6 signature */ + buf[7] = 0x1b; + buf[8] = 0; /* P_Key */ + buf[9] = 0; + memcpy(buf + 10, addr->s6_addr + 6, 10); +} #endif #endif --- linux-bk.orig/net/ipv6/addrconf.c 2004-12-19 21:09:51.000000000 -0800 +++ linux-bk/net/ipv6/addrconf.c 2004-12-19 22:04:17.215162248 -0800 @@ -48,6 +48,7 @@ #include #include #include +#include #include #include #include @@ -1095,6 +1096,12 @@ memset(eui, 0, 7); eui[7] = *(u8*)dev->dev_addr; return 0; + case ARPHRD_INFINIBAND: + if (dev->addr_len != INFINIBAND_ALEN) + return -1; + memcpy(eui, dev->dev_addr + 12, 8); + eui[0] |= 2; + return 0; } return -1; } @@ -1794,7 +1801,8 @@ if ((dev->type != ARPHRD_ETHER) && (dev->type != ARPHRD_FDDI) && (dev->type != ARPHRD_IEEE802_TR) && - (dev->type != ARPHRD_ARCNET)) { + (dev->type != ARPHRD_ARCNET) && + (dev->type != ARPHRD_INFINIBAND)) { /* Alas, we support only Ethernet autoconfiguration. */ return; } --- linux-bk.orig/net/ipv6/ndisc.c 2004-12-19 21:09:20.000000000 -0800 +++ linux-bk/net/ipv6/ndisc.c 2004-12-19 22:04:17.216162100 -0800 @@ -260,6 +260,9 @@ case ARPHRD_ARCNET: ipv6_arcnet_mc_map(addr, buf); return 0; + case ARPHRD_INFINIBAND: + ipv6_ib_mc_map(addr, buf); + return 0; default: if (dir) { memcpy(buf, dev->broadcast, dev->addr_len); From roland@topspin.com Sun Dec 19 22:16:02 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 19 Dec 2004 22:16:12 -0800 (PST) Received: from umhlanga.STRATNET.NET (umhlanga.stratnet.net [12.162.17.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBK6Fcr0024751 for ; Sun, 19 Dec 2004 22:16:02 -0800 Received: from exch-1.topspincom.com ([12.162.17.3]) by umhlanga.STRATNET.NET with Microsoft SMTPSVC(5.0.2195.5329); Sun, 19 Dec 2004 22:15:18 -0800 Received: from eddore ([10.10.253.169]) by exch-1.topspincom.com with Microsoft SMTPSVC(5.0.2195.5329); Sun, 19 Dec 2004 22:15:17 -0800 Received: from localhost ([127.0.0.1] helo=eddore) by eddore with smtp (Exim 4.34) id 1CgGpA-00087E-G5; Sun, 19 Dec 2004 22:15:17 -0800 Cc: openib-general@openib.org, netdev@oss.sgi.com In-Reply-To: <200412192215.fZX1ZQqQD4QGkKcF@topspin.com> X-Mailer: Roland's Patchbomber Date: Sun, 19 Dec 2004 22:15:16 -0800 Message-Id: <200412192215.0RFFzMyJ3V7jnOWs@topspin.com> Mime-Version: 1.0 To: linux-kernel@vger.kernel.org From: Roland Dreier X-SA-Exim-Connect-IP: 127.0.0.1 X-SA-Exim-Mail-From: roland@topspin.com Subject: [PATCH][v4][20/24] Add IPoIB multicast & partition code Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-SA-Exim-Version: 4.1 (built Tue, 17 Aug 2004 11:06:07 +0200) X-SA-Exim-Scanned: Yes (on eddore) X-OriginalArrivalTime: 20 Dec 2004 06:15:17.0973 (UTC) FILETIME=[46EF8050:01C4E65B] X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12900 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: roland@topspin.com Precedence: bulk X-list: netdev Add functions for handling IPoIB multicast and multiple partitions. Signed-off-by: Roland Dreier --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/ulp/ipoib/ipoib_multicast.c 2004-12-19 22:04:18.140025955 -0800 @@ -0,0 +1,981 @@ +/* + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: ipoib_multicast.c 1362 2004-12-18 15:56:29Z roland $ + */ + +#include +#include +#include +#include +#include +#include +#include +#include + +#include "ipoib.h" + +#ifdef CONFIG_INFINIBAND_IPOIB_DEBUG +int mcast_debug_level; + +module_param(mcast_debug_level, int, 0644); +MODULE_PARM_DESC(mcast_debug_level, + "Enable multicast debug tracing if > 0"); +#endif + +static DECLARE_MUTEX(mcast_mutex); + +/* Used for all multicast joins (broadcast, IPv4 mcast and IPv6 mcast) */ +struct ipoib_mcast { + struct ib_sa_mcmember_rec mcmember; + struct ipoib_ah *ah; + + struct rb_node rb_node; + struct list_head list; + struct completion done; + + int query_id; + struct ib_sa_query *query; + + unsigned long created; + unsigned long backoff; + + unsigned long flags; + unsigned char logcount; + + struct list_head neigh_list; + + struct sk_buff_head pkt_queue; + + struct net_device *dev; +}; + +struct ipoib_mcast_iter { + struct net_device *dev; + union ib_gid mgid; + unsigned long created; + unsigned int queuelen; + unsigned int complete; + unsigned int send_only; +}; + +static void ipoib_mcast_free(struct ipoib_mcast *mcast) +{ + struct net_device *dev = mcast->dev; + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ipoib_neigh *neigh, *tmp; + unsigned long flags; + + ipoib_dbg_mcast(netdev_priv(dev), + "deleting multicast group " IPOIB_GID_FMT "\n", + IPOIB_GID_ARG(mcast->mcmember.mgid)); + + spin_lock_irqsave(&priv->lock, flags); + + list_for_each_entry_safe(neigh, tmp, &mcast->neigh_list, list) { + ipoib_put_ah(neigh->ah); + *to_ipoib_neigh(neigh->neighbour) = NULL; + neigh->neighbour->ops->destructor = NULL; + kfree(neigh); + } + + spin_unlock_irqrestore(&priv->lock, flags); + + if (mcast->ah) + ipoib_put_ah(mcast->ah); + + while (!skb_queue_empty(&mcast->pkt_queue)) { + struct sk_buff *skb = skb_dequeue(&mcast->pkt_queue); + + skb->dev = dev; + dev_kfree_skb_any(skb); + } + + kfree(mcast); +} + +static struct ipoib_mcast *ipoib_mcast_alloc(struct net_device *dev, + int can_sleep) +{ + struct ipoib_mcast *mcast; + + mcast = kmalloc(sizeof (*mcast), can_sleep ? GFP_KERNEL : GFP_ATOMIC); + if (!mcast) + return NULL; + + memset(mcast, 0, sizeof (*mcast)); + + init_completion(&mcast->done); + + mcast->dev = dev; + mcast->created = jiffies; + mcast->backoff = HZ; + mcast->logcount = 0; + + INIT_LIST_HEAD(&mcast->list); + INIT_LIST_HEAD(&mcast->neigh_list); + skb_queue_head_init(&mcast->pkt_queue); + + mcast->ah = NULL; + mcast->query = NULL; + + return mcast; +} + +static struct ipoib_mcast *__ipoib_mcast_find(struct net_device *dev, union ib_gid *mgid) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct rb_node *n = priv->multicast_tree.rb_node; + + while (n) { + struct ipoib_mcast *mcast; + int ret; + + mcast = rb_entry(n, struct ipoib_mcast, rb_node); + + ret = memcmp(mgid->raw, mcast->mcmember.mgid.raw, + sizeof (union ib_gid)); + if (ret < 0) + n = n->rb_left; + else if (ret > 0) + n = n->rb_right; + else + return mcast; + } + + return NULL; +} + +static int __ipoib_mcast_add(struct net_device *dev, struct ipoib_mcast *mcast) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct rb_node **n = &priv->multicast_tree.rb_node, *pn = NULL; + + while (*n) { + struct ipoib_mcast *tmcast; + int ret; + + pn = *n; + tmcast = rb_entry(pn, struct ipoib_mcast, rb_node); + + ret = memcmp(mcast->mcmember.mgid.raw, tmcast->mcmember.mgid.raw, + sizeof (union ib_gid)); + if (ret < 0) + n = &pn->rb_left; + else if (ret > 0) + n = &pn->rb_right; + else + return -EEXIST; + } + + rb_link_node(&mcast->rb_node, pn, n); + rb_insert_color(&mcast->rb_node, &priv->multicast_tree); + + return 0; +} + +static int ipoib_mcast_join_finish(struct ipoib_mcast *mcast, + struct ib_sa_mcmember_rec *mcmember) +{ + struct net_device *dev = mcast->dev; + struct ipoib_dev_priv *priv = netdev_priv(dev); + int ret; + + mcast->mcmember = *mcmember; + + /* Set the cached Q_Key before we attach if it's the broadcast group */ + if (!memcmp(mcast->mcmember.mgid.raw, priv->dev->broadcast + 4, + sizeof (union ib_gid))) + priv->qkey = be32_to_cpu(priv->broadcast->mcmember.qkey); + + if (!test_bit(IPOIB_MCAST_FLAG_SENDONLY, &mcast->flags)) { + if (test_and_set_bit(IPOIB_MCAST_FLAG_ATTACHED, &mcast->flags)) { + ipoib_warn(priv, "multicast group " IPOIB_GID_FMT + " already attached\n", + IPOIB_GID_ARG(mcast->mcmember.mgid)); + + return 0; + } + + ret = ipoib_mcast_attach(dev, be16_to_cpu(mcast->mcmember.mlid), + &mcast->mcmember.mgid); + if (ret < 0) { + ipoib_warn(priv, "couldn't attach QP to multicast group " + IPOIB_GID_FMT "\n", + IPOIB_GID_ARG(mcast->mcmember.mgid)); + + clear_bit(IPOIB_MCAST_FLAG_ATTACHED, &mcast->flags); + return ret; + } + } + + { + /* + * For now we set static_rate to 0. This is not + * really correct: we should look at the rate + * component of the MC member record, compare it with + * the rate of our local port (calculated from the + * active link speed and link width) and set an + * inter-packet delay appropriately. + */ + struct ib_ah_attr av = { + .dlid = be16_to_cpu(mcast->mcmember.mlid), + .port_num = priv->port, + .sl = mcast->mcmember.sl, + .static_rate = 0, + .ah_flags = IB_AH_GRH, + .grh = { + .flow_label = be32_to_cpu(mcast->mcmember.flow_label), + .hop_limit = mcast->mcmember.hop_limit, + .sgid_index = 0, + .traffic_class = mcast->mcmember.traffic_class + } + }; + + av.grh.dgid = mcast->mcmember.mgid; + + mcast->ah = ipoib_create_ah(dev, priv->pd, &av); + if (!mcast->ah) { + ipoib_warn(priv, "ib_address_create failed\n"); + } else { + ipoib_dbg_mcast(priv, "MGID " IPOIB_GID_FMT + " AV %p, LID 0x%04x, SL %d\n", + IPOIB_GID_ARG(mcast->mcmember.mgid), + mcast->ah->ah, + be16_to_cpu(mcast->mcmember.mlid), + mcast->mcmember.sl); + } + } + + /* actually send any queued packets */ + while (!skb_queue_empty(&mcast->pkt_queue)) { + struct sk_buff *skb = skb_dequeue(&mcast->pkt_queue); + + skb->dev = dev; + + if (!skb->dst || !skb->dst->neighbour) { + /* put pseudoheader back on for next time */ + skb_push(skb, sizeof (struct ipoib_pseudoheader)); + } + + if (dev_queue_xmit(skb)) + ipoib_warn(priv, "dev_queue_xmit failed to requeue packet\n"); + } + + return 0; +} + +static void +ipoib_mcast_sendonly_join_complete(int status, + struct ib_sa_mcmember_rec *mcmember, + void *mcast_ptr) +{ + struct ipoib_mcast *mcast = mcast_ptr; + struct net_device *dev = mcast->dev; + + if (!status) + ipoib_mcast_join_finish(mcast, mcmember); + else { + if (mcast->logcount++ < 20) + ipoib_dbg_mcast(netdev_priv(dev), "multicast join failed for " + IPOIB_GID_FMT ", status %d\n", + IPOIB_GID_ARG(mcast->mcmember.mgid), status); + + /* Flush out any queued packets */ + while (!skb_queue_empty(&mcast->pkt_queue)) { + struct sk_buff *skb = skb_dequeue(&mcast->pkt_queue); + + skb->dev = dev; + + dev_kfree_skb_any(skb); + } + + /* Clear the busy flag so we try again */ + clear_bit(IPOIB_MCAST_FLAG_BUSY, &mcast->flags); + } + + complete(&mcast->done); +} + +static int ipoib_mcast_sendonly_join(struct ipoib_mcast *mcast) +{ + struct net_device *dev = mcast->dev; + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ib_sa_mcmember_rec rec = { +#if 0 /* Some SMs don't support send-only yet */ + .join_state = 4 +#else + .join_state = 1 +#endif + }; + int ret = 0; + + if (!test_bit(IPOIB_FLAG_OPER_UP, &priv->flags)) { + ipoib_dbg_mcast(priv, "device shutting down, no multicast joins\n"); + return -ENODEV; + } + + if (test_and_set_bit(IPOIB_MCAST_FLAG_BUSY, &mcast->flags)) { + ipoib_dbg_mcast(priv, "multicast entry busy, skipping\n"); + return -EBUSY; + } + + rec.mgid = mcast->mcmember.mgid; + rec.port_gid = priv->local_gid; + rec.pkey = be16_to_cpu(priv->pkey); + + ret = ib_sa_mcmember_rec_set(priv->ca, priv->port, &rec, + IB_SA_MCMEMBER_REC_MGID | + IB_SA_MCMEMBER_REC_PORT_GID | + IB_SA_MCMEMBER_REC_PKEY | + IB_SA_MCMEMBER_REC_JOIN_STATE, + 1000, GFP_ATOMIC, + ipoib_mcast_sendonly_join_complete, + mcast, &mcast->query); + if (ret < 0) { + ipoib_warn(priv, "ib_sa_mcmember_rec_set failed (ret = %d)\n", + ret); + } else { + ipoib_dbg_mcast(priv, "no multicast record for " IPOIB_GID_FMT + ", starting join\n", + IPOIB_GID_ARG(mcast->mcmember.mgid)); + + mcast->query_id = ret; + } + + return ret; +} + +static void ipoib_mcast_join_complete(int status, + struct ib_sa_mcmember_rec *mcmember, + void *mcast_ptr) +{ + struct ipoib_mcast *mcast = mcast_ptr; + struct net_device *dev = mcast->dev; + struct ipoib_dev_priv *priv = netdev_priv(dev); + + ipoib_dbg_mcast(priv, "join completion for " IPOIB_GID_FMT + " (status %d)\n", + IPOIB_GID_ARG(mcast->mcmember.mgid), status); + + if (!status && !ipoib_mcast_join_finish(mcast, mcmember)) { + mcast->backoff = HZ; + down(&mcast_mutex); + if (test_bit(IPOIB_MCAST_RUN, &priv->flags)) + queue_work(ipoib_workqueue, &priv->mcast_task); + up(&mcast_mutex); + complete(&mcast->done); + return; + } + + if (status == -EINTR) { + complete(&mcast->done); + return; + } + + if (status && mcast->logcount++ < 20) { + if (status == -ETIMEDOUT || status == -EINTR) { + ipoib_dbg_mcast(priv, "multicast join failed for " IPOIB_GID_FMT + ", status %d\n", + IPOIB_GID_ARG(mcast->mcmember.mgid), + status); + } else { + ipoib_warn(priv, "multicast join failed for " + IPOIB_GID_FMT ", status %d\n", + IPOIB_GID_ARG(mcast->mcmember.mgid), + status); + } + } + + mcast->backoff *= 2; + if (mcast->backoff > IPOIB_MAX_BACKOFF_SECONDS) + mcast->backoff = IPOIB_MAX_BACKOFF_SECONDS; + + mcast->query = NULL; + + down(&mcast_mutex); + if (test_bit(IPOIB_MCAST_RUN, &priv->flags)) { + if (status == -ETIMEDOUT) + queue_work(ipoib_workqueue, &priv->mcast_task); + else + queue_delayed_work(ipoib_workqueue, &priv->mcast_task, + mcast->backoff * HZ); + } else + complete(&mcast->done); + up(&mcast_mutex); + + return; +} + +static void ipoib_mcast_join(struct net_device *dev, struct ipoib_mcast *mcast, + int create) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ib_sa_mcmember_rec rec = { + .join_state = 1 + }; + ib_sa_comp_mask comp_mask; + int ret = 0; + + ipoib_dbg_mcast(priv, "joining MGID " IPOIB_GID_FMT "\n", + IPOIB_GID_ARG(mcast->mcmember.mgid)); + + rec.mgid = mcast->mcmember.mgid; + rec.port_gid = priv->local_gid; + rec.pkey = be16_to_cpu(priv->pkey); + + comp_mask = + IB_SA_MCMEMBER_REC_MGID | + IB_SA_MCMEMBER_REC_PORT_GID | + IB_SA_MCMEMBER_REC_PKEY | + IB_SA_MCMEMBER_REC_JOIN_STATE; + + if (create) { + comp_mask |= + IB_SA_MCMEMBER_REC_QKEY | + IB_SA_MCMEMBER_REC_SL | + IB_SA_MCMEMBER_REC_FLOW_LABEL | + IB_SA_MCMEMBER_REC_TRAFFIC_CLASS; + + rec.qkey = priv->broadcast->mcmember.qkey; + rec.sl = priv->broadcast->mcmember.sl; + rec.flow_label = priv->broadcast->mcmember.flow_label; + rec.traffic_class = priv->broadcast->mcmember.traffic_class; + } + + ret = ib_sa_mcmember_rec_set(priv->ca, priv->port, &rec, comp_mask, + mcast->backoff * 1000, GFP_ATOMIC, + ipoib_mcast_join_complete, + mcast, &mcast->query); + + if (ret < 0) { + ipoib_warn(priv, "ib_sa_mcmember_rec_set failed, status %d\n", ret); + + mcast->backoff *= 2; + if (mcast->backoff > IPOIB_MAX_BACKOFF_SECONDS) + mcast->backoff = IPOIB_MAX_BACKOFF_SECONDS; + + down(&mcast_mutex); + if (test_bit(IPOIB_MCAST_RUN, &priv->flags)) + queue_delayed_work(ipoib_workqueue, + &priv->mcast_task, + mcast->backoff); + up(&mcast_mutex); + } else + mcast->query_id = ret; +} + +void ipoib_mcast_join_task(void *dev_ptr) +{ + struct net_device *dev = dev_ptr; + struct ipoib_dev_priv *priv = netdev_priv(dev); + + if (!test_bit(IPOIB_MCAST_RUN, &priv->flags)) + return; + + if (ib_query_gid(priv->ca, priv->port, 0, &priv->local_gid)) + ipoib_warn(priv, "ib_gid_entry_get() failed\n"); + else + memcpy(priv->dev->dev_addr + 4, priv->local_gid.raw, sizeof (union ib_gid)); + + if (!priv->broadcast) { + priv->broadcast = ipoib_mcast_alloc(dev, 1); + if (!priv->broadcast) { + ipoib_warn(priv, "failed to allocate broadcast group\n"); + down(&mcast_mutex); + if (test_bit(IPOIB_MCAST_RUN, &priv->flags)) + queue_delayed_work(ipoib_workqueue, + &priv->mcast_task, HZ); + up(&mcast_mutex); + return; + } + + memcpy(priv->broadcast->mcmember.mgid.raw, priv->dev->broadcast + 4, + sizeof (union ib_gid)); + + spin_lock_irq(&priv->lock); + __ipoib_mcast_add(dev, priv->broadcast); + spin_unlock_irq(&priv->lock); + } + + if (!test_bit(IPOIB_MCAST_FLAG_ATTACHED, &priv->broadcast->flags)) { + ipoib_mcast_join(dev, priv->broadcast, 0); + return; + } + + while (1) { + struct ipoib_mcast *mcast = NULL; + + spin_lock_irq(&priv->lock); + list_for_each_entry(mcast, &priv->multicast_list, list) { + if (!test_bit(IPOIB_MCAST_FLAG_SENDONLY, &mcast->flags) + && !test_bit(IPOIB_MCAST_FLAG_BUSY, &mcast->flags) + && !test_bit(IPOIB_MCAST_FLAG_ATTACHED, &mcast->flags)) { + /* Found the next unjoined group */ + break; + } + } + spin_unlock_irq(&priv->lock); + + if (&mcast->list == &priv->multicast_list) { + /* All done */ + break; + } + + ipoib_mcast_join(dev, mcast, 1); + return; + } + + { + struct ib_port_attr attr; + + if (!ib_query_port(priv->ca, priv->port, &attr)) + priv->local_lid = attr.lid; + else + ipoib_warn(priv, "ib_query_port failed\n"); + } + + priv->mcast_mtu = ib_mtu_enum_to_int(priv->broadcast->mcmember.mtu) - + IPOIB_ENCAP_LEN; + dev->mtu = min(priv->mcast_mtu, priv->admin_mtu); + + ipoib_dbg_mcast(priv, "successfully joined all multicast groups\n"); + + clear_bit(IPOIB_MCAST_RUN, &priv->flags); + netif_carrier_on(dev); +} + +int ipoib_mcast_start_thread(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + ipoib_dbg_mcast(priv, "starting multicast thread\n"); + + down(&mcast_mutex); + if (!test_and_set_bit(IPOIB_MCAST_RUN, &priv->flags)) + queue_work(ipoib_workqueue, &priv->mcast_task); + up(&mcast_mutex); + + return 0; +} + +int ipoib_mcast_stop_thread(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ipoib_mcast *mcast; + + ipoib_dbg_mcast(priv, "stopping multicast thread\n"); + + down(&mcast_mutex); + clear_bit(IPOIB_MCAST_RUN, &priv->flags); + cancel_delayed_work(&priv->mcast_task); + up(&mcast_mutex); + + flush_workqueue(ipoib_workqueue); + + if (priv->broadcast && priv->broadcast->query) { + ib_sa_cancel_query(priv->broadcast->query_id, priv->broadcast->query); + priv->broadcast->query = NULL; + ipoib_dbg_mcast(priv, "waiting for bcast\n"); + wait_for_completion(&priv->broadcast->done); + } + + list_for_each_entry(mcast, &priv->multicast_list, list) { + if (mcast->query) { + ib_sa_cancel_query(mcast->query_id, mcast->query); + mcast->query = NULL; + ipoib_dbg_mcast(priv, "waiting for MGID " IPOIB_GID_FMT "\n", + IPOIB_GID_ARG(mcast->mcmember.mgid)); + wait_for_completion(&mcast->done); + } + } + + return 0; +} + +int ipoib_mcast_leave(struct net_device *dev, struct ipoib_mcast *mcast) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ib_sa_mcmember_rec rec = { + .join_state = 1 + }; + int ret = 0; + + if (!test_and_clear_bit(IPOIB_MCAST_FLAG_ATTACHED, &mcast->flags)) + return 0; + + ipoib_dbg_mcast(priv, "leaving MGID " IPOIB_GID_FMT "\n", + IPOIB_GID_ARG(mcast->mcmember.mgid)); + + rec.mgid = mcast->mcmember.mgid; + rec.port_gid = priv->local_gid; + rec.pkey = be16_to_cpu(priv->pkey); + + /* Remove ourselves from the multicast group */ + ret = ipoib_mcast_detach(dev, be16_to_cpu(mcast->mcmember.mlid), + &mcast->mcmember.mgid); + if (ret) + ipoib_warn(priv, "ipoib_mcast_detach failed (result = %d)\n", ret); + + /* + * Just make one shot at leaving and don't wait for a reply; + * if we fail, too bad. + */ + ret = ib_sa_mcmember_rec_delete(priv->ca, priv->port, &rec, + IB_SA_MCMEMBER_REC_MGID | + IB_SA_MCMEMBER_REC_PORT_GID | + IB_SA_MCMEMBER_REC_PKEY | + IB_SA_MCMEMBER_REC_JOIN_STATE, + 0, GFP_ATOMIC, NULL, + mcast, &mcast->query); + if (ret < 0) + ipoib_warn(priv, "ib_sa_mcmember_rec_delete failed " + "for leave (result = %d)\n", ret); + + return 0; +} + +void ipoib_mcast_send(struct net_device *dev, union ib_gid *mgid, + struct sk_buff *skb) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ipoib_mcast *mcast; + + /* + * We can only be called from ipoib_start_xmit, so we're + * inside tx_lock -- no need to save/restore flags. + */ + spin_lock(&priv->lock); + + mcast = __ipoib_mcast_find(dev, mgid); + if (!mcast) { + /* Let's create a new send only group now */ + ipoib_dbg_mcast(priv, "setting up send only multicast group for " + IPOIB_GID_FMT "\n", IPOIB_GID_ARG(*mgid)); + + mcast = ipoib_mcast_alloc(dev, 0); + if (!mcast) { + ipoib_warn(priv, "unable to allocate memory for " + "multicast structure\n"); + dev_kfree_skb_any(skb); + goto out; + } + + set_bit(IPOIB_MCAST_FLAG_SENDONLY, &mcast->flags); + mcast->mcmember.mgid = *mgid; + __ipoib_mcast_add(dev, mcast); + list_add_tail(&mcast->list, &priv->multicast_list); + } + + if (!mcast->ah) { + if (skb_queue_len(&mcast->pkt_queue) < IPOIB_MAX_MCAST_QUEUE) + skb_queue_tail(&mcast->pkt_queue, skb); + else + dev_kfree_skb_any(skb); + + if (mcast->query) + ipoib_dbg_mcast(priv, "no address vector, " + "but multicast join already started\n"); + else if (test_bit(IPOIB_MCAST_FLAG_SENDONLY, &mcast->flags)) + ipoib_mcast_sendonly_join(mcast); + + /* + * If lookup completes between here and out:, don't + * want to send packet twice. + */ + mcast = NULL; + } + +out: + if (mcast && mcast->ah) { + if (skb->dst && + skb->dst->neighbour && + !*to_ipoib_neigh(skb->dst->neighbour)) { + struct ipoib_neigh *neigh = kmalloc(sizeof *neigh, GFP_ATOMIC); + + if (neigh) { + kref_get(&mcast->ah->ref); + neigh->ah = mcast->ah; + neigh->neighbour = skb->dst->neighbour; + *to_ipoib_neigh(skb->dst->neighbour) = neigh; + list_add_tail(&neigh->list, &mcast->neigh_list); + } + } + + ipoib_send(dev, skb, mcast->ah, IB_MULTICAST_QPN); + } + + spin_unlock(&priv->lock); +} + +void ipoib_mcast_dev_flush(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + LIST_HEAD(remove_list); + struct ipoib_mcast *mcast, *tmcast, *nmcast; + unsigned long flags; + + ipoib_dbg_mcast(priv, "flushing multicast list\n"); + + spin_lock_irqsave(&priv->lock, flags); + list_for_each_entry_safe(mcast, tmcast, &priv->multicast_list, list) { + nmcast = ipoib_mcast_alloc(dev, 0); + if (nmcast) { + nmcast->flags = + mcast->flags & (1 << IPOIB_MCAST_FLAG_SENDONLY); + + nmcast->mcmember.mgid = mcast->mcmember.mgid; + + /* Add the new group in before the to-be-destroyed group */ + list_add_tail(&nmcast->list, &mcast->list); + list_del_init(&mcast->list); + + rb_replace_node(&mcast->rb_node, &nmcast->rb_node, + &priv->multicast_tree); + + list_add_tail(&mcast->list, &remove_list); + } else { + ipoib_warn(priv, "could not reallocate multicast group " + IPOIB_GID_FMT "\n", + IPOIB_GID_ARG(mcast->mcmember.mgid)); + } + } + + if (priv->broadcast) { + nmcast = ipoib_mcast_alloc(dev, 0); + if (nmcast) { + nmcast->mcmember.mgid = priv->broadcast->mcmember.mgid; + + rb_replace_node(&priv->broadcast->rb_node, + &nmcast->rb_node, + &priv->multicast_tree); + + list_add_tail(&priv->broadcast->list, &remove_list); + } + + priv->broadcast = nmcast; + } + + spin_unlock_irqrestore(&priv->lock, flags); + + list_for_each_entry(mcast, &remove_list, list) { + ipoib_mcast_leave(dev, mcast); + ipoib_mcast_free(mcast); + } +} + +void ipoib_mcast_dev_down(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + unsigned long flags; + + /* Delete broadcast since it will be recreated */ + if (priv->broadcast) { + ipoib_dbg_mcast(priv, "deleting broadcast group\n"); + + spin_lock_irqsave(&priv->lock, flags); + rb_erase(&priv->broadcast->rb_node, &priv->multicast_tree); + spin_unlock_irqrestore(&priv->lock, flags); + ipoib_mcast_leave(dev, priv->broadcast); + ipoib_mcast_free(priv->broadcast); + priv->broadcast = NULL; + } +} + +void ipoib_mcast_restart_task(void *dev_ptr) +{ + struct net_device *dev = dev_ptr; + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct dev_mc_list *mclist; + struct ipoib_mcast *mcast, *tmcast; + LIST_HEAD(remove_list); + unsigned long flags; + + ipoib_dbg_mcast(priv, "restarting multicast task\n"); + + ipoib_mcast_stop_thread(dev); + + spin_lock_irqsave(&priv->lock, flags); + + /* + * Unfortunately, the networking core only gives us a list of all of + * the multicast hardware addresses. We need to figure out which ones + * are new and which ones have been removed + */ + + /* Clear out the found flag */ + list_for_each_entry(mcast, &priv->multicast_list, list) + clear_bit(IPOIB_MCAST_FLAG_FOUND, &mcast->flags); + + /* Mark all of the entries that are found or don't exist */ + for (mclist = dev->mc_list; mclist; mclist = mclist->next) { + union ib_gid mgid; + + memcpy(mgid.raw, mclist->dmi_addr + 4, sizeof mgid); + + /* Add in the P_Key */ + mgid.raw[4] = (priv->pkey >> 8) & 0xff; + mgid.raw[5] = priv->pkey & 0xff; + + mcast = __ipoib_mcast_find(dev, &mgid); + if (!mcast || test_bit(IPOIB_MCAST_FLAG_SENDONLY, &mcast->flags)) { + struct ipoib_mcast *nmcast; + + /* Not found or send-only group, let's add a new entry */ + ipoib_dbg_mcast(priv, "adding multicast entry for mgid " + IPOIB_GID_FMT "\n", IPOIB_GID_ARG(mgid)); + + nmcast = ipoib_mcast_alloc(dev, 0); + if (!nmcast) { + ipoib_warn(priv, "unable to allocate memory for multicast structure\n"); + continue; + } + + set_bit(IPOIB_MCAST_FLAG_FOUND, &nmcast->flags); + + nmcast->mcmember.mgid = mgid; + + if (mcast) { + /* Destroy the send only entry */ + list_del(&mcast->list); + list_add_tail(&mcast->list, &remove_list); + + rb_replace_node(&mcast->rb_node, + &nmcast->rb_node, + &priv->multicast_tree); + } else + __ipoib_mcast_add(dev, nmcast); + + list_add_tail(&nmcast->list, &priv->multicast_list); + } + + if (mcast) + set_bit(IPOIB_MCAST_FLAG_FOUND, &mcast->flags); + } + + /* Remove all of the entries don't exist anymore */ + list_for_each_entry_safe(mcast, tmcast, &priv->multicast_list, list) { + if (!test_bit(IPOIB_MCAST_FLAG_FOUND, &mcast->flags) && + !test_bit(IPOIB_MCAST_FLAG_SENDONLY, &mcast->flags)) { + ipoib_dbg_mcast(priv, "deleting multicast group " IPOIB_GID_FMT "\n", + IPOIB_GID_ARG(mcast->mcmember.mgid)); + + rb_erase(&mcast->rb_node, &priv->multicast_tree); + + /* Move to the remove list */ + list_del(&mcast->list); + list_add_tail(&mcast->list, &remove_list); + } + } + spin_unlock_irqrestore(&priv->lock, flags); + + /* We have to cancel outside of the spinlock */ + list_for_each_entry(mcast, &remove_list, list) { + ipoib_mcast_leave(mcast->dev, mcast); + ipoib_mcast_free(mcast); + } + + if (test_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags)) + ipoib_mcast_start_thread(dev); +} + +struct ipoib_mcast_iter *ipoib_mcast_iter_init(struct net_device *dev) +{ + struct ipoib_mcast_iter *iter; + + iter = kmalloc(sizeof *iter, GFP_KERNEL); + if (!iter) + return NULL; + + iter->dev = dev; + memset(iter->mgid.raw, 0, sizeof iter->mgid); + + if (ipoib_mcast_iter_next(iter)) { + ipoib_mcast_iter_free(iter); + return NULL; + } + + return iter; +} + +void ipoib_mcast_iter_free(struct ipoib_mcast_iter *iter) +{ + kfree(iter); +} + +int ipoib_mcast_iter_next(struct ipoib_mcast_iter *iter) +{ + struct ipoib_dev_priv *priv = netdev_priv(iter->dev); + struct rb_node *n; + struct ipoib_mcast *mcast; + int ret = 1; + + spin_lock_irq(&priv->lock); + + n = rb_first(&priv->multicast_tree); + + while (n) { + mcast = rb_entry(n, struct ipoib_mcast, rb_node); + + if (memcmp(iter->mgid.raw, mcast->mcmember.mgid.raw, + sizeof (union ib_gid)) < 0) { + iter->mgid = mcast->mcmember.mgid; + iter->created = mcast->created; + iter->queuelen = skb_queue_len(&mcast->pkt_queue); + iter->complete = !!mcast->ah; + iter->send_only = !!(mcast->flags & (1 << IPOIB_MCAST_FLAG_SENDONLY)); + + ret = 0; + + break; + } + + n = rb_next(n); + } + + spin_unlock_irq(&priv->lock); + + return ret; +} + +void ipoib_mcast_iter_read(struct ipoib_mcast_iter *iter, + union ib_gid *mgid, + unsigned long *created, + unsigned int *queuelen, + unsigned int *complete, + unsigned int *send_only) +{ + *mgid = iter->mgid; + *created = iter->created; + *queuelen = iter->queuelen; + *complete = iter->complete; + *send_only = iter->send_only; +} --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/ulp/ipoib/ipoib_vlan.c 2004-12-19 22:04:18.168021829 -0800 @@ -0,0 +1,177 @@ +/* + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: ipoib_vlan.c 1349 2004-12-16 21:09:43Z roland $ + */ + +#include +#include + +#include +#include +#include + +#include + +#include "ipoib.h" + +static ssize_t show_parent(struct class_device *class_dev, char *buf) +{ + struct net_device *dev = + container_of(class_dev, struct net_device, class_dev); + struct ipoib_dev_priv *priv = netdev_priv(dev); + + return sprintf(buf, "%s\n", priv->parent->name); +} +static CLASS_DEVICE_ATTR(parent, S_IRUGO, show_parent, NULL); + +int ipoib_vlan_add(struct net_device *pdev, unsigned short pkey) +{ + struct ipoib_dev_priv *ppriv, *priv; + char intf_name[IFNAMSIZ]; + int result; + + if (!capable(CAP_NET_ADMIN)) + return -EPERM; + + ppriv = netdev_priv(pdev); + + down(&ppriv->vlan_mutex); + + /* + * First ensure this isn't a duplicate. We check the parent device and + * then all of the child interfaces to make sure the Pkey doesn't match. + */ + if (ppriv->pkey == pkey) { + result = -ENOTUNIQ; + goto err; + } + + list_for_each_entry(priv, &ppriv->child_intfs, list) { + if (priv->pkey == pkey) { + result = -ENOTUNIQ; + goto err; + } + } + + snprintf(intf_name, sizeof intf_name, "%s.%04x", + ppriv->dev->name, pkey); + priv = ipoib_intf_alloc(intf_name); + if (!priv) { + result = -ENOMEM; + goto err; + } + + set_bit(IPOIB_FLAG_SUBINTERFACE, &priv->flags); + + priv->pkey = pkey; + + memcpy(priv->dev->dev_addr, ppriv->dev->dev_addr, INFINIBAND_ALEN); + priv->dev->broadcast[8] = pkey >> 8; + priv->dev->broadcast[9] = pkey & 0xff; + + result = ipoib_dev_init(priv->dev, ppriv->ca, ppriv->port); + if (result < 0) { + ipoib_warn(ppriv, "failed to initialize subinterface: " + "device %s, port %d", + ppriv->ca->name, ppriv->port); + goto device_init_failed; + } + + result = register_netdev(priv->dev); + if (result) { + ipoib_warn(priv, "failed to initialize; error %i", result); + goto register_failed; + } + + priv->parent = ppriv->dev; + + if (ipoib_create_debug_file(priv->dev)) + goto debug_failed; + + if (ipoib_add_pkey_attr(priv->dev)) + goto sysfs_failed; + + if (class_device_create_file(&priv->dev->class_dev, + &class_device_attr_parent)) + goto sysfs_failed; + + list_add_tail(&priv->list, &ppriv->child_intfs); + + up(&ppriv->vlan_mutex); + + return 0; + +sysfs_failed: + ipoib_delete_debug_file(priv->dev); + +debug_failed: + unregister_netdev(priv->dev); + +register_failed: + ipoib_dev_cleanup(priv->dev); + +device_init_failed: + free_netdev(priv->dev); + +err: + up(&ppriv->vlan_mutex); + return result; +} + +int ipoib_vlan_delete(struct net_device *pdev, unsigned short pkey) +{ + struct ipoib_dev_priv *ppriv, *priv, *tpriv; + int ret = -ENOENT; + + if (!capable(CAP_NET_ADMIN)) + return -EPERM; + + ppriv = netdev_priv(pdev); + + down(&ppriv->vlan_mutex); + list_for_each_entry_safe(priv, tpriv, &ppriv->child_intfs, list) { + if (priv->pkey == pkey) { + unregister_netdev(priv->dev); + ipoib_dev_cleanup(priv->dev); + + list_del(&priv->list); + + kfree(priv); + + ret = 0; + break; + } + } + up(&ppriv->vlan_mutex); + + return ret; +} From roland@topspin.com Sun Dec 19 22:16:01 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 19 Dec 2004 22:16:16 -0800 (PST) Received: from umhlanga.STRATNET.NET (umhlanga.stratnet.net [12.162.17.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBK6Fcqw024751 for ; Sun, 19 Dec 2004 22:16:00 -0800 Received: from exch-1.topspincom.com ([12.162.17.3]) by umhlanga.STRATNET.NET with Microsoft SMTPSVC(5.0.2195.5329); Sun, 19 Dec 2004 22:15:16 -0800 Received: from eddore ([10.10.253.169]) by exch-1.topspincom.com with Microsoft SMTPSVC(5.0.2195.5329); Sun, 19 Dec 2004 22:15:16 -0800 Received: from localhost ([127.0.0.1] helo=eddore) by eddore with smtp (Exim 4.34) id 1CgGp8-000876-Ue; Sun, 19 Dec 2004 22:15:16 -0800 Cc: openib-general@openib.org, netdev@oss.sgi.com In-Reply-To: <200412192215.69tnzAhGIT1vQGLF@topspin.com> X-Mailer: Roland's Patchbomber Date: Sun, 19 Dec 2004 22:15:14 -0800 Message-Id: <200412192215.fZX1ZQqQD4QGkKcF@topspin.com> Mime-Version: 1.0 To: linux-kernel@vger.kernel.org From: Roland Dreier X-SA-Exim-Connect-IP: 127.0.0.1 X-SA-Exim-Mail-From: roland@topspin.com Subject: [PATCH][v4][19/24] Add IPoIB (IP-over-InfiniBand) driver Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-SA-Exim-Version: 4.1 (built Tue, 17 Aug 2004 11:06:07 +0200) X-SA-Exim-Scanned: Yes (on eddore) X-OriginalArrivalTime: 20 Dec 2004 06:15:16.0489 (UTC) FILETIME=[460D0F90:01C4E65B] X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12901 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: roland@topspin.com Precedence: bulk X-list: netdev Add a driver that implements the (IPoIB) IP-over-InfiniBand protocol. This is a network device driver of type ARPHRD_INFINIBAND (and addr_len INFINIBAND_ALEN bytes). The ARP/ND implementation for this driver is not completely straightforward, because InfiniBand requires an additional path lookup be performed (through an IB-specific mechanism) after a remote hardware address has been resolved. We are very open to suggestions of a better way to handle this than the current implementation. Although IB has a special multicast group join mode intended to support IP multicast routing (non member join), no means to identify different multicast styles has yet been determined, so all joins by the driver are currently full member joins. We are looking for guidance in how to solve this. The IPoIB protocol/encapsulation is described in the Internet-Drafts http://www.ietf.org/internet-drafts/draft-ietf-ipoib-architecture-04.txt http://www.ietf.org/internet-drafts/draft-ietf-ipoib-ip-over-infiniband-08.txt Signed-off-by: Roland Dreier --- linux-bk.orig/drivers/infiniband/Kconfig 2004-12-19 22:04:14.496562875 -0800 +++ linux-bk/drivers/infiniband/Kconfig 2004-12-19 22:04:17.510118781 -0800 @@ -9,4 +9,6 @@ source "drivers/infiniband/hw/mthca/Kconfig" +source "drivers/infiniband/ulp/ipoib/Kconfig" + endmenu --- linux-bk.orig/drivers/infiniband/Makefile 2004-12-19 22:04:14.472566412 -0800 +++ linux-bk/drivers/infiniband/Makefile 2004-12-19 22:04:17.485122465 -0800 @@ -1,2 +1,3 @@ obj-$(CONFIG_INFINIBAND) += core/ obj-$(CONFIG_INFINIBAND_MTHCA) += hw/mthca/ +obj-$(CONFIG_INFINIBAND_IPOIB) += ulp/ipoib/ --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/ulp/ipoib/Kconfig 2004-12-19 22:04:17.559111562 -0800 @@ -0,0 +1,33 @@ +config INFINIBAND_IPOIB + tristate "IP-over-InfiniBand" + depends on INFINIBAND && NETDEVICES && INET + ---help--- + Support for the IP-over-InfiniBand protocol (IPoIB). This + transports IP packets over InfiniBand so you can use your IB + device as a fancy NIC. + + The IPoIB protocol is defined by the IETF ipoib working + group: . + +config INFINIBAND_IPOIB_DEBUG + bool "IP-over-InfiniBand debugging" + depends on INFINIBAND_IPOIB + ---help--- + This option causes debugging code to be compiled into the + IPoIB driver. The output can be turned on via the + debug_level and mcast_debug_level module parameters (which + can also be set after the driver is loaded through sysfs). + + This option also creates an "ipoib_debugfs," which can be + mounted to expose debugging information about IB multicast + groups used by the IPoIB driver. + +config INFINIBAND_IPOIB_DEBUG_DATA + bool "IP-over-InfiniBand data path debugging" + depends on INFINIBAND_IPOIB_DEBUG + ---help--- + This option compiles debugging code into the the data path + of the IPoIB driver. The output can be turned on via the + data_debug_level module parameter; however, even with output + turned off, this debugging code will have some performance + impact. --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/ulp/ipoib/Makefile 2004-12-19 22:04:17.534115245 -0800 @@ -0,0 +1,11 @@ +EXTRA_CFLAGS += -Idrivers/infiniband/include + +obj-$(CONFIG_INFINIBAND_IPOIB) += ib_ipoib.o + +ib_ipoib-y := ipoib_main.o \ + ipoib_ib.o \ + ipoib_multicast.o \ + ipoib_verbs.o \ + ipoib_vlan.o +ib_ipoib-$(CONFIG_INFINIBAND_IPOIB_DEBUG) += ipoib_fs.o + --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/ulp/ipoib/ipoib.h 2004-12-19 22:04:17.584107878 -0800 @@ -0,0 +1,350 @@ +/* + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: ipoib.h 1358 2004-12-17 22:00:11Z roland $ + */ + +#ifndef _IPOIB_H +#define _IPOIB_H + +#include +#include +#include +#include +#include +#include +#include +#include + +#include + +#include +#include + +#include +#include +#include + +/* constants */ + +enum { + IPOIB_PACKET_SIZE = 2048, + IPOIB_BUF_SIZE = IPOIB_PACKET_SIZE + IB_GRH_BYTES, + + IPOIB_ENCAP_LEN = 4, + + IPOIB_RX_RING_SIZE = 128, + IPOIB_TX_RING_SIZE = 64, + + IPOIB_NUM_WC = 4, + + IPOIB_MAX_PATH_REC_QUEUE = 3, + IPOIB_MAX_MCAST_QUEUE = 3, + + IPOIB_FLAG_OPER_UP = 0, + IPOIB_FLAG_ADMIN_UP = 1, + IPOIB_PKEY_ASSIGNED = 2, + IPOIB_PKEY_STOP = 3, + IPOIB_FLAG_SUBINTERFACE = 4, + IPOIB_MCAST_RUN = 5, + IPOIB_STOP_REAPER = 6, + + IPOIB_MAX_BACKOFF_SECONDS = 16, + + IPOIB_MCAST_FLAG_FOUND = 0, /* used in set_multicast_list */ + IPOIB_MCAST_FLAG_SENDONLY = 1, + IPOIB_MCAST_FLAG_BUSY = 2, /* joining or already joined */ + IPOIB_MCAST_FLAG_ATTACHED = 3, +}; + +/* structs */ + +struct ipoib_header { + u16 proto; + u16 reserved; +}; + +struct ipoib_pseudoheader { + u8 hwaddr[INFINIBAND_ALEN]; +}; + +struct ipoib_mcast; + +struct ipoib_buf { + struct sk_buff *skb; + DECLARE_PCI_UNMAP_ADDR(mapping) +}; + +/* + * Device private locking: tx_lock protects members used in TX fast + * path (and we use LLTX so upper layers don't do extra locking). + * lock protects everything else. lock nests inside of tx_lock (ie + * tx_lock must be acquired first if needed). + */ +struct ipoib_dev_priv { + spinlock_t lock; + + struct net_device *dev; + + unsigned long flags; + + struct semaphore mcast_mutex; + struct semaphore vlan_mutex; + + struct rb_root path_tree; + struct list_head path_list; + + struct ipoib_mcast *broadcast; + struct list_head multicast_list; + struct rb_root multicast_tree; + + struct work_struct pkey_task; + struct work_struct mcast_task; + struct work_struct flush_task; + struct work_struct restart_task; + struct work_struct ah_reap_task; + + struct ib_device *ca; + u8 port; + u16 pkey; + struct ib_pd *pd; + struct ib_mr *mr; + struct ib_cq *cq; + struct ib_qp *qp; + u32 qkey; + + union ib_gid local_gid; + u16 local_lid; + + unsigned int admin_mtu; + unsigned int mcast_mtu; + + struct ipoib_buf *rx_ring; + + spinlock_t tx_lock; + struct ipoib_buf *tx_ring; + unsigned tx_head; + unsigned tx_tail; + + struct ib_wc ibwc[IPOIB_NUM_WC]; + + struct list_head dead_ahs; + + struct ib_event_handler event_handler; + + struct net_device_stats stats; + + struct net_device *parent; + struct list_head child_intfs; + struct list_head list; + +#ifdef CONFIG_INFINIBAND_IPOIB_DEBUG + struct list_head fs_list; + struct dentry *mcg_dentry; +#endif +}; + +struct ipoib_ah { + struct net_device *dev; + struct ib_ah *ah; + struct list_head list; + struct kref ref; + unsigned last_send; +}; + +struct ipoib_path { + struct net_device *dev; + struct ib_sa_path_rec pathrec; + struct ipoib_ah *ah; + struct sk_buff_head queue; + + struct list_head neigh_list; + + int query_id; + struct ib_sa_query *query; + struct completion done; + + struct rb_node rb_node; + struct list_head list; +}; + +struct ipoib_neigh { + struct ipoib_ah *ah; + struct sk_buff_head queue; + + struct neighbour *neighbour; + + struct list_head list; +}; + +static inline struct ipoib_neigh **to_ipoib_neigh(struct neighbour *neigh) +{ + return (struct ipoib_neigh **) (neigh->ha + 24 - + (offsetof(struct neighbour, ha) & 4)); +} + +extern struct workqueue_struct *ipoib_workqueue; + +/* functions */ + +void ipoib_ib_completion(struct ib_cq *cq, void *dev_ptr); + +struct ipoib_ah *ipoib_create_ah(struct net_device *dev, + struct ib_pd *pd, struct ib_ah_attr *attr); +void ipoib_free_ah(struct kref *kref); +static inline void ipoib_put_ah(struct ipoib_ah *ah) +{ + kref_put(&ah->ref, ipoib_free_ah); +} + +int ipoib_add_pkey_attr(struct net_device *dev); + +void ipoib_send(struct net_device *dev, struct sk_buff *skb, + struct ipoib_ah *address, u32 qpn); +void ipoib_reap_ah(void *dev_ptr); + +void ipoib_flush_paths(struct net_device *dev); +struct ipoib_dev_priv *ipoib_intf_alloc(const char *format); + +int ipoib_ib_dev_init(struct net_device *dev, struct ib_device *ca, int port); +void ipoib_ib_dev_flush(void *dev); +void ipoib_ib_dev_cleanup(struct net_device *dev); + +int ipoib_ib_dev_open(struct net_device *dev); +int ipoib_ib_dev_up(struct net_device *dev); +int ipoib_ib_dev_down(struct net_device *dev); +int ipoib_ib_dev_stop(struct net_device *dev); + +int ipoib_dev_init(struct net_device *dev, struct ib_device *ca, int port); +void ipoib_dev_cleanup(struct net_device *dev); + +void ipoib_mcast_join_task(void *dev_ptr); +void ipoib_mcast_send(struct net_device *dev, union ib_gid *mgid, + struct sk_buff *skb); + +void ipoib_mcast_restart_task(void *dev_ptr); +int ipoib_mcast_start_thread(struct net_device *dev); +int ipoib_mcast_stop_thread(struct net_device *dev); + +void ipoib_mcast_dev_down(struct net_device *dev); +void ipoib_mcast_dev_flush(struct net_device *dev); + +struct ipoib_mcast_iter *ipoib_mcast_iter_init(struct net_device *dev); +void ipoib_mcast_iter_free(struct ipoib_mcast_iter *iter); +int ipoib_mcast_iter_next(struct ipoib_mcast_iter *iter); +void ipoib_mcast_iter_read(struct ipoib_mcast_iter *iter, + union ib_gid *gid, + unsigned long *created, + unsigned int *queuelen, + unsigned int *complete, + unsigned int *send_only); + +int ipoib_mcast_attach(struct net_device *dev, u16 mlid, + union ib_gid *mgid); +int ipoib_mcast_detach(struct net_device *dev, u16 mlid, + union ib_gid *mgid); + +int ipoib_qp_create(struct net_device *dev); +int ipoib_transport_dev_init(struct net_device *dev, struct ib_device *ca); +void ipoib_transport_dev_cleanup(struct net_device *dev); + +void ipoib_event(struct ib_event_handler *handler, + struct ib_event *record); + +int ipoib_vlan_add(struct net_device *pdev, unsigned short pkey); +int ipoib_vlan_delete(struct net_device *pdev, unsigned short pkey); + +void ipoib_pkey_poll(void *dev); +int ipoib_pkey_dev_delay_open(struct net_device *dev); + +#ifdef CONFIG_INFINIBAND_IPOIB_DEBUG +int ipoib_create_debug_file(struct net_device *dev); +void ipoib_delete_debug_file(struct net_device *dev); +int ipoib_register_debugfs(void); +void ipoib_unregister_debugfs(void); +#else +static inline int ipoib_create_debug_file(struct net_device *dev) { return 0; } +static inline void ipoib_delete_debug_file(struct net_device *dev) { } +static inline int ipoib_register_debugfs(void) { return 0; } +static inline void ipoib_unregister_debugfs(void) { } +#endif + + +#define ipoib_printk(level, priv, format, arg...) \ + printk(level "%s: " format, ((struct ipoib_dev_priv *) priv)->dev->name , ## arg) +#define ipoib_warn(priv, format, arg...) \ + ipoib_printk(KERN_WARNING, priv, format , ## arg) + + +#ifdef CONFIG_INFINIBAND_IPOIB_DEBUG +extern int debug_level; + +#define ipoib_dbg(priv, format, arg...) \ + do { \ + if (debug_level > 0) \ + ipoib_printk(KERN_DEBUG, priv, format , ## arg); \ + } while (0) +#define ipoib_dbg_mcast(priv, format, arg...) \ + do { \ + if (mcast_debug_level > 0) \ + ipoib_printk(KERN_DEBUG, priv, format , ## arg); \ + } while (0) +#else /* CONFIG_INFINIBAND_IPOIB_DEBUG */ +#define ipoib_dbg(priv, format, arg...) \ + do { (void) (priv); } while (0) +#define ipoib_dbg_mcast(priv, format, arg...) \ + do { (void) (priv); } while (0) +#endif /* CONFIG_INFINIBAND_IPOIB_DEBUG */ + +#ifdef CONFIG_INFINIBAND_IPOIB_DEBUG_DATA +#define ipoib_dbg_data(priv, format, arg...) \ + do { \ + if (data_debug_level > 0) \ + ipoib_printk(KERN_DEBUG, priv, format , ## arg); \ + } while (0) +#else /* CONFIG_INFINIBAND_IPOIB_DEBUG_DATA */ +#define ipoib_dbg_data(priv, format, arg...) \ + do { (void) (priv); } while (0) +#endif /* CONFIG_INFINIBAND_IPOIB_DEBUG_DATA */ + + +#define IPOIB_GID_FMT "%x:%x:%x:%x:%x:%x:%x:%x" + +#define IPOIB_GID_ARG(gid) be16_to_cpup((__be16 *) ((gid).raw + 0)), \ + be16_to_cpup((__be16 *) ((gid).raw + 2)), \ + be16_to_cpup((__be16 *) ((gid).raw + 4)), \ + be16_to_cpup((__be16 *) ((gid).raw + 6)), \ + be16_to_cpup((__be16 *) ((gid).raw + 8)), \ + be16_to_cpup((__be16 *) ((gid).raw + 10)), \ + be16_to_cpup((__be16 *) ((gid).raw + 12)), \ + be16_to_cpup((__be16 *) ((gid).raw + 14)) + +#endif /* _IPOIB_H */ --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/ulp/ipoib/ipoib_fs.c 2004-12-19 22:04:17.608104342 -0800 @@ -0,0 +1,287 @@ +/* + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id$ + */ + +#include +#include + +#include "ipoib.h" + +enum { + IPOIB_MAGIC = 0x49504942 /* "IPIB" */ +}; + +static DECLARE_MUTEX(ipoib_fs_mutex); +static struct dentry *ipoib_root; +static struct super_block *ipoib_sb; +static LIST_HEAD(ipoib_device_list); + +static void *ipoib_mcg_seq_start(struct seq_file *file, loff_t *pos) +{ + struct ipoib_mcast_iter *iter; + loff_t n = *pos; + + iter = ipoib_mcast_iter_init(file->private); + if (!iter) + return NULL; + + while (n--) { + if (ipoib_mcast_iter_next(iter)) { + ipoib_mcast_iter_free(iter); + return NULL; + } + } + + return iter; +} + +static void *ipoib_mcg_seq_next(struct seq_file *file, void *iter_ptr, + loff_t *pos) +{ + struct ipoib_mcast_iter *iter = iter_ptr; + + (*pos)++; + + if (ipoib_mcast_iter_next(iter)) { + ipoib_mcast_iter_free(iter); + return NULL; + } + + return iter; +} + +static void ipoib_mcg_seq_stop(struct seq_file *file, void *iter_ptr) +{ + /* nothing for now */ +} + +static int ipoib_mcg_seq_show(struct seq_file *file, void *iter_ptr) +{ + struct ipoib_mcast_iter *iter = iter_ptr; + char gid_buf[sizeof "ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff"]; + union ib_gid mgid; + int i, n; + unsigned long created; + unsigned int queuelen, complete, send_only; + + if (iter) { + ipoib_mcast_iter_read(iter, &mgid, &created, &queuelen, + &complete, &send_only); + + for (n = 0, i = 0; i < sizeof mgid / 2; ++i) { + n += sprintf(gid_buf + n, "%x", + be16_to_cpu(((u16 *)mgid.raw)[i])); + if (i < sizeof mgid / 2 - 1) + gid_buf[n++] = ':'; + } + } + + seq_printf(file, "GID: %*s", -(1 + (int) sizeof gid_buf), gid_buf); + + seq_printf(file, + " created: %10ld queuelen: %4d complete: %d send_only: %d\n", + created, queuelen, complete, send_only); + + return 0; +} + +static struct seq_operations ipoib_seq_ops = { + .start = ipoib_mcg_seq_start, + .next = ipoib_mcg_seq_next, + .stop = ipoib_mcg_seq_stop, + .show = ipoib_mcg_seq_show, +}; + +static int ipoib_mcg_open(struct inode *inode, struct file *file) +{ + struct seq_file *seq; + int ret; + + ret = seq_open(file, &ipoib_seq_ops); + if (ret) + return ret; + + seq = file->private_data; + seq->private = inode->u.generic_ip; + + return 0; +} + +static struct file_operations ipoib_fops = { + .owner = THIS_MODULE, + .open = ipoib_mcg_open, + .read = seq_read, + .llseek = seq_lseek, + .release = seq_release +}; + +static struct inode *ipoib_get_inode(void) +{ + struct inode *inode = new_inode(ipoib_sb); + + if (inode) { + inode->i_mode = S_IFREG | S_IRUGO; + inode->i_uid = 0; + inode->i_gid = 0; + inode->i_blksize = PAGE_CACHE_SIZE; + inode->i_blocks = 0; + inode->i_atime = inode->i_mtime = inode->i_ctime = CURRENT_TIME; + inode->i_fop = &ipoib_fops; + } + + return inode; +} + +static int __ipoib_create_debug_file(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct dentry *dentry; + struct inode *inode; + char name[IFNAMSIZ + sizeof "_mcg"]; + + snprintf(name, sizeof name, "%s_mcg", dev->name); + + dentry = d_alloc_name(ipoib_root, name); + if (!dentry) + return -ENOMEM; + + inode = ipoib_get_inode(); + if (!inode) { + dput(dentry); + return -ENOMEM; + } + + inode->u.generic_ip = dev; + priv->mcg_dentry = dentry; + + d_add(dentry, inode); + + return 0; +} + +int ipoib_create_debug_file(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + down(&ipoib_fs_mutex); + + list_add_tail(&priv->fs_list, &ipoib_device_list); + + if (!ipoib_sb) { + up(&ipoib_fs_mutex); + return 0; + } + + up(&ipoib_fs_mutex); + + return __ipoib_create_debug_file(dev); +} + +void ipoib_delete_debug_file(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + down(&ipoib_fs_mutex); + list_del(&priv->fs_list); + if (!ipoib_sb) { + up(&ipoib_fs_mutex); + return; + } + up(&ipoib_fs_mutex); + + if (priv->mcg_dentry) { + d_drop(priv->mcg_dentry); + simple_unlink(ipoib_root->d_inode, priv->mcg_dentry); + } +} + +static int ipoib_fill_super(struct super_block *sb, void *data, int silent) +{ + static struct tree_descr ipoib_files[] = { + { "" } + }; + struct ipoib_dev_priv *priv; + int ret; + + ret = simple_fill_super(sb, IPOIB_MAGIC, ipoib_files); + if (ret) + return ret; + + ipoib_root = sb->s_root; + + down(&ipoib_fs_mutex); + + ipoib_sb = sb; + + list_for_each_entry(priv, &ipoib_device_list, fs_list) { + ret = __ipoib_create_debug_file(priv->dev); + if (ret) + break; + } + + up(&ipoib_fs_mutex); + + return ret; +} + +static struct super_block *ipoib_get_sb(struct file_system_type *fs_type, + int flags, const char *dev_name, void *data) +{ + return get_sb_single(fs_type, flags, data, ipoib_fill_super); +} + +static void ipoib_kill_sb(struct super_block *sb) +{ + down(&ipoib_fs_mutex); + ipoib_sb = NULL; + up(&ipoib_fs_mutex); + + kill_litter_super(sb); +} + +static struct file_system_type ipoib_fs_type = { + .owner = THIS_MODULE, + .name = "ipoib_debugfs", + .get_sb = ipoib_get_sb, + .kill_sb = ipoib_kill_sb, +}; + +int ipoib_register_debugfs(void) +{ + return register_filesystem(&ipoib_fs_type); +} + +void ipoib_unregister_debugfs(void) +{ + unregister_filesystem(&ipoib_fs_type); +} --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/ulp/ipoib/ipoib_ib.c 2004-12-19 22:04:17.633100658 -0800 @@ -0,0 +1,632 @@ +/* + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: ipoib_ib.c 1349 2004-12-16 21:09:43Z roland $ + */ + +#include +#include + +#include + +#include "ipoib.h" + +#ifdef CONFIG_INFINIBAND_IPOIB_DEBUG_DATA +int data_debug_level; + +module_param(data_debug_level, int, 0644); +MODULE_PARM_DESC(data_debug_level, + "Enable data path debug tracing if > 0"); +#endif + +#define IPOIB_OP_RECV (1ul << 31) + +static DECLARE_MUTEX(pkey_sem); + +struct ipoib_ah *ipoib_create_ah(struct net_device *dev, + struct ib_pd *pd, struct ib_ah_attr *attr) +{ + struct ipoib_ah *ah; + + ah = kmalloc(sizeof *ah, GFP_KERNEL); + if (!ah) + return NULL; + + ah->dev = dev; + ah->last_send = 0; + kref_init(&ah->ref); + + ah->ah = ib_create_ah(pd, attr); + if (IS_ERR(ah->ah)) { + kfree(ah); + ah = NULL; + } else + ipoib_dbg(netdev_priv(dev), "Created ah %p\n", ah->ah); + + return ah; +} + +void ipoib_free_ah(struct kref *kref) +{ + struct ipoib_ah *ah = container_of(kref, struct ipoib_ah, ref); + struct ipoib_dev_priv *priv = netdev_priv(ah->dev); + + unsigned long flags; + + if (ah->last_send <= priv->tx_tail) { + ipoib_dbg(priv, "Freeing ah %p\n", ah->ah); + ib_destroy_ah(ah->ah); + kfree(ah); + } else { + spin_lock_irqsave(&priv->lock, flags); + list_add_tail(&ah->list, &priv->dead_ahs); + spin_unlock_irqrestore(&priv->lock, flags); + } +} + +static inline int ipoib_ib_receive(struct ipoib_dev_priv *priv, + unsigned int wr_id, + dma_addr_t addr) +{ + struct ib_sge list = { + .addr = addr, + .length = IPOIB_BUF_SIZE, + .lkey = priv->mr->lkey, + }; + struct ib_recv_wr param = { + .wr_id = wr_id | IPOIB_OP_RECV, + .sg_list = &list, + .num_sge = 1, + .recv_flags = IB_RECV_SIGNALED + }; + struct ib_recv_wr *bad_wr; + + return ib_post_recv(priv->qp, ¶m, &bad_wr); +} + +static int ipoib_ib_post_receive(struct net_device *dev, int id) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct sk_buff *skb; + dma_addr_t addr; + int ret; + + skb = dev_alloc_skb(IPOIB_BUF_SIZE + 4); + if (!skb) { + ipoib_warn(priv, "failed to allocate receive buffer\n"); + + priv->rx_ring[id].skb = NULL; + return -ENOMEM; + } + skb_reserve(skb, 4); /* 16 byte align IP header */ + priv->rx_ring[id].skb = skb; + addr = dma_map_single(priv->ca->dma_device, + skb->data, IPOIB_BUF_SIZE, + DMA_FROM_DEVICE); + pci_unmap_addr_set(&priv->rx_ring[id], mapping, addr); + + ret = ipoib_ib_receive(priv, id, addr); + if (ret) { + ipoib_warn(priv, "ipoib_ib_receive failed for buf %d (%d)\n", + id, ret); + priv->rx_ring[id].skb = NULL; + } + + return ret; +} + +static int ipoib_ib_post_receives(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + int i; + + for (i = 0; i < IPOIB_RX_RING_SIZE; ++i) { + if (ipoib_ib_post_receive(dev, i)) { + ipoib_warn(priv, "ipoib_ib_post_receive failed for buf %d\n", i); + return -EIO; + } + } + + return 0; +} + +static void ipoib_ib_handle_wc(struct net_device *dev, + struct ib_wc *wc) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + unsigned int wr_id = wc->wr_id; + + ipoib_dbg_data(priv, "called: id %d, op %d, status: %d\n", + wr_id, wc->opcode, wc->status); + + if (wr_id & IPOIB_OP_RECV) { + wr_id &= ~IPOIB_OP_RECV; + + if (wr_id < IPOIB_RX_RING_SIZE) { + struct sk_buff *skb = priv->rx_ring[wr_id].skb; + + priv->rx_ring[wr_id].skb = NULL; + + dma_unmap_single(priv->ca->dma_device, + pci_unmap_addr(&priv->rx_ring[wr_id], + mapping), + IPOIB_BUF_SIZE, + DMA_FROM_DEVICE); + + if (wc->status != IB_WC_SUCCESS) { + if (wc->status != IB_WC_WR_FLUSH_ERR) + ipoib_warn(priv, "failed recv event " + "(status=%d, wrid=%d vend_err %x)\n", + wc->status, wr_id, wc->vendor_err); + dev_kfree_skb_any(skb); + return; + } + + ipoib_dbg_data(priv, "received %d bytes, SLID 0x%04x\n", + wc->byte_len, wc->slid); + + skb_put(skb, wc->byte_len); + skb_pull(skb, IB_GRH_BYTES); + + if (wc->slid != priv->local_lid || + wc->src_qp != priv->qp->qp_num) { + skb->protocol = ((struct ipoib_header *) skb->data)->proto; + + skb_pull(skb, IPOIB_ENCAP_LEN); + + dev->last_rx = jiffies; + ++priv->stats.rx_packets; + priv->stats.rx_bytes += skb->len; + + skb->dev = dev; + /* XXX get correct PACKET_ type here */ + skb->pkt_type = PACKET_HOST; + netif_rx_ni(skb); + } else { + ipoib_dbg_data(priv, "dropping loopback packet\n"); + dev_kfree_skb_any(skb); + } + + /* repost receive */ + if (ipoib_ib_post_receive(dev, wr_id)) + ipoib_warn(priv, "ipoib_ib_post_receive failed " + "for buf %d\n", wr_id); + } else + ipoib_warn(priv, "completion event with wrid %d\n", + wr_id); + + } else { + struct ipoib_buf *tx_req; + unsigned long flags; + + if (wr_id >= IPOIB_TX_RING_SIZE) { + ipoib_warn(priv, "completion event with wrid %d (> %d)\n", + wr_id, IPOIB_TX_RING_SIZE); + return; + } + + ipoib_dbg_data(priv, "send complete, wrid %d\n", wr_id); + + tx_req = &priv->tx_ring[wr_id]; + + dma_unmap_single(priv->ca->dma_device, + pci_unmap_addr(tx_req, mapping), + tx_req->skb->len, + DMA_TO_DEVICE); + + ++priv->stats.tx_packets; + priv->stats.tx_bytes += tx_req->skb->len; + + dev_kfree_skb_any(tx_req->skb); + + spin_lock_irqsave(&priv->tx_lock, flags); + ++priv->tx_tail; + if (priv->tx_head - priv->tx_tail <= IPOIB_TX_RING_SIZE / 2) + netif_wake_queue(dev); + spin_unlock_irqrestore(&priv->tx_lock, flags); + + if (wc->status != IB_WC_SUCCESS && + wc->status != IB_WC_WR_FLUSH_ERR) + ipoib_warn(priv, "failed send event " + "(status=%d, wrid=%d vend_err %x)\n", + wc->status, wr_id, wc->vendor_err); + } +} + +void ipoib_ib_completion(struct ib_cq *cq, void *dev_ptr) +{ + struct net_device *dev = (struct net_device *) dev_ptr; + struct ipoib_dev_priv *priv = netdev_priv(dev); + int n, i; + + ib_req_notify_cq(cq, IB_CQ_NEXT_COMP); + do { + n = ib_poll_cq(cq, IPOIB_NUM_WC, priv->ibwc); + for (i = 0; i < n; ++i) + ipoib_ib_handle_wc(dev, priv->ibwc + i); + } while (n == IPOIB_NUM_WC); +} + +static inline int post_send(struct ipoib_dev_priv *priv, + unsigned int wr_id, + struct ib_ah *address, u32 qpn, + dma_addr_t addr, int len) +{ + struct ib_sge list = { + .addr = addr, + .length = len, + .lkey = priv->mr->lkey, + }; + struct ib_send_wr param = { + .wr_id = wr_id, + .opcode = IB_WR_SEND, + .sg_list = &list, + .num_sge = 1, + .wr = { + .ud = { + .remote_qpn = qpn, + .remote_qkey = priv->qkey, + .ah = address + }, + }, + .send_flags = IB_SEND_SIGNALED, + }; + struct ib_send_wr *bad_wr; + + return ib_post_send(priv->qp, ¶m, &bad_wr); +} + +void ipoib_send(struct net_device *dev, struct sk_buff *skb, + struct ipoib_ah *address, u32 qpn) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ipoib_buf *tx_req; + dma_addr_t addr; + + if (skb->len > dev->mtu + INFINIBAND_ALEN) { + ipoib_warn(priv, "packet len %d (> %d) too long to send, dropping\n", + skb->len, dev->mtu + INFINIBAND_ALEN); + ++priv->stats.tx_dropped; + ++priv->stats.tx_errors; + dev_kfree_skb_any(skb); + return; + } + + ipoib_dbg_data(priv, "sending packet, length=%d address=%p qpn=0x%06x\n", + skb->len, address, qpn); + + /* + * We put the skb into the tx_ring _before_ we call post_send() + * because it's entirely possible that the completion handler will + * run before we execute anything after the post_send(). That + * means we have to make sure everything is properly recorded and + * our state is consistent before we call post_send(). + */ + tx_req = &priv->tx_ring[priv->tx_head & (IPOIB_TX_RING_SIZE - 1)]; + tx_req->skb = skb; + addr = dma_map_single(priv->ca->dma_device, skb->data, skb->len, + DMA_TO_DEVICE); + pci_unmap_addr_set(tx_req, mapping, addr); + + if (unlikely(post_send(priv, priv->tx_head & (IPOIB_TX_RING_SIZE - 1), + address->ah, qpn, addr, skb->len))) { + ipoib_warn(priv, "post_send failed\n"); + ++priv->stats.tx_errors; + dma_unmap_single(priv->ca->dma_device, addr, skb->len, + DMA_TO_DEVICE); + dev_kfree_skb_any(skb); + } else { + dev->trans_start = jiffies; + + address->last_send = priv->tx_head; + ++priv->tx_head; + + if (priv->tx_head - priv->tx_tail == IPOIB_TX_RING_SIZE) { + ipoib_dbg(priv, "TX ring full, stopping kernel net queue\n"); + netif_stop_queue(dev); + } + } +} + +void __ipoib_reap_ah(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ipoib_ah *ah, *tah; + LIST_HEAD(remove_list); + + spin_lock_irq(&priv->lock); + list_for_each_entry_safe(ah, tah, &priv->dead_ahs, list) + if (ah->last_send <= priv->tx_tail) { + list_del(&ah->list); + list_add_tail(&ah->list, &remove_list); + } + spin_unlock_irq(&priv->lock); + + list_for_each_entry_safe(ah, tah, &remove_list, list) { + ipoib_dbg(priv, "Reaping ah %p\n", ah->ah); + ib_destroy_ah(ah->ah); + kfree(ah); + } +} + +void ipoib_reap_ah(void *dev_ptr) +{ + struct net_device *dev = dev_ptr; + struct ipoib_dev_priv *priv = netdev_priv(dev); + + __ipoib_reap_ah(dev); + + if (!test_bit(IPOIB_STOP_REAPER, &priv->flags)) + queue_delayed_work(ipoib_workqueue, &priv->ah_reap_task, HZ); +} + +int ipoib_ib_dev_open(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + int ret; + + ret = ipoib_qp_create(dev); + if (ret) { + ipoib_warn(priv, "ipoib_qp_create returned %d\n", ret); + return -1; + } + + ret = ipoib_ib_post_receives(dev); + if (ret) { + ipoib_warn(priv, "ipoib_ib_post_receives returned %d\n", ret); + return -1; + } + + clear_bit(IPOIB_STOP_REAPER, &priv->flags); + queue_delayed_work(ipoib_workqueue, &priv->ah_reap_task, HZ); + + return 0; +} + +int ipoib_ib_dev_up(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + set_bit(IPOIB_FLAG_OPER_UP, &priv->flags); + + return ipoib_mcast_start_thread(dev); +} + +int ipoib_ib_dev_down(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + ipoib_dbg(priv, "downing ib_dev\n"); + + clear_bit(IPOIB_FLAG_OPER_UP, &priv->flags); + netif_carrier_off(dev); + + /* Shutdown the P_Key thread if still active */ + if (!test_bit(IPOIB_PKEY_ASSIGNED, &priv->flags)) { + down(&pkey_sem); + set_bit(IPOIB_PKEY_STOP, &priv->flags); + cancel_delayed_work(&priv->pkey_task); + up(&pkey_sem); + flush_workqueue(ipoib_workqueue); + } + + ipoib_mcast_stop_thread(dev); + + /* + * Flush the multicast groups first so we stop any multicast joins. The + * completion thread may have already died and we may deadlock waiting + * for the completion thread to finish some multicast joins. + */ + ipoib_mcast_dev_flush(dev); + + /* Delete broadcast and local addresses since they will be recreated */ + ipoib_mcast_dev_down(dev); + + ipoib_flush_paths(dev); + + return 0; +} + +static int recvs_pending(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + int i; + + for (i = 0; i < IPOIB_RX_RING_SIZE; ++i) + if (priv->rx_ring[i].skb) + return 1; + + return 0; +} + +int ipoib_ib_dev_stop(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ib_qp_attr qp_attr; + int attr_mask; + int i; + + /* Kill the existing QP and allocate a new one */ + qp_attr.qp_state = IB_QPS_ERR; + attr_mask = IB_QP_STATE; + if (ib_modify_qp(priv->qp, &qp_attr, attr_mask)) + ipoib_warn(priv, "Failed to modify QP to ERROR state\n"); + + /* Wait for all sends and receives to complete */ + while (priv->tx_head != priv->tx_tail || recvs_pending(dev)) + yield(); + + ipoib_dbg(priv, "All sends and receives done.\n"); + + qp_attr.qp_state = IB_QPS_RESET; + attr_mask = IB_QP_STATE; + if (ib_modify_qp(priv->qp, &qp_attr, attr_mask)) + ipoib_warn(priv, "Failed to modify QP to RESET state\n"); + + /* Wait for all AHs to be reaped */ + set_bit(IPOIB_STOP_REAPER, &priv->flags); + cancel_delayed_work(&priv->ah_reap_task); + flush_workqueue(ipoib_workqueue); + while (!list_empty(&priv->dead_ahs)) { + __ipoib_reap_ah(dev); + yield(); + } + + for (i = 0; i < IPOIB_RX_RING_SIZE; ++i) + if (priv->rx_ring[i].skb) + ipoib_warn(priv, "Recv skb still around @ %d\n", i); + + return 0; +} + +int ipoib_ib_dev_init(struct net_device *dev, struct ib_device *ca, int port) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + priv->ca = ca; + priv->port = port; + priv->qp = NULL; + + if (ipoib_transport_dev_init(dev, ca)) { + printk(KERN_WARNING "%s: ipoib_transport_dev_init failed\n", ca->name); + return -ENODEV; + } + + if (dev->flags & IFF_UP) { + if (ipoib_ib_dev_open(dev)) { + ipoib_transport_dev_cleanup(dev); + return -ENODEV; + } + } + + return 0; +} + +void ipoib_ib_dev_flush(void *_dev) +{ + struct net_device *dev = (struct net_device *)_dev; + struct ipoib_dev_priv *priv = netdev_priv(dev), *cpriv; + + if (!test_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags)) + return; + + ipoib_dbg(priv, "flushing\n"); + + ipoib_ib_dev_down(dev); + + /* + * The device could have been brought down between the start and when + * we get here, don't bring it back up if it's not configured up + */ + if (test_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags)) + ipoib_ib_dev_up(dev); + + /* Flush any child interfaces too */ + list_for_each_entry(cpriv, &priv->child_intfs, list) + ipoib_ib_dev_flush(&cpriv->dev); +} + +void ipoib_ib_dev_cleanup(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + ipoib_dbg(priv, "cleaning up ib_dev\n"); + + ipoib_mcast_stop_thread(dev); + + /* Delete the broadcast address and the local address */ + ipoib_mcast_dev_down(dev); + + ipoib_transport_dev_cleanup(dev); +} + +/* + * Delayed P_Key Assigment Interim Support + * + * The following is initial implementation of delayed P_Key assigment + * mechanism. It is using the same approach implemented for the multicast + * group join. The single goal of this implementation is to quickly address + * Bug #2507. This implementation will probably be removed when the P_Key + * change async notification is available. + */ +int ipoib_open(struct net_device *dev); + +static void ipoib_pkey_dev_check_presence(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + u16 pkey_index = 0; + + if (ib_cached_pkey_find(priv->ca, priv->port, priv->pkey, &pkey_index)) + clear_bit(IPOIB_PKEY_ASSIGNED, &priv->flags); + else + set_bit(IPOIB_PKEY_ASSIGNED, &priv->flags); +} + +void ipoib_pkey_poll(void *dev_ptr) +{ + struct net_device *dev = dev_ptr; + struct ipoib_dev_priv *priv = netdev_priv(dev); + + ipoib_pkey_dev_check_presence(dev); + + if (test_bit(IPOIB_PKEY_ASSIGNED, &priv->flags)) + ipoib_open(dev); + else { + down(&pkey_sem); + if (!test_bit(IPOIB_PKEY_STOP, &priv->flags)) + queue_delayed_work(ipoib_workqueue, + &priv->pkey_task, + HZ); + up(&pkey_sem); + } +} + +int ipoib_pkey_dev_delay_open(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + /* Look for the interface pkey value in the IB Port P_Key table and */ + /* set the interface pkey assigment flag */ + ipoib_pkey_dev_check_presence(dev); + + /* P_Key value not assigned yet - start polling */ + if (!test_bit(IPOIB_PKEY_ASSIGNED, &priv->flags)) { + down(&pkey_sem); + clear_bit(IPOIB_PKEY_STOP, &priv->flags); + queue_delayed_work(ipoib_workqueue, + &priv->pkey_task, + HZ); + up(&pkey_sem); + return 1; + } + + return 0; +} --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/ulp/ipoib/ipoib_main.c 2004-12-19 22:04:17.658096974 -0800 @@ -0,0 +1,1084 @@ +/* + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: ipoib_main.c 1362 2004-12-18 15:56:29Z roland $ + */ + +#include "ipoib.h" + +#include +#include + +#include +#include +#include + +#include /* For ARPHRD_xxx */ + +#include +#include + +MODULE_AUTHOR("Roland Dreier"); +MODULE_DESCRIPTION("IP-over-InfiniBand net driver"); +MODULE_LICENSE("Dual BSD/GPL"); + +#ifdef CONFIG_INFINIBAND_IPOIB_DEBUG +int debug_level; + +module_param(debug_level, int, 0644); +MODULE_PARM_DESC(debug_level, "Enable debug tracing if > 0"); +#endif + +static const u8 ipv4_bcast_addr[] = { + 0x00, 0xff, 0xff, 0xff, + 0xff, 0x12, 0x40, 0x1b, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff +}; + +struct workqueue_struct *ipoib_workqueue; + +static void ipoib_add_one(struct ib_device *device); +static void ipoib_remove_one(struct ib_device *device); + +static struct ib_client ipoib_client = { + .name = "ipoib", + .add = ipoib_add_one, + .remove = ipoib_remove_one +}; + +int ipoib_open(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + ipoib_dbg(priv, "bringing up interface\n"); + + set_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags); + + if (ipoib_pkey_dev_delay_open(dev)) + return 0; + + if (ipoib_ib_dev_open(dev)) + return -EINVAL; + + if (ipoib_ib_dev_up(dev)) + return -EINVAL; + + if (!test_bit(IPOIB_FLAG_SUBINTERFACE, &priv->flags)) { + struct ipoib_dev_priv *cpriv; + + /* Bring up any child interfaces too */ + down(&priv->vlan_mutex); + list_for_each_entry(cpriv, &priv->child_intfs, list) { + int flags; + + flags = cpriv->dev->flags; + if (flags & IFF_UP) + continue; + + dev_change_flags(cpriv->dev, flags | IFF_UP); + } + up(&priv->vlan_mutex); + } + + netif_start_queue(dev); + + return 0; +} + +static int ipoib_stop(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + ipoib_dbg(priv, "stopping interface\n"); + + clear_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags); + + netif_stop_queue(dev); + + ipoib_ib_dev_down(dev); + ipoib_ib_dev_stop(dev); + + if (!test_bit(IPOIB_FLAG_SUBINTERFACE, &priv->flags)) { + struct ipoib_dev_priv *cpriv; + + /* Bring down any child interfaces too */ + down(&priv->vlan_mutex); + list_for_each_entry(cpriv, &priv->child_intfs, list) { + int flags; + + flags = cpriv->dev->flags; + if (!(flags & IFF_UP)) + continue; + + dev_change_flags(cpriv->dev, flags & ~IFF_UP); + } + up(&priv->vlan_mutex); + } + + return 0; +} + +static int ipoib_change_mtu(struct net_device *dev, int new_mtu) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + if (new_mtu > IPOIB_PACKET_SIZE - IPOIB_ENCAP_LEN) + return -EINVAL; + + priv->admin_mtu = new_mtu; + + dev->mtu = min(priv->mcast_mtu, priv->admin_mtu); + + return 0; +} + +static struct ipoib_path *__path_find(struct net_device *dev, + union ib_gid *gid) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct rb_node *n = priv->path_tree.rb_node; + struct ipoib_path *path; + int ret; + + while (n) { + path = rb_entry(n, struct ipoib_path, rb_node); + + ret = memcmp(gid->raw, path->pathrec.dgid.raw, + sizeof (union ib_gid)); + + if (ret < 0) + n = n->rb_left; + else if (ret > 0) + n = n->rb_right; + else + return path; + } + + return NULL; +} + +static int __path_add(struct net_device *dev, struct ipoib_path *path) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct rb_node **n = &priv->path_tree.rb_node; + struct rb_node *pn = NULL; + struct ipoib_path *tpath; + int ret; + + while (*n) { + pn = *n; + tpath = rb_entry(pn, struct ipoib_path, rb_node); + + ret = memcmp(path->pathrec.dgid.raw, tpath->pathrec.dgid.raw, + sizeof (union ib_gid)); + if (ret < 0) + n = &pn->rb_left; + else if (ret > 0) + n = &pn->rb_right; + else + return -EEXIST; + } + + rb_link_node(&path->rb_node, pn, n); + rb_insert_color(&path->rb_node, &priv->path_tree); + + list_add_tail(&path->list, &priv->path_list); + + return 0; +} + +static void __path_free(struct net_device *dev, struct ipoib_path *path) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ipoib_neigh *neigh, *tn; + struct sk_buff *skb; + + while ((skb = __skb_dequeue(&path->queue))) + dev_kfree_skb_irq(skb); + + list_for_each_entry_safe(neigh, tn, &path->neigh_list, list) { + if (neigh->ah) + ipoib_put_ah(neigh->ah); + *to_ipoib_neigh(neigh->neighbour) = NULL; + neigh->neighbour->ops->destructor = NULL; + kfree(neigh); + } + + if (path->ah) + ipoib_put_ah(path->ah); + + rb_erase(&path->rb_node, &priv->path_tree); + list_del(&path->list); + kfree(path); +} + +void ipoib_flush_paths(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ipoib_path *path, *tp; + LIST_HEAD(remove_list); + unsigned long flags; + + spin_lock_irqsave(&priv->lock, flags); + list_splice(&priv->path_list, &remove_list); + INIT_LIST_HEAD(&priv->path_list); + spin_unlock_irqrestore(&priv->lock, flags); + + list_for_each_entry_safe(path, tp, &remove_list, list) { + if (path->query) + ib_sa_cancel_query(path->query_id, path->query); + wait_for_completion(&path->done); + __path_free(dev, path); + } +} + +static void path_rec_completion(int status, + struct ib_sa_path_rec *pathrec, + void *path_ptr) +{ + struct ipoib_path *path = path_ptr; + struct net_device *dev = path->dev; + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ipoib_ah *ah = NULL; + struct ipoib_neigh *neigh; + struct sk_buff_head skqueue; + struct sk_buff *skb; + unsigned long flags; + + if (pathrec) + ipoib_dbg(priv, "PathRec LID 0x%04x for GID " IPOIB_GID_FMT "\n", + be16_to_cpu(pathrec->dlid), IPOIB_GID_ARG(pathrec->dgid)); + else + ipoib_dbg(priv, "PathRec status %d for GID " IPOIB_GID_FMT "\n", + status, IPOIB_GID_ARG(path->pathrec.dgid)); + + skb_queue_head_init(&skqueue); + + if (!status) { + /* + * For now we set static_rate to 0. This is not + * really correct: we should look at the rate + * component of the path member record, compare it + * with the rate of our local port (calculated from + * the active link speed and link width) and set an + * inter-packet delay appropriately. + */ + struct ib_ah_attr av = { + .dlid = be16_to_cpu(pathrec->dlid), + .sl = pathrec->sl, + .static_rate = 0, + .port_num = priv->port + }; + + ah = ipoib_create_ah(dev, priv->pd, &av); + } + + spin_lock_irqsave(&priv->lock, flags); + + path->ah = ah; + + if (ah) { + path->pathrec = *pathrec; + + ipoib_dbg(priv, "created address handle %p for LID 0x%04x, SL %d\n", + ah, be16_to_cpu(pathrec->dlid), pathrec->sl); + + while ((skb = __skb_dequeue(&path->queue))) + __skb_queue_tail(&skqueue, skb); + + list_for_each_entry(neigh, &path->neigh_list, list) { + kref_get(&path->ah->ref); + neigh->ah = path->ah; + + while ((skb = __skb_dequeue(&neigh->queue))) + __skb_queue_tail(&skqueue, skb); + } + } else + path->query = NULL; + + complete(&path->done); + + spin_unlock_irqrestore(&priv->lock, flags); + + while ((skb = __skb_dequeue(&skqueue))) { + skb->dev = dev; + if (dev_queue_xmit(skb)) + ipoib_warn(priv, "dev_queue_xmit failed " + "to requeue packet\n"); + } +} + +static struct ipoib_path *path_rec_create(struct net_device *dev, + union ib_gid *gid) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ipoib_path *path; + + path = kmalloc(sizeof *path, GFP_ATOMIC); + if (!path) + return NULL; + + path->dev = dev; + path->pathrec.dlid = 0; + + skb_queue_head_init(&path->queue); + + INIT_LIST_HEAD(&path->neigh_list); + path->query = NULL; + init_completion(&path->done); + + memcpy(path->pathrec.dgid.raw, gid->raw, sizeof (union ib_gid)); + path->pathrec.sgid = priv->local_gid; + path->pathrec.pkey = cpu_to_be16(priv->pkey); + path->pathrec.numb_path = 1; + + __path_add(dev, path); + + return path; +} + +static int path_rec_start(struct net_device *dev, + struct ipoib_path *path) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + ipoib_dbg(priv, "Start path record lookup for " IPOIB_GID_FMT "\n", + IPOIB_GID_ARG(path->pathrec.dgid)); + + path->query_id = + ib_sa_path_rec_get(priv->ca, priv->port, + &path->pathrec, + IB_SA_PATH_REC_DGID | + IB_SA_PATH_REC_SGID | + IB_SA_PATH_REC_NUMB_PATH | + IB_SA_PATH_REC_PKEY, + 1000, GFP_ATOMIC, + path_rec_completion, + path, &path->query); + if (path->query_id < 0) { + ipoib_warn(priv, "ib_sa_path_rec_get failed\n"); + path->query = NULL; + return path->query_id; + } + + return 0; +} + +static void neigh_add_path(struct sk_buff *skb, struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ipoib_path *path; + struct ipoib_neigh *neigh; + + neigh = kmalloc(sizeof *neigh, GFP_ATOMIC); + if (!neigh) { + ++priv->stats.tx_dropped; + dev_kfree_skb_any(skb); + return; + } + + skb_queue_head_init(&neigh->queue); + neigh->neighbour = skb->dst->neighbour; + *to_ipoib_neigh(skb->dst->neighbour) = neigh; + + /* + * We can only be called from ipoib_start_xmit, so we're + * inside tx_lock -- no need to save/restore flags. + */ + spin_lock(&priv->lock); + + path = __path_find(dev, (union ib_gid *) (skb->dst->neighbour->ha + 4)); + if (!path) { + path = path_rec_create(dev, + (union ib_gid *) (skb->dst->neighbour->ha + 4)); + if (!path) + goto err; + } + + list_add_tail(&neigh->list, &path->neigh_list); + + if (path->pathrec.dlid) { + kref_get(&path->ah->ref); + neigh->ah = path->ah; + + ipoib_send(dev, skb, path->ah, + be32_to_cpup((__be32 *) skb->dst->neighbour->ha)); + } else { + neigh->ah = NULL; + if (skb_queue_len(&neigh->queue) < IPOIB_MAX_PATH_REC_QUEUE) { + __skb_queue_tail(&neigh->queue, skb); + } else { + ++priv->stats.tx_dropped; + dev_kfree_skb_any(skb); + } + + if (!path->query && path_rec_start(dev, path)) + goto err; + } + + spin_unlock(&priv->lock); + return; + +err: + *to_ipoib_neigh(skb->dst->neighbour) = NULL; + list_del(&neigh->list); + kfree(neigh); + neigh->neighbour->ops->destructor = NULL; + + ++priv->stats.tx_dropped; + dev_kfree_skb_any(skb); + + spin_unlock(&priv->lock); +} + +static void path_lookup(struct sk_buff *skb, struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(skb->dev); + + /* Look up path record for unicasts */ + if (skb->dst->neighbour->ha[4] != 0xff) { + neigh_add_path(skb, dev); + return; + } + + /* Add in the P_Key for multicasts */ + skb->dst->neighbour->ha[8] = (priv->pkey >> 8) & 0xff; + skb->dst->neighbour->ha[9] = priv->pkey & 0xff; + ipoib_mcast_send(dev, (union ib_gid *) (skb->dst->neighbour->ha + 4), skb); +} + +static void unicast_arp_send(struct sk_buff *skb, struct net_device *dev, + struct ipoib_pseudoheader *phdr) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ipoib_path *path; + + /* + * We can only be called from ipoib_start_xmit, so we're + * inside tx_lock -- no need to save/restore flags. + */ + spin_lock(&priv->lock); + + path = __path_find(dev, (union ib_gid *) (phdr->hwaddr + 4)); + if (!path) { + path = path_rec_create(dev, + (union ib_gid *) (phdr->hwaddr + 4)); + if (path) { + /* put pseudoheader back on for next time */ + skb_push(skb, sizeof *phdr); + __skb_queue_tail(&path->queue, skb); + + if (path_rec_start(dev, path)) + __path_free(dev, path); + } else { + ++priv->stats.tx_dropped; + dev_kfree_skb_any(skb); + } + + spin_unlock(&priv->lock); + return; + } + + if (path->pathrec.dlid) { + ipoib_dbg(priv, "Send unicast ARP to %04x\n", + be16_to_cpu(path->pathrec.dlid)); + + ipoib_send(dev, skb, path->ah, + be32_to_cpup((__be32 *) phdr->hwaddr)); + } else if ((path->query || !path_rec_start(dev, path)) && + skb_queue_len(&path->queue) < IPOIB_MAX_PATH_REC_QUEUE) { + /* put pseudoheader back on for next time */ + skb_push(skb, sizeof *phdr); + __skb_queue_tail(&path->queue, skb); + } else { + ++priv->stats.tx_dropped; + dev_kfree_skb_any(skb); + } + + spin_unlock(&priv->lock); +} + +static int ipoib_start_xmit(struct sk_buff *skb, struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ipoib_neigh *neigh; + unsigned long flags; + + local_irq_save(flags); + if (!spin_trylock(&priv->tx_lock)) { + local_irq_restore(flags); + return NETDEV_TX_LOCKED; + } + + /* + * Check if our queue is full. Since we have the LLTX feature + * bit set, we can't rely on netif_stop_queue() preventing our + * xmit function from being called with a full queue. + * + * This is a temporary workaround until LLTX is fixed so that + * hard_start_xmit does not get called after netif_stop_queue(). + */ + if (unlikely(priv->tx_head - priv->tx_tail >= IPOIB_TX_RING_SIZE)) { + ipoib_dbg(priv, "TX ring full in xmit, stopping kernel net queue\n"); + netif_stop_queue(dev); + spin_unlock_irqrestore(&priv->tx_lock, flags); + return NETDEV_TX_BUSY; + } + + if (skb->dst && skb->dst->neighbour) { + if (unlikely(!*to_ipoib_neigh(skb->dst->neighbour))) { + path_lookup(skb, dev); + goto out; + } + + neigh = *to_ipoib_neigh(skb->dst->neighbour); + + if (likely(neigh->ah)) { + ipoib_send(dev, skb, neigh->ah, + be32_to_cpup((__be32 *) skb->dst->neighbour->ha)); + goto out; + } + + if (skb_queue_len(&neigh->queue) < IPOIB_MAX_PATH_REC_QUEUE) { + spin_lock(&priv->lock); + __skb_queue_tail(&neigh->queue, skb); + spin_unlock(&priv->lock); + } else { + ++priv->stats.tx_dropped; + dev_kfree_skb_any(skb); + } + } else { + struct ipoib_pseudoheader *phdr = + (struct ipoib_pseudoheader *) skb->data; + skb_pull(skb, sizeof *phdr); + + if (phdr->hwaddr[4] == 0xff) { + /* Add in the P_Key for multicast*/ + phdr->hwaddr[8] = (priv->pkey >> 8) & 0xff; + phdr->hwaddr[9] = priv->pkey & 0xff; + + ipoib_mcast_send(dev, (union ib_gid *) (phdr->hwaddr + 4), skb); + } else { + /* unicast GID -- should be ARP reply */ + + if (be16_to_cpup((u16 *) skb->data) != ETH_P_ARP) { + ipoib_warn(priv, "Unicast, no %s: type %04x, QPN %06x " + IPOIB_GID_FMT "\n", + skb->dst ? "neigh" : "dst", + be16_to_cpup((u16 *) skb->data), + be32_to_cpup((u32 *) phdr->hwaddr), + IPOIB_GID_ARG(*(union ib_gid *) (phdr->hwaddr + 4))); + dev_kfree_skb_any(skb); + ++priv->stats.tx_dropped; + goto out; + } + + unicast_arp_send(skb, dev, phdr); + } + } + +out: + spin_unlock_irqrestore(&priv->tx_lock, flags); + + return NETDEV_TX_OK; +} + +struct net_device_stats *ipoib_get_stats(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + return &priv->stats; +} + +static void ipoib_timeout(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + ipoib_warn(priv, "transmit timeout: latency %ld\n", + jiffies - dev->trans_start); + /* XXX reset QP, etc. */ +} + +static int ipoib_hard_header(struct sk_buff *skb, + struct net_device *dev, + unsigned short type, + void *daddr, void *saddr, unsigned len) +{ + struct ipoib_header *header; + + header = (struct ipoib_header *) skb_push(skb, sizeof *header); + + header->proto = htons(type); + header->reserved = 0; + + /* + * If we don't have a neighbour structure, stuff the + * destination address onto the front of the skb so we can + * figure out where to send the packet later. + */ + if (!skb->dst || !skb->dst->neighbour) { + struct ipoib_pseudoheader *phdr = + (struct ipoib_pseudoheader *) skb_push(skb, sizeof *phdr); + memcpy(phdr->hwaddr, daddr, INFINIBAND_ALEN); + } + + return 0; +} + +static void ipoib_set_mcast_list(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + schedule_work(&priv->restart_task); +} + +static void ipoib_neigh_destructor(struct neighbour *n) +{ + struct ipoib_neigh *neigh = *to_ipoib_neigh(n); + struct ipoib_dev_priv *priv = netdev_priv(n->dev); + unsigned long flags; + + ipoib_dbg(priv, + "neigh_destructor for %06x " IPOIB_GID_FMT "\n", + be32_to_cpup((__be32 *) n->ha), + IPOIB_GID_ARG(*((union ib_gid *) (n->ha + 4)))); + + spin_lock_irqsave(&priv->lock, flags); + + if (neigh) { + if (neigh->ah) + ipoib_put_ah(neigh->ah); + list_del(&neigh->list); + *to_ipoib_neigh(n) = NULL; + kfree(neigh); + } + + spin_unlock_irqrestore(&priv->lock, flags); +} + +static int ipoib_neigh_setup(struct neighbour *neigh) +{ + /* + * Is this kosher? I can't find anybody in the kernel that + * sets neigh->destructor, so we should be able to set it here + * without trouble. + */ + neigh->ops->destructor = ipoib_neigh_destructor; + + return 0; +} + +static int ipoib_neigh_setup_dev(struct net_device *dev, struct neigh_parms *parms) +{ + parms->neigh_setup = ipoib_neigh_setup; + + return 0; +} + +int ipoib_dev_init(struct net_device *dev, struct ib_device *ca, int port) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + /* Allocate RX/TX "rings" to hold queued skbs */ + + priv->rx_ring = kmalloc(IPOIB_RX_RING_SIZE * sizeof (struct ipoib_buf), + GFP_KERNEL); + if (!priv->rx_ring) { + printk(KERN_WARNING "%s: failed to allocate RX ring (%d entries)\n", + ca->name, IPOIB_RX_RING_SIZE); + goto out; + } + memset(priv->rx_ring, 0, + IPOIB_RX_RING_SIZE * sizeof (struct ipoib_buf)); + + priv->tx_ring = kmalloc(IPOIB_TX_RING_SIZE * sizeof (struct ipoib_buf), + GFP_KERNEL); + if (!priv->tx_ring) { + printk(KERN_WARNING "%s: failed to allocate TX ring (%d entries)\n", + ca->name, IPOIB_TX_RING_SIZE); + goto out_rx_ring_cleanup; + } + memset(priv->tx_ring, 0, + IPOIB_TX_RING_SIZE * sizeof (struct ipoib_buf)); + + /* priv->tx_head & tx_tail are already 0 */ + + if (ipoib_ib_dev_init(dev, ca, port)) + goto out_tx_ring_cleanup; + + return 0; + +out_tx_ring_cleanup: + kfree(priv->tx_ring); + +out_rx_ring_cleanup: + kfree(priv->rx_ring); + +out: + return -ENOMEM; +} + +void ipoib_dev_cleanup(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev), *cpriv, *tcpriv; + + ipoib_delete_debug_file(dev); + + /* Delete any child interfaces first */ + list_for_each_entry_safe(cpriv, tcpriv, &priv->child_intfs, list) { + unregister_netdev(cpriv->dev); + ipoib_dev_cleanup(cpriv->dev); + free_netdev(cpriv->dev); + } + + ipoib_ib_dev_cleanup(dev); + + if (priv->rx_ring) { + kfree(priv->rx_ring); + priv->rx_ring = NULL; + } + + if (priv->tx_ring) { + kfree(priv->tx_ring); + priv->tx_ring = NULL; + } +} + +static void ipoib_setup(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + dev->open = ipoib_open; + dev->stop = ipoib_stop; + dev->change_mtu = ipoib_change_mtu; + dev->hard_start_xmit = ipoib_start_xmit; + dev->get_stats = ipoib_get_stats; + dev->tx_timeout = ipoib_timeout; + dev->hard_header = ipoib_hard_header; + dev->set_multicast_list = ipoib_set_mcast_list; + dev->neigh_setup = ipoib_neigh_setup_dev; + + dev->watchdog_timeo = HZ; + + dev->rebuild_header = NULL; + dev->set_mac_address = NULL; + dev->header_cache_update = NULL; + + dev->flags |= IFF_BROADCAST | IFF_MULTICAST; + + /* + * We add in INFINIBAND_ALEN to allow for the destination + * address "pseudoheader" for skbs without neighbour struct. + */ + dev->hard_header_len = IPOIB_ENCAP_LEN + INFINIBAND_ALEN; + dev->addr_len = INFINIBAND_ALEN; + dev->type = ARPHRD_INFINIBAND; + dev->tx_queue_len = IPOIB_TX_RING_SIZE * 2; + dev->features = NETIF_F_VLAN_CHALLENGED | NETIF_F_LLTX; + + /* MTU will be reset when mcast join happens */ + dev->mtu = IPOIB_PACKET_SIZE - IPOIB_ENCAP_LEN; + priv->mcast_mtu = priv->admin_mtu = dev->mtu; + + memcpy(dev->broadcast, ipv4_bcast_addr, INFINIBAND_ALEN); + + netif_carrier_off(dev); + + SET_MODULE_OWNER(dev); + + priv->dev = dev; + + spin_lock_init(&priv->lock); + spin_lock_init(&priv->tx_lock); + + init_MUTEX(&priv->mcast_mutex); + init_MUTEX(&priv->vlan_mutex); + + INIT_LIST_HEAD(&priv->path_list); + INIT_LIST_HEAD(&priv->child_intfs); + INIT_LIST_HEAD(&priv->dead_ahs); + INIT_LIST_HEAD(&priv->multicast_list); + + INIT_WORK(&priv->pkey_task, ipoib_pkey_poll, priv->dev); + INIT_WORK(&priv->mcast_task, ipoib_mcast_join_task, priv->dev); + INIT_WORK(&priv->flush_task, ipoib_ib_dev_flush, priv->dev); + INIT_WORK(&priv->restart_task, ipoib_mcast_restart_task, priv->dev); + INIT_WORK(&priv->ah_reap_task, ipoib_reap_ah, priv->dev); +} + +struct ipoib_dev_priv *ipoib_intf_alloc(const char *name) +{ + struct net_device *dev; + + dev = alloc_netdev((int) sizeof (struct ipoib_dev_priv), name, + ipoib_setup); + if (!dev) + return NULL; + + return netdev_priv(dev); +} + +static ssize_t show_pkey(struct class_device *cdev, char *buf) +{ + struct ipoib_dev_priv *priv = + netdev_priv(container_of(cdev, struct net_device, class_dev)); + + return sprintf(buf, "0x%04x\n", priv->pkey); +} +static CLASS_DEVICE_ATTR(pkey, S_IRUGO, show_pkey, NULL); + +static ssize_t create_child(struct class_device *cdev, + const char *buf, size_t count) +{ + int pkey; + int ret; + + if (sscanf(buf, "%i", &pkey) != 1) + return -EINVAL; + + if (pkey < 0 || pkey > 0xffff) + return -EINVAL; + + ret = ipoib_vlan_add(container_of(cdev, struct net_device, class_dev), + pkey); + + return ret ? ret : count; +} +static CLASS_DEVICE_ATTR(create_child, S_IWUGO, NULL, create_child); + +static ssize_t delete_child(struct class_device *cdev, + const char *buf, size_t count) +{ + int pkey; + int ret; + + if (sscanf(buf, "%i", &pkey) != 1) + return -EINVAL; + + if (pkey < 0 || pkey > 0xffff) + return -EINVAL; + + ret = ipoib_vlan_delete(container_of(cdev, struct net_device, class_dev), + pkey); + + return ret ? ret : count; + +} +static CLASS_DEVICE_ATTR(delete_child, S_IWUGO, NULL, delete_child); + +int ipoib_add_pkey_attr(struct net_device *dev) +{ + return class_device_create_file(&dev->class_dev, + &class_device_attr_pkey); +} + +static struct net_device *ipoib_add_port(const char *format, + struct ib_device *hca, u8 port) +{ + struct ipoib_dev_priv *priv; + int result = -ENOMEM; + + priv = ipoib_intf_alloc(format); + if (!priv) + goto alloc_mem_failed; + + SET_NETDEV_DEV(priv->dev, hca->dma_device); + + result = ib_query_pkey(hca, port, 0, &priv->pkey); + if (result) { + printk(KERN_WARNING "%s: ib_query_pkey port %d failed (ret = %d)\n", + hca->name, port, result); + goto alloc_mem_failed; + } + + priv->dev->broadcast[8] = priv->pkey >> 8; + priv->dev->broadcast[9] = priv->pkey & 0xff; + + result = ib_query_gid(hca, port, 0, &priv->local_gid); + if (result) { + printk(KERN_WARNING "%s: ib_query_gid port %d failed (ret = %d)\n", + hca->name, port, result); + goto alloc_mem_failed; + } else + memcpy(priv->dev->dev_addr + 4, priv->local_gid.raw, sizeof (union ib_gid)); + + + result = ipoib_dev_init(priv->dev, hca, port); + if (result < 0) { + printk(KERN_WARNING "%s: failed to initialize port %d (ret = %d)\n", + hca->name, port, result); + goto device_init_failed; + } + + INIT_IB_EVENT_HANDLER(&priv->event_handler, + priv->ca, ipoib_event); + result = ib_register_event_handler(&priv->event_handler); + if (result < 0) { + printk(KERN_WARNING "%s: ib_register_event_handler failed for " + "port %d (ret = %d)\n", + hca->name, port, result); + goto event_failed; + } + + result = register_netdev(priv->dev); + if (result) { + printk(KERN_WARNING "%s: couldn't register ipoib port %d; error %d\n", + hca->name, port, result); + goto register_failed; + } + + if (ipoib_create_debug_file(priv->dev)) + goto debug_failed; + + if (ipoib_add_pkey_attr(priv->dev)) + goto sysfs_failed; + if (class_device_create_file(&priv->dev->class_dev, + &class_device_attr_create_child)) + goto sysfs_failed; + if (class_device_create_file(&priv->dev->class_dev, + &class_device_attr_delete_child)) + goto sysfs_failed; + + return priv->dev; + +sysfs_failed: + ipoib_delete_debug_file(priv->dev); + +debug_failed: + unregister_netdev(priv->dev); + +register_failed: + ib_unregister_event_handler(&priv->event_handler); + +event_failed: + ipoib_dev_cleanup(priv->dev); + +device_init_failed: + free_netdev(priv->dev); + +alloc_mem_failed: + return ERR_PTR(result); +} + +static void ipoib_add_one(struct ib_device *device) +{ + struct list_head *dev_list; + struct net_device *dev; + struct ipoib_dev_priv *priv; + int s, e, p; + + dev_list = kmalloc(sizeof *dev_list, GFP_KERNEL); + if (!dev_list) + return; + + INIT_LIST_HEAD(dev_list); + + if (device->node_type == IB_NODE_SWITCH) { + s = 0; + e = 0; + } else { + s = 1; + e = device->phys_port_cnt; + } + + for (p = s; p <= e; ++p) { + dev = ipoib_add_port("ib%d", device, p); + if (!IS_ERR(dev)) { + priv = netdev_priv(dev); + list_add_tail(&priv->list, dev_list); + } + } + + ib_set_client_data(device, &ipoib_client, dev_list); +} + +static void ipoib_remove_one(struct ib_device *device) +{ + struct ipoib_dev_priv *priv, *tmp; + struct list_head *dev_list; + + dev_list = ib_get_client_data(device, &ipoib_client); + + list_for_each_entry_safe(priv, tmp, dev_list, list) { + ib_unregister_event_handler(&priv->event_handler); + + unregister_netdev(priv->dev); + ipoib_dev_cleanup(priv->dev); + free_netdev(priv->dev); + } +} + +static int __init ipoib_init_module(void) +{ + int ret; + + ret = ipoib_register_debugfs(); + if (ret) + return ret; + + /* + * We create our own workqueue mainly because we want to be + * able to flush it when devices are being removed. We can't + * use schedule_work()/flush_scheduled_work() because both + * unregister_netdev() and linkwatch_event take the rtnl lock, + * so flush_scheduled_work() can deadlock during device + * removal. + */ + ipoib_workqueue = create_singlethread_workqueue("ipoib"); + if (!ipoib_workqueue) { + ret = -ENOMEM; + goto err_fs; + } + + ret = ib_register_client(&ipoib_client); + if (ret) + goto err_wq; + + return 0; + +err_fs: + ipoib_unregister_debugfs(); + +err_wq: + destroy_workqueue(ipoib_workqueue); + + return ret; +} + +static void __exit ipoib_cleanup_module(void) +{ + ipoib_unregister_debugfs(); + ib_unregister_client(&ipoib_client); + destroy_workqueue(ipoib_workqueue); +} + +module_init(ipoib_init_module); +module_exit(ipoib_cleanup_module); --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/ulp/ipoib/ipoib_verbs.c 2004-12-19 22:04:17.682093438 -0800 @@ -0,0 +1,254 @@ +/* + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: ipoib_verbs.c 1349 2004-12-16 21:09:43Z roland $ + */ + +#include + +#include "ipoib.h" + +int ipoib_mcast_attach(struct net_device *dev, u16 mlid, union ib_gid *mgid) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ib_qp_attr *qp_attr; + int attr_mask; + int ret; + u16 pkey_index; + + ret = -ENOMEM; + qp_attr = kmalloc(sizeof *qp_attr, GFP_KERNEL); + if (!qp_attr) + goto out; + + if (ib_cached_pkey_find(priv->ca, priv->port, priv->pkey, &pkey_index)) { + clear_bit(IPOIB_PKEY_ASSIGNED, &priv->flags); + ret = -ENXIO; + goto out; + } + set_bit(IPOIB_PKEY_ASSIGNED, &priv->flags); + + /* set correct QKey for QP */ + qp_attr->qkey = priv->qkey; + attr_mask = IB_QP_QKEY; + ret = ib_modify_qp(priv->qp, qp_attr, attr_mask); + if (ret) { + ipoib_warn(priv, "failed to modify QP, ret = %d\n", ret); + goto out; + } + + /* attach QP to multicast group */ + down(&priv->mcast_mutex); + ret = ib_attach_mcast(priv->qp, mgid, mlid); + up(&priv->mcast_mutex); + if (ret) + ipoib_warn(priv, "failed to attach to multicast group, ret = %d\n", ret); + +out: + kfree(qp_attr); + return ret; +} + +int ipoib_mcast_detach(struct net_device *dev, u16 mlid, union ib_gid *mgid) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + int ret; + + down(&priv->mcast_mutex); + ret = ib_detach_mcast(priv->qp, mgid, mlid); + up(&priv->mcast_mutex); + if (ret) + ipoib_warn(priv, "ib_detach_mcast failed (result = %d)\n", ret); + + return ret; +} + +int ipoib_qp_create(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + int ret; + u16 pkey_index; + struct ib_qp_attr qp_attr; + int attr_mask; + + /* + * Search through the port P_Key table for the requested pkey value. + * The port has to be assigned to the respective IB partition in + * advance. + */ + ret = ib_cached_pkey_find(priv->ca, priv->port, priv->pkey, &pkey_index); + if (ret) { + clear_bit(IPOIB_PKEY_ASSIGNED, &priv->flags); + return ret; + } + set_bit(IPOIB_PKEY_ASSIGNED, &priv->flags); + + qp_attr.qp_state = IB_QPS_INIT; + qp_attr.qkey = 0; + qp_attr.port_num = priv->port; + qp_attr.pkey_index = pkey_index; + attr_mask = + IB_QP_QKEY | + IB_QP_PORT | + IB_QP_PKEY_INDEX | + IB_QP_STATE; + ret = ib_modify_qp(priv->qp, &qp_attr, attr_mask); + if (ret) { + ipoib_warn(priv, "failed to modify QP to init, ret = %d\n", ret); + goto out_fail; + } + + qp_attr.qp_state = IB_QPS_RTR; + /* Can't set this in a INIT->RTR transition */ + attr_mask &= ~IB_QP_PORT; + ret = ib_modify_qp(priv->qp, &qp_attr, attr_mask); + if (ret) { + ipoib_warn(priv, "failed to modify QP to RTR, ret = %d\n", ret); + goto out_fail; + } + + qp_attr.qp_state = IB_QPS_RTS; + qp_attr.sq_psn = 0; + attr_mask |= IB_QP_SQ_PSN; + attr_mask &= ~IB_QP_PKEY_INDEX; + ret = ib_modify_qp(priv->qp, &qp_attr, attr_mask); + if (ret) { + ipoib_warn(priv, "failed to modify QP to RTS, ret = %d\n", ret); + goto out_fail; + } + + return 0; + +out_fail: + ib_destroy_qp(priv->qp); + priv->qp = NULL; + + return -EINVAL; +} + +int ipoib_transport_dev_init(struct net_device *dev, struct ib_device *ca) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ib_qp_init_attr init_attr = { + .cap = { + .max_send_wr = IPOIB_TX_RING_SIZE, + .max_recv_wr = IPOIB_RX_RING_SIZE, + .max_send_sge = 1, + .max_recv_sge = 1 + }, + .sq_sig_type = IB_SIGNAL_ALL_WR, + .rq_sig_type = IB_SIGNAL_ALL_WR, + .qp_type = IB_QPT_UD + }; + + priv->pd = ib_alloc_pd(priv->ca); + if (IS_ERR(priv->pd)) { + printk(KERN_WARNING "%s: failed to allocate PD\n", ca->name); + return -ENODEV; + } + + priv->cq = ib_create_cq(priv->ca, ipoib_ib_completion, NULL, dev, + IPOIB_TX_RING_SIZE + IPOIB_RX_RING_SIZE + 1); + if (IS_ERR(priv->cq)) { + printk(KERN_WARNING "%s: failed to create CQ\n", ca->name); + goto out_free_pd; + } + + if (ib_req_notify_cq(priv->cq, IB_CQ_NEXT_COMP)) + goto out_free_cq; + + priv->mr = ib_get_dma_mr(priv->pd, IB_ACCESS_LOCAL_WRITE); + if (IS_ERR(priv->mr)) { + printk(KERN_WARNING "%s: ib_reg_phys_mr failed\n", ca->name); + goto out_free_cq; + } + + init_attr.send_cq = priv->cq; + init_attr.recv_cq = priv->cq, + + priv->qp = ib_create_qp(priv->pd, &init_attr); + if (IS_ERR(priv->qp)) { + printk(KERN_WARNING "%s: failed to create QP\n", ca->name); + goto out_free_mr; + } + + priv->dev->dev_addr[1] = (priv->qp->qp_num >> 16) & 0xff; + priv->dev->dev_addr[2] = (priv->qp->qp_num >> 8) & 0xff; + priv->dev->dev_addr[3] = (priv->qp->qp_num ) & 0xff; + + return 0; + +out_free_mr: + ib_dereg_mr(priv->mr); + +out_free_cq: + ib_destroy_cq(priv->cq); + +out_free_pd: + ib_dealloc_pd(priv->pd); + return -ENODEV; +} + +void ipoib_transport_dev_cleanup(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + if (priv->qp) { + if (ib_destroy_qp(priv->qp)) + ipoib_warn(priv, "ib_qp_destroy failed\n"); + + priv->qp = NULL; + clear_bit(IPOIB_PKEY_ASSIGNED, &priv->flags); + } + + if (ib_dereg_mr(priv->mr)) + ipoib_warn(priv, "ib_dereg_mr failed\n"); + + if (ib_destroy_cq(priv->cq)) + ipoib_warn(priv, "ib_cq_destroy failed\n"); + + if (ib_dealloc_pd(priv->pd)) + ipoib_warn(priv, "ib_dealloc_pd failed\n"); +} + +void ipoib_event(struct ib_event_handler *handler, + struct ib_event *record) +{ + struct ipoib_dev_priv *priv = + container_of(handler, struct ipoib_dev_priv, event_handler); + + if (record->event == IB_EVENT_PORT_ACTIVE || + record->event == IB_EVENT_LID_CHANGE || + record->event == IB_EVENT_SM_CHANGE) { + ipoib_dbg(priv, "Port active event\n"); + schedule_work(&priv->flush_task); + } +} From yoshfuji@linux-ipv6.org Sun Dec 19 22:39:26 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 19 Dec 2004 22:39:34 -0800 (PST) Received: from yue.st-paulia.net (yue.linux-ipv6.org [203.178.140.15]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBK6d6aS027145 for ; Sun, 19 Dec 2004 22:39:26 -0800 Received: from localhost (localhost [127.0.0.1]) by yue.st-paulia.net (Postfix) with ESMTP id 5038533CC2; Mon, 20 Dec 2004 15:38:46 +0900 (JST) Date: Mon, 20 Dec 2004 15:38:45 +0900 (JST) Message-Id: <20041220.153845.70996857.yoshfuji@linux-ipv6.org> To: davem@davemloft.net, roland@topspin.com Cc: linux-kernel@vger.kernel.org, openib-general@openib.org, akpm@osdl.org, torvalds@osdl.org, netdev@oss.sgi.com, yoshfuji@linux-ipv6.org Subject: Re: [PATCH][v4][0/24] Second InfiniBand merge candidate patch set From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: <200412192214.KlDxQ7icOmxHYIf0@topspin.com> References: <200412192214.KlDxQ7icOmxHYIf0@topspin.com> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12902 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev In article <200412192214.KlDxQ7icOmxHYIf0@topspin.com> (at Sun, 19 Dec 2004 22:14:43 -0800), Roland Dreier says: > The following series of patches is the latest version of the OpenIB > InfiniBand drivers. We believe that this version is suitable for > merging when 2.6.11 opens (or into -mm immediately), although of > course we are willing to go through as many more iterations as > required to fix any remaining issues. Maybe, via the net queue. David? --yoshfuji From yoshfuji@linux-ipv6.org Sun Dec 19 22:59:16 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 19 Dec 2004 22:59:24 -0800 (PST) Received: from yue.st-paulia.net (yue.linux-ipv6.org [203.178.140.15]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBK6wtAO028053 for ; Sun, 19 Dec 2004 22:59:16 -0800 Received: from localhost (localhost [127.0.0.1]) by yue.st-paulia.net (Postfix) with ESMTP id 7166333CC2; Mon, 20 Dec 2004 15:58:36 +0900 (JST) Date: Mon, 20 Dec 2004 15:58:36 +0900 (JST) Message-Id: <20041220.155836.75677852.yoshfuji@linux-ipv6.org> To: roland@topspin.com Cc: linux-kernel@vger.kernel.org, openib-general@openib.org, netdev@oss.sgi.com, yoshfuji@linux-ipv6.org Subject: Re: [PATCH][v4][19/24] Add IPoIB (IP-over-InfiniBand) driver From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: <200412192215.fZX1ZQqQD4QGkKcF@topspin.com> References: <200412192215.69tnzAhGIT1vQGLF@topspin.com> <200412192215.fZX1ZQqQD4QGkKcF@topspin.com> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12903 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev In article <200412192215.fZX1ZQqQD4QGkKcF@topspin.com> (at Sun, 19 Dec 2004 22:15:14 -0800), Roland Dreier says: > +enum { > + IPOIB_PACKET_SIZE = 2048, > + IPOIB_BUF_SIZE = IPOIB_PACKET_SIZE + IB_GRH_BYTES, > + > + IPOIB_ENCAP_LEN = 4, > + > + IPOIB_RX_RING_SIZE = 128, > + IPOIB_TX_RING_SIZE = 64, > + > + IPOIB_NUM_WC = 4, > + > + IPOIB_MAX_PATH_REC_QUEUE = 3, > + IPOIB_MAX_MCAST_QUEUE = 3, above entries does not seem to appropriate for enum (than #define). > + > + IPOIB_FLAG_OPER_UP = 0, > + IPOIB_FLAG_ADMIN_UP = 1, > + IPOIB_PKEY_ASSIGNED = 2, > + IPOIB_PKEY_STOP = 3, > + IPOIB_FLAG_SUBINTERFACE = 4, > + IPOIB_MCAST_RUN = 5, > + IPOIB_STOP_REAPER = 6, this seems ok, but are "xxx_FLAG_xxx" entries really flags? > + IPOIB_MAX_BACKOFF_SECONDS = 16, ditto, w/ first one. > + IPOIB_MCAST_FLAG_FOUND = 0, /* used in set_multicast_list */ > + IPOIB_MCAST_FLAG_SENDONLY = 1, > + IPOIB_MCAST_FLAG_BUSY = 2, /* joining or already joined */ > + IPOIB_MCAST_FLAG_ATTACHED = 3, seems fine, but are these really flags? --yoshfuji From linux_lover2004@yahoo.com Sun Dec 19 23:46:24 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 19 Dec 2004 23:46:32 -0800 (PST) Received: from web52208.mail.yahoo.com (web52208.mail.yahoo.com [206.190.39.90]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBK7k3kU029768 for ; Sun, 19 Dec 2004 23:46:24 -0800 Received: (qmail 64219 invoked by uid 60001); 20 Dec 2004 07:45:35 -0000 Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; b=Fo2VcbMzdGSostowbkSo7PwpTHpoPaWeaV0oVdbAlTQeKPVLzSRSp8Jc5o34JKrUf5QN21XZfRFced3yqPWhjQ/oSN+6bK1Fhx7ScHo+7b9nYIjdAVi47E6yS9KHUv6kbQEQG7u7+SpVv0wLCaw4cDYMvhOg12nMeCwF8Ss+vU4= ; Message-ID: <20041220074535.64217.qmail@web52208.mail.yahoo.com> Received: from [203.199.141.99] by web52208.mail.yahoo.com via HTTP; Sun, 19 Dec 2004 23:45:35 PST Date: Sun, 19 Dec 2004 23:45:35 -0800 (PST) From: linux lover Subject: Required resources on network driver programming To: linux-net@vger.kernel.org Cc: netdev@oss.sgi.com MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12904 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: linux_lover2004@yahoo.com Precedence: bulk X-list: netdev Hi all, I am newbie to Linux kernel programming. I want to write my own virtual network device driver that take every packets from IP layer just print the contents of packet(header part with its starting addresses only) and send it to actual device driver for packet transmission and at receiving end receive packet from NIC card again print the header addresses and send it to upper layer for normal packet processing. I require help about where can i get resources or any book for writing virtual network driver with SAMPLE EXAMPLES? regards, linux_lover __________________________________ Do you Yahoo!? Send holiday email and support a worthy cause. Do good. http://celebrity.mail.yahoo.com From kaber@trash.net Mon Dec 20 00:08:05 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 00:08:12 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBK87igd030829 for ; Mon, 20 Dec 2004 00:08:05 -0800 Received: from eru.coreworks.de ([172.16.0.2] helo=trash.net) by kaber.coreworks.de with esmtp (Exim 4.34) id 1CgIZ5-0003tB-83; Mon, 20 Dec 2004 09:06:47 +0100 Message-ID: <41C687EE.1090205@trash.net> Date: Mon, 20 Dec 2004 09:06:06 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040413 Debian/1.6-5 X-Accept-Language: en MIME-Version: 1.0 To: Thomas Graf CC: jamal , "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PATCH] PKT_SCHED: dsmark must take care of shared/cloned skbs References: <20041218170017.GH17998@postel.suug.ch> <1103487827.1048.188.camel@jzny.localdomain> <20041219203641.GL17998@postel.suug.ch> In-Reply-To: <20041219203641.GL17998@postel.suug.ch> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12905 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev Thomas Graf wrote: > * jamal <1103487827.1048.188.camel@jzny.localdomain> 2004-12-19 15:23 > >>If the qdisc at that level muddies the packet thats fair game - thats >>what goes out on the wire. So we should leave the code as is. > > > Agreed for egress but I think it is needed for stuff like IMQ. It's > debatable whether we should take care of IMQ and alike though. > You shouldn't care about IMQ, but we still need to copy the packet before modifying it if the data is shared. Otherwise we have a race on SMP with AF_PACKET sockets, depending on when the packet is read it can be either modified or not. Converting dsmark to an action sounds like the best long-term solution. Regards Patrick From kaber@trash.net Mon Dec 20 00:29:25 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 00:29:31 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBK8T4iS003474 for ; Mon, 20 Dec 2004 00:29:24 -0800 Received: from eru.coreworks.de ([172.16.0.2] helo=trash.net) by kaber.coreworks.de with esmtp (Exim 4.34) id 1CgItk-0003vJ-A4; Mon, 20 Dec 2004 09:28:08 +0100 Message-ID: <41C68CEF.3030803@trash.net> Date: Mon, 20 Dec 2004 09:27:27 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040413 Debian/1.6-5 X-Accept-Language: en MIME-Version: 1.0 To: Thomas Graf CC: "David S. Miller" , Jamal Hadi Salim , netdev@oss.sgi.com Subject: Re: [PATCH] PKT_SCHED: Fix cls indev validation References: <20041219203050.GK17998@postel.suug.ch> In-Reply-To: <20041219203050.GK17998@postel.suug.ch> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12906 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev Thomas Graf wrote: > Dave, > > This patch is actually part of a patchset for 2.6.11 inclusion > but the memory corruption possibility might make it worth > for 2.6.10. It's not really dangerous since it requires > admin capabilities though. There are lots of places where this is possible, just look at all the silly checks in the action code: sprintf(act_name, "%s", (char*)RTA_DATA(kind)); if (RTA_PAYLOAD(kind) >= IFNAMSIZ) { > Puts the indev validation into its own function to allow > parameter validation before any changes are made. Changes > the sanity check from >= IFNAMSIZ to > IFNAMSIZ to correctly > handle 0 terminated strings and replaces the dangerous sprintf > with a memcpy bound to the TLV size. Providing a indev TLV > for kernels without CONFIG_NET_CLS_IND support will now lead > to EOPPNOTSUPP. Why special-case indev ? Neither u32 nor fw make any attempt to clean up after errors in their init functions, so instead of fixing a single attribute they need to do proper cleanup, than we can just continue to validate indev when changing it. Returning EOPNOTSUPP makes sense, of course. I'm also against keeping all those printks when touching the code. Its ok when writing new code, but I don't see why this code, unlike everything else, needs to report errors in the ringbuffer instead of returning meaningful error codes. > +static inline void > +tcf_change_indev(struct tcf_proto *tp, char *indev, struct rtattr *id_tlv) > +{ > + memset(indev, 0, IFNAMSIZ); > + memcpy(indev, RTA_DATA(id_tlv), RTA_PAYLOAD(id_tlv)); > + indev[IFNAMSIZ - 1] = '\0'; > +} And this should just use strlcpy. Regards Patrick From mrenzmann@web.de Mon Dec 20 01:07:02 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 01:07:10 -0800 (PST) Received: from smtp06.web.de (smtp06.web.de [217.72.192.224]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBK96g4R004815 for ; Mon, 20 Dec 2004 01:07:02 -0800 Received: from [213.168.104.65] (helo=[192.168.2.42]) by smtp06.web.de with asmtp (TLSv1:RC4-MD5:128) (WEB.DE 4.103 #184) id 1CgJUY-00032L-00; Mon, 20 Dec 2004 10:06:10 +0100 Message-ID: <41C69601.7000202@web.de> Date: Mon, 20 Dec 2004 10:06:09 +0100 From: Michael Renzmann User-Agent: Mozilla Thunderbird 0.8 (X11/20040916) X-Accept-Language: en-us, en MIME-Version: 1.0 To: linux lover CC: linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: Required resources on network driver programming References: <20041220074535.64217.qmail@web52208.mail.yahoo.com> In-Reply-To: <20041220074535.64217.qmail@web52208.mail.yahoo.com> X-Enigmail-Version: 0.86.0.0 X-Enigmail-Supports: pgp-inline, pgp-mime Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Sender: mrenzmann@web.de X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12907 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mrenzmann@web.de Precedence: bulk X-list: netdev Hi. linux lover wrote: > I require help about where can i get > resources or any book for writing virtual network > driver with SAMPLE EXAMPLES? I think the following resources might be useful for your task - also there might be alternatives for the implementation (tcpdump/tethereal-like programs that capture the information you need from a packet socket): Linux Device Drivers, 2nd Edition http://www.xml.com/ldd/chapter/book/index.html The Linux Kernel Module Programming Guide http://en.tldp.org/LDP/lkmpg/ Virtual Network Interfaces http://www.linux.it/~rubini/docs/vinter/vinter.html Linux Kernel Module Programming Guide http://oopweb.com/OS/Documents/LKMPG/VolumeFrames.html?/OS/Documents/LKMPG/Volume/node11.html The Linux Kernel API http://kernelnewbies.org/documents/kdoc/kernel-api/linuxkernelapi.html Generally it's also a good idea to have a look at http://www.kernelnewbies.org/ and visit the IRC-channel #kernelnewbies at oftc.net. Bye, Mike From markb@wetlettuce.com Mon Dec 20 01:43:33 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 01:43:46 -0800 (PST) Received: from piglet.wetlettuce.com (piglet.wetlettuce.com [82.68.149.69]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBK9hCpL006233 for ; Mon, 20 Dec 2004 01:43:33 -0800 Received: from robin ([127.0.0.1] helo=wetlettuce.com ident=www-data) by piglet.wetlettuce.com with smtp (Exim 3.35 #1 (Debian)) id 1CgK3M-0000fm-00; Mon, 20 Dec 2004 09:42:08 +0000 Received: from 192.102.214.6 (SquirrelMail authenticated user lists) by webmail.wetlettuce.com with HTTP; Mon, 20 Dec 2004 09:42:08 -0000 (GMT) Message-ID: <36901.192.102.214.6.1103535728.squirrel@webmail.wetlettuce.com> Date: Mon, 20 Dec 2004 09:42:08 -0000 (GMT) Subject: Re: Lockup with 2.6.9-ac15 related to netconsole From: "Mark Broadbent" To: In-Reply-To: <20041217233524.GA11202@electric-eye.fr.zoreil.com> References: <59719.192.102.214.6.1103214002.squirrel@webmail.wetlettuce.com> <20041216211024.GK2767@waste.org> <34721.192.102.214.6.1103274614.squirrel@webmail.wetlettuce.com> <20041217215752.GP2767@waste.org> <20041217233524.GA11202@electric-eye.fr.zoreil.com> X-Priority: 3 Importance: Normal X-MSMail-Priority: Normal Cc: , , Reply-To: markb@wetlettuce.com X-Mailer: SquirrelMail (version 1.2.6) MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-MailScanner: Mail is clear of Viree X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12908 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: markb@wetlettuce.com Precedence: bulk X-list: netdev Francois Romieu said: > Matt Mackall : > [...] >> Please try the attached untested, uncompiled patch to add polling to >> r8169: > [...] >> @@ -1839,6 +1842,15 @@ >> } >> #endif >> >> +#ifdef CONFIG_NET_POLL_CONTROLLER >> +static void rtl8169_netpoll(struct net_device *dev) >> +{ >> + disable_irq(dev->irq); >> + rtl8169_interrupt(dev->irq, netdev, NULL); > ^^^^^^ -> should be "dev" > > The r8169 driver in -mm offers netpoll. A patch which syncs the r8169 > driver from 2.6.10-rc3 with current -mm is available at: > http://www.fr.zoreil.com/people/francois/misc/20041218-2.6.10-rc3-r8169.c-test.patch> > Please report success/failure. Cc: netdev@oss.sgi.com is welcome. Exactly the same happens, I still get a 'NMI Watchdog detected LOCKUP' with the r8169 device using the above patch on top of 2.6.10-rc3-bk10. Thanks Mark -- Mark Broadbent Web: http://www.wetlettuce.com From roarbr@tihlde.org Mon Dec 20 01:45:56 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 01:46:03 -0800 (PST) Received: from mail.thales-communications.no (mail.thales-communications.no [80.239.27.10]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBK9jZ96006641 for ; Mon, 20 Dec 2004 01:45:56 -0800 Received: by tcnosr02.tcno.thales (Postfix, from userid 10) id 7F1E86FC82; Mon, 20 Dec 2004 10:45:05 +0100 (CET) Received: from [10.1.3.57] (unknown [80.239.6.50]) by outpost.thales-communications.no (Postfix) with ESMTP id 9F9836FC82; Mon, 20 Dec 2004 10:44:06 +0100 (CET) Message-ID: <41C69EE5.7070907@tihlde.org> Date: Mon, 20 Dec 2004 10:44:05 +0100 From: =?ISO-8859-1?Q?Roar_Bj=F8rgum_Rotvik?= User-Agent: Mozilla Thunderbird 0.7.1 (X11/20040626) X-Accept-Language: en-us, en MIME-Version: 1.0 To: linux lover Cc: linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: Required resources on network driver programming References: <20041220074535.64217.qmail@web52208.mail.yahoo.com> In-Reply-To: <20041220074535.64217.qmail@web52208.mail.yahoo.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12909 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: roarbr@tihlde.org Precedence: bulk X-list: netdev linux lover wrote: > > Hi all, > I am newbie to Linux kernel > programming. I want to write my own virtual network > device driver that take every packets from IP layer > just print the contents of packet(header part with its > starting addresses only) and send it to actual device > driver for packet transmission and at receiving end > receive packet from NIC card again print the header > addresses and send it to upper layer for normal packet > processing. > I require help about where can i get > resources or any book for writing virtual network > driver with SAMPLE EXAMPLES? Another mail gave you pointers to driver programming. Here are links to TUN/TAP, a virtual ethernet (or point-to-point) device driver in the Linux kernel. With TAP you have a virtual ethernet device, where the device driver is replaced with a userspace application (that you write) that receives all data sent to the tap device (for instance tap0) from the linux kernel network stack. This data contains ethernet header + all other payload (like IP-header + IP data with either TCP/UDP or arp-packets).. Your userspace app may also send data to a tap-device and this data will be sent to the linux network stack like it came from a real ethernet device. So your userspace app may do whatever it want with the data, it may print out the contents of the IP layer and forward the data to another PC using the real ethernet device (eth0, or another network medium). The other PC may do the opposite, i.e. receive the data from the ethernet device and send then to the tap device. Your application may even encapsulate the data send between the two PC in any protocol you may like (you got tunneling). If you wanted to do all in kernel mode this is not what you want, but tun/tap code may give you some tips on how to create a virtual ethernet device.. See these links: http://lxr.linux.no/source/drivers/net/tun.c?v=2.6.8.1 http://vtun.sourceforge.net/tun/ http://vtun.sourceforge.net/tun/faq.html -- Roar B. Rotvik From kaber@trash.net Mon Dec 20 02:19:50 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 02:20:05 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBKAJTRd008344 for ; Mon, 20 Dec 2004 02:19:50 -0800 Received: from eru.coreworks.de ([172.16.0.2] helo=trash.net) by kaber.coreworks.de with esmtp (Exim 4.34) id 1CgKcY-00043V-Rq; Mon, 20 Dec 2004 11:18:31 +0100 Message-ID: <41C6A6CC.1050105@trash.net> Date: Mon, 20 Dec 2004 11:17:48 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040413 Debian/1.6-5 X-Accept-Language: en MIME-Version: 1.0 To: hadi@cyberus.ca CC: Thomas Graf , "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PATCH] PKT_SCHED: Provide compat policer stats in action policer References: <20041215130128.GK8493@postel.suug.ch> <1103119774.1077.74.camel@jzny.localdomain> <41C05B60.6030504@trash.net> <1103484249.1046.143.camel@jzny.localdomain> In-Reply-To: <1103484249.1046.143.camel@jzny.localdomain> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12910 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev jamal wrote: > On Wed, 2004-12-15 at 10:42, Patrick McHardy wrote: > >>Since >>this problem is not related to the policer oops fix it doesn't >>convince me that my time would have been well invested doing the >>tests you described. > > > But it is _absolutely_ related to the policer oops. > If those tests were run to begin with there would be no oops neither > this latest problem. I agree that this problem would have been avoided if the regression tests were run when the change was made, and it made sense to run them at that time. Unfortunately I missed the patch when it went in, otherwise I would have objected to using a field called "priv" and making assumptions about the layout of the structure it points to in a file called act_api anyway. But reshuffling structure members of a structure not exposed to userspace doesn't require testing tc, and doesn't require testing old kernels. This was the only point I was trying to make, I run tests when they make sense, but not because I might find something unrelated by accident. As an exception to this, I am willing to run unrelated tests if it is little overhead (== fully automatic). Even following your logic (We cant compromise quality by handwaving on instinct. Famous last words: "that couldnt have possibly caused a bug down there") I need to either test the entire kernel after each change ("down there" could be anywhere), or judge for myself which parts to test. Blindly running regression tests isn't going to do much good. > Hopefully with the regression tests in place this will get better. On a side-note, you both seem to be inventing your own testing framework and regression tests. tcng already includes lots of regression tests for tc, tcng and the kernel. Unfortunately, last time I checked, it didn't work with 2.6. > [You fear Murphy less than i - and thats a style difference. Your style > is actually more effective in Linux because you can distribute the > burden onto users. As a matter of fact it is within Daves tolerance > range (but not mine[1]). So you should do just fine] I don't feel like I'm distributing burden onto anyone. As I said, I run the tests I deem necessary, and I never send out patches of whichs correctness I'm not convinced. So far, my history of mistakes has been pretty good. Regards Patrick From lkml@einar-lueck.de Mon Dec 20 03:14:07 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 03:14:14 -0800 (PST) Received: from smtprelay03.ispgateway.de (smtprelay03.ispgateway.de [80.67.18.15]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBKBDjri012383 for ; Mon, 20 Dec 2004 03:14:06 -0800 Received: (qmail 25938 invoked from network); 20 Dec 2004 11:13:20 -0000 Received: from unknown (HELO [192.168.30.10]) (008508@[217.231.189.59]) (envelope-sender ) by smtprelay03.ispgateway.de (qmail-ldap-1.03) with SMTP for ; 20 Dec 2004 11:13:20 -0000 Message-ID: <41C6B3D4.6060207@einar-lueck.de> Date: Mon, 20 Dec 2004 12:13:24 +0100 From: =?ISO-8859-1?Q?Einar_L=FCck?= User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206) X-Accept-Language: en-us, en MIME-Version: 1.0 To: linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: [PATCH 1/2] ipv4 routing: splitting of ip_route_[in|out]put_slow, 2.6.10-rc3 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12911 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: lkml@einar-lueck.de Precedence: bulk X-list: netdev [PATCH 1/2] ipv4 routing: splitting of ip_route_[in|out]put_slow, 2.6.10-rc3 From: Einar Lueck This patch splits up ip_route_[in|out]put_slow in inlined functions. Basic idea: * improve overall comprehensibility * allow for an easier application of patch for improved multipath support (refer to the subsequent patch) Please consider for application. Regards Einar. Signed-off-by: Einar Lueck diff -ruN linux-2.6.9/net/ipv4/route.c linux-2.6.9.split/net/ipv4/route.c --- linux-2.6.9/net/ipv4/route.c 2004-12-15 12:03:59.000000000 +0100 +++ linux-2.6.9.split/net/ipv4/route.c 2004-12-15 12:05:32.000000000 +0100 @@ -104,6 +104,9 @@ #include #endif +#define RT_FL_TOS(oldflp) \ + ((u32)(oldflp->fl4_tos & (IPTOS_RT_MASK | RTO_ONLINK))) + #define IP_MAX_MTU 0xFFF0 #define RT_GC_TIMEOUT (300*HZ) @@ -143,6 +146,7 @@ static void ipv4_link_failure(struct sk_buff *skb); static void ip_rt_update_pmtu(struct dst_entry *dst, u32 mtu); static int rt_garbage_collect(void); +static inline int compare_keys(struct flowi *fl1, struct flowi *fl2); static struct dst_ops ipv4_dst_ops = { @@ -1533,6 +1537,169 @@ return -EINVAL; } + +static void ip_handle_martian_source(struct net_device *dev, + struct in_device *in_dev, + struct sk_buff *skb, + u32 daddr, + u32 saddr) +{ + RT_CACHE_STAT_INC(in_martian_src); +#ifdef CONFIG_IP_ROUTE_VERBOSE + if (IN_DEV_LOG_MARTIANS(in_dev) && net_ratelimit()) { + /* + * RFC1812 recommendation, if source is martian, + * the only hint is MAC header. + */ + printk(KERN_WARNING "martian source %u.%u.%u.%u from " + "%u.%u.%u.%u, on dev %s\n", + NIPQUAD(daddr), NIPQUAD(saddr), dev->name); + if (dev->hard_header_len) { + int i; + unsigned char *p = skb->mac.raw; + printk(KERN_WARNING "ll header: "); + for (i = 0; i < dev->hard_header_len; i++, p++) { + printk("%02x", *p); + if (i < (dev->hard_header_len - 1)) + printk(":"); + } + printk("\n"); + } + } +#endif +} + +static inline int __mkroute_input(struct sk_buff *skb, + struct fib_result* res, + struct in_device *in_dev, + u32 daddr, u32 saddr, u32 tos, + struct rtable **result) +{ + + struct rtable *rth; + int err; + struct in_device *out_dev; + unsigned flags = 0; + u32 spec_dst, itag; + + /* get a working reference to the output device */ + out_dev = in_dev_get(FIB_RES_DEV(*res)); + if (out_dev == NULL) { + if (net_ratelimit()) + printk(KERN_CRIT "Bug in ip_route_input" \ + "_slow(). Please, report\n"); + return -EINVAL; + } + + + err = fib_validate_source(saddr, daddr, tos, FIB_RES_OIF(*res), + in_dev->dev, &spec_dst, &itag); + if (err < 0) { + ip_handle_martian_source(in_dev->dev, in_dev, skb, daddr, + saddr); + + err = -EINVAL; + goto cleanup; + } + + if (err) + flags |= RTCF_DIRECTSRC; + + if (out_dev == in_dev && err && !(flags & (RTCF_NAT | RTCF_MASQ)) && + (IN_DEV_SHARED_MEDIA(out_dev) || + inet_addr_onlink(out_dev, saddr, FIB_RES_GW(*res)))) + flags |= RTCF_DOREDIRECT; + + if (skb->protocol != htons(ETH_P_IP)) { + /* Not IP (i.e. ARP). Do not create route, if it is + * invalid for proxy arp. DNAT routes are always valid. + */ + if (out_dev == in_dev && !(flags & RTCF_DNAT)) { + err = -EINVAL; + goto cleanup; + } + } + + + rth = dst_alloc(&ipv4_dst_ops); + if (!rth) { + err = -ENOBUFS; + goto cleanup; + } + + rth->u.dst.flags= DST_HOST; + if (in_dev->cnf.no_policy) + rth->u.dst.flags |= DST_NOPOLICY; + if (in_dev->cnf.no_xfrm) + rth->u.dst.flags |= DST_NOXFRM; + rth->fl.fl4_dst = daddr; + rth->rt_dst = daddr; + rth->fl.fl4_tos = tos; +#ifdef CONFIG_IP_ROUTE_FWMARK + rth->fl.fl4_fwmark= skb->nfmark; +#endif + rth->fl.fl4_src = saddr; + rth->rt_src = saddr; + rth->rt_gateway = daddr; + rth->rt_iif = + rth->fl.iif = in_dev->dev->ifindex; + rth->u.dst.dev = (out_dev)->dev; + dev_hold(rth->u.dst.dev); + rth->idev = in_dev_get(rth->u.dst.dev); + rth->fl.oif = 0; + rth->rt_spec_dst= spec_dst; + + rth->u.dst.input = ip_forward; + rth->u.dst.output = ip_output; + + rt_set_nexthop(rth, res, itag); + + rth->rt_flags = flags; + + *result = rth; + err = 0; + cleanup: + /* release the working reference to the output device */ + in_dev_put(out_dev); + return err; +} + +static inline int ip_mkroute_input_def(struct sk_buff *skb, + struct fib_result* res, + const struct flowi *fl, + struct in_device *in_dev, + u32 daddr, u32 saddr, u32 tos) +{ + struct rtable* rth; + int err; + unsigned hash; + +#ifdef CONFIG_IP_ROUTE_MULTIPATH + if (res->fi->fib_nhs > 1 && fl->oif == 0) + fib_select_multipath(fl, res); +#endif + + /* create a routing cache entry */ + err = __mkroute_input( skb, res, in_dev, daddr, saddr, tos, &rth ); + if ( err ) + return err; + atomic_set(&rth->u.dst.__refcnt, 1); + + /* put it into the cache */ + hash = rt_hash_code(daddr, saddr ^ (fl->iif << 5), tos); + return rt_intern_hash(hash, rth, (struct rtable**)&skb->dst); +} + +static inline int ip_mkroute_input(struct sk_buff *skb, + struct fib_result* res, + const struct flowi *fl, + struct in_device *in_dev, + u32 daddr, u32 saddr, u32 tos) +{ + return ip_mkroute_input_def(skb, res, fl, in_dev, daddr, saddr, tos); +} + + /* * NOTE. We drop all the packets that has local source * addresses, because every properly looped back packet @@ -1544,11 +1711,10 @@ */ static int ip_route_input_slow(struct sk_buff *skb, u32 daddr, u32 saddr, - u8 tos, struct net_device *dev) + u8 tos, struct net_device *dev) { struct fib_result res; struct in_device *in_dev = in_dev_get(dev); - struct in_device *out_dev = NULL; struct flowi fl = { .nl_u = { .ip4_u = { .daddr = daddr, .saddr = saddr, @@ -1572,8 +1738,6 @@ if (!in_dev) goto out; - hash = rt_hash_code(daddr, saddr ^ (fl.iif << 5), tos); - /* Check for the most weird martians, which can be not detected by fib_lookup. */ @@ -1626,79 +1790,14 @@ if (res.type != RTN_UNICAST) goto martian_destination; -#ifdef CONFIG_IP_ROUTE_MULTIPATH - if (res.fi->fib_nhs > 1 && fl.oif == 0) - fib_select_multipath(&fl, &res); -#endif - out_dev = in_dev_get(FIB_RES_DEV(res)); - if (out_dev == NULL) { - if (net_ratelimit()) - printk(KERN_CRIT "Bug in ip_route_input_slow(). " - "Please, report\n"); - goto e_inval; - } - - err = fib_validate_source(saddr, daddr, tos, FIB_RES_OIF(res), dev, - &spec_dst, &itag); - if (err < 0) - goto martian_source; - - if (err) - flags |= RTCF_DIRECTSRC; - - if (out_dev == in_dev && err && !(flags & (RTCF_NAT | RTCF_MASQ)) && - (IN_DEV_SHARED_MEDIA(out_dev) || - inet_addr_onlink(out_dev, saddr, FIB_RES_GW(res)))) - flags |= RTCF_DOREDIRECT; - - if (skb->protocol != htons(ETH_P_IP)) { - /* Not IP (i.e. ARP). Do not create route, if it is - * invalid for proxy arp. DNAT routes are always valid. - */ - if (out_dev == in_dev && !(flags & RTCF_DNAT)) - goto e_inval; - } - - rth = dst_alloc(&ipv4_dst_ops); - if (!rth) + err = ip_mkroute_input(skb, &res, &fl, in_dev, daddr, saddr, tos); + if ( err == -ENOBUFS ) goto e_nobufs; - - atomic_set(&rth->u.dst.__refcnt, 1); - rth->u.dst.flags= DST_HOST; - if (in_dev->cnf.no_policy) - rth->u.dst.flags |= DST_NOPOLICY; - if (in_dev->cnf.no_xfrm) - rth->u.dst.flags |= DST_NOXFRM; - rth->fl.fl4_dst = daddr; - rth->rt_dst = daddr; - rth->fl.fl4_tos = tos; -#ifdef CONFIG_IP_ROUTE_FWMARK - rth->fl.fl4_fwmark= skb->nfmark; -#endif - rth->fl.fl4_src = saddr; - rth->rt_src = saddr; - rth->rt_gateway = daddr; - rth->rt_iif = - rth->fl.iif = dev->ifindex; - rth->u.dst.dev = out_dev->dev; - dev_hold(rth->u.dst.dev); - rth->idev = in_dev_get(rth->u.dst.dev); - rth->fl.oif = 0; - rth->rt_spec_dst= spec_dst; - - rth->u.dst.input = ip_forward; - rth->u.dst.output = ip_output; - - rt_set_nexthop(rth, &res, itag); - - rth->rt_flags = flags; - -intern: - err = rt_intern_hash(hash, rth, (struct rtable**)&skb->dst); + if ( err == -EINVAL ) + goto e_inval; + done: in_dev_put(in_dev); - if (out_dev) - in_dev_put(out_dev); if (free_res) fib_res_put(&res); out: return err; @@ -1758,7 +1857,9 @@ rth->rt_flags &= ~RTCF_LOCAL; } rth->rt_type = res.type; - goto intern; + hash = rt_hash_code(daddr, saddr ^ (fl.iif << 5), tos); + err = rt_intern_hash(hash, rth, (struct rtable**)&skb->dst); + goto done; no_route: RT_CACHE_STAT_INC(in_no_route); @@ -1786,30 +1887,7 @@ goto done; martian_source: - - RT_CACHE_STAT_INC(in_martian_src); -#ifdef CONFIG_IP_ROUTE_VERBOSE - if (IN_DEV_LOG_MARTIANS(in_dev) && net_ratelimit()) { - /* - * RFC1812 recommendation, if source is martian, - * the only hint is MAC header. - */ - printk(KERN_WARNING "martian source %u.%u.%u.%u from " - "%u.%u.%u.%u, on dev %s\n", - NIPQUAD(daddr), NIPQUAD(saddr), dev->name); - if (dev->hard_header_len) { - int i; - unsigned char *p = skb->mac.raw; - printk(KERN_WARNING "ll header: "); - for (i = 0; i < dev->hard_header_len; i++, p++) { - printk("%02x", *p); - if (i < (dev->hard_header_len - 1)) - printk(":"); - } - printk("\n"); - } - } -#endif + ip_handle_martian_source(dev, in_dev, skb, daddr, saddr); goto e_inval; } @@ -1880,13 +1958,166 @@ return ip_route_input_slow(skb, daddr, saddr, tos, dev); } +static inline int __mkroute_output(struct rtable **result, + struct fib_result* res, + const struct flowi *fl, + const struct flowi *oldflp, + struct net_device *dev_out, + unsigned flags) +{ + struct rtable *rth; + struct in_device *in_dev; + u32 tos = RT_FL_TOS(oldflp); + int err = 0; + + if (LOOPBACK(fl->fl4_src) && !(dev_out->flags&IFF_LOOPBACK)) + return -EINVAL; + + if (fl->fl4_dst == 0xFFFFFFFF) + res->type = RTN_BROADCAST; + else if (MULTICAST(fl->fl4_dst)) + res->type = RTN_MULTICAST; + else if (BADCLASS(fl->fl4_dst) || ZERONET(fl->fl4_dst)) + return -EINVAL; + + if (dev_out->flags & IFF_LOOPBACK) + flags |= RTCF_LOCAL; + + /* get work reference to inet device */ + in_dev = in_dev_get(dev_out); + if (!in_dev) + return -EINVAL; + + if (res->type == RTN_BROADCAST) { + flags |= RTCF_BROADCAST | RTCF_LOCAL; + if (res->fi) { + fib_info_put(res->fi); + res->fi = NULL; + } + } else if (res->type == RTN_MULTICAST) { + flags |= RTCF_MULTICAST|RTCF_LOCAL; + if (!ip_check_mc(in_dev, oldflp->fl4_dst, oldflp->fl4_src, + oldflp->proto)) + flags &= ~RTCF_LOCAL; + /* If multicast route do not exist use + default one, but do not gateway in this case. + Yes, it is hack. + */ + if (res->fi && res->prefixlen < 4) { + fib_info_put(res->fi); + res->fi = NULL; + } + } + + + rth = dst_alloc(&ipv4_dst_ops); + if (!rth) { + err = -ENOBUFS; + goto cleanup; + } + + rth->u.dst.flags= DST_HOST; + if (in_dev->cnf.no_xfrm) + rth->u.dst.flags |= DST_NOXFRM; + if (in_dev->cnf.no_policy) + rth->u.dst.flags |= DST_NOPOLICY; + + rth->fl.fl4_dst = oldflp->fl4_dst; + rth->fl.fl4_tos = tos; + rth->fl.fl4_src = oldflp->fl4_src; + rth->fl.oif = oldflp->oif; +#ifdef CONFIG_IP_ROUTE_FWMARK + rth->fl.fl4_fwmark= oldflp->fl4_fwmark; +#endif + rth->rt_dst = fl->fl4_dst; + rth->rt_src = fl->fl4_src; + rth->rt_iif = oldflp->oif ? : dev_out->ifindex; + /* get references to the devices that are to be hold by the routing + cache entry */ + rth->u.dst.dev = dev_out; + dev_hold(dev_out); + rth->idev = in_dev_get(dev_out); + rth->rt_gateway = fl->fl4_dst; + rth->rt_spec_dst= fl->fl4_src; + + rth->u.dst.output=ip_output; + + RT_CACHE_STAT_INC(out_slow_tot); + + if (flags & RTCF_LOCAL) { + rth->u.dst.input = ip_local_deliver; + rth->rt_spec_dst = fl->fl4_dst; + } + if (flags & (RTCF_BROADCAST | RTCF_MULTICAST)) { + rth->rt_spec_dst = fl->fl4_src; + if (flags & RTCF_LOCAL && + !(dev_out->flags & IFF_LOOPBACK)) { + rth->u.dst.output = ip_mc_output; + RT_CACHE_STAT_INC(out_slow_mc); + } +#ifdef CONFIG_IP_MROUTE + if (res->type == RTN_MULTICAST) { + if (IN_DEV_MFORWARD(in_dev) && + !LOCAL_MCAST(oldflp->fl4_dst)) { + rth->u.dst.input = ip_mr_input; + rth->u.dst.output = ip_mc_output; + } + } +#endif + } + + rt_set_nexthop(rth, res, 0); + + rth->rt_flags = flags; + + *result = rth; + cleanup: + /* release work reference to inet device */ + in_dev_put(in_dev); + + return err; +} + +static inline int ip_mkroute_output_def(struct rtable **rp, + struct fib_result* res, + const struct flowi *fl, + const struct flowi *oldflp, + struct net_device *dev_out, + unsigned flags) +{ + struct rtable *rth; + int err = __mkroute_output(&rth, res, fl, oldflp, dev_out, flags); + unsigned hash; + if ( err == 0 ) { + u32 tos = RT_FL_TOS(oldflp); + + atomic_set(&rth->u.dst.__refcnt, 1); + + hash = rt_hash_code(oldflp->fl4_dst, + oldflp->fl4_src ^ (oldflp->oif << 5), tos); + err = rt_intern_hash(hash, rth, rp); + } + + return err; +} + +static inline int ip_mkroute_output(struct rtable** rp, + struct fib_result* res, + const struct flowi *fl, + const struct flowi *oldflp, + struct net_device *dev_out, + unsigned flags) +{ + return ip_mkroute_output_def(rp, res, fl, oldflp, dev_out, flags); +} + /* * Major route resolver routine. */ static int ip_route_output_slow(struct rtable **rp, const struct flowi *oldflp) { - u32 tos = oldflp->fl4_tos & (IPTOS_RT_MASK | RTO_ONLINK); + u32 tos = RT_FL_TOS(oldflp); struct flowi fl = { .nl_u = { .ip4_u = { .daddr = oldflp->fl4_dst, .saddr = oldflp->fl4_src, @@ -1902,10 +2133,7 @@ .oif = oldflp->oif }; struct fib_result res; unsigned flags = 0; - struct rtable *rth; struct net_device *dev_out = NULL; - struct in_device *in_dev = NULL; - unsigned hash; int free_res = 0; int err; @@ -2065,116 +2293,13 @@ fl.oif = dev_out->ifindex; make_route: - if (LOOPBACK(fl.fl4_src) && !(dev_out->flags&IFF_LOOPBACK)) - goto e_inval; + err = ip_mkroute_output(rp, &res, &fl, oldflp, dev_out, flags); - if (fl.fl4_dst == 0xFFFFFFFF) - res.type = RTN_BROADCAST; - else if (MULTICAST(fl.fl4_dst)) - res.type = RTN_MULTICAST; - else if (BADCLASS(fl.fl4_dst) || ZERONET(fl.fl4_dst)) - goto e_inval; - - if (dev_out->flags & IFF_LOOPBACK) - flags |= RTCF_LOCAL; - - in_dev = in_dev_get(dev_out); - if (!in_dev) - goto e_inval; - - if (res.type == RTN_BROADCAST) { - flags |= RTCF_BROADCAST | RTCF_LOCAL; - if (res.fi) { - fib_info_put(res.fi); - res.fi = NULL; - } - } else if (res.type == RTN_MULTICAST) { - flags |= RTCF_MULTICAST|RTCF_LOCAL; - if (!ip_check_mc(in_dev, oldflp->fl4_dst, oldflp->fl4_src, oldflp->proto)) - flags &= ~RTCF_LOCAL; - /* If multicast route do not exist use - default one, but do not gateway in this case. - Yes, it is hack. - */ - if (res.fi && res.prefixlen < 4) { - fib_info_put(res.fi); - res.fi = NULL; - } - } - - rth = dst_alloc(&ipv4_dst_ops); - if (!rth) - goto e_nobufs; - - atomic_set(&rth->u.dst.__refcnt, 1); - rth->u.dst.flags= DST_HOST; - if (in_dev->cnf.no_xfrm) - rth->u.dst.flags |= DST_NOXFRM; - if (in_dev->cnf.no_policy) - rth->u.dst.flags |= DST_NOPOLICY; - rth->fl.fl4_dst = oldflp->fl4_dst; - rth->fl.fl4_tos = tos; - rth->fl.fl4_src = oldflp->fl4_src; - rth->fl.oif = oldflp->oif; -#ifdef CONFIG_IP_ROUTE_FWMARK - rth->fl.fl4_fwmark= oldflp->fl4_fwmark; -#endif - rth->rt_dst = fl.fl4_dst; - rth->rt_src = fl.fl4_src; - rth->rt_iif = oldflp->oif ? : dev_out->ifindex; - rth->u.dst.dev = dev_out; - dev_hold(dev_out); - rth->idev = in_dev_get(dev_out); - rth->rt_gateway = fl.fl4_dst; - rth->rt_spec_dst= fl.fl4_src; - - rth->u.dst.output=ip_output; - - RT_CACHE_STAT_INC(out_slow_tot); - - if (flags & RTCF_LOCAL) { - rth->u.dst.input = ip_local_deliver; - rth->rt_spec_dst = fl.fl4_dst; - } - if (flags & (RTCF_BROADCAST | RTCF_MULTICAST)) { - rth->rt_spec_dst = fl.fl4_src; - if (flags & RTCF_LOCAL && !(dev_out->flags & IFF_LOOPBACK)) { - rth->u.dst.output = ip_mc_output; - RT_CACHE_STAT_INC(out_slow_mc); - } -#ifdef CONFIG_IP_MROUTE - if (res.type == RTN_MULTICAST) { - if (IN_DEV_MFORWARD(in_dev) && - !LOCAL_MCAST(oldflp->fl4_dst)) { - rth->u.dst.input = ip_mr_input; - rth->u.dst.output = ip_mc_output; - } - } -#endif - } - - rt_set_nexthop(rth, &res, 0); - - - rth->rt_flags = flags; - - hash = rt_hash_code(oldflp->fl4_dst, oldflp->fl4_src ^ (oldflp->oif << 5), tos); - err = rt_intern_hash(hash, rth, rp); -done: if (free_res) fib_res_put(&res); if (dev_out) dev_put(dev_out); - if (in_dev) - in_dev_put(in_dev); out: return err; - -e_inval: - err = -EINVAL; - goto done; -e_nobufs: - err = -ENOBUFS; - goto done; } int __ip_route_output_key(struct rtable **rp, const struct flowi *flp) From lkml@einar-lueck.de Mon Dec 20 03:16:15 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 03:16:24 -0800 (PST) Received: from smtprelay03.ispgateway.de (smtprelay03.ispgateway.de [80.67.18.15]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBKBFrKE012921 for ; Mon, 20 Dec 2004 03:16:14 -0800 Received: (qmail 26016 invoked from network); 20 Dec 2004 11:13:39 -0000 Received: from unknown (HELO [192.168.30.10]) (008508@[217.231.189.59]) (envelope-sender ) by smtprelay03.ispgateway.de (qmail-ldap-1.03) with SMTP for ; 20 Dec 2004 11:13:39 -0000 Message-ID: <41C6B3E8.60401@einar-lueck.de> Date: Mon, 20 Dec 2004 12:13:44 +0100 From: =?ISO-8859-1?Q?Einar_L=FCck?= User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206) X-Accept-Language: en-us, en MIME-Version: 1.0 To: netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: [PATCH 2/2] ipv4 routing: multipath with cache support, 2.6.10-rc3 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12912 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: lkml@einar-lueck.de Precedence: bulk X-list: netdev [PATCH 2/2] ipv4 routing: multipath with cache support, 2.6.10-rc3 From: Einar Lueck This patch is an approach towards solving some problems with the current multipath implementation: * routing cache allows only one route to be cached for a certain search key * a mulitpath/load balancing decision is only made in case a corresponding route is not yet cached In the scenarios, that are relevant to us (high amount of outgoing connection requests), this is a serious problem. Our approach: * a new flag at the "struct dst_entry" flags field to indicate that further routes having the same key follow in the routing cache chain * at cache lookup time: we recognize this flag and may apply different load balancing policies to the available routes having the same key. * Garbage Collection: modified in a way that ensures that all routes having the same target are removed together from the cache. Motivation for the approach: * keep the overall routing cache entry size constant * separate load balancing state from the overall routing cache state as good as possible keeping all routes inthe same place * keep changes to the overall routing code/behaviour at a minimum. We implemented the following routing policies: * _weighted_ random routing policiy utilizing the weights configurable via "ip route" that allows to approximate many other policies (e.g. round robin) * random * round-robin * interface round robin policy that selects among relevant routes in a way that tries to ensure an overall round robin among the available _interfaces_ is achieved. Applied tests: * all policies functionally tested with up to 3 NICs (scripted iperf and ifconfig) * garbage collection within the routing cache functionally tested * reference counting on devices functionally tested * Platforms: s390, x86 Please consider for application! Regards, Einar Signed-off-by: Einar Lueck diff -ruN linux-2.6.9.split/include/net/dst.h linux-2.6.9.nicbalancing/include/net/dst.h --- linux-2.6.9.split/include/net/dst.h 2004-12-15 12:04:25.000000000 +0100 +++ linux-2.6.9.nicbalancing/include/net/dst.h 2004-12-15 12:07:15.000000000 +0100 @@ -48,6 +48,7 @@ #define DST_NOXFRM 2 #define DST_NOPOLICY 4 #define DST_NOHASH 8 +#define DST_BALANCED 0x10 unsigned long lastuse; unsigned long expires; diff -ruN linux-2.6.9.split/include/net/flow.h linux-2.6.9.nicbalancing/include/net/flow.h --- linux-2.6.9.split/include/net/flow.h 2004-12-15 12:04:25.000000000 +0100 +++ linux-2.6.9.nicbalancing/include/net/flow.h 2004-12-15 12:07:15.000000000 +0100 @@ -51,6 +51,7 @@ __u8 proto; __u8 flags; +#define FLOWI_FLAG_MULTIPATHOLDROUTE 0x01 union { struct { __u16 sport; diff -ruN linux-2.6.9.split/include/net/ip_fib.h linux-2.6.9.nicbalancing/include/net/ip_fib.h --- linux-2.6.9.split/include/net/ip_fib.h 2004-12-15 12:04:25.000000000 +0100 +++ linux-2.6.9.nicbalancing/include/net/ip_fib.h 2004-12-15 12:07:15.000000000 +0100 @@ -95,6 +95,10 @@ unsigned char nh_sel; unsigned char type; unsigned char scope; +#ifdef CONFIG_IP_ROUTE_MULTIPATH_WRANDOM + __u32 network; + __u32 netmask; +#endif struct fib_info *fi; #ifdef CONFIG_IP_MULTIPLE_TABLES struct fib_rule *r; @@ -119,6 +123,14 @@ #define FIB_RES_DEV(res) (FIB_RES_NH(res).nh_dev) #define FIB_RES_OIF(res) (FIB_RES_NH(res).nh_oif) +#ifdef CONFIG_IP_ROUTE_MULTIPATH_WRANDOM +#define FIB_RES_NETWORK(res) ((res).network) +#define FIB_RES_NETMASK(res) ((res).netmask) +#else /* CONFIG_IP_ROUTE_MULTIPATH_WRANDOM */ +#define FIB_RES_NETWORK(res) (0) +#define FIB_RES_NETMASK(res) (0) +#endif /* CONFIG_IP_ROUTE_MULTIPATH_WRANDOM */ + struct fib_table { unsigned char tb_id; unsigned tb_stamp; diff -ruN linux-2.6.9.split/include/net/route.h linux-2.6.9.nicbalancing/include/net/route.h --- linux-2.6.9.split/include/net/route.h 2004-12-15 12:04:24.000000000 +0100 +++ linux-2.6.9.nicbalancing/include/net/route.h 2004-12-15 12:07:15.000000000 +0100 @@ -46,6 +46,7 @@ #define RT_CONN_FLAGS(sk) (RT_TOS(inet_sk(sk)->tos) | sk->sk_localroute) +struct fib_nh; struct inet_peer; struct rtable { @@ -179,6 +180,9 @@ memcpy(&fl, &(*rp)->fl, sizeof(fl)); fl.fl_ip_sport = sport; fl.fl_ip_dport = dport; +#if defined(CONFIG_IP_ROUTE_MULTIPATH_CACHED) + fl.flags |= FLOWI_FLAG_MULTIPATHOLDROUTE; +#endif ip_rt_put(*rp); *rp = NULL; return ip_route_output_flow(rp, &fl, sk, 0); @@ -197,4 +201,69 @@ return rt->peer; } +#ifdef CONFIG_IP_ROUTE_MULTIPATH_WRANDOM +extern void __multipath_flush(void); +static inline void multipath_flush(void) { + __multipath_flush(); +} +#else /* CONFIG_IP_ROUTE_MULTIPATH_WRANDOM */ +static inline void multipath_flush(void){} +#endif /* CONFIG_IP_ROUTE_MULTIPATH_WRANDOM */ + +#ifdef CONFIG_IP_ROUTE_MULTIPATH_WRANDOM +extern void __multipath_set_nhinfo(__u32 network, + __u32 netmask, + unsigned char prefixlen, + const struct fib_nh* nh); +static inline void multipath_set_nhinfo(__u32 network, + __u32 netmask, + unsigned char prefixlen, + const struct fib_nh* nh) { + __multipath_set_nhinfo(network, netmask, prefixlen, nh); +} +#else +static inline void multipath_set_nhinfo(__u32 network, + __u32 netmask, + unsigned char prefixlen, + const struct fib_nh* nh) { +} +#endif + + + +#if defined(CONFIG_IP_ROUTE_MULTIPATH_RR) || defined(CONFIG_IP_ROUTE_MULTIPATH_DRR) +extern void __multipath_remove(struct rtable *rt); +static inline void multipath_remove(struct rtable *rt) { + if ( rt->u.dst.flags & DST_BALANCED ) { + __multipath_remove( rt ); + } +} +#else /* CONFIG_IP_ROUTE_MULTIPATH_RR || CONFIG_IP_ROUTE_MULTIPATH_DRR */ +static inline void multipath_remove(struct rtable *rt) {} +#endif /* CONFIG_IP_ROUTE_MULTIPATH_RR || CONFIG_IP_ROUTE_MULTIPATH_DRR */ + + +#ifdef CONFIG_IP_ROUTE_MULTIPATH_CACHED +extern void __multipath_selectroute(const struct flowi *flp, + struct rtable *rth, + struct rtable **rp); +static inline int multipath_selectroute(const struct flowi *flp, + struct rtable *rth, + struct rtable **rp) { + if ( rth->u.dst.flags & DST_BALANCED ) { + __multipath_selectroute( flp, rth, rp ); + return 1; + } + else { + return 0; + } +} +#else /* CONFIG_IP_ROUTE_MULTIPATH_CACHED */ +static inline int multipath_selectroute(const struct flowi *flp, + struct rtable *rth, + struct rtable **rp) { + return 0; +} +#endif /* CONFIG_IP_ROUTE_MULTIPATH_CACHED */ + #endif /* _ROUTE_H */ diff -ruN linux-2.6.9.split/net/ipv4/Kconfig linux-2.6.9.nicbalancing/net/ipv4/Kconfig --- linux-2.6.9.split/net/ipv4/Kconfig 2004-12-15 12:04:32.000000000 +0100 +++ linux-2.6.9.nicbalancing/net/ipv4/Kconfig 2004-12-15 12:07:15.000000000 +0100 @@ -90,6 +90,54 @@ equal "cost" and chooses one of them in a non-deterministic fashion if a matching packet arrives. +config IP_ROUTE_MULTIPATH_CACHED + bool "IP: equal cost multipath with caching support (EXPERIMENTAL)" + depends on: IP_ROUTE_MULTIPATH + help + Normally, equal cost multipath routing is not supported by the + routing cache. If you say Y here, alternative routes are cached + and on cache lookup a route is chosen in a configurable fashion. + + If unsure, say N. + +# +# multipath policy configuration +# +choice + prompt "Multipath policy" + depends on IP_ROUTE_MULTIPATH_CACHED + default IP_ROUTE_MULTIPATH_RANDOM + +config IP_ROUTE_MULTIPATH_RR + bool "round robin (EXPERIMENTAL)" + help + Mulitpath routes are chosen according to Round Robin + +config IP_ROUTE_MULTIPATH_RANDOM + bool "random multipath (EXPERIMENTAL)" + help + Multipath routes are chosen in a random fashion. Actually, + there is no weight for a route. The advantage of this policy + is that it is implemented stateless and therefore introduces only + a very small delay. +config IP_ROUTE_MULTIPATH_WRANDOM + bool "weighted random multipath (EXPERIMENTAL)" + help + Multipath routes are chosen in a weighted random fashion. + The per route weights are the weights visible via ip route 2. As the + corresponding state management introduces some overhead routing delay + is increased. +config IP_ROUTE_MULTIPATH_DRR + bool "interface round robin (EXPERIMENTAL)" + help + Connections are distributed in a round robin fashion over the + available interfaces. This policy makes sense if the connections + should be primarily distributed on interfaces and not on routes. +endchoice +# +# END OF multipath policy configuration +# + config IP_ROUTE_VERBOSE bool "IP: verbose route monitoring" depends on IP_ADVANCED_ROUTER diff -ruN linux-2.6.9.split/net/ipv4/Makefile linux-2.6.9.nicbalancing/net/ipv4/Makefile --- linux-2.6.9.split/net/ipv4/Makefile 2004-12-15 12:04:31.000000000 +0100 +++ linux-2.6.9.nicbalancing/net/ipv4/Makefile 2004-12-15 12:07:15.000000000 +0100 @@ -20,6 +20,10 @@ obj-$(CONFIG_INET_IPCOMP) += ipcomp.o obj-$(CONFIG_INET_TUNNEL) += xfrm4_tunnel.o obj-$(CONFIG_IP_PNP) += ipconfig.o +obj-$(CONFIG_IP_ROUTE_MULTIPATH_RR) += multipath_rr.o +obj-$(CONFIG_IP_ROUTE_MULTIPATH_RANDOM) += multipath_random.o +obj-$(CONFIG_IP_ROUTE_MULTIPATH_WRANDOM) += multipath_wrandom.o +obj-$(CONFIG_IP_ROUTE_MULTIPATH_DRR) += multipath_drr.o obj-$(CONFIG_NETFILTER) += netfilter/ obj-$(CONFIG_IP_VS) += ipvs/ obj-$(CONFIG_IP_TCPDIAG) += tcp_diag.o diff -ruN linux-2.6.9.split/net/ipv4/fib_hash.c linux-2.6.9.nicbalancing/net/ipv4/fib_hash.c --- linux-2.6.9.split/net/ipv4/fib_hash.c 2004-12-15 12:04:31.000000000 +0100 +++ linux-2.6.9.nicbalancing/net/ipv4/fib_hash.c 2004-12-15 12:07:15.000000000 +0100 @@ -261,6 +261,7 @@ err = fib_semantic_match(&f->fn_alias, flp, res, + f->fn_key, fz->fz_mask, fz->fz_order); if (err <= 0) goto out; diff -ruN linux-2.6.9.split/net/ipv4/fib_lookup.h linux-2.6.9.nicbalancing/net/ipv4/fib_lookup.h --- linux-2.6.9.split/net/ipv4/fib_lookup.h 2004-12-15 12:04:31.000000000 +0100 +++ linux-2.6.9.nicbalancing/net/ipv4/fib_lookup.h 2004-12-15 12:07:15.000000000 +0100 @@ -19,7 +19,8 @@ /* Exported by fib_semantics.c */ extern int fib_semantic_match(struct list_head *head, const struct flowi *flp, - struct fib_result *res, int prefixlen); + struct fib_result *res, __u32 zone, __u32 mask, + int prefixlen); extern void fib_release_info(struct fib_info *); extern struct fib_info *fib_create_info(const struct rtmsg *r, struct kern_rta *rta, diff -ruN linux-2.6.9.split/net/ipv4/fib_semantics.c linux-2.6.9.nicbalancing/net/ipv4/fib_semantics.c --- linux-2.6.9.split/net/ipv4/fib_semantics.c 2004-12-15 12:04:31.000000000 +0100 +++ linux-2.6.9.nicbalancing/net/ipv4/fib_semantics.c 2004-12-15 12:07:15.000000000 +0100 @@ -763,7 +763,8 @@ } int fib_semantic_match(struct list_head *head, const struct flowi *flp, - struct fib_result *res, int prefixlen) + struct fib_result *res, __u32 zone, __u32 mask, + int prefixlen) { struct fib_alias *fa; int nh_sel = 0; @@ -827,6 +828,11 @@ res->type = fa->fa_type; res->scope = fa->fa_scope; res->fi = fa->fa_info; +#ifdef CONFIG_IP_ROUTE_MULTIPATH_WRANDOM + res->netmask = mask; + res->network = zone & + (0xFFFFFFFF >> (32 - prefixlen)); +#endif atomic_inc(&res->fi->fib_clntref); return 0; } diff -ruN linux-2.6.9.split/net/ipv4/multipath_drr.c linux-2.6.9.nicbalancing/net/ipv4/multipath_drr.c --- linux-2.6.9.split/net/ipv4/multipath_drr.c 1970-01-01 01:00:00.000000000 +0100 +++ linux-2.6.9.nicbalancing/net/ipv4/multipath_drr.c 2004-12-15 12:07:15.000000000 +0100 @@ -0,0 +1,292 @@ +/* + * Device round robin policy for multipath. + * + * + * Version: $Id: multipath_drr.c,v 1.1.2.1 2004/09/16 07:42:34 elueck Exp $ + * + * Authors: Einar Lueck + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +struct multipath_device +{ + int ifi; /* interface index of device */ + atomic_t usecount; + int allocated; +}; + +#define MULTIPATH_MAX_DEVICECANDIDATES 10 + +static struct multipath_device state[MULTIPATH_MAX_DEVICECANDIDATES]; +static spinlock_t state_lock = SPIN_LOCK_UNLOCKED; +static int registered_dev_notifier = 0; +static struct rtable *last_selection = NULL; + +#define RTprint(a...) // printk(KERN_DEBUG a) + + + +static int __inline__ multipath_comparekeys(const struct flowi *flp1, + const struct flowi *flp2) { + return flp1->fl4_dst == flp2->fl4_dst && + flp1->fl4_src == flp2->fl4_src && + flp1->oif == flp2->oif && +#ifdef CONFIG_IP_ROUTE_FWMARK + flp1->fl4_fwmark == flp2->fl4_fwmark && +#endif + !((flp1->fl4_tos ^ flp2->fl4_tos) & + (IPTOS_RT_MASK | RTO_ONLINK)); +} + +static int __inline__ __multipath_findslot(void) { + int i; + for (i=0; iifindex); + if (devidx != -1) { + state[devidx].allocated = 0; + state[devidx].ifi = 0; + atomic_set(&state[devidx].usecount, 0); + RTprint(KERN_DEBUG"%s: successfully removed device " \ + "with index %d\n",__FUNCTION__, devidx); + } + else { + RTprint(KERN_DEBUG"%s: Device not relevant for " \ + " multipath: %d\n", + __FUNCTION__, devidx); + } + + spin_unlock_bh(&state_lock); + break; + } + return NOTIFY_DONE; +} + +struct notifier_block multipath_dev_notifier = { + .notifier_call = multipath_dev_event, +}; + +void __multipath_remove(struct rtable* rt) { + if (last_selection == rt) { + last_selection = NULL; + } +} + +void __multipath_safe_inc(atomic_t* usecount) +{ + int n; + atomic_inc(usecount); + n = atomic_read(usecount); + if (n<=0) { + int i; + RTprint("%s: detected overflow, now ill will reset all "\ + "usecounts\n", __FUNCTION__); + + spin_lock_bh(&state_lock); + for (i=0; iflags & FLOWI_FLAG_MULTIPATHOLDROUTE ) != 0 && + last_selection != NULL ) { + RTprint( KERN_CRIT"%s: holding route \n", __FUNCTION__ ); + result = last_selection; + *rp = result; + return; + } + + + /* 1. make sure all alt. nexthops have the same GC related data */ + /* 2. determine the new candidate to be returned */ + result = NULL; + cur_min = NULL; + for (nh = rcu_dereference(first); nh; + nh = rcu_dereference(nh->u.rt_next)) { + if ( ( nh->u.dst.flags & DST_BALANCED ) != 0 && + multipath_comparekeys(&nh->fl, flp ) ) { + int nh_ifidx = nh->u.dst.dev->ifindex; + nh->u.dst.lastuse = jiffies; + nh->u.dst.__use++; + if (result != NULL) { + continue; + } + + /* search for the output interface */ + /* this is not SMP safe, only add/remove are + SMP safe as wrong usecount updates have no big + impact */ + devidx = __multipath_finddev(nh_ifidx); + if (devidx==-1) { + /* add the interface to the array + SMP safe */ + spin_lock_bh(&state_lock); + + /* due to SMP: search again */ + devidx = __multipath_finddev(nh_ifidx); + if (devidx==-1) { + /* add entry for device */ + devidx = __multipath_findslot(); + if (devidx==-1) { + /* unlikely but possible */ + RTprint(KERN_DEBUG"%s: " \ + "out of space\n", + __FUNCTION__); + continue; + } + + state[devidx].allocated = 1; + state[devidx].ifi = nh_ifidx; + atomic_set(&state[devidx].usecount, 0); + min_usecount = 0; + RTprint(KERN_DEBUG"%s: created " \ + " for " \ + "device %d and " \ + "min_usecount " \ + " == -1\n", + __FUNCTION__, + nh_ifidx ); + } + + spin_unlock_bh(&state_lock); + } + + if (min_usecount == 0) { + /* if the device has not been used it is + the primary target */ + RTprint(KERN_DEBUG"%s: now setting " \ + "result to device %d\n", + __FUNCTION__, nh_ifidx ); + + __multipath_safe_inc(&state[devidx].usecount); + result = nh; + } + else { + int count = + atomic_read(&state[devidx].usecount); + + if (min_usecount == -1 || + count < min_usecount) { + cur_min = nh; + cur_min_devidx = devidx; + min_usecount = count; + + RTprint(KERN_DEBUG"%s: found " \ + "device " \ + "%d with usecount == %d\n", + __FUNCTION__, + nh_ifidx, + min_usecount); + } + } + } + } + + if (!result) { + if (cur_min) { + RTprint( KERN_DEBUG"%s: index of device in state "\ + "array: %d\n", + __FUNCTION__, cur_min_devidx ); + __multipath_safe_inc(&state[cur_min_devidx].usecount); + result = cur_min; + } + else { + RTprint( KERN_DEBUG"%s: utilized first\n", + __FUNCTION__); + result = first; + } + } + else { + RTprint(KERN_DEBUG"%s: utilize result: found device " \ + "%d with usecount == %d\n", + __FUNCTION__, result->u.dst.dev->ifindex, + min_usecount); + + } + + *rp = result; + last_selection = result; +} diff -ruN linux-2.6.9.split/net/ipv4/multipath_random.c linux-2.6.9.nicbalancing/net/ipv4/multipath_random.c --- linux-2.6.9.split/net/ipv4/multipath_random.c 1970-01-01 01:00:00.000000000 +0100 +++ linux-2.6.9.nicbalancing/net/ipv4/multipath_random.c 2004-12-15 12:07:15.000000000 +0100 @@ -0,0 +1,124 @@ +/* + * Random policy for multipath. + * + * + * Version: $Id: multipath_random.c,v 1.1.2.3 2004/09/21 08:42:11 elueck Exp $ + * + * Authors: Einar Lueck + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define RTprint(a...) // printk(KERN_DEBUG a) + +#define MULTIPATH_MAX_CANDIDATES 40 + +/* interface to random number generation */ +static unsigned int RANDOM_SEED = 93186752; +static __inline__ unsigned int random(unsigned int ubound); + +static int __inline__ multipath_comparekeys(const struct flowi *flp1, + const struct flowi *flp2) { + return flp1->fl4_dst == flp2->fl4_dst && + flp1->fl4_src == flp2->fl4_src && + flp1->oif == flp2->oif && +#ifdef CONFIG_IP_ROUTE_FWMARK + flp1->fl4_fwmark == flp2->fl4_fwmark && +#endif + !((flp1->fl4_tos ^ flp2->fl4_tos) & + (IPTOS_RT_MASK | RTO_ONLINK)); +} + +void __multipath_selectroute(const struct flowi *flp, + struct rtable *first, + struct rtable **rp) { + struct rtable *rt; + struct rtable *decision; + unsigned char candidate_count = 0; + + /* count all candidate */ + for (rt = rcu_dereference(first); rt; + rt = rcu_dereference(rt->u.rt_next)) { + if ( ( rt->u.dst.flags & DST_BALANCED ) != 0 && + multipath_comparekeys(&rt->fl, flp) ) { + ++candidate_count; + } + } + + /* choose a random candidate */ + decision = first; + if ( candidate_count > 1 ) { + unsigned char i = 0; + unsigned char candidate_no = (unsigned char) + random(candidate_count); + RTprint( "%s: randomly chosen candidate: %d (count: %d)\n", + __FUNCTION__, candidate_no, candidate_count ); + + + /* find chosen candidate and adjust GC data for all candidates + to ensure they stay in cache */ + for (rt = first; rt; rt = rt->u.rt_next) { + if ( ( rt->u.dst.flags & DST_BALANCED ) != 0 && + multipath_comparekeys(&rt->fl, flp) ) { + rt->u.dst.lastuse = jiffies; + if (i == candidate_no) { + decision = rt; + } + if (i >= candidate_count) { + break; + } + i++; + } + } + } + + decision->u.dst.__use++; + *rp = decision; +} + +static __inline__ unsigned int random(unsigned int ubound) +{ + static unsigned int a = 1588635695, + q = 2, + r = 1117695901; + RANDOM_SEED = a*(RANDOM_SEED % q) - r*(RANDOM_SEED / q); + return RANDOM_SEED % ubound; +} + diff -ruN linux-2.6.9.split/net/ipv4/multipath_rr.c linux-2.6.9.nicbalancing/net/ipv4/multipath_rr.c --- linux-2.6.9.split/net/ipv4/multipath_rr.c 1970-01-01 01:00:00.000000000 +0100 +++ linux-2.6.9.nicbalancing/net/ipv4/multipath_rr.c 2004-12-15 12:07:15.000000000 +0100 @@ -0,0 +1,118 @@ +/* + * Round robin policy for multipath. + * + * + * Version: $Id: multipath_rr.c,v 1.1.2.2 2004/09/16 07:42:34 elueck Exp $ + * + * Authors: Einar Lueck + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define RTprint(a...) // printk(KERN_DEBUG a) + +#define MULTIPATH_MAX_CANDIDATES 40 + +static struct rtable* last_used = NULL; + +static int __inline__ multipath_comparekeys(const struct flowi *flp1, + const struct flowi *flp2) { + return flp1->fl4_dst == flp2->fl4_dst && + flp1->fl4_src == flp2->fl4_src && + flp1->oif == flp2->oif && +#ifdef CONFIG_IP_ROUTE_FWMARK + flp1->fl4_fwmark == flp2->fl4_fwmark && +#endif + !((flp1->fl4_tos ^ flp2->fl4_tos) & + (IPTOS_RT_MASK | RTO_ONLINK)); +} + +void __multipath_remove(struct rtable *rt) +{ + if (last_used==rt) { + last_used = NULL; + } +} + +void __multipath_selectroute(const struct flowi *flp, + struct rtable *first, struct rtable **rp) +{ + struct rtable *nh, *result, *min_use_cand = NULL; + int min_use = -1; + + /* if necessary and possible utilize the old alternative */ + if ( ( flp->flags & FLOWI_FLAG_MULTIPATHOLDROUTE ) != 0 && + last_used != NULL ) { + RTprint( KERN_CRIT"%s: holding route \n", + __FUNCTION__ ); + result = last_used; + goto out; + } + + + /* 1. make sure all alt. nexthops have the same GC related data */ + /* 2. determine the new candidate to be returned */ + result = NULL; + for (nh = rcu_dereference(first); nh; + nh = rcu_dereference(nh->u.rt_next)) { + if ( ( nh->u.dst.flags & DST_BALANCED ) != 0 && + multipath_comparekeys(&nh->fl, flp ) ) { + nh->u.dst.lastuse = jiffies; + + if (min_use == -1 || nh->u.dst.__use < min_use) { + min_use = nh->u.dst.__use; + min_use_cand = nh; + } + RTprint( KERN_CRIT"%s: found balanced entry\n", + __FUNCTION__ ); + } + } + result = min_use_cand; + if (!result) { + result = first; + } + + out: + last_used = result; + result->u.dst.__use++; + *rp = result; +} + + diff -ruN linux-2.6.9.split/net/ipv4/multipath_wrandom.c linux-2.6.9.nicbalancing/net/ipv4/multipath_wrandom.c --- linux-2.6.9.split/net/ipv4/multipath_wrandom.c 1970-01-01 01:00:00.000000000 +0100 +++ linux-2.6.9.nicbalancing/net/ipv4/multipath_wrandom.c 2004-12-15 12:07:15.000000000 +0100 @@ -0,0 +1,374 @@ +/* + * Weighted random policy for multipath. + * + * + * Version: $Id: multipath_wrandom.c,v 1.1.2.3 2004/09/22 07:51:40 elueck Exp $ + * + * Authors: Einar Lueck + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define MPprint(a...) // printk(KERN_DEBUG a) + +#define MULTIPATH_STATE_SIZE 15 + +struct multipath_candidate { + struct multipath_candidate *next; + int power; + struct rtable *rt; +}; + +struct multipath_dest { + struct list_head list; + + const struct fib_nh *nh_info; + __u32 netmask; + __u32 network; + unsigned char prefixlen; + + struct rcu_head rcu; +}; + +struct multipath_bucket { + struct list_head head; + spinlock_t lock; +}; + +struct multipath_route { + struct list_head list; + + int oif; + __u32 gw; + struct list_head dests; + + struct rcu_head rcu; +}; + + +/* state: primarily weight per route information */ +static int multipath_state_initialized = 0; +static spinlock_t state_big_lock = SPIN_LOCK_UNLOCKED; +static struct multipath_bucket state[MULTIPATH_STATE_SIZE]; + + +/* interface to random number generation */ +static unsigned int RANDOM_SEED = 93186752; +static __inline__ unsigned int random(unsigned int ubound); + +static int __inline__ multipath_comparekeys(const struct flowi *flp1, + const struct flowi *flp2) { + return flp1->fl4_dst == flp2->fl4_dst && + flp1->fl4_src == flp2->fl4_src && + flp1->oif == flp2->oif && +#ifdef CONFIG_IP_ROUTE_FWMARK + flp1->fl4_fwmark == flp2->fl4_fwmark && +#endif + !((flp1->fl4_tos ^ flp2->fl4_tos) & + (IPTOS_RT_MASK | RTO_ONLINK)); +} + + +static unsigned char __multipath_lookup_weight(const struct flowi *fl, + const struct rtable *rt) { + const int state_idx = rt->idev->dev->ifindex % MULTIPATH_STATE_SIZE; + struct multipath_route *r; + struct multipath_route *target_route = NULL; + struct multipath_dest *d; + int weight = 1; + + /* lookup the weight information for a certain route */ + rcu_read_lock(); + + /* find state entry for gateway or add one if necessary */ + list_for_each_entry_rcu(r, &state[state_idx].head, list) { + if (r->gw == rt->rt_gateway && + r->oif == rt->idev->dev->ifindex) { + target_route = r; + break; + } + } + if (!target_route) { + /* this should not happen... but we are prepared */ + printk( KERN_CRIT"%s: missing state for gateway: %u and " \ + "device %d\n", __FUNCTION__, rt->rt_gateway, + rt->idev->dev->ifindex); + goto out; + } + + /* find state entry for destination */ + list_for_each_entry_rcu(d, &target_route->dests, list) { + __u32 targetnetwork = fl->fl4_dst & + (0xFFFFFFFF >> (32 - d->prefixlen)); + + if ((targetnetwork & d->netmask) == d->network) { + weight = d->nh_info->nh_weight; + MPprint("%s: found weight %d for gateway %u\n", + __FUNCTION__, weight, rt->rt_gateway); + goto out; + } + } + + out: + rcu_read_unlock(); + return weight; +} + +static void __multipath_init_state(void) +{ + spin_lock(&state_big_lock); + + /* check again due to SMP and to prevent contention */ + if (!multipath_state_initialized) { + int i; + for (i = 0; i < MULTIPATH_STATE_SIZE; ++i) { + INIT_LIST_HEAD(&state[i].head); + state[i].lock = SPIN_LOCK_UNLOCKED; + } + } + + /* now mark initialized */ + multipath_state_initialized = 1; + + spin_unlock(&state_big_lock); +} + +static void inline __multipath_init(void) +{ + /* do not spinlock to reduce unnecessary contention */ + if (!multipath_state_initialized) { + __multipath_init_state(); + } +} + +void __multipath_selectroute(const struct flowi *flp, + struct rtable *first, + struct rtable **rp) { + struct rtable *rt; + struct rtable *decision; + struct multipath_candidate *first_mpc = NULL; + struct multipath_candidate *mpc, *last_mpc = NULL; + int power = 0; + int last_power; + int selector; + const size_t size_mpc = sizeof(struct multipath_candidate); + + /* init state if necessary */ + __multipath_init(); + + + /* collect all candidates and identify their weights */ + for (rt = rcu_dereference(first); rt; + rt = rcu_dereference(rt->u.rt_next)) { + if ((rt->u.dst.flags & DST_BALANCED) != 0 && + multipath_comparekeys(&rt->fl, flp) ) { + struct multipath_candidate* mpc = + (struct multipath_candidate*) + kmalloc(size_mpc, GFP_KERNEL); + + power += __multipath_lookup_weight(flp, rt) * 10000; + + mpc->power = power; + mpc->rt = rt; + mpc->next = NULL; + + if (!first_mpc) + first_mpc = mpc; + else + last_mpc->next = mpc; + + last_mpc = mpc; + } + } + + /* choose a weighted random candidate */ + decision = first; + selector = random(power); + MPprint("%s: random number %d in range %d\n", __FUNCTION__, selector, + power); + last_power = 0; + + + /* select candidate, adjust GC data and cleanup local state */ + decision = first; + last_mpc = NULL; + for (mpc = first_mpc; mpc; mpc = mpc->next) { + mpc->rt->u.dst.lastuse = jiffies; + MPprint("%s: last_power = %d, selector: %d, mpc->power: %d\n", + __FUNCTION__, last_power, selector, mpc->power); + if (last_power <= selector && selector < mpc->power) { + decision = mpc->rt; + MPprint("%s: selected %u\n", __FUNCTION__, + decision->rt_gateway); + } + last_power = mpc->power; + if (last_mpc) { + kfree(last_mpc); + } + last_mpc = mpc; + } + if (last_mpc) { + /* concurrent __multipath_flush may lead to !last_mpc */ + kfree(last_mpc); + } + + decision->u.dst.__use++; + *rp = decision; +} + +void __multipath_set_nhinfo(__u32 network, + __u32 netmask, + unsigned char prefixlen, + const struct fib_nh* nh) +{ + const int state_idx = nh->nh_oif % MULTIPATH_STATE_SIZE; + struct multipath_route *r, *target_route = NULL; + struct multipath_dest *d, *target_dest = NULL; + + /* init state if necessary */ + __multipath_init(); + + /* store the weight information for a certain route */ + spin_lock(&state[state_idx].lock); + + /* find state entry for gateway or add one if necessary */ + list_for_each_entry_rcu(r, &state[state_idx].head, list) { + if (r->gw == nh->nh_gw && r->oif == nh->nh_oif) { + target_route = r; + break; + } + } + if (!target_route) { + const size_t size_rt = sizeof(struct multipath_route); + target_route = (struct multipath_route *) + kmalloc(size_rt, GFP_KERNEL); + + target_route->gw = nh->nh_gw; + target_route->oif = nh->nh_oif; + memset(&target_route->rcu, sizeof(struct rcu_head), 0); + INIT_LIST_HEAD(&target_route->dests); + + list_add_rcu(&target_route->list, &state[state_idx].head); + } + + /* find state entry for destination or add one if necessary */ + list_for_each_entry_rcu(d, &target_route->dests, list) { + if (d->nh_info == nh) { + target_dest = d; + break; + } + } + if (!target_dest) { + const size_t size_dst = sizeof(struct multipath_dest); + target_dest = (struct multipath_dest*) + kmalloc(size_dst, GFP_KERNEL); + + target_dest->nh_info = nh; + target_dest->network = network; + target_dest->netmask = netmask; + target_dest->prefixlen = prefixlen; + memset(&target_dest->rcu, sizeof(struct rcu_head), 0); + + list_add_rcu(&target_dest->list, &target_route->dests); + } + /* else: we already stored this info for another destination => + we are finished */ + + spin_unlock(&state[state_idx].lock); +} + + +static void __multipath_free(struct rcu_head *head) +{ + struct multipath_route *rt = container_of(head, struct multipath_route, + rcu); + kfree(rt); +} + +static void __multipath_free_dst(struct rcu_head *head) +{ + struct multipath_dest *dst = container_of(head, + struct multipath_dest, + rcu); + kfree(dst); +} + +void __multipath_flush() +{ + int i; + + MPprint("%s: called\n", __FUNCTION__); + + /* init state if necessary */ + __multipath_init(); + + /* defere delete to all entries */ + for (i = 0; i < MULTIPATH_STATE_SIZE; ++i) { + struct multipath_route *r; + spin_lock(&state[i].lock); + + list_for_each_entry_rcu(r, &state[i].head, list) { + struct multipath_dest *d; + list_for_each_entry_rcu(d, &r->dests, list) { + list_del_rcu(&d->list); + call_rcu(&d->rcu, + __multipath_free_dst); + + } + list_del_rcu(&r->list); + call_rcu(&r->rcu, + __multipath_free); + } + + spin_unlock(&state[i].lock); + } + + MPprint("%s: finished\n", __FUNCTION__); +} + +static __inline__ unsigned int random(unsigned int ubound) +{ + static unsigned int a = 1588635695, + q = 2, + r = 1117695901; + RANDOM_SEED = a*(RANDOM_SEED % q) - r*(RANDOM_SEED / q); + return RANDOM_SEED % ubound; +} diff -ruN linux-2.6.9.split/net/ipv4/route.c linux-2.6.9.nicbalancing/net/ipv4/route.c --- linux-2.6.9.split/net/ipv4/route.c 2004-12-15 12:05:32.000000000 +0100 +++ linux-2.6.9.nicbalancing/net/ipv4/route.c 2004-12-15 12:07:15.000000000 +0100 @@ -129,7 +129,7 @@ int ip_rt_secret_interval = 10 * 60 * HZ; static unsigned long rt_deadline; -#define RTprint(a...) printk(KERN_DEBUG a) +#define RTprint(a...) // printk(KERN_DEBUG a) static struct timer_list rt_flush_timer; static struct timer_list rt_periodic_timer; @@ -450,11 +450,13 @@ static __inline__ void rt_free(struct rtable *rt) { + multipath_remove( rt ); call_rcu_bh(&rt->u.dst.rcu_head, dst_rcu_free); } static __inline__ void rt_drop(struct rtable *rt) { + multipath_remove( rt ); ip_rt_put(rt); call_rcu_bh(&rt->u.dst.rcu_head, dst_rcu_free); } @@ -516,6 +518,54 @@ return score; } +#ifdef CONFIG_IP_ROUTE_MULTIPATH_CACHED +static struct rtable **rt_remove_balanced_route(struct rtable **chain_head, + struct rtable *expentry, + int* removed_count) +{ + int passedexpired = 0; + struct rtable **nextstep = NULL; + struct rtable **rthp = chain_head; + struct rtable *rth; + if (removed_count) + *removed_count = 0; + while ((rth = *rthp) != NULL) { + if ( rth == expentry ) { + passedexpired = 1; + } + + if (((*rthp)->u.dst.flags & DST_BALANCED) != 0 && + compare_keys(&(*rthp)->fl, &expentry->fl)) { + if (*rthp == expentry) { + *rthp = rth->u.rt_next; + continue; + } + else { + *rthp = rth->u.rt_next; + rt_free(rth); + if (removed_count) + ++(*removed_count); + } + } + else { + if ( !((*rthp)->u.dst.flags & DST_BALANCED) && + passedexpired && !nextstep ) { + nextstep = &rth->u.rt_next; + } + rthp = &rth->u.rt_next; + } + } + + rt_free(expentry); + if (removed_count) + ++(*removed_count); + + return nextstep; +} + +#endif + + /* This runs via a timer and thus is always in BH context. */ static void rt_check_expire(unsigned long dummy) { @@ -547,8 +597,24 @@ } /* Cleanup aged off entries. */ - *rthp = rth->u.rt_next; - rt_free(rth); +#ifdef CONFIG_IP_ROUTE_MULTIPATH_CACHED + /* remove all related balanced entries if necessary */ + if ( rth->u.dst.flags & DST_BALANCED ) { + rthp = rt_remove_balanced_route( + &rt_hash_table[i].chain, + rth, NULL); + if (!rthp) { + break; + } + } + else { + *rthp = rth->u.rt_next; + rt_free(rth); + } +#else /* CONFIG_IP_ROUTE_MULTIPATH_CACHED */ + *rthp = rth->u.rt_next; + rt_free(rth); +#endif /* CONFIG_IP_ROUTE_MULTIPATH_CACHED */ } spin_unlock(&rt_hash_table[i].lock); @@ -596,6 +662,9 @@ if (delay < 0) delay = ip_rt_min_delay; + /* flush existing multipath state*/ + multipath_flush(); + spin_lock_bh(&rt_flush_lock); if (del_timer(&rt_flush_timer) && delay > 0 && rt_deadline) { @@ -714,9 +783,29 @@ rthp = &rth->u.rt_next; continue; } +#ifdef CONFIG_IP_ROUTE_MULTIPATH_CACHED + /* remove all related balanced entries if necessary */ + if ( rth->u.dst.flags & DST_BALANCED ) { + int r; + rthp = rt_remove_balanced_route( + &rt_hash_table[i].chain, + rth, + &r); + goal -= r; + if (!rthp) { + break; + } + } + else { + *rthp = rth->u.rt_next; + rt_free(rth); + goal--; + } +#else /* CONFIG_IP_ROUTE_MULTIPATH_CACHED */ *rthp = rth->u.rt_next; rt_free(rth); goal--; +#endif /* CONFIG_IP_ROUTE_MULTIPATH_CACHED */ } spin_unlock_bh(&rt_hash_table[k].lock); if (goal <= 0) @@ -797,7 +886,12 @@ spin_lock_bh(&rt_hash_table[hash].lock); while ((rth = *rthp) != NULL) { +#ifdef CONFIG_IP_ROUTE_MULTIPATH_CACHED + if (!(rth->u.dst.flags & DST_BALANCED) && + compare_keys(&rth->fl, &rt->fl)) { +#else if (compare_keys(&rth->fl, &rt->fl)) { +#endif /* Put it first */ *rthp = rth->u.rt_next; /* @@ -1628,6 +1722,10 @@ } rth->u.dst.flags= DST_HOST; +#ifdef CONFIG_IP_ROUTE_MULTIPATH_CACHED + if ( res->fi->fib_nhs > 1 ) + rth->u.dst.flags |= DST_BALANCED; +#endif if (in_dev->cnf.no_policy) rth->u.dst.flags |= DST_NOPOLICY; if (in_dev->cnf.no_xfrm) @@ -1675,7 +1773,7 @@ unsigned hash; #ifdef CONFIG_IP_ROUTE_MULTIPATH - if (res->fi->fib_nhs > 1 && fl->oif == 0) + if (res->fi && res->fi->fib_nhs > 1 && fl->oif == 0) fib_select_multipath(fl, res); #endif @@ -1696,7 +1794,65 @@ struct in_device *in_dev, u32 daddr, u32 saddr, u32 tos) { +#ifdef CONFIG_IP_ROUTE_MULTIPATH_CACHED + struct rtable* rth; + unsigned char hop, hopcount, lasthop; + int err = -EINVAL; + unsigned hash; + if (res->fi) { + hopcount = res->fi->fib_nhs; + } + else { + hopcount = 1; + } + lasthop = hopcount - 1; + + /* distinguish between multipath and singlepath */ + if ( hopcount < 2 ) + return ip_mkroute_input_def(skb, res, fl, in_dev, daddr, + saddr, tos); + + RTprint( KERN_DEBUG"%s: entered (hopcount: %d)\n", __FUNCTION__, + hopcount); + + /* add all alternatives to the routing cache */ + for ( hop = 0; hop < hopcount; ++hop ) { + res->nh_sel = hop; + + RTprint( KERN_DEBUG"%s: entered (hopcount: %d)\n", + __FUNCTION__, hopcount); + + /* create a routing cache entry */ + err = __mkroute_input( skb, res, in_dev, daddr, saddr, tos, + &rth ); + if ( err ) + return err; + + + /* put it into the cache */ + hash = rt_hash_code(daddr, saddr ^ (fl->iif << 5), tos); + err = rt_intern_hash(hash, rth, (struct rtable**)&skb->dst); + if ( err ) + return err; + + /* forward hop information to multipath impl. */ + multipath_set_nhinfo(FIB_RES_NETWORK(*res), + FIB_RES_NETMASK(*res), + res->prefixlen, + &FIB_RES_NH(*res)); + + + /* only for the last hop the reference count is handled + outside */ + RTprint( KERN_DEBUG"%s: balanced entry created: %d\n", + __FUNCTION__, rth ); + if ( hop == lasthop ) + atomic_set(&(skb->dst->__refcnt), 1); + } + return err; +#else /* CONFIG_IP_ROUTE_MULTIPATH_CACHED */ return ip_mkroute_input_def(skb, res, fl, in_dev, daddr, saddr, tos); +#endif /* CONFIG_IP_ROUTE_MULTIPATH_CACHED */ } @@ -2017,6 +2173,10 @@ } rth->u.dst.flags= DST_HOST; +#ifdef CONFIG_IP_ROUTE_MULTIPATH_CACHED + if (res->fi && res->fi->fib_nhs > 1) + rth->u.dst.flags |= DST_BALANCED; +#endif if (in_dev->cnf.no_xfrm) rth->u.dst.flags |= DST_NOXFRM; if (in_dev->cnf.no_policy) @@ -2108,7 +2268,77 @@ struct net_device *dev_out, unsigned flags) { +#ifdef CONFIG_IP_ROUTE_MULTIPATH_CACHED + u32 tos = RT_FL_TOS(oldflp); + unsigned char hop; + unsigned hash; + int err = -EINVAL; + struct rtable* rth; + + if (res->fi && res->fi->fib_nhs > 1) { + unsigned char hopcount = res->fi->fib_nhs; + RTprint( KERN_DEBUG"%s: entered (hopcount: %d, fl->oif: %d)\n", + __FUNCTION__, hopcount, fl->oif); + + for ( hop = 0; hop < hopcount; ++hop ) { + struct net_device *dev2nexthop; + RTprint( KERN_DEBUG"%s: hop %d of %d\n", __FUNCTION__, + hop, hopcount ); + + res->nh_sel = hop; + + /* hold a work reference to the output device */ + dev2nexthop = FIB_RES_DEV(*res); + dev_hold(dev2nexthop); + + err = __mkroute_output(&rth, res, fl, oldflp, + dev2nexthop, flags); + + /** FIXME remove debug code */ + RTprint( "%s: balanced entry created: %d " \ + " (GW: %u)\n", + __FUNCTION__, + &rth->u.dst, + FIB_RES_GW(*res) ); + + if ( err != 0 ) { + goto cleanup; + } + + RTprint( KERN_DEBUG"%s: created successfully %d\n", + __FUNCTION__, hop ); + + hash = rt_hash_code(oldflp->fl4_dst, + oldflp->fl4_src ^ + (oldflp->oif << 5), tos); + err = rt_intern_hash(hash, rth, rp); + RTprint( KERN_DEBUG"%s: hashed %d\n", + __FUNCTION__, hop ); + + /* forward hop information to multipath impl. */ + multipath_set_nhinfo(FIB_RES_NETWORK(*res), + FIB_RES_NETMASK(*res), + res->prefixlen, + &FIB_RES_NH(*res)); + cleanup: + /* release work reference to output device */ + dev_put(dev2nexthop); + + if ( err != 0 ) { + return err; + } + } + RTprint( "%s: exited loop\n", __FUNCTION__ ); + atomic_set(&(*rp)->u.dst.__refcnt, 1); + return err; + } + else { + return ip_mkroute_output_def(rp, res, fl, oldflp, dev_out, + flags); + } +#else /* CONFIG_IP_ROUTE_MULTIPATH_CACHED */ return ip_mkroute_output_def(rp, res, fl, oldflp, dev_out, flags); +#endif } /* @@ -2137,6 +2367,7 @@ int free_res = 0; int err; + res.fi = NULL; #ifdef CONFIG_IP_MULTIPLE_TABLES res.r = NULL; @@ -2186,6 +2417,8 @@ dev_put(dev_out); dev_out = NULL; } + + if (oldflp->oif) { dev_out = dev_get_by_index(oldflp->oif); err = -ENODEV; @@ -2292,9 +2525,11 @@ dev_hold(dev_out); fl.oif = dev_out->ifindex; + make_route: err = ip_mkroute_output(rp, &res, &fl, oldflp, dev_out, flags); + if (free_res) fib_res_put(&res); if (dev_out) @@ -2321,6 +2556,15 @@ #endif !((rth->fl.fl4_tos ^ flp->fl4_tos) & (IPTOS_RT_MASK | RTO_ONLINK))) { + /* check for multipath routes and choose one if + necessary */ + if (multipath_selectroute(flp, rth, rp)) { + dst_hold(&(*rp)->u.dst); + RT_CACHE_STAT_INC(out_hit); + rcu_read_unlock_bh(); + return 0; + } + rth->u.dst.lastuse = jiffies; dst_hold(&rth->u.dst); rth->u.dst.__use++; From lkml@einar-lueck.de Mon Dec 20 03:18:44 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 03:18:52 -0800 (PST) Received: from smtprelay03.ispgateway.de (smtprelay03.ispgateway.de [80.67.18.15]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBKBIM1n013472 for ; Mon, 20 Dec 2004 03:18:42 -0800 Received: (qmail 26469 invoked from network); 20 Dec 2004 11:15:39 -0000 Received: from unknown (HELO [192.168.30.10]) (008508@[217.231.189.59]) (envelope-sender ) by smtprelay03.ispgateway.de (qmail-ldap-1.03) with SMTP for ; 20 Dec 2004 11:15:39 -0000 Message-ID: <41C6B45F.9050403@einar-lueck.de> Date: Mon, 20 Dec 2004 12:15:43 +0100 From: =?ISO-8859-1?Q?Einar_L=FCck?= User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206) X-Accept-Language: en-us, en MIME-Version: 1.0 To: netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: [PATCH 2/2] ipv4 routing: multipath with cache support, 2.6.10-rc3 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12913 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: lkml@einar-lueck.de Precedence: bulk X-list: netdev [PATCH 2/2] ipv4 routing: multipath with cache support, 2.6.10-rc3 From: Einar Lueck This patch is an approach towards solving some problems with the current multipath implementation: * routing cache allows only one route to be cached for a certain search key * a mulitpath/load balancing decision is only made in case a corresponding route is not yet cached In the scenarios, that are relevant to us (high amount of outgoing connection requests), this is a serious problem. Our approach: * a new flag at the "struct dst_entry" flags field to indicate that further routes having the same key follow in the routing cache chain * at cache lookup time: we recognize this flag and may apply different load balancing policies to the available routes having the same key. * Garbage Collection: modified in a way that ensures that all routes having the same target are removed together from the cache. Motivation for the approach: * keep the overall routing cache entry size constant * separate load balancing state from the overall routing cache state as good as possible keeping all routes inthe same place * keep changes to the overall routing code/behaviour at a minimum. We implemented the following routing policies: * _weighted_ random routing policiy utilizing the weights configurable via "ip route" that allows to approximate many other policies (e.g. round robin) * random * round-robin * interface round robin policy that selects among relevant routes in a way that tries to ensure an overall round robin among the available _interfaces_ is achieved. Applied tests: * all policies functionally tested with up to 3 NICs (scripted iperf and ifconfig) * garbage collection within the routing cache functionally tested * reference counting on devices functionally tested * Platforms: s390, x86 Please consider for application! Regards, Einar Signed-off-by: Einar Lueck diff -ruN linux-2.6.9.split/include/net/dst.h linux-2.6.9.nicbalancing/include/net/dst.h --- linux-2.6.9.split/include/net/dst.h 2004-12-15 12:04:25.000000000 +0100 +++ linux-2.6.9.nicbalancing/include/net/dst.h 2004-12-15 12:07:15.000000000 +0100 @@ -48,6 +48,7 @@ #define DST_NOXFRM 2 #define DST_NOPOLICY 4 #define DST_NOHASH 8 +#define DST_BALANCED 0x10 unsigned long lastuse; unsigned long expires; diff -ruN linux-2.6.9.split/include/net/flow.h linux-2.6.9.nicbalancing/include/net/flow.h --- linux-2.6.9.split/include/net/flow.h 2004-12-15 12:04:25.000000000 +0100 +++ linux-2.6.9.nicbalancing/include/net/flow.h 2004-12-15 12:07:15.000000000 +0100 @@ -51,6 +51,7 @@ __u8 proto; __u8 flags; +#define FLOWI_FLAG_MULTIPATHOLDROUTE 0x01 union { struct { __u16 sport; diff -ruN linux-2.6.9.split/include/net/ip_fib.h linux-2.6.9.nicbalancing/include/net/ip_fib.h --- linux-2.6.9.split/include/net/ip_fib.h 2004-12-15 12:04:25.000000000 +0100 +++ linux-2.6.9.nicbalancing/include/net/ip_fib.h 2004-12-15 12:07:15.000000000 +0100 @@ -95,6 +95,10 @@ unsigned char nh_sel; unsigned char type; unsigned char scope; +#ifdef CONFIG_IP_ROUTE_MULTIPATH_WRANDOM + __u32 network; + __u32 netmask; +#endif struct fib_info *fi; #ifdef CONFIG_IP_MULTIPLE_TABLES struct fib_rule *r; @@ -119,6 +123,14 @@ #define FIB_RES_DEV(res) (FIB_RES_NH(res).nh_dev) #define FIB_RES_OIF(res) (FIB_RES_NH(res).nh_oif) +#ifdef CONFIG_IP_ROUTE_MULTIPATH_WRANDOM +#define FIB_RES_NETWORK(res) ((res).network) +#define FIB_RES_NETMASK(res) ((res).netmask) +#else /* CONFIG_IP_ROUTE_MULTIPATH_WRANDOM */ +#define FIB_RES_NETWORK(res) (0) +#define FIB_RES_NETMASK(res) (0) +#endif /* CONFIG_IP_ROUTE_MULTIPATH_WRANDOM */ + struct fib_table { unsigned char tb_id; unsigned tb_stamp; diff -ruN linux-2.6.9.split/include/net/route.h linux-2.6.9.nicbalancing/include/net/route.h --- linux-2.6.9.split/include/net/route.h 2004-12-15 12:04:24.000000000 +0100 +++ linux-2.6.9.nicbalancing/include/net/route.h 2004-12-15 12:07:15.000000000 +0100 @@ -46,6 +46,7 @@ #define RT_CONN_FLAGS(sk) (RT_TOS(inet_sk(sk)->tos) | sk->sk_localroute) +struct fib_nh; struct inet_peer; struct rtable { @@ -179,6 +180,9 @@ memcpy(&fl, &(*rp)->fl, sizeof(fl)); fl.fl_ip_sport = sport; fl.fl_ip_dport = dport; +#if defined(CONFIG_IP_ROUTE_MULTIPATH_CACHED) + fl.flags |= FLOWI_FLAG_MULTIPATHOLDROUTE; +#endif ip_rt_put(*rp); *rp = NULL; return ip_route_output_flow(rp, &fl, sk, 0); @@ -197,4 +201,69 @@ return rt->peer; } +#ifdef CONFIG_IP_ROUTE_MULTIPATH_WRANDOM +extern void __multipath_flush(void); +static inline void multipath_flush(void) { + __multipath_flush(); +} +#else /* CONFIG_IP_ROUTE_MULTIPATH_WRANDOM */ +static inline void multipath_flush(void){} +#endif /* CONFIG_IP_ROUTE_MULTIPATH_WRANDOM */ + +#ifdef CONFIG_IP_ROUTE_MULTIPATH_WRANDOM +extern void __multipath_set_nhinfo(__u32 network, + __u32 netmask, + unsigned char prefixlen, + const struct fib_nh* nh); +static inline void multipath_set_nhinfo(__u32 network, + __u32 netmask, + unsigned char prefixlen, + const struct fib_nh* nh) { + __multipath_set_nhinfo(network, netmask, prefixlen, nh); +} +#else +static inline void multipath_set_nhinfo(__u32 network, + __u32 netmask, + unsigned char prefixlen, + const struct fib_nh* nh) { +} +#endif + + + +#if defined(CONFIG_IP_ROUTE_MULTIPATH_RR) || defined(CONFIG_IP_ROUTE_MULTIPATH_DRR) +extern void __multipath_remove(struct rtable *rt); +static inline void multipath_remove(struct rtable *rt) { + if ( rt->u.dst.flags & DST_BALANCED ) { + __multipath_remove( rt ); + } +} +#else /* CONFIG_IP_ROUTE_MULTIPATH_RR || CONFIG_IP_ROUTE_MULTIPATH_DRR */ +static inline void multipath_remove(struct rtable *rt) {} +#endif /* CONFIG_IP_ROUTE_MULTIPATH_RR || CONFIG_IP_ROUTE_MULTIPATH_DRR */ + + +#ifdef CONFIG_IP_ROUTE_MULTIPATH_CACHED +extern void __multipath_selectroute(const struct flowi *flp, + struct rtable *rth, + struct rtable **rp); +static inline int multipath_selectroute(const struct flowi *flp, + struct rtable *rth, + struct rtable **rp) { + if ( rth->u.dst.flags & DST_BALANCED ) { + __multipath_selectroute( flp, rth, rp ); + return 1; + } + else { + return 0; + } +} +#else /* CONFIG_IP_ROUTE_MULTIPATH_CACHED */ +static inline int multipath_selectroute(const struct flowi *flp, + struct rtable *rth, + struct rtable **rp) { + return 0; +} +#endif /* CONFIG_IP_ROUTE_MULTIPATH_CACHED */ + #endif /* _ROUTE_H */ diff -ruN linux-2.6.9.split/net/ipv4/Kconfig linux-2.6.9.nicbalancing/net/ipv4/Kconfig --- linux-2.6.9.split/net/ipv4/Kconfig 2004-12-15 12:04:32.000000000 +0100 +++ linux-2.6.9.nicbalancing/net/ipv4/Kconfig 2004-12-15 12:07:15.000000000 +0100 @@ -90,6 +90,54 @@ equal "cost" and chooses one of them in a non-deterministic fashion if a matching packet arrives. +config IP_ROUTE_MULTIPATH_CACHED + bool "IP: equal cost multipath with caching support (EXPERIMENTAL)" + depends on: IP_ROUTE_MULTIPATH + help + Normally, equal cost multipath routing is not supported by the + routing cache. If you say Y here, alternative routes are cached + and on cache lookup a route is chosen in a configurable fashion. + + If unsure, say N. + +# +# multipath policy configuration +# +choice + prompt "Multipath policy" + depends on IP_ROUTE_MULTIPATH_CACHED + default IP_ROUTE_MULTIPATH_RANDOM + +config IP_ROUTE_MULTIPATH_RR + bool "round robin (EXPERIMENTAL)" + help + Mulitpath routes are chosen according to Round Robin + +config IP_ROUTE_MULTIPATH_RANDOM + bool "random multipath (EXPERIMENTAL)" + help + Multipath routes are chosen in a random fashion. Actually, + there is no weight for a route. The advantage of this policy + is that it is implemented stateless and therefore introduces only + a very small delay. +config IP_ROUTE_MULTIPATH_WRANDOM + bool "weighted random multipath (EXPERIMENTAL)" + help + Multipath routes are chosen in a weighted random fashion. + The per route weights are the weights visible via ip route 2. As the + corresponding state management introduces some overhead routing delay + is increased. +config IP_ROUTE_MULTIPATH_DRR + bool "interface round robin (EXPERIMENTAL)" + help + Connections are distributed in a round robin fashion over the + available interfaces. This policy makes sense if the connections + should be primarily distributed on interfaces and not on routes. +endchoice +# +# END OF multipath policy configuration +# + config IP_ROUTE_VERBOSE bool "IP: verbose route monitoring" depends on IP_ADVANCED_ROUTER diff -ruN linux-2.6.9.split/net/ipv4/Makefile linux-2.6.9.nicbalancing/net/ipv4/Makefile --- linux-2.6.9.split/net/ipv4/Makefile 2004-12-15 12:04:31.000000000 +0100 +++ linux-2.6.9.nicbalancing/net/ipv4/Makefile 2004-12-15 12:07:15.000000000 +0100 @@ -20,6 +20,10 @@ obj-$(CONFIG_INET_IPCOMP) += ipcomp.o obj-$(CONFIG_INET_TUNNEL) += xfrm4_tunnel.o obj-$(CONFIG_IP_PNP) += ipconfig.o +obj-$(CONFIG_IP_ROUTE_MULTIPATH_RR) += multipath_rr.o +obj-$(CONFIG_IP_ROUTE_MULTIPATH_RANDOM) += multipath_random.o +obj-$(CONFIG_IP_ROUTE_MULTIPATH_WRANDOM) += multipath_wrandom.o +obj-$(CONFIG_IP_ROUTE_MULTIPATH_DRR) += multipath_drr.o obj-$(CONFIG_NETFILTER) += netfilter/ obj-$(CONFIG_IP_VS) += ipvs/ obj-$(CONFIG_IP_TCPDIAG) += tcp_diag.o diff -ruN linux-2.6.9.split/net/ipv4/fib_hash.c linux-2.6.9.nicbalancing/net/ipv4/fib_hash.c --- linux-2.6.9.split/net/ipv4/fib_hash.c 2004-12-15 12:04:31.000000000 +0100 +++ linux-2.6.9.nicbalancing/net/ipv4/fib_hash.c 2004-12-15 12:07:15.000000000 +0100 @@ -261,6 +261,7 @@ err = fib_semantic_match(&f->fn_alias, flp, res, + f->fn_key, fz->fz_mask, fz->fz_order); if (err <= 0) goto out; diff -ruN linux-2.6.9.split/net/ipv4/fib_lookup.h linux-2.6.9.nicbalancing/net/ipv4/fib_lookup.h --- linux-2.6.9.split/net/ipv4/fib_lookup.h 2004-12-15 12:04:31.000000000 +0100 +++ linux-2.6.9.nicbalancing/net/ipv4/fib_lookup.h 2004-12-15 12:07:15.000000000 +0100 @@ -19,7 +19,8 @@ /* Exported by fib_semantics.c */ extern int fib_semantic_match(struct list_head *head, const struct flowi *flp, - struct fib_result *res, int prefixlen); + struct fib_result *res, __u32 zone, __u32 mask, + int prefixlen); extern void fib_release_info(struct fib_info *); extern struct fib_info *fib_create_info(const struct rtmsg *r, struct kern_rta *rta, diff -ruN linux-2.6.9.split/net/ipv4/fib_semantics.c linux-2.6.9.nicbalancing/net/ipv4/fib_semantics.c --- linux-2.6.9.split/net/ipv4/fib_semantics.c 2004-12-15 12:04:31.000000000 +0100 +++ linux-2.6.9.nicbalancing/net/ipv4/fib_semantics.c 2004-12-15 12:07:15.000000000 +0100 @@ -763,7 +763,8 @@ } int fib_semantic_match(struct list_head *head, const struct flowi *flp, - struct fib_result *res, int prefixlen) + struct fib_result *res, __u32 zone, __u32 mask, + int prefixlen) { struct fib_alias *fa; int nh_sel = 0; @@ -827,6 +828,11 @@ res->type = fa->fa_type; res->scope = fa->fa_scope; res->fi = fa->fa_info; +#ifdef CONFIG_IP_ROUTE_MULTIPATH_WRANDOM + res->netmask = mask; + res->network = zone & + (0xFFFFFFFF >> (32 - prefixlen)); +#endif atomic_inc(&res->fi->fib_clntref); return 0; } diff -ruN linux-2.6.9.split/net/ipv4/multipath_drr.c linux-2.6.9.nicbalancing/net/ipv4/multipath_drr.c --- linux-2.6.9.split/net/ipv4/multipath_drr.c 1970-01-01 01:00:00.000000000 +0100 +++ linux-2.6.9.nicbalancing/net/ipv4/multipath_drr.c 2004-12-15 12:07:15.000000000 +0100 @@ -0,0 +1,292 @@ +/* + * Device round robin policy for multipath. + * + * + * Version: $Id: multipath_drr.c,v 1.1.2.1 2004/09/16 07:42:34 elueck Exp $ + * + * Authors: Einar Lueck + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +struct multipath_device +{ + int ifi; /* interface index of device */ + atomic_t usecount; + int allocated; +}; + +#define MULTIPATH_MAX_DEVICECANDIDATES 10 + +static struct multipath_device state[MULTIPATH_MAX_DEVICECANDIDATES]; +static spinlock_t state_lock = SPIN_LOCK_UNLOCKED; +static int registered_dev_notifier = 0; +static struct rtable *last_selection = NULL; + +#define RTprint(a...) // printk(KERN_DEBUG a) + + + +static int __inline__ multipath_comparekeys(const struct flowi *flp1, + const struct flowi *flp2) { + return flp1->fl4_dst == flp2->fl4_dst && + flp1->fl4_src == flp2->fl4_src && + flp1->oif == flp2->oif && +#ifdef CONFIG_IP_ROUTE_FWMARK + flp1->fl4_fwmark == flp2->fl4_fwmark && +#endif + !((flp1->fl4_tos ^ flp2->fl4_tos) & + (IPTOS_RT_MASK | RTO_ONLINK)); +} + +static int __inline__ __multipath_findslot(void) { + int i; + for (i=0; iifindex); + if (devidx != -1) { + state[devidx].allocated = 0; + state[devidx].ifi = 0; + atomic_set(&state[devidx].usecount, 0); + RTprint(KERN_DEBUG"%s: successfully removed device " \ + "with index %d\n",__FUNCTION__, devidx); + } + else { + RTprint(KERN_DEBUG"%s: Device not relevant for " \ + " multipath: %d\n", + __FUNCTION__, devidx); + } + + spin_unlock_bh(&state_lock); + break; + } + return NOTIFY_DONE; +} + +struct notifier_block multipath_dev_notifier = { + .notifier_call = multipath_dev_event, +}; + +void __multipath_remove(struct rtable* rt) { + if (last_selection == rt) { + last_selection = NULL; + } +} + +void __multipath_safe_inc(atomic_t* usecount) +{ + int n; + atomic_inc(usecount); + n = atomic_read(usecount); + if (n<=0) { + int i; + RTprint("%s: detected overflow, now ill will reset all "\ + "usecounts\n", __FUNCTION__); + + spin_lock_bh(&state_lock); + for (i=0; iflags & FLOWI_FLAG_MULTIPATHOLDROUTE ) != 0 && + last_selection != NULL ) { + RTprint( KERN_CRIT"%s: holding route \n", __FUNCTION__ ); + result = last_selection; + *rp = result; + return; + } + + + /* 1. make sure all alt. nexthops have the same GC related data */ + /* 2. determine the new candidate to be returned */ + result = NULL; + cur_min = NULL; + for (nh = rcu_dereference(first); nh; + nh = rcu_dereference(nh->u.rt_next)) { + if ( ( nh->u.dst.flags & DST_BALANCED ) != 0 && + multipath_comparekeys(&nh->fl, flp ) ) { + int nh_ifidx = nh->u.dst.dev->ifindex; + nh->u.dst.lastuse = jiffies; + nh->u.dst.__use++; + if (result != NULL) { + continue; + } + + /* search for the output interface */ + /* this is not SMP safe, only add/remove are + SMP safe as wrong usecount updates have no big + impact */ + devidx = __multipath_finddev(nh_ifidx); + if (devidx==-1) { + /* add the interface to the array + SMP safe */ + spin_lock_bh(&state_lock); + + /* due to SMP: search again */ + devidx = __multipath_finddev(nh_ifidx); + if (devidx==-1) { + /* add entry for device */ + devidx = __multipath_findslot(); + if (devidx==-1) { + /* unlikely but possible */ + RTprint(KERN_DEBUG"%s: " \ + "out of space\n", + __FUNCTION__); + continue; + } + + state[devidx].allocated = 1; + state[devidx].ifi = nh_ifidx; + atomic_set(&state[devidx].usecount, 0); + min_usecount = 0; + RTprint(KERN_DEBUG"%s: created " \ + " for " \ + "device %d and " \ + "min_usecount " \ + " == -1\n", + __FUNCTION__, + nh_ifidx ); + } + + spin_unlock_bh(&state_lock); + } + + if (min_usecount == 0) { + /* if the device has not been used it is + the primary target */ + RTprint(KERN_DEBUG"%s: now setting " \ + "result to device %d\n", + __FUNCTION__, nh_ifidx ); + + __multipath_safe_inc(&state[devidx].usecount); + result = nh; + } + else { + int count = + atomic_read(&state[devidx].usecount); + + if (min_usecount == -1 || + count < min_usecount) { + cur_min = nh; + cur_min_devidx = devidx; + min_usecount = count; + + RTprint(KERN_DEBUG"%s: found " \ + "device " \ + "%d with usecount == %d\n", + __FUNCTION__, + nh_ifidx, + min_usecount); + } + } + } + } + + if (!result) { + if (cur_min) { + RTprint( KERN_DEBUG"%s: index of device in state "\ + "array: %d\n", + __FUNCTION__, cur_min_devidx ); + __multipath_safe_inc(&state[cur_min_devidx].usecount); + result = cur_min; + } + else { + RTprint( KERN_DEBUG"%s: utilized first\n", + __FUNCTION__); + result = first; + } + } + else { + RTprint(KERN_DEBUG"%s: utilize result: found device " \ + "%d with usecount == %d\n", + __FUNCTION__, result->u.dst.dev->ifindex, + min_usecount); + + } + + *rp = result; + last_selection = result; +} diff -ruN linux-2.6.9.split/net/ipv4/multipath_random.c linux-2.6.9.nicbalancing/net/ipv4/multipath_random.c --- linux-2.6.9.split/net/ipv4/multipath_random.c 1970-01-01 01:00:00.000000000 +0100 +++ linux-2.6.9.nicbalancing/net/ipv4/multipath_random.c 2004-12-15 12:07:15.000000000 +0100 @@ -0,0 +1,124 @@ +/* + * Random policy for multipath. + * + * + * Version: $Id: multipath_random.c,v 1.1.2.3 2004/09/21 08:42:11 elueck Exp $ + * + * Authors: Einar Lueck + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define RTprint(a...) // printk(KERN_DEBUG a) + +#define MULTIPATH_MAX_CANDIDATES 40 + +/* interface to random number generation */ +static unsigned int RANDOM_SEED = 93186752; +static __inline__ unsigned int random(unsigned int ubound); + +static int __inline__ multipath_comparekeys(const struct flowi *flp1, + const struct flowi *flp2) { + return flp1->fl4_dst == flp2->fl4_dst && + flp1->fl4_src == flp2->fl4_src && + flp1->oif == flp2->oif && +#ifdef CONFIG_IP_ROUTE_FWMARK + flp1->fl4_fwmark == flp2->fl4_fwmark && +#endif + !((flp1->fl4_tos ^ flp2->fl4_tos) & + (IPTOS_RT_MASK | RTO_ONLINK)); +} + +void __multipath_selectroute(const struct flowi *flp, + struct rtable *first, + struct rtable **rp) { + struct rtable *rt; + struct rtable *decision; + unsigned char candidate_count = 0; + + /* count all candidate */ + for (rt = rcu_dereference(first); rt; + rt = rcu_dereference(rt->u.rt_next)) { + if ( ( rt->u.dst.flags & DST_BALANCED ) != 0 && + multipath_comparekeys(&rt->fl, flp) ) { + ++candidate_count; + } + } + + /* choose a random candidate */ + decision = first; + if ( candidate_count > 1 ) { + unsigned char i = 0; + unsigned char candidate_no = (unsigned char) + random(candidate_count); + RTprint( "%s: randomly chosen candidate: %d (count: %d)\n", + __FUNCTION__, candidate_no, candidate_count ); + + + /* find chosen candidate and adjust GC data for all candidates + to ensure they stay in cache */ + for (rt = first; rt; rt = rt->u.rt_next) { + if ( ( rt->u.dst.flags & DST_BALANCED ) != 0 && + multipath_comparekeys(&rt->fl, flp) ) { + rt->u.dst.lastuse = jiffies; + if (i == candidate_no) { + decision = rt; + } + if (i >= candidate_count) { + break; + } + i++; + } + } + } + + decision->u.dst.__use++; + *rp = decision; +} + +static __inline__ unsigned int random(unsigned int ubound) +{ + static unsigned int a = 1588635695, + q = 2, + r = 1117695901; + RANDOM_SEED = a*(RANDOM_SEED % q) - r*(RANDOM_SEED / q); + return RANDOM_SEED % ubound; +} + diff -ruN linux-2.6.9.split/net/ipv4/multipath_rr.c linux-2.6.9.nicbalancing/net/ipv4/multipath_rr.c --- linux-2.6.9.split/net/ipv4/multipath_rr.c 1970-01-01 01:00:00.000000000 +0100 +++ linux-2.6.9.nicbalancing/net/ipv4/multipath_rr.c 2004-12-15 12:07:15.000000000 +0100 @@ -0,0 +1,118 @@ +/* + * Round robin policy for multipath. + * + * + * Version: $Id: multipath_rr.c,v 1.1.2.2 2004/09/16 07:42:34 elueck Exp $ + * + * Authors: Einar Lueck + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define RTprint(a...) // printk(KERN_DEBUG a) + +#define MULTIPATH_MAX_CANDIDATES 40 + +static struct rtable* last_used = NULL; + +static int __inline__ multipath_comparekeys(const struct flowi *flp1, + const struct flowi *flp2) { + return flp1->fl4_dst == flp2->fl4_dst && + flp1->fl4_src == flp2->fl4_src && + flp1->oif == flp2->oif && +#ifdef CONFIG_IP_ROUTE_FWMARK + flp1->fl4_fwmark == flp2->fl4_fwmark && +#endif + !((flp1->fl4_tos ^ flp2->fl4_tos) & + (IPTOS_RT_MASK | RTO_ONLINK)); +} + +void __multipath_remove(struct rtable *rt) +{ + if (last_used==rt) { + last_used = NULL; + } +} + +void __multipath_selectroute(const struct flowi *flp, + struct rtable *first, struct rtable **rp) +{ + struct rtable *nh, *result, *min_use_cand = NULL; + int min_use = -1; + + /* if necessary and possible utilize the old alternative */ + if ( ( flp->flags & FLOWI_FLAG_MULTIPATHOLDROUTE ) != 0 && + last_used != NULL ) { + RTprint( KERN_CRIT"%s: holding route \n", + __FUNCTION__ ); + result = last_used; + goto out; + } + + + /* 1. make sure all alt. nexthops have the same GC related data */ + /* 2. determine the new candidate to be returned */ + result = NULL; + for (nh = rcu_dereference(first); nh; + nh = rcu_dereference(nh->u.rt_next)) { + if ( ( nh->u.dst.flags & DST_BALANCED ) != 0 && + multipath_comparekeys(&nh->fl, flp ) ) { + nh->u.dst.lastuse = jiffies; + + if (min_use == -1 || nh->u.dst.__use < min_use) { + min_use = nh->u.dst.__use; + min_use_cand = nh; + } + RTprint( KERN_CRIT"%s: found balanced entry\n", + __FUNCTION__ ); + } + } + result = min_use_cand; + if (!result) { + result = first; + } + + out: + last_used = result; + result->u.dst.__use++; + *rp = result; +} + + diff -ruN linux-2.6.9.split/net/ipv4/multipath_wrandom.c linux-2.6.9.nicbalancing/net/ipv4/multipath_wrandom.c --- linux-2.6.9.split/net/ipv4/multipath_wrandom.c 1970-01-01 01:00:00.000000000 +0100 +++ linux-2.6.9.nicbalancing/net/ipv4/multipath_wrandom.c 2004-12-15 12:07:15.000000000 +0100 @@ -0,0 +1,374 @@ +/* + * Weighted random policy for multipath. + * + * + * Version: $Id: multipath_wrandom.c,v 1.1.2.3 2004/09/22 07:51:40 elueck Exp $ + * + * Authors: Einar Lueck + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define MPprint(a...) // printk(KERN_DEBUG a) + +#define MULTIPATH_STATE_SIZE 15 + +struct multipath_candidate { + struct multipath_candidate *next; + int power; + struct rtable *rt; +}; + +struct multipath_dest { + struct list_head list; + + const struct fib_nh *nh_info; + __u32 netmask; + __u32 network; + unsigned char prefixlen; + + struct rcu_head rcu; +}; + +struct multipath_bucket { + struct list_head head; + spinlock_t lock; +}; + +struct multipath_route { + struct list_head list; + + int oif; + __u32 gw; + struct list_head dests; + + struct rcu_head rcu; +}; + + +/* state: primarily weight per route information */ +static int multipath_state_initialized = 0; +static spinlock_t state_big_lock = SPIN_LOCK_UNLOCKED; +static struct multipath_bucket state[MULTIPATH_STATE_SIZE]; + + +/* interface to random number generation */ +static unsigned int RANDOM_SEED = 93186752; +static __inline__ unsigned int random(unsigned int ubound); + +static int __inline__ multipath_comparekeys(const struct flowi *flp1, + const struct flowi *flp2) { + return flp1->fl4_dst == flp2->fl4_dst && + flp1->fl4_src == flp2->fl4_src && + flp1->oif == flp2->oif && +#ifdef CONFIG_IP_ROUTE_FWMARK + flp1->fl4_fwmark == flp2->fl4_fwmark && +#endif + !((flp1->fl4_tos ^ flp2->fl4_tos) & + (IPTOS_RT_MASK | RTO_ONLINK)); +} + + +static unsigned char __multipath_lookup_weight(const struct flowi *fl, + const struct rtable *rt) { + const int state_idx = rt->idev->dev->ifindex % MULTIPATH_STATE_SIZE; + struct multipath_route *r; + struct multipath_route *target_route = NULL; + struct multipath_dest *d; + int weight = 1; + + /* lookup the weight information for a certain route */ + rcu_read_lock(); + + /* find state entry for gateway or add one if necessary */ + list_for_each_entry_rcu(r, &state[state_idx].head, list) { + if (r->gw == rt->rt_gateway && + r->oif == rt->idev->dev->ifindex) { + target_route = r; + break; + } + } + if (!target_route) { + /* this should not happen... but we are prepared */ + printk( KERN_CRIT"%s: missing state for gateway: %u and " \ + "device %d\n", __FUNCTION__, rt->rt_gateway, + rt->idev->dev->ifindex); + goto out; + } + + /* find state entry for destination */ + list_for_each_entry_rcu(d, &target_route->dests, list) { + __u32 targetnetwork = fl->fl4_dst & + (0xFFFFFFFF >> (32 - d->prefixlen)); + + if ((targetnetwork & d->netmask) == d->network) { + weight = d->nh_info->nh_weight; + MPprint("%s: found weight %d for gateway %u\n", + __FUNCTION__, weight, rt->rt_gateway); + goto out; + } + } + + out: + rcu_read_unlock(); + return weight; +} + +static void __multipath_init_state(void) +{ + spin_lock(&state_big_lock); + + /* check again due to SMP and to prevent contention */ + if (!multipath_state_initialized) { + int i; + for (i = 0; i < MULTIPATH_STATE_SIZE; ++i) { + INIT_LIST_HEAD(&state[i].head); + state[i].lock = SPIN_LOCK_UNLOCKED; + } + } + + /* now mark initialized */ + multipath_state_initialized = 1; + + spin_unlock(&state_big_lock); +} + +static void inline __multipath_init(void) +{ + /* do not spinlock to reduce unnecessary contention */ + if (!multipath_state_initialized) { + __multipath_init_state(); + } +} + +void __multipath_selectroute(const struct flowi *flp, + struct rtable *first, + struct rtable **rp) { + struct rtable *rt; + struct rtable *decision; + struct multipath_candidate *first_mpc = NULL; + struct multipath_candidate *mpc, *last_mpc = NULL; + int power = 0; + int last_power; + int selector; + const size_t size_mpc = sizeof(struct multipath_candidate); + + /* init state if necessary */ + __multipath_init(); + + + /* collect all candidates and identify their weights */ + for (rt = rcu_dereference(first); rt; + rt = rcu_dereference(rt->u.rt_next)) { + if ((rt->u.dst.flags & DST_BALANCED) != 0 && + multipath_comparekeys(&rt->fl, flp) ) { + struct multipath_candidate* mpc = + (struct multipath_candidate*) + kmalloc(size_mpc, GFP_KERNEL); + + power += __multipath_lookup_weight(flp, rt) * 10000; + + mpc->power = power; + mpc->rt = rt; + mpc->next = NULL; + + if (!first_mpc) + first_mpc = mpc; + else + last_mpc->next = mpc; + + last_mpc = mpc; + } + } + + /* choose a weighted random candidate */ + decision = first; + selector = random(power); + MPprint("%s: random number %d in range %d\n", __FUNCTION__, selector, + power); + last_power = 0; + + + /* select candidate, adjust GC data and cleanup local state */ + decision = first; + last_mpc = NULL; + for (mpc = first_mpc; mpc; mpc = mpc->next) { + mpc->rt->u.dst.lastuse = jiffies; + MPprint("%s: last_power = %d, selector: %d, mpc->power: %d\n", + __FUNCTION__, last_power, selector, mpc->power); + if (last_power <= selector && selector < mpc->power) { + decision = mpc->rt; + MPprint("%s: selected %u\n", __FUNCTION__, + decision->rt_gateway); + } + last_power = mpc->power; + if (last_mpc) { + kfree(last_mpc); + } + last_mpc = mpc; + } + if (last_mpc) { + /* concurrent __multipath_flush may lead to !last_mpc */ + kfree(last_mpc); + } + + decision->u.dst.__use++; + *rp = decision; +} + +void __multipath_set_nhinfo(__u32 network, + __u32 netmask, + unsigned char prefixlen, + const struct fib_nh* nh) +{ + const int state_idx = nh->nh_oif % MULTIPATH_STATE_SIZE; + struct multipath_route *r, *target_route = NULL; + struct multipath_dest *d, *target_dest = NULL; + + /* init state if necessary */ + __multipath_init(); + + /* store the weight information for a certain route */ + spin_lock(&state[state_idx].lock); + + /* find state entry for gateway or add one if necessary */ + list_for_each_entry_rcu(r, &state[state_idx].head, list) { + if (r->gw == nh->nh_gw && r->oif == nh->nh_oif) { + target_route = r; + break; + } + } + if (!target_route) { + const size_t size_rt = sizeof(struct multipath_route); + target_route = (struct multipath_route *) + kmalloc(size_rt, GFP_KERNEL); + + target_route->gw = nh->nh_gw; + target_route->oif = nh->nh_oif; + memset(&target_route->rcu, sizeof(struct rcu_head), 0); + INIT_LIST_HEAD(&target_route->dests); + + list_add_rcu(&target_route->list, &state[state_idx].head); + } + + /* find state entry for destination or add one if necessary */ + list_for_each_entry_rcu(d, &target_route->dests, list) { + if (d->nh_info == nh) { + target_dest = d; + break; + } + } + if (!target_dest) { + const size_t size_dst = sizeof(struct multipath_dest); + target_dest = (struct multipath_dest*) + kmalloc(size_dst, GFP_KERNEL); + + target_dest->nh_info = nh; + target_dest->network = network; + target_dest->netmask = netmask; + target_dest->prefixlen = prefixlen; + memset(&target_dest->rcu, sizeof(struct rcu_head), 0); + + list_add_rcu(&target_dest->list, &target_route->dests); + } + /* else: we already stored this info for another destination => + we are finished */ + + spin_unlock(&state[state_idx].lock); +} + + +static void __multipath_free(struct rcu_head *head) +{ + struct multipath_route *rt = container_of(head, struct multipath_route, + rcu); + kfree(rt); +} + +static void __multipath_free_dst(struct rcu_head *head) +{ + struct multipath_dest *dst = container_of(head, + struct multipath_dest, + rcu); + kfree(dst); +} + +void __multipath_flush() +{ + int i; + + MPprint("%s: called\n", __FUNCTION__); + + /* init state if necessary */ + __multipath_init(); + + /* defere delete to all entries */ + for (i = 0; i < MULTIPATH_STATE_SIZE; ++i) { + struct multipath_route *r; + spin_lock(&state[i].lock); + + list_for_each_entry_rcu(r, &state[i].head, list) { + struct multipath_dest *d; + list_for_each_entry_rcu(d, &r->dests, list) { + list_del_rcu(&d->list); + call_rcu(&d->rcu, + __multipath_free_dst); + + } + list_del_rcu(&r->list); + call_rcu(&r->rcu, + __multipath_free); + } + + spin_unlock(&state[i].lock); + } + + MPprint("%s: finished\n", __FUNCTION__); +} + +static __inline__ unsigned int random(unsigned int ubound) +{ + static unsigned int a = 1588635695, + q = 2, + r = 1117695901; + RANDOM_SEED = a*(RANDOM_SEED % q) - r*(RANDOM_SEED / q); + return RANDOM_SEED % ubound; +} diff -ruN linux-2.6.9.split/net/ipv4/route.c linux-2.6.9.nicbalancing/net/ipv4/route.c --- linux-2.6.9.split/net/ipv4/route.c 2004-12-15 12:05:32.000000000 +0100 +++ linux-2.6.9.nicbalancing/net/ipv4/route.c 2004-12-15 12:07:15.000000000 +0100 @@ -129,7 +129,7 @@ int ip_rt_secret_interval = 10 * 60 * HZ; static unsigned long rt_deadline; -#define RTprint(a...) printk(KERN_DEBUG a) +#define RTprint(a...) // printk(KERN_DEBUG a) static struct timer_list rt_flush_timer; static struct timer_list rt_periodic_timer; @@ -450,11 +450,13 @@ static __inline__ void rt_free(struct rtable *rt) { + multipath_remove( rt ); call_rcu_bh(&rt->u.dst.rcu_head, dst_rcu_free); } static __inline__ void rt_drop(struct rtable *rt) { + multipath_remove( rt ); ip_rt_put(rt); call_rcu_bh(&rt->u.dst.rcu_head, dst_rcu_free); } @@ -516,6 +518,54 @@ return score; } +#ifdef CONFIG_IP_ROUTE_MULTIPATH_CACHED +static struct rtable **rt_remove_balanced_route(struct rtable **chain_head, + struct rtable *expentry, + int* removed_count) +{ + int passedexpired = 0; + struct rtable **nextstep = NULL; + struct rtable **rthp = chain_head; + struct rtable *rth; + if (removed_count) + *removed_count = 0; + while ((rth = *rthp) != NULL) { + if ( rth == expentry ) { + passedexpired = 1; + } + + if (((*rthp)->u.dst.flags & DST_BALANCED) != 0 && + compare_keys(&(*rthp)->fl, &expentry->fl)) { + if (*rthp == expentry) { + *rthp = rth->u.rt_next; + continue; + } + else { + *rthp = rth->u.rt_next; + rt_free(rth); + if (removed_count) + ++(*removed_count); + } + } + else { + if ( !((*rthp)->u.dst.flags & DST_BALANCED) && + passedexpired && !nextstep ) { + nextstep = &rth->u.rt_next; + } + rthp = &rth->u.rt_next; + } + } + + rt_free(expentry); + if (removed_count) + ++(*removed_count); + + return nextstep; +} + +#endif + + /* This runs via a timer and thus is always in BH context. */ static void rt_check_expire(unsigned long dummy) { @@ -547,8 +597,24 @@ } /* Cleanup aged off entries. */ - *rthp = rth->u.rt_next; - rt_free(rth); +#ifdef CONFIG_IP_ROUTE_MULTIPATH_CACHED + /* remove all related balanced entries if necessary */ + if ( rth->u.dst.flags & DST_BALANCED ) { + rthp = rt_remove_balanced_route( + &rt_hash_table[i].chain, + rth, NULL); + if (!rthp) { + break; + } + } + else { + *rthp = rth->u.rt_next; + rt_free(rth); + } +#else /* CONFIG_IP_ROUTE_MULTIPATH_CACHED */ + *rthp = rth->u.rt_next; + rt_free(rth); +#endif /* CONFIG_IP_ROUTE_MULTIPATH_CACHED */ } spin_unlock(&rt_hash_table[i].lock); @@ -596,6 +662,9 @@ if (delay < 0) delay = ip_rt_min_delay; + /* flush existing multipath state*/ + multipath_flush(); + spin_lock_bh(&rt_flush_lock); if (del_timer(&rt_flush_timer) && delay > 0 && rt_deadline) { @@ -714,9 +783,29 @@ rthp = &rth->u.rt_next; continue; } +#ifdef CONFIG_IP_ROUTE_MULTIPATH_CACHED + /* remove all related balanced entries if necessary */ + if ( rth->u.dst.flags & DST_BALANCED ) { + int r; + rthp = rt_remove_balanced_route( + &rt_hash_table[i].chain, + rth, + &r); + goal -= r; + if (!rthp) { + break; + } + } + else { + *rthp = rth->u.rt_next; + rt_free(rth); + goal--; + } +#else /* CONFIG_IP_ROUTE_MULTIPATH_CACHED */ *rthp = rth->u.rt_next; rt_free(rth); goal--; +#endif /* CONFIG_IP_ROUTE_MULTIPATH_CACHED */ } spin_unlock_bh(&rt_hash_table[k].lock); if (goal <= 0) @@ -797,7 +886,12 @@ spin_lock_bh(&rt_hash_table[hash].lock); while ((rth = *rthp) != NULL) { +#ifdef CONFIG_IP_ROUTE_MULTIPATH_CACHED + if (!(rth->u.dst.flags & DST_BALANCED) && + compare_keys(&rth->fl, &rt->fl)) { +#else if (compare_keys(&rth->fl, &rt->fl)) { +#endif /* Put it first */ *rthp = rth->u.rt_next; /* @@ -1628,6 +1722,10 @@ } rth->u.dst.flags= DST_HOST; +#ifdef CONFIG_IP_ROUTE_MULTIPATH_CACHED + if ( res->fi->fib_nhs > 1 ) + rth->u.dst.flags |= DST_BALANCED; +#endif if (in_dev->cnf.no_policy) rth->u.dst.flags |= DST_NOPOLICY; if (in_dev->cnf.no_xfrm) @@ -1675,7 +1773,7 @@ unsigned hash; #ifdef CONFIG_IP_ROUTE_MULTIPATH - if (res->fi->fib_nhs > 1 && fl->oif == 0) + if (res->fi && res->fi->fib_nhs > 1 && fl->oif == 0) fib_select_multipath(fl, res); #endif @@ -1696,7 +1794,65 @@ struct in_device *in_dev, u32 daddr, u32 saddr, u32 tos) { +#ifdef CONFIG_IP_ROUTE_MULTIPATH_CACHED + struct rtable* rth; + unsigned char hop, hopcount, lasthop; + int err = -EINVAL; + unsigned hash; + if (res->fi) { + hopcount = res->fi->fib_nhs; + } + else { + hopcount = 1; + } + lasthop = hopcount - 1; + + /* distinguish between multipath and singlepath */ + if ( hopcount < 2 ) + return ip_mkroute_input_def(skb, res, fl, in_dev, daddr, + saddr, tos); + + RTprint( KERN_DEBUG"%s: entered (hopcount: %d)\n", __FUNCTION__, + hopcount); + + /* add all alternatives to the routing cache */ + for ( hop = 0; hop < hopcount; ++hop ) { + res->nh_sel = hop; + + RTprint( KERN_DEBUG"%s: entered (hopcount: %d)\n", + __FUNCTION__, hopcount); + + /* create a routing cache entry */ + err = __mkroute_input( skb, res, in_dev, daddr, saddr, tos, + &rth ); + if ( err ) + return err; + + + /* put it into the cache */ + hash = rt_hash_code(daddr, saddr ^ (fl->iif << 5), tos); + err = rt_intern_hash(hash, rth, (struct rtable**)&skb->dst); + if ( err ) + return err; + + /* forward hop information to multipath impl. */ + multipath_set_nhinfo(FIB_RES_NETWORK(*res), + FIB_RES_NETMASK(*res), + res->prefixlen, + &FIB_RES_NH(*res)); + + + /* only for the last hop the reference count is handled + outside */ + RTprint( KERN_DEBUG"%s: balanced entry created: %d\n", + __FUNCTION__, rth ); + if ( hop == lasthop ) + atomic_set(&(skb->dst->__refcnt), 1); + } + return err; +#else /* CONFIG_IP_ROUTE_MULTIPATH_CACHED */ return ip_mkroute_input_def(skb, res, fl, in_dev, daddr, saddr, tos); +#endif /* CONFIG_IP_ROUTE_MULTIPATH_CACHED */ } @@ -2017,6 +2173,10 @@ } rth->u.dst.flags= DST_HOST; +#ifdef CONFIG_IP_ROUTE_MULTIPATH_CACHED + if (res->fi && res->fi->fib_nhs > 1) + rth->u.dst.flags |= DST_BALANCED; +#endif if (in_dev->cnf.no_xfrm) rth->u.dst.flags |= DST_NOXFRM; if (in_dev->cnf.no_policy) @@ -2108,7 +2268,77 @@ struct net_device *dev_out, unsigned flags) { +#ifdef CONFIG_IP_ROUTE_MULTIPATH_CACHED + u32 tos = RT_FL_TOS(oldflp); + unsigned char hop; + unsigned hash; + int err = -EINVAL; + struct rtable* rth; + + if (res->fi && res->fi->fib_nhs > 1) { + unsigned char hopcount = res->fi->fib_nhs; + RTprint( KERN_DEBUG"%s: entered (hopcount: %d, fl->oif: %d)\n", + __FUNCTION__, hopcount, fl->oif); + + for ( hop = 0; hop < hopcount; ++hop ) { + struct net_device *dev2nexthop; + RTprint( KERN_DEBUG"%s: hop %d of %d\n", __FUNCTION__, + hop, hopcount ); + + res->nh_sel = hop; + + /* hold a work reference to the output device */ + dev2nexthop = FIB_RES_DEV(*res); + dev_hold(dev2nexthop); + + err = __mkroute_output(&rth, res, fl, oldflp, + dev2nexthop, flags); + + /** FIXME remove debug code */ + RTprint( "%s: balanced entry created: %d " \ + " (GW: %u)\n", + __FUNCTION__, + &rth->u.dst, + FIB_RES_GW(*res) ); + + if ( err != 0 ) { + goto cleanup; + } + + RTprint( KERN_DEBUG"%s: created successfully %d\n", + __FUNCTION__, hop ); + + hash = rt_hash_code(oldflp->fl4_dst, + oldflp->fl4_src ^ + (oldflp->oif << 5), tos); + err = rt_intern_hash(hash, rth, rp); + RTprint( KERN_DEBUG"%s: hashed %d\n", + __FUNCTION__, hop ); + + /* forward hop information to multipath impl. */ + multipath_set_nhinfo(FIB_RES_NETWORK(*res), + FIB_RES_NETMASK(*res), + res->prefixlen, + &FIB_RES_NH(*res)); + cleanup: + /* release work reference to output device */ + dev_put(dev2nexthop); + + if ( err != 0 ) { + return err; + } + } + RTprint( "%s: exited loop\n", __FUNCTION__ ); + atomic_set(&(*rp)->u.dst.__refcnt, 1); + return err; + } + else { + return ip_mkroute_output_def(rp, res, fl, oldflp, dev_out, + flags); + } +#else /* CONFIG_IP_ROUTE_MULTIPATH_CACHED */ return ip_mkroute_output_def(rp, res, fl, oldflp, dev_out, flags); +#endif } /* @@ -2137,6 +2367,7 @@ int free_res = 0; int err; + res.fi = NULL; #ifdef CONFIG_IP_MULTIPLE_TABLES res.r = NULL; @@ -2186,6 +2417,8 @@ dev_put(dev_out); dev_out = NULL; } + + if (oldflp->oif) { dev_out = dev_get_by_index(oldflp->oif); err = -ENODEV; @@ -2292,9 +2525,11 @@ dev_hold(dev_out); fl.oif = dev_out->ifindex; + make_route: err = ip_mkroute_output(rp, &res, &fl, oldflp, dev_out, flags); + if (free_res) fib_res_put(&res); if (dev_out) @@ -2321,6 +2556,15 @@ #endif !((rth->fl.fl4_tos ^ flp->fl4_tos) & (IPTOS_RT_MASK | RTO_ONLINK))) { + /* check for multipath routes and choose one if + necessary */ + if (multipath_selectroute(flp, rth, rp)) { + dst_hold(&(*rp)->u.dst); + RT_CACHE_STAT_INC(out_hit); + rcu_read_unlock_bh(); + return 0; + } + rth->u.dst.lastuse = jiffies; dst_hold(&rth->u.dst); rth->u.dst.__use++; From lkml@einar-lueck.de Mon Dec 20 03:21:36 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 03:21:44 -0800 (PST) Received: from smtprelay03.ispgateway.de (smtprelay03.ispgateway.de [80.67.18.15]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBKBLD9g014026 for ; Mon, 20 Dec 2004 03:21:34 -0800 Received: (qmail 27510 invoked from network); 20 Dec 2004 11:19:42 -0000 Received: from unknown (HELO [192.168.30.10]) (008508@[217.231.189.59]) (envelope-sender ) by smtprelay03.ispgateway.de (qmail-ldap-1.03) with SMTP for ; 20 Dec 2004 11:19:42 -0000 Message-ID: <41C6B54F.2020604@einar-lueck.de> Date: Mon, 20 Dec 2004 12:19:43 +0100 From: =?ISO-8859-1?Q?Einar_L=FCck?= User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206) X-Accept-Language: en-us, en MIME-Version: 1.0 To: linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: [PATCH 2/2] ipv4 routing: multipath with cache support, 2.6.10-rc3 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12914 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: lkml@einar-lueck.de Precedence: bulk X-list: netdev [PATCH 2/2] ipv4 routing: multipath with cache support, 2.6.10-rc3 From: Einar Lueck This patch is an approach towards solving some problems with the current multipath implementation: * routing cache allows only one route to be cached for a certain search key * a mulitpath/load balancing decision is only made in case a corresponding route is not yet cached In the scenarios, that are relevant to us (high amount of outgoing connection requests), this is a serious problem. Our approach: * a new flag at the "struct dst_entry" flags field to indicate that further routes having the same key follow in the routing cache chain * at cache lookup time: we recognize this flag and may apply different load balancing policies to the available routes having the same key. * Garbage Collection: modified in a way that ensures that all routes having the same target are removed together from the cache. Motivation for the approach: * keep the overall routing cache entry size constant * separate load balancing state from the overall routing cache state as good as possible keeping all routes inthe same place * keep changes to the overall routing code/behaviour at a minimum. We implemented the following routing policies: * _weighted_ random routing policiy utilizing the weights configurable via "ip route" that allows to approximate many other policies (e.g. round robin) * random * round-robin * interface round robin policy that selects among relevant routes in a way that tries to ensure an overall round robin among the available _interfaces_ is achieved. Applied tests: * all policies functionally tested with up to 3 NICs (scripted iperf and ifconfig) * garbage collection within the routing cache functionally tested * reference counting on devices functionally tested * Platforms: s390, x86 Please consider for application! Regards, Einar Signed-off-by: Einar Lueck diff -ruN linux-2.6.9.split/include/net/dst.h linux-2.6.9.nicbalancing/include/net/dst.h --- linux-2.6.9.split/include/net/dst.h 2004-12-15 12:04:25.000000000 +0100 +++ linux-2.6.9.nicbalancing/include/net/dst.h 2004-12-15 12:07:15.000000000 +0100 @@ -48,6 +48,7 @@ #define DST_NOXFRM 2 #define DST_NOPOLICY 4 #define DST_NOHASH 8 +#define DST_BALANCED 0x10 unsigned long lastuse; unsigned long expires; diff -ruN linux-2.6.9.split/include/net/flow.h linux-2.6.9.nicbalancing/include/net/flow.h --- linux-2.6.9.split/include/net/flow.h 2004-12-15 12:04:25.000000000 +0100 +++ linux-2.6.9.nicbalancing/include/net/flow.h 2004-12-15 12:07:15.000000000 +0100 @@ -51,6 +51,7 @@ __u8 proto; __u8 flags; +#define FLOWI_FLAG_MULTIPATHOLDROUTE 0x01 union { struct { __u16 sport; diff -ruN linux-2.6.9.split/include/net/ip_fib.h linux-2.6.9.nicbalancing/include/net/ip_fib.h --- linux-2.6.9.split/include/net/ip_fib.h 2004-12-15 12:04:25.000000000 +0100 +++ linux-2.6.9.nicbalancing/include/net/ip_fib.h 2004-12-15 12:07:15.000000000 +0100 @@ -95,6 +95,10 @@ unsigned char nh_sel; unsigned char type; unsigned char scope; +#ifdef CONFIG_IP_ROUTE_MULTIPATH_WRANDOM + __u32 network; + __u32 netmask; +#endif struct fib_info *fi; #ifdef CONFIG_IP_MULTIPLE_TABLES struct fib_rule *r; @@ -119,6 +123,14 @@ #define FIB_RES_DEV(res) (FIB_RES_NH(res).nh_dev) #define FIB_RES_OIF(res) (FIB_RES_NH(res).nh_oif) +#ifdef CONFIG_IP_ROUTE_MULTIPATH_WRANDOM +#define FIB_RES_NETWORK(res) ((res).network) +#define FIB_RES_NETMASK(res) ((res).netmask) +#else /* CONFIG_IP_ROUTE_MULTIPATH_WRANDOM */ +#define FIB_RES_NETWORK(res) (0) +#define FIB_RES_NETMASK(res) (0) +#endif /* CONFIG_IP_ROUTE_MULTIPATH_WRANDOM */ + struct fib_table { unsigned char tb_id; unsigned tb_stamp; diff -ruN linux-2.6.9.split/include/net/route.h linux-2.6.9.nicbalancing/include/net/route.h --- linux-2.6.9.split/include/net/route.h 2004-12-15 12:04:24.000000000 +0100 +++ linux-2.6.9.nicbalancing/include/net/route.h 2004-12-15 12:07:15.000000000 +0100 @@ -46,6 +46,7 @@ #define RT_CONN_FLAGS(sk) (RT_TOS(inet_sk(sk)->tos) | sk->sk_localroute) +struct fib_nh; struct inet_peer; struct rtable { @@ -179,6 +180,9 @@ memcpy(&fl, &(*rp)->fl, sizeof(fl)); fl.fl_ip_sport = sport; fl.fl_ip_dport = dport; +#if defined(CONFIG_IP_ROUTE_MULTIPATH_CACHED) + fl.flags |= FLOWI_FLAG_MULTIPATHOLDROUTE; +#endif ip_rt_put(*rp); *rp = NULL; return ip_route_output_flow(rp, &fl, sk, 0); @@ -197,4 +201,69 @@ return rt->peer; } +#ifdef CONFIG_IP_ROUTE_MULTIPATH_WRANDOM +extern void __multipath_flush(void); +static inline void multipath_flush(void) { + __multipath_flush(); +} +#else /* CONFIG_IP_ROUTE_MULTIPATH_WRANDOM */ +static inline void multipath_flush(void){} +#endif /* CONFIG_IP_ROUTE_MULTIPATH_WRANDOM */ + +#ifdef CONFIG_IP_ROUTE_MULTIPATH_WRANDOM +extern void __multipath_set_nhinfo(__u32 network, + __u32 netmask, + unsigned char prefixlen, + const struct fib_nh* nh); +static inline void multipath_set_nhinfo(__u32 network, + __u32 netmask, + unsigned char prefixlen, + const struct fib_nh* nh) { + __multipath_set_nhinfo(network, netmask, prefixlen, nh); +} +#else +static inline void multipath_set_nhinfo(__u32 network, + __u32 netmask, + unsigned char prefixlen, + const struct fib_nh* nh) { +} +#endif + + + +#if defined(CONFIG_IP_ROUTE_MULTIPATH_RR) || defined(CONFIG_IP_ROUTE_MULTIPATH_DRR) +extern void __multipath_remove(struct rtable *rt); +static inline void multipath_remove(struct rtable *rt) { + if ( rt->u.dst.flags & DST_BALANCED ) { + __multipath_remove( rt ); + } +} +#else /* CONFIG_IP_ROUTE_MULTIPATH_RR || CONFIG_IP_ROUTE_MULTIPATH_DRR */ +static inline void multipath_remove(struct rtable *rt) {} +#endif /* CONFIG_IP_ROUTE_MULTIPATH_RR || CONFIG_IP_ROUTE_MULTIPATH_DRR */ + + +#ifdef CONFIG_IP_ROUTE_MULTIPATH_CACHED +extern void __multipath_selectroute(const struct flowi *flp, + struct rtable *rth, + struct rtable **rp); +static inline int multipath_selectroute(const struct flowi *flp, + struct rtable *rth, + struct rtable **rp) { + if ( rth->u.dst.flags & DST_BALANCED ) { + __multipath_selectroute( flp, rth, rp ); + return 1; + } + else { + return 0; + } +} +#else /* CONFIG_IP_ROUTE_MULTIPATH_CACHED */ +static inline int multipath_selectroute(const struct flowi *flp, + struct rtable *rth, + struct rtable **rp) { + return 0; +} +#endif /* CONFIG_IP_ROUTE_MULTIPATH_CACHED */ + #endif /* _ROUTE_H */ diff -ruN linux-2.6.9.split/net/ipv4/Kconfig linux-2.6.9.nicbalancing/net/ipv4/Kconfig --- linux-2.6.9.split/net/ipv4/Kconfig 2004-12-15 12:04:32.000000000 +0100 +++ linux-2.6.9.nicbalancing/net/ipv4/Kconfig 2004-12-15 12:07:15.000000000 +0100 @@ -90,6 +90,54 @@ equal "cost" and chooses one of them in a non-deterministic fashion if a matching packet arrives. +config IP_ROUTE_MULTIPATH_CACHED + bool "IP: equal cost multipath with caching support (EXPERIMENTAL)" + depends on: IP_ROUTE_MULTIPATH + help + Normally, equal cost multipath routing is not supported by the + routing cache. If you say Y here, alternative routes are cached + and on cache lookup a route is chosen in a configurable fashion. + + If unsure, say N. + +# +# multipath policy configuration +# +choice + prompt "Multipath policy" + depends on IP_ROUTE_MULTIPATH_CACHED + default IP_ROUTE_MULTIPATH_RANDOM + +config IP_ROUTE_MULTIPATH_RR + bool "round robin (EXPERIMENTAL)" + help + Mulitpath routes are chosen according to Round Robin + +config IP_ROUTE_MULTIPATH_RANDOM + bool "random multipath (EXPERIMENTAL)" + help + Multipath routes are chosen in a random fashion. Actually, + there is no weight for a route. The advantage of this policy + is that it is implemented stateless and therefore introduces only + a very small delay. +config IP_ROUTE_MULTIPATH_WRANDOM + bool "weighted random multipath (EXPERIMENTAL)" + help + Multipath routes are chosen in a weighted random fashion. + The per route weights are the weights visible via ip route 2. As the + corresponding state management introduces some overhead routing delay + is increased. +config IP_ROUTE_MULTIPATH_DRR + bool "interface round robin (EXPERIMENTAL)" + help + Connections are distributed in a round robin fashion over the + available interfaces. This policy makes sense if the connections + should be primarily distributed on interfaces and not on routes. +endchoice +# +# END OF multipath policy configuration +# + config IP_ROUTE_VERBOSE bool "IP: verbose route monitoring" depends on IP_ADVANCED_ROUTER diff -ruN linux-2.6.9.split/net/ipv4/Makefile linux-2.6.9.nicbalancing/net/ipv4/Makefile --- linux-2.6.9.split/net/ipv4/Makefile 2004-12-15 12:04:31.000000000 +0100 +++ linux-2.6.9.nicbalancing/net/ipv4/Makefile 2004-12-15 12:07:15.000000000 +0100 @@ -20,6 +20,10 @@ obj-$(CONFIG_INET_IPCOMP) += ipcomp.o obj-$(CONFIG_INET_TUNNEL) += xfrm4_tunnel.o obj-$(CONFIG_IP_PNP) += ipconfig.o +obj-$(CONFIG_IP_ROUTE_MULTIPATH_RR) += multipath_rr.o +obj-$(CONFIG_IP_ROUTE_MULTIPATH_RANDOM) += multipath_random.o +obj-$(CONFIG_IP_ROUTE_MULTIPATH_WRANDOM) += multipath_wrandom.o +obj-$(CONFIG_IP_ROUTE_MULTIPATH_DRR) += multipath_drr.o obj-$(CONFIG_NETFILTER) += netfilter/ obj-$(CONFIG_IP_VS) += ipvs/ obj-$(CONFIG_IP_TCPDIAG) += tcp_diag.o diff -ruN linux-2.6.9.split/net/ipv4/fib_hash.c linux-2.6.9.nicbalancing/net/ipv4/fib_hash.c --- linux-2.6.9.split/net/ipv4/fib_hash.c 2004-12-15 12:04:31.000000000 +0100 +++ linux-2.6.9.nicbalancing/net/ipv4/fib_hash.c 2004-12-15 12:07:15.000000000 +0100 @@ -261,6 +261,7 @@ err = fib_semantic_match(&f->fn_alias, flp, res, + f->fn_key, fz->fz_mask, fz->fz_order); if (err <= 0) goto out; diff -ruN linux-2.6.9.split/net/ipv4/fib_lookup.h linux-2.6.9.nicbalancing/net/ipv4/fib_lookup.h --- linux-2.6.9.split/net/ipv4/fib_lookup.h 2004-12-15 12:04:31.000000000 +0100 +++ linux-2.6.9.nicbalancing/net/ipv4/fib_lookup.h 2004-12-15 12:07:15.000000000 +0100 @@ -19,7 +19,8 @@ /* Exported by fib_semantics.c */ extern int fib_semantic_match(struct list_head *head, const struct flowi *flp, - struct fib_result *res, int prefixlen); + struct fib_result *res, __u32 zone, __u32 mask, + int prefixlen); extern void fib_release_info(struct fib_info *); extern struct fib_info *fib_create_info(const struct rtmsg *r, struct kern_rta *rta, diff -ruN linux-2.6.9.split/net/ipv4/fib_semantics.c linux-2.6.9.nicbalancing/net/ipv4/fib_semantics.c --- linux-2.6.9.split/net/ipv4/fib_semantics.c 2004-12-15 12:04:31.000000000 +0100 +++ linux-2.6.9.nicbalancing/net/ipv4/fib_semantics.c 2004-12-15 12:07:15.000000000 +0100 @@ -763,7 +763,8 @@ } int fib_semantic_match(struct list_head *head, const struct flowi *flp, - struct fib_result *res, int prefixlen) + struct fib_result *res, __u32 zone, __u32 mask, + int prefixlen) { struct fib_alias *fa; int nh_sel = 0; @@ -827,6 +828,11 @@ res->type = fa->fa_type; res->scope = fa->fa_scope; res->fi = fa->fa_info; +#ifdef CONFIG_IP_ROUTE_MULTIPATH_WRANDOM + res->netmask = mask; + res->network = zone & + (0xFFFFFFFF >> (32 - prefixlen)); +#endif atomic_inc(&res->fi->fib_clntref); return 0; } diff -ruN linux-2.6.9.split/net/ipv4/multipath_drr.c linux-2.6.9.nicbalancing/net/ipv4/multipath_drr.c --- linux-2.6.9.split/net/ipv4/multipath_drr.c 1970-01-01 01:00:00.000000000 +0100 +++ linux-2.6.9.nicbalancing/net/ipv4/multipath_drr.c 2004-12-15 12:07:15.000000000 +0100 @@ -0,0 +1,292 @@ +/* + * Device round robin policy for multipath. + * + * + * Version: $Id: multipath_drr.c,v 1.1.2.1 2004/09/16 07:42:34 elueck Exp $ + * + * Authors: Einar Lueck + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +struct multipath_device +{ + int ifi; /* interface index of device */ + atomic_t usecount; + int allocated; +}; + +#define MULTIPATH_MAX_DEVICECANDIDATES 10 + +static struct multipath_device state[MULTIPATH_MAX_DEVICECANDIDATES]; +static spinlock_t state_lock = SPIN_LOCK_UNLOCKED; +static int registered_dev_notifier = 0; +static struct rtable *last_selection = NULL; + +#define RTprint(a...) // printk(KERN_DEBUG a) + + + +static int __inline__ multipath_comparekeys(const struct flowi *flp1, + const struct flowi *flp2) { + return flp1->fl4_dst == flp2->fl4_dst && + flp1->fl4_src == flp2->fl4_src && + flp1->oif == flp2->oif && +#ifdef CONFIG_IP_ROUTE_FWMARK + flp1->fl4_fwmark == flp2->fl4_fwmark && +#endif + !((flp1->fl4_tos ^ flp2->fl4_tos) & + (IPTOS_RT_MASK | RTO_ONLINK)); +} + +static int __inline__ __multipath_findslot(void) { + int i; + for (i=0; iifindex); + if (devidx != -1) { + state[devidx].allocated = 0; + state[devidx].ifi = 0; + atomic_set(&state[devidx].usecount, 0); + RTprint(KERN_DEBUG"%s: successfully removed device " \ + "with index %d\n",__FUNCTION__, devidx); + } + else { + RTprint(KERN_DEBUG"%s: Device not relevant for " \ + " multipath: %d\n", + __FUNCTION__, devidx); + } + + spin_unlock_bh(&state_lock); + break; + } + return NOTIFY_DONE; +} + +struct notifier_block multipath_dev_notifier = { + .notifier_call = multipath_dev_event, +}; + +void __multipath_remove(struct rtable* rt) { + if (last_selection == rt) { + last_selection = NULL; + } +} + +void __multipath_safe_inc(atomic_t* usecount) +{ + int n; + atomic_inc(usecount); + n = atomic_read(usecount); + if (n<=0) { + int i; + RTprint("%s: detected overflow, now ill will reset all "\ + "usecounts\n", __FUNCTION__); + + spin_lock_bh(&state_lock); + for (i=0; iflags & FLOWI_FLAG_MULTIPATHOLDROUTE ) != 0 && + last_selection != NULL ) { + RTprint( KERN_CRIT"%s: holding route \n", __FUNCTION__ ); + result = last_selection; + *rp = result; + return; + } + + + /* 1. make sure all alt. nexthops have the same GC related data */ + /* 2. determine the new candidate to be returned */ + result = NULL; + cur_min = NULL; + for (nh = rcu_dereference(first); nh; + nh = rcu_dereference(nh->u.rt_next)) { + if ( ( nh->u.dst.flags & DST_BALANCED ) != 0 && + multipath_comparekeys(&nh->fl, flp ) ) { + int nh_ifidx = nh->u.dst.dev->ifindex; + nh->u.dst.lastuse = jiffies; + nh->u.dst.__use++; + if (result != NULL) { + continue; + } + + /* search for the output interface */ + /* this is not SMP safe, only add/remove are + SMP safe as wrong usecount updates have no big + impact */ + devidx = __multipath_finddev(nh_ifidx); + if (devidx==-1) { + /* add the interface to the array + SMP safe */ + spin_lock_bh(&state_lock); + + /* due to SMP: search again */ + devidx = __multipath_finddev(nh_ifidx); + if (devidx==-1) { + /* add entry for device */ + devidx = __multipath_findslot(); + if (devidx==-1) { + /* unlikely but possible */ + RTprint(KERN_DEBUG"%s: " \ + "out of space\n", + __FUNCTION__); + continue; + } + + state[devidx].allocated = 1; + state[devidx].ifi = nh_ifidx; + atomic_set(&state[devidx].usecount, 0); + min_usecount = 0; + RTprint(KERN_DEBUG"%s: created " \ + " for " \ + "device %d and " \ + "min_usecount " \ + " == -1\n", + __FUNCTION__, + nh_ifidx ); + } + + spin_unlock_bh(&state_lock); + } + + if (min_usecount == 0) { + /* if the device has not been used it is + the primary target */ + RTprint(KERN_DEBUG"%s: now setting " \ + "result to device %d\n", + __FUNCTION__, nh_ifidx ); + + __multipath_safe_inc(&state[devidx].usecount); + result = nh; + } + else { + int count = + atomic_read(&state[devidx].usecount); + + if (min_usecount == -1 || + count < min_usecount) { + cur_min = nh; + cur_min_devidx = devidx; + min_usecount = count; + + RTprint(KERN_DEBUG"%s: found " \ + "device " \ + "%d with usecount == %d\n", + __FUNCTION__, + nh_ifidx, + min_usecount); + } + } + } + } + + if (!result) { + if (cur_min) { + RTprint( KERN_DEBUG"%s: index of device in state "\ + "array: %d\n", + __FUNCTION__, cur_min_devidx ); + __multipath_safe_inc(&state[cur_min_devidx].usecount); + result = cur_min; + } + else { + RTprint( KERN_DEBUG"%s: utilized first\n", + __FUNCTION__); + result = first; + } + } + else { + RTprint(KERN_DEBUG"%s: utilize result: found device " \ + "%d with usecount == %d\n", + __FUNCTION__, result->u.dst.dev->ifindex, + min_usecount); + + } + + *rp = result; + last_selection = result; +} diff -ruN linux-2.6.9.split/net/ipv4/multipath_random.c linux-2.6.9.nicbalancing/net/ipv4/multipath_random.c --- linux-2.6.9.split/net/ipv4/multipath_random.c 1970-01-01 01:00:00.000000000 +0100 +++ linux-2.6.9.nicbalancing/net/ipv4/multipath_random.c 2004-12-15 12:07:15.000000000 +0100 @@ -0,0 +1,124 @@ +/* + * Random policy for multipath. + * + * + * Version: $Id: multipath_random.c,v 1.1.2.3 2004/09/21 08:42:11 elueck Exp $ + * + * Authors: Einar Lueck + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define RTprint(a...) // printk(KERN_DEBUG a) + +#define MULTIPATH_MAX_CANDIDATES 40 + +/* interface to random number generation */ +static unsigned int RANDOM_SEED = 93186752; +static __inline__ unsigned int random(unsigned int ubound); + +static int __inline__ multipath_comparekeys(const struct flowi *flp1, + const struct flowi *flp2) { + return flp1->fl4_dst == flp2->fl4_dst && + flp1->fl4_src == flp2->fl4_src && + flp1->oif == flp2->oif && +#ifdef CONFIG_IP_ROUTE_FWMARK + flp1->fl4_fwmark == flp2->fl4_fwmark && +#endif + !((flp1->fl4_tos ^ flp2->fl4_tos) & + (IPTOS_RT_MASK | RTO_ONLINK)); +} + +void __multipath_selectroute(const struct flowi *flp, + struct rtable *first, + struct rtable **rp) { + struct rtable *rt; + struct rtable *decision; + unsigned char candidate_count = 0; + + /* count all candidate */ + for (rt = rcu_dereference(first); rt; + rt = rcu_dereference(rt->u.rt_next)) { + if ( ( rt->u.dst.flags & DST_BALANCED ) != 0 && + multipath_comparekeys(&rt->fl, flp) ) { + ++candidate_count; + } + } + + /* choose a random candidate */ + decision = first; + if ( candidate_count > 1 ) { + unsigned char i = 0; + unsigned char candidate_no = (unsigned char) + random(candidate_count); + RTprint( "%s: randomly chosen candidate: %d (count: %d)\n", + __FUNCTION__, candidate_no, candidate_count ); + + + /* find chosen candidate and adjust GC data for all candidates + to ensure they stay in cache */ + for (rt = first; rt; rt = rt->u.rt_next) { + if ( ( rt->u.dst.flags & DST_BALANCED ) != 0 && + multipath_comparekeys(&rt->fl, flp) ) { + rt->u.dst.lastuse = jiffies; + if (i == candidate_no) { + decision = rt; + } + if (i >= candidate_count) { + break; + } + i++; + } + } + } + + decision->u.dst.__use++; + *rp = decision; +} + +static __inline__ unsigned int random(unsigned int ubound) +{ + static unsigned int a = 1588635695, + q = 2, + r = 1117695901; + RANDOM_SEED = a*(RANDOM_SEED % q) - r*(RANDOM_SEED / q); + return RANDOM_SEED % ubound; +} + diff -ruN linux-2.6.9.split/net/ipv4/multipath_rr.c linux-2.6.9.nicbalancing/net/ipv4/multipath_rr.c --- linux-2.6.9.split/net/ipv4/multipath_rr.c 1970-01-01 01:00:00.000000000 +0100 +++ linux-2.6.9.nicbalancing/net/ipv4/multipath_rr.c 2004-12-15 12:07:15.000000000 +0100 @@ -0,0 +1,118 @@ +/* + * Round robin policy for multipath. + * + * + * Version: $Id: multipath_rr.c,v 1.1.2.2 2004/09/16 07:42:34 elueck Exp $ + * + * Authors: Einar Lueck + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define RTprint(a...) // printk(KERN_DEBUG a) + +#define MULTIPATH_MAX_CANDIDATES 40 + +static struct rtable* last_used = NULL; + +static int __inline__ multipath_comparekeys(const struct flowi *flp1, + const struct flowi *flp2) { + return flp1->fl4_dst == flp2->fl4_dst && + flp1->fl4_src == flp2->fl4_src && + flp1->oif == flp2->oif && +#ifdef CONFIG_IP_ROUTE_FWMARK + flp1->fl4_fwmark == flp2->fl4_fwmark && +#endif + !((flp1->fl4_tos ^ flp2->fl4_tos) & + (IPTOS_RT_MASK | RTO_ONLINK)); +} + +void __multipath_remove(struct rtable *rt) +{ + if (last_used==rt) { + last_used = NULL; + } +} + +void __multipath_selectroute(const struct flowi *flp, + struct rtable *first, struct rtable **rp) +{ + struct rtable *nh, *result, *min_use_cand = NULL; + int min_use = -1; + + /* if necessary and possible utilize the old alternative */ + if ( ( flp->flags & FLOWI_FLAG_MULTIPATHOLDROUTE ) != 0 && + last_used != NULL ) { + RTprint( KERN_CRIT"%s: holding route \n", + __FUNCTION__ ); + result = last_used; + goto out; + } + + + /* 1. make sure all alt. nexthops have the same GC related data */ + /* 2. determine the new candidate to be returned */ + result = NULL; + for (nh = rcu_dereference(first); nh; + nh = rcu_dereference(nh->u.rt_next)) { + if ( ( nh->u.dst.flags & DST_BALANCED ) != 0 && + multipath_comparekeys(&nh->fl, flp ) ) { + nh->u.dst.lastuse = jiffies; + + if (min_use == -1 || nh->u.dst.__use < min_use) { + min_use = nh->u.dst.__use; + min_use_cand = nh; + } + RTprint( KERN_CRIT"%s: found balanced entry\n", + __FUNCTION__ ); + } + } + result = min_use_cand; + if (!result) { + result = first; + } + + out: + last_used = result; + result->u.dst.__use++; + *rp = result; +} + + diff -ruN linux-2.6.9.split/net/ipv4/multipath_wrandom.c linux-2.6.9.nicbalancing/net/ipv4/multipath_wrandom.c --- linux-2.6.9.split/net/ipv4/multipath_wrandom.c 1970-01-01 01:00:00.000000000 +0100 +++ linux-2.6.9.nicbalancing/net/ipv4/multipath_wrandom.c 2004-12-15 12:07:15.000000000 +0100 @@ -0,0 +1,374 @@ +/* + * Weighted random policy for multipath. + * + * + * Version: $Id: multipath_wrandom.c,v 1.1.2.3 2004/09/22 07:51:40 elueck Exp $ + * + * Authors: Einar Lueck + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define MPprint(a...) // printk(KERN_DEBUG a) + +#define MULTIPATH_STATE_SIZE 15 + +struct multipath_candidate { + struct multipath_candidate *next; + int power; + struct rtable *rt; +}; + +struct multipath_dest { + struct list_head list; + + const struct fib_nh *nh_info; + __u32 netmask; + __u32 network; + unsigned char prefixlen; + + struct rcu_head rcu; +}; + +struct multipath_bucket { + struct list_head head; + spinlock_t lock; +}; + +struct multipath_route { + struct list_head list; + + int oif; + __u32 gw; + struct list_head dests; + + struct rcu_head rcu; +}; + + +/* state: primarily weight per route information */ +static int multipath_state_initialized = 0; +static spinlock_t state_big_lock = SPIN_LOCK_UNLOCKED; +static struct multipath_bucket state[MULTIPATH_STATE_SIZE]; + + +/* interface to random number generation */ +static unsigned int RANDOM_SEED = 93186752; +static __inline__ unsigned int random(unsigned int ubound); + +static int __inline__ multipath_comparekeys(const struct flowi *flp1, + const struct flowi *flp2) { + return flp1->fl4_dst == flp2->fl4_dst && + flp1->fl4_src == flp2->fl4_src && + flp1->oif == flp2->oif && +#ifdef CONFIG_IP_ROUTE_FWMARK + flp1->fl4_fwmark == flp2->fl4_fwmark && +#endif + !((flp1->fl4_tos ^ flp2->fl4_tos) & + (IPTOS_RT_MASK | RTO_ONLINK)); +} + + +static unsigned char __multipath_lookup_weight(const struct flowi *fl, + const struct rtable *rt) { + const int state_idx = rt->idev->dev->ifindex % MULTIPATH_STATE_SIZE; + struct multipath_route *r; + struct multipath_route *target_route = NULL; + struct multipath_dest *d; + int weight = 1; + + /* lookup the weight information for a certain route */ + rcu_read_lock(); + + /* find state entry for gateway or add one if necessary */ + list_for_each_entry_rcu(r, &state[state_idx].head, list) { + if (r->gw == rt->rt_gateway && + r->oif == rt->idev->dev->ifindex) { + target_route = r; + break; + } + } + if (!target_route) { + /* this should not happen... but we are prepared */ + printk( KERN_CRIT"%s: missing state for gateway: %u and " \ + "device %d\n", __FUNCTION__, rt->rt_gateway, + rt->idev->dev->ifindex); + goto out; + } + + /* find state entry for destination */ + list_for_each_entry_rcu(d, &target_route->dests, list) { + __u32 targetnetwork = fl->fl4_dst & + (0xFFFFFFFF >> (32 - d->prefixlen)); + + if ((targetnetwork & d->netmask) == d->network) { + weight = d->nh_info->nh_weight; + MPprint("%s: found weight %d for gateway %u\n", + __FUNCTION__, weight, rt->rt_gateway); + goto out; + } + } + + out: + rcu_read_unlock(); + return weight; +} + +static void __multipath_init_state(void) +{ + spin_lock(&state_big_lock); + + /* check again due to SMP and to prevent contention */ + if (!multipath_state_initialized) { + int i; + for (i = 0; i < MULTIPATH_STATE_SIZE; ++i) { + INIT_LIST_HEAD(&state[i].head); + state[i].lock = SPIN_LOCK_UNLOCKED; + } + } + + /* now mark initialized */ + multipath_state_initialized = 1; + + spin_unlock(&state_big_lock); +} + +static void inline __multipath_init(void) +{ + /* do not spinlock to reduce unnecessary contention */ + if (!multipath_state_initialized) { + __multipath_init_state(); + } +} + +void __multipath_selectroute(const struct flowi *flp, + struct rtable *first, + struct rtable **rp) { + struct rtable *rt; + struct rtable *decision; + struct multipath_candidate *first_mpc = NULL; + struct multipath_candidate *mpc, *last_mpc = NULL; + int power = 0; + int last_power; + int selector; + const size_t size_mpc = sizeof(struct multipath_candidate); + + /* init state if necessary */ + __multipath_init(); + + + /* collect all candidates and identify their weights */ + for (rt = rcu_dereference(first); rt; + rt = rcu_dereference(rt->u.rt_next)) { + if ((rt->u.dst.flags & DST_BALANCED) != 0 && + multipath_comparekeys(&rt->fl, flp) ) { + struct multipath_candidate* mpc = + (struct multipath_candidate*) + kmalloc(size_mpc, GFP_KERNEL); + + power += __multipath_lookup_weight(flp, rt) * 10000; + + mpc->power = power; + mpc->rt = rt; + mpc->next = NULL; + + if (!first_mpc) + first_mpc = mpc; + else + last_mpc->next = mpc; + + last_mpc = mpc; + } + } + + /* choose a weighted random candidate */ + decision = first; + selector = random(power); + MPprint("%s: random number %d in range %d\n", __FUNCTION__, selector, + power); + last_power = 0; + + + /* select candidate, adjust GC data and cleanup local state */ + decision = first; + last_mpc = NULL; + for (mpc = first_mpc; mpc; mpc = mpc->next) { + mpc->rt->u.dst.lastuse = jiffies; + MPprint("%s: last_power = %d, selector: %d, mpc->power: %d\n", + __FUNCTION__, last_power, selector, mpc->power); + if (last_power <= selector && selector < mpc->power) { + decision = mpc->rt; + MPprint("%s: selected %u\n", __FUNCTION__, + decision->rt_gateway); + } + last_power = mpc->power; + if (last_mpc) { + kfree(last_mpc); + } + last_mpc = mpc; + } + if (last_mpc) { + /* concurrent __multipath_flush may lead to !last_mpc */ + kfree(last_mpc); + } + + decision->u.dst.__use++; + *rp = decision; +} + +void __multipath_set_nhinfo(__u32 network, + __u32 netmask, + unsigned char prefixlen, + const struct fib_nh* nh) +{ + const int state_idx = nh->nh_oif % MULTIPATH_STATE_SIZE; + struct multipath_route *r, *target_route = NULL; + struct multipath_dest *d, *target_dest = NULL; + + /* init state if necessary */ + __multipath_init(); + + /* store the weight information for a certain route */ + spin_lock(&state[state_idx].lock); + + /* find state entry for gateway or add one if necessary */ + list_for_each_entry_rcu(r, &state[state_idx].head, list) { + if (r->gw == nh->nh_gw && r->oif == nh->nh_oif) { + target_route = r; + break; + } + } + if (!target_route) { + const size_t size_rt = sizeof(struct multipath_route); + target_route = (struct multipath_route *) + kmalloc(size_rt, GFP_KERNEL); + + target_route->gw = nh->nh_gw; + target_route->oif = nh->nh_oif; + memset(&target_route->rcu, sizeof(struct rcu_head), 0); + INIT_LIST_HEAD(&target_route->dests); + + list_add_rcu(&target_route->list, &state[state_idx].head); + } + + /* find state entry for destination or add one if necessary */ + list_for_each_entry_rcu(d, &target_route->dests, list) { + if (d->nh_info == nh) { + target_dest = d; + break; + } + } + if (!target_dest) { + const size_t size_dst = sizeof(struct multipath_dest); + target_dest = (struct multipath_dest*) + kmalloc(size_dst, GFP_KERNEL); + + target_dest->nh_info = nh; + target_dest->network = network; + target_dest->netmask = netmask; + target_dest->prefixlen = prefixlen; + memset(&target_dest->rcu, sizeof(struct rcu_head), 0); + + list_add_rcu(&target_dest->list, &target_route->dests); + } + /* else: we already stored this info for another destination => + we are finished */ + + spin_unlock(&state[state_idx].lock); +} + + +static void __multipath_free(struct rcu_head *head) +{ + struct multipath_route *rt = container_of(head, struct multipath_route, + rcu); + kfree(rt); +} + +static void __multipath_free_dst(struct rcu_head *head) +{ + struct multipath_dest *dst = container_of(head, + struct multipath_dest, + rcu); + kfree(dst); +} + +void __multipath_flush() +{ + int i; + + MPprint("%s: called\n", __FUNCTION__); + + /* init state if necessary */ + __multipath_init(); + + /* defere delete to all entries */ + for (i = 0; i < MULTIPATH_STATE_SIZE; ++i) { + struct multipath_route *r; + spin_lock(&state[i].lock); + + list_for_each_entry_rcu(r, &state[i].head, list) { + struct multipath_dest *d; + list_for_each_entry_rcu(d, &r->dests, list) { + list_del_rcu(&d->list); + call_rcu(&d->rcu, + __multipath_free_dst); + + } + list_del_rcu(&r->list); + call_rcu(&r->rcu, + __multipath_free); + } + + spin_unlock(&state[i].lock); + } + + MPprint("%s: finished\n", __FUNCTION__); +} + +static __inline__ unsigned int random(unsigned int ubound) +{ + static unsigned int a = 1588635695, + q = 2, + r = 1117695901; + RANDOM_SEED = a*(RANDOM_SEED % q) - r*(RANDOM_SEED / q); + return RANDOM_SEED % ubound; +} diff -ruN linux-2.6.9.split/net/ipv4/route.c linux-2.6.9.nicbalancing/net/ipv4/route.c --- linux-2.6.9.split/net/ipv4/route.c 2004-12-15 12:05:32.000000000 +0100 +++ linux-2.6.9.nicbalancing/net/ipv4/route.c 2004-12-15 12:07:15.000000000 +0100 @@ -129,7 +129,7 @@ int ip_rt_secret_interval = 10 * 60 * HZ; static unsigned long rt_deadline; -#define RTprint(a...) printk(KERN_DEBUG a) +#define RTprint(a...) // printk(KERN_DEBUG a) static struct timer_list rt_flush_timer; static struct timer_list rt_periodic_timer; @@ -450,11 +450,13 @@ static __inline__ void rt_free(struct rtable *rt) { + multipath_remove( rt ); call_rcu_bh(&rt->u.dst.rcu_head, dst_rcu_free); } static __inline__ void rt_drop(struct rtable *rt) { + multipath_remove( rt ); ip_rt_put(rt); call_rcu_bh(&rt->u.dst.rcu_head, dst_rcu_free); } @@ -516,6 +518,54 @@ return score; } +#ifdef CONFIG_IP_ROUTE_MULTIPATH_CACHED +static struct rtable **rt_remove_balanced_route(struct rtable **chain_head, + struct rtable *expentry, + int* removed_count) +{ + int passedexpired = 0; + struct rtable **nextstep = NULL; + struct rtable **rthp = chain_head; + struct rtable *rth; + if (removed_count) + *removed_count = 0; + while ((rth = *rthp) != NULL) { + if ( rth == expentry ) { + passedexpired = 1; + } + + if (((*rthp)->u.dst.flags & DST_BALANCED) != 0 && + compare_keys(&(*rthp)->fl, &expentry->fl)) { + if (*rthp == expentry) { + *rthp = rth->u.rt_next; + continue; + } + else { + *rthp = rth->u.rt_next; + rt_free(rth); + if (removed_count) + ++(*removed_count); + } + } + else { + if ( !((*rthp)->u.dst.flags & DST_BALANCED) && + passedexpired && !nextstep ) { + nextstep = &rth->u.rt_next; + } + rthp = &rth->u.rt_next; + } + } + + rt_free(expentry); + if (removed_count) + ++(*removed_count); + + return nextstep; +} + +#endif + + /* This runs via a timer and thus is always in BH context. */ static void rt_check_expire(unsigned long dummy) { @@ -547,8 +597,24 @@ } /* Cleanup aged off entries. */ - *rthp = rth->u.rt_next; - rt_free(rth); +#ifdef CONFIG_IP_ROUTE_MULTIPATH_CACHED + /* remove all related balanced entries if necessary */ + if ( rth->u.dst.flags & DST_BALANCED ) { + rthp = rt_remove_balanced_route( + &rt_hash_table[i].chain, + rth, NULL); + if (!rthp) { + break; + } + } + else { + *rthp = rth->u.rt_next; + rt_free(rth); + } +#else /* CONFIG_IP_ROUTE_MULTIPATH_CACHED */ + *rthp = rth->u.rt_next; + rt_free(rth); +#endif /* CONFIG_IP_ROUTE_MULTIPATH_CACHED */ } spin_unlock(&rt_hash_table[i].lock); @@ -596,6 +662,9 @@ if (delay < 0) delay = ip_rt_min_delay; + /* flush existing multipath state*/ + multipath_flush(); + spin_lock_bh(&rt_flush_lock); if (del_timer(&rt_flush_timer) && delay > 0 && rt_deadline) { @@ -714,9 +783,29 @@ rthp = &rth->u.rt_next; continue; } +#ifdef CONFIG_IP_ROUTE_MULTIPATH_CACHED + /* remove all related balanced entries if necessary */ + if ( rth->u.dst.flags & DST_BALANCED ) { + int r; + rthp = rt_remove_balanced_route( + &rt_hash_table[i].chain, + rth, + &r); + goal -= r; + if (!rthp) { + break; + } + } + else { + *rthp = rth->u.rt_next; + rt_free(rth); + goal--; + } +#else /* CONFIG_IP_ROUTE_MULTIPATH_CACHED */ *rthp = rth->u.rt_next; rt_free(rth); goal--; +#endif /* CONFIG_IP_ROUTE_MULTIPATH_CACHED */ } spin_unlock_bh(&rt_hash_table[k].lock); if (goal <= 0) @@ -797,7 +886,12 @@ spin_lock_bh(&rt_hash_table[hash].lock); while ((rth = *rthp) != NULL) { +#ifdef CONFIG_IP_ROUTE_MULTIPATH_CACHED + if (!(rth->u.dst.flags & DST_BALANCED) && + compare_keys(&rth->fl, &rt->fl)) { +#else if (compare_keys(&rth->fl, &rt->fl)) { +#endif /* Put it first */ *rthp = rth->u.rt_next; /* @@ -1628,6 +1722,10 @@ } rth->u.dst.flags= DST_HOST; +#ifdef CONFIG_IP_ROUTE_MULTIPATH_CACHED + if ( res->fi->fib_nhs > 1 ) + rth->u.dst.flags |= DST_BALANCED; +#endif if (in_dev->cnf.no_policy) rth->u.dst.flags |= DST_NOPOLICY; if (in_dev->cnf.no_xfrm) @@ -1675,7 +1773,7 @@ unsigned hash; #ifdef CONFIG_IP_ROUTE_MULTIPATH - if (res->fi->fib_nhs > 1 && fl->oif == 0) + if (res->fi && res->fi->fib_nhs > 1 && fl->oif == 0) fib_select_multipath(fl, res); #endif @@ -1696,7 +1794,65 @@ struct in_device *in_dev, u32 daddr, u32 saddr, u32 tos) { +#ifdef CONFIG_IP_ROUTE_MULTIPATH_CACHED + struct rtable* rth; + unsigned char hop, hopcount, lasthop; + int err = -EINVAL; + unsigned hash; + if (res->fi) { + hopcount = res->fi->fib_nhs; + } + else { + hopcount = 1; + } + lasthop = hopcount - 1; + + /* distinguish between multipath and singlepath */ + if ( hopcount < 2 ) + return ip_mkroute_input_def(skb, res, fl, in_dev, daddr, + saddr, tos); + + RTprint( KERN_DEBUG"%s: entered (hopcount: %d)\n", __FUNCTION__, + hopcount); + + /* add all alternatives to the routing cache */ + for ( hop = 0; hop < hopcount; ++hop ) { + res->nh_sel = hop; + + RTprint( KERN_DEBUG"%s: entered (hopcount: %d)\n", + __FUNCTION__, hopcount); + + /* create a routing cache entry */ + err = __mkroute_input( skb, res, in_dev, daddr, saddr, tos, + &rth ); + if ( err ) + return err; + + + /* put it into the cache */ + hash = rt_hash_code(daddr, saddr ^ (fl->iif << 5), tos); + err = rt_intern_hash(hash, rth, (struct rtable**)&skb->dst); + if ( err ) + return err; + + /* forward hop information to multipath impl. */ + multipath_set_nhinfo(FIB_RES_NETWORK(*res), + FIB_RES_NETMASK(*res), + res->prefixlen, + &FIB_RES_NH(*res)); + + + /* only for the last hop the reference count is handled + outside */ + RTprint( KERN_DEBUG"%s: balanced entry created: %d\n", + __FUNCTION__, rth ); + if ( hop == lasthop ) + atomic_set(&(skb->dst->__refcnt), 1); + } + return err; +#else /* CONFIG_IP_ROUTE_MULTIPATH_CACHED */ return ip_mkroute_input_def(skb, res, fl, in_dev, daddr, saddr, tos); +#endif /* CONFIG_IP_ROUTE_MULTIPATH_CACHED */ } @@ -2017,6 +2173,10 @@ } rth->u.dst.flags= DST_HOST; +#ifdef CONFIG_IP_ROUTE_MULTIPATH_CACHED + if (res->fi && res->fi->fib_nhs > 1) + rth->u.dst.flags |= DST_BALANCED; +#endif if (in_dev->cnf.no_xfrm) rth->u.dst.flags |= DST_NOXFRM; if (in_dev->cnf.no_policy) @@ -2108,7 +2268,77 @@ struct net_device *dev_out, unsigned flags) { +#ifdef CONFIG_IP_ROUTE_MULTIPATH_CACHED + u32 tos = RT_FL_TOS(oldflp); + unsigned char hop; + unsigned hash; + int err = -EINVAL; + struct rtable* rth; + + if (res->fi && res->fi->fib_nhs > 1) { + unsigned char hopcount = res->fi->fib_nhs; + RTprint( KERN_DEBUG"%s: entered (hopcount: %d, fl->oif: %d)\n", + __FUNCTION__, hopcount, fl->oif); + + for ( hop = 0; hop < hopcount; ++hop ) { + struct net_device *dev2nexthop; + RTprint( KERN_DEBUG"%s: hop %d of %d\n", __FUNCTION__, + hop, hopcount ); + + res->nh_sel = hop; + + /* hold a work reference to the output device */ + dev2nexthop = FIB_RES_DEV(*res); + dev_hold(dev2nexthop); + + err = __mkroute_output(&rth, res, fl, oldflp, + dev2nexthop, flags); + + /** FIXME remove debug code */ + RTprint( "%s: balanced entry created: %d " \ + " (GW: %u)\n", + __FUNCTION__, + &rth->u.dst, + FIB_RES_GW(*res) ); + + if ( err != 0 ) { + goto cleanup; + } + + RTprint( KERN_DEBUG"%s: created successfully %d\n", + __FUNCTION__, hop ); + + hash = rt_hash_code(oldflp->fl4_dst, + oldflp->fl4_src ^ + (oldflp->oif << 5), tos); + err = rt_intern_hash(hash, rth, rp); + RTprint( KERN_DEBUG"%s: hashed %d\n", + __FUNCTION__, hop ); + + /* forward hop information to multipath impl. */ + multipath_set_nhinfo(FIB_RES_NETWORK(*res), + FIB_RES_NETMASK(*res), + res->prefixlen, + &FIB_RES_NH(*res)); + cleanup: + /* release work reference to output device */ + dev_put(dev2nexthop); + + if ( err != 0 ) { + return err; + } + } + RTprint( "%s: exited loop\n", __FUNCTION__ ); + atomic_set(&(*rp)->u.dst.__refcnt, 1); + return err; + } + else { + return ip_mkroute_output_def(rp, res, fl, oldflp, dev_out, + flags); + } +#else /* CONFIG_IP_ROUTE_MULTIPATH_CACHED */ return ip_mkroute_output_def(rp, res, fl, oldflp, dev_out, flags); +#endif } /* @@ -2137,6 +2367,7 @@ int free_res = 0; int err; + res.fi = NULL; #ifdef CONFIG_IP_MULTIPLE_TABLES res.r = NULL; @@ -2186,6 +2417,8 @@ dev_put(dev_out); dev_out = NULL; } + + if (oldflp->oif) { dev_out = dev_get_by_index(oldflp->oif); err = -ENODEV; @@ -2292,9 +2525,11 @@ dev_hold(dev_out); fl.oif = dev_out->ifindex; + make_route: err = ip_mkroute_output(rp, &res, &fl, oldflp, dev_out, flags); + if (free_res) fib_res_put(&res); if (dev_out) @@ -2321,6 +2556,15 @@ #endif !((rth->fl.fl4_tos ^ flp->fl4_tos) & (IPTOS_RT_MASK | RTO_ONLINK))) { + /* check for multipath routes and choose one if + necessary */ + if (multipath_selectroute(flp, rth, rp)) { + dst_hold(&(*rp)->u.dst); + RT_CACHE_STAT_INC(out_hit); + rcu_read_unlock_bh(); + return 0; + } + rth->u.dst.lastuse = jiffies; dst_hold(&rth->u.dst); rth->u.dst.__use++; From arnd@arndb.de Mon Dec 20 04:21:54 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 04:22:00 -0800 (PST) Received: from e2.ny.us.ibm.com (e2.ny.us.ibm.com [32.97.182.142]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBKCLXsU020516 for ; Mon, 20 Dec 2004 04:21:54 -0800 Received: from d01relay04.pok.ibm.com (d01relay04.pok.ibm.com [9.56.227.236]) by e2.ny.us.ibm.com (8.12.10/8.12.10) with ESMTP id iBKCL48F027541 for ; Mon, 20 Dec 2004 07:21:04 -0500 Received: from d01av02.pok.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by d01relay04.pok.ibm.com (8.12.10/NCO/VER6.6) with ESMTP id iBKCL1Vv268982 for ; Mon, 20 Dec 2004 07:21:04 -0500 Received: from d01av02.pok.ibm.com (loopback [127.0.0.1]) by d01av02.pok.ibm.com (8.12.11/8.12.11) with ESMTP id iBKCL0aI024715 for ; Mon, 20 Dec 2004 07:21:01 -0500 Received: from dyn-9-152-210-124.boeblingen.de.ibm.com (dyn-9-152-210-124.boeblingen.de.ibm.com [9.152.210.124]) by d01av02.pok.ibm.com (8.12.11/8.12.11) with ESMTP id iBKCL0Bt024686; Mon, 20 Dec 2004 07:21:00 -0500 From: Arnd Bergmann To: YOSHIFUJI Hideaki / =?utf-8?q?=E5=90=89=E8=97=A4=E8=8B=B1=E6=98=8E?= Subject: Re: [PATCH][v4][19/24] Add IPoIB (IP-over-InfiniBand) driver Date: Mon, 20 Dec 2004 13:14:35 +0100 User-Agent: KMail/1.6.2 Cc: roland@topspin.com, linux-kernel@vger.kernel.org, openib-general@openib.org, netdev@oss.sgi.com References: <200412192215.69tnzAhGIT1vQGLF@topspin.com> <200412192215.fZX1ZQqQD4QGkKcF@topspin.com> <20041220.155836.75677852.yoshfuji@linux-ipv6.org> In-Reply-To: <20041220.155836.75677852.yoshfuji@linux-ipv6.org> MIME-Version: 1.0 Content-Disposition: inline Message-Id: <200412201314.35502.arnd@arndb.de> Content-Type: text/plain; charset="utf-8" X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by oss.sgi.com id iBKCLXsU020516 X-archive-position: 12915 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: arnd@arndb.de Precedence: bulk X-list: netdev On Maandag 20 Dezember 2004 07:58, YOSHIFUJI Hideaki / å‰è—¤è‹±æ˜Ž wrote: > Roland Dreier says: > > > +enum { > > +     IPOIB_PACKET_SIZE         = 2048, > > +     IPOIB_BUF_SIZE            = IPOIB_PACKET_SIZE + IB_GRH_BYTES, > > + > > +     IPOIB_ENCAP_LEN           = 4, > > + > > +     IPOIB_RX_RING_SIZE        = 128, > > +     IPOIB_TX_RING_SIZE        = 64, > > + > > above entries does not seem to appropriate for enum (than #define). According to Documentation/CodingStyle, it actually is preferred like this. See also include/linux/ide.h for another example where this is done. Arnd <>< From kaber@trash.net Mon Dec 20 04:42:46 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 04:43:02 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBKCgPYp021563 for ; Mon, 20 Dec 2004 04:42:46 -0800 Received: from eru.coreworks.de ([172.16.0.2] helo=trash.net) by kaber.coreworks.de with esmtp (Exim 4.34) id 1CgMrP-00048j-QN; Mon, 20 Dec 2004 13:42:01 +0100 Message-ID: <41C6DB68.30607@trash.net> Date: Mon, 20 Dec 2004 15:02:16 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040413 Debian/1.6-5 X-Accept-Language: en MIME-Version: 1.0 To: =?ISO-8859-1?Q?Einar_L=FCck?= CC: linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: [PATCH 1/2] ipv4 routing: splitting of ip_route_[in|out]put_slow, 2.6.10-rc3 References: <41C6B3D4.6060207@einar-lueck.de> In-Reply-To: <41C6B3D4.6060207@einar-lueck.de> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12916 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev Einar Lück wrote: > [PATCH 1/2] ipv4 routing: splitting of ip_route_[in|out]put_slow, > 2.6.10-rc3 > > From: Einar Lueck > > This patch splits up ip_route_[in|out]put_slow in inlined functions. > Basic idea: > * improve overall comprehensibility > * allow for an easier application of patch for improved multipath > support (refer to the subsequent patch) > > Please consider for application. Your patches have once again been made unreadable by your email-client. Please send them again as attachments. Regards Patrick From tgraf@suug.ch Mon Dec 20 05:03:39 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 05:03:45 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBKD3HZq022479 for ; Mon, 20 Dec 2004 05:03:39 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id AD54C84; Mon, 20 Dec 2004 14:02:31 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id AD5FD1C0EA; Mon, 20 Dec 2004 14:03:14 +0100 (CET) Date: Mon, 20 Dec 2004 14:03:14 +0100 From: Thomas Graf To: Patrick McHardy Cc: "David S. Miller" , Jamal Hadi Salim , netdev@oss.sgi.com Subject: Re: [PATCH] PKT_SCHED: Fix cls indev validation Message-ID: <20041220130314.GT17998@postel.suug.ch> References: <20041219203050.GK17998@postel.suug.ch> <41C68CEF.3030803@trash.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <41C68CEF.3030803@trash.net> X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12917 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * Patrick McHardy <41C68CEF.3030803@trash.net> 2004-12-20 09:27 > Thomas Graf wrote: > >This patch is actually part of a patchset for 2.6.11 inclusion > >but the memory corruption possibility might make it worth > >for 2.6.10. It's not really dangerous since it requires > >admin capabilities though. > > There are lots of places where this is possible, just look > at all the silly checks in the action code: > > sprintf(act_name, "%s", (char*)RTA_DATA(kind)); > if (RTA_PAYLOAD(kind) >= IFNAMSIZ) { Agreed, also the recursive algorithms are vulnerable but this doesn't mean we should just ignore it. I'm fine with droping this for 2.6.10 and wait for my complete patchset. > Why special-case indev ? Neither u32 nor fw make any attempt > to clean up after errors in their init functions, so instead > of fixing a single attribute they need to do proper cleanup, > than we can just continue to validate indev when changing it. As I said, the patch is part of a patchset which cleans up all the init paths. I will introduce a new abstraction layer called tcf_attrs which hides action/police configuration, removes all the ifdefs without polluting the cls config data structures and enforces the policy "change everything or nothing" to make things consistent again. (They were once) It basically works by having a struct tcf_attrs which ifdefs tc_action, tcf_police and is empty if none of them are configured. The classifiers include it in their private data and call tcf_attrs_validate() and tcf_attrs_change() to validate respectively commit the configuration requests or tcf_attrs_match() to match them. The TLV mapping is done via tcf_attr_map which must be provided by the classifiers: static struct tcf_attr_map u32_map = { .action = TCA_U32_ACT, .police = TCA_U32_POLICE, }; Speaking of it, would everyone agree to put indev matching into the abstraction layer as well so it is available for all classifiers? It doesn't cost us anything except a few iproute2 additions. > I'm also against keeping all those printks when touching the > code. Its ok when writing new code, but I don't see why this > code, unlike everything else, needs to report errors in the > ringbuffer instead of returning meaningful error codes. I didn't want to change behaviour because there might be someone checking for it, it's not that I'd like it. > >+static inline void > >+tcf_change_indev(struct tcf_proto *tp, char *indev, struct rtattr *id_tlv) > >+{ > >+ memset(indev, 0, IFNAMSIZ); > >+ memcpy(indev, RTA_DATA(id_tlv), RTA_PAYLOAD(id_tlv)); > >+ indev[IFNAMSIZ - 1] = '\0'; > >+} > > And this should just use strlcpy. No, RTA_DATA(id_tlv) is not guaranteed to be NUL terminated and internal calls to strlen might crash. Even if it would be safe, I don't like using str functions if NUL termination is not guaranteed. From yoshfuji@linux-ipv6.org Mon Dec 20 05:17:50 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 05:17:56 -0800 (PST) Received: from yue.st-paulia.net (yue.linux-ipv6.org [203.178.140.15]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBKDHTru023406 for ; Mon, 20 Dec 2004 05:17:49 -0800 Received: from localhost (localhost [127.0.0.1]) by yue.st-paulia.net (Postfix) with ESMTP id 9DE0E33CC2; Mon, 20 Dec 2004 22:17:09 +0900 (JST) Date: Mon, 20 Dec 2004 22:17:09 +0900 (JST) Message-Id: <20041220.221709.99112884.yoshfuji@linux-ipv6.org> To: arnd@arndb.de Cc: roland@topspin.com, linux-kernel@vger.kernel.org, openib-general@openib.org, netdev@oss.sgi.com, yoshfuji@linux-ipv6.org Subject: Re: [PATCH][v4][19/24] Add IPoIB (IP-over-InfiniBand) driver From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: <200412201314.35502.arnd@arndb.de> References: <200412192215.fZX1ZQqQD4QGkKcF@topspin.com> <20041220.155836.75677852.yoshfuji@linux-ipv6.org> <200412201314.35502.arnd@arndb.de> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=utf-8 X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by oss.sgi.com id iBKDHTru023406 X-archive-position: 12918 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev In article <200412201314.35502.arnd@arndb.de> (at Mon, 20 Dec 2004 13:14:35 +0100), Arnd Bergmann says: > On Maandag 20 Dezember 2004 07:58, YOSHIFUJI Hideaki / å‰è—¤è‹±æ˜Ž wrote: > > Roland Dreier says: > > > > > +enum { > > > +     IPOIB_PACKET_SIZE         = 2048, > > > +     IPOIB_BUF_SIZE            = IPOIB_PACKET_SIZE + IB_GRH_BYTES, > > > + > > > +     IPOIB_ENCAP_LEN           = 4, > > > + > > > +     IPOIB_RX_RING_SIZE        = 128, > > > +     IPOIB_TX_RING_SIZE        = 64, > > > + > > > > above entries does not seem to appropriate for enum (than #define). > > According to Documentation/CodingStyle, it actually is preferred like this. > See also include/linux/ide.h for another example where this is done. No, it is not the similar case. --yoshfuji From hadi@cyberus.ca Mon Dec 20 05:56:00 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 05:56:08 -0800 (PST) Received: from mx03.cybersurf.com (mx03.cybersurf.com [209.197.145.106]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBKDtdXo026325 for ; Mon, 20 Dec 2004 05:56:00 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx03.cybersurf.com with esmtp (Exim 4.30) id 1CgO0F-0001Qk-8s for netdev@oss.sgi.com; Mon, 20 Dec 2004 08:55:11 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CgO08-0003Y2-IS; Mon, 20 Dec 2004 08:55:04 -0500 Subject: Re: primary and secondary ip addresses From: jamal Reply-To: hadi@cyberus.ca To: Thomas Graf , Harald Welte Cc: Henrik Nordstrom , "David S. Miller" , Andrea G Forte , hasso@estpak.ee, nhorman@redhat.com, linux-net@vger.kernel.org, netdev@oss.sgi.com In-Reply-To: <1103500587.1048.269.camel@jzny.localdomain> References: <200412161153.51251.hasso@estpak.ee> <200412161302.42357.hasso@estpak.ee> <41C2F6E5.5010607@cs.columbia.edu> <41C30212.6000906@cs.columbia.edu> <20041217112025.27688eb6.davem@davemloft.net> <1103487517.1047.181.camel@jzny.localdomain> <20041219214120.GX17302@sunbeam.de.gnumonks.org> <20041219220211.GQ17998@postel.suug.ch> <1103497168.1046.218.camel@jzny.localdomain> <1103500587.1048.269.camel@jzny.localdomain> Content-Type: multipart/mixed; boundary="=-+5XeGy0BAYdkSAlQxb0L" Organization: jamalopolous Message-Id: <1103550901.1050.292.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 20 Dec 2004 08:55:02 -0500 X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12919 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev --=-+5XeGy0BAYdkSAlQxb0L Content-Type: text/plain Content-Transfer-Encoding: 7bit On Sun, 2004-12-19 at 18:56, jamal wrote: > Harald, > > My (stoopid) ISP doesnt like your email address. In any case, attached > patch of what i was alluding to. Maybe be missing some things. > Compiles - not tested Didnt boot ;-> A small silly magic number i missed. Now boots - but doesnt mean it works. And i dont have much time to spare chasing it. cheers, jamal --=-+5XeGy0BAYdkSAlQxb0L Content-Disposition: attachment; filename=p5-take2 Content-Type: text/plain; name=p5-take2; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit --- 2610-rc3-bk12/net/ipv4/bak.devinet.c 2004-12-19 17:30:33.000000000 -0500 +++ 2610-rc3-bk12/net/ipv4/devinet.c 2004-12-20 08:48:30.038520264 -0500 @@ -230,11 +230,14 @@ static void inet_del_ifa(struct in_device *in_dev, struct in_ifaddr **ifap, int destroy) { + struct in_ifaddr *promote = NULL; struct in_ifaddr *ifa1 = *ifap; ASSERT_RTNL(); - /* 1. Deleting primary ifaddr forces deletion all secondaries */ + /* 1. Deleting primary ifaddr forces deletion all secondaries + * unless alias promotion is set + **/ if (!(ifa1->ifa_flags & IFA_F_SECONDARY)) { struct in_ifaddr *ifa; @@ -248,11 +251,16 @@ continue; } - *ifap1 = ifa->ifa_next; + if (!IN_DEV_PROMOTE_ALIASES(in_dev)) { + *ifap1 = ifa->ifa_next; - rtmsg_ifa(RTM_DELADDR, ifa); - notifier_call_chain(&inetaddr_chain, NETDEV_DOWN, ifa); - inet_free_ifa(ifa); + rtmsg_ifa(RTM_DELADDR, ifa); + notifier_call_chain(&inetaddr_chain, NETDEV_DOWN, ifa); + inet_free_ifa(ifa); + } else { + promote = ifa; + break; + } } } @@ -278,6 +286,13 @@ if (!in_dev->ifa_list) inetdev_destroy(in_dev); } + + if (promote && IN_DEV_PROMOTE_ALIASES(in_dev)) { + /* not sure if we should send a delete notify first? */ + promote->ifa_flags &= ~IFA_F_SECONDARY; + rtmsg_ifa(RTM_NEWADDR, promote); + notifier_call_chain(&inetaddr_chain, NETDEV_UP, promote); + } } static int inet_insert_ifa(struct in_ifaddr *ifa) @@ -1209,10 +1224,10 @@ return 1; } - +#define DEVINET_SIZE 21 static struct devinet_sysctl_table { struct ctl_table_header *sysctl_header; - ctl_table devinet_vars[20]; + ctl_table devinet_vars[DEVINET_SIZE]; ctl_table devinet_dev[2]; ctl_table devinet_conf_dir[2]; ctl_table devinet_proto_dir[2]; @@ -1374,6 +1389,15 @@ .proc_handler = &ipv4_doint_and_flush, .strategy = &ipv4_doint_and_flush_strategy, }, + { + .ctl_name = NET_IPV4_CONF_PROMOTE_ALIASES, + .procname = "promote_aliases", + .data = &ipv4_devconf.promote_aliases, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = &ipv4_doint_and_flush, + .strategy = &ipv4_doint_and_flush_strategy, + }, }, .devinet_dev = { { --- a/include/linux/bak.sysctl.h 2004-12-19 18:26:04.622675432 -0500 +++ b/include/linux/sysctl.h 2004-12-19 18:27:18.471448712 -0500 @@ -395,6 +395,7 @@ NET_IPV4_CONF_FORCE_IGMP_VERSION=17, NET_IPV4_CONF_ARP_ANNOUNCE=18, NET_IPV4_CONF_ARP_IGNORE=19, + NET_IPV4_CONF_PROMOTE_ALIASES=20, }; /* /proc/sys/net/ipv4/netfilter */ --- a/include/linux/bak.inetdevice.h 2004-12-19 17:54:26.610217184 -0500 +++ b/include/linux/inetdevice.h 2004-12-19 17:58:45.130916064 -0500 @@ -29,6 +29,7 @@ int no_xfrm; int no_policy; int force_igmp_version; + int promote_aliases; void *sysctl; }; @@ -71,6 +72,7 @@ #define IN_DEV_SEC_REDIRECTS(in_dev) (ipv4_devconf.secure_redirects || (in_dev)->cnf.secure_redirects) #define IN_DEV_IDTAG(in_dev) ((in_dev)->cnf.tag) #define IN_DEV_MEDIUM_ID(in_dev) ((in_dev)->cnf.medium_id) +#define IN_DEV_PROMOTE_ALIASES(in_dev) (ipv4_devconf.promote_aliases || (in_dev)->cnf.promote_aliases) #define IN_DEV_RX_REDIRECTS(in_dev) \ ((IN_DEV_FORWARD(in_dev) && \ --=-+5XeGy0BAYdkSAlQxb0L-- From tgraf@suug.ch Mon Dec 20 06:03:50 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 06:04:01 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBKE3TG9027063 for ; Mon, 20 Dec 2004 06:03:50 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id BA81884; Mon, 20 Dec 2004 15:02:43 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id 1EDB31C0EA; Mon, 20 Dec 2004 15:03:26 +0100 (CET) Date: Mon, 20 Dec 2004 15:03:25 +0100 From: Thomas Graf To: Patrick McHardy Cc: hadi@cyberus.ca, "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PATCH] PKT_SCHED: Provide compat policer stats in action policer Message-ID: <20041220140325.GW17998@postel.suug.ch> References: <20041215130128.GK8493@postel.suug.ch> <1103119774.1077.74.camel@jzny.localdomain> <41C05B60.6030504@trash.net> <1103484249.1046.143.camel@jzny.localdomain> <41C6A6CC.1050105@trash.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <41C6A6CC.1050105@trash.net> X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12920 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * Patrick McHardy <41C6A6CC.1050105@trash.net> 2004-12-20 11:17 > I agree that this problem would have been avoided if the > regression tests were run when the change was made, and it > made sense to run them at that time. Unfortunately I missed > the patch when it went in, otherwise I would have objected > to using a field called "priv" and making assumptions about > the layout of the structure it points to in a file called > act_api anyway. Ifs and buts, it was solely and purely my fault. period. > On a side-note, you both seem to be inventing your own testing > framework and regression tests. tcng already includes lots of > regression tests for tc, tcng and the kernel. Unfortunately, > last time I checked, it didn't work with 2.6. The tests are based on tcsim which might behave differently than the kernel itself. My test framework primarly tries to cover iproute - kernel incompatbilities and tries to trigger bugs by running every bit of code in every possible combination. > I don't feel like I'm distributing burden onto anyone. As I > said, I run the tests I deem necessary, and I never send out > patches of whichs correctness I'm not convinced. So far, my > history of mistakes has been pretty good. Agreed, the only bug which would have been easly found with the testframework was the CBQ slab corruption bug due deleting filters twice added with your generic filter deletion simplification. Unfortunately, I'm more error-prone than most so I need a testframework. From hadi@cyberus.ca Mon Dec 20 06:12:16 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 06:12:22 -0800 (PST) Received: from mx01.cybersurf.com (mx01.cybersurf.com [209.197.145.104]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBKEBtWM027845 for ; Mon, 20 Dec 2004 06:12:15 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx01.cybersurf.com with esmtp (Exim 4.30) id 1CgOFn-0000hp-AL for netdev@oss.sgi.com; Mon, 20 Dec 2004 09:11:15 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CgOF7-0005wv-DI; Mon, 20 Dec 2004 09:10:33 -0500 Subject: Re: [patch 4/10] s390: network driver. From: jamal Reply-To: hadi@cyberus.ca To: Tommy Christensen Cc: Thomas Spatzier , "David S. Miller" , Hasso Tepper , Herbert Xu , jgarzik@pobox.com, netdev@oss.sgi.com, Paul Jakma In-Reply-To: <41C612BC.5070909@tpack.net> References: <1103484552.1046.155.camel@jzny.localdomain> <41C600D7.70005@tpack.net> <1103497516.1046.231.camel@jzny.localdomain> <41C612BC.5070909@tpack.net> Content-Type: text/plain Organization: jamalopolous Message-Id: <1103551830.1047.316.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 20 Dec 2004 09:10:31 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12921 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Sun, 2004-12-19 at 18:46, Tommy Christensen wrote: > OK. So is this the recommendation for these pour souls? > > - Use a socket for each device. Sounds sensible (each device with IP enabled is more like it) > - Set the socket buffer (SO_SNDBUF) large enough. E.g. 1 MB ? > Or use non-blocking sockets - just in case. I think we may need a socket "flush socket buffer" signal > - If you care about not sending stale packets, it is the > responsibility of the application to flush the socket on > link-down events (by down'ing the interface?). sigh. I am begining to think this is too complex an approach. It requires there be a way to automagically clean up the buffers when things like this happen. I beginuing to think thats the simplest way to achieve this: i.e not to stop the queue but rather to let the packets continue showing up and drop them at the driver when the link is down . The netlink async carrier signal to the app is to be used to reroute instead of being a signal to flush buffers. In other words the other Thomas got it right (with the exception of setting the IFF_RUNNIGN flags) Jeff? cheers, jamal From hadi@cyberus.ca Mon Dec 20 06:13:16 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 06:13:23 -0800 (PST) Received: from mx01.cybersurf.com (mx01.cybersurf.com [209.197.145.104]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBKECtLn028059 for ; Mon, 20 Dec 2004 06:13:16 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx01.cybersurf.com with esmtp (Exim 4.30) id 1CgOGo-0001QK-3t for netdev@oss.sgi.com; Mon, 20 Dec 2004 09:12:18 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CgOGD-00065S-LK; Mon, 20 Dec 2004 09:11:41 -0500 Subject: Re: [patch 4/10] s390: network driver. From: jamal Reply-To: hadi@cyberus.ca To: Paul Jakma Cc: Thomas Spatzier , "David S. Miller" , Hasso Tepper , Herbert Xu , jgarzik@pobox.com, netdev@oss.sgi.com In-Reply-To: References: <1103484552.1046.155.camel@jzny.localdomain> Content-Type: text/plain Organization: jamalopolous Message-Id: <1103551899.1048.319.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 20 Dec 2004 09:11:39 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12922 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Sun, 2004-12-19 at 18:54, Paul Jakma wrote: > On Sun, 19 Dec 2004, jamal wrote: > > > This is the strange part. Anyone still willing to provide a sample > > program that hangs? > > I can provide instructions, or you can wait a wee bit - I didnt have > any e1000 hardware to test with (e1000 being one of the drivers which > has this behavious, AFAIK/TTBOMK) - but a computer with an e1000 > arrived Friday. So, give me a bit and I'll try come up with a test > case. A simple UDP send should be enough. When you are ready let me know, we could repeat what i said in my last email. cheers, jamal From hadi@cyberus.ca Mon Dec 20 06:15:22 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 06:15:29 -0800 (PST) Received: from mx01.cybersurf.com (mx01.cybersurf.com [209.197.145.104]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBKEF2LM028845 for ; Mon, 20 Dec 2004 06:15:22 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx01.cybersurf.com with esmtp (Exim 4.30) id 1CgOIr-0002i6-QT for netdev@oss.sgi.com; Mon, 20 Dec 2004 09:14:25 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CgOIG-0006O0-W3; Mon, 20 Dec 2004 09:13:49 -0500 Subject: Re: [PATCH] PKT_SCHED: dsmark must take care of shared/cloned skbs From: jamal Reply-To: hadi@cyberus.ca To: Patrick McHardy Cc: Thomas Graf , "David S. Miller" , netdev@oss.sgi.com In-Reply-To: <41C687EE.1090205@trash.net> References: <20041218170017.GH17998@postel.suug.ch> <1103487827.1048.188.camel@jzny.localdomain> <20041219203641.GL17998@postel.suug.ch> <41C687EE.1090205@trash.net> Content-Type: text/plain Organization: jamalopolous Message-Id: <1103552026.1048.324.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 20 Dec 2004 09:13:46 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12923 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Mon, 2004-12-20 at 03:06, Patrick McHardy wrote: > You shouldn't care about IMQ, but we still need to copy the packet > before modifying it if the data is shared. Otherwise we have a race > on SMP with AF_PACKET sockets, depending on when the packet is read > it can be either modified or not. Certainly not a big deal; shouldnt care if once in a while tcpdump actually gets to see the real packet that went out the wire. > Converting dsmark to an action sounds like the best long-term solution. Indeed; hence my comment to talk to Werner. cheers, jamal From hadi@cyberus.ca Mon Dec 20 06:18:30 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 06:18:37 -0800 (PST) Received: from mx01.cybersurf.com (mx01.cybersurf.com [209.197.145.104]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBKEIAMP029398 for ; Mon, 20 Dec 2004 06:18:30 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx01.cybersurf.com with esmtp (Exim 4.30) id 1CgOLt-0004JP-QT for netdev@oss.sgi.com; Mon, 20 Dec 2004 09:17:33 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CgOLJ-0006sh-DN; Mon, 20 Dec 2004 09:16:57 -0500 Subject: Re: [PATCH] PKT_SCHED: Fix cls indev validation From: jamal Reply-To: hadi@cyberus.ca To: Patrick McHardy Cc: Thomas Graf , "David S. Miller" , netdev@oss.sgi.com In-Reply-To: <41C68CEF.3030803@trash.net> References: <20041219203050.GK17998@postel.suug.ch> <41C68CEF.3030803@trash.net> Content-Type: text/plain Organization: jamalopolous Message-Id: <1103552215.1048.333.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 20 Dec 2004 09:16:55 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12924 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Mon, 2004-12-20 at 03:27, Patrick McHardy wrote: > There are lots of places where this is possible, just look > at all the silly checks in the action code: > > sprintf(act_name, "%s", (char*)RTA_DATA(kind)); > if (RTA_PAYLOAD(kind) >= IFNAMSIZ) { > Hehe. I am sure thats a cutnpaste(LinuxWay) from some code in the kernel probably sch_api.c (or maybe the code it was cutnpasted has been fixed in the last 3 years ;->). That needs fixing. Who is sending the patch? > > Puts the indev validation into its own function to allow > > parameter validation before any changes are made. Changes > > the sanity check from >= IFNAMSIZ to > IFNAMSIZ to correctly > > handle 0 terminated strings and replaces the dangerous sprintf > > with a memcpy bound to the TLV size. Providing a indev TLV > > for kernels without CONFIG_NET_CLS_IND support will now lead > > to EOPPNOTSUPP. > > Why special-case indev ? Neither u32 nor fw make any attempt > to clean up after errors in their init functions, so instead > of fixing a single attribute they need to do proper cleanup, > than we can just continue to validate indev when changing it. > Returning EOPNOTSUPP makes sense, of course. > Indev is going to die - no need in investing a lot of effort in it. > I'm also against keeping all those printks when touching the > code. Its ok when writing new code, but I don't see why this > code, unlike everything else, needs to report errors in the > ringbuffer instead of returning meaningful error codes. > Its not that expensive since done on config path. But agree when proper codes exist its not needed. cheers, jamal From paul@clubi.ie Mon Dec 20 06:19:20 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 06:19:26 -0800 (PST) Received: from hibernia.jakma.org (IDENT:U2FsdGVkX19Jmtc16o4dkiZyE5P5ZT5Cf3LAW9tp36o@hibernia.jakma.org [212.17.55.49]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBKEIs5l029547 for ; Mon, 20 Dec 2004 06:19:18 -0800 Received: from sheen.jakma.org (sheen.jakma.org [212.17.55.53]) by hibernia.jakma.org (8.13.1/8.12.11) with ESMTP id iBKEGldx019456; Mon, 20 Dec 2004 14:16:50 GMT Date: Mon, 20 Dec 2004 14:16:47 +0000 (UTC) From: Paul Jakma X-X-Sender: paul@sheen.jakma.org To: Tommy Christensen cc: hadi@cyberus.ca, Thomas Spatzier , "David S. Miller" , Hasso Tepper , Herbert Xu , jgarzik@pobox.com, netdev@oss.sgi.com Subject: Re: [patch 4/10] s390: network driver. In-Reply-To: <41C612BC.5070909@tpack.net> Message-ID: References: <1103484552.1046.155.camel@jzny.localdomain> <41C600D7.70005@tpack.net> <1103497516.1046.231.camel@jzny.localdomain> <41C612BC.5070909@tpack.net> Mail-Followup-To: paul@hibernia.jakma.org X-NSA: arafat al aqsar jihad musharef jet-A1 avgas ammonium qran inshallah allah al-akbar martyr iraq saddam hammas hisballah rabin ayatollah korea vietnam revolt mustard gas british airways washington MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 19:53:11 2004 clamav-milter version 0.80j on hibernia.jakma.org X-Virus-Status: Clean X-Virus-Status: Clean X-archive-position: 12925 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: paul@clubi.ie Precedence: bulk X-list: netdev On Mon, 20 Dec 2004, Tommy Christensen wrote: >> For a routing protocol that actually is notified that the link >> went down, it should probably flush those socket buffer at that >> point. Or why not return an error, as soon as possible on the socket, eg ENOBUFS, and discard anything in the queue before that. Make it configurable via a sockopt if you think it'd harm ordinary apps (though, anything that cant deal with ENOBUFS is broken already, really..) or make it apply only to nonblock sockets. > responsibility of the application to flush the socket on > link-down events (by down'ing the interface?). That seems more complex than needs be, for userspace at least. regards, -- Paul Jakma paul@clubi.ie paul@jakma.org Key ID: 64A2FF6A Fortune: "Paul Lynde to block..." -- a contestant on "Hollywood Squares" From hadi@cyberus.ca Mon Dec 20 06:28:02 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 06:28:10 -0800 (PST) Received: from mx03.cybersurf.com (mx03.cybersurf.com [209.197.145.106]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBKERfgY030620 for ; Mon, 20 Dec 2004 06:28:02 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx03.cybersurf.com with esmtp (Exim 4.30) id 1CgOVH-00054s-0J for netdev@oss.sgi.com; Mon, 20 Dec 2004 09:27:15 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CgOVE-00009l-HL; Mon, 20 Dec 2004 09:27:12 -0500 Subject: Re: [PATCH] PKT_SCHED: Provide compat policer stats in action policer From: jamal Reply-To: hadi@cyberus.ca To: Patrick McHardy Cc: Thomas Graf , "David S. Miller" , netdev@oss.sgi.com In-Reply-To: <41C6A6CC.1050105@trash.net> References: <20041215130128.GK8493@postel.suug.ch> <1103119774.1077.74.camel@jzny.localdomain> <41C05B60.6030504@trash.net> <1103484249.1046.143.camel@jzny.localdomain> <41C6A6CC.1050105@trash.net> Content-Type: multipart/mixed; boundary="=-amGTD0YycTBPfRwuI0HL" Organization: jamalopolous Message-Id: <1103552830.1049.355.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 20 Dec 2004 09:27:10 -0500 X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12926 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev --=-amGTD0YycTBPfRwuI0HL Content-Type: text/plain Content-Transfer-Encoding: 7bit On Mon, 2004-12-20 at 05:17, Patrick McHardy wrote: > > Hopefully with the regression tests in place this will get better. > > On a side-note, you both seem to be inventing your own testing > framework and regression tests. tcng already includes lots of > regression tests for tc, tcng and the kernel. Unfortunately, > last time I checked, it didn't work with 2.6. I havent looked closely at tcng although Werner has showed it to me a few times (may be under influence). We need to pick one or other test setup. I dont care if its what I have, tcng or what Thomas has. I just stared quickly at what Thomas has and realize its not really automated. In my case it is easier because i can click on the proverbial one-button and run 20 tests (including a subset of the policer ones) and even capturing tcpdumps. I have attached a sample testcase. They are harder to create and require the environment i have. But once you create them, you should be saying "go" - go do something and come back and get results. Whatever we end up having, my preference would be something along those lines, > > [You fear Murphy less than i - and thats a style difference. Your style > > is actually more effective in Linux because you can distribute the > > burden onto users. As a matter of fact it is within Daves tolerance > > range (but not mine[1]). So you should do just fine] > > I don't feel like I'm distributing burden onto anyone. As I > said, I run the tests I deem necessary, and I never send out > patches of whichs correctness I'm not convinced. So far, my > history of mistakes has been pretty good. You have been excellent. My statements are more generic than reflect on you in person. We have probably the highest amount of visible bugs in a longtime so i hope you understand my concerns. cheers, jamal --=-amGTD0YycTBPfRwuI0HL Content-Disposition: attachment; filename=M-005-set.tcl Content-Type: text/x-tcl; name=M-005-set.tcl; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit #!/usr/bin/expect # # Pedit Main Test 5 # # Purpose: # To test the pedit/munge/set capabilities. # # What it does: # -Run tcpdump beforehand, and capture a hex dump of a packet. # -Install a rule stating that any packet that arrives on loopback # (ingress) and is destined for 127.0.0.1 will be edited. Offset 0 # will be rewritten to 0x12345678 # -Run tcpdump afterwards, and check the differences between packets # # Expected result: # -The ping will fail, and offset 0 of the packet will contain 0x12345678 # # Set description for use with libtest set libtest_description "Pedit action: setting a u32 range at offset 0." # Source common test procedures and run set-up commands source ./libtest.tcl # Delete any ingress qdiscs associated with loopback in order to clear state send "tc qdisc del dev lo ingress\r" # Send initial pings to establish a base case send "sleep 1; ping 127.0.0.1 -c 3 &\r " # Get a tcpdump hex output for a packet send "tcpdump -i lo -s 96 -vvv -x host 127.0.0.1 and icmp\[icmptype\]=icmp-echo -c 1\r" if [expect_tcpdump_results before_list] { output_test_failure "Error while expecting tcpdump results" return -1 } # Add an ingress qdisc send "tc qdisc add dev lo ingress\r" # Match IP destination 127.0.0.1, and edit these packets to set the 4 octets # at offset 0 to be 0x12345678 send "tc filter add dev lo parent ffff: protocol ip prio 10 u32 match ip dst 127.0.0.1/32 flowid 1:1 action pedit munge offset 0 u32 set 0x12345678 pipe action mirred egress redirect dev dummy0\r" # Send a few pings to 127.0.0.1 while tcpdump is running on dummy0 and capture # the data. expect -re ".*" send "sleep 1; ping 127.0.0.1 -c 2 &\r" # Get a tcpdump hex output for a packet send "tcpdump -i dummy0 -s 96 -vvv -x -c 1\r" if [expect_tcpdump_results after_list] { output_test_failure "Error while expecting tcpdump results" return -1 } # Extra logging output_test_message "Before: $before_list" output_test_message "After: $after_list" # Check the results. In this case, data at offsets 0-3 was changed. # Offsets 0-1 should be 0x1234 if {[lindex $before_list 0] == [lindex $after_list 0]} { output_test_failure "Test failed at offsets 0-1" return -1 } else { if {[lindex $after_list 0] != "1234"} { output_test_failure "Test failed at offsets 0-1." return -1 } else { output_test_message "Offsets 0-1 changed to 0x1234 (success)" } } # Offsets 2-3 should be 0x5678 if {[lindex $before_list 1] == [lindex $after_list 1]} { output_test_failure "Test failed at offsets 2-3" return -1 } else { if {[lindex $after_list 1] != "5678"} { output_test_failure "Test failed at offsets 2-3" return -1 } else { output_test_message "Offsets 2-3 changed to 0x5678 (success)" } } # Ensure that all expected matching portions of the packet are identical set exemptions [concat $DEFAULT_TCPDUMP_ICMP_EXEMPTION 0 1] if [compare_lists $before_list $after_list $exemptions $DEFAULT_TCPDUMP_LENGTH mismatch_index] { output_test_failure "Test failed at $mismatch_index" return -1 } expect -re ".*" output_test_success return 0 --=-amGTD0YycTBPfRwuI0HL-- From laforge@gnumonks.org Mon Dec 20 06:30:49 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 06:30:56 -0800 (PST) Received: from ganesha.gnumonks.org (Debian-exim@ganesha.gnumonks.org [213.95.27.120]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBKEUSoj031194 for ; Mon, 20 Dec 2004 06:30:48 -0800 Received: from dsl-082-082-097-002.arcor-ip.net ([82.82.97.2] helo=sunbeam.gnumonks.org) by ganesha.gnumonks.org with asmtp (TLS-1.0:RSA_ARCFOUR_SHA:16) (Exim 4.34) id 1CgOXz-0005QB-2W; Mon, 20 Dec 2004 15:30:03 +0100 Received: from laforge by sunbeam.gnumonks.org with local (Exim 4.34) id 1CgOXv-000783-8S; Mon, 20 Dec 2004 15:29:59 +0100 Date: Mon, 20 Dec 2004 15:29:59 +0100 From: Harald Welte To: jamal Cc: Thomas Graf , Henrik Nordstrom , "David S. Miller" , Andrea G Forte , hasso@estpak.ee, nhorman@redhat.com, linux-net@vger.kernel.org, netdev@oss.sgi.com Subject: Re: primary and secondary ip addresses Message-ID: <20041220142959.GU24142@sunbeam.de.gnumonks.org> References: <41C30212.6000906@cs.columbia.edu> <20041217112025.27688eb6.davem@davemloft.net> <1103487517.1047.181.camel@jzny.localdomain> <20041219214120.GX17302@sunbeam.de.gnumonks.org> <20041219220211.GQ17998@postel.suug.ch> <1103497168.1046.218.camel@jzny.localdomain> <1103500587.1048.269.camel@jzny.localdomain> <1103550901.1050.292.camel@jzny.localdomain> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="NY6JkbSqL3W9mApi" Content-Disposition: inline In-Reply-To: <1103550901.1050.292.camel@jzny.localdomain> User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12927 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: laforge@gnumonks.org Precedence: bulk X-list: netdev --NY6JkbSqL3W9mApi Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Dec 20, 2004 at 08:55:02AM -0500, jamal wrote: > Didnt boot ;-> > A small silly magic number i missed. Now boots - but doesnt mean it > works. And i dont have much time to spare chasing it. Don't bother, I'll give it some testing here. Thanks again. > cheers, > jamal --=20 - Harald Welte http://www.gnumonks.org/ =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D Programming is like sex: One mistake and you have to support it your lifeti= me --NY6JkbSqL3W9mApi Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Digital signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) iD8DBQFBxuHnXaXGVTD0i/8RAgUwAJ0RozEMTOXsiI7CGdSOi7jP59UkXgCfTknM R4uQ3PpvBi+CNXZLPUDFWtU= =uu8k -----END PGP SIGNATURE----- --NY6JkbSqL3W9mApi-- From hadi@cyberus.ca Mon Dec 20 06:33:35 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 06:33:42 -0800 (PST) Received: from mx02.cybersurf.com (mx02.cybersurf.com [209.197.145.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBKEXEqF031706 for ; Mon, 20 Dec 2004 06:33:34 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx02.cybersurf.com with esmtp (Exim 4.30) id 1CgOae-0008QB-T6 for netdev@oss.sgi.com; Mon, 20 Dec 2004 09:32:48 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CgOac-0000un-W1; Mon, 20 Dec 2004 09:32:47 -0500 Subject: Re: [PATCH] PKT_SCHED: Provide compat policer stats in action policer From: jamal Reply-To: hadi@cyberus.ca To: Thomas Graf Cc: Patrick McHardy , "David S. Miller" , netdev@oss.sgi.com In-Reply-To: <20041220140325.GW17998@postel.suug.ch> References: <20041215130128.GK8493@postel.suug.ch> <1103119774.1077.74.camel@jzny.localdomain> <41C05B60.6030504@trash.net> <1103484249.1046.143.camel@jzny.localdomain> <41C6A6CC.1050105@trash.net> <20041220140325.GW17998@postel.suug.ch> Content-Type: text/plain Organization: jamalopolous Message-Id: <1103553164.1050.365.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 20 Dec 2004 09:32:44 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12928 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Mon, 2004-12-20 at 09:03, Thomas Graf wrote: > > Ifs and buts, it was solely and purely my fault. period. > No fault on anyone - its a process of improving what we do. > Agreed, the only bug which would have been easly found with the > testframework was the CBQ slab corruption bug due deleting > filters twice added with your generic filter deletion simplification. > Unfortunately, I'm more error-prone than most so I need a testframework. I need it to ;-> It will help me be less conservative ;-> Thomas, we need automation of these tests - I just relaized what you have is not really in that category. Refer to sample example i posted. Doesnt have to be exactly like that, but somewhere along those lines. cheers, jamal From lkml@einar-lueck.de Mon Dec 20 07:09:36 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 07:09:42 -0800 (PST) Received: from smtprelay01.ispgateway.de (smtprelay01.ispgateway.de [80.67.18.13]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBKF9BBo000523 for ; Mon, 20 Dec 2004 07:09:35 -0800 Received: (qmail 3985 invoked from network); 20 Dec 2004 15:08:44 -0000 Received: from unknown (HELO [192.168.30.10]) (008508@[217.231.190.51]) (envelope-sender ) by smtprelay01.ispgateway.de (qmail-ldap-1.03) with SMTP for ; 20 Dec 2004 15:08:44 -0000 Message-ID: <41C6EB04.7080306@einar-lueck.de> Date: Mon, 20 Dec 2004 16:08:52 +0100 From: =?ISO-8859-1?Q?Einar_L=FCck?= User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Patrick McHardy CC: linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: [PATCH 1/2] ipv4 routing: splitting of ip_route_[in|out]put_slow, 2.6.10-rc3 References: <41C6B3D4.6060207@einar-lueck.de> <41C6DB68.30607@trash.net> In-Reply-To: <41C6DB68.30607@trash.net> Content-Type: multipart/mixed; boundary="------------010809070706070601060701" X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12929 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: lkml@einar-lueck.de Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------010809070706070601060701 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Patrick McHardy wrote: > Your patches have once again been made unreadable by > your email-client. Please send them again as attachments. > > Regards > Patrick Thank you for the hint! Regards Einar --------------010809070706070601060701 Content-Type: text/plain; name="routingsplit_rc3.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="routingsplit_rc3.diff" diff -ruN linux-2.6.9/net/ipv4/route.c linux-2.6.9.split/net/ipv4/route.c --- linux-2.6.9/net/ipv4/route.c 2004-12-15 12:03:59.000000000 +0100 +++ linux-2.6.9.split/net/ipv4/route.c 2004-12-15 12:05:32.000000000 +0100 @@ -104,6 +104,9 @@ #include #endif +#define RT_FL_TOS(oldflp) \ + ((u32)(oldflp->fl4_tos & (IPTOS_RT_MASK | RTO_ONLINK))) + #define IP_MAX_MTU 0xFFF0 #define RT_GC_TIMEOUT (300*HZ) @@ -143,6 +146,7 @@ static void ipv4_link_failure(struct sk_buff *skb); static void ip_rt_update_pmtu(struct dst_entry *dst, u32 mtu); static int rt_garbage_collect(void); +static inline int compare_keys(struct flowi *fl1, struct flowi *fl2); static struct dst_ops ipv4_dst_ops = { @@ -1533,6 +1537,169 @@ return -EINVAL; } + +static void ip_handle_martian_source(struct net_device *dev, + struct in_device *in_dev, + struct sk_buff *skb, + u32 daddr, + u32 saddr) +{ + RT_CACHE_STAT_INC(in_martian_src); +#ifdef CONFIG_IP_ROUTE_VERBOSE + if (IN_DEV_LOG_MARTIANS(in_dev) && net_ratelimit()) { + /* + * RFC1812 recommendation, if source is martian, + * the only hint is MAC header. + */ + printk(KERN_WARNING "martian source %u.%u.%u.%u from " + "%u.%u.%u.%u, on dev %s\n", + NIPQUAD(daddr), NIPQUAD(saddr), dev->name); + if (dev->hard_header_len) { + int i; + unsigned char *p = skb->mac.raw; + printk(KERN_WARNING "ll header: "); + for (i = 0; i < dev->hard_header_len; i++, p++) { + printk("%02x", *p); + if (i < (dev->hard_header_len - 1)) + printk(":"); + } + printk("\n"); + } + } +#endif +} + +static inline int __mkroute_input(struct sk_buff *skb, + struct fib_result* res, + struct in_device *in_dev, + u32 daddr, u32 saddr, u32 tos, + struct rtable **result) +{ + + struct rtable *rth; + int err; + struct in_device *out_dev; + unsigned flags = 0; + u32 spec_dst, itag; + + /* get a working reference to the output device */ + out_dev = in_dev_get(FIB_RES_DEV(*res)); + if (out_dev == NULL) { + if (net_ratelimit()) + printk(KERN_CRIT "Bug in ip_route_input" \ + "_slow(). Please, report\n"); + return -EINVAL; + } + + + err = fib_validate_source(saddr, daddr, tos, FIB_RES_OIF(*res), + in_dev->dev, &spec_dst, &itag); + if (err < 0) { + ip_handle_martian_source(in_dev->dev, in_dev, skb, daddr, + saddr); + + err = -EINVAL; + goto cleanup; + } + + if (err) + flags |= RTCF_DIRECTSRC; + + if (out_dev == in_dev && err && !(flags & (RTCF_NAT | RTCF_MASQ)) && + (IN_DEV_SHARED_MEDIA(out_dev) || + inet_addr_onlink(out_dev, saddr, FIB_RES_GW(*res)))) + flags |= RTCF_DOREDIRECT; + + if (skb->protocol != htons(ETH_P_IP)) { + /* Not IP (i.e. ARP). Do not create route, if it is + * invalid for proxy arp. DNAT routes are always valid. + */ + if (out_dev == in_dev && !(flags & RTCF_DNAT)) { + err = -EINVAL; + goto cleanup; + } + } + + + rth = dst_alloc(&ipv4_dst_ops); + if (!rth) { + err = -ENOBUFS; + goto cleanup; + } + + rth->u.dst.flags= DST_HOST; + if (in_dev->cnf.no_policy) + rth->u.dst.flags |= DST_NOPOLICY; + if (in_dev->cnf.no_xfrm) + rth->u.dst.flags |= DST_NOXFRM; + rth->fl.fl4_dst = daddr; + rth->rt_dst = daddr; + rth->fl.fl4_tos = tos; +#ifdef CONFIG_IP_ROUTE_FWMARK + rth->fl.fl4_fwmark= skb->nfmark; +#endif + rth->fl.fl4_src = saddr; + rth->rt_src = saddr; + rth->rt_gateway = daddr; + rth->rt_iif = + rth->fl.iif = in_dev->dev->ifindex; + rth->u.dst.dev = (out_dev)->dev; + dev_hold(rth->u.dst.dev); + rth->idev = in_dev_get(rth->u.dst.dev); + rth->fl.oif = 0; + rth->rt_spec_dst= spec_dst; + + rth->u.dst.input = ip_forward; + rth->u.dst.output = ip_output; + + rt_set_nexthop(rth, res, itag); + + rth->rt_flags = flags; + + *result = rth; + err = 0; + cleanup: + /* release the working reference to the output device */ + in_dev_put(out_dev); + return err; +} + +static inline int ip_mkroute_input_def(struct sk_buff *skb, + struct fib_result* res, + const struct flowi *fl, + struct in_device *in_dev, + u32 daddr, u32 saddr, u32 tos) +{ + struct rtable* rth; + int err; + unsigned hash; + +#ifdef CONFIG_IP_ROUTE_MULTIPATH + if (res->fi->fib_nhs > 1 && fl->oif == 0) + fib_select_multipath(fl, res); +#endif + + /* create a routing cache entry */ + err = __mkroute_input( skb, res, in_dev, daddr, saddr, tos, &rth ); + if ( err ) + return err; + atomic_set(&rth->u.dst.__refcnt, 1); + + /* put it into the cache */ + hash = rt_hash_code(daddr, saddr ^ (fl->iif << 5), tos); + return rt_intern_hash(hash, rth, (struct rtable**)&skb->dst); +} + +static inline int ip_mkroute_input(struct sk_buff *skb, + struct fib_result* res, + const struct flowi *fl, + struct in_device *in_dev, + u32 daddr, u32 saddr, u32 tos) +{ + return ip_mkroute_input_def(skb, res, fl, in_dev, daddr, saddr, tos); +} + + /* * NOTE. We drop all the packets that has local source * addresses, because every properly looped back packet @@ -1544,11 +1711,10 @@ */ static int ip_route_input_slow(struct sk_buff *skb, u32 daddr, u32 saddr, - u8 tos, struct net_device *dev) + u8 tos, struct net_device *dev) { struct fib_result res; struct in_device *in_dev = in_dev_get(dev); - struct in_device *out_dev = NULL; struct flowi fl = { .nl_u = { .ip4_u = { .daddr = daddr, .saddr = saddr, @@ -1572,8 +1738,6 @@ if (!in_dev) goto out; - hash = rt_hash_code(daddr, saddr ^ (fl.iif << 5), tos); - /* Check for the most weird martians, which can be not detected by fib_lookup. */ @@ -1626,79 +1790,14 @@ if (res.type != RTN_UNICAST) goto martian_destination; -#ifdef CONFIG_IP_ROUTE_MULTIPATH - if (res.fi->fib_nhs > 1 && fl.oif == 0) - fib_select_multipath(&fl, &res); -#endif - out_dev = in_dev_get(FIB_RES_DEV(res)); - if (out_dev == NULL) { - if (net_ratelimit()) - printk(KERN_CRIT "Bug in ip_route_input_slow(). " - "Please, report\n"); - goto e_inval; - } - - err = fib_validate_source(saddr, daddr, tos, FIB_RES_OIF(res), dev, - &spec_dst, &itag); - if (err < 0) - goto martian_source; - - if (err) - flags |= RTCF_DIRECTSRC; - - if (out_dev == in_dev && err && !(flags & (RTCF_NAT | RTCF_MASQ)) && - (IN_DEV_SHARED_MEDIA(out_dev) || - inet_addr_onlink(out_dev, saddr, FIB_RES_GW(res)))) - flags |= RTCF_DOREDIRECT; - - if (skb->protocol != htons(ETH_P_IP)) { - /* Not IP (i.e. ARP). Do not create route, if it is - * invalid for proxy arp. DNAT routes are always valid. - */ - if (out_dev == in_dev && !(flags & RTCF_DNAT)) - goto e_inval; - } - - rth = dst_alloc(&ipv4_dst_ops); - if (!rth) + err = ip_mkroute_input(skb, &res, &fl, in_dev, daddr, saddr, tos); + if ( err == -ENOBUFS ) goto e_nobufs; - - atomic_set(&rth->u.dst.__refcnt, 1); - rth->u.dst.flags= DST_HOST; - if (in_dev->cnf.no_policy) - rth->u.dst.flags |= DST_NOPOLICY; - if (in_dev->cnf.no_xfrm) - rth->u.dst.flags |= DST_NOXFRM; - rth->fl.fl4_dst = daddr; - rth->rt_dst = daddr; - rth->fl.fl4_tos = tos; -#ifdef CONFIG_IP_ROUTE_FWMARK - rth->fl.fl4_fwmark= skb->nfmark; -#endif - rth->fl.fl4_src = saddr; - rth->rt_src = saddr; - rth->rt_gateway = daddr; - rth->rt_iif = - rth->fl.iif = dev->ifindex; - rth->u.dst.dev = out_dev->dev; - dev_hold(rth->u.dst.dev); - rth->idev = in_dev_get(rth->u.dst.dev); - rth->fl.oif = 0; - rth->rt_spec_dst= spec_dst; - - rth->u.dst.input = ip_forward; - rth->u.dst.output = ip_output; - - rt_set_nexthop(rth, &res, itag); - - rth->rt_flags = flags; - -intern: - err = rt_intern_hash(hash, rth, (struct rtable**)&skb->dst); + if ( err == -EINVAL ) + goto e_inval; + done: in_dev_put(in_dev); - if (out_dev) - in_dev_put(out_dev); if (free_res) fib_res_put(&res); out: return err; @@ -1758,7 +1857,9 @@ rth->rt_flags &= ~RTCF_LOCAL; } rth->rt_type = res.type; - goto intern; + hash = rt_hash_code(daddr, saddr ^ (fl.iif << 5), tos); + err = rt_intern_hash(hash, rth, (struct rtable**)&skb->dst); + goto done; no_route: RT_CACHE_STAT_INC(in_no_route); @@ -1786,30 +1887,7 @@ goto done; martian_source: - - RT_CACHE_STAT_INC(in_martian_src); -#ifdef CONFIG_IP_ROUTE_VERBOSE - if (IN_DEV_LOG_MARTIANS(in_dev) && net_ratelimit()) { - /* - * RFC1812 recommendation, if source is martian, - * the only hint is MAC header. - */ - printk(KERN_WARNING "martian source %u.%u.%u.%u from " - "%u.%u.%u.%u, on dev %s\n", - NIPQUAD(daddr), NIPQUAD(saddr), dev->name); - if (dev->hard_header_len) { - int i; - unsigned char *p = skb->mac.raw; - printk(KERN_WARNING "ll header: "); - for (i = 0; i < dev->hard_header_len; i++, p++) { - printk("%02x", *p); - if (i < (dev->hard_header_len - 1)) - printk(":"); - } - printk("\n"); - } - } -#endif + ip_handle_martian_source(dev, in_dev, skb, daddr, saddr); goto e_inval; } @@ -1880,13 +1958,166 @@ return ip_route_input_slow(skb, daddr, saddr, tos, dev); } +static inline int __mkroute_output(struct rtable **result, + struct fib_result* res, + const struct flowi *fl, + const struct flowi *oldflp, + struct net_device *dev_out, + unsigned flags) +{ + struct rtable *rth; + struct in_device *in_dev; + u32 tos = RT_FL_TOS(oldflp); + int err = 0; + + if (LOOPBACK(fl->fl4_src) && !(dev_out->flags&IFF_LOOPBACK)) + return -EINVAL; + + if (fl->fl4_dst == 0xFFFFFFFF) + res->type = RTN_BROADCAST; + else if (MULTICAST(fl->fl4_dst)) + res->type = RTN_MULTICAST; + else if (BADCLASS(fl->fl4_dst) || ZERONET(fl->fl4_dst)) + return -EINVAL; + + if (dev_out->flags & IFF_LOOPBACK) + flags |= RTCF_LOCAL; + + /* get work reference to inet device */ + in_dev = in_dev_get(dev_out); + if (!in_dev) + return -EINVAL; + + if (res->type == RTN_BROADCAST) { + flags |= RTCF_BROADCAST | RTCF_LOCAL; + if (res->fi) { + fib_info_put(res->fi); + res->fi = NULL; + } + } else if (res->type == RTN_MULTICAST) { + flags |= RTCF_MULTICAST|RTCF_LOCAL; + if (!ip_check_mc(in_dev, oldflp->fl4_dst, oldflp->fl4_src, + oldflp->proto)) + flags &= ~RTCF_LOCAL; + /* If multicast route do not exist use + default one, but do not gateway in this case. + Yes, it is hack. + */ + if (res->fi && res->prefixlen < 4) { + fib_info_put(res->fi); + res->fi = NULL; + } + } + + + rth = dst_alloc(&ipv4_dst_ops); + if (!rth) { + err = -ENOBUFS; + goto cleanup; + } + + rth->u.dst.flags= DST_HOST; + if (in_dev->cnf.no_xfrm) + rth->u.dst.flags |= DST_NOXFRM; + if (in_dev->cnf.no_policy) + rth->u.dst.flags |= DST_NOPOLICY; + + rth->fl.fl4_dst = oldflp->fl4_dst; + rth->fl.fl4_tos = tos; + rth->fl.fl4_src = oldflp->fl4_src; + rth->fl.oif = oldflp->oif; +#ifdef CONFIG_IP_ROUTE_FWMARK + rth->fl.fl4_fwmark= oldflp->fl4_fwmark; +#endif + rth->rt_dst = fl->fl4_dst; + rth->rt_src = fl->fl4_src; + rth->rt_iif = oldflp->oif ? : dev_out->ifindex; + /* get references to the devices that are to be hold by the routing + cache entry */ + rth->u.dst.dev = dev_out; + dev_hold(dev_out); + rth->idev = in_dev_get(dev_out); + rth->rt_gateway = fl->fl4_dst; + rth->rt_spec_dst= fl->fl4_src; + + rth->u.dst.output=ip_output; + + RT_CACHE_STAT_INC(out_slow_tot); + + if (flags & RTCF_LOCAL) { + rth->u.dst.input = ip_local_deliver; + rth->rt_spec_dst = fl->fl4_dst; + } + if (flags & (RTCF_BROADCAST | RTCF_MULTICAST)) { + rth->rt_spec_dst = fl->fl4_src; + if (flags & RTCF_LOCAL && + !(dev_out->flags & IFF_LOOPBACK)) { + rth->u.dst.output = ip_mc_output; + RT_CACHE_STAT_INC(out_slow_mc); + } +#ifdef CONFIG_IP_MROUTE + if (res->type == RTN_MULTICAST) { + if (IN_DEV_MFORWARD(in_dev) && + !LOCAL_MCAST(oldflp->fl4_dst)) { + rth->u.dst.input = ip_mr_input; + rth->u.dst.output = ip_mc_output; + } + } +#endif + } + + rt_set_nexthop(rth, res, 0); + + rth->rt_flags = flags; + + *result = rth; + cleanup: + /* release work reference to inet device */ + in_dev_put(in_dev); + + return err; +} + +static inline int ip_mkroute_output_def(struct rtable **rp, + struct fib_result* res, + const struct flowi *fl, + const struct flowi *oldflp, + struct net_device *dev_out, + unsigned flags) +{ + struct rtable *rth; + int err = __mkroute_output(&rth, res, fl, oldflp, dev_out, flags); + unsigned hash; + if ( err == 0 ) { + u32 tos = RT_FL_TOS(oldflp); + + atomic_set(&rth->u.dst.__refcnt, 1); + + hash = rt_hash_code(oldflp->fl4_dst, + oldflp->fl4_src ^ (oldflp->oif << 5), tos); + err = rt_intern_hash(hash, rth, rp); + } + + return err; +} + +static inline int ip_mkroute_output(struct rtable** rp, + struct fib_result* res, + const struct flowi *fl, + const struct flowi *oldflp, + struct net_device *dev_out, + unsigned flags) +{ + return ip_mkroute_output_def(rp, res, fl, oldflp, dev_out, flags); +} + /* * Major route resolver routine. */ static int ip_route_output_slow(struct rtable **rp, const struct flowi *oldflp) { - u32 tos = oldflp->fl4_tos & (IPTOS_RT_MASK | RTO_ONLINK); + u32 tos = RT_FL_TOS(oldflp); struct flowi fl = { .nl_u = { .ip4_u = { .daddr = oldflp->fl4_dst, .saddr = oldflp->fl4_src, @@ -1902,10 +2133,7 @@ .oif = oldflp->oif }; struct fib_result res; unsigned flags = 0; - struct rtable *rth; struct net_device *dev_out = NULL; - struct in_device *in_dev = NULL; - unsigned hash; int free_res = 0; int err; @@ -2065,116 +2293,13 @@ fl.oif = dev_out->ifindex; make_route: - if (LOOPBACK(fl.fl4_src) && !(dev_out->flags&IFF_LOOPBACK)) - goto e_inval; + err = ip_mkroute_output(rp, &res, &fl, oldflp, dev_out, flags); - if (fl.fl4_dst == 0xFFFFFFFF) - res.type = RTN_BROADCAST; - else if (MULTICAST(fl.fl4_dst)) - res.type = RTN_MULTICAST; - else if (BADCLASS(fl.fl4_dst) || ZERONET(fl.fl4_dst)) - goto e_inval; - - if (dev_out->flags & IFF_LOOPBACK) - flags |= RTCF_LOCAL; - - in_dev = in_dev_get(dev_out); - if (!in_dev) - goto e_inval; - - if (res.type == RTN_BROADCAST) { - flags |= RTCF_BROADCAST | RTCF_LOCAL; - if (res.fi) { - fib_info_put(res.fi); - res.fi = NULL; - } - } else if (res.type == RTN_MULTICAST) { - flags |= RTCF_MULTICAST|RTCF_LOCAL; - if (!ip_check_mc(in_dev, oldflp->fl4_dst, oldflp->fl4_src, oldflp->proto)) - flags &= ~RTCF_LOCAL; - /* If multicast route do not exist use - default one, but do not gateway in this case. - Yes, it is hack. - */ - if (res.fi && res.prefixlen < 4) { - fib_info_put(res.fi); - res.fi = NULL; - } - } - - rth = dst_alloc(&ipv4_dst_ops); - if (!rth) - goto e_nobufs; - - atomic_set(&rth->u.dst.__refcnt, 1); - rth->u.dst.flags= DST_HOST; - if (in_dev->cnf.no_xfrm) - rth->u.dst.flags |= DST_NOXFRM; - if (in_dev->cnf.no_policy) - rth->u.dst.flags |= DST_NOPOLICY; - rth->fl.fl4_dst = oldflp->fl4_dst; - rth->fl.fl4_tos = tos; - rth->fl.fl4_src = oldflp->fl4_src; - rth->fl.oif = oldflp->oif; -#ifdef CONFIG_IP_ROUTE_FWMARK - rth->fl.fl4_fwmark= oldflp->fl4_fwmark; -#endif - rth->rt_dst = fl.fl4_dst; - rth->rt_src = fl.fl4_src; - rth->rt_iif = oldflp->oif ? : dev_out->ifindex; - rth->u.dst.dev = dev_out; - dev_hold(dev_out); - rth->idev = in_dev_get(dev_out); - rth->rt_gateway = fl.fl4_dst; - rth->rt_spec_dst= fl.fl4_src; - - rth->u.dst.output=ip_output; - - RT_CACHE_STAT_INC(out_slow_tot); - - if (flags & RTCF_LOCAL) { - rth->u.dst.input = ip_local_deliver; - rth->rt_spec_dst = fl.fl4_dst; - } - if (flags & (RTCF_BROADCAST | RTCF_MULTICAST)) { - rth->rt_spec_dst = fl.fl4_src; - if (flags & RTCF_LOCAL && !(dev_out->flags & IFF_LOOPBACK)) { - rth->u.dst.output = ip_mc_output; - RT_CACHE_STAT_INC(out_slow_mc); - } -#ifdef CONFIG_IP_MROUTE - if (res.type == RTN_MULTICAST) { - if (IN_DEV_MFORWARD(in_dev) && - !LOCAL_MCAST(oldflp->fl4_dst)) { - rth->u.dst.input = ip_mr_input; - rth->u.dst.output = ip_mc_output; - } - } -#endif - } - - rt_set_nexthop(rth, &res, 0); - - - rth->rt_flags = flags; - - hash = rt_hash_code(oldflp->fl4_dst, oldflp->fl4_src ^ (oldflp->oif << 5), tos); - err = rt_intern_hash(hash, rth, rp); -done: if (free_res) fib_res_put(&res); if (dev_out) dev_put(dev_out); - if (in_dev) - in_dev_put(in_dev); out: return err; - -e_inval: - err = -EINVAL; - goto done; -e_nobufs: - err = -ENOBUFS; - goto done; } int __ip_route_output_key(struct rtable **rp, const struct flowi *flp) --------------010809070706070601060701-- From lkml@einar-lueck.de Mon Dec 20 07:10:48 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 07:10:57 -0800 (PST) Received: from smtprelay01.ispgateway.de (smtprelay01.ispgateway.de [80.67.18.13]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBKFAR6b000761 for ; Mon, 20 Dec 2004 07:10:47 -0800 Received: (qmail 4368 invoked from network); 20 Dec 2004 15:09:58 -0000 Received: from unknown (HELO [192.168.30.10]) (008508@[217.231.190.51]) (envelope-sender ) by smtprelay01.ispgateway.de (qmail-ldap-1.03) with SMTP for ; 20 Dec 2004 15:09:58 -0000 Message-ID: <41C6EB4E.10402@einar-lueck.de> Date: Mon, 20 Dec 2004 16:10:06 +0100 From: =?ISO-8859-1?Q?Einar_L=FCck?= User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206) X-Accept-Language: en-us, en MIME-Version: 1.0 To: =?ISO-8859-1?Q?Einar_L=FCck?= CC: linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: [PATCH 2/2] ipv4 routing: multipath with cache support, 2.6.10-rc3 References: <41C6B54F.2020604@einar-lueck.de> In-Reply-To: <41C6B54F.2020604@einar-lueck.de> Content-Type: multipart/mixed; boundary="------------080301040700050000030800" X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12930 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: lkml@einar-lueck.de Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------080301040700050000030800 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Attached the patch as file due to problems with my email-client. Regards Einar. --------------080301040700050000030800 Content-Type: text/plain; name="nicbalancing_rc3.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="nicbalancing_rc3.diff" diff -ruN linux-2.6.9.split/include/net/dst.h linux-2.6.9.nicbalancing/include/net/dst.h --- linux-2.6.9.split/include/net/dst.h 2004-12-15 12:04:25.000000000 +0100 +++ linux-2.6.9.nicbalancing/include/net/dst.h 2004-12-15 12:07:15.000000000 +0100 @@ -48,6 +48,7 @@ #define DST_NOXFRM 2 #define DST_NOPOLICY 4 #define DST_NOHASH 8 +#define DST_BALANCED 0x10 unsigned long lastuse; unsigned long expires; diff -ruN linux-2.6.9.split/include/net/flow.h linux-2.6.9.nicbalancing/include/net/flow.h --- linux-2.6.9.split/include/net/flow.h 2004-12-15 12:04:25.000000000 +0100 +++ linux-2.6.9.nicbalancing/include/net/flow.h 2004-12-15 12:07:15.000000000 +0100 @@ -51,6 +51,7 @@ __u8 proto; __u8 flags; +#define FLOWI_FLAG_MULTIPATHOLDROUTE 0x01 union { struct { __u16 sport; diff -ruN linux-2.6.9.split/include/net/ip_fib.h linux-2.6.9.nicbalancing/include/net/ip_fib.h --- linux-2.6.9.split/include/net/ip_fib.h 2004-12-15 12:04:25.000000000 +0100 +++ linux-2.6.9.nicbalancing/include/net/ip_fib.h 2004-12-15 12:07:15.000000000 +0100 @@ -95,6 +95,10 @@ unsigned char nh_sel; unsigned char type; unsigned char scope; +#ifdef CONFIG_IP_ROUTE_MULTIPATH_WRANDOM + __u32 network; + __u32 netmask; +#endif struct fib_info *fi; #ifdef CONFIG_IP_MULTIPLE_TABLES struct fib_rule *r; @@ -119,6 +123,14 @@ #define FIB_RES_DEV(res) (FIB_RES_NH(res).nh_dev) #define FIB_RES_OIF(res) (FIB_RES_NH(res).nh_oif) +#ifdef CONFIG_IP_ROUTE_MULTIPATH_WRANDOM +#define FIB_RES_NETWORK(res) ((res).network) +#define FIB_RES_NETMASK(res) ((res).netmask) +#else /* CONFIG_IP_ROUTE_MULTIPATH_WRANDOM */ +#define FIB_RES_NETWORK(res) (0) +#define FIB_RES_NETMASK(res) (0) +#endif /* CONFIG_IP_ROUTE_MULTIPATH_WRANDOM */ + struct fib_table { unsigned char tb_id; unsigned tb_stamp; diff -ruN linux-2.6.9.split/include/net/route.h linux-2.6.9.nicbalancing/include/net/route.h --- linux-2.6.9.split/include/net/route.h 2004-12-15 12:04:24.000000000 +0100 +++ linux-2.6.9.nicbalancing/include/net/route.h 2004-12-15 12:07:15.000000000 +0100 @@ -46,6 +46,7 @@ #define RT_CONN_FLAGS(sk) (RT_TOS(inet_sk(sk)->tos) | sk->sk_localroute) +struct fib_nh; struct inet_peer; struct rtable { @@ -179,6 +180,9 @@ memcpy(&fl, &(*rp)->fl, sizeof(fl)); fl.fl_ip_sport = sport; fl.fl_ip_dport = dport; +#if defined(CONFIG_IP_ROUTE_MULTIPATH_CACHED) + fl.flags |= FLOWI_FLAG_MULTIPATHOLDROUTE; +#endif ip_rt_put(*rp); *rp = NULL; return ip_route_output_flow(rp, &fl, sk, 0); @@ -197,4 +201,69 @@ return rt->peer; } +#ifdef CONFIG_IP_ROUTE_MULTIPATH_WRANDOM +extern void __multipath_flush(void); +static inline void multipath_flush(void) { + __multipath_flush(); +} +#else /* CONFIG_IP_ROUTE_MULTIPATH_WRANDOM */ +static inline void multipath_flush(void){} +#endif /* CONFIG_IP_ROUTE_MULTIPATH_WRANDOM */ + +#ifdef CONFIG_IP_ROUTE_MULTIPATH_WRANDOM +extern void __multipath_set_nhinfo(__u32 network, + __u32 netmask, + unsigned char prefixlen, + const struct fib_nh* nh); +static inline void multipath_set_nhinfo(__u32 network, + __u32 netmask, + unsigned char prefixlen, + const struct fib_nh* nh) { + __multipath_set_nhinfo(network, netmask, prefixlen, nh); +} +#else +static inline void multipath_set_nhinfo(__u32 network, + __u32 netmask, + unsigned char prefixlen, + const struct fib_nh* nh) { +} +#endif + + + +#if defined(CONFIG_IP_ROUTE_MULTIPATH_RR) || defined(CONFIG_IP_ROUTE_MULTIPATH_DRR) +extern void __multipath_remove(struct rtable *rt); +static inline void multipath_remove(struct rtable *rt) { + if ( rt->u.dst.flags & DST_BALANCED ) { + __multipath_remove( rt ); + } +} +#else /* CONFIG_IP_ROUTE_MULTIPATH_RR || CONFIG_IP_ROUTE_MULTIPATH_DRR */ +static inline void multipath_remove(struct rtable *rt) {} +#endif /* CONFIG_IP_ROUTE_MULTIPATH_RR || CONFIG_IP_ROUTE_MULTIPATH_DRR */ + + +#ifdef CONFIG_IP_ROUTE_MULTIPATH_CACHED +extern void __multipath_selectroute(const struct flowi *flp, + struct rtable *rth, + struct rtable **rp); +static inline int multipath_selectroute(const struct flowi *flp, + struct rtable *rth, + struct rtable **rp) { + if ( rth->u.dst.flags & DST_BALANCED ) { + __multipath_selectroute( flp, rth, rp ); + return 1; + } + else { + return 0; + } +} +#else /* CONFIG_IP_ROUTE_MULTIPATH_CACHED */ +static inline int multipath_selectroute(const struct flowi *flp, + struct rtable *rth, + struct rtable **rp) { + return 0; +} +#endif /* CONFIG_IP_ROUTE_MULTIPATH_CACHED */ + #endif /* _ROUTE_H */ diff -ruN linux-2.6.9.split/net/ipv4/Kconfig linux-2.6.9.nicbalancing/net/ipv4/Kconfig --- linux-2.6.9.split/net/ipv4/Kconfig 2004-12-15 12:04:32.000000000 +0100 +++ linux-2.6.9.nicbalancing/net/ipv4/Kconfig 2004-12-15 12:07:15.000000000 +0100 @@ -90,6 +90,54 @@ equal "cost" and chooses one of them in a non-deterministic fashion if a matching packet arrives. +config IP_ROUTE_MULTIPATH_CACHED + bool "IP: equal cost multipath with caching support (EXPERIMENTAL)" + depends on: IP_ROUTE_MULTIPATH + help + Normally, equal cost multipath routing is not supported by the + routing cache. If you say Y here, alternative routes are cached + and on cache lookup a route is chosen in a configurable fashion. + + If unsure, say N. + +# +# multipath policy configuration +# +choice + prompt "Multipath policy" + depends on IP_ROUTE_MULTIPATH_CACHED + default IP_ROUTE_MULTIPATH_RANDOM + +config IP_ROUTE_MULTIPATH_RR + bool "round robin (EXPERIMENTAL)" + help + Mulitpath routes are chosen according to Round Robin + +config IP_ROUTE_MULTIPATH_RANDOM + bool "random multipath (EXPERIMENTAL)" + help + Multipath routes are chosen in a random fashion. Actually, + there is no weight for a route. The advantage of this policy + is that it is implemented stateless and therefore introduces only + a very small delay. +config IP_ROUTE_MULTIPATH_WRANDOM + bool "weighted random multipath (EXPERIMENTAL)" + help + Multipath routes are chosen in a weighted random fashion. + The per route weights are the weights visible via ip route 2. As the + corresponding state management introduces some overhead routing delay + is increased. +config IP_ROUTE_MULTIPATH_DRR + bool "interface round robin (EXPERIMENTAL)" + help + Connections are distributed in a round robin fashion over the + available interfaces. This policy makes sense if the connections + should be primarily distributed on interfaces and not on routes. +endchoice +# +# END OF multipath policy configuration +# + config IP_ROUTE_VERBOSE bool "IP: verbose route monitoring" depends on IP_ADVANCED_ROUTER diff -ruN linux-2.6.9.split/net/ipv4/Makefile linux-2.6.9.nicbalancing/net/ipv4/Makefile --- linux-2.6.9.split/net/ipv4/Makefile 2004-12-15 12:04:31.000000000 +0100 +++ linux-2.6.9.nicbalancing/net/ipv4/Makefile 2004-12-15 12:07:15.000000000 +0100 @@ -20,6 +20,10 @@ obj-$(CONFIG_INET_IPCOMP) += ipcomp.o obj-$(CONFIG_INET_TUNNEL) += xfrm4_tunnel.o obj-$(CONFIG_IP_PNP) += ipconfig.o +obj-$(CONFIG_IP_ROUTE_MULTIPATH_RR) += multipath_rr.o +obj-$(CONFIG_IP_ROUTE_MULTIPATH_RANDOM) += multipath_random.o +obj-$(CONFIG_IP_ROUTE_MULTIPATH_WRANDOM) += multipath_wrandom.o +obj-$(CONFIG_IP_ROUTE_MULTIPATH_DRR) += multipath_drr.o obj-$(CONFIG_NETFILTER) += netfilter/ obj-$(CONFIG_IP_VS) += ipvs/ obj-$(CONFIG_IP_TCPDIAG) += tcp_diag.o diff -ruN linux-2.6.9.split/net/ipv4/fib_hash.c linux-2.6.9.nicbalancing/net/ipv4/fib_hash.c --- linux-2.6.9.split/net/ipv4/fib_hash.c 2004-12-15 12:04:31.000000000 +0100 +++ linux-2.6.9.nicbalancing/net/ipv4/fib_hash.c 2004-12-15 12:07:15.000000000 +0100 @@ -261,6 +261,7 @@ err = fib_semantic_match(&f->fn_alias, flp, res, + f->fn_key, fz->fz_mask, fz->fz_order); if (err <= 0) goto out; diff -ruN linux-2.6.9.split/net/ipv4/fib_lookup.h linux-2.6.9.nicbalancing/net/ipv4/fib_lookup.h --- linux-2.6.9.split/net/ipv4/fib_lookup.h 2004-12-15 12:04:31.000000000 +0100 +++ linux-2.6.9.nicbalancing/net/ipv4/fib_lookup.h 2004-12-15 12:07:15.000000000 +0100 @@ -19,7 +19,8 @@ /* Exported by fib_semantics.c */ extern int fib_semantic_match(struct list_head *head, const struct flowi *flp, - struct fib_result *res, int prefixlen); + struct fib_result *res, __u32 zone, __u32 mask, + int prefixlen); extern void fib_release_info(struct fib_info *); extern struct fib_info *fib_create_info(const struct rtmsg *r, struct kern_rta *rta, diff -ruN linux-2.6.9.split/net/ipv4/fib_semantics.c linux-2.6.9.nicbalancing/net/ipv4/fib_semantics.c --- linux-2.6.9.split/net/ipv4/fib_semantics.c 2004-12-15 12:04:31.000000000 +0100 +++ linux-2.6.9.nicbalancing/net/ipv4/fib_semantics.c 2004-12-15 12:07:15.000000000 +0100 @@ -763,7 +763,8 @@ } int fib_semantic_match(struct list_head *head, const struct flowi *flp, - struct fib_result *res, int prefixlen) + struct fib_result *res, __u32 zone, __u32 mask, + int prefixlen) { struct fib_alias *fa; int nh_sel = 0; @@ -827,6 +828,11 @@ res->type = fa->fa_type; res->scope = fa->fa_scope; res->fi = fa->fa_info; +#ifdef CONFIG_IP_ROUTE_MULTIPATH_WRANDOM + res->netmask = mask; + res->network = zone & + (0xFFFFFFFF >> (32 - prefixlen)); +#endif atomic_inc(&res->fi->fib_clntref); return 0; } diff -ruN linux-2.6.9.split/net/ipv4/multipath_drr.c linux-2.6.9.nicbalancing/net/ipv4/multipath_drr.c --- linux-2.6.9.split/net/ipv4/multipath_drr.c 1970-01-01 01:00:00.000000000 +0100 +++ linux-2.6.9.nicbalancing/net/ipv4/multipath_drr.c 2004-12-15 12:07:15.000000000 +0100 @@ -0,0 +1,292 @@ +/* + * Device round robin policy for multipath. + * + * + * Version: $Id: multipath_drr.c,v 1.1.2.1 2004/09/16 07:42:34 elueck Exp $ + * + * Authors: Einar Lueck + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +struct multipath_device +{ + int ifi; /* interface index of device */ + atomic_t usecount; + int allocated; +}; + +#define MULTIPATH_MAX_DEVICECANDIDATES 10 + +static struct multipath_device state[MULTIPATH_MAX_DEVICECANDIDATES]; +static spinlock_t state_lock = SPIN_LOCK_UNLOCKED; +static int registered_dev_notifier = 0; +static struct rtable *last_selection = NULL; + +#define RTprint(a...) // printk(KERN_DEBUG a) + + + +static int __inline__ multipath_comparekeys(const struct flowi *flp1, + const struct flowi *flp2) { + return flp1->fl4_dst == flp2->fl4_dst && + flp1->fl4_src == flp2->fl4_src && + flp1->oif == flp2->oif && +#ifdef CONFIG_IP_ROUTE_FWMARK + flp1->fl4_fwmark == flp2->fl4_fwmark && +#endif + !((flp1->fl4_tos ^ flp2->fl4_tos) & + (IPTOS_RT_MASK | RTO_ONLINK)); +} + +static int __inline__ __multipath_findslot(void) { + int i; + for (i=0; iifindex); + if (devidx != -1) { + state[devidx].allocated = 0; + state[devidx].ifi = 0; + atomic_set(&state[devidx].usecount, 0); + RTprint(KERN_DEBUG"%s: successfully removed device " \ + "with index %d\n",__FUNCTION__, devidx); + } + else { + RTprint(KERN_DEBUG"%s: Device not relevant for " \ + " multipath: %d\n", + __FUNCTION__, devidx); + } + + spin_unlock_bh(&state_lock); + break; + } + return NOTIFY_DONE; +} + +struct notifier_block multipath_dev_notifier = { + .notifier_call = multipath_dev_event, +}; + +void __multipath_remove(struct rtable* rt) { + if (last_selection == rt) { + last_selection = NULL; + } +} + +void __multipath_safe_inc(atomic_t* usecount) +{ + int n; + atomic_inc(usecount); + n = atomic_read(usecount); + if (n<=0) { + int i; + RTprint("%s: detected overflow, now ill will reset all "\ + "usecounts\n", __FUNCTION__); + + spin_lock_bh(&state_lock); + for (i=0; iflags & FLOWI_FLAG_MULTIPATHOLDROUTE ) != 0 && + last_selection != NULL ) { + RTprint( KERN_CRIT"%s: holding route \n", __FUNCTION__ ); + result = last_selection; + *rp = result; + return; + } + + + /* 1. make sure all alt. nexthops have the same GC related data */ + /* 2. determine the new candidate to be returned */ + result = NULL; + cur_min = NULL; + for (nh = rcu_dereference(first); nh; + nh = rcu_dereference(nh->u.rt_next)) { + if ( ( nh->u.dst.flags & DST_BALANCED ) != 0 && + multipath_comparekeys(&nh->fl, flp ) ) { + int nh_ifidx = nh->u.dst.dev->ifindex; + nh->u.dst.lastuse = jiffies; + nh->u.dst.__use++; + if (result != NULL) { + continue; + } + + /* search for the output interface */ + /* this is not SMP safe, only add/remove are + SMP safe as wrong usecount updates have no big + impact */ + devidx = __multipath_finddev(nh_ifidx); + if (devidx==-1) { + /* add the interface to the array + SMP safe */ + spin_lock_bh(&state_lock); + + /* due to SMP: search again */ + devidx = __multipath_finddev(nh_ifidx); + if (devidx==-1) { + /* add entry for device */ + devidx = __multipath_findslot(); + if (devidx==-1) { + /* unlikely but possible */ + RTprint(KERN_DEBUG"%s: " \ + "out of space\n", + __FUNCTION__); + continue; + } + + state[devidx].allocated = 1; + state[devidx].ifi = nh_ifidx; + atomic_set(&state[devidx].usecount, 0); + min_usecount = 0; + RTprint(KERN_DEBUG"%s: created " \ + " for " \ + "device %d and " \ + "min_usecount " \ + " == -1\n", + __FUNCTION__, + nh_ifidx ); + } + + spin_unlock_bh(&state_lock); + } + + if (min_usecount == 0) { + /* if the device has not been used it is + the primary target */ + RTprint(KERN_DEBUG"%s: now setting " \ + "result to device %d\n", + __FUNCTION__, nh_ifidx ); + + __multipath_safe_inc(&state[devidx].usecount); + result = nh; + } + else { + int count = + atomic_read(&state[devidx].usecount); + + if (min_usecount == -1 || + count < min_usecount) { + cur_min = nh; + cur_min_devidx = devidx; + min_usecount = count; + + RTprint(KERN_DEBUG"%s: found " \ + "device " \ + "%d with usecount == %d\n", + __FUNCTION__, + nh_ifidx, + min_usecount); + } + } + } + } + + if (!result) { + if (cur_min) { + RTprint( KERN_DEBUG"%s: index of device in state "\ + "array: %d\n", + __FUNCTION__, cur_min_devidx ); + __multipath_safe_inc(&state[cur_min_devidx].usecount); + result = cur_min; + } + else { + RTprint( KERN_DEBUG"%s: utilized first\n", + __FUNCTION__); + result = first; + } + } + else { + RTprint(KERN_DEBUG"%s: utilize result: found device " \ + "%d with usecount == %d\n", + __FUNCTION__, result->u.dst.dev->ifindex, + min_usecount); + + } + + *rp = result; + last_selection = result; +} diff -ruN linux-2.6.9.split/net/ipv4/multipath_random.c linux-2.6.9.nicbalancing/net/ipv4/multipath_random.c --- linux-2.6.9.split/net/ipv4/multipath_random.c 1970-01-01 01:00:00.000000000 +0100 +++ linux-2.6.9.nicbalancing/net/ipv4/multipath_random.c 2004-12-15 12:07:15.000000000 +0100 @@ -0,0 +1,124 @@ +/* + * Random policy for multipath. + * + * + * Version: $Id: multipath_random.c,v 1.1.2.3 2004/09/21 08:42:11 elueck Exp $ + * + * Authors: Einar Lueck + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define RTprint(a...) // printk(KERN_DEBUG a) + +#define MULTIPATH_MAX_CANDIDATES 40 + +/* interface to random number generation */ +static unsigned int RANDOM_SEED = 93186752; +static __inline__ unsigned int random(unsigned int ubound); + +static int __inline__ multipath_comparekeys(const struct flowi *flp1, + const struct flowi *flp2) { + return flp1->fl4_dst == flp2->fl4_dst && + flp1->fl4_src == flp2->fl4_src && + flp1->oif == flp2->oif && +#ifdef CONFIG_IP_ROUTE_FWMARK + flp1->fl4_fwmark == flp2->fl4_fwmark && +#endif + !((flp1->fl4_tos ^ flp2->fl4_tos) & + (IPTOS_RT_MASK | RTO_ONLINK)); +} + +void __multipath_selectroute(const struct flowi *flp, + struct rtable *first, + struct rtable **rp) { + struct rtable *rt; + struct rtable *decision; + unsigned char candidate_count = 0; + + /* count all candidate */ + for (rt = rcu_dereference(first); rt; + rt = rcu_dereference(rt->u.rt_next)) { + if ( ( rt->u.dst.flags & DST_BALANCED ) != 0 && + multipath_comparekeys(&rt->fl, flp) ) { + ++candidate_count; + } + } + + /* choose a random candidate */ + decision = first; + if ( candidate_count > 1 ) { + unsigned char i = 0; + unsigned char candidate_no = (unsigned char) + random(candidate_count); + RTprint( "%s: randomly chosen candidate: %d (count: %d)\n", + __FUNCTION__, candidate_no, candidate_count ); + + + /* find chosen candidate and adjust GC data for all candidates + to ensure they stay in cache */ + for (rt = first; rt; rt = rt->u.rt_next) { + if ( ( rt->u.dst.flags & DST_BALANCED ) != 0 && + multipath_comparekeys(&rt->fl, flp) ) { + rt->u.dst.lastuse = jiffies; + if (i == candidate_no) { + decision = rt; + } + if (i >= candidate_count) { + break; + } + i++; + } + } + } + + decision->u.dst.__use++; + *rp = decision; +} + +static __inline__ unsigned int random(unsigned int ubound) +{ + static unsigned int a = 1588635695, + q = 2, + r = 1117695901; + RANDOM_SEED = a*(RANDOM_SEED % q) - r*(RANDOM_SEED / q); + return RANDOM_SEED % ubound; +} + diff -ruN linux-2.6.9.split/net/ipv4/multipath_rr.c linux-2.6.9.nicbalancing/net/ipv4/multipath_rr.c --- linux-2.6.9.split/net/ipv4/multipath_rr.c 1970-01-01 01:00:00.000000000 +0100 +++ linux-2.6.9.nicbalancing/net/ipv4/multipath_rr.c 2004-12-15 12:07:15.000000000 +0100 @@ -0,0 +1,118 @@ +/* + * Round robin policy for multipath. + * + * + * Version: $Id: multipath_rr.c,v 1.1.2.2 2004/09/16 07:42:34 elueck Exp $ + * + * Authors: Einar Lueck + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define RTprint(a...) // printk(KERN_DEBUG a) + +#define MULTIPATH_MAX_CANDIDATES 40 + +static struct rtable* last_used = NULL; + +static int __inline__ multipath_comparekeys(const struct flowi *flp1, + const struct flowi *flp2) { + return flp1->fl4_dst == flp2->fl4_dst && + flp1->fl4_src == flp2->fl4_src && + flp1->oif == flp2->oif && +#ifdef CONFIG_IP_ROUTE_FWMARK + flp1->fl4_fwmark == flp2->fl4_fwmark && +#endif + !((flp1->fl4_tos ^ flp2->fl4_tos) & + (IPTOS_RT_MASK | RTO_ONLINK)); +} + +void __multipath_remove(struct rtable *rt) +{ + if (last_used==rt) { + last_used = NULL; + } +} + +void __multipath_selectroute(const struct flowi *flp, + struct rtable *first, struct rtable **rp) +{ + struct rtable *nh, *result, *min_use_cand = NULL; + int min_use = -1; + + /* if necessary and possible utilize the old alternative */ + if ( ( flp->flags & FLOWI_FLAG_MULTIPATHOLDROUTE ) != 0 && + last_used != NULL ) { + RTprint( KERN_CRIT"%s: holding route \n", + __FUNCTION__ ); + result = last_used; + goto out; + } + + + /* 1. make sure all alt. nexthops have the same GC related data */ + /* 2. determine the new candidate to be returned */ + result = NULL; + for (nh = rcu_dereference(first); nh; + nh = rcu_dereference(nh->u.rt_next)) { + if ( ( nh->u.dst.flags & DST_BALANCED ) != 0 && + multipath_comparekeys(&nh->fl, flp ) ) { + nh->u.dst.lastuse = jiffies; + + if (min_use == -1 || nh->u.dst.__use < min_use) { + min_use = nh->u.dst.__use; + min_use_cand = nh; + } + RTprint( KERN_CRIT"%s: found balanced entry\n", + __FUNCTION__ ); + } + } + result = min_use_cand; + if (!result) { + result = first; + } + + out: + last_used = result; + result->u.dst.__use++; + *rp = result; +} + + diff -ruN linux-2.6.9.split/net/ipv4/multipath_wrandom.c linux-2.6.9.nicbalancing/net/ipv4/multipath_wrandom.c --- linux-2.6.9.split/net/ipv4/multipath_wrandom.c 1970-01-01 01:00:00.000000000 +0100 +++ linux-2.6.9.nicbalancing/net/ipv4/multipath_wrandom.c 2004-12-15 12:07:15.000000000 +0100 @@ -0,0 +1,374 @@ +/* + * Weighted random policy for multipath. + * + * + * Version: $Id: multipath_wrandom.c,v 1.1.2.3 2004/09/22 07:51:40 elueck Exp $ + * + * Authors: Einar Lueck + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define MPprint(a...) // printk(KERN_DEBUG a) + +#define MULTIPATH_STATE_SIZE 15 + +struct multipath_candidate { + struct multipath_candidate *next; + int power; + struct rtable *rt; +}; + +struct multipath_dest { + struct list_head list; + + const struct fib_nh *nh_info; + __u32 netmask; + __u32 network; + unsigned char prefixlen; + + struct rcu_head rcu; +}; + +struct multipath_bucket { + struct list_head head; + spinlock_t lock; +}; + +struct multipath_route { + struct list_head list; + + int oif; + __u32 gw; + struct list_head dests; + + struct rcu_head rcu; +}; + + +/* state: primarily weight per route information */ +static int multipath_state_initialized = 0; +static spinlock_t state_big_lock = SPIN_LOCK_UNLOCKED; +static struct multipath_bucket state[MULTIPATH_STATE_SIZE]; + + +/* interface to random number generation */ +static unsigned int RANDOM_SEED = 93186752; +static __inline__ unsigned int random(unsigned int ubound); + +static int __inline__ multipath_comparekeys(const struct flowi *flp1, + const struct flowi *flp2) { + return flp1->fl4_dst == flp2->fl4_dst && + flp1->fl4_src == flp2->fl4_src && + flp1->oif == flp2->oif && +#ifdef CONFIG_IP_ROUTE_FWMARK + flp1->fl4_fwmark == flp2->fl4_fwmark && +#endif + !((flp1->fl4_tos ^ flp2->fl4_tos) & + (IPTOS_RT_MASK | RTO_ONLINK)); +} + + +static unsigned char __multipath_lookup_weight(const struct flowi *fl, + const struct rtable *rt) { + const int state_idx = rt->idev->dev->ifindex % MULTIPATH_STATE_SIZE; + struct multipath_route *r; + struct multipath_route *target_route = NULL; + struct multipath_dest *d; + int weight = 1; + + /* lookup the weight information for a certain route */ + rcu_read_lock(); + + /* find state entry for gateway or add one if necessary */ + list_for_each_entry_rcu(r, &state[state_idx].head, list) { + if (r->gw == rt->rt_gateway && + r->oif == rt->idev->dev->ifindex) { + target_route = r; + break; + } + } + if (!target_route) { + /* this should not happen... but we are prepared */ + printk( KERN_CRIT"%s: missing state for gateway: %u and " \ + "device %d\n", __FUNCTION__, rt->rt_gateway, + rt->idev->dev->ifindex); + goto out; + } + + /* find state entry for destination */ + list_for_each_entry_rcu(d, &target_route->dests, list) { + __u32 targetnetwork = fl->fl4_dst & + (0xFFFFFFFF >> (32 - d->prefixlen)); + + if ((targetnetwork & d->netmask) == d->network) { + weight = d->nh_info->nh_weight; + MPprint("%s: found weight %d for gateway %u\n", + __FUNCTION__, weight, rt->rt_gateway); + goto out; + } + } + + out: + rcu_read_unlock(); + return weight; +} + +static void __multipath_init_state(void) +{ + spin_lock(&state_big_lock); + + /* check again due to SMP and to prevent contention */ + if (!multipath_state_initialized) { + int i; + for (i = 0; i < MULTIPATH_STATE_SIZE; ++i) { + INIT_LIST_HEAD(&state[i].head); + state[i].lock = SPIN_LOCK_UNLOCKED; + } + } + + /* now mark initialized */ + multipath_state_initialized = 1; + + spin_unlock(&state_big_lock); +} + +static void inline __multipath_init(void) +{ + /* do not spinlock to reduce unnecessary contention */ + if (!multipath_state_initialized) { + __multipath_init_state(); + } +} + +void __multipath_selectroute(const struct flowi *flp, + struct rtable *first, + struct rtable **rp) { + struct rtable *rt; + struct rtable *decision; + struct multipath_candidate *first_mpc = NULL; + struct multipath_candidate *mpc, *last_mpc = NULL; + int power = 0; + int last_power; + int selector; + const size_t size_mpc = sizeof(struct multipath_candidate); + + /* init state if necessary */ + __multipath_init(); + + + /* collect all candidates and identify their weights */ + for (rt = rcu_dereference(first); rt; + rt = rcu_dereference(rt->u.rt_next)) { + if ((rt->u.dst.flags & DST_BALANCED) != 0 && + multipath_comparekeys(&rt->fl, flp) ) { + struct multipath_candidate* mpc = + (struct multipath_candidate*) + kmalloc(size_mpc, GFP_KERNEL); + + power += __multipath_lookup_weight(flp, rt) * 10000; + + mpc->power = power; + mpc->rt = rt; + mpc->next = NULL; + + if (!first_mpc) + first_mpc = mpc; + else + last_mpc->next = mpc; + + last_mpc = mpc; + } + } + + /* choose a weighted random candidate */ + decision = first; + selector = random(power); + MPprint("%s: random number %d in range %d\n", __FUNCTION__, selector, + power); + last_power = 0; + + + /* select candidate, adjust GC data and cleanup local state */ + decision = first; + last_mpc = NULL; + for (mpc = first_mpc; mpc; mpc = mpc->next) { + mpc->rt->u.dst.lastuse = jiffies; + MPprint("%s: last_power = %d, selector: %d, mpc->power: %d\n", + __FUNCTION__, last_power, selector, mpc->power); + if (last_power <= selector && selector < mpc->power) { + decision = mpc->rt; + MPprint("%s: selected %u\n", __FUNCTION__, + decision->rt_gateway); + } + last_power = mpc->power; + if (last_mpc) { + kfree(last_mpc); + } + last_mpc = mpc; + } + if (last_mpc) { + /* concurrent __multipath_flush may lead to !last_mpc */ + kfree(last_mpc); + } + + decision->u.dst.__use++; + *rp = decision; +} + +void __multipath_set_nhinfo(__u32 network, + __u32 netmask, + unsigned char prefixlen, + const struct fib_nh* nh) +{ + const int state_idx = nh->nh_oif % MULTIPATH_STATE_SIZE; + struct multipath_route *r, *target_route = NULL; + struct multipath_dest *d, *target_dest = NULL; + + /* init state if necessary */ + __multipath_init(); + + /* store the weight information for a certain route */ + spin_lock(&state[state_idx].lock); + + /* find state entry for gateway or add one if necessary */ + list_for_each_entry_rcu(r, &state[state_idx].head, list) { + if (r->gw == nh->nh_gw && r->oif == nh->nh_oif) { + target_route = r; + break; + } + } + if (!target_route) { + const size_t size_rt = sizeof(struct multipath_route); + target_route = (struct multipath_route *) + kmalloc(size_rt, GFP_KERNEL); + + target_route->gw = nh->nh_gw; + target_route->oif = nh->nh_oif; + memset(&target_route->rcu, sizeof(struct rcu_head), 0); + INIT_LIST_HEAD(&target_route->dests); + + list_add_rcu(&target_route->list, &state[state_idx].head); + } + + /* find state entry for destination or add one if necessary */ + list_for_each_entry_rcu(d, &target_route->dests, list) { + if (d->nh_info == nh) { + target_dest = d; + break; + } + } + if (!target_dest) { + const size_t size_dst = sizeof(struct multipath_dest); + target_dest = (struct multipath_dest*) + kmalloc(size_dst, GFP_KERNEL); + + target_dest->nh_info = nh; + target_dest->network = network; + target_dest->netmask = netmask; + target_dest->prefixlen = prefixlen; + memset(&target_dest->rcu, sizeof(struct rcu_head), 0); + + list_add_rcu(&target_dest->list, &target_route->dests); + } + /* else: we already stored this info for another destination => + we are finished */ + + spin_unlock(&state[state_idx].lock); +} + + +static void __multipath_free(struct rcu_head *head) +{ + struct multipath_route *rt = container_of(head, struct multipath_route, + rcu); + kfree(rt); +} + +static void __multipath_free_dst(struct rcu_head *head) +{ + struct multipath_dest *dst = container_of(head, + struct multipath_dest, + rcu); + kfree(dst); +} + +void __multipath_flush() +{ + int i; + + MPprint("%s: called\n", __FUNCTION__); + + /* init state if necessary */ + __multipath_init(); + + /* defere delete to all entries */ + for (i = 0; i < MULTIPATH_STATE_SIZE; ++i) { + struct multipath_route *r; + spin_lock(&state[i].lock); + + list_for_each_entry_rcu(r, &state[i].head, list) { + struct multipath_dest *d; + list_for_each_entry_rcu(d, &r->dests, list) { + list_del_rcu(&d->list); + call_rcu(&d->rcu, + __multipath_free_dst); + + } + list_del_rcu(&r->list); + call_rcu(&r->rcu, + __multipath_free); + } + + spin_unlock(&state[i].lock); + } + + MPprint("%s: finished\n", __FUNCTION__); +} + +static __inline__ unsigned int random(unsigned int ubound) +{ + static unsigned int a = 1588635695, + q = 2, + r = 1117695901; + RANDOM_SEED = a*(RANDOM_SEED % q) - r*(RANDOM_SEED / q); + return RANDOM_SEED % ubound; +} diff -ruN linux-2.6.9.split/net/ipv4/route.c linux-2.6.9.nicbalancing/net/ipv4/route.c --- linux-2.6.9.split/net/ipv4/route.c 2004-12-15 12:05:32.000000000 +0100 +++ linux-2.6.9.nicbalancing/net/ipv4/route.c 2004-12-15 12:07:15.000000000 +0100 @@ -129,7 +129,7 @@ int ip_rt_secret_interval = 10 * 60 * HZ; static unsigned long rt_deadline; -#define RTprint(a...) printk(KERN_DEBUG a) +#define RTprint(a...) // printk(KERN_DEBUG a) static struct timer_list rt_flush_timer; static struct timer_list rt_periodic_timer; @@ -450,11 +450,13 @@ static __inline__ void rt_free(struct rtable *rt) { + multipath_remove( rt ); call_rcu_bh(&rt->u.dst.rcu_head, dst_rcu_free); } static __inline__ void rt_drop(struct rtable *rt) { + multipath_remove( rt ); ip_rt_put(rt); call_rcu_bh(&rt->u.dst.rcu_head, dst_rcu_free); } @@ -516,6 +518,54 @@ return score; } +#ifdef CONFIG_IP_ROUTE_MULTIPATH_CACHED +static struct rtable **rt_remove_balanced_route(struct rtable **chain_head, + struct rtable *expentry, + int* removed_count) +{ + int passedexpired = 0; + struct rtable **nextstep = NULL; + struct rtable **rthp = chain_head; + struct rtable *rth; + if (removed_count) + *removed_count = 0; + while ((rth = *rthp) != NULL) { + if ( rth == expentry ) { + passedexpired = 1; + } + + if (((*rthp)->u.dst.flags & DST_BALANCED) != 0 && + compare_keys(&(*rthp)->fl, &expentry->fl)) { + if (*rthp == expentry) { + *rthp = rth->u.rt_next; + continue; + } + else { + *rthp = rth->u.rt_next; + rt_free(rth); + if (removed_count) + ++(*removed_count); + } + } + else { + if ( !((*rthp)->u.dst.flags & DST_BALANCED) && + passedexpired && !nextstep ) { + nextstep = &rth->u.rt_next; + } + rthp = &rth->u.rt_next; + } + } + + rt_free(expentry); + if (removed_count) + ++(*removed_count); + + return nextstep; +} + +#endif + + /* This runs via a timer and thus is always in BH context. */ static void rt_check_expire(unsigned long dummy) { @@ -547,8 +597,24 @@ } /* Cleanup aged off entries. */ - *rthp = rth->u.rt_next; - rt_free(rth); +#ifdef CONFIG_IP_ROUTE_MULTIPATH_CACHED + /* remove all related balanced entries if necessary */ + if ( rth->u.dst.flags & DST_BALANCED ) { + rthp = rt_remove_balanced_route( + &rt_hash_table[i].chain, + rth, NULL); + if (!rthp) { + break; + } + } + else { + *rthp = rth->u.rt_next; + rt_free(rth); + } +#else /* CONFIG_IP_ROUTE_MULTIPATH_CACHED */ + *rthp = rth->u.rt_next; + rt_free(rth); +#endif /* CONFIG_IP_ROUTE_MULTIPATH_CACHED */ } spin_unlock(&rt_hash_table[i].lock); @@ -596,6 +662,9 @@ if (delay < 0) delay = ip_rt_min_delay; + /* flush existing multipath state*/ + multipath_flush(); + spin_lock_bh(&rt_flush_lock); if (del_timer(&rt_flush_timer) && delay > 0 && rt_deadline) { @@ -714,9 +783,29 @@ rthp = &rth->u.rt_next; continue; } +#ifdef CONFIG_IP_ROUTE_MULTIPATH_CACHED + /* remove all related balanced entries if necessary */ + if ( rth->u.dst.flags & DST_BALANCED ) { + int r; + rthp = rt_remove_balanced_route( + &rt_hash_table[i].chain, + rth, + &r); + goal -= r; + if (!rthp) { + break; + } + } + else { + *rthp = rth->u.rt_next; + rt_free(rth); + goal--; + } +#else /* CONFIG_IP_ROUTE_MULTIPATH_CACHED */ *rthp = rth->u.rt_next; rt_free(rth); goal--; +#endif /* CONFIG_IP_ROUTE_MULTIPATH_CACHED */ } spin_unlock_bh(&rt_hash_table[k].lock); if (goal <= 0) @@ -797,7 +886,12 @@ spin_lock_bh(&rt_hash_table[hash].lock); while ((rth = *rthp) != NULL) { +#ifdef CONFIG_IP_ROUTE_MULTIPATH_CACHED + if (!(rth->u.dst.flags & DST_BALANCED) && + compare_keys(&rth->fl, &rt->fl)) { +#else if (compare_keys(&rth->fl, &rt->fl)) { +#endif /* Put it first */ *rthp = rth->u.rt_next; /* @@ -1628,6 +1722,10 @@ } rth->u.dst.flags= DST_HOST; +#ifdef CONFIG_IP_ROUTE_MULTIPATH_CACHED + if ( res->fi->fib_nhs > 1 ) + rth->u.dst.flags |= DST_BALANCED; +#endif if (in_dev->cnf.no_policy) rth->u.dst.flags |= DST_NOPOLICY; if (in_dev->cnf.no_xfrm) @@ -1675,7 +1773,7 @@ unsigned hash; #ifdef CONFIG_IP_ROUTE_MULTIPATH - if (res->fi->fib_nhs > 1 && fl->oif == 0) + if (res->fi && res->fi->fib_nhs > 1 && fl->oif == 0) fib_select_multipath(fl, res); #endif @@ -1696,7 +1794,65 @@ struct in_device *in_dev, u32 daddr, u32 saddr, u32 tos) { +#ifdef CONFIG_IP_ROUTE_MULTIPATH_CACHED + struct rtable* rth; + unsigned char hop, hopcount, lasthop; + int err = -EINVAL; + unsigned hash; + if (res->fi) { + hopcount = res->fi->fib_nhs; + } + else { + hopcount = 1; + } + lasthop = hopcount - 1; + + /* distinguish between multipath and singlepath */ + if ( hopcount < 2 ) + return ip_mkroute_input_def(skb, res, fl, in_dev, daddr, + saddr, tos); + + RTprint( KERN_DEBUG"%s: entered (hopcount: %d)\n", __FUNCTION__, + hopcount); + + /* add all alternatives to the routing cache */ + for ( hop = 0; hop < hopcount; ++hop ) { + res->nh_sel = hop; + + RTprint( KERN_DEBUG"%s: entered (hopcount: %d)\n", + __FUNCTION__, hopcount); + + /* create a routing cache entry */ + err = __mkroute_input( skb, res, in_dev, daddr, saddr, tos, + &rth ); + if ( err ) + return err; + + + /* put it into the cache */ + hash = rt_hash_code(daddr, saddr ^ (fl->iif << 5), tos); + err = rt_intern_hash(hash, rth, (struct rtable**)&skb->dst); + if ( err ) + return err; + + /* forward hop information to multipath impl. */ + multipath_set_nhinfo(FIB_RES_NETWORK(*res), + FIB_RES_NETMASK(*res), + res->prefixlen, + &FIB_RES_NH(*res)); + + + /* only for the last hop the reference count is handled + outside */ + RTprint( KERN_DEBUG"%s: balanced entry created: %d\n", + __FUNCTION__, rth ); + if ( hop == lasthop ) + atomic_set(&(skb->dst->__refcnt), 1); + } + return err; +#else /* CONFIG_IP_ROUTE_MULTIPATH_CACHED */ return ip_mkroute_input_def(skb, res, fl, in_dev, daddr, saddr, tos); +#endif /* CONFIG_IP_ROUTE_MULTIPATH_CACHED */ } @@ -2017,6 +2173,10 @@ } rth->u.dst.flags= DST_HOST; +#ifdef CONFIG_IP_ROUTE_MULTIPATH_CACHED + if (res->fi && res->fi->fib_nhs > 1) + rth->u.dst.flags |= DST_BALANCED; +#endif if (in_dev->cnf.no_xfrm) rth->u.dst.flags |= DST_NOXFRM; if (in_dev->cnf.no_policy) @@ -2108,7 +2268,77 @@ struct net_device *dev_out, unsigned flags) { +#ifdef CONFIG_IP_ROUTE_MULTIPATH_CACHED + u32 tos = RT_FL_TOS(oldflp); + unsigned char hop; + unsigned hash; + int err = -EINVAL; + struct rtable* rth; + + if (res->fi && res->fi->fib_nhs > 1) { + unsigned char hopcount = res->fi->fib_nhs; + RTprint( KERN_DEBUG"%s: entered (hopcount: %d, fl->oif: %d)\n", + __FUNCTION__, hopcount, fl->oif); + + for ( hop = 0; hop < hopcount; ++hop ) { + struct net_device *dev2nexthop; + RTprint( KERN_DEBUG"%s: hop %d of %d\n", __FUNCTION__, + hop, hopcount ); + + res->nh_sel = hop; + + /* hold a work reference to the output device */ + dev2nexthop = FIB_RES_DEV(*res); + dev_hold(dev2nexthop); + + err = __mkroute_output(&rth, res, fl, oldflp, + dev2nexthop, flags); + + /** FIXME remove debug code */ + RTprint( "%s: balanced entry created: %d " \ + " (GW: %u)\n", + __FUNCTION__, + &rth->u.dst, + FIB_RES_GW(*res) ); + + if ( err != 0 ) { + goto cleanup; + } + + RTprint( KERN_DEBUG"%s: created successfully %d\n", + __FUNCTION__, hop ); + + hash = rt_hash_code(oldflp->fl4_dst, + oldflp->fl4_src ^ + (oldflp->oif << 5), tos); + err = rt_intern_hash(hash, rth, rp); + RTprint( KERN_DEBUG"%s: hashed %d\n", + __FUNCTION__, hop ); + + /* forward hop information to multipath impl. */ + multipath_set_nhinfo(FIB_RES_NETWORK(*res), + FIB_RES_NETMASK(*res), + res->prefixlen, + &FIB_RES_NH(*res)); + cleanup: + /* release work reference to output device */ + dev_put(dev2nexthop); + + if ( err != 0 ) { + return err; + } + } + RTprint( "%s: exited loop\n", __FUNCTION__ ); + atomic_set(&(*rp)->u.dst.__refcnt, 1); + return err; + } + else { + return ip_mkroute_output_def(rp, res, fl, oldflp, dev_out, + flags); + } +#else /* CONFIG_IP_ROUTE_MULTIPATH_CACHED */ return ip_mkroute_output_def(rp, res, fl, oldflp, dev_out, flags); +#endif } /* @@ -2137,6 +2367,7 @@ int free_res = 0; int err; + res.fi = NULL; #ifdef CONFIG_IP_MULTIPLE_TABLES res.r = NULL; @@ -2186,6 +2417,8 @@ dev_put(dev_out); dev_out = NULL; } + + if (oldflp->oif) { dev_out = dev_get_by_index(oldflp->oif); err = -ENODEV; @@ -2292,9 +2525,11 @@ dev_hold(dev_out); fl.oif = dev_out->ifindex; + make_route: err = ip_mkroute_output(rp, &res, &fl, oldflp, dev_out, flags); + if (free_res) fib_res_put(&res); if (dev_out) @@ -2321,6 +2556,15 @@ #endif !((rth->fl.fl4_tos ^ flp->fl4_tos) & (IPTOS_RT_MASK | RTO_ONLINK))) { + /* check for multipath routes and choose one if + necessary */ + if (multipath_selectroute(flp, rth, rp)) { + dst_hold(&(*rp)->u.dst); + RT_CACHE_STAT_INC(out_hit); + rcu_read_unlock_bh(); + return 0; + } + rth->u.dst.lastuse = jiffies; dst_hold(&rth->u.dst); rth->u.dst.__use++; --------------080301040700050000030800-- From arnd@arndb.de Mon Dec 20 08:17:20 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 08:17:27 -0800 (PST) Received: from e4.ny.us.ibm.com (e4.ny.us.ibm.com [32.97.182.144]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBKGGxp7003286 for ; Mon, 20 Dec 2004 08:17:19 -0800 Received: from d01relay02.pok.ibm.com (d01relay02.pok.ibm.com [9.56.227.234]) by e4.ny.us.ibm.com (8.12.10/8.12.10) with ESMTP id iBKGGU50019053 for ; Mon, 20 Dec 2004 11:16:30 -0500 Received: from d01av02.pok.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by d01relay02.pok.ibm.com (8.12.10/NCO/VER6.6) with ESMTP id iBKGGU5X247440 for ; Mon, 20 Dec 2004 11:16:30 -0500 Received: from d01av02.pok.ibm.com (loopback [127.0.0.1]) by d01av02.pok.ibm.com (8.12.11/8.12.11) with ESMTP id iBKGGTpC002072 for ; Mon, 20 Dec 2004 11:16:29 -0500 Received: from dyn-9-152-210-124.boeblingen.de.ibm.com (dyn-9-152-210-124.boeblingen.de.ibm.com [9.152.210.124]) by d01av02.pok.ibm.com (8.12.11/8.12.11) with ESMTP id iBKGGTTX002046; Mon, 20 Dec 2004 11:16:29 -0500 From: Arnd Bergmann To: YOSHIFUJI Hideaki / =?utf-8?q?=E5=90=89=E8=97=A4=E8=8B=B1=E6=98=8E?= Subject: Re: [PATCH][v4][19/24] Add IPoIB (IP-over-InfiniBand) driver Date: Mon, 20 Dec 2004 17:10:03 +0100 User-Agent: KMail/1.6.2 Cc: roland@topspin.com, linux-kernel@vger.kernel.org, openib-general@openib.org, netdev@oss.sgi.com References: <200412192215.fZX1ZQqQD4QGkKcF@topspin.com> <200412201314.35502.arnd@arndb.de> <20041220.221709.99112884.yoshfuji@linux-ipv6.org> In-Reply-To: <20041220.221709.99112884.yoshfuji@linux-ipv6.org> MIME-Version: 1.0 Content-Type: multipart/signed; protocol="application/pgp-signature"; micalg=pgp-sha1; boundary="Boundary-02=_clvxBV4W2Uef7Q4"; charset="utf-8" Content-Transfer-Encoding: 7bit Message-Id: <200412201710.04025.arnd@arndb.de> X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12931 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: arnd@arndb.de Precedence: bulk X-list: netdev --Boundary-02=_clvxBV4W2Uef7Q4 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline On Maandag 20 Dezember 2004 14:17, YOSHIFUJI Hideaki / =E5=90=89=E8=97=A4= =E8=8B=B1=E6=98=8E wrote: > In article <200412201314.35502.arnd@arndb.de> (at Mon, 20 Dec 2004 13:14:= 35 +0100), Arnd Bergmann says: > > See also include/linux/ide.h for another example where this is done. >=20 > No, it is not the similar case. Sorry, it was a typo on my side, I meant include/linux/ata.h. Arnd <>< --Boundary-02=_clvxBV4W2Uef7Q4 Content-Type: application/pgp-signature Content-Description: signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (GNU/Linux) iD8DBQBBxvlc5t5GS2LDRf4RAqiSAKCliKQJuBZfAWuZxxJveg8mkCh+ogCggt9K Iw8Cq2dZ47rfjBMiFDJ3xpQ= =JBwb -----END PGP SIGNATURE----- --Boundary-02=_clvxBV4W2Uef7Q4-- From roland@topspin.com Mon Dec 20 08:46:34 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 08:46:41 -0800 (PST) Received: from umhlanga.STRATNET.NET (umhlanga.stratnet.net [12.162.17.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBKGkEHt008678 for ; Mon, 20 Dec 2004 08:46:34 -0800 Received: from exch-1.topspincom.com ([12.162.17.3]) by umhlanga.STRATNET.NET with Microsoft SMTPSVC(5.0.2195.5329); Mon, 20 Dec 2004 08:45:50 -0800 Received: from eddore ([10.10.253.169]) by exch-1.topspincom.com with Microsoft SMTPSVC(5.0.2195.5329); Mon, 20 Dec 2004 08:45:50 -0800 Received: from roland by eddore with local (Exim 4.34) id 1CgQfN-0000Cs-H4; Mon, 20 Dec 2004 08:45:49 -0800 To: YOSHIFUJI Hideaki / =?iso-2022-jp?b?GyRCNUhGIzFRGyhC?= =?iso-2022-jp?b?GyRCTEAbKEI=?= Cc: linux-kernel@vger.kernel.org, openib-general@openib.org, netdev@oss.sgi.com X-Message-Flag: Warning: May contain useful information References: <200412192215.69tnzAhGIT1vQGLF@topspin.com> <200412192215.fZX1ZQqQD4QGkKcF@topspin.com> <20041220.155836.75677852.yoshfuji@linux-ipv6.org> From: Roland Dreier Date: Mon, 20 Dec 2004 08:45:49 -0800 In-Reply-To: <20041220.155836.75677852.yoshfuji@linux-ipv6.org> (YOSHIFUJI Hideaki's message of "Mon, 20 Dec 2004 15:58:36 +0900 (JST)") Message-ID: <52is6wkjeq.fsf@topspin.com> User-Agent: Gnus/5.1006 (Gnus v5.10.6) XEmacs/21.4 (Security Through Obscurity, linux) MIME-Version: 1.0 X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: roland@topspin.com Subject: Re: [PATCH][v4][19/24] Add IPoIB (IP-over-InfiniBand) driver Content-Type: text/plain; charset=us-ascii X-SA-Exim-Version: 4.1 (built Tue, 17 Aug 2004 11:06:07 +0200) X-SA-Exim-Scanned: Yes (on eddore) X-OriginalArrivalTime: 20 Dec 2004 16:45:50.0074 (UTC) FILETIME=[5C9FFDA0:01C4E6B3] X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12932 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: roland@topspin.com Precedence: bulk X-list: netdev YOSHIFUJI> above entries does not seem to appropriate for enum YOSHIFUJI> (than #define). As Arnd mentioned, I thought enum values were preferred to using the preprocessor. What's the advantage of converting to macros (which have no type, are invisible to the compiler, etc)? Thanks, Roland From jgarzik@pobox.com Mon Dec 20 10:55:52 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 10:56:05 -0800 (PST) Received: from www.linux.org.uk (IDENT:93@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBKItUsp015625 for ; Mon, 20 Dec 2004 10:55:51 -0800 Received: from [69.134.152.124] (helo=[10.10.10.88]) by www.linux.org.uk with asmtp (TLSv1:AES256-SHA:256) (Exim 4.33) id 1CgSgQ-0003GE-KC; Mon, 20 Dec 2004 18:55:02 +0000 Message-ID: <41C71FFD.7090308@pobox.com> Date: Mon, 20 Dec 2004 13:54:53 -0500 From: Jeff Garzik User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3) Gecko/20040922 X-Accept-Language: en-us, en MIME-Version: 1.0 To: hadi@cyberus.ca CC: Tommy Christensen , Thomas Spatzier , "David S. Miller" , Hasso Tepper , Herbert Xu , netdev@oss.sgi.com, Paul Jakma Subject: Re: [patch 4/10] s390: network driver. References: <1103484552.1046.155.camel@jzny.localdomain> <41C600D7.70005@tpack.net> <1103497516.1046.231.camel@jzny.localdomain> <41C612BC.5070909@tpack.net> <1103551830.1047.316.camel@jzny.localdomain> In-Reply-To: <1103551830.1047.316.camel@jzny.localdomain> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12933 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev jamal wrote: > I beginuing to think thats the simplest way to achieve this: i.e not to > stop the queue but rather to let the packets continue showing up and > drop them at the driver when the link is down . The netlink async > carrier signal to the app is to be used to reroute instead of being a > signal to flush buffers. In other words the other Thomas got it right > (with the exception of setting the IFF_RUNNIGN flags) > > Jeff? I haven't heard anything to convince me that the same change should be deployed across NNN drivers. The drivers already signal the net core that the link is down; to me, that implies there should be code in _one_ place that handles this condition, not NNN places. Further, * if this policy ("drop them in the driver") ever changes, we must again touch NNN drivers * dropping them in the driver but not stopping the queue means that the system is allowed to continue to stream data into the driver, only for the driver to free it. That will scale -- right up to (worst case) 100% CPU, with userland sending packets as fast as it can, and the driver dropping packets as fast as it can. The only places the net stack currently checks carrier is dev_get_flags() and dev_watchdog(). * If you need a hook to flush the in-hardware buffers, add such a hook. Don't hack it in like this. Yeah, adding a hook touches NNN drivers but at least the hook is far more self-contained, it's semantics will be more clear, and it will increase the likelihood that the driver changes do not affect the hot path nor current netif_{start,stop}_queue() logic. Jeff From jgarzik@pobox.com Mon Dec 20 10:57:21 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 10:57:26 -0800 (PST) Received: from www.linux.org.uk (IDENT:93@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBKIv1Js015841 for ; Mon, 20 Dec 2004 10:57:21 -0800 Received: from [69.134.152.124] (helo=[10.10.10.88]) by www.linux.org.uk with asmtp (TLSv1:AES256-SHA:256) (Exim 4.33) id 1CgShx-0003JW-4i; Mon, 20 Dec 2004 18:56:37 +0000 Message-ID: <41C72060.1010400@pobox.com> Date: Mon, 20 Dec 2004 13:56:32 -0500 From: Jeff Garzik User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3) Gecko/20040922 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Paul Jakma , hadi@cyberus.ca CC: Tommy Christensen , Thomas Spatzier , "David S. Miller" , Hasso Tepper , Herbert Xu , netdev@oss.sgi.com Subject: Re: [patch 4/10] s390: network driver. References: <1103484552.1046.155.camel@jzny.localdomain> <41C600D7.70005@tpack.net> <1103497516.1046.231.camel@jzny.localdomain> <41C612BC.5070909@tpack.net> In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12934 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Paul Jakma wrote: >> responsibility of the application to flush the socket on >> link-down events (by down'ing the interface?). > > > That seems more complex than needs be, for userspace at least. It is the responsibility of the kernel to push complexity to userland. Some applications may NOT desire that the socket be flushed. That's an app policy decision. If this is the core issue, then I am even more inclined to think that the kernel is not what needs to be modified here. Jeff From tgraf@suug.ch Mon Dec 20 12:08:04 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 12:08:12 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBKK7hC9018894 for ; Mon, 20 Dec 2004 12:08:04 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id 6314684; Mon, 20 Dec 2004 21:06:57 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id 8A23B1C0EA; Mon, 20 Dec 2004 21:07:39 +0100 (CET) Date: Mon, 20 Dec 2004 21:07:39 +0100 From: Thomas Graf To: jamal Cc: Patrick McHardy , "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PATCH] PKT_SCHED: Fix cls indev validation Message-ID: <20041220200739.GX17998@postel.suug.ch> References: <20041219203050.GK17998@postel.suug.ch> <41C68CEF.3030803@trash.net> <1103552215.1048.333.camel@jzny.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1103552215.1048.333.camel@jzny.localdomain> X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12935 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * jamal <1103552215.1048.333.camel@jzny.localdomain> 2004-12-20 09:16 > Hehe. I am sure thats a cutnpaste(LinuxWay) from some code in the kernel > probably sch_api.c (or maybe the code it was cutnpasted has been fixed > in the last 3 years ;->). > That needs fixing. Who is sending the patch? I'll put it into my patchset so it gets into the test cycles. > Its not that expensive since done on config path. But agree when proper > codes exist its not needed. It might cause random data to be printed onto the console ;) I'll remove it in my patchset. From jdmason@us.ibm.com Mon Dec 20 12:55:50 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 12:55:58 -0800 (PST) Received: from e31.co.us.ibm.com (e31.co.us.ibm.com [32.97.110.129]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBKKtNaU023939 for ; Mon, 20 Dec 2004 12:55:49 -0800 Received: from d03relay05.boulder.ibm.com (d03relay05.boulder.ibm.com [9.17.195.107]) by e31.co.us.ibm.com (8.12.10/8.12.9) with ESMTP id iBKKsrjd377164 for ; Mon, 20 Dec 2004 15:54:53 -0500 Received: from d03av01.boulder.ibm.com (d03av01.boulder.ibm.com [9.17.195.167]) by d03relay05.boulder.ibm.com (8.12.10/NCO/VER6.6) with ESMTP id iBKKsraE244616 for ; Mon, 20 Dec 2004 13:54:53 -0700 Received: from d03av01.boulder.ibm.com (loopback [127.0.0.1]) by d03av01.boulder.ibm.com (8.12.11/8.12.11) with ESMTP id iBKKsqjC022969 for ; Mon, 20 Dec 2004 13:54:52 -0700 Received: from dreadnought.austin.ibm.com (dreadnought.austin.ibm.com [9.41.94.123]) by d03av01.boulder.ibm.com (8.12.11/8.12.11) with ESMTP id iBKKsqkc022957; Mon, 20 Dec 2004 13:54:52 -0700 From: Jon Mason Organization: IBM To: Xose Vazquez Perez Subject: Re: No Maintainer entry for dl2k driver Date: Mon, 20 Dec 2004 14:54:50 -0600 User-Agent: KMail/1.7 Cc: netdev@oss.sgi.com, edward_peng@alphanetworks.com References: <41C377E9.1040305@wanadoo.es> In-Reply-To: <41C377E9.1040305@wanadoo.es> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200412201454.50114.jdmason@us.ibm.com> X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12936 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jdmason@us.ibm.com Precedence: bulk X-list: netdev Thank you for the correction. However my point remains the same, there is no entry in the MAINTAINERS file for the dl2k driver. So, now it should look something like this: --- MAINTAINERS.orig 2004-12-17 16:01:37.955336376 -0600 +++ MAINTAINERS 2004-12-17 16:07:21.027181528 -0600 @@ -704,6 +704,10 @@ M: mvw@planets.elm.net L: linux-kernel@vger.kernel.org S: Maintained +DL2K NETWORK DRIVER +P: Edward Peng +M: edward_peng@alphanetworks.com +L: linux-net@vger.kernel.org +S: Maintained + DAVICOM FAST ETHERNET (DMFE) NETWORK DRIVER P: Tobias Ringstrom M: tori@unhappy.mine.nu Also, his e-mail address in the dl2k driver needs to be updated, as the one currently in the driver (edward_peng@dlink.com.tw) bounces. Thanks, Jon On Friday 17 December 2004 06:20 pm, Xose Vazquez Perez wrote: > Jon Mason wrote: > > It has come to my attention that there is no entry in the kernel > > MAINTAINERS file for the D-Link Gigabit Ethernet Driver (and possibly no > > maintainer). So, I have created a patch for that file with an entry for > > the driver status as Orphaned. > > Edward Peng is the current maintainer. > The 2.4 kernel already has latest release, and 'I think' he is working > on the 2.6 driver. -- Jon Mason jdmason@us.ibm.com From oxymoron@waste.org Mon Dec 20 13:15:24 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 13:15:38 -0800 (PST) Received: from waste.org (waste.org [216.27.176.166]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBKLF3gk024868 for ; Mon, 20 Dec 2004 13:15:23 -0800 Received: from waste.org (localhost [127.0.0.1]) by waste.org (8.12.3/8.12.3/Debian-7.1) with ESMTP id iBKLEPkC000891; Mon, 20 Dec 2004 15:14:25 -0600 Received: (from oxymoron@localhost) by waste.org (8.12.3/8.12.3/Debian-7.1) id iBKLEJH8000870; Mon, 20 Dec 2004 15:14:19 -0600 Date: Mon, 20 Dec 2004 13:14:19 -0800 From: Matt Mackall To: Mark Broadbent Cc: romieu@fr.zoreil.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: Lockup with 2.6.9-ac15 related to netconsole Message-ID: <20041220211419.GC5974@waste.org> References: <59719.192.102.214.6.1103214002.squirrel@webmail.wetlettuce.com> <20041216211024.GK2767@waste.org> <34721.192.102.214.6.1103274614.squirrel@webmail.wetlettuce.com> <20041217215752.GP2767@waste.org> <20041217233524.GA11202@electric-eye.fr.zoreil.com> <36901.192.102.214.6.1103535728.squirrel@webmail.wetlettuce.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <36901.192.102.214.6.1103535728.squirrel@webmail.wetlettuce.com> User-Agent: Mutt/1.3.28i X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Scanned: by amavisd-new X-Virus-Status: Clean X-archive-position: 12937 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mpm@selenic.com Precedence: bulk X-list: netdev On Mon, Dec 20, 2004 at 09:42:08AM -0000, Mark Broadbent wrote: > > Exactly the same happens, I still get a 'NMI Watchdog detected LOCKUP' > with the r8169 device using the above patch on top of 2.6.10-rc3-bk10. Ok, that suggests a problem localized to netpoll itself. Do you have spinlock debugging turned on by any chance? -- Mathematics is the supreme nostalgia of our time. From Rudolf.ladyzhenskii@opennw.com Mon Dec 20 13:35:04 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 13:35:11 -0800 (PST) Received: from opennw.com ([61.88.96.230]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBKLYh8F026203 for ; Mon, 20 Dec 2004 13:35:04 -0800 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Subject: RE: Required resources on network driver programming Date: Tue, 21 Dec 2004 08:34:10 +1100 Message-ID: From: "Rudolf Ladyzhenskii" To: "linux lover" , Cc: X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id iBKLYh8F026203 X-archive-position: 12938 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Rudolf.ladyzhenskii@opennw.com Precedence: bulk X-list: netdev Hi, Firstly, there is a software package called ETHEREAL that will do that for you. If you really want to do it yourself, search the web on how to interface with NETFILTER. Rudolf -----Original Message----- From: netdev-bounce@oss.sgi.com [mailto:netdev-bounce@oss.sgi.com]On Behalf Of linux lover Sent: Monday, December 20, 2004 6:46 PM To: linux-net@vger.kernel.org Cc: netdev@oss.sgi.com Subject: Required resources on network driver programming Hi all, I am newbie to Linux kernel programming. I want to write my own virtual network device driver that take every packets from IP layer just print the contents of packet(header part with its starting addresses only) and send it to actual device driver for packet transmission and at receiving end receive packet from NIC card again print the header addresses and send it to upper layer for normal packet processing. I require help about where can i get resources or any book for writing virtual network driver with SAMPLE EXAMPLES? regards, linux_lover __________________________________ Do you Yahoo!? Send holiday email and support a worthy cause. Do good. http://celebrity.mail.yahoo.com From shemminger@osdl.org Mon Dec 20 14:42:15 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 14:42:22 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBKMftGP028801 for ; Mon, 20 Dec 2004 14:42:15 -0800 Received: from dxpl.pdx.osdl.net (dxpl.pdx.osdl.net [172.20.1.103]) by mail.osdl.org (8.11.6/8.11.6) with ESMTP id iBKMfK908431; Mon, 20 Dec 2004 14:41:20 -0800 Date: Mon, 20 Dec 2004 14:41:20 -0800 From: Stephen Hemminger To: "David S. Miller" Cc: netdev@oss.sgi.com Subject: [PATCH]] New: tcp_input.c:3038 missing newline character Message-Id: <20041220144120.5dfb4ee2@dxpl.pdx.osdl.net> In-Reply-To: <200412202215.iBKMFSCr025060@fire-1.osdl.org> References: <200412202215.iBKMFSCr025060@fire-1.osdl.org> Organization: Open Source Development Lab X-Mailer: Sylpheed version 0.9.10claws (GTK+ 1.2.10; x86_64-suse-linux) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12939 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shemminger@osdl.org Precedence: bulk X-list: netdev http://bugme.osdl.org/show_bug.cgi?id=3924 The printk at net/ipv4/tcp_input.c:3038 is missing the newline (\n) character. Signed-off-by: Stephen Hemminger diff -Nru a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c --- a/net/ipv4/tcp_input.c 2004-12-20 14:42:23 -08:00 +++ b/net/ipv4/tcp_input.c 2004-12-20 14:42:23 -08:00 @@ -3029,7 +3029,7 @@ if(tp->snd_wscale > 14) { if(net_ratelimit()) printk(KERN_INFO "tcp_parse_options: Illegal window " - "scaling value %d >14 received.", + "scaling value %d >14 received.\n", tp->snd_wscale); tp->snd_wscale = 14; } From sri@us.ibm.com Mon Dec 20 15:07:17 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 15:07:25 -0800 (PST) Received: from e33.co.us.ibm.com (e33.co.us.ibm.com [32.97.110.131]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBKN6l75029935 for ; Mon, 20 Dec 2004 15:07:16 -0800 Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.17.195.11]) by e33.co.us.ibm.com (8.12.10/8.12.9) with ESMTP id iBKN6HDr405424 for ; Mon, 20 Dec 2004 18:06:18 -0500 Received: from d03av02.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.195.168]) by westrelay02.boulder.ibm.com (8.12.10/NCO/VER6.6) with ESMTP id iBKN6H2L455846 for ; Mon, 20 Dec 2004 16:06:17 -0700 Received: from d03av02.boulder.ibm.com (loopback [127.0.0.1]) by d03av02.boulder.ibm.com (8.12.11/8.12.11) with ESMTP id iBKN6H8N023231 for ; Mon, 20 Dec 2004 16:06:17 -0700 Received: from w-sridhar.beaverton.ibm.com (w-sridhar.beaverton.ibm.com [9.47.18.19]) by d03av02.boulder.ibm.com (8.12.11/8.12.11) with ESMTP id iBKN6GsA023168; Mon, 20 Dec 2004 16:06:16 -0700 Date: Mon, 20 Dec 2004 15:06:15 -0800 (PST) From: Sridhar Samudrala X-X-Sender: sridhar@w-sridhar.beaverton.ibm.com To: Arnaldo Carvalho de Melo cc: "David S. Miller" , netdev@oss.sgi.com Subject: Re: FYI: struct sock class hierarchy In-Reply-To: <41C50633.1010102@conectiva.com.br> Message-ID: References: <41C50633.1010102@conectiva.com.br> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12940 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: sri@us.ibm.com Precedence: bulk X-list: netdev Arnaldo, The name struct stream_sock seems to indicate that this structure is needed only for SOCK_STREAM sockets. But in reality this structure is required for any connected sockets(ex: SOCK_SEQPACKET). I cannot think of good name for this structure, but struct conn_sock to indicate any connected socket seems to be the best i can come up with. Also If i understand the hierarchy correctly, stream_sock is a specialization of struct inet_sock. So can we include stream_sock also in the hierarchy as follows although i would prefer renaming stream_sock to conn_sock. > tcp6_sock <- tcp_sock <- stream_sock <- inet_sock <- sock struct stream_sock { struct inet_sock inet; struct stream_opt sopt; } struct tcp_sock { struct stream_sock ssk; struct tcp_opt tcp; } Thanks Sridhar On Sun, 19 Dec 2004, Arnaldo Carvalho de Melo wrote: > Dave, > > Further info on that struct sock hierarchy stuff I'm mentioned > that I planned doing while at the netsummit, with the changes I have > in one of my pending trees, things are now looking like this: > > struct inet_sock { > struct sock sk; > struct stream_sock *pssk; > #if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) > struct ipv6_pinfo *pinet6; > #endif > struct inet_opt inet; > }; > > I.e. inet sock is an specialization of struct sock (the pointer to any > instance of both structs point to the same memory address). > > Now tcp: > > struct tcp_sock { > struct inet_sock inet; > struct tcp_opt tcp; > struct stream_sock ssk; > }; > > Now it is tcp_sock that is an specialization of struct inet_sock, that is, > in turn, as said above, an specialization of struct sock (the pointer > to any instance of the three structs points to the same memory address) > > And finally struct tcp6_sock: > > struct tcp6_sock { > struct tcp_sock tcp; > struct ipv6_pinfo inet6; > }; > > I guess you got the idea: > > tcp6_sock <- tcp_sock <- inet_sock <- sock > > This was done for struct udp_sock, raw6_sock, sctp_sock, etc > > Using this approach we see clearly how the layouts are organized and > the relations among the, ho-hum, classes, i.e. the class hierarchy. > > For further eye candy please take a look at: > > http://master.kernel.org/~acme/sock_class_hierarchy.ps > > That has the complete struct sock class hierarchy subtree for inet > protocols, including SCTP, DCCP, raw, raw6, tcp_tw_bucket, etc. > > Apart from the introduction of struct stream_sock it is completely > equivalent to the current state of things in Linus tree, but much > clearer, IMHO, by not having all those cut'n'pasted layouts, complete > with the #ifdef IPV6 optimization for when IPv6 is not compiled. > > BTW, this #ifdef IPV6 is wrong, as it leads to a kernel where IPV6 > can't be later compiled and loaded, but this remains a futile exercise > while the other #ifdef IPV6 is scattered in the common IPV6/IPV4 > code: > > [acme@oldpandora stream_sock-2.6]$ grep -l IPV6 net/ipv4/*.c | wc -l > 6 > > But this is something we'll possibly address in the future... :-) > > - Arnaldo > > > From davem@davemloft.net Mon Dec 20 15:37:13 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 15:37:20 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBKNaqOU031282 for ; Mon, 20 Dec 2004 15:37:13 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1CgWyt-0008Sf-00; Mon, 20 Dec 2004 15:30:23 -0800 Date: Mon, 20 Dec 2004 15:30:23 -0800 From: "David S. Miller" To: Stephen Hemminger Cc: netdev@oss.sgi.com Subject: Re: [PATCH]] New: tcp_input.c:3038 missing newline character Message-Id: <20041220153023.3f5cccee.davem@davemloft.net> In-Reply-To: <20041220144120.5dfb4ee2@dxpl.pdx.osdl.net> References: <200412202215.iBKMFSCr025060@fire-1.osdl.org> <20041220144120.5dfb4ee2@dxpl.pdx.osdl.net> X-Mailer: Sylpheed version 1.0.0beta3 (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12941 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Mon, 20 Dec 2004 14:41:20 -0800 Stephen Hemminger wrote: > http://bugme.osdl.org/show_bug.cgi?id=3924 > The printk at net/ipv4/tcp_input.c:3038 is missing the newline (\n) character. > > Signed-off-by: Stephen Hemminger Applied, thanks Stephen. From bcasavan@sgi.com Mon Dec 20 15:38:02 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 15:38:09 -0800 (PST) Received: from omx2.sgi.com (omx2-ext.sgi.com [192.48.171.19]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBKNbgi3031393 for ; Mon, 20 Dec 2004 15:38:02 -0800 Received: from flecktone.americas.sgi.com (flecktone.americas.sgi.com [198.149.16.15]) by omx2.sgi.com (8.12.11/8.12.9/linux-outbound_gateway-1.1) with ESMTP id iBL0xquw013592 for ; Mon, 20 Dec 2004 16:59:52 -0800 Received: from tulip-e236.americas.sgi.com (tulip-e236.americas.sgi.com [128.162.236.208]) by flecktone.americas.sgi.com (8.12.9/8.12.10/SGI_generic_relay-1.2) with ESMTP id iBKNbHCK4748520; Mon, 20 Dec 2004 17:37:18 -0600 (CST) Received: from kzerza.americas.sgi.com (kzerza.americas.sgi.com [128.162.233.27]) by tulip-e236.americas.sgi.com (8.12.9/SGI-server-1.8) with ESMTP id iBKNbHgs13827327; Mon, 20 Dec 2004 17:37:17 -0600 (CST) Date: Mon, 20 Dec 2004 17:37:17 -0600 From: Brent Casavant To: akpm@osdl.org cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: [PATCH 3/3] TCP hashes: NUMA interleaving Message-ID: Organization: "Silicon Graphics, Inc." MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12942 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bcasavan@sgi.com Precedence: bulk X-list: netdev Resend: Submitting for inclusion, as this patch series drew no objections from interested parties, and required no changes. The following patch against 2.6.10-rc3 modifies the TCP ehash and TCP bhash to enable the use of vmalloc to alleviate boottime memory allocation imbalances on NUMA systems, utilizing flags to the alloc_large_system_hash routine in order to centralize the enabling of this behavior. This patch (3/3) depends on patch 1/3 in this patch series. It does not depend on, nor is depended upon by, patch 2/3 in the series. Signed-off-by: Brent Casavant Index: linux/net/ipv4/tcp.c =================================================================== --- linux.orig/net/ipv4/tcp.c 2004-12-10 18:09:49.011272637 -0600 +++ linux/net/ipv4/tcp.c 2004-12-10 18:10:52.285839780 -0600 @@ -256,6 +256,7 @@ #include #include #include +#include #include #include @@ -2254,7 +2255,6 @@ void __init tcp_init(void) { struct sk_buff *skb = NULL; - unsigned long goal; int order, i; if (sizeof(struct tcp_skb_cb) > sizeof(skb->cb)) @@ -2287,43 +2287,35 @@ * * The methodology is similar to that of the buffer cache. */ - if (num_physpages >= (128 * 1024)) - goal = num_physpages >> (21 - PAGE_SHIFT); - else - goal = num_physpages >> (23 - PAGE_SHIFT); - - if (thash_entries) - goal = (thash_entries * sizeof(struct tcp_ehash_bucket)) >> PAGE_SHIFT; - for (order = 0; (1UL << order) < goal; order++) - ; - do { - tcp_ehash_size = (1UL << order) * PAGE_SIZE / - sizeof(struct tcp_ehash_bucket); - tcp_ehash_size >>= 1; - while (tcp_ehash_size & (tcp_ehash_size - 1)) - tcp_ehash_size--; - tcp_ehash = (struct tcp_ehash_bucket *) - __get_free_pages(GFP_ATOMIC, order); - } while (!tcp_ehash && --order > 0); - - if (!tcp_ehash) - panic("Failed to allocate TCP established hash table\n"); + tcp_ehash = (struct tcp_ehash_bucket *) + alloc_large_system_hash("TCP established", + sizeof(struct tcp_ehash_bucket), + thash_entries, + (num_physpages >= 128 * 1024) ? + (25 - PAGE_SHIFT) : + (27 - PAGE_SHIFT), + HASH_HIGHMEM, + &tcp_ehash_size, + NULL, + 0); + tcp_ehash_size = (1 << tcp_ehash_size) >> 1; for (i = 0; i < (tcp_ehash_size << 1); i++) { rwlock_init(&tcp_ehash[i].lock); INIT_HLIST_HEAD(&tcp_ehash[i].chain); } - do { - tcp_bhash_size = (1UL << order) * PAGE_SIZE / - sizeof(struct tcp_bind_hashbucket); - if ((tcp_bhash_size > (64 * 1024)) && order > 0) - continue; - tcp_bhash = (struct tcp_bind_hashbucket *) - __get_free_pages(GFP_ATOMIC, order); - } while (!tcp_bhash && --order >= 0); - - if (!tcp_bhash) - panic("Failed to allocate TCP bind hash table\n"); + tcp_bhash = (struct tcp_bind_hashbucket *) + alloc_large_system_hash("TCP bind", + sizeof(struct tcp_bind_hashbucket), + tcp_ehash_size, + (num_physpages >= 128 * 1024) ? + (25 - PAGE_SHIFT) : + (27 - PAGE_SHIFT), + HASH_HIGHMEM, + &tcp_bhash_size, + NULL, + 64 * 1024); + tcp_bhash_size = 1 << tcp_bhash_size; for (i = 0; i < tcp_bhash_size; i++) { spin_lock_init(&tcp_bhash[i].lock); INIT_HLIST_HEAD(&tcp_bhash[i].chain); @@ -2332,6 +2324,10 @@ /* Try to be a bit smarter and adjust defaults depending * on available memory. */ + for (order = 0; ((1 << order) << PAGE_SHIFT) < + (tcp_bhash_size * sizeof(struct tcp_bind_hashbucket)); + order++) + ; if (order > 4) { sysctl_local_port_range[0] = 32768; sysctl_local_port_range[1] = 61000; -- Brent Casavant If you had nothing to fear, bcasavan@sgi.com how then could you be brave? Silicon Graphics, Inc. -- Queen Dama, Source Wars From davem@davemloft.net Mon Dec 20 15:43:20 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 15:43:27 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBKNgxVY032314 for ; Mon, 20 Dec 2004 15:43:20 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1CgX4j-0008Tu-00; Mon, 20 Dec 2004 15:36:25 -0800 Date: Mon, 20 Dec 2004 15:36:24 -0800 From: "David S. Miller" To: Thomas Graf Cc: kaber@trash.net, netdev@oss.sgi.com Subject: Re: [PATCH] PKT_SCHED: Fix double locking in tcindex destroy path Message-Id: <20041220153624.58a1d93e.davem@davemloft.net> In-Reply-To: <20041210124445.GV1371@postel.suug.ch> References: <20041210014918.GT1371@postel.suug.ch> <41B90B81.1020102@trash.net> <20041210124445.GV1371@postel.suug.ch> X-Mailer: Sylpheed version 1.0.0beta3 (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12943 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Fri, 10 Dec 2004 13:44:45 +0100 Thomas Graf wrote: > * Patrick McHardy <41B90B81.1020102@trash.net> 2004-12-10 03:35 > > Thomas Graf wrote: > > > > >tcindex's destroy uses its own delete functions to destroy its > > >configuration. The delete function (correctly) takes the qdisc_tree_lock > > >to prevent list walkings from happening while removing from the list. > > >The qdisc_tree_lock is already held if we're comming via the destroy > > >path and thus a double locking takes place. > > > > > >Patch not needed for 2.4 since both destroy paths are unlocked but will > > >be needed if we add them. > > > > > > > > Looks correct, but 2.4 does need this. qdisc_destroy in 2.4 always > > happens under dev->queue_lock. For example dev_shutdown from 2.4: > > Not 100% correct since cls_api.c drops the lock before calling > tcf_destroy but the patch is indeed needed and it's not a problem > if dev->queue_lock is not taken since it is already unlinked as you > correctly stated in your previous mail. Thanks Patrick. > > Patch also applies to 2.4 with some fuzz. > > Signed-off-by: Thomas Graf I think the conditional locking is quite ugly, but I can't suggest something better at this time. Patch applied to both 2.4.x and 2.6.x, thanks Patrick and Thomas. From davem@davemloft.net Mon Dec 20 15:46:07 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 15:46:14 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBKNjlgP000353 for ; Mon, 20 Dec 2004 15:46:07 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1CgX7U-0008Un-00; Mon, 20 Dec 2004 15:39:16 -0800 Date: Mon, 20 Dec 2004 15:39:16 -0800 From: "David S. Miller" To: Stephen Hemminger Cc: michael.vittrup.larsen@ericsson.com, netdev@oss.sgi.com Subject: Re: [PATCH] tcp: efficient port randomistion (rev 3) Message-Id: <20041220153916.6c00c114.davem@davemloft.net> In-Reply-To: <20041210170900.11d41d56.shemminger@osdl.org> References: <20041027092531.78fe438c@guest-251-240.pdx.osdl.net> <20041202135252.04e64f51.davem@davemloft.net> <41B14E57.5080803@osdl.org> <200412060918.04441.michael.vittrup.larsen@ericsson.com> <20041206094234.34861c78@dxpl.pdx.osdl.net> <20041208235524.202ff3a1.davem@davemloft.net> <20041210170900.11d41d56.shemminger@osdl.org> X-Mailer: Sylpheed version 1.0.0beta3 (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12944 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Fri, 10 Dec 2004 17:09:00 -0800 Stephen Hemminger wrote: > okay, here is the revised version. Testing shows that it > is more consistent, and just as fast as existing code, > probably because of the getting rid of portalloc_lock and > better distribution. > > Signed-off-by: Stephen Hemminger Queued up for 2.6.11, thanks Stephen. From davem@davemloft.net Mon Dec 20 15:50:16 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 15:50:23 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBKNntDM000930 for ; Mon, 20 Dec 2004 15:50:16 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1CgXBW-0008V9-00; Mon, 20 Dec 2004 15:43:26 -0800 Date: Mon, 20 Dec 2004 15:43:26 -0800 From: "David S. Miller" To: Stephen Hemminger Cc: kaber@trash.net, netem@osdl.org, netdev@oss.sgi.com Subject: Re: [PATCH] netem: restart device after inserting packets Message-Id: <20041220154326.4eedb936.davem@davemloft.net> In-Reply-To: <20041214131113.32d080fb@dxpl.pdx.osdl.net> References: <20041208123103.4cc6b005@dxpl.pdx.osdl.net> <20041208210031.63f0963f.davem@davemloft.net> <41B91901.3070304@trash.net> <20041214113249.0725a655.davem@davemloft.net> <20041214131113.32d080fb@dxpl.pdx.osdl.net> X-Mailer: Sylpheed version 1.0.0beta3 (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12945 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Tue, 14 Dec 2004 13:11:13 -0800 Stephen Hemminger wrote: > 2.4 version of the netem wakeup patch. Also fixes the qlen > in a couple of places. This makes code basically same as 2.6 > > Signed-off-by: Stephen Hemminger Applied, thanks a lot Stephen. From davem@davemloft.net Mon Dec 20 15:58:23 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 15:58:29 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBKNw37U001568 for ; Mon, 20 Dec 2004 15:58:23 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1CgXJO-00005K-00; Mon, 20 Dec 2004 15:51:34 -0800 Date: Mon, 20 Dec 2004 15:51:34 -0800 From: "David S. Miller" To: Thomas Graf Cc: netdev@oss.sgi.com Subject: Re: [PATCH] PKT_SCHED: Provide compat policer stats in action policer Message-Id: <20041220155134.580396cb.davem@davemloft.net> In-Reply-To: <20041215130128.GK8493@postel.suug.ch> References: <20041215130128.GK8493@postel.suug.ch> X-Mailer: Sylpheed version 1.0.0beta3 (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12946 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Wed, 15 Dec 2004 14:01:28 +0100 Thomas Graf wrote: > This should go in before 2.6.10. It fixes a forgotten case to provide > police backward compatibility statistics for old iproute2 versions > running on a new kernel with actions enabled. Should make distributions > happy with older iproute2 versions and all-included kernel configs > since they probably favour actions over plain policer. Applied, thanks Thomas. About testsuites. Figure out what we want, and I can even put up a BK repository for the testsuite on kernel.bkbits.net Unlike what some others appear to be suggesting, I see no problem with multiple testsuites :-) From tommy.christensen@tpack.net Mon Dec 20 16:13:07 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 16:13:15 -0800 (PST) Received: from mail.tpack.net (ip18.tpack.net [213.173.228.18]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBL0CkY6002361 for ; Mon, 20 Dec 2004 16:13:07 -0800 Received: (qmail 8233 invoked from network); 21 Dec 2004 00:12:16 -0000 Received: from dhcp-188.cph.tpack.net (HELO ?172.17.159.11?) (192.168.0.188) by 0 with SMTP; 21 Dec 2004 00:12:16 -0000 Message-ID: <41C76AA0.7020800@tpack.net> Date: Tue, 21 Dec 2004 01:13:20 +0100 From: Tommy Christensen User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.2) Gecko/20040803 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Jeff Garzik CC: hadi@cyberus.ca, Thomas Spatzier , "David S. Miller" , Hasso Tepper , Herbert Xu , netdev@oss.sgi.com, Paul Jakma Subject: Re: [patch 4/10] s390: network driver. References: <1103484552.1046.155.camel@jzny.localdomain> <41C600D7.70005@tpack.net> <1103497516.1046.231.camel@jzny.localdomain> <41C612BC.5070909@tpack.net> <1103551830.1047.316.camel@jzny.localdomain> <41C71FFD.7090308@pobox.com> In-Reply-To: <41C71FFD.7090308@pobox.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12947 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tommy.christensen@tpack.net Precedence: bulk X-list: netdev Jeff Garzik wrote: > I haven't heard anything to convince me that the same change should be > deployed across NNN drivers. The drivers already signal the net core > that the link is down; to me, that implies there should be code in _one_ > place that handles this condition, not NNN places. AFAICS only a handful of (newer) drivers call netif_stop_queue() directly. Others may do this indirectly if the MAC stops taking packets from the DMA ringbuffer. At least some MAC's/drivers certainly don't. I've always thought of netif_stop_queue() as the replacement of the old tbusy flag, signaling a transient condition where the HW is unable to keep up with the flow of packets. And it seems to be used for just this in most cases. Perhaps somebody confused netif_stop_queue with dev_deactivate() ?? OK, another view on this: isn't is problematic to have skb's stuck in the network stack "indefinitely" ? They hold references to a dst_entry and a sock (and probably more). So how about this for the FAQ: Q: Why can't I unload the af_packet module? A: Ohh, you'll have to plug in the darn cable to eth0 first! *Please* tell me, I've got this all wrong. -Tommy From peterc@gelato.unsw.edu.au Mon Dec 20 16:13:09 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 16:13:16 -0800 (PST) Received: from note.orchestra.cse.unsw.EDU.AU (root@note.orchestra.cse.unsw.EDU.AU [129.94.242.24]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBL0CmLe002362 for ; Mon, 20 Dec 2004 16:13:09 -0800 Received: From lemon.gelato.unsw.edu.au ([129.94.173.27]) (for ) By note With Smtp ; Tue, 21 Dec 2004 11:12:23 +1100 Received: from berry.gelato.unsw.edu.au ([129.94.173.230]:37472) by lemon.gelato.unsw.edu.au with esmtp (Exim 4.34) id 1CgXdX-0008Du-RL for netdev@oss.sgi.com; Tue, 21 Dec 2004 11:12:23 +1100 Received: from peterc by berry.gelato.unsw.EDU.AU with local (Exim 3.36 #1 (Debian)) id 1CgXdX-0005LM-00 for ; Tue, 21 Dec 2004 11:12:23 +1100 From: Peter Chubb To: netdev@oss.sgi.com Date: Tue, 21 Dec 2004 11:12:23 +1100 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16839.27239.264551.415058@berry.gelato.unsw.EDU.AU> Subject: TG3 fix for slow switches (Was: TG3 driver failure on HP 16-way) X-Mailer: VM 7.17 under 21.4 (patch 15) "Security Through Obscurity" XEmacs Lucid Comments: Hyperbole mail buttons accepted, v04.18. X-Face: GgFg(Z>fx((4\32hvXq<)|jndSniCH~~$D)Ka:P@e@JR1P%Vr}EwUdfwf-4j\rUs#JR{'h# !]])6%Jh~b$VA|ALhnpPiHu[-x~@<"@Iv&|%R)Fq[[,(&Z'O)Q)xCqe1\M[F8#9l8~}#u$S$Rm`S9% \'T@`:&8>Sb*c5d'=eDYI&GF`+t[LfDH="MP5rwOO]w>ALi7'=QJHz&y&C&TE_3j! X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12948 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: peterc@gelato.unsw.edu.au Precedence: bulk X-list: netdev Hi, I've tracked down the problem I was having with the TG3 driver on the HP 16-way. If you'll recall, the problem was that the link was always marked as down, and autonegotiation was failing. The problem is that the driver gives up before the switch that the NIC is connected to has finished the negotiation phase. Here's a simple patch. I changed the way the loop works too, because tg3_readphys() sets *val to 0xffffffff if it fails. Signed-off-by: Peter Chubb ===== drivers/net/tg3.c 1.222 vs edited ===== --- 1.222/drivers/net/tg3.c 2004-11-15 23:53:08 +00:00 +++ edited/drivers/net/tg3.c 2004-12-21 00:00:58 +00:00 @@ -1554,10 +1554,11 @@ static int tg3_setup_copper_phy(struct t } } - bmsr = 0; - for (i = 0; i < 100; i++) { - tg3_readphy(tp, MII_BMSR, &bmsr); + + for (i = 0; i < 1000; i++) { tg3_readphy(tp, MII_BMSR, &bmsr); + if (tg3_readphy(tp, MII_BMSR, &bmsr)) + bmsr = 0; if (bmsr & BMSR_LSTATUS) break; udelay(40); From tgraf@suug.ch Mon Dec 20 16:16:47 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 16:16:54 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBL0GQWl003261 for ; Mon, 20 Dec 2004 16:16:47 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id 3B0B384; Tue, 21 Dec 2004 01:15:39 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id 137731C0EA; Tue, 21 Dec 2004 01:16:22 +0100 (CET) Date: Tue, 21 Dec 2004 01:16:21 +0100 From: Thomas Graf To: jamal Cc: Patrick McHardy , "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PATCH] PKT_SCHED: Provide compat policer stats in action policer Message-ID: <20041221001621.GY17998@postel.suug.ch> References: <20041215130128.GK8493@postel.suug.ch> <1103119774.1077.74.camel@jzny.localdomain> <41C05B60.6030504@trash.net> <1103484249.1046.143.camel@jzny.localdomain> <41C6A6CC.1050105@trash.net> <1103552830.1049.355.camel@jzny.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1103552830.1049.355.camel@jzny.localdomain> X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12949 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * jamal <1103552830.1049.355.camel@jzny.localdomain> 2004-12-20 09:27 > I just stared quickly at what Thomas has and realize its not really > automated. In my case it is easier because i can click on the proverbial > one-button and run 20 tests (including a subset of the policer ones) > and even capturing tcpdumps. I have attached a sample testcase. I've put up a snapshot of what I have at: http://people.suug.ch/~tgr/tmp/tc-testsuite.tar.gz .. and a special for jamal not including iproute2 sources so you can save a few bucks ;-> http://people.suug.ch/~tgr/tmp/tc-testsuite-jamal.tar.bz2 I'm not sure what you mean, it's all automated and actually quite similar to what you've included. Actually, we could simply run your testscript under my testsuite if you replace the "tc" with a variable given to the script. My indev test: (requires a patched iproute2 to correctly print [ indev INDEV ] in usage text, patches follow later) ---- #!/bin/bash # vim: ft=sh # source lib/generic.sh TYPE=$1; shift FILTER=$@ HELP=`$TC filter add ${TYPE} help 2>&1` if [ "`echo $HELP | grep \"indev DEV\"`" ]; then if [ $CONFIG_NET_CLS_IND ]; then ts_tc "attr-indev" "filter addition" $FILTER indev "eth1" else ts_log "attr-indev: No INDEV support, checking error return value" RES=`$TC $FILTER indev "eth1" 2>&1` if [ "`echo $RES | head -1`" != \ "RTNETLINK answers: Operation not supported" ]; then echo "Warning: Kernel did not report EOPNOTSUPP" fi fi else # no support for indev, add filter anyway to not break deletion ts_log "ip version doesn't support indev, skipping" ts_tc "attr-indev" "dummy addition" $FILTER fi --- Here's an example run: tgr:axs ~/dev/tc-testsuite make alltests Running cls-testbed.t [iproute2-2.4.7/2.6.10-rc3-bk12]: FAILED Running cls-testbed.t [iproute2-2.6.9/2.6.10-rc3-bk12]: PASS Running dsmark.t [iproute2-2.4.7/2.6.10-rc3-bk12]: FAILED Running dsmark.t [iproute2-2.6.9/2.6.10-rc3-bk12]: PASS Running std-cbq.t [iproute2-2.4.7/2.6.10-rc3-bk12]: PASS Running std-cbq.t [iproute2-2.6.9/2.6.10-rc3-bk12]: PASS tgr:axs ~/dev/tc-testsuite So, 2 of them failed, let's see what happenend: tgr:axs ~/dev/tc-testsuite cat results/cls-testbed.t.iproute2-2.4.7.out ... Preparing classifier testbed with qdisc dsmark cls-testbed: dsmark root qdisc creation failed: command: iproute2/iproute2-2.4.7/tc/tc qdisc add dev lo root handle 10:0 dsmark indices 64 default_index 1 set_tc_index stderr output: Unknown qdisc "dsmark", hence option "indices" is unparsable ... I'll take a closer look at tcng to see how we can put the best parts of both together. It will definitely be no problem to integrate your expect based tests. From tgraf@suug.ch Mon Dec 20 16:22:54 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 16:23:01 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBL0MXrF007258 for ; Mon, 20 Dec 2004 16:22:54 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id E78C584; Tue, 21 Dec 2004 01:21:47 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id 4B1AF1C0EA; Tue, 21 Dec 2004 01:22:30 +0100 (CET) Date: Tue, 21 Dec 2004 01:22:30 +0100 From: Thomas Graf To: jamal Cc: Patrick McHardy , "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PATCH] PKT_SCHED: Fix cls indev validation Message-ID: <20041221002230.GZ17998@postel.suug.ch> References: <20041219203050.GK17998@postel.suug.ch> <41C68CEF.3030803@trash.net> <1103552215.1048.333.camel@jzny.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1103552215.1048.333.camel@jzny.localdomain> X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12951 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev Dave, drop this one. I will move indev into the abstraction layer i'm going to introduce .11 which will fix the current issues, remove the ifdefs and will make removal/replacement much simpler once we have metadata match. From davem@davemloft.net Mon Dec 20 16:22:44 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 16:22:51 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBL0MO0P007251 for ; Mon, 20 Dec 2004 16:22:44 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1CgXgu-0000AN-00; Mon, 20 Dec 2004 16:15:52 -0800 Date: Mon, 20 Dec 2004 16:15:52 -0800 From: "David S. Miller" To: Peter Chubb Cc: netdev@oss.sgi.com Subject: Re: TG3 fix for slow switches (Was: TG3 driver failure on HP 16-way) Message-Id: <20041220161552.2b88aa3d.davem@davemloft.net> In-Reply-To: <16839.27239.264551.415058@berry.gelato.unsw.EDU.AU> References: <16839.27239.264551.415058@berry.gelato.unsw.EDU.AU> X-Mailer: Sylpheed version 1.0.0beta3 (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12950 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Tue, 21 Dec 2004 11:12:23 +1100 Peter Chubb wrote: > The problem is that the driver gives up before the switch that the NIC > is connected to has finished the negotiation phase. > > Here's a simple patch. I changed the way the loop works too, because > tg3_readphys() sets *val to 0xffffffff if it fails. This patch shouldn't be needed. This waiting loop is just an optimization in case we can negotiation quickly. If the loop fails, we just wait for the chip to interrupt us when the link status changes next (or we notice a link status change via timer based polling which we use on chips which can't provide the interrupt based notification properly), at which time we'll call tg3_setup_copper_phy() from the interrupt handler. You need to figure out why that isn't working correctly. From romieu@fr.zoreil.com Mon Dec 20 16:25:04 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 16:25:12 -0800 (PST) Received: from fr.zoreil.com (electric-eye.fr.zoreil.com [213.41.134.224]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBL0OgEu008265 for ; Mon, 20 Dec 2004 16:25:03 -0800 Received: from electric-eye.fr.zoreil.com (localhost.localdomain [127.0.0.1]) by fr.zoreil.com (8.12.10/8.12.1) with ESMTP id iBL0MJvr005333; Tue, 21 Dec 2004 01:22:19 +0100 Received: (from romieu@localhost) by electric-eye.fr.zoreil.com (8.12.10/8.12.10/Submit) id iBL0MISQ005332; Tue, 21 Dec 2004 01:22:18 +0100 Date: Tue, 21 Dec 2004 01:22:18 +0100 From: Francois Romieu To: Matt Mackall Cc: Mark Broadbent , linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: Lockup with 2.6.9-ac15 related to netconsole Message-ID: <20041221002218.GA1487@electric-eye.fr.zoreil.com> References: <59719.192.102.214.6.1103214002.squirrel@webmail.wetlettuce.com> <20041216211024.GK2767@waste.org> <34721.192.102.214.6.1103274614.squirrel@webmail.wetlettuce.com> <20041217215752.GP2767@waste.org> <20041217233524.GA11202@electric-eye.fr.zoreil.com> <36901.192.102.214.6.1103535728.squirrel@webmail.wetlettuce.com> <20041220211419.GC5974@waste.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041220211419.GC5974@waste.org> User-Agent: Mutt/1.4.1i X-Organisation: Land of Sunshine Inc. X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12952 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: romieu@fr.zoreil.com Precedence: bulk X-list: netdev Matt Mackall : > On Mon, Dec 20, 2004 at 09:42:08AM -0000, Mark Broadbent wrote: > > > > Exactly the same happens, I still get a 'NMI Watchdog detected LOCKUP' > > with the r8169 device using the above patch on top of 2.6.10-rc3-bk10. > > Ok, that suggests a problem localized to netpoll itself. Do you have > spinlock debugging turned on by any chance? Any chance of: 1 dev_queue_xmit 2 dev->xmit_lock taken 3 interruption 4 printk 5 netconsole write 6 dev->xmit_lock again 7 lockup ? This is probably the silly question of the day. -- Ueimor From oxymoron@waste.org Mon Dec 20 16:56:12 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 16:56:20 -0800 (PST) Received: from waste.org (waste.org [216.27.176.166]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBL0tpGj009466 for ; Mon, 20 Dec 2004 16:56:12 -0800 Received: from waste.org (localhost [127.0.0.1]) by waste.org (8.12.3/8.12.3/Debian-7.1) with ESMTP id iBL0tLkC031679; Mon, 20 Dec 2004 18:55:21 -0600 Received: (from oxymoron@localhost) by waste.org (8.12.3/8.12.3/Debian-7.1) id iBL0tLJr031676; Mon, 20 Dec 2004 18:55:21 -0600 Date: Mon, 20 Dec 2004 16:55:21 -0800 From: Matt Mackall To: Francois Romieu Cc: Mark Broadbent , linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: Lockup with 2.6.9-ac15 related to netconsole Message-ID: <20041221005521.GD5974@waste.org> References: <59719.192.102.214.6.1103214002.squirrel@webmail.wetlettuce.com> <20041216211024.GK2767@waste.org> <34721.192.102.214.6.1103274614.squirrel@webmail.wetlettuce.com> <20041217215752.GP2767@waste.org> <20041217233524.GA11202@electric-eye.fr.zoreil.com> <36901.192.102.214.6.1103535728.squirrel@webmail.wetlettuce.com> <20041220211419.GC5974@waste.org> <20041221002218.GA1487@electric-eye.fr.zoreil.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041221002218.GA1487@electric-eye.fr.zoreil.com> User-Agent: Mutt/1.3.28i X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Scanned: by amavisd-new X-Virus-Status: Clean X-archive-position: 12953 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mpm@selenic.com Precedence: bulk X-list: netdev On Tue, Dec 21, 2004 at 01:22:18AM +0100, Francois Romieu wrote: > Matt Mackall : > > On Mon, Dec 20, 2004 at 09:42:08AM -0000, Mark Broadbent wrote: > > > > > > Exactly the same happens, I still get a 'NMI Watchdog detected LOCKUP' > > > with the r8169 device using the above patch on top of 2.6.10-rc3-bk10. > > > > Ok, that suggests a problem localized to netpoll itself. Do you have > > spinlock debugging turned on by any chance? > > Any chance of: > 1 dev_queue_xmit > 2 dev->xmit_lock taken > 3 interruption > 4 printk > 5 netconsole write > 6 dev->xmit_lock again > 7 lockup > > ? > > This is probably the silly question of the day. Maybe, but the answer isn't obvious to me at the moment as I haven't been thinking about such stuff enough lately. Silly response of the day: Mark, can you try this (again completely untested, but at least compiles) patch? I'm afraid I don't have a proper test rig to reproduce this at the moment. This will attempt to grab the lock, and if it fails, will check for recursion. Then it will try to print a message on the local console, temporarily disabling netconsole to allow the printk to get through.. Index: l/net/core/netpoll.c =================================================================== --- l.orig/net/core/netpoll.c 2004-11-04 10:53:23.388610000 -0800 +++ l/net/core/netpoll.c 2004-12-20 16:45:40.212709000 -0800 @@ -31,6 +31,8 @@ #define MAX_SKBS 32 #define MAX_UDP_CHUNK 1460 +static int netpoll_kill; + static spinlock_t skb_list_lock = SPIN_LOCK_UNLOCKED; static int nr_skbs; static struct sk_buff *skbs; @@ -183,13 +185,24 @@ int status; repeat: - if(!np || !np->dev || !netif_running(np->dev)) { + if(!np || !np->dev || !netif_running(np->dev) || netpoll_kill) { __kfree_skb(skb); return; } - spin_lock(&np->dev->xmit_lock); - np->dev->xmit_lock_owner = smp_processor_id(); + if(spin_trylock(&np->dev->xmit_lock)) + np->dev->xmit_lock_owner = smp_processor_id(); + else { + if(np->dev->xmit_lock_owner == smp_processor_id()) { + netpoll_kill = 1; + __kfree_skb(skb); + printk("Tried to recursively get dev->xmit_lock"); + netpoll_kill = 0; + return; + } + spin_lock(&np->dev->xmit_lock); + + } /* * network drivers do not expect to be called if the queue is -- Mathematics is the supreme nostalgia of our time. From davem@davemloft.net Mon Dec 20 17:03:12 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 17:03:18 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBL12qWO010075 for ; Mon, 20 Dec 2004 17:03:12 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1CgYK0-0000GO-00; Mon, 20 Dec 2004 16:56:16 -0800 Date: Mon, 20 Dec 2004 16:56:16 -0800 From: "David S. Miller" To: Thomas Graf Cc: hadi@cyberus.ca, kaber@trash.net, netdev@oss.sgi.com Subject: Re: [PATCH] PKT_SCHED: Fix cls indev validation Message-Id: <20041220165616.60729dbc.davem@davemloft.net> In-Reply-To: <20041221002230.GZ17998@postel.suug.ch> References: <20041219203050.GK17998@postel.suug.ch> <41C68CEF.3030803@trash.net> <1103552215.1048.333.camel@jzny.localdomain> <20041221002230.GZ17998@postel.suug.ch> X-Mailer: Sylpheed version 1.0.0beta3 (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12954 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Tue, 21 Dec 2004 01:22:30 +0100 Thomas Graf wrote: > Dave, drop this one. I will move indev into the abstraction layer > i'm going to introduce .11 which will fix the current issues, remove > the ifdefs and will make removal/replacement much simpler once > we have metadata match. OK. From davem@davemloft.net Mon Dec 20 17:09:14 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 17:09:21 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBL18sRO010711 for ; Mon, 20 Dec 2004 17:09:14 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1CgYPu-0000Gb-00; Mon, 20 Dec 2004 17:02:22 -0800 Date: Mon, 20 Dec 2004 17:02:22 -0800 From: "David S. Miller" To: hadi@cyberus.ca Cc: kaber@trash.net, tgraf@suug.ch, netdev@oss.sgi.com Subject: Re: [PATCH] PKT_SCHED: dsmark must take care of shared/cloned skbs Message-Id: <20041220170222.5ee14588.davem@davemloft.net> In-Reply-To: <1103552026.1048.324.camel@jzny.localdomain> References: <20041218170017.GH17998@postel.suug.ch> <1103487827.1048.188.camel@jzny.localdomain> <20041219203641.GL17998@postel.suug.ch> <41C687EE.1090205@trash.net> <1103552026.1048.324.camel@jzny.localdomain> X-Mailer: Sylpheed version 1.0.0beta3 (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12955 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On 20 Dec 2004 09:13:46 -0500 jamal wrote: > Certainly not a big deal; shouldnt care if once in a while tcpdump > actually gets to see the real packet that went out the wire. It's not just tcpdump. Any modification of a the packet data for a shared SKB is illegal, no matter where it occurs. This can corrupt TCP packets, which share the transmitted packet with the socket retransmit queue. We have a similar problem with TSO and some gigabit cards whose drivers muck with the iphdr->tot_len field on transmit. I still am not sure how I want to address that case yet. Since transmitted TCP data packets are always shared/cloned, we'll have to do a data copy on every TSO send on these cards which frankly nullifies much of the performance gain TSO gives. If we end of fixing it via a copy we'll probably need to seriously consider not doing TSO unless we are doing sendfile. From peter@chubb.wattle.id.au Mon Dec 20 17:12:39 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 17:12:46 -0800 (PST) Received: from mail15.syd.optusnet.com.au (mail15.syd.optusnet.com.au [211.29.132.196]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBL1CHPG011229 for ; Mon, 20 Dec 2004 17:12:38 -0800 Received: from mail.chubb.wattle.id.au (c220-237-8-57.randw1.nsw.optusnet.com.au [220.237.8.57]) by mail15.syd.optusnet.com.au (8.12.11/8.12.11) with ESMTP id iBL1BeR5009758; Tue, 21 Dec 2004 12:11:41 +1100 Received: from peterc by mail.chubb.wattle.id.au with local (Exim 3.36 #1 (Debian)) id 1CgYYu-0003YZ-00; Tue, 21 Dec 2004 12:11:40 +1100 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16839.30796.413939.333935@wombat.chubb.wattle.id.au> Date: Tue, 21 Dec 2004 12:11:40 +1100 From: Peter Chubb To: "David S. Miller" Cc: Peter Chubb , netdev@oss.sgi.com Subject: Re: TG3 fix for slow switches (Was: TG3 driver failure on HP 16-way) In-Reply-To: <20041220161552.2b88aa3d.davem@davemloft.net> References: <16839.27239.264551.415058@berry.gelato.unsw.EDU.AU> <20041220161552.2b88aa3d.davem@davemloft.net> X-Mailer: VM 7.17 under 21.4 (patch 15) "Security Through Obscurity" XEmacs Lucid Comments: Hyperbole mail buttons accepted, v04.18. X-Face: GgFg(Z>fx((4\32hvXq<)|jndSniCH~~$D)Ka:P@e@JR1P%Vr}EwUdfwf-4j\rUs#JR{'h# !]])6%Jh~b$VA|ALhnpPiHu[-x~@<"@Iv&|%R)Fq[[,(&Z'O)Q)xCqe1\M[F8#9l8~}#u$S$Rm`S9% \'T@`:&8>Sb*c5d'=eDYI&GF`+t[LfDH="MP5rwOO]w>ALi7'=QJHz&y&C&TE_3j! X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12956 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: peterc@gelato.unsw.edu.au Precedence: bulk X-list: netdev >>>>> "David" == David S Miller writes: David> On Tue, 21 Dec 2004 11:12:23 +1100 Peter Chubb David> wrote: >> The problem is that the driver gives up before the switch that the >> NIC is connected to has finished the negotiation phase. >> >> Here's a simple patch. I changed the way the loop works too, >> because tg3_readphys() sets *val to 0xffffffff if it fails. David> This patch shouldn't be needed. This waiting loop is just an David> optimization in case we can negotiation quickly. David> You need to figure out why that isn't working correctly. It doesn't work because tg3_readphy sets bmsr to 0xffffffff even if it can't read the value. This breaks that loop early; and because BBMSR_LSTATUS is set, all sorts of things happen before the card is ready. Why is tg3_readphys returning 0xffffffff if it can't read a value anyway? Here's another possible fix. ===== drivers/net/tg3.c 1.222 vs edited ===== Index: linux-2.6.10-rc3/drivers/net/tg3.c =================================================================== --- linux-2.6.10-rc3.orig/drivers/net/tg3.c 2004-12-21 00:25:34.000000000 +0000 +++ linux-2.6.10-rc3/drivers/net/tg3.c 2004-12-21 01:00:56.586108274 +0000 @@ -1554,10 +1554,11 @@ } } - bmsr = 0; + for (i = 0; i < 100; i++) { tg3_readphy(tp, MII_BMSR, &bmsr); - tg3_readphy(tp, MII_BMSR, &bmsr); + if (tg3_readphy(tp, MII_BMSR, &bmsr)) + bmsr = 0; if (bmsr & BMSR_LSTATUS) break; udelay(40); From jgarzik@pobox.com Mon Dec 20 17:20:46 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 17:20:53 -0800 (PST) Received: from www.linux.org.uk (IDENT:93@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBL1KORY011877 for ; Mon, 20 Dec 2004 17:20:45 -0800 Received: from [69.134.152.124] (helo=[10.10.10.88]) by www.linux.org.uk with asmtp (TLSv1:AES256-SHA:256) (Exim 4.33) id 1CgYgn-0004rF-Cm; Tue, 21 Dec 2004 01:19:49 +0000 Message-ID: <41C77A2E.3090000@pobox.com> Date: Mon, 20 Dec 2004 20:19:42 -0500 From: Jeff Garzik User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3) Gecko/20040922 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Tommy Christensen CC: hadi@cyberus.ca, Thomas Spatzier , "David S. Miller" , Hasso Tepper , Herbert Xu , netdev@oss.sgi.com, Paul Jakma Subject: Re: [patch 4/10] s390: network driver. References: <1103484552.1046.155.camel@jzny.localdomain> <41C600D7.70005@tpack.net> <1103497516.1046.231.camel@jzny.localdomain> <41C612BC.5070909@tpack.net> <1103551830.1047.316.camel@jzny.localdomain> <41C71FFD.7090308@pobox.com> <41C76AA0.7020800@tpack.net> In-Reply-To: <41C76AA0.7020800@tpack.net> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12957 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Tommy Christensen wrote: > Jeff Garzik wrote: > >> I haven't heard anything to convince me that the same change should be >> deployed across NNN drivers. The drivers already signal the net core >> that the link is down; to me, that implies there should be code in >> _one_ place that handles this condition, not NNN places. > > > AFAICS only a handful of (newer) drivers call netif_stop_queue() directly. > Others may do this indirectly if the MAC stops taking packets from the > DMA ringbuffer. At least some MAC's/drivers certainly don't. Incorrect. Use of netif_stop_queue() is -required- to signal that the hardware cannot accept any more skbs from the system. Far more than a "handful" and required for all but a few very strange drivers. > OK, another view on this: isn't is problematic to have skb's stuck in > the network stack "indefinitely" ? > They hold references to a dst_entry and a sock (and probably more). > So how about this for the FAQ: > Q: Why can't I unload the af_packet module? > A: Ohh, you'll have to plug in the darn cable to eth0 first! > *Please* tell me, I've got this all wrong. You've got this all wrong. Jeff From davem@davemloft.net Mon Dec 20 22:26:33 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 20 Dec 2004 22:26:43 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBL6QClu023552 for ; Mon, 20 Dec 2004 22:26:33 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1CgdMu-0000n2-00; Mon, 20 Dec 2004 22:19:36 -0800 Date: Mon, 20 Dec 2004 22:19:36 -0800 From: "David S. Miller" To: Peter Chubb Cc: peterc@gelato.unsw.edu.au, netdev@oss.sgi.com Subject: Re: TG3 fix for slow switches (Was: TG3 driver failure on HP 16-way) Message-Id: <20041220221936.056b6812.davem@davemloft.net> In-Reply-To: <16839.30796.413939.333935@wombat.chubb.wattle.id.au> References: <16839.27239.264551.415058@berry.gelato.unsw.EDU.AU> <20041220161552.2b88aa3d.davem@davemloft.net> <16839.30796.413939.333935@wombat.chubb.wattle.id.au> X-Mailer: Sylpheed version 1.0.0beta3 (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12958 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Tue, 21 Dec 2004 12:11:40 +1100 Peter Chubb wrote: > It doesn't work because tg3_readphy sets bmsr to 0xffffffff even if it > can't read the value. This breaks that loop early; and because > BBMSR_LSTATUS is set, all sorts of things happen before the card is ready. > > Why is tg3_readphys returning 0xffffffff if it can't read a value anyway? Because callers are not supposed to depend upon the value being set to anything valid if tg3_readphy() returns an error. I thought it should never be returning an error at this spot. The PHY should always return a valid value within PHY_BUSY_LOOPS. If MI_COM_BUSY is staying set for such a long time that's a pretty serious problem. Taking a peek at the bcm5700 driver by Broadcom, they handle all PHY read timeouts the way your patch does in this one spot, by setting the returned value to zero. So it seems the device can time out like that in these situations, and your patch is something close to the correct fix. Good catch Peter, I'll think some more about this and probably end up using something similar to your second patch. Thanks. From kaber@trash.net Tue Dec 21 00:52:54 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 21 Dec 2004 00:53:01 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBL8qXlp031215 for ; Tue, 21 Dec 2004 00:52:54 -0800 Received: from eru.coreworks.de ([172.16.0.2] helo=trash.net) by kaber.coreworks.de with esmtp (Exim 4.34) id 1Cgfk0-0005v3-1t; Tue, 21 Dec 2004 09:51:36 +0100 Message-ID: <41C7F6E4.1010507@trash.net> Date: Tue, 21 Dec 2004 11:11:48 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040413 Debian/1.6-5 X-Accept-Language: en MIME-Version: 1.0 To: hadi@cyberus.ca CC: Thomas Graf , "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PATCH] PKT_SCHED: Provide compat policer stats in action policer References: <20041215130128.GK8493@postel.suug.ch> <1103119774.1077.74.camel@jzny.localdomain> <41C05B60.6030504@trash.net> <1103484249.1046.143.camel@jzny.localdomain> <41C6A6CC.1050105@trash.net> <1103552830.1049.355.camel@jzny.localdomain> In-Reply-To: <1103552830.1049.355.camel@jzny.localdomain> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12959 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev jamal wrote: > I havent looked closely at tcng although Werner has showed it to me a > few times (may be under influence). We need to pick one or other test > setup. I dont care if its what I have, tcng or what Thomas has. > I just stared quickly at what Thomas has and realize its not really > automated. In my case it is easier because i can click on the proverbial > one-button and run 20 tests (including a subset of the policer ones) > and even capturing tcpdumps. I have attached a sample testcase. > They are harder to create and require the environment i have. > But once you create them, you should be saying "go" - go do something > and come back and get results. > Whatever we end up having, my preference would be something along those > lines, tcsim has one major advantage, you can test the actual scheduling algorithms for their behaviour under very controlled conditions. It gave me a lot more confidence when replacing the HFSC lists by rbtrees. But, as Thomas notes, it does all its tests in userspace, which might not be ideal for things besides scheduling algorithms. So a combination of both seems to be best. Regards Patrick From kaber@trash.net Tue Dec 21 00:58:26 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 21 Dec 2004 00:58:32 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBL8w57Z031776 for ; Tue, 21 Dec 2004 00:58:26 -0800 Received: from eru.coreworks.de ([172.16.0.2] helo=trash.net) by kaber.coreworks.de with esmtp (Exim 4.34) id 1CgfpP-0005vP-C6; Tue, 21 Dec 2004 09:57:11 +0100 Message-ID: <41C7F833.4000909@trash.net> Date: Tue, 21 Dec 2004 11:17:23 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040413 Debian/1.6-5 X-Accept-Language: en MIME-Version: 1.0 To: Thomas Graf CC: jamal , "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PATCH] PKT_SCHED: Fix cls indev validation References: <20041219203050.GK17998@postel.suug.ch> <41C68CEF.3030803@trash.net> <1103552215.1048.333.camel@jzny.localdomain> <20041220200739.GX17998@postel.suug.ch> In-Reply-To: <20041220200739.GX17998@postel.suug.ch> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12960 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev Thomas Graf wrote: > * jamal <1103552215.1048.333.camel@jzny.localdomain> 2004-12-20 09:16 > >>Hehe. I am sure thats a cutnpaste(LinuxWay) from some code in the kernel >>probably sch_api.c (or maybe the code it was cutnpasted has been fixed >>in the last 3 years ;->). >>That needs fixing. Who is sending the patch? > > I'll put it into my patchset so it gets into the test cycles. Could you make your patchset available somehow ? I'm not sure which areas of the code I can still touch without conflicting with you :) Regards Patrick From linux.lover2004@gmail.com Tue Dec 21 02:07:05 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 21 Dec 2004 02:07:12 -0800 (PST) Received: from rproxy.gmail.com (rproxy.gmail.com [64.233.170.192]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBLA6hN3001382 for ; Tue, 21 Dec 2004 02:07:04 -0800 Received: by rproxy.gmail.com with SMTP id c16so285056rne for ; Tue, 21 Dec 2004 02:06:15 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:cc:mime-version:content-type:content-transfer-encoding; b=pztHeNF+k/b1QraDrI90es4kYGInElYSwOglKzkEx6yf023E15n1IC6NkSaPcrXs8y8gO15n8h0kQerIdkCMfZKBCEwkZl9hUgsiqFpL0ArApn4AVHTLvFD6rQngwg7zbiBG5zWijdzGXoLdW02zFhTcjbhN/7YgnjnfuEtqgTY= Received: by 10.38.72.4 with SMTP id u4mr491700rna; Tue, 21 Dec 2004 02:06:14 -0800 (PST) Received: by 10.38.207.46 with HTTP; Tue, 21 Dec 2004 02:06:14 -0800 (PST) Message-ID: <72c6e379041221020653e7a488@mail.gmail.com> Date: Tue, 21 Dec 2004 15:36:14 +0530 From: linux lover Reply-To: linux lover To: linux-net@vger.kernel.org Subject: what is use of dev_queue_xmit_nit? Cc: netdev@oss.sgi.com Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12961 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: linux.lover2004@gmail.com Precedence: bulk X-list: netdev Hello all, I want to know what is use of dev_queue_xmit_nit function in dev.c file? Also why it calls following statement ptype->func(skb2, skb->dev, ptype); Also why skb2 is created by cloning skb? Acually i trace TCP packets and found that control goes from neigh_resolve_output to directly dev_queue_xmit_nit and then to HW driver 8139too.c? I want to know why it not goes from dev_queue_xmit? I place printk statments and found after dev_queue_xmit_nit control moves to network interface driver? Help to understand the control packet path. Thanks in advance. linux_lover From markb@wetlettuce.com Tue Dec 21 02:24:54 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 21 Dec 2004 02:25:00 -0800 (PST) Received: from piglet.wetlettuce.com (piglet.wetlettuce.com [82.68.149.69]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBLAOUvt002295 for ; Tue, 21 Dec 2004 02:24:53 -0800 Received: from robin ([127.0.0.1] helo=wetlettuce.com ident=www-data) by piglet.wetlettuce.com with smtp (Exim 3.35 #1 (Debian)) id 1CghB6-0000hn-00; Tue, 21 Dec 2004 10:23:40 +0000 Received: from 192.102.214.6 (SquirrelMail authenticated user lists) by webmail.wetlettuce.com with HTTP; Tue, 21 Dec 2004 10:23:40 -0000 (GMT) Message-ID: <52121.192.102.214.6.1103624620.squirrel@webmail.wetlettuce.com> Date: Tue, 21 Dec 2004 10:23:40 -0000 (GMT) Subject: Re: Lockup with 2.6.9-ac15 related to netconsole From: "Mark Broadbent" To: In-Reply-To: <20041221005521.GD5974@waste.org> References: <59719.192.102.214.6.1103214002.squirrel@webmail.wetlettuce.com> <20041216211024.GK2767@waste.org> <34721.192.102.214.6.1103274614.squirrel@webmail.wetlettuce.com> <20041217215752.GP2767@waste.org> <20041217233524.GA11202@electric-eye.fr.zoreil.com> <36901.192.102.214.6.1103535728.squirrel@webmail.wetlettuce.com> <20041220211419.GC5974@waste.org> <20041221002218.GA1487@electric-eye.fr.zoreil.com> <20041221005521.GD5974@waste.org> X-Priority: 3 Importance: Normal X-MSMail-Priority: Normal Cc: , , Reply-To: markb@wetlettuce.com X-Mailer: SquirrelMail (version 1.2.6) MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-MailScanner: Mail is clear of Viree X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12962 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: markb@wetlettuce.com Precedence: bulk X-list: netdev Matt Mackall said: > On Tue, Dec 21, 2004 at 01:22:18AM +0100, Francois Romieu wrote: >> Matt Mackall : >> > On Mon, Dec 20, 2004 at 09:42:08AM -0000, Mark Broadbent wrote: >> > > >> > > Exactly the same happens, I still get a 'NMI Watchdog detected >> > > LOCKUP' with the r8169 device using the above patch on top of >> > > 2.6.10-rc3-bk10. >> > >> > Ok, that suggests a problem localized to netpoll itself. Do you have >> > spinlock debugging turned on by any chance? >> >> Any chance of: >> 1 dev_queue_xmit >> 2 dev->xmit_lock taken >> 3 interruption >> 4 printk >> 5 netconsole write >> 6 dev->xmit_lock again >> 7 lockup >> >> ? >> >> This is probably the silly question of the day. > > Maybe, but the answer isn't obvious to me at the moment as I haven't > been thinking about such stuff enough lately. Silly response of the > day: > > Mark, can you try this (again completely untested, but at least > compiles) patch? I'm afraid I don't have a proper test rig to > reproduce this at the moment. This will attempt to grab the lock, and > if it fails, will check for recursion. Then it will try to print a > message on the local console, temporarily disabling netconsole to > allow the printk to get through.. OK, patch applied and spinlock debugging enabled. Testing with eth1 (r1869) doesn'tyield any additional messages, just the standard 'NMI Watchdog detected lockup'. Thanks Mark -- Mark Broadbent Web: http://www.wetlettuce.com From romieu@fr.zoreil.com Tue Dec 21 04:41:16 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 21 Dec 2004 04:41:22 -0800 (PST) Received: from fr.zoreil.com (electric-eye.fr.zoreil.com [213.41.134.224]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBLCes5h012692 for ; Tue, 21 Dec 2004 04:41:15 -0800 Received: from electric-eye.fr.zoreil.com (localhost.localdomain [127.0.0.1]) by fr.zoreil.com (8.12.10/8.12.1) with ESMTP id iBLCbSvr014009; Tue, 21 Dec 2004 13:37:28 +0100 Received: (from romieu@localhost) by electric-eye.fr.zoreil.com (8.12.10/8.12.10/Submit) id iBLCbRx0014008; Tue, 21 Dec 2004 13:37:27 +0100 Date: Tue, 21 Dec 2004 13:37:27 +0100 From: Francois Romieu To: Mark Broadbent Cc: mpm@selenic.com, romieu@fr.zoreil.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: Lockup with 2.6.9-ac15 related to netconsole Message-ID: <20041221123727.GA13606@electric-eye.fr.zoreil.com> References: <59719.192.102.214.6.1103214002.squirrel@webmail.wetlettuce.com> <20041216211024.GK2767@waste.org> <34721.192.102.214.6.1103274614.squirrel@webmail.wetlettuce.com> <20041217215752.GP2767@waste.org> <20041217233524.GA11202@electric-eye.fr.zoreil.com> <36901.192.102.214.6.1103535728.squirrel@webmail.wetlettuce.com> <20041220211419.GC5974@waste.org> <20041221002218.GA1487@electric-eye.fr.zoreil.com> <20041221005521.GD5974@waste.org> <52121.192.102.214.6.1103624620.squirrel@webmail.wetlettuce.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <52121.192.102.214.6.1103624620.squirrel@webmail.wetlettuce.com> User-Agent: Mutt/1.4.1i X-Organisation: Land of Sunshine Inc. X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12963 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: romieu@fr.zoreil.com Precedence: bulk X-list: netdev Mark Broadbent : [...] > OK, patch applied and spinlock debugging enabled. Testing with eth1 > (r1869) doesn'tyield any additional messages, just the standard > 'NMI Watchdog detected lockup'. Does the modified version below trigger _exactly_ the same hang ? --- net/core/netpoll.c 2004-12-21 13:09:51.000000000 +0100 +++ net/core/netpoll.c 2004-12-21 13:27:01.000000000 +0100 @@ -31,6 +31,8 @@ #define MAX_SKBS 32 #define MAX_UDP_CHUNK 1460 +static int netpoll_kill; + static spinlock_t skb_list_lock = SPIN_LOCK_UNLOCKED; static int nr_skbs; static struct sk_buff *skbs; @@ -184,11 +186,21 @@ void netpoll_send_skb(struct netpoll *np repeat: if(!np || !np->dev || !netif_running(np->dev)) { +too_bad: __kfree_skb(skb); return; } - spin_lock(&np->dev->xmit_lock); + if (!spin_trylock(&np->dev->xmit_lock)) { + netpoll_kill = 1; + goto too_bad; + } + + if (netpoll_kill) { + if (net_ratelimit()) + printk(KERN_ERR "netconsole raced"); + netpoll_kill = 0; + } np->dev->xmit_lock_owner = smp_processor_id(); /* From markb@wetlettuce.com Tue Dec 21 05:30:46 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 21 Dec 2004 05:30:54 -0800 (PST) Received: from piglet.wetlettuce.com (piglet.wetlettuce.com [82.68.149.69]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBLDUPXF015059 for ; Tue, 21 Dec 2004 05:30:46 -0800 Received: from robin ([127.0.0.1] helo=wetlettuce.com ident=www-data) by piglet.wetlettuce.com with smtp (Exim 3.35 #1 (Debian)) id 1Cgk4o-0000sp-00; Tue, 21 Dec 2004 13:29:22 +0000 Received: from 192.102.214.6 (SquirrelMail authenticated user lists) by webmail.wetlettuce.com with HTTP; Tue, 21 Dec 2004 13:29:22 -0000 (GMT) Message-ID: <49295.192.102.214.6.1103635762.squirrel@webmail.wetlettuce.com> Date: Tue, 21 Dec 2004 13:29:22 -0000 (GMT) Subject: Re: Lockup with 2.6.9-ac15 related to netconsole From: "Mark Broadbent" To: In-Reply-To: <20041221123727.GA13606@electric-eye.fr.zoreil.com> References: <59719.192.102.214.6.1103214002.squirrel@webmail.wetlettuce.com> <20041216211024.GK2767@waste.org> <34721.192.102.214.6.1103274614.squirrel@webmail.wetlettuce.com> <20041217215752.GP2767@waste.org> <20041217233524.GA11202@electric-eye.fr.zoreil.com> <36901.192.102.214.6.1103535728.squirrel@webmail.wetlettuce.com> <20041220211419.GC5974@waste.org> <20041221002218.GA1487@electric-eye.fr.zoreil.com> <20041221005521.GD5974@waste.org> <52121.192.102.214.6.1103624620.squirrel@webmail.wetlettuce.com> <20041221123727.GA13606@electric-eye.fr.zoreil.com> X-Priority: 3 Importance: Normal X-MSMail-Priority: Normal Cc: , , Reply-To: markb@wetlettuce.com X-Mailer: SquirrelMail (version 1.2.6) MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-MailScanner: Mail is clear of Viree X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12964 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: markb@wetlettuce.com Precedence: bulk X-list: netdev Francois Romieu said: > Mark Broadbent : > [...] >> OK, patch applied and spinlock debugging enabled. Testing with eth1 >> (r1869) doesn'tyield any additional messages, just the standard >> 'NMI Watchdog detected lockup'. > > Does the modified version below trigger _exactly_ the same hang ? Using the patch supplied I get no hang, just the message 'netconsole raced' output to the console and the packet capture proceeds as normal. Thanks Mark -- Mark Broadbent Web: http://www.wetlettuce.com From eeb@bartonsoftware.com Tue Dec 21 11:09:19 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 21 Dec 2004 11:09:27 -0800 (PST) Received: from smtp-out2.blueyonder.co.uk (smtp-out2.blueyonder.co.uk [195.188.213.5]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBLJ8vUI003880 for ; Tue, 21 Dec 2004 11:09:18 -0800 Received: from ebpc ([82.33.116.203]) by smtp-out2.blueyonder.co.uk with Microsoft SMTPSVC(5.0.2195.6713); Tue, 21 Dec 2004 19:09:03 +0000 From: "Eric Barton" To: Subject: tcp retransmission Date: Tue, 21 Dec 2004 19:08:33 -0000 Message-ID: <005d01c4e790$78030ff0$0281a8c0@ebpc> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0 X-Message-Flag: Follow up X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1441 Importance: Normal Reply-By: Tue, 21 Dec 2004 17:00:00 -0000 X-OriginalArrivalTime: 21 Dec 2004 19:09:03.0610 (UTC) FILETIME=[892F45A0:01C4E790] X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12965 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: eeb@bartonsoftware.com Precedence: bulk X-list: netdev Hi, Can someone please help me understand this code in tcp_output.c... int tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb) { struct tcp_opt *tp = tcp_sk(sk); unsigned int cur_mss = tcp_current_mss(sk, 0); int err; /* Do not sent more than we queued. 1/4 is reserved for possible * copying overhead: frgagmentation, tunneling, mangling etc. */ if (atomic_read(&sk->sk_wmem_alloc) > min(sk->sk_wmem_queued + (sk->sk_wmem_queued >> 2), sk->sk_sndbuf)) return -EAGAIN; ...it's the decision to return -EAGAIN that has me baffled. AFAIK sk_wmem_queued is the number of unacknowledged bytes buffered for sending, wherease sk_wmwm_alloc is the total number of bytes allocated for buffering sends. How does this comparison make sense? I ask because I'm trying to investigate a TCP hangup, where (for some reason) the NIC fails to send an skb, but all the retransmit attempts fail because this condition traps them. Also, when trawling through the code to trying to understand some context, I came across the following in tcp_retransmit_timer()... if (tcp_retransmit_skb(sk, skb_peek(&sk->sk_write_queue)) > 0) { /* Retransmission failed because of local congestion, * do not backoff. */ ...maybe I'm being blind but I don't quite get under which circumstances tcp_retransmit_skb() can return > 0. Thanks in advance... - To unsubscribe from this list: send the line "unsubscribe linux-net" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html From jgarzik@pobox.com Tue Dec 21 12:04:05 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 21 Dec 2004 12:04:13 -0800 (PST) Received: from www.linux.org.uk (IDENT:93@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBLK3iwX005779 for ; Tue, 21 Dec 2004 12:04:05 -0800 Received: from [69.134.152.124] (helo=[10.10.10.88]) by www.linux.org.uk with asmtp (SSLv3:AES256-SHA:256) (Exim 4.33) id 1CgqE4-00014H-9p; Tue, 21 Dec 2004 20:03:20 +0000 Message-ID: <41C88183.3020906@pobox.com> Date: Tue, 21 Dec 2004 15:03:15 -0500 From: Jeff Garzik User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3) Gecko/20040922 X-Accept-Language: en-us, en MIME-Version: 1.0 To: James Ketrenos CC: Netdev Subject: Re: Steps for netdev-2.6 inclusion? References: <41AE7143.80505@linux.intel.com> In-Reply-To: <41AE7143.80505@linux.intel.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12966 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev James Ketrenos wrote: > > Ok, its been a long time coming, but it appears the ipw* wireless > drivers are to the point where being more proactive at getting them into > the kernel is appropriate (at least based on the frequency of emails I'm > getting of 'why isn't this in mainline?') > > So, what would be the set of steps required to get a version in the > queue for inclusion? Any updates? Where do we stand with this? I would suggest sending a patch for each driver, diff'd against -mm (which auto-includes my netdev-2.6 and wireless-2.6 queues). Jeff From romieu@fr.zoreil.com Tue Dec 21 12:53:08 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 21 Dec 2004 12:53:15 -0800 (PST) Received: from fr.zoreil.com (electric-eye.fr.zoreil.com [213.41.134.224]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBLKqk8F010809 for ; Tue, 21 Dec 2004 12:53:07 -0800 Received: from electric-eye.fr.zoreil.com (localhost.localdomain [127.0.0.1]) by fr.zoreil.com (8.12.10/8.12.1) with ESMTP id iBLKmsvr023257; Tue, 21 Dec 2004 21:48:54 +0100 Received: (from romieu@localhost) by electric-eye.fr.zoreil.com (8.12.10/8.12.10/Submit) id iBLKmrLi023256; Tue, 21 Dec 2004 21:48:54 +0100 Date: Tue, 21 Dec 2004 21:48:53 +0100 From: Francois Romieu To: Mark Broadbent Cc: mpm@selenic.com, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: Lockup with 2.6.9-ac15 related to netconsole Message-ID: <20041221204853.GA20869@electric-eye.fr.zoreil.com> References: <34721.192.102.214.6.1103274614.squirrel@webmail.wetlettuce.com> <20041217215752.GP2767@waste.org> <20041217233524.GA11202@electric-eye.fr.zoreil.com> <36901.192.102.214.6.1103535728.squirrel@webmail.wetlettuce.com> <20041220211419.GC5974@waste.org> <20041221002218.GA1487@electric-eye.fr.zoreil.com> <20041221005521.GD5974@waste.org> <52121.192.102.214.6.1103624620.squirrel@webmail.wetlettuce.com> <20041221123727.GA13606@electric-eye.fr.zoreil.com> <49295.192.102.214.6.1103635762.squirrel@webmail.wetlettuce.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <49295.192.102.214.6.1103635762.squirrel@webmail.wetlettuce.com> User-Agent: Mutt/1.4.1i X-Organisation: Land of Sunshine Inc. X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12967 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: romieu@fr.zoreil.com Precedence: bulk X-list: netdev Mark Broadbent : [...] > Using the patch supplied I get no hang, just the message 'netconsole > raced' output to the console and the packet capture proceeds as normal. > Thanks The patch is more a bandaid for debugging than a real fix. The netconsole will drop some messages until its locking is fixed If you can issue one more test, I'd like to know if some messages appear on the VGA console around the time at which tcpdump is started (the test assumes that netconsole is not used/insmoded at all). Please check that the console log_level is set high enough as it will be really disappointing if nothing appears :o) -- Ueimor From oxymoron@waste.org Tue Dec 21 13:28:29 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 21 Dec 2004 13:28:36 -0800 (PST) Received: from waste.org (waste.org [216.27.176.166]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBLLS9C6012113 for ; Tue, 21 Dec 2004 13:28:29 -0800 Received: from waste.org (localhost [127.0.0.1]) by waste.org (8.12.3/8.12.3/Debian-7.1) with ESMTP id iBLLRbkC010087; Tue, 21 Dec 2004 15:27:37 -0600 Received: (from oxymoron@localhost) by waste.org (8.12.3/8.12.3/Debian-7.1) id iBLLRbcY010084; Tue, 21 Dec 2004 15:27:37 -0600 Date: Tue, 21 Dec 2004 13:27:37 -0800 From: Matt Mackall To: Francois Romieu Cc: Mark Broadbent , linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: Lockup with 2.6.9-ac15 related to netconsole Message-ID: <20041221212737.GK5974@waste.org> References: <20041217215752.GP2767@waste.org> <20041217233524.GA11202@electric-eye.fr.zoreil.com> <36901.192.102.214.6.1103535728.squirrel@webmail.wetlettuce.com> <20041220211419.GC5974@waste.org> <20041221002218.GA1487@electric-eye.fr.zoreil.com> <20041221005521.GD5974@waste.org> <52121.192.102.214.6.1103624620.squirrel@webmail.wetlettuce.com> <20041221123727.GA13606@electric-eye.fr.zoreil.com> <49295.192.102.214.6.1103635762.squirrel@webmail.wetlettuce.com> <20041221204853.GA20869@electric-eye.fr.zoreil.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041221204853.GA20869@electric-eye.fr.zoreil.com> User-Agent: Mutt/1.3.28i X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Scanned: by amavisd-new X-Virus-Status: Clean X-archive-position: 12968 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mpm@selenic.com Precedence: bulk X-list: netdev On Tue, Dec 21, 2004 at 09:48:53PM +0100, Francois Romieu wrote: > Mark Broadbent : > [...] > > Using the patch supplied I get no hang, just the message 'netconsole > > raced' output to the console and the packet capture proceeds as normal. > > Thanks > > The patch is more a bandaid for debugging than a real fix. The netconsole > will drop some messages until its locking is fixed Unfortunately there's no good way to fix its locking in this circumstance (or the harder case of driver-private locks). I think I'll just have to come up with some scheme for queueing messages that arrive when the queue lock is held. > If you can issue one more test, I'd like to know if some messages appear > on the VGA console around the time at which tcpdump is started (the test > assumes that netconsole is not used/insmoded at all). Please check that > the console log_level is set high enough as it will be really disappointing > if nothing appears :o) I think it's the promiscuous mode message itself that's the problem but I've not had time to reproduce it. -- Mathematics is the supreme nostalgia of our time. From andreaf@cs.columbia.edu Tue Dec 21 13:30:45 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 21 Dec 2004 13:30:53 -0800 (PST) Received: from cs.columbia.edu (cs.columbia.edu [128.59.16.20]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBLLUOUH012498 for ; Tue, 21 Dec 2004 13:30:45 -0800 Received: from lion.cs.columbia.edu (IDENT:afjX2vQmfnkH+NMxU3fUjHD24Tx3eS1J@lion.cs.columbia.edu [128.59.16.120]) by cs.columbia.edu (8.12.10/8.12.10) with ESMTP id iBLLTugf000872 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NOT); Tue, 21 Dec 2004 16:29:57 -0500 (EST) Received: from [128.59.17.219] (dhcp69.cs.columbia.edu [128.59.17.219]) by lion.cs.columbia.edu (8.12.9/8.12.9) with ESMTP id iBLLTtqG005969; Tue, 21 Dec 2004 16:29:56 -0500 Message-ID: <41C895D6.2040001@cs.columbia.edu> Date: Tue, 21 Dec 2004 16:29:58 -0500 From: Andrea G Forte User-Agent: Mozilla Thunderbird 0.9 (Windows/20041103) X-Accept-Language: en-us, en MIME-Version: 1.0 To: netdev@oss.sgi.com, linux-net@vger.kernel.org Subject: Very slow change of IP in kernel (slow socket?). Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-PMX-Version: 4.7.0.111621, Antispam-Engine: 2.0.2.0, Antispam-Data: 2004.12.21.24 X-PerlMx-Spam: Gauge=IIIIIII, Probability=7%, Report='__CT 0, __CTE 0, __CT_TEXT_PLAIN 0, __HAS_MSGID 0, __MIME_VERSION 0, __SANE_MSGID 0' X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12969 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: andreaf@cs.columbia.edu Precedence: bulk X-list: netdev Hi all, after some talking I decided to try again and post a specific thread for this problem. I noticed that when I change IP address for the same wireless card (since I am moving to a different subnet I need a new IP), the actual change in the kernel happens between 300 to 500 ms later. In particular, after changing the IP (ip route add...) and updating route table and default gw, the actual data packets are sent using the new IP only after 300 to 500 ms after setting all the above. Does anyone of you know what this could be related to? Or, does anyone of you know where in the kernel code I could start looking for some answers? I have already had some feedback with suggestions to look into the route_cache, however this does not seem to me as a route problem but more as a socket problem and perhaps some kind of timer that is set to refresh the socket info in the kernel. Any help would be really appreciated. Thanks all, Andrea From romieu@fr.zoreil.com Tue Dec 21 15:01:09 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 21 Dec 2004 15:01:16 -0800 (PST) Received: from fr.zoreil.com (electric-eye.fr.zoreil.com [213.41.134.224]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBLN0mVl020590 for ; Tue, 21 Dec 2004 15:01:09 -0800 Received: from electric-eye.fr.zoreil.com (localhost.localdomain [127.0.0.1]) by fr.zoreil.com (8.12.10/8.12.1) with ESMTP id iBLMwWvr026244; Tue, 21 Dec 2004 23:58:32 +0100 Received: (from romieu@localhost) by electric-eye.fr.zoreil.com (8.12.10/8.12.10/Submit) id iBLMwVNW026243; Tue, 21 Dec 2004 23:58:31 +0100 Date: Tue, 21 Dec 2004 23:58:31 +0100 From: Francois Romieu To: Matt Mackall Cc: Mark Broadbent , linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: Lockup with 2.6.9-ac15 related to netconsole Message-ID: <20041221225831.GA20910@electric-eye.fr.zoreil.com> References: <20041217233524.GA11202@electric-eye.fr.zoreil.com> <36901.192.102.214.6.1103535728.squirrel@webmail.wetlettuce.com> <20041220211419.GC5974@waste.org> <20041221002218.GA1487@electric-eye.fr.zoreil.com> <20041221005521.GD5974@waste.org> <52121.192.102.214.6.1103624620.squirrel@webmail.wetlettuce.com> <20041221123727.GA13606@electric-eye.fr.zoreil.com> <49295.192.102.214.6.1103635762.squirrel@webmail.wetlettuce.com> <20041221204853.GA20869@electric-eye.fr.zoreil.com> <20041221212737.GK5974@waste.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041221212737.GK5974@waste.org> User-Agent: Mutt/1.4.1i X-Organisation: Land of Sunshine Inc. X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12970 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: romieu@fr.zoreil.com Precedence: bulk X-list: netdev Matt Mackall : [...] > I think it's the promiscuous mode message itself that's the problem Yes. dev_mc_add takes dev->xmit_lock and the game is over. Marc, if the patch below happens to work, it should not drop messages like the previous one (it is an ugly short-term suggestion). --- net/core/netpoll.c 2004-12-21 13:09:51.000000000 +0100 +++ net/core/netpoll.c 2004-12-21 23:35:25.000000000 +0100 @@ -22,6 +22,7 @@ #include #include #include +#include /* * We maintain a small pool of fully-sized skbs, to make sure the @@ -184,11 +187,19 @@ void netpoll_send_skb(struct netpoll *np repeat: if(!np || !np->dev || !netif_running(np->dev)) { __kfree_skb(skb); return; } - spin_lock(&np->dev->xmit_lock); + while (!spin_trylock(&np->dev->xmit_lock)) { + if (np->dev->xmit_lock_owner == smp_processor_id()) { + struct Qdisc *q = dev->qdisc; + + q->ops->enqueue(skb, q); + return; + } + } + np->dev->xmit_lock_owner = smp_processor_id(); /* From tgraf@suug.ch Tue Dec 21 16:32:08 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 21 Dec 2004 16:32:15 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBM0Vk8W026740 for ; Tue, 21 Dec 2004 16:32:07 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id 8820782; Wed, 22 Dec 2004 01:30:59 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id 9E7911C0EA; Wed, 22 Dec 2004 01:31:42 +0100 (CET) Date: Wed, 22 Dec 2004 01:31:42 +0100 From: Thomas Graf To: Patrick McHardy Cc: jamal , "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PATCH] PKT_SCHED: Fix cls indev validation Message-ID: <20041222003142.GB7884@postel.suug.ch> References: <20041219203050.GK17998@postel.suug.ch> <41C68CEF.3030803@trash.net> <1103552215.1048.333.camel@jzny.localdomain> <20041220200739.GX17998@postel.suug.ch> <41C7F833.4000909@trash.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <41C7F833.4000909@trash.net> X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12971 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * Patrick McHardy <41C7F833.4000909@trash.net> 2004-12-21 11:17 > Could you make your patchset available somehow ? http://people.suug.ch/~tgr/patches/queue/ Unfinished and untested. > I'm not sure which areas of the code I can still > touch without conflicting with you :) include/linux/pkt_cls.h | 6 include/linux/rtnetlink.h | 3 include/net/pkt_cls.h | 45 +++++ net/sched/cls_api.c | 205 +++++++++++++++++++++++++ net/sched/cls_fw.c | 145 +++--------------- net/sched/cls_route.c | 355 ++++++++++++++++++++++++-------------------- net/sched/cls_rsvp.h | 88 +++++------ net/sched/cls_tcindex.c | 364 +++++++++++++++++++++++++--------------------- net/sched/cls_u32.c | 131 ++++------------ 9 files changed, 766 insertions(+), 576 deletions(-) From ahu@outpost.ds9a.nl Wed Dec 22 00:28:08 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 22 Dec 2004 00:28:24 -0800 (PST) Received: from outpost.ds9a.nl (postfix@outpost.ds9a.nl [213.244.168.210]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBM8RkfF016989 for ; Wed, 22 Dec 2004 00:28:07 -0800 Received: by outpost.ds9a.nl (Postfix, from userid 1000) id 9875E4495; Wed, 22 Dec 2004 09:27:20 +0100 (CET) Date: Wed, 22 Dec 2004 09:27:20 +0100 From: bert hubert To: Andrea G Forte Cc: netdev@oss.sgi.com, linux-net@vger.kernel.org Subject: Re: Very slow change of IP in kernel (slow socket?). Message-ID: <20041222082720.GA26089@outpost.ds9a.nl> Mail-Followup-To: bert hubert , Andrea G Forte , netdev@oss.sgi.com, linux-net@vger.kernel.org References: <41C895D6.2040001@cs.columbia.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <41C895D6.2040001@cs.columbia.edu> User-Agent: Mutt/1.3.28i X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12972 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ahu@ds9a.nl Precedence: bulk X-list: netdev On Tue, Dec 21, 2004 at 04:29:58PM -0500, Andrea G Forte wrote: > Does anyone of you know what this could be related to? Or, does anyone 'ip route flush cache' may help. -- http://www.PowerDNS.com Open source, database driven DNS Software http://lartc.org Linux Advanced Routing & Traffic Control HOWTO From kaber@trash.net Wed Dec 22 01:36:18 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 22 Dec 2004 01:36:37 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBM9ZvOG020382 for ; Wed, 22 Dec 2004 01:36:18 -0800 Received: from eru.coreworks.de ([172.16.0.2] helo=trash.net) by kaber.coreworks.de with esmtp (Exim 4.34) id 1Ch2tt-0007pJ-7j; Wed, 22 Dec 2004 10:35:21 +0100 Message-ID: <41C93FAB.9090708@trash.net> Date: Wed, 22 Dec 2004 10:34:35 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040413 Debian/1.6-5 X-Accept-Language: en MIME-Version: 1.0 To: Francois Romieu CC: Matt Mackall , Mark Broadbent , linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: Lockup with 2.6.9-ac15 related to netconsole References: <20041217233524.GA11202@electric-eye.fr.zoreil.com> <36901.192.102.214.6.1103535728.squirrel@webmail.wetlettuce.com> <20041220211419.GC5974@waste.org> <20041221002218.GA1487@electric-eye.fr.zoreil.com> <20041221005521.GD5974@waste.org> <52121.192.102.214.6.1103624620.squirrel@webmail.wetlettuce.com> <20041221123727.GA13606@electric-eye.fr.zoreil.com> <49295.192.102.214.6.1103635762.squirrel@webmail.wetlettuce.com> <20041221204853.GA20869@electric-eye.fr.zoreil.com> <20041221212737.GK5974@waste.org> <20041221225831.GA20910@electric-eye.fr.zoreil.com> In-Reply-To: <20041221225831.GA20910@electric-eye.fr.zoreil.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12973 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev Francois Romieu wrote: > Marc, if the patch below happens to work, it should not drop messages > like the previous one (it is an ugly short-term suggestion). > > - spin_lock(&np->dev->xmit_lock); > + while (!spin_trylock(&np->dev->xmit_lock)) { > + if (np->dev->xmit_lock_owner == smp_processor_id()) { > + struct Qdisc *q = dev->qdisc; > + > + q->ops->enqueue(skb, q); Shouldn't this be requeue ? Regards Patrick From kaber@trash.net Wed Dec 22 01:38:47 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 22 Dec 2004 01:39:11 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBM9cQrh020651 for ; Wed, 22 Dec 2004 01:38:47 -0800 Received: from eru.coreworks.de ([172.16.0.2] helo=trash.net) by kaber.coreworks.de with esmtp (Exim 4.34) id 1Ch2w2-0007pf-Rz; Wed, 22 Dec 2004 10:37:35 +0100 Message-ID: <41C94030.4050806@trash.net> Date: Wed, 22 Dec 2004 10:36:48 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040413 Debian/1.6-5 X-Accept-Language: en MIME-Version: 1.0 To: Thomas Graf CC: jamal , "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PATCH] PKT_SCHED: Fix cls indev validation References: <20041219203050.GK17998@postel.suug.ch> <41C68CEF.3030803@trash.net> <1103552215.1048.333.camel@jzny.localdomain> <20041220200739.GX17998@postel.suug.ch> <41C7F833.4000909@trash.net> <20041222003142.GB7884@postel.suug.ch> In-Reply-To: <20041222003142.GB7884@postel.suug.ch> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12974 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev Thomas Graf wrote: > * Patrick McHardy <41C7F833.4000909@trash.net> 2004-12-21 11:17 > >>I'm not sure which areas of the code I can still >>touch without conflicting with you :) > > > include/linux/pkt_cls.h | 6 > include/linux/rtnetlink.h | 3 > include/net/pkt_cls.h | 45 +++++ > net/sched/cls_api.c | 205 +++++++++++++++++++++++++ > net/sched/cls_fw.c | 145 +++--------------- > net/sched/cls_route.c | 355 ++++++++++++++++++++++++-------------------- > net/sched/cls_rsvp.h | 88 +++++------ > net/sched/cls_tcindex.c | 364 +++++++++++++++++++++++++--------------------- > net/sched/cls_u32.c | 131 ++++------------ > 9 files changed, 766 insertions(+), 576 deletions(-) Thanks. I'm going to take on the action code then. Regards Patrick From kaber@trash.net Wed Dec 22 02:55:37 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 22 Dec 2004 02:55:49 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBMAtGok026512 for ; Wed, 22 Dec 2004 02:55:37 -0800 Received: from eru.coreworks.de ([172.16.0.2] helo=trash.net) by kaber.coreworks.de with esmtp (Exim 4.34) id 1Ch495-0007vS-Nz; Wed, 22 Dec 2004 11:55:08 +0100 Message-ID: <41C9525F.4070805@trash.net> Date: Wed, 22 Dec 2004 11:54:23 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040413 Debian/1.6-5 X-Accept-Language: en MIME-Version: 1.0 To: Francois Romieu CC: Matt Mackall , Mark Broadbent , linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: Lockup with 2.6.9-ac15 related to netconsole References: <20041217233524.GA11202@electric-eye.fr.zoreil.com> <36901.192.102.214.6.1103535728.squirrel@webmail.wetlettuce.com> <20041220211419.GC5974@waste.org> <20041221002218.GA1487@electric-eye.fr.zoreil.com> <20041221005521.GD5974@waste.org> <52121.192.102.214.6.1103624620.squirrel@webmail.wetlettuce.com> <20041221123727.GA13606@electric-eye.fr.zoreil.com> <49295.192.102.214.6.1103635762.squirrel@webmail.wetlettuce.com> <20041221204853.GA20869@electric-eye.fr.zoreil.com> <20041221212737.GK5974@waste.org> <20041221225831.GA20910@electric-eye.fr.zoreil.com> <41C93FAB.9090708@trash.net> In-Reply-To: <41C93FAB.9090708@trash.net> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12975 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev Patrick McHardy wrote: > Francois Romieu wrote: > >> Marc, if the patch below happens to work, it should not drop messages >> like the previous one (it is an ugly short-term suggestion). >> > >> - spin_lock(&np->dev->xmit_lock); >> + while (!spin_trylock(&np->dev->xmit_lock)) { >> + if (np->dev->xmit_lock_owner == smp_processor_id()) { >> + struct Qdisc *q = dev->qdisc; >> + >> + q->ops->enqueue(skb, q); > > > Shouldn't this be requeue ? Since the code doesn't dequeue itself, enqueue seems fine to keep at least the queued messages ordered. But you need to grab dev->queue_lock, otherwise you risk corrupting qdisc internal data. You should probably also deal with the noqueue-qdisc, which doesn't have an enqueue function. So it should look something like this: while (!spin_trylock(&np->dev->xmit_lock)) { if (np->dev->xmit_lock_owner == smp_processor_id()) { struct Qdisc *q; rcu_read_lock(); q = rcu_dereference(dev->qdisc); if (q->enqueue) { spin_lock(&dev->queue_lock); q->ops->enqueue(skb, q); spin_unlock(&dev->queue_lock); netif_schedule(np->dev); } else kfree_skb(skb); rcu_read_unlock(); return; } } Regards Patrick From thomas.spatzier@de.ibm.com Wed Dec 22 03:00:08 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 22 Dec 2004 03:00:20 -0800 (PST) Received: from mtagate2.de.ibm.com (mtagate2.de.ibm.com [195.212.29.151]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBMAxjxI027132 for ; Wed, 22 Dec 2004 03:00:06 -0800 Received: from d12nrmr1607.megacenter.de.ibm.com (d12nrmr1607.megacenter.de.ibm.com [9.149.167.49]) by mtagate2.de.ibm.com (8.12.10/8.12.10) with ESMTP id iBMAxYbP188498 for ; Wed, 22 Dec 2004 10:59:34 GMT Received: from d12av02.megacenter.de.ibm.com (d12av02.megacenter.de.ibm.com [9.149.165.228]) by d12nrmr1607.megacenter.de.ibm.com (8.12.10/NCO/VER6.6) with ESMTP id iBMB0HpJ142570 for ; Wed, 22 Dec 2004 12:00:17 +0100 Received: from d12av02.megacenter.de.ibm.com (loopback [127.0.0.1]) by d12av02.megacenter.de.ibm.com (8.12.11/8.12.11) with ESMTP id iBMAxXox030196 for ; Wed, 22 Dec 2004 11:59:33 +0100 Received: from d12ml061.megacenter.de.ibm.com (d12nrml1501.megacenter.de.ibm.com [9.149.164.51] (may be forged)) by d12av02.megacenter.de.ibm.com (8.12.11/8.12.11) with ESMTP id iBMAxXO9030193; Wed, 22 Dec 2004 11:59:33 +0100 In-Reply-To: <41C71FFD.7090308@pobox.com> Subject: Re: [patch 4/10] s390: network driver. To: Jeff Garzik Cc: "David S. Miller" , hadi@cyberus.ca, Hasso Tepper , Herbert Xu , netdev@oss.sgi.com, Paul Jakma , Tommy Christensen X-Mailer: Lotus Notes Release 6.0.2CF1 June 9, 2003 Message-ID: From: Thomas Spatzier Date: Wed, 22 Dec 2004 11:56:06 +0100 X-MIMETrack: Serialize by Router on D12ML061/12/M/IBM(Release 6.51HF338 | June 21, 2004) at 22/12/2004 12:00:08 MIME-Version: 1.0 Content-type: text/plain; charset=US-ASCII X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12976 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: thomas.spatzier@de.ibm.com Precedence: bulk X-list: netdev Jeff Garzik wrote on 20.12.2004 19:54:53: > I haven't heard anything to convince me that the same change should be > deployed across NNN drivers. The drivers already signal the net core > that the link is down; to me, that implies there should be code in _one_ > place that handles this condition, not NNN places. > This sounds plausible and I'm with Jeff here. For me as the driver author it's the smallest change. I will put it like this: on cable gone: netif_stop_queue(); netif_carrier_off(); on cable reconnected: netif_carrier_on(); netif_wake_queue(); Is that ok, i.e. what all drivers do or should do? For the problems that applications might have (i.e. sockets being blocked etc.) another solution should be found. And as Jeff pointed out, this should be a central solution and not be implemented in drivers. Regards, Thomas. From garzik@havoc.gtf.org Wed Dec 22 03:08:27 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 22 Dec 2004 03:08:34 -0800 (PST) Received: from havoc.gtf.org (havoc.gtf.org [63.115.148.101]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBMB86So027957 for ; Wed, 22 Dec 2004 03:08:26 -0800 Received: from havoc.gtf.org (havoc.gtf.org [127.0.0.1]) by havoc.gtf.org (Postfix) with ESMTP id 196517891; Wed, 22 Dec 2004 06:07:54 -0500 (EST) Received: (from garzik@localhost) by havoc.gtf.org (8.12.10/8.12.10/Submit) id iBMB7qQR012019; Wed, 22 Dec 2004 06:07:52 -0500 Date: Wed, 22 Dec 2004 06:07:52 -0500 From: Jeff Garzik To: Thomas Spatzier Cc: "David S. Miller" , hadi@cyberus.ca, Hasso Tepper , Herbert Xu , netdev@oss.sgi.com, Paul Jakma , Tommy Christensen Subject: Re: [patch 4/10] s390: network driver. Message-ID: <20041222110752.GA11969@havoc.gtf.org> References: <41C71FFD.7090308@pobox.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.1i X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12977 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev On Wed, Dec 22, 2004 at 11:56:06AM +0100, Thomas Spatzier wrote: > on cable reconnected: > netif_carrier_on(); > netif_wake_queue(); Side note to all driver authors, make sure you only ever start or wake the queue if there is truly space available for another skb. Jeff From romieu@fr.zoreil.com Wed Dec 22 04:40:45 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 22 Dec 2004 04:41:00 -0800 (PST) Received: from fr.zoreil.com (electric-eye.fr.zoreil.com [213.41.134.224]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBMCeNiP003990 for ; Wed, 22 Dec 2004 04:40:44 -0800 Received: from electric-eye.fr.zoreil.com (localhost.localdomain [127.0.0.1]) by fr.zoreil.com (8.12.10/8.12.1) with ESMTP id iBMCdfvr004364; Wed, 22 Dec 2004 13:39:41 +0100 Received: (from romieu@localhost) by electric-eye.fr.zoreil.com (8.12.10/8.12.10/Submit) id iBMCdfFD004363; Wed, 22 Dec 2004 13:39:41 +0100 Date: Wed, 22 Dec 2004 13:39:41 +0100 From: Francois Romieu To: Patrick McHardy Cc: Matt Mackall , Mark Broadbent , linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: Lockup with 2.6.9-ac15 related to netconsole Message-ID: <20041222123940.GA4241@electric-eye.fr.zoreil.com> References: <20041221002218.GA1487@electric-eye.fr.zoreil.com> <20041221005521.GD5974@waste.org> <52121.192.102.214.6.1103624620.squirrel@webmail.wetlettuce.com> <20041221123727.GA13606@electric-eye.fr.zoreil.com> <49295.192.102.214.6.1103635762.squirrel@webmail.wetlettuce.com> <20041221204853.GA20869@electric-eye.fr.zoreil.com> <20041221212737.GK5974@waste.org> <20041221225831.GA20910@electric-eye.fr.zoreil.com> <41C93FAB.9090708@trash.net> <41C9525F.4070805@trash.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <41C9525F.4070805@trash.net> User-Agent: Mutt/1.4.1i X-Organisation: Land of Sunshine Inc. X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12978 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: romieu@fr.zoreil.com Precedence: bulk X-list: netdev Patrick McHardy : [...] > at least the queued messages ordered. But you need to grab > dev->queue_lock, otherwise you risk corrupting qdisc internal data. > You should probably also deal with the noqueue-qdisc, which doesn't > have an enqueue function. So it should look something like this: If I am not mistaken, a failure on spin_trylock + the test on xmit_lock_owner imply that it is safe to directly handle the queue. It means that qdisc_run() has been interrupted on the current cpu and the other paths seem fine as well. Counter-example is welcome (no joke). Of course the patch is completely ugly and violates any layering principle one could think of. It was not submitted for inclusion :o) > while (!spin_trylock(&np->dev->xmit_lock)) { > if (np->dev->xmit_lock_owner == smp_processor_id()) { > struct Qdisc *q; > > rcu_read_lock(); > q = rcu_dereference(dev->qdisc); > if (q->enqueue) { > spin_lock(&dev->queue_lock); I'd expect it to deadlock if dev_queue_xmit -> qdisc_run is interrupted on the current cpu and a printk is issued as dev->queue_lock will have been taken elsewhere. -- Ueimor From hadi@cyberus.ca Wed Dec 22 05:10:29 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 22 Dec 2004 05:10:36 -0800 (PST) Received: from mx01.cybersurf.com (mx01.cybersurf.com [209.197.145.104]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBMDA8uR005696 for ; Wed, 22 Dec 2004 05:10:28 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx01.cybersurf.com with esmtp (Exim 4.30) id 1Ch6Fv-0002XV-O8 for netdev@oss.sgi.com; Wed, 22 Dec 2004 08:10:19 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1Ch6Fs-0007HB-2x; Wed, 22 Dec 2004 08:10:16 -0500 Subject: Re: [PATCH] PKT_SCHED: Provide compat policer stats in action policer From: jamal Reply-To: hadi@cyberus.ca To: Thomas Graf Cc: Patrick McHardy , "David S. Miller" , netdev@oss.sgi.com In-Reply-To: <20041221001621.GY17998@postel.suug.ch> References: <20041215130128.GK8493@postel.suug.ch> <1103119774.1077.74.camel@jzny.localdomain> <41C05B60.6030504@trash.net> <1103484249.1046.143.camel@jzny.localdomain> <41C6A6CC.1050105@trash.net> <1103552830.1049.355.camel@jzny.localdomain> <20041221001621.GY17998@postel.suug.ch> Content-Type: text/plain Organization: jamalopolous Message-Id: <1103721013.1090.30.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 22 Dec 2004 08:10:13 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12979 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev Dont have time to look at it right now - If you can also have it generate traffic in that framework and validate results that would be similar to what i have. cheers, jamal On Mon, 2004-12-20 at 19:16, Thomas Graf wrote: > * jamal <1103552830.1049.355.camel@jzny.localdomain> 2004-12-20 09:27 > > I just stared quickly at what Thomas has and realize its not really > > automated. In my case it is easier because i can click on the proverbial > > one-button and run 20 tests (including a subset of the policer ones) > > and even capturing tcpdumps. I have attached a sample testcase. > > I've put up a snapshot of what I have at: > http://people.suug.ch/~tgr/tmp/tc-testsuite.tar.gz > > .. and a special for jamal not including iproute2 sources > so you can save a few bucks ;-> > http://people.suug.ch/~tgr/tmp/tc-testsuite-jamal.tar.bz2 > > I'm not sure what you mean, it's all automated and actually > quite similar to what you've included. Actually, we could > simply run your testscript under my testsuite if you > replace the "tc" with a variable given to the script. > > My indev test: (requires a patched iproute2 to correctly > print [ indev INDEV ] in usage text, patches follow later) > > ---- > #!/bin/bash > # vim: ft=sh > # > source lib/generic.sh > > TYPE=$1; shift > FILTER=$@ > > HELP=`$TC filter add ${TYPE} help 2>&1` > > if [ "`echo $HELP | grep \"indev DEV\"`" ]; then > if [ $CONFIG_NET_CLS_IND ]; then > ts_tc "attr-indev" "filter addition" $FILTER indev "eth1" > else > ts_log "attr-indev: No INDEV support, checking error return value" > RES=`$TC $FILTER indev "eth1" 2>&1` > if [ "`echo $RES | head -1`" != \ > "RTNETLINK answers: Operation not supported" ]; then > echo "Warning: Kernel did not report EOPNOTSUPP" > fi > fi > else > # no support for indev, add filter anyway to not break deletion > ts_log "ip version doesn't support indev, skipping" > ts_tc "attr-indev" "dummy addition" $FILTER > fi > --- > > Here's an example run: > > tgr:axs ~/dev/tc-testsuite make alltests > Running cls-testbed.t [iproute2-2.4.7/2.6.10-rc3-bk12]: FAILED > Running cls-testbed.t [iproute2-2.6.9/2.6.10-rc3-bk12]: PASS > Running dsmark.t [iproute2-2.4.7/2.6.10-rc3-bk12]: FAILED > Running dsmark.t [iproute2-2.6.9/2.6.10-rc3-bk12]: PASS > Running std-cbq.t [iproute2-2.4.7/2.6.10-rc3-bk12]: PASS > Running std-cbq.t [iproute2-2.6.9/2.6.10-rc3-bk12]: PASS > tgr:axs ~/dev/tc-testsuite > > So, 2 of them failed, let's see what happenend: > tgr:axs ~/dev/tc-testsuite cat results/cls-testbed.t.iproute2-2.4.7.out > ... > Preparing classifier testbed with qdisc dsmark > cls-testbed: dsmark root qdisc creation failed: > command: iproute2/iproute2-2.4.7/tc/tc qdisc add dev lo root handle 10:0 dsmark indices 64 default_index 1 set_tc_index > stderr output: > Unknown qdisc "dsmark", hence option "indices" is unparsable > ... > > I'll take a closer look at tcng to see how we can put the best parts of both together. > It will definitely be no problem to integrate your expect based tests. > From hadi@cyberus.ca Wed Dec 22 05:14:20 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 22 Dec 2004 05:14:27 -0800 (PST) Received: from mx02.cybersurf.com (mx02.cybersurf.com [209.197.145.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBMDE0Vn006300 for ; Wed, 22 Dec 2004 05:14:20 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx02.cybersurf.com with esmtp (Exim 4.30) id 1Ch6Je-0002Oh-6b for netdev@oss.sgi.com; Wed, 22 Dec 2004 08:14:10 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1Ch6Jc-000841-W4; Wed, 22 Dec 2004 08:14:09 -0500 Subject: Re: [PATCH] PKT_SCHED: dsmark must take care of shared/cloned skbs From: jamal Reply-To: hadi@cyberus.ca To: "David S. Miller" Cc: kaber@trash.net, tgraf@suug.ch, netdev@oss.sgi.com In-Reply-To: <20041220170222.5ee14588.davem@davemloft.net> References: <20041218170017.GH17998@postel.suug.ch> <1103487827.1048.188.camel@jzny.localdomain> <20041219203641.GL17998@postel.suug.ch> <41C687EE.1090205@trash.net> <1103552026.1048.324.camel@jzny.localdomain> <20041220170222.5ee14588.davem@davemloft.net> Content-Type: text/plain Organization: jamalopolous Message-Id: <1103721246.1093.39.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 22 Dec 2004 08:14:06 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12980 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Mon, 2004-12-20 at 20:02, David S. Miller wrote: > On 20 Dec 2004 09:13:46 -0500 > jamal wrote: > > > Certainly not a big deal; shouldnt care if once in a while tcpdump > > actually gets to see the real packet that went out the wire. > > It's not just tcpdump. Any modification of a the packet data > for a shared SKB is illegal, no matter where it occurs. > This can corrupt TCP packets, which share the transmitted > packet with the socket retransmit queue. > > We have a similar problem with TSO and some gigabit cards whose > drivers muck with the iphdr->tot_len field on transmit. I still > am not sure how I want to address that case yet. Since transmitted > TCP data packets are always shared/cloned, we'll have to do a data > copy on every TSO send on these cards which frankly nullifies much > of the performance gain TSO gives. If we end of fixing it via a copy > we'll probably need to seriously consider not doing TSO unless we > are doing sendfile. Ok, makes sense. I can see how TSO may not be useful if you have to copy. I suppose NICS with hardware based retransmits will behave differently. cheers, jamal From tgraf@suug.ch Wed Dec 22 05:32:36 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 22 Dec 2004 05:32:44 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBMDWFbg007527 for ; Wed, 22 Dec 2004 05:32:36 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id 2801482; Wed, 22 Dec 2004 14:32:09 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id 7AA9E1C0EA; Wed, 22 Dec 2004 14:32:51 +0100 (CET) Date: Wed, 22 Dec 2004 14:32:51 +0100 From: Thomas Graf To: jamal Cc: Patrick McHardy , "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PATCH] PKT_SCHED: Provide compat policer stats in action policer Message-ID: <20041222133251.GC7884@postel.suug.ch> References: <20041215130128.GK8493@postel.suug.ch> <1103119774.1077.74.camel@jzny.localdomain> <41C05B60.6030504@trash.net> <1103484249.1046.143.camel@jzny.localdomain> <41C6A6CC.1050105@trash.net> <1103552830.1049.355.camel@jzny.localdomain> <20041221001621.GY17998@postel.suug.ch> <1103721013.1090.30.camel@jzny.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1103721013.1090.30.camel@jzny.localdomain> X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12981 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * jamal <1103721013.1090.30.camel@jzny.localdomain> 2004-12-22 08:10 > Dont have time to look at it right now - If you can also have it > generate traffic in that framework and validate results that would > be similar to what i have. Sure, no problem. Here's an example: ts_tc "attr-police" "police creation" $FILTER police rate 2kbit buffer 10k drop ts_log "cls-police: Sending 10 ICMP echo requests" hping2 -c 10 -1 --fast ${DEST} &> /dev/null ts_tc "cls-police" "police dumping" -s -s -d filter list dev $DEV parent 10:0 It could be done a lot better but I'm not aiming at replacing tcsim but rather to complement it so we have tcsim to validate the correctness of queueing and classification algorithms and my naive brutal testing framework for regression and stress testing under real conditions. From hadi@cyberus.ca Wed Dec 22 05:32:57 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 22 Dec 2004 05:33:04 -0800 (PST) Received: from mx03.cybersurf.com (mx03.cybersurf.com [209.197.145.106]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBMDWaYn007544 for ; Wed, 22 Dec 2004 05:32:57 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx03.cybersurf.com with esmtp (Exim 4.30) id 1Ch6bj-00039D-2s for netdev@oss.sgi.com; Wed, 22 Dec 2004 08:32:51 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1Ch6bg-0002YP-CD; Wed, 22 Dec 2004 08:32:48 -0500 Subject: Re: [PATCH] PKT_SCHED: Fix cls indev validation From: jamal Reply-To: hadi@cyberus.ca To: Thomas Graf Cc: Patrick McHardy , "David S. Miller" , netdev@oss.sgi.com In-Reply-To: <20041222003142.GB7884@postel.suug.ch> References: <20041219203050.GK17998@postel.suug.ch> <41C68CEF.3030803@trash.net> <1103552215.1048.333.camel@jzny.localdomain> <20041220200739.GX17998@postel.suug.ch> <41C7F833.4000909@trash.net> <20041222003142.GB7884@postel.suug.ch> Content-Type: text/plain Organization: jamalopolous Message-Id: <1103722366.1089.75.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 22 Dec 2004 08:32:46 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12982 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Tue, 2004-12-21 at 19:31, Thomas Graf wrote: > * Patrick McHardy <41C7F833.4000909@trash.net> 2004-12-21 11:17 > > Could you make your patchset available somehow ? > > http://people.suug.ch/~tgr/patches/queue/ > > Unfinished and untested. I just took a quick glimpse. 1)Recall: Policer will have to die at some point - only reason for its existence is for backward compat. New iproute2 code sooner than later stop using that inteface so we can kill it. I suspect we can kill it in a year or two and definetely the day 2.7 comes out. 2) The name tcf_attrs doesnt sound right - attributes are normally data pieces not methods. Cant think of a good name. 3) What can i say? dang - this indev thing is getting out of control ;-> If you are going to go this far for beautification sake then kill the .indev thing please before it becomes a beast. Do what we discussed a while back: - have a generic very basic extended generic match API which indev uses that gets invoked from the classifier. It should take no more than one page to write the indev extension - if it exceeds that you are doing something wrong. There should be capability to mix and match these extenders so i can say in u32 something like: match ip src X match extend indev src eth0 match protocol tcp match extended metadata fwmark 0x10 etc. I think its time we did this right than defering. Of course all backward compatibility rules apply ;-> cheers, jamal From hadi@cyberus.ca Wed Dec 22 05:33:30 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 22 Dec 2004 05:33:37 -0800 (PST) Received: from mx02.cybersurf.com (mx02.cybersurf.com [209.197.145.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBMDX92h007664 for ; Wed, 22 Dec 2004 05:33:30 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx02.cybersurf.com with esmtp (Exim 4.30) id 1Ch6cE-0006L2-Bl for netdev@oss.sgi.com; Wed, 22 Dec 2004 08:33:22 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1Ch6cA-0002f6-9f; Wed, 22 Dec 2004 08:33:18 -0500 Subject: Re: Lockup with 2.6.9-ac15 related to netconsole From: jamal Reply-To: hadi@cyberus.ca To: Francois Romieu Cc: Patrick McHardy , Matt Mackall , Mark Broadbent , linux-kernel@vger.kernel.org, netdev@oss.sgi.com In-Reply-To: <20041222123940.GA4241@electric-eye.fr.zoreil.com> References: <20041221002218.GA1487@electric-eye.fr.zoreil.com> <20041221005521.GD5974@waste.org> <52121.192.102.214.6.1103624620.squirrel@webmail.wetlettuce.com> <20041221123727.GA13606@electric-eye.fr.zoreil.com> <49295.192.102.214.6.1103635762.squirrel@webmail.wetlettuce.com> <20041221204853.GA20869@electric-eye.fr.zoreil.com> <20041221212737.GK5974@waste.org> <20041221225831.GA20910@electric-eye.fr.zoreil.com> <41C93FAB.9090708@trash.net> <41C9525F.4070805@trash.net> <20041222123940.GA4241@electric-eye.fr.zoreil.com> Content-Type: text/plain Organization: jamalopolous Message-Id: <1103722395.1090.77.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 22 Dec 2004 08:33:15 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12983 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Wed, 2004-12-22 at 07:39, Francois Romieu wrote: > If I am not mistaken, a failure on spin_trylock + the test on > xmit_lock_owner imply that it is safe to directly handle the > queue. It means that qdisc_run() has been interrupted on the > current cpu and the other paths seem fine as well. Counter-example > is welcome (no joke). Think more than 2 processors ;-> cheers, jamal From hadi@cyberus.ca Wed Dec 22 05:48:44 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 22 Dec 2004 05:48:51 -0800 (PST) Received: from mx01.cybersurf.com (mx01.cybersurf.com [209.197.145.104]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBMDmNQI009580 for ; Wed, 22 Dec 2004 05:48:43 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx01.cybersurf.com with esmtp (Exim 4.30) id 1Ch6qz-0003zC-GX for netdev@oss.sgi.com; Wed, 22 Dec 2004 08:48:37 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1Ch6qs-0004vh-Lf; Wed, 22 Dec 2004 08:48:30 -0500 Subject: Re: [patch 4/10] s390: network driver. From: jamal Reply-To: hadi@cyberus.ca To: Thomas Spatzier Cc: Jeff Garzik , "David S. Miller" , Hasso Tepper , Herbert Xu , netdev@oss.sgi.com, Paul Jakma , Tommy Christensen In-Reply-To: References: Content-Type: text/plain Organization: jamalopolous Message-Id: <1103723307.1092.83.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 22 Dec 2004 08:48:28 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12984 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Wed, 2004-12-22 at 05:56, Thomas Spatzier wrote: > > Is that ok, i.e. what all drivers do or should do? > > For the problems that applications might have (i.e. sockets being > blocked etc.) another solution should be found. > And as Jeff pointed out, this should be a central solution and > not be implemented in drivers. > I think this needs to be resolved too. It is possible to have a centralized action instead of requiring drivers to make changes if we know the state of the driver is in netcarrier_off. What that would require is on cable gone, you just say: netif_carrier_off(); and the top layer code will junk the packets before they hit the driver. This way the socket code can continue sending whatever it wants but if theres no link, then its fair to drop those packets? If this acceptable i can generate a quick patch. cheers, jamal From tgraf@suug.ch Wed Dec 22 05:50:14 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 22 Dec 2004 05:50:21 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBMDnr8Q009895 for ; Wed, 22 Dec 2004 05:50:14 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id 2040F82; Wed, 22 Dec 2004 14:49:48 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id DA3981C0EA; Wed, 22 Dec 2004 14:50:30 +0100 (CET) Date: Wed, 22 Dec 2004 14:50:30 +0100 From: Thomas Graf To: jamal Cc: "David S. Miller" , kaber@trash.net, netdev@oss.sgi.com Subject: Re: [PATCH] PKT_SCHED: dsmark must take care of shared/cloned skbs Message-ID: <20041222135030.GD7884@postel.suug.ch> References: <20041218170017.GH17998@postel.suug.ch> <1103487827.1048.188.camel@jzny.localdomain> <20041219203641.GL17998@postel.suug.ch> <41C687EE.1090205@trash.net> <1103552026.1048.324.camel@jzny.localdomain> <20041220170222.5ee14588.davem@davemloft.net> <1103721246.1093.39.camel@jzny.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1103721246.1093.39.camel@jzny.localdomain> X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12985 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * jamal <1103721246.1093.39.camel@jzny.localdomain> 2004-12-22 08:14 > On Mon, 2004-12-20 at 20:02, David S. Miller wrote: > > We have a similar problem with TSO and some gigabit cards whose > > drivers muck with the iphdr->tot_len field on transmit. I still > > am not sure how I want to address that case yet. Since transmitted > > TCP data packets are always shared/cloned, we'll have to do a data > > copy on every TSO send on these cards which frankly nullifies much > > of the performance gain TSO gives. If we end of fixing it via a copy > > we'll probably need to seriously consider not doing TSO unless we > > are doing sendfile. > > Ok, makes sense. I can see how TSO may not be useful if you have to > copy. I suppose NICS with hardware based retransmits will behave > differently. I think we should move all packet manngling to be an action and warn about the loss of performance. We might be able to add a fast path for simple modifications such as dsmark dscp maping and other simple header modifications by not copying but mangle the fragment itself assuming we can accept some drawbacks with packet sockets. More advanced mangling as done by pedit is less of a problem since it will likely be used in combination with heavy filtering but we could of course do some analysis of the edit request and go the fast path under some circumstances. From hadi@cyberus.ca Wed Dec 22 05:54:45 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 22 Dec 2004 05:54:52 -0800 (PST) Received: from mx01.cybersurf.com (mx01.cybersurf.com [209.197.145.104]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBMDsPKF010763 for ; Wed, 22 Dec 2004 05:54:45 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx01.cybersurf.com with esmtp (Exim 4.30) id 1Ch6wq-0002mJ-Hu for netdev@oss.sgi.com; Wed, 22 Dec 2004 08:54:40 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1Ch6wn-0005wK-23; Wed, 22 Dec 2004 08:54:37 -0500 Subject: Re: [PATCH] PKT_SCHED: Provide compat policer stats in action policer From: jamal Reply-To: hadi@cyberus.ca To: Thomas Graf Cc: Patrick McHardy , "David S. Miller" , netdev@oss.sgi.com, Werner Almesberger In-Reply-To: <20041222133251.GC7884@postel.suug.ch> References: <20041215130128.GK8493@postel.suug.ch> <1103119774.1077.74.camel@jzny.localdomain> <41C05B60.6030504@trash.net> <1103484249.1046.143.camel@jzny.localdomain> <41C6A6CC.1050105@trash.net> <1103552830.1049.355.camel@jzny.localdomain> <20041221001621.GY17998@postel.suug.ch> <1103721013.1090.30.camel@jzny.localdomain> <20041222133251.GC7884@postel.suug.ch> Content-Type: text/plain Organization: jamalopolous Message-Id: <1103723674.1090.90.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 22 Dec 2004 08:54:34 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12986 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Wed, 2004-12-22 at 08:32, Thomas Graf wrote: > * jamal <1103721013.1090.30.camel@jzny.localdomain> 2004-12-22 08:10 > > Dont have time to look at it right now - If you can also have it > > generate traffic in that framework and validate results that would > > be similar to what i have. > > Sure, no problem. Here's an example: > > ts_tc "attr-police" "police creation" $FILTER police rate 2kbit buffer 10k drop > > ts_log "cls-police: Sending 10 ICMP echo requests" > hping2 -c 10 -1 --fast ${DEST} &> /dev/null > > ts_tc "cls-police" "police dumping" -s -s -d filter list dev $DEV parent 10:0 > Good start. Now if there was two hooks which could be empty. First one to say what to do with traffic - you have it going to /dev/null Note that i had tcpdump capture it because i wanted to analyze it for pedit sake at the second spot: after you finish testing. In order to really tell whether test passed or not, you need to see those counters and analyze them too ;-> (I dont - but it is valuable). > It could be done a lot better but I'm not aiming at replacing tcsim > but rather to complement it so we have tcsim to validate the > correctness of queueing and classification algorithms and my naive brutal > testing framework for regression and stress testing under real conditions. I think tcsim for testing the infrastructure sounds reasonable. Whats Mr. Almesberger's long term plans for it? CCed him. cheers, jamal From tgraf@suug.ch Wed Dec 22 06:26:13 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 22 Dec 2004 06:26:21 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBMEPq7H012710 for ; Wed, 22 Dec 2004 06:26:13 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id 9D61682; Wed, 22 Dec 2004 15:25:54 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id 4FFEA1C0EA; Wed, 22 Dec 2004 15:26:37 +0100 (CET) Date: Wed, 22 Dec 2004 15:26:37 +0100 From: Thomas Graf To: jamal Cc: Patrick McHardy , "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PATCH] PKT_SCHED: Fix cls indev validation Message-ID: <20041222142637.GE7884@postel.suug.ch> References: <20041219203050.GK17998@postel.suug.ch> <41C68CEF.3030803@trash.net> <1103552215.1048.333.camel@jzny.localdomain> <20041220200739.GX17998@postel.suug.ch> <41C7F833.4000909@trash.net> <20041222003142.GB7884@postel.suug.ch> <1103722366.1089.75.camel@jzny.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1103722366.1089.75.camel@jzny.localdomain> X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12987 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * jamal <1103722366.1089.75.camel@jzny.localdomain> 2004-12-22 08:32 > On Tue, 2004-12-21 at 19:31, Thomas Graf wrote: > > * Patrick McHardy <41C7F833.4000909@trash.net> 2004-12-21 11:17 > > > Could you make your patchset available somehow ? > > > > http://people.suug.ch/~tgr/patches/queue/ > > > > Unfinished and untested. > > I just took a quick glimpse. > > 1)Recall: Policer will have to die at some point - only reason for its > existence is for backward compat. > New iproute2 code sooner than later stop using that inteface so we can > kill it. I suspect we can kill it in a year or two and definetely the > day 2.7 comes out. I fully agree. The patchset looks like a beautification but it isn't. Its main purpose is to make changing consistent again. I tried achieving this with the existing API and the ifdef hell got horrible and ended up having over 60 lines of redundant code per classifier. I would rather be implementing new fancy stuff but fixing the existing issues comes first. > 2) The name tcf_attrs doesnt sound right - attributes are normally > data pieces not methods. Cant think of a good name. I feel the same, I was thinking of extensions but wasn't pleased either. Suggestions appreciated. > 3) What can i say? dang - this indev thing is getting out of control ;-> Too late, it's there, we must maintain it ;-> > If you are going to go this far for beautification sake then > kill the .indev thing please before it becomes a beast. Again, it's not for beautification, validating the indev tlv must be done before changing native classifier attributes which lead to more ifdefs. Putting it into tcf_attrs saves code and makes it available to the other classifiers at the same time. It's a dodgy situation, the current status is unsatisfying and all changes made now will be removed again. > Do what we discussed a while back: > - have a generic very basic extended generic match API which indev uses > that gets invoked from the classifier. It should take no more than one > page to write the indev extension - if it exceeds that you are doing > something wrong. There should be capability to mix and match these > extenders so i can say in u32 something like: > match ip src X > match extend indev src eth0 > match protocol tcp > match extended metadata fwmark 0x10 My actual plan was to introduce a nested TCA_TYPE_ATTRS TLV containing all the generic matches and attributes so a classifier no longer has to be changed but the backward compatibility will be a PITA. > I think its time we did this right than defering. Indeed, what about this: I'll try and do a proposal on a new generic matching layer holding the action bits and providing backward compatibility to police/indev. We can then build the metadata match on top of it. I'm going to post the classifier I'm working on for a long time even if it's still buggy and unfinished so we can at least take over some ideas and concepts and then come up with something final that doesn't need to be changed again in 2 months. From markb@wetlettuce.com Wed Dec 22 06:38:07 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 22 Dec 2004 06:38:16 -0800 (PST) Received: from piglet.wetlettuce.com (piglet.wetlettuce.com [82.68.149.69]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBMEbjK9013688 for ; Wed, 22 Dec 2004 06:38:06 -0800 Received: from robin ([127.0.0.1] helo=wetlettuce.com ident=www-data) by piglet.wetlettuce.com with smtp (Exim 3.35 #1 (Debian)) id 1Ch7cK-0000qt-00; Wed, 22 Dec 2004 14:37:32 +0000 Received: from 192.102.214.6 (SquirrelMail authenticated user lists) by webmail.wetlettuce.com with HTTP; Wed, 22 Dec 2004 14:37:32 -0000 (GMT) Message-ID: <60880.192.102.214.6.1103726252.squirrel@webmail.wetlettuce.com> Date: Wed, 22 Dec 2004 14:37:32 -0000 (GMT) Subject: Re: Lockup with 2.6.9-ac15 related to netconsole From: "Mark Broadbent" To: In-Reply-To: <41C9525F.4070805@trash.net> References: <20041217233524.GA11202@electric-eye.fr.zoreil.com> <36901.192.102.214.6.1103535728.squirrel@webmail.wetlettuce.com> <20041220211419.GC5974@waste.org> <20041221002218.GA1487@electric-eye.fr.zoreil.com> <20041221005521.GD5974@waste.org> <52121.192.102.214.6.1103624620.squirrel@webmail.wetlettuce.com> <20041221123727.GA13606@electric-eye.fr.zoreil.com> <49295.192.102.214.6.1103635762.squirrel@webmail.wetlettuce.com> <20041221204853.GA20869@electric-eye.fr.zoreil.com> <20041221212737.GK5974@waste.org> <20041221225831.GA20910@electric-eye.fr.zoreil.com> <41C93FAB.9090708@trash.net> <41C9525F.4070805@trash.net> X-Priority: 3 Importance: Normal X-MSMail-Priority: Normal Cc: , , , Reply-To: markb@wetlettuce.com X-Mailer: SquirrelMail (version 1.2.6) MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-MailScanner: Mail is clear of Viree X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12988 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: markb@wetlettuce.com Precedence: bulk X-list: netdev Patrick McHardy said: > Patrick McHardy wrote: >> Francois Romieu wrote: >> >>> Marc, if the patch below happens to work, it should not drop messages >>> like the previous one (it is an ugly short-term suggestion). >>> >> >>> - spin_lock(&np->dev->xmit_lock); >>> + while (!spin_trylock(&np->dev->xmit_lock)) { >>> + if (np->dev->xmit_lock_owner == smp_processor_id()) { >>> + struct Qdisc *q = dev->qdisc; >>> + >>> + q->ops->enqueue(skb, q); >> >> >> Shouldn't this be requeue ? > > Since the code doesn't dequeue itself, enqueue seems fine to keep > at least the queued messages ordered. But you need to grab > dev->queue_lock, otherwise you risk corrupting qdisc internal data. You > should probably also deal with the noqueue-qdisc, which doesn't have an > enqueue function. So it should look something like this: > > while (!spin_trylock(&np->dev->xmit_lock)) { > if (np->dev->xmit_lock_owner == smp_processor_id()) { > struct Qdisc *q; > > rcu_read_lock(); > q = rcu_dereference(dev->qdisc); > if (q->enqueue) { > spin_lock(&dev->queue_lock); > q->ops->enqueue(skb, q); > spin_unlock(&dev->queue_lock); > netif_schedule(np->dev); > } else > kfree_skb(skb); > rcu_read_unlock(); > return; > } > } I've tried both patches (modified slightly to get them to compile) but they both produce hard NMI detected lockups (as before). Thanks Mark Patches after modification to allow compilation: Francois' patch (against 2.6.10-rc3-bk10): diff -X dontdiff -urN linux-2.6.9-rc3-bk10.orig/net/core/netpoll.c linux-2.6.9-rc3-bk10/net/core/netpoll.c--- linux-2.6.9-rc3-bk10.orig/net/core/netpoll.c 2004-12-22 12:09:32.000000000 +0000+++ linux-2.6.9-rc3-bk10/net/core/netpoll.c 2004-12-22 14:13:54.000000000 +0000@@ -22,6 +22,7 @@ #include #include #include +#include /* * We maintain a small pool of fully-sized skbs, to make sure the @@ -188,7 +189,15 @@ return; } - spin_lock(&np->dev->xmit_lock); + while (!spin_trylock(&np->dev->xmit_lock)) { + if (np->dev->xmit_lock_owner == smp_processor_id()) { + struct Qdisc *q = np->dev->qdisc; + + q->ops->enqueue(skb, q); + return; + } + } + np->dev->xmit_lock_owner = smp_processor_id(); /* Patrick's patch (against 2.6.10-rc3-bk10): diff -X dontdiff -urN linux-2.6.9-rc3-bk10.orig/net/core/netpoll.c linux-2.6.9-rc3-bk10/net/core/netpoll.c--- linux-2.6.9-rc3-bk10.orig/net/core/netpoll.c 2004-12-22 12:09:32.000000000 +0000+++ linux-2.6.9-rc3-bk10/net/core/netpoll.c 2004-12-22 11:08:06.000000000 +0000@@ -22,6 +22,7 @@ #include #include #include +#include /* * We maintain a small pool of fully-sized skbs, to make sure the @@ -188,7 +189,24 @@ return; } - spin_lock(&np->dev->xmit_lock); + while (!spin_trylock(&np->dev->xmit_lock)) { + if (np->dev->xmit_lock_owner == smp_processor_id()) { + struct Qdisc *q; + + rcu_read_lock(); + q = rcu_dereference(np->dev->qdisc); + if (q->enqueue) { + spin_lock(&np->dev->queue_lock); + q->ops->enqueue(skb, q); + spin_unlock(&np->dev->queue_lock); + netif_schedule(np->dev); + } else + __kfree_skb(skb); + rcu_read_unlock(); + return; + } + } + np->dev->xmit_lock_owner = smp_processor_id(); /* -- Mark Broadbent Web: http://www.wetlettuce.com From kaber@trash.net Wed Dec 22 06:58:44 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 22 Dec 2004 06:58:52 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBMEwLaY015009 for ; Wed, 22 Dec 2004 06:58:43 -0800 Received: from eru.coreworks.de ([172.16.0.2] helo=trash.net) by kaber.coreworks.de with esmtp (Exim 4.34) id 1Ch7wo-0008CW-Jc; Wed, 22 Dec 2004 15:58:42 +0100 Message-ID: <41C98B75.9020802@trash.net> Date: Wed, 22 Dec 2004 15:57:57 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040413 Debian/1.6-5 X-Accept-Language: en MIME-Version: 1.0 To: Francois Romieu CC: Matt Mackall , Mark Broadbent , linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: Lockup with 2.6.9-ac15 related to netconsole References: <20041221002218.GA1487@electric-eye.fr.zoreil.com> <20041221005521.GD5974@waste.org> <52121.192.102.214.6.1103624620.squirrel@webmail.wetlettuce.com> <20041221123727.GA13606@electric-eye.fr.zoreil.com> <49295.192.102.214.6.1103635762.squirrel@webmail.wetlettuce.com> <20041221204853.GA20869@electric-eye.fr.zoreil.com> <20041221212737.GK5974@waste.org> <20041221225831.GA20910@electric-eye.fr.zoreil.com> <41C93FAB.9090708@trash.net> <41C9525F.4070805@trash.net> <20041222123940.GA4241@electric-eye.fr.zoreil.com> In-Reply-To: <20041222123940.GA4241@electric-eye.fr.zoreil.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12989 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev Francois Romieu wrote: > Patrick McHardy : > [...] > >>at least the queued messages ordered. But you need to grab >>dev->queue_lock, otherwise you risk corrupting qdisc internal data. >>You should probably also deal with the noqueue-qdisc, which doesn't >>have an enqueue function. So it should look something like this: > > > If I am not mistaken, a failure on spin_trylock + the test on > xmit_lock_owner imply that it is safe to directly handle the > queue. It means that qdisc_run() has been interrupted on the > current cpu and the other paths seem fine as well. Counter-example > is welcome (no joke). enqueue is only protected by dev->queue_lock, and dev->queue_lock is dropped as soon as dev->xmit_lock is grabbed, so any other CPU might call enqueue at the same time. Example: CPU1 CPU2 dev_queue_xmit dev_queue_xmit lock(dev->queue_lock) lock(dev->queue_lock) q->enqueue qdisc_run qdisc_restart trylock(dev->xmit_lock), ok unlock(dev->queue_lock) ... printk("something") ... netpoll_send_skb trylock(dev->xmit_lock), fails q->enqueue q->enqueue > Of course the patch is completely ugly and violates any layering > principle one could think of. It was not submitted for inclusion :o) Sure, but I think we should have a short-term workaround until a better solution has been invented. Maybe dropping the packets would be best for now, it only affects printks issued in paths starting at qdisc_restart (-> hard_start_xmit -> ...). Queueing the packets might also cause reordering since not all packets are queued. >>while (!spin_trylock(&np->dev->xmit_lock)) { >> if (np->dev->xmit_lock_owner == smp_processor_id()) { >> struct Qdisc *q; >> >> rcu_read_lock(); >> q = rcu_dereference(dev->qdisc); >> if (q->enqueue) { >> spin_lock(&dev->queue_lock); > > > I'd expect it to deadlock if dev_queue_xmit -> qdisc_run is interrupted > on the current cpu and a printk is issued as dev->queue_lock will have > been taken elsewhere. Hmm this is complicated, let me think some more about it. Regards Patrick From kleptog@svana.org Wed Dec 22 08:26:38 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 22 Dec 2004 08:26:45 -0800 (PST) Received: from svana.org (mail@svana.org [203.20.62.76]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBMGQH8f022776 for ; Wed, 22 Dec 2004 08:26:37 -0800 Received: from kleptog by svana.org with local (Exim 3.35 #1 (Debian)) id 1Ch9Jo-000869-00; Thu, 23 Dec 2004 03:26:32 +1100 Date: Wed, 22 Dec 2004 17:26:32 +0100 From: Martijn van Oosterhout To: Andrea G Forte Cc: netdev@oss.sgi.com, linux-net@vger.kernel.org Subject: Re: Very slow change of IP in kernel (slow socket?). Message-ID: <20041222162632.GC29278@svana.org> Reply-To: Martijn van Oosterhout References: <41C895D6.2040001@cs.columbia.edu> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="ghzN8eJ9Qlbqn3iT" Content-Disposition: inline In-Reply-To: <41C895D6.2040001@cs.columbia.edu> User-Agent: Mutt/1.3.28i X-PGP-Key-ID: Length=1024; ID=0x0DC67BE6 X-PGP-Key-Fingerprint: 295F A899 A81A 156D B522 48A7 6394 F08A 0DC6 7BE6 X-PGP-Key-URL: X-Warning: Normally my messages are signed, but not on the ozTiVo list due to legacy application support X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12990 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kleptog@svana.org Precedence: bulk X-list: netdev --ghzN8eJ9Qlbqn3iT Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Seems long, but have you condered other possible delays, like ARP, routing daemons, etc. BTW, I don't quite understand what you mean by changing IPs, because "ip route add" adds routes, not IPs. Presumably you're using an unbound UDP socket. You need to be a bit clearer about what you're actually trying to do... On Tue, Dec 21, 2004 at 04:29:58PM -0500, Andrea G Forte wrote: > Hi all, >=20 > after some talking I decided to try again and post a specific thread for= =20 > this problem. > I noticed that when I change IP address for the same wireless card=20 > (since I am moving to a different subnet I need a new IP), the actual=20 > change in the kernel happens between 300 to 500 ms later. In particular,= =20 > after changing the IP (ip route add...) and updating route table and=20 > default gw, the actual data packets are sent using the new IP only after= =20 > 300 to 500 ms after setting all the above. > Does anyone of you know what this could be related to? Or, does anyone=20 > of you know where in the kernel code I could start looking for some answe= rs? > I have already had some feedback with suggestions to look into the=20 > route_cache, however this does not seem to me as a route problem but=20 > more as a socket problem and perhaps some kind of timer that is set to=20 > refresh the socket info in the kernel. >=20 > Any help would be really appreciated. >=20 > Thanks all, > Andrea > - > To unsubscribe from this list: send the line "unsubscribe linux-net" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html --=20 Martijn van Oosterhout http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them. --ghzN8eJ9Qlbqn3iT Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (GNU/Linux) Comment: For info see http://www.gnupg.org iD8DBQFByaA4Y5Twig3Ge+YRAkZDAJkBSeRFUdN25wAQY0muOxwjlKI8IgCfYB2B Y0GJbUUjhs7bD9kVMwmF5cg= =wlKq -----END PGP SIGNATURE----- --ghzN8eJ9Qlbqn3iT-- From oxymoron@waste.org Wed Dec 22 09:18:21 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 22 Dec 2004 09:18:28 -0800 (PST) Received: from waste.org (waste.org [216.27.176.166]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBMHI08w025806 for ; Wed, 22 Dec 2004 09:18:20 -0800 Received: from waste.org (localhost [127.0.0.1]) by waste.org (8.12.3/8.12.3/Debian-7.1) with ESMTP id iBMHIbkC025405; Wed, 22 Dec 2004 11:18:37 -0600 Received: (from oxymoron@localhost) by waste.org (8.12.3/8.12.3/Debian-7.1) id iBMHIa9Z025402; Wed, 22 Dec 2004 11:18:36 -0600 Date: Wed, 22 Dec 2004 09:18:36 -0800 From: Matt Mackall To: Patrick McHardy Cc: Francois Romieu , Mark Broadbent , linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: Lockup with 2.6.9-ac15 related to netconsole Message-ID: <20041222171836.GL5974@waste.org> References: <52121.192.102.214.6.1103624620.squirrel@webmail.wetlettuce.com> <20041221123727.GA13606@electric-eye.fr.zoreil.com> <49295.192.102.214.6.1103635762.squirrel@webmail.wetlettuce.com> <20041221204853.GA20869@electric-eye.fr.zoreil.com> <20041221212737.GK5974@waste.org> <20041221225831.GA20910@electric-eye.fr.zoreil.com> <41C93FAB.9090708@trash.net> <41C9525F.4070805@trash.net> <20041222123940.GA4241@electric-eye.fr.zoreil.com> <41C98B75.9020802@trash.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <41C98B75.9020802@trash.net> User-Agent: Mutt/1.3.28i X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Scanned: by amavisd-new X-Virus-Status: Clean X-archive-position: 12991 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mpm@selenic.com Precedence: bulk X-list: netdev On Wed, Dec 22, 2004 at 03:57:57PM +0100, Patrick McHardy wrote: > >Of course the patch is completely ugly and violates any layering > >principle one could think of. It was not submitted for inclusion :o) > > Sure, but I think we should have a short-term workaround until > a better solution has been invented. Maybe dropping the packets > would be best for now, it only affects printks issued in paths > starting at qdisc_restart (-> hard_start_xmit -> ...). Queueing > the packets might also cause reordering since not all packets > are queued. When I mentioned queueing, I was thinking of a netpoll-private queue that would be hooked to a softirq or some such so that it would be pushed out as soon as possible. Dropping may be the better approach as queueing throws away netpoll's immediacy and ordering properties. And getting netpoll _more_ tangled in the net stack mechanics is definitely a step in the wrong direction. More generally, I'm tempted to add some warn_on style functionality so that printks in such troublesome paths can be lifted out. -- Mathematics is the supreme nostalgia of our time. From andreaf@cs.columbia.edu Wed Dec 22 09:36:49 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 22 Dec 2004 09:36:55 -0800 (PST) Received: from cs.columbia.edu (cs.columbia.edu [128.59.16.20]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBMHaSWG027144 for ; Wed, 22 Dec 2004 09:36:48 -0800 Received: from lion.cs.columbia.edu (IDENT:Scrvf6Rp5TvnB7nFpBNA83iAhwDLaC93@lion.cs.columbia.edu [128.59.16.120]) by cs.columbia.edu (8.12.10/8.12.10) with ESMTP id iBMHbFgf006247 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NOT); Wed, 22 Dec 2004 12:37:15 -0500 (EST) Received: from [128.59.17.219] (dhcp69.cs.columbia.edu [128.59.17.219]) by lion.cs.columbia.edu (8.12.9/8.12.9) with ESMTP id iBMHbBbE014791; Wed, 22 Dec 2004 12:37:14 -0500 Message-ID: <41C9B0CA.6060409@cs.columbia.edu> Date: Wed, 22 Dec 2004 12:37:14 -0500 From: Andrea G Forte User-Agent: Mozilla Thunderbird 0.9 (Windows/20041103) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Martijn van Oosterhout CC: netdev@oss.sgi.com, linux-net@vger.kernel.org Subject: Re: Very slow change of IP in kernel (slow socket?). References: <41C895D6.2040001@cs.columbia.edu> <20041222162632.GC29278@svana.org> In-Reply-To: <20041222162632.GC29278@svana.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-PMX-Version: 4.7.0.111621, Antispam-Engine: 2.0.2.0, Antispam-Data: 2004.12.21.55 X-PerlMx-Spam: Gauge=IIIIIII, Probability=7%, Report='__CT 0, __CTE 0, __CT_TEXT_PLAIN 0, __HAS_MSGID 0, __MIME_VERSION 0, __SANE_MSGID 0' X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12992 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: andreaf@cs.columbia.edu Precedence: bulk X-list: netdev Yes, I apologize for the confusion. I am trying to achieve a faster L3 handoff on a wireless network. Right now it seems like one of the biggest bottlenecks is in the kernel. After I change IP address of the client (ip address add xxx.xxx.xxx.xxx/yy dev wlan0) and update all the routing info (ip route replace default via zzz.zzz.zzz.zzz, ip route replace kkk.kkk.0.0/yy dev wlan0 proto kernel src xxx.xxx.xxx.xxx), the old IP is still used to send packets as I mentioned earlier. Note that I did not delete the old IP, I just set a new IP and changed the routing info. The old IP however it is invalid in the new subnet, but the kernel does not seem to realize it at all, at least for about 500ms (even though I have changed all the routing info with the new IP). Yes, perhaps it could be a problem related to the routing table being slow to update. I will do some more testing and let you know. Thank you, Andrea Martijn van Oosterhout wrote: >Seems long, but have you condered other possible delays, like ARP, >routing daemons, etc. BTW, I don't quite understand what you mean by >changing IPs, because "ip route add" adds routes, not IPs. Presumably >you're using an unbound UDP socket. You need to be a bit clearer about >what you're actually trying to do... > >On Tue, Dec 21, 2004 at 04:29:58PM -0500, Andrea G Forte wrote: > > >>Hi all, >> >>after some talking I decided to try again and post a specific thread for >>this problem. >>I noticed that when I change IP address for the same wireless card >>(since I am moving to a different subnet I need a new IP), the actual >>change in the kernel happens between 300 to 500 ms later. In particular, >>after changing the IP (ip route add...) and updating route table and >>default gw, the actual data packets are sent using the new IP only after >>300 to 500 ms after setting all the above. >>Does anyone of you know what this could be related to? Or, does anyone >>of you know where in the kernel code I could start looking for some answers? >>I have already had some feedback with suggestions to look into the >>route_cache, however this does not seem to me as a route problem but >>more as a socket problem and perhaps some kind of timer that is set to >>refresh the socket info in the kernel. >> >>Any help would be really appreciated. >> >>Thanks all, >>Andrea >>- >>To unsubscribe from this list: send the line "unsubscribe linux-net" in >>the body of a message to majordomo@vger.kernel.org >>More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> > > > From eric.lemoine@gmail.com Wed Dec 22 10:49:25 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 22 Dec 2004 10:49:33 -0800 (PST) Received: from wproxy.gmail.com (wproxy.gmail.com [64.233.184.197]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBMImw2T026211 for ; Wed, 22 Dec 2004 10:49:24 -0800 Received: by wproxy.gmail.com with SMTP id 70so1313wra for ; Wed, 22 Dec 2004 10:49:48 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:cc:in-reply-to:mime-version:content-type:references; b=e9BhxdMCrPpdlobEkEt0glU1RdXhnXYNfTK3fC34M1bshKe6dazKYfahzW16/1TqR/7S3wgYnKmfFAfUS13GRHnNN9iwZ9LSQORe3aN7emgBB3ENRJwjm6LfwBRNom5uj1NXO661yT3uaJ9wton6ScXELgPSIivSa/CT+zigDt4= Received: by 10.54.18.13 with SMTP id 13mr178449wrr; Wed, 22 Dec 2004 10:49:48 -0800 (PST) Received: by 10.54.30.8 with HTTP; Wed, 22 Dec 2004 10:49:48 -0800 (PST) Message-ID: <5cac192f04122210491d64d4b6@mail.gmail.com> Date: Wed, 22 Dec 2004 19:49:48 +0100 From: Eric Lemoine Reply-To: Eric Lemoine To: hadi@cyberus.ca Subject: Re: LLTX and netif_stop_queue Cc: "David S. Miller" , Roland Dreier , netdev@oss.sgi.com, openib-general@openib.org In-Reply-To: <1103484675.1050.158.camel@jzny.localdomain> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_911_15124942.1103741388450" References: <52llbwoaej.fsf@topspin.com> <20041217214432.07b7b21e.davem@davemloft.net> <1103484675.1050.158.camel@jzny.localdomain> X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12993 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: eric.lemoine@gmail.com Precedence: bulk X-list: netdev ------=_Part_911_15124942.1103741388450 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Content-Disposition: inline On 19 Dec 2004 14:31:15 -0500, jamal wrote: > On Sat, 2004-12-18 at 00:44, David S. Miller wrote: > > > Perhaps one way to fix this is to add a pointer to a spinlock to > > the netdev struct, and have hold that the upper level grab that > > when NETIF_F_LLTX when doing queue state checks. Actually, that > > could end up being racy too. > > How about releasing the qlock only when the LLTX transmit lock is > grabbed? That should bring it to par with what it was originally. I dont like the idea of releasing inside the driver a lock taken outside. That might be just me... Instead, I would suggest to have LLTX drivers check whether queue is stopped after they grab their private tx lock and before they check tx ring fullness. That way we close the race window but keep the driver bug check around. See attached sungem patch. -- Eric ------=_Part_911_15124942.1103741388450 Content-Type: application/octet-stream; name="sungem-lltx.patch" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="sungem-lltx.patch" PT09PT0gZHJpdmVycy9uZXQvc3VuZ2VtLmMgMS43MSB2cyBlZGl0ZWQgPT09PT0KLS0tIDEuNzEv ZHJpdmVycy9uZXQvc3VuZ2VtLmMJMjAwNC0xMS0xNCAxMDo0NTozNiArMDE6MDAKKysrIGVkaXRl ZC9kcml2ZXJzL25ldC9zdW5nZW0uYwkyMDA0LTEyLTIyIDE5OjM0OjA3ICswMTowMApAQCAtOTc2 LDYgKzk3NiwxMiBAQAogCQlyZXR1cm4gTkVUREVWX1RYX0xPQ0tFRDsKIAl9CiAKKwkvKiBUaGlz IGhhbmRsZXMgYSBMTFRYLXJlbGF0ZWQgcmFjZSBjb25kaXRpb24gKi8KKwlpZiAobmV0aWZfcXVl dWVfc3RvcHBlZChkZXYpKSB7CisJCXNwaW5fdW5sb2NrX2lycXJlc3RvcmUoJmdwLT50eF9sb2Nr LCBmbGFncyk7CisJCXJldHVybiBORVRERVZfVFhfQlVTWTsKKwl9CisKIAkvKiBUaGlzIGlzIGEg aGFyZCBlcnJvciwgbG9nIGl0LiAqLwogCWlmIChUWF9CVUZGU19BVkFJTChncCkgPD0gKHNrYl9z aGluZm8oc2tiKS0+bnJfZnJhZ3MgKyAxKSkgewogCQluZXRpZl9zdG9wX3F1ZXVlKGRldik7Cg== ------=_Part_911_15124942.1103741388450-- From mchan@broadcom.com Wed Dec 22 16:01:54 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 22 Dec 2004 16:02:02 -0800 (PST) Received: from mms1.broadcom.com (mms-nat.broadcom.com [63.70.210.58]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBN01VX5005501 for ; Wed, 22 Dec 2004 16:01:54 -0800 Received: from 63.70.210.1 by mms1.broadcom.com with ESMTP (Broadcom SMTP Relay (MMS v5.6.0)); Wed, 22 Dec 2004 16:02:45 -0800 X-Server-Uuid: 97B92932-364A-4474-92D6-5CFE9C59AD14 Received: from nt-irva-0741.brcm.ad.broadcom.com ( nt-irva-0741.brcm.ad.broadcom.com [10.8.194.54]) by mon-irva-10.broadcom.com (8.9.1/8.9.1) with ESMTP id QAA28762; Wed, 22 Dec 2004 16:02:45 -0800 (PST) X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Subject: RE: TG3 fix for slow switches (Was: TG3 driver failure on HP 16-way) Date: Wed, 22 Dec 2004 16:02:44 -0800 Message-ID: Thread-Topic: TG3 fix for slow switches (Was: TG3 driver failure on HP 16-way) Thread-Index: AcTnJhYZSF/AskYiRfWYZ70tzyfq6QBWiqKQ From: "Michael Chan" To: "David S. Miller" , "Peter Chubb" cc: peterc@gelato.unsw.edu.au, netdev@oss.sgi.com X-WSS-ID: 6DD4D4AF1FC3303205-01-01 Content-Type: text/plain; charset=us-ascii X-Virus-Scanned: ClamAV 0.80/627/Sun Dec 12 11:53:11 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id iBN01VX5005501 X-archive-position: 12994 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mchan@broadcom.com Precedence: bulk X-list: netdev > Because callers are not supposed to depend upon the value > being set to anything valid if tg3_readphy() returns an error. > > I thought it should never be returning an error at this spot. > The PHY should always return a valid value within > PHY_BUSY_LOOPS. If MI_COM_BUSY is staying set for such a > long time that's a pretty serious problem. > > Taking a peek at the bcm5700 driver by Broadcom, they handle > all PHY read timeouts the way your patch does in this one > spot, by setting the returned value to zero. So it seems the > device can time out like that in these situations, and your > patch is something close to the correct fix. > > Good catch Peter, I'll think some more about this and > probably end up using something similar to your second patch. > > Thanks. > David, While the 2nd patch or something similar should be applied, I think the underlying cause of tg3_readphy() returning error should be further investigated. Peter, if you provide more information, such as registers MAC_MI_MODE (0x454) and MAC_MI_COM (0x44c) after the failure, I can look into it. Michael From davem@davemloft.net Wed Dec 22 20:35:01 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 22 Dec 2004 20:35:08 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBN4YeIP010829 for ; Wed, 22 Dec 2004 20:35:01 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1ChKbI-0005aF-00; Wed, 22 Dec 2004 20:29:20 -0800 Date: Wed, 22 Dec 2004 20:29:19 -0800 From: "David S. Miller" To: Eric Lemoine Cc: hadi@cyberus.ca, roland@topspin.com, netdev@oss.sgi.com, openib-general@openib.org Subject: Re: LLTX and netif_stop_queue Message-Id: <20041222202919.057b8331.davem@davemloft.net> In-Reply-To: <5cac192f04122210491d64d4b6@mail.gmail.com> References: <52llbwoaej.fsf@topspin.com> <20041217214432.07b7b21e.davem@davemloft.net> <1103484675.1050.158.camel@jzny.localdomain> <5cac192f04122210491d64d4b6@mail.gmail.com> X-Mailer: Sylpheed version 1.0.0beta3 (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12995 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Wed, 22 Dec 2004 19:49:48 +0100 Eric Lemoine wrote: > Instead, I would suggest to have LLTX drivers check whether queue is > stopped after they grab their private tx lock and before they check tx > ring fullness. That way we close the race window but keep the driver > bug check around. > > See attached sungem patch. That sounds about right. Nice idea. It solves the race, and retains the error state check. I'll apply Eric's patch, and do something similar in the other LLTX drivers (except loopback which has not "queue" per se so doesn't need this stuff). From davem@davemloft.net Wed Dec 22 20:37:21 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 22 Dec 2004 20:37:28 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBN4b1jE011104 for ; Wed, 22 Dec 2004 20:37:21 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1ChKdW-0005aa-00; Wed, 22 Dec 2004 20:31:38 -0800 Date: Wed, 22 Dec 2004 20:31:38 -0800 From: "David S. Miller" To: "Michael Chan" Cc: peterc@gelato.unsw.edu.au, netdev@oss.sgi.com Subject: Re: TG3 fix for slow switches (Was: TG3 driver failure on HP 16-way) Message-Id: <20041222203138.778fedb3.davem@davemloft.net> In-Reply-To: References: X-Mailer: Sylpheed version 1.0.0beta3 (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12996 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Wed, 22 Dec 2004 16:02:44 -0800 "Michael Chan" wrote: > David, While the 2nd patch or something similar should be applied, I > think the underlying cause of tg3_readphy() returning error should be > further investigated. Would this condition be possible if something, such as ASF, were continually polling the PHY in parallel with the driver? On the other hand, it doesn't seem so foreign for the PHY to block out register accesses for long periods of time for various reasons. But yes I'd also like to know more about exactly what is going on in this case. From roland@topspin.com Wed Dec 22 20:38:03 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 22 Dec 2004 20:38:11 -0800 (PST) Received: from umhlanga.STRATNET.NET (umhlanga.stratnet.net [12.162.17.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBN4bh9W011293 for ; Wed, 22 Dec 2004 20:38:03 -0800 Received: from exch-1.topspincom.com ([12.162.17.3]) by umhlanga.STRATNET.NET with Microsoft SMTPSVC(5.0.2195.5329); Wed, 22 Dec 2004 20:38:35 -0800 Received: from eddore ([10.10.253.169]) by exch-1.topspincom.com with Microsoft SMTPSVC(5.0.2195.5329); Wed, 22 Dec 2004 20:38:34 -0800 Received: from roland by eddore with local (Exim 4.34) id 1ChKkE-0004dT-CB; Wed, 22 Dec 2004 20:38:34 -0800 To: "David S. Miller" Cc: Eric Lemoine , hadi@cyberus.ca, netdev@oss.sgi.com, openib-general@openib.org X-Message-Flag: Warning: May contain useful information References: <52llbwoaej.fsf@topspin.com> <20041217214432.07b7b21e.davem@davemloft.net> <1103484675.1050.158.camel@jzny.localdomain> <5cac192f04122210491d64d4b6@mail.gmail.com> <20041222202919.057b8331.davem@davemloft.net> From: Roland Dreier Date: Wed, 22 Dec 2004 20:38:34 -0800 In-Reply-To: <20041222202919.057b8331.davem@davemloft.net> (David S. Miller's message of "Wed, 22 Dec 2004 20:29:19 -0800") Message-ID: <52pt11bpdh.fsf@topspin.com> User-Agent: Gnus/5.1006 (Gnus v5.10.6) XEmacs/21.4 (Security Through Obscurity, linux) MIME-Version: 1.0 X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: roland@topspin.com Subject: Re: LLTX and netif_stop_queue Content-Type: text/plain; charset=us-ascii X-SA-Exim-Version: 4.1 (built Tue, 17 Aug 2004 11:06:07 +0200) X-SA-Exim-Scanned: Yes (on eddore) X-OriginalArrivalTime: 23 Dec 2004 04:38:34.0772 (UTC) FILETIME=[43328D40:01C4E8A9] X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12997 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: roland@topspin.com Precedence: bulk X-list: netdev David> That sounds about right. Nice idea. It solves the race, David> and retains the error state check. Great, I've made a similar change to the IP-over-IB driver. Thanks, Roland From cranium2003@yahoo.com Wed Dec 22 22:53:44 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 22 Dec 2004 22:53:54 -0800 (PST) Received: from web41406.mail.yahoo.com (web41406.mail.yahoo.com [66.218.93.72]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBN6rOG8016647 for ; Wed, 22 Dec 2004 22:53:44 -0800 Received: (qmail 29951 invoked by uid 60001); 23 Dec 2004 06:54:49 -0000 Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; b=Cn0WxoLi8OfAHqUr/14PIQELb7hTfwgpaSptI30qA5fjq6nntGnom7kDSNaRBrDamdueNH+7czCcKqTI9PYCKpjVAV7aLs4WyqfClJOBBxpZ+UrRgAYY6kzOgnbQVgHmU8NPcicBV9aG7ag1spbGxe6k8C3rdmub6IRnnEPfybE= ; Message-ID: <20041223065449.29949.qmail@web41406.mail.yahoo.com> Received: from [203.199.141.99] by web41406.mail.yahoo.com via HTTP; Wed, 22 Dec 2004 22:54:49 PST Date: Wed, 22 Dec 2004 22:54:49 -0800 (PST) From: cranium2003 Subject: device driver problem To: net dev MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 12998 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: cranium2003@yahoo.com Precedence: bulk X-list: netdev Hello all, I am trying to use snull driver from Ldd book. I have RH9 with 2.4.20-8 kernel but i install new kernel 2.4.26 also to try all my programming stuff on it. following is snull code i am attching here.I am compiling it with command gcc -c snull.c -I/usr/src/linux-2.4.26/include Then i put host numbers into /etc/hosts as given in Chapter 14:Network Drivers in LDD. 192.168.0.1 local0 192.168.0.2 remote0 192.168.1.2 local1 192.168.1.1 remote1 and then i insmod snull.o ifconfig sn0 local0 ifconfig sn1 local1 then to check it when i give following command ping -c 2 remote0 I got oops message with EIP c88bedd8 my first 10 lines of /proc/ksyms c88bf680 __insmod_tun_S.data_L96 [tun] c88be000 __insmod_tun_O/lib/modules/2.4.26/kernel/drivers/net/tun.o_M41CA53CB_V132122 [tun] c88befc8 __insmod_tun_S.rodata_L20 [tun] c88be180 tun_net_init [tun] c88beaf0 tun_cleanup [tun] c88be060 __insmod_tun_S.text_L3591 [tun] c88beaa0 tun_init [tun] c88bc000 __insmod_8139too_S.data_L1096 [8139too] c88b8000 __insmod_8139too_O/lib/modules/2.4.26/kernel/drivers/net/8139too.o_M41CA53CB_V132122 [8139too] c88b8060 __insmod_8139too_S.text_L10506 [8139too] Where am i wrong in following code and above setup? Also i would like to know that what is requirements to run snull on my pc. I have PC with NIC eth0 having IP 192.168.1.200 set and no other PC connected to it? Is it require for me to have another 2 pcs of snullnet0 and snullnet1? regards, cranium my snull.c is #define __KERNEL__ #define MODULE #include #include /* struct net_device & other headers*/ #include #include #include /* printk() */ #include /* kmalloc() */ #include /* error codes */ #include /* size_t */ #include /* mark_bh */ #include #include /* eth_type_trans */ #include /* struct iphdr */ #include /* struct tcphdr */ #include /* struct sk_buff */ #define SNULL_RX_INTR 0x0001 #define SNULL_TX_INTR 0x0002 /* CHECKSUM macro as defined in sk_buff.h #define CHECKSUM_NONE 0 #define CHECKSUM_HW 1 #define CHECKSUM_UNNECESSARY 2 */ extern struct net_device snull_devs[ ]; /* struct used for statical info of any interface by inconfig command */ struct snull_priv { struct net_device_stats stats; int status; int rx_packetlen; u8 *rx_packetdata; int tx_packetlen; u8 *tx_packetdata; struct sk_buff *skb; spinlock_t lock; }; static int lockup = 0; MODULE_PARM(lockup,"i"); #define SNULL_TIMEOUT 5 /* in jiffies*/ #ifdef HAVE_TX_TIMEOUT static int timeout = SNULL_TIMEOUT; MODULE_PARM(timeout,"i"); #endif static int eth=0; MODULE_PARM(eth,"i"); int snull_eth; int snull_open(struct net_device *dev); void snull_interrupt(int irq,void *dev_id,struct pt_regs *regs); void snull_rx(struct net_device *dev,int len,unsigned char *buf); int snull_tx(struct sk_buff *skb,struct net_device *dev); void snull_hw_tx(char *buf, int len, struct net_device *dev); void snull_tx_timeout(struct net_device *dev); int snull_rebuild_header(struct sk_buff *skb); int snull_header(struct sk_buff *skb, struct net_device *dev, unsigned short type, void *daddr, void *saddr, unsigned int len); struct net_device_stats *snull_stats(struct net_device *dev); int snull_release(struct net_device *dev); int snull_init(struct net_device *dev); int snull_config(struct net_device *dev, struct ifmap *map); int snull_ioctl(struct net_device *dev,struct ifreq *rq,int cmd); int snull_open(struct net_device *dev) { printk("snull_open \n"); memcpy(dev->dev_addr,"\0SNUL0",ETH_ALEN); dev->dev_addr[ETH_ALEN - 1] += (dev - snull_devs); /* the number */ netif_start_queue(dev); return 0; } int snull_config(struct net_device *dev, struct ifmap *map) { printk("snull_ifconfig is called"); if(dev->flags & IFF_UP) { printk("snull_ifconfig :device is UP"); return -EBUSY; } return 0; } int snull_ioctl(struct net_device *dev,struct ifreq *rq,int cmd) { return 0; } /* The interface interrupts the processor to signal one of two possible events : (1) a new packet is arrived or (2) transmission of outgoing packet is complete */ void snull_interrupt(int irq,void *dev_id,struct pt_regs *regs) { int statusword; struct snull_priv *priv; struct net_device *dev = (struct net_device *) dev_id; printk(" snull_interrupt is called \n"); /* check with hw if it is really ours*/ if(!dev) { printk("hw is not ours\n"); return; } /* lock the device */ priv = (struct snull_priv *) dev->priv; spin_lock(&priv->lock); /* retrieve the statusword : real hardware use I/O instructions */ statusword = priv->status; if(statusword && SNULL_RX_INTR) { /* send it to snull_rx for handling arrived packet */ snull_rx(dev,priv->rx_packetlen,priv->rx_packetdata); } if(statusword && SNULL_TX_INTR) { /* transmission is over : free skb */ priv->stats.tx_packets++; priv->stats.tx_bytes += priv->tx_packetlen; dev_kfree_skb(priv->skb); } /* unlock the device and we are done */ spin_unlock(&priv->lock); return; } /* receiving a packet & send it to upper layers */ /* snull_rx is called after the hardware has received packet & it is already in comp memory */ void snull_rx(struct net_device *dev,int len,unsigned char *buf) { struct sk_buff *skb; struct snull_priv *priv = (struct snull_priv *) dev->priv; printk("snull_rx is called\n"); /* the packet has been reterived from the transmission medium . Build an skb around it, so upper layer can handle it */ skb = dev_alloc_skb(len+2); if(!skb) { printk("snull_rx : low on mem hence packet dropped "); priv->stats.rx_dropped++; return; } memcpy(skb_put(skb,len),buf,len); /* write metadata ,and then pass to the receive level */ skb->dev = dev; skb->protocol = eth_type_trans(skb,dev); skb->ip_summed = CHECKSUM_UNNECESSARY; /* don't ch! eck it */ priv->stats.rx_packets++; priv->stats.rx_bytes += len; netif_rx(skb); } /* the socket buffer passed to snull_tx (hard_start_xmit) contains the physical packet as it should appear on the media , compelete with the transmission-level headers. The interface doesn't neet to modify the data being transmitted . skb->data points to the packet being transmitted, and skb->len is its length, in octets. */ int snull_tx(struct sk_buff *skb,struct net_device *dev) { int len; char *data; struct snull_priv *priv = (struct snull_priv *) dev->priv; printk("snull_tx is called \n"); len = skb->len < ETH_ZLEN ? ETH_ZLEN : skb->len; data = skb->data; /* svae the timestamp */ dev->trans_start = jiffies ; /* remember the skb, so we can free at interrup time */ priv->skb = skb ; /* actual delivery data that is device specific */ snull_hw_tx(data,len,dev); return 0; /* Our simple device cann't fail */ } void snull_hw_tx(char *buf, int len, struct net_device *dev) { struct iphdr *ih; struct net_device *dest; struct snull_priv *priv; u32 *saddr, *daddr; printk("snull_hw_tx is called \n"); /* I am paranoid ,Ain't I? */ if(len < sizeof(struct ethhdr) + sizeof(struct iphdr)) { printk("snull_hw_tx:Packet is too short compare to size=%i(octets)\n", len); return; } /* Ethhdr is 14 bytes , but the kernel arranges for iphdr to be aligned (i.e. ethhdr is unaligned */ ih = (struct iphdr *) (buf+sizeof(struct ethhdr)); /* sir i think here i am getting problem due to pointer derefrence */ saddr = &ih->saddr; daddr = &ih->daddr; /* change the third octet (class)*/ ((u8 *)saddr)[2] ^= 1; ((u8 *)daddr)[2] ^= 1; /* Ok,now packet is ready for transmission : first send a receive interrupt on the twin device , then send a transmission-done to the transmitting device */ dest = snull_devs + (dev==snull_devs ? 1 : 0); priv = (struct snull_priv *) dest->priv; priv->status = SNULL_RX_INTR; priv->rx_packetlen = len; priv->rx_packetdata = buf; snull_interrupt(0,dest,NULL); priv = (struct snull_priv *) dev->priv; priv->status = SNULL_TX_INTR; priv->tx_packetlen = len; priv->tx_packetdata = buf; if(lockup && ((priv->stats.tx_packets + 1) % lockup) == 0) { /* Simulate a dropped transmit interrupt */ netif_stop_queue(dev); printk("Simulate lockup at %ld ,txp %ld \n",jiffies,(unsigned long) priv->stats.tx_packets); } else snull_interrupt(0,dev,NULL); } void snull_tx_timeout(struct net_device *dev) { struct snull_priv *priv = (struct snull_priv *) dev->priv; printk("snull_tx_timeout is called \n"); printk("Transmission timeout at %ld, latency %ld \n", jiffies, jiffies - dev->trans_start); priv->status = SNULL_TX_INTR; snull_interrupt(0,dev,NULL); priv->stats.tx_errors++; netif_wake_queue(dev); return; } /* Snull cann't use ARP because driver change IP addresses in packets being transmitted , and ARP packets exchanges IP addresses as well. snull_header method handle physical-layer headers directly. If device driver wants to use the usual hardware header without running ARP, then we need to override the default dev->hard_header method (as soon in snull_header method. */ int snull_header(struct sk_buff *skb, struct net_device *dev, unsigned short type, void * daddr, void *saddr, unsigned int len) { struct ethhdr *eth = ( struct ethhdr *) skb_push(skb,ETH_HLEN); printk("snull_header is called \n"); eth->h_proto = ETH_P_IP; memcpy(eth->h_source, saddr ? saddr : dev->dev_addr,dev->addr_len); memcpy(eth->h_dest, daddr ? daddr : dev->dev_addr,dev->addr_len); eth->h_dest[ETH_ALEN-1] ^= 0x11; return (dev->hard_header_len); } int snull_rebuild_header(struct sk_buff *skb) { struct ethhdr *eth = (struct ethhdr *) skb->data; struct net_device *dev = skb->dev; printk("snull_rebuild_header is called \n"); memcpy(eth->h_source,dev->dev_addr,dev->addr_len); memcpy(eth->h_dest,dev->dev_addr,dev->addr_len); eth->h_dest[ETH_ALEN-1] ^= 0x01; return 0; } struct net_device_stats *snull_stats(struct net_device *dev) { struct snull_priv *priv = (struct snull_priv *) dev->priv; printk("snull_stats is called \n"); return &priv->stats; } int snull_release(struct net_device *dev) { printk("snull_release called\n"); netif_stop_queue(dev); return 0; } int snull_init(struct net_device *dev) { printk("snull_init called \n"); ether_setup(dev); dev->open = snull_open; dev->stop = snull_release; dev->set_config = snull_config; dev->hard_start_xmit = snull_tx; dev->do_ioctl = snull_ioctl; dev->get_stats = snull_stats; dev->rebuild_header = snull_rebuild_header; dev->hard_header = snull_header; #ifdef HAVE_TX_TIMEOUT dev->tx_timeout = snull_tx_timeout; dev->watchdog_timeo = timeout; #endif dev->flags |= IFF_NOARP; // keep default flags ,just add NOARP dev->hard_header_cache = NULL; //disable caching SET_MODULE_OWNER(dev); dev->priv = kmalloc(sizeof( struct snull_priv),GFP_KERNEL); if(dev->priv==NULL) { printk("could not allocated memory for statical info struct of dev = sn%i",(dev-snull_devs)); return -ENOMEM; } memset(dev->priv,0,sizeof(struct snull_priv)); spin_lock_init(&((struct snull_priv *)dev->priv)->lock); return 0; } struct net_device snull_devs[2] = { { init : snull_init }, { init : snull_init }, }; int init_module(void) { int i,result,devs_present=0; printk("init_module called\n"); snull_eth = eth; if(!snull_eth) { strcpy(snull_devs[0].name,"sn0"); strcpy(snull_devs[1].name,"sn1"); } else { strcpy(snull_devs[0].name,"eth%d"); strcpy(snull_devs[1].name,"eth%d"); } for(i=0;i<2;i++) { if(result=register_netdev(snull_devs+i)) printk("eroor=%i while registering device = %s \n", result,snull_devs[i].name); else devs_present++; } printk("no of registered devices = %i \n",devs_present); return 0; } void cleanup_module(void) { int i; printk("cleanup_module"); for(i=0;i<2;i++) { kfree(snull_devs[i].priv); unregister_netdev(snull_devs+i); } } __________________________________ Do you Yahoo!? The all-new My Yahoo! - What will yours do? http://my.yahoo.com From mchan@broadcom.com Thu Dec 23 00:13:36 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 23 Dec 2004 00:14:04 -0800 (PST) Received: from mms3.broadcom.com (mms-nat.broadcom.com [63.70.210.58]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBN8Cvu9020122 for ; Thu, 23 Dec 2004 00:13:28 -0800 Received: from 63.70.210.1 by mms3.broadcom.com with ESMTP (Broadcom SMTP Relay (MMS v5.6.0)); Thu, 23 Dec 2004 00:14:12 -0800 X-Server-Uuid: 062D48FB-9769-4139-967C-478C67B5F9C9 Received: from nt-irva-0741.brcm.ad.broadcom.com ( nt-irva-0741.brcm.ad.broadcom.com [10.8.194.54]) by mon-irva-10.broadcom.com (8.9.1/8.9.1) with ESMTP id AAA28997; Thu, 23 Dec 2004 00:14:11 -0800 (PST) X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Subject: RE: TG3 fix for slow switches (Was: TG3 driver failure on HP 16-way) Date: Thu, 23 Dec 2004 00:14:11 -0800 Message-ID: Thread-Topic: TG3 fix for slow switches (Was: TG3 driver failure on HP 16-way) Thread-Index: AcToqUv3WYbMhI9bRy+/paHfI/WpywAHJSYb From: "Michael Chan" To: "David S. Miller" cc: peterc@gelato.unsw.edu.au, netdev@oss.sgi.com X-WSS-ID: 6DD4A1DE2344569364-01-01 Content-Type: text/plain; charset=iso-8859-1 X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id iBN8Cvu9020122 X-archive-position: 12999 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: mchan@broadcom.com Precedence: bulk X-list: netdev > Would this condition be possible if something, such as ASF, were > continually polling the PHY in parallel with the driver? Without proper handshake with ASF, I think it may be possible if ASF is in a very tight loop polling the PHY. The tg3_readphy() poll loop is not very tight so it is possible to continually hit the busy condition if ASF is polling PHY registers. If this is the case, even if tg3_readphy() eventually gets the data, the data will most likely be from a different PHY register (that ASF is trying to read). But I don't think this is an ASF related problem because if it were, the patch would not have fixed it. Michael From eric.lemoine@gmail.com Thu Dec 23 01:09:31 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 23 Dec 2004 01:09:38 -0800 (PST) Received: from wproxy.gmail.com (wproxy.gmail.com [64.233.184.200]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBN99AuR026153 for ; Thu, 23 Dec 2004 01:09:31 -0800 Received: by wproxy.gmail.com with SMTP id 71so75406wri for ; Thu, 23 Dec 2004 01:10:35 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:references; b=TiuJhu3uBjraWexE5SSCUybZeCBWz20krqmWpR2n58MvCw8Q2xTempN4sWtq6emUN5YFtc0RMqsAX9P/L09tmSELJCELh1Szo5AdOmGBzpNNGG+vtBZ7e8rf89r3SYdKSsctKLyPKGRxA6m57yna/erXe1zXD28T9WYGQ1/GI4w= Received: by 10.54.30.6 with SMTP id d6mr416193wrd; Thu, 23 Dec 2004 01:10:34 -0800 (PST) Received: by 10.54.30.8 with HTTP; Thu, 23 Dec 2004 01:10:34 -0800 (PST) Message-ID: <5cac192f0412230110628749e3@mail.gmail.com> Date: Thu, 23 Dec 2004 10:10:34 +0100 From: Eric Lemoine Reply-To: Eric Lemoine To: "David S. Miller" Subject: Re: LLTX and netif_stop_queue Cc: hadi@cyberus.ca, roland@topspin.com, netdev@oss.sgi.com, openib-general@openib.org In-Reply-To: <20041222202919.057b8331.davem@davemloft.net> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit References: <52llbwoaej.fsf@topspin.com> <20041217214432.07b7b21e.davem@davemloft.net> <1103484675.1050.158.camel@jzny.localdomain> <5cac192f04122210491d64d4b6@mail.gmail.com> <20041222202919.057b8331.davem@davemloft.net> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13000 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: eric.lemoine@gmail.com Precedence: bulk X-list: netdev On Wed, 22 Dec 2004 20:29:19 -0800, David S. Miller wrote: > On Wed, 22 Dec 2004 19:49:48 +0100 > Eric Lemoine wrote: > > > Instead, I would suggest to have LLTX drivers check whether queue is > > stopped after they grab their private tx lock and before they check tx > > ring fullness. That way we close the race window but keep the driver > > bug check around. > > > > See attached sungem patch. > > That sounds about right. Nice idea. It solves the race, and retains > the error state check. > > I'll apply Eric's patch, and do something similar in the other LLTX > drivers (except loopback which has not "queue" per se so doesn't need > this stuff). Dave, I still have one concern with the LLTX code (and it may be that the correct patch is Jamal's) : Without LLTX we do : lock(queue_lock), lock(xmit_lock), release(queue_lock), release(xmit_lock). With LLTX (without Jamal's patch) we do : lock(queue_lock), release(queue_lock), lock(tx_lock), release(tx_lock). LLTX doesn't look correct because it creates a race condition window between the the two lock-protected sections. So you may want to reconsider Jamal's patch or pull out LLTX... Thanks, -- Eric From kaber@trash.net Thu Dec 23 08:37:39 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 23 Dec 2004 08:37:46 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBNGbIF3025029 for ; Thu, 23 Dec 2004 08:37:39 -0800 Received: from eru.coreworks.de ([172.16.0.2] helo=trash.net) by kaber.coreworks.de with esmtp (Exim 4.34) id 1ChVyi-0001X8-Kq; Thu, 23 Dec 2004 17:38:16 +0100 Message-ID: <41CAF444.3000305@trash.net> Date: Thu, 23 Dec 2004 17:37:24 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040413 Debian/1.6-5 X-Accept-Language: en MIME-Version: 1.0 To: Eric Lemoine CC: "David S. Miller" , hadi@cyberus.ca, roland@topspin.com, netdev@oss.sgi.com, openib-general@openib.org Subject: Re: LLTX and netif_stop_queue References: <52llbwoaej.fsf@topspin.com> <20041217214432.07b7b21e.davem@davemloft.net> <1103484675.1050.158.camel@jzny.localdomain> <5cac192f04122210491d64d4b6@mail.gmail.com> <20041222202919.057b8331.davem@davemloft.net> <5cac192f0412230110628749e3@mail.gmail.com> In-Reply-To: <5cac192f0412230110628749e3@mail.gmail.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13001 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev Eric Lemoine wrote: > I still have one concern with the LLTX code (and it may be that the > correct patch is Jamal's) : > > Without LLTX we do : lock(queue_lock), lock(xmit_lock), > release(queue_lock), release(xmit_lock). With LLTX (without Jamal's > patch) we do : lock(queue_lock), release(queue_lock), lock(tx_lock), > release(tx_lock). LLTX doesn't look correct because it creates a race > condition window between the the two lock-protected sections. So you > may want to reconsider Jamal's patch or pull out LLTX... You're right, it can cause packet reordering if something like this happens: CPU1 CPU2 lock(queue_lock) dequeue unlock(queue_lock) lock(queue_lock) dequeue unlock(queue_lock) lock(xmit_lock) hard_start_xmit unlock(xmit_lock) lock(xmit_lock) hard_start_xmit unlock(xmit_lock) Jamal's patch should fix this. Regards Patrick From linux.lover2004@gmail.com Thu Dec 23 08:57:10 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 23 Dec 2004 08:57:17 -0800 (PST) Received: from rproxy.gmail.com (rproxy.gmail.com [64.233.170.203]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBNGunmN026386 for ; Thu, 23 Dec 2004 08:57:10 -0800 Received: by rproxy.gmail.com with SMTP id f1so109174rne for ; Thu, 23 Dec 2004 08:58:17 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:mime-version:content-type:content-transfer-encoding; b=aup8t7Ohvd2fUXvquRcA5bVMe3GO4xMHfPZg72sYw1mLt2/Tjzac5Xb8bl7EopDsXiNm6nmybXUolMng0J8i1arxNGnIIeHFIbLAzB366vajR5WwnA+9tmhmDrfal98mbEjIYOfDiV7qG6sbmF7+wgBOGscpP4aJxULNPaRI0Bs= Received: by 10.38.76.71 with SMTP id y71mr178571rna; Thu, 23 Dec 2004 08:58:16 -0800 (PST) Received: by 10.38.207.9 with HTTP; Thu, 23 Dec 2004 08:58:16 -0800 (PST) Message-ID: <72c6e37904122308584977c621@mail.gmail.com> Date: Thu, 23 Dec 2004 22:28:16 +0530 From: linux lover Reply-To: linux lover To: netdev@oss.sgi.com Subject: why packet is duplicated dev_queue_xmit_nit? Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13002 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: linux.lover2004@gmail.com Precedence: bulk X-list: netdev Hello all, I want to know what is use of dev_queue_xmit_nit function in dev.c file? Also why it calls following statement ptype->func(skb2, skb->dev, ptype); Also why skb2 is created by cloning skb? Acually i trace TCP packets and found that control goes from neigh_resolve_output to directly dev_queue_xmit_nit and then to HW driver 8139too.c? I want to know why it not goes from dev_queue_xmit? I place printk statments and found after dev_queue_xmit_nit control moves to network interface driver? Help to understand the control packet path. Thanks in advance. linux_lover From eric.lemoine@gmail.com Thu Dec 23 10:10:48 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 23 Dec 2004 10:10:56 -0800 (PST) Received: from wproxy.gmail.com (wproxy.gmail.com [64.233.184.203]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBNIASZN030456 for ; Thu, 23 Dec 2004 10:10:48 -0800 Received: by wproxy.gmail.com with SMTP id 71so92332wra for ; Thu, 23 Dec 2004 10:11:50 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:references; b=eXTqpZdj1eCeeJb/t4XYUywbqrAEqfELXgN3/jVH1DEQkisoGDqVxXBVKc9TptnWUDTce9uZ/BlsePX3sqYHgR4lXqX0C6kx8Vqoa4M8bBieqfs76IoGEbKN3XBdVz+Hu8Fn/BD1NxIGlV+vHtX/pQs3ddcXr+nkHYSb+1HOJrA= Received: by 10.54.38.25 with SMTP id l25mr174683wrl; Thu, 23 Dec 2004 10:11:50 -0800 (PST) Received: by 10.54.30.8 with HTTP; Thu, 23 Dec 2004 10:11:50 -0800 (PST) Message-ID: <5cac192f0412231011471763f3@mail.gmail.com> Date: Thu, 23 Dec 2004 19:11:50 +0100 From: Eric Lemoine Reply-To: Eric Lemoine To: Patrick McHardy Subject: Re: LLTX and netif_stop_queue Cc: "David S. Miller" , hadi@cyberus.ca, roland@topspin.com, netdev@oss.sgi.com, openib-general@openib.org In-Reply-To: <41CAF444.3000305@trash.net> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit References: <52llbwoaej.fsf@topspin.com> <20041217214432.07b7b21e.davem@davemloft.net> <1103484675.1050.158.camel@jzny.localdomain> <5cac192f04122210491d64d4b6@mail.gmail.com> <20041222202919.057b8331.davem@davemloft.net> <5cac192f0412230110628749e3@mail.gmail.com> <41CAF444.3000305@trash.net> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13003 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: eric.lemoine@gmail.com Precedence: bulk X-list: netdev On Thu, 23 Dec 2004 17:37:24 +0100, Patrick McHardy wrote: > Eric Lemoine wrote: > > I still have one concern with the LLTX code (and it may be that the > > correct patch is Jamal's) : > > > > Without LLTX we do : lock(queue_lock), lock(xmit_lock), > > release(queue_lock), release(xmit_lock). With LLTX (without Jamal's > > patch) we do : lock(queue_lock), release(queue_lock), lock(tx_lock), > > release(tx_lock). LLTX doesn't look correct because it creates a race > > condition window between the the two lock-protected sections. So you > > may want to reconsider Jamal's patch or pull out LLTX... > > You're right, it can cause packet reordering That's exactly what I was thinking of. > if something like this > happens: > > CPU1 CPU2 > lock(queue_lock) > dequeue > unlock(queue_lock) > lock(queue_lock) > dequeue > unlock(queue_lock) > lock(xmit_lock) > hard_start_xmit > unlock(xmit_lock) > lock(xmit_lock) > hard_start_xmit > unlock(xmit_lock) > > Jamal's patch should fix this. -- Eric From tgraf@suug.ch Thu Dec 23 11:47:07 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 23 Dec 2004 11:47:14 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBNJkjR2002643 for ; Thu, 23 Dec 2004 11:47:06 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id C467882 for ; Thu, 23 Dec 2004 20:47:52 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id A752E1C0EA; Thu, 23 Dec 2004 20:48:34 +0100 (CET) Date: Thu, 23 Dec 2004 20:48:34 +0100 From: Thomas Graf To: netdev@oss.sgi.com Subject: [RFC] Extended Generic Packet Classifier Message-ID: <20041223194834.GF7884@postel.suug.ch> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13004 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev The following patch contains the classifier I've been talking about a few times already. It builds upon the exiting cls api and provides an extended api to so called keys. It's not something totally new but rather a collection of various ideas and algorithms put together. The architecture is splitted into 3 parts, the filter management part, a layer to abstract any kind of value and a set of keys to implement the actual classification algorithms. The patch including the userspace tools can be found at http://people.suug.ch/~tgr/egp/. Be warned, this is totally unfinished work, some parts are not fully implemented yet it and still contains a lot of known issues. Nevertheless, I think it is time to publish it. tcf_proto->root contains a list of classifiers adresseable by their handle just like u32 or fw. One can add/remove to/from this list or change existing classifiers. Every classifier holds any number of keys arranged in a tree where every tree node contains a list of keys interconnectable with AND and OR with a strict left-to-right processing order. A key's result can be inverted. A classifier matches if the logical expression is true. This basically allows to implement any kind of logical expression. (a and b) or (c and d) would look like this: x OR y / \ a AND b c AND d where x and y are dummy nodes representing the result of their child nodes. The tree is implemented as array of key lists where a key can point to another key list in the array. This simplifies the transfer from userspace and allows to reuse parts of the tree. This is already quite powerful but isn't new and already nearly doable with u32. The thing making the whole classifier powerful is the value abstraction layer. Classyfing is all about numbers, be it port numbers, ipv4 addresses, sequence numbers, dscp values, interface indexes, classids, nfmark or simply results of a matching procedure. The abstraction layer takes advantage of this and hides all the different kinds of values and brings them down to a simple integer. The following value types have been implemented: o Simple u32 value (read/write) o Metadata such as random value, input device, real device, load average, nr of running processes, tcindex, socket protocol, paket length, socket family, data len, netfilter mark, socket receive queue, socket priority, ack backlog, ... o Kernel global register values (read/write). Can be used to to communicate between ingress/egress. o Reference to another value o Classifier result (classid) o Result of key evaluation o Packet content, (u8/u6/u32) with support for layers, a mask and left/right shift to access single bits. The configuration parameters offet,mask,lshift,rshift are abstract values again which means offsets can be dynamically calculated. o Term to combine all of the above with support for precedence by making use of refeference so you can configure value such as (nfmark - register_2) >> (offset(u8 at 2@2 mask 0xf) * 4) In order to use the API, all the values must be defined in an index must be assigned to them. Keys are given theses indeces to access the value which basically means that a key has no knowledge about where the data is comming from, all he gets is an integer. The actual matching is done in the keys. The following keys have been implemented: o simple_cmp: Simple comparison of two values supporing eq, ne, lt le, gt, and ge. Sounds boring but the value abstraction makes this really powerful already. o nbyte: Compares a pattern against packet content at a specific offset. Intented for IPv6 address matching but can be used for any kind of pattern. o kmp: Knuth-Moriss-Pratt text search. Is basically equal to nbyte except that it looks for the pattern in a given range. o regexp: Very simple regular expression to match dynamic data. Supports wildcard, specific characters and various groups such as digit, xdigit, print, alpha, ... Allows recurrences of 1, 0..1, 0..n, and 1..n. That's it, very simple but fair enough for classification. o true: always true o cmd: Implements a pseudo machine similar to BPF. Processes a list of instructions with up to 3 arguments until a RET is processed or the tail of the list has been found. A hard limit of backward jumps can be configured to avoid endless loops. Supports all the basic calculation instructions, basic branching instructions and some specific instructions to convert numbers from network to host byte order and vice versa, shortcuts to make the filter match and an instruction to write a character or a number to the console. That's all there is in the kernel part, describing the userspace part would take just too long and it's probably easier if you look at examples/. Just a few words, the configuration is done by writing a .egp file which is then processed by a pre processor and converted to an XML based format which can be loaded into the kernel with the ectl tool. The .egp language is fairly intutitive and a mix of C, functional languages and assembler. The highlight is probably the internal on-the-fly converter to convert a C like language into the instruction set understood by the cmd key. I attached a small example dumping the packet content like below to show how it can be abused. -- Dumping 0x56 octets -- 00 02 44 63 ca 27 00 02 44 63 ed 53 08 00 45 00 ..Dc.'..Dc.S..E. 00 48 47 4d 40 00 40 11 43 f9 c0 a8 17 01 c0 a8 .HGM@.@.C....... 17 0d 80 2f 00 35 00 34 15 f3 c9 4f 01 00 00 01 .../.5.4...O.... 00 00 00 00 00 00 02 31 32 02 32 33 03 31 36 38 .......12.23.168 03 31 39 32 07 69 6e 2d 61 64 64 72 04 61 72 70 .192.in-addr.arp 61 00 00 0c 00 01 a..... Pay attention to the semicolon after the label, it's needed for now ;-> result default 0:0; /* default class */ off := 0; pos := 1; tmp := 0; main() { cmd { puts("-- Dumping 0x"); puti(%PKTLEN); puts(" octets --\n"); for (off = 0, pos = 1; off < %PKTLEN; off++) { puti(offset(u8 at off@0)); puts(" "); if (pos == 16) { print_ascii:; puts(" "); while (pos > 0) { tmp = offset(u8 at {off-(pos-1)}@0); if (tmp >= 0x20) { if (tmp <= 0x7e) putc(tmp); else puts("."); } else puts("."); pos--; } puts("\n"); pos = 1; } else pos++; } if (pos > 1) { for (tmp = pos; tmp <= 16; tmp++) { puts(" "); } off--; pos--; goto print_ascii; } puts("\n"); return 1; } } From afleming@freescale.com Thu Dec 23 12:25:08 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 23 Dec 2004 12:25:15 -0800 (PST) Received: from de01egw02.freescale.net (de01egw02.freescale.net [192.88.165.103]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBNKOlwb008190 for ; Thu, 23 Dec 2004 12:25:07 -0800 Received: from de01smr02.am.mot.com (de01smr02 [10.208.0.151]) by de01egw02.freescale.net (8.12.11/de01egw02) with ESMTP id iBNKSRMH024778; Thu, 23 Dec 2004 13:28:27 -0700 (MST) Received: from [10.82.17.240] ([10.82.17.240]) by de01smr02.am.mot.com (8.13.1/8.13.0) with ESMTP id iBNKUSLR022519; Thu, 23 Dec 2004 14:30:29 -0600 (CST) Mime-Version: 1.0 (Apple Message framework v619) Content-Type: multipart/mixed; boundary=Apple-Mail-2-967209165 Message-Id: Cc: Kumar Gala , Embedded PPC Linux list From: Andy Fleming Subject: [RFC] Patch to Abstract Ethernet PHY support (using driver model) Date: Thu, 23 Dec 2004 13:01:00 -0600 To: Netdev X-Mailer: Apple Mail (2.619) X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13005 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: afleming@freescale.com Precedence: bulk X-list: netdev --Apple-Mail-2-967209165 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=US-ASCII; format=flowed Adds a Phy Abstraction Layer which allows ethernet controllers to manage their PHYs without knowing the details of how the particular PHY device operates. This code steals heavily from BenH's sungem driver, and got some stuff out of Jason McMullan's patch. Primary features of the code: * Allows drivers to only use what they want (to a degree). If you want to handle it all yourself, but use some of the data structures and functions, that's ok. If you want to handle your own interrupts, that's fine. However, it also allows you to minimize PHY management code. See the gianfar driver patches (included for reference). * Integrates with current ethtool/mii defined fields. * Uses the driver model to manage binding PHY drivers to PHY devices, and MDIO bus drivers to MDIO bus devices. * Doesn't affect drivers which don't use it. Here's the patch: --Apple-Mail-2-967209165 Content-Transfer-Encoding: 7bit Content-Type: application/octet-stream; x-unix-mode=0644; name="phy_12232004.patch" Content-Disposition: attachment; filename=phy_12232004.patch diff -Nru a/arch/ppc/platforms/85xx/Makefile b/arch/ppc/platforms/85xx/Makefile --- a/arch/ppc/platforms/85xx/Makefile 2004-12-23 12:39:15 -06:00 +++ b/arch/ppc/platforms/85xx/Makefile 2004-12-23 12:39:15 -06:00 @@ -1,6 +1,7 @@ # # Makefile for the PowerPC 85xx linux kernel. # +obj-$(CONFIG_85xx) += mpc85xx.o obj-$(CONFIG_MPC8540_ADS) += mpc85xx_ads_common.o mpc8540_ads.o obj-$(CONFIG_MPC8555_CDS) += mpc85xx_cds_common.o diff -Nru a/arch/ppc/platforms/85xx/mpc8540.c b/arch/ppc/platforms/85xx/mpc8540.c --- a/arch/ppc/platforms/85xx/mpc8540.c 2004-12-23 12:39:16 -06:00 +++ b/arch/ppc/platforms/85xx/mpc8540.c 2004-12-23 12:39:16 -06:00 @@ -22,7 +22,7 @@ extern struct ocp_gfar_data mpc85xx_tsec1_def; extern struct ocp_gfar_data mpc85xx_tsec2_def; extern struct ocp_gfar_data mpc85xx_fec_def; -extern struct ocp_mpc_i2c_data mpc85xx_i2c1_def; +extern struct ocp_mpc_fs_data mpc85xx_i2c1_def; /* We use offsets for paddr since we do not know at compile time * what CCSRBAR is, platform code should fix this up in diff -Nru a/arch/ppc/platforms/85xx/mpc8540_ads.c b/arch/ppc/platforms/85xx/mpc8540_ads.c --- a/arch/ppc/platforms/85xx/mpc8540_ads.c 2004-12-23 12:39:15 -06:00 +++ b/arch/ppc/platforms/85xx/mpc8540_ads.c 2004-12-23 12:39:15 -06:00 @@ -32,6 +32,7 @@ #include #include #include +#include #include #include @@ -48,50 +49,24 @@ #include #include #include -#include #include +#include #include #include struct ocp_gfar_data mpc85xx_tsec1_def = { - .interruptTransmit = MPC85xx_IRQ_TSEC1_TX, - .interruptError = MPC85xx_IRQ_TSEC1_ERROR, - .interruptReceive = MPC85xx_IRQ_TSEC1_RX, - .interruptPHY = MPC85xx_IRQ_EXT5, - .flags = (GFAR_HAS_GIGABIT | GFAR_HAS_MULTI_INTR - | GFAR_HAS_RMON - | GFAR_HAS_PHY_INTR | GFAR_HAS_COALESCE), - .phyid = 0, - .phyregidx = 0, }; - struct ocp_gfar_data mpc85xx_tsec2_def = { - .interruptTransmit = MPC85xx_IRQ_TSEC2_TX, - .interruptError = MPC85xx_IRQ_TSEC2_ERROR, - .interruptReceive = MPC85xx_IRQ_TSEC2_RX, - .interruptPHY = MPC85xx_IRQ_EXT5, - .flags = (GFAR_HAS_GIGABIT | GFAR_HAS_MULTI_INTR - | GFAR_HAS_RMON - | GFAR_HAS_PHY_INTR | GFAR_HAS_COALESCE), - .phyid = 1, - .phyregidx = 0, }; - struct ocp_gfar_data mpc85xx_fec_def = { - .interruptTransmit = MPC85xx_IRQ_FEC, - .interruptError = MPC85xx_IRQ_FEC, - .interruptReceive = MPC85xx_IRQ_FEC, - .interruptPHY = MPC85xx_IRQ_EXT5, - .flags = 0, - .phyid = 3, - .phyregidx = 0, }; - struct ocp_fs_i2c_data mpc85xx_i2c1_def = { .flags = FS_I2C_SEPARATE_DFSRR, }; +extern void * get_platform_data(enum fsl_devices dev); + /* ************************************************************************ * * Setup the architecture @@ -100,10 +75,10 @@ static void __init mpc8540ads_setup_arch(void) { - struct ocp_def *def; - struct ocp_gfar_data *einfo; bd_t *binfo = (bd_t *) __res; unsigned int freq; + struct gianfar_platform_data *pdata; + struct gianfar_mdio_data *mdata; /* get the core frequency */ freq = binfo->bi_intfreq; @@ -130,23 +105,26 @@ invalidate_tlbcam_entry(NUM_TLBCAMS - 1); #endif - def = ocp_get_one_device(OCP_VENDOR_FREESCALE, OCP_FUNC_GFAR, 0); - if (def) { - einfo = (struct ocp_gfar_data *) def->additions; - memcpy(einfo->mac_addr, binfo->bi_enetaddr, 6); - } - - def = ocp_get_one_device(OCP_VENDOR_FREESCALE, OCP_FUNC_GFAR, 1); - if (def) { - einfo = (struct ocp_gfar_data *) def->additions; - memcpy(einfo->mac_addr, binfo->bi_enet1addr, 6); - } - - def = ocp_get_one_device(OCP_VENDOR_FREESCALE, OCP_FUNC_GFAR, 2); - if (def) { - einfo = (struct ocp_gfar_data *) def->additions; - memcpy(einfo->mac_addr, binfo->bi_enet2addr, 6); - } + /* setup the board related information for the enet controllers */ + pdata = (struct gianfar_platform_data *) get_platform_data(MPC85xx_TSEC1); + pdata->bus_id = "phy0:0"; + memcpy(pdata->mac_addr, binfo->bi_enetaddr, 6); + + mdata = (struct gianfar_mdio_data *) get_platform_data(MPC85xx_MDIO); + mdata->paddr += binfo->bi_immr_base; + memset(&mdata->irq, -1, sizeof(mdata->irq)); + mdata->irq[0] = MPC85xx_IRQ_EXT5; + mdata->irq[1] = MPC85xx_IRQ_EXT5; + mdata->irq[2] = MPC85xx_IRQ_EXT5; + mdata->irq[3] = MPC85xx_IRQ_EXT5; + + pdata = (struct gianfar_platform_data *) get_platform_data(MPC85xx_TSEC2); + pdata->bus_id = "phy0:1"; + memcpy(pdata->mac_addr, binfo->bi_enet1addr, 6); + + pdata = (struct gianfar_platform_data *) get_platform_data(MPC85xx_FEC); + pdata->bus_id = "phy0:3"; + memcpy(pdata->mac_addr, binfo->bi_enet2addr, 6); #ifdef CONFIG_BLK_DEV_INITRD if (initrd_start) @@ -158,8 +136,6 @@ #else ROOT_DEV = Root_HDA1; #endif - - ocp_for_each_device(mpc85xx_update_paddr_ocp, &(binfo->bi_immr_base)); } /* ************************************************************************ */ diff -Nru a/arch/ppc/platforms/85xx/mpc8560_ads.c b/arch/ppc/platforms/85xx/mpc8560_ads.c --- a/arch/ppc/platforms/85xx/mpc8560_ads.c 2004-12-23 12:39:16 -06:00 +++ b/arch/ppc/platforms/85xx/mpc8560_ads.c 2004-12-23 12:39:16 -06:00 @@ -48,7 +48,6 @@ #include #include #include -#include #include #include @@ -59,33 +58,15 @@ extern void cpm2_reset(void); struct ocp_gfar_data mpc85xx_tsec1_def = { - .interruptTransmit = MPC85xx_IRQ_TSEC1_TX, - .interruptError = MPC85xx_IRQ_TSEC1_ERROR, - .interruptReceive = MPC85xx_IRQ_TSEC1_RX, - .interruptPHY = MPC85xx_IRQ_EXT5, - .flags = (GFAR_HAS_GIGABIT | GFAR_HAS_MULTI_INTR - | GFAR_HAS_RMON | GFAR_HAS_COALESCE - | GFAR_HAS_PHY_INTR), - .phyid = 0, - .phyregidx = 0, }; - struct ocp_gfar_data mpc85xx_tsec2_def = { - .interruptTransmit = MPC85xx_IRQ_TSEC2_TX, - .interruptError = MPC85xx_IRQ_TSEC2_ERROR, - .interruptReceive = MPC85xx_IRQ_TSEC2_RX, - .interruptPHY = MPC85xx_IRQ_EXT5, - .flags = (GFAR_HAS_GIGABIT | GFAR_HAS_MULTI_INTR - | GFAR_HAS_RMON | GFAR_HAS_COALESCE - | GFAR_HAS_PHY_INTR), - .phyid = 1, - .phyregidx = 0, }; - struct ocp_fs_i2c_data mpc85xx_i2c1_def = { .flags = FS_I2C_SEPARATE_DFSRR, }; +extern void * get_platform_data(enum fsl_devices dev); + /* ************************************************************************ * * Setup the architecture @@ -95,10 +76,10 @@ static void __init mpc8560ads_setup_arch(void) { - struct ocp_def *def; - struct ocp_gfar_data *einfo; bd_t *binfo = (bd_t *) __res; unsigned int freq; + struct gianfar_platform_data *pdata; + struct gianfar_mdio_data *mdata; cpm2_reset(); @@ -117,17 +98,22 @@ mpc85xx_setup_hose(); #endif - def = ocp_get_one_device(OCP_VENDOR_FREESCALE, OCP_FUNC_GFAR, 0); - if (def) { - einfo = (struct ocp_gfar_data *) def->additions; - memcpy(einfo->mac_addr, binfo->bi_enetaddr, 6); - } - - def = ocp_get_one_device(OCP_VENDOR_FREESCALE, OCP_FUNC_GFAR, 1); - if (def) { - einfo = (struct ocp_gfar_data *) def->additions; - memcpy(einfo->mac_addr, binfo->bi_enet1addr, 6); - } + /* setup the board related information for the enet controllers */ + pdata = (struct gianfar_platform_data *) get_platform_data(MPC85xx_TSEC1); + pdata->bus_id = "phy0:0"; + memcpy(pdata->mac_addr, binfo->bi_enetaddr, 6); + + mdata = (struct gianfar_mdio_data *) get_platform_data(MPC85xx_MDIO); + mdata->paddr += binfo->bi_immr_base; + memset(&mdata->irq, -1, sizeof(mdata->irq)); + mdata->irq[0] = MPC85xx_IRQ_EXT5; + mdata->irq[1] = MPC85xx_IRQ_EXT5; + mdata->irq[2] = MPC85xx_IRQ_EXT5; + mdata->irq[3] = MPC85xx_IRQ_EXT5; + + pdata = (struct gianfar_platform_data *) get_platform_data(MPC85xx_TSEC2); + pdata->bus_id = "phy0:1"; + memcpy(pdata->mac_addr, binfo->bi_enet1addr, 6); #ifdef CONFIG_BLK_DEV_INITRD if (initrd_start) @@ -139,8 +125,6 @@ #else ROOT_DEV = Root_HDA1; #endif - - ocp_for_each_device(mpc85xx_update_paddr_ocp, &(binfo->bi_immr_base)); } static irqreturn_t cpm2_cascade(int irq, void *dev_id, struct pt_regs *regs) diff -Nru a/drivers/base/platform.c b/drivers/base/platform.c --- a/drivers/base/platform.c 2004-12-23 12:39:15 -06:00 +++ b/drivers/base/platform.c 2004-12-23 12:39:15 -06:00 @@ -58,6 +58,42 @@ } /** + * platform_get_resource_byname - get a resource for a device by name + * @dev: platform device + * @type: resource type + * @name: resource name + */ +struct resource * +platform_get_resource_byname(struct platform_device *dev, unsigned int type, + char * name) +{ + int i; + + for (i = 0; i < dev->num_resources; i++) { + struct resource *r = &dev->resource[i]; + + if ((r->flags & (IORESOURCE_IO|IORESOURCE_MEM| + IORESOURCE_IRQ|IORESOURCE_DMA)) + == type) + if (!strcmp(r->name, name)) + return r; + } + return NULL; +} + +/** + * platform_get_irq - get an IRQ for a device + * @dev: platform device + * @name: IRQ name + */ +int platform_get_irq_byname(struct platform_device *dev, char * name) +{ + struct resource *r = platform_get_resource_byname(dev, IORESOURCE_IRQ, name); + + return r ? r->start : 0; +} + +/** * platform_add_devices - add a numbers of platform devices * @devs: array of platform devices to add * @num: number of platform devices in array @@ -103,7 +139,8 @@ for (i = 0; i < pdev->num_resources; i++) { struct resource *p, *r = &pdev->resource[i]; - r->name = pdev->dev.bus_id; + if (r->name == NULL) + r->name = pdev->dev.bus_id; p = NULL; if (r->flags & IORESOURCE_MEM) @@ -308,3 +345,5 @@ EXPORT_SYMBOL_GPL(platform_device_unregister); EXPORT_SYMBOL_GPL(platform_get_irq); EXPORT_SYMBOL_GPL(platform_get_resource); +EXPORT_SYMBOL_GPL(platform_get_irq_byname); +EXPORT_SYMBOL_GPL(platform_get_resource_byname); diff -Nru a/drivers/net/Kconfig b/drivers/net/Kconfig --- a/drivers/net/Kconfig 2004-12-23 12:39:16 -06:00 +++ b/drivers/net/Kconfig 2004-12-23 12:39:16 -06:00 @@ -155,6 +155,8 @@ source "drivers/net/arcnet/Kconfig" endif +source "drivers/net/phy/Kconfig" + # # Ethernet # @@ -188,14 +190,6 @@ kernel: saying N will just cause the configurator to skip all the questions about Ethernet network cards. If unsure, say N. -config MII - tristate "Generic Media Independent Interface device support" - depends on NET_ETHERNET - help - Most ethernet controllers have MII transceiver either as an external - or internal device. It is safe to say Y or M here even if your - ethernet card lack MII. - source "drivers/net/arm/Kconfig" config MACE @@ -2079,17 +2073,6 @@ To compile this driver as a module, choose M here: the module will be called tg3. This is recommended. -config GIANFAR - tristate "Gianfar Ethernet" - depends on 85xx - help - This driver supports the Gigabit TSEC on the MPC85xx - family of chips, and the FEC on the 8540 - -config GFAR_NAPI - bool "NAPI Support" - depends on GIANFAR - config MV643XX_ETH tristate "MV-643XX Ethernet support" depends on MOMENCO_OCELOT_C || MOMENCO_JAGUAR_ATX @@ -2117,6 +2100,18 @@ help This enables support for Port 2 of the Marvell MV643XX Gigabit Ethernet. + +config GIANFAR + tristate "Gianfar Ethernet" + depends on 85xx + depends on MII + help + This driver supports the Gigabit TSEC on the MPC85xx + family of chips, and the FEC on the 8540 + +config GFAR_NAPI + bool "NAPI Support" + depends on GIANFAR endmenu diff -Nru a/drivers/net/Makefile b/drivers/net/Makefile --- a/drivers/net/Makefile 2004-12-23 12:39:15 -06:00 +++ b/drivers/net/Makefile 2004-12-23 12:39:15 -06:00 @@ -12,7 +12,7 @@ obj-$(CONFIG_BONDING) += bonding/ obj-$(CONFIG_GIANFAR) += gianfar_driver.o -gianfar_driver-objs := gianfar.o gianfar_ethtool.o gianfar_phy.o +gianfar_driver-objs := gianfar.o gianfar_ethtool.o gianfar_mii.o # # link order important here @@ -63,6 +63,7 @@ # obj-$(CONFIG_MII) += mii.o +obj-$(CONFIG_MII) += phy/ obj-$(CONFIG_SUNDANCE) += sundance.o obj-$(CONFIG_HAMACHI) += hamachi.o diff -Nru a/drivers/net/gianfar.c b/drivers/net/gianfar.c --- a/drivers/net/gianfar.c 2004-12-23 12:39:15 -06:00 +++ b/drivers/net/gianfar.c 2004-12-23 12:39:15 -06:00 @@ -1,4 +1,4 @@ -/* +/* * drivers/net/gianfar.c * * Gianfar Ethernet Driver @@ -24,27 +24,22 @@ * Theory of operation * This driver is designed for the Triple-speed Ethernet * controllers on the Freescale 8540/8560 integrated processors, - * as well as the Fast Ethernet Controller on the 8540. - * - * The driver is initialized through OCP. Structures which - * define the configuration needed by the board are defined in a - * board structure in arch/ppc/platforms (though I do not + * as well as the Fast Ethernet Controller on the 8540. + * + * The driver is initialized through platform_device. Structures + * which define the configuration needed by the board are defined + * in a board structure in arch/ppc/platforms (though I do not * discount the possibility that other architectures could one - * day be supported. One assumption the driver currently makes - * is that the PHY is configured in such a way to advertise all - * capabilities. This is a sensible default, and on certain - * PHYs, changing this default encounters substantial errata - * issues. Future versions may remove this requirement, but for - * now, it is best for the firmware to ensure this is the case. + * day be supported. * * The Gianfar Ethernet Controller uses a ring of buffer * descriptors. The beginning is indicated by a register - * pointing to the physical address of the start of the ring. - * The end is determined by a "wrap" bit being set in the + * pointing to the physical address of the start of the ring. + * The end is determined by a "wrap" bit being set in the * last descriptor of the ring. * * When a packet is received, the RXF bit in the - * IEVENT register is set, triggering an interrupt when the + * IEVENT register is set, triggering an interrupt when the * corresponding bit in the IMASK register is also set (if * interrupt coalescing is active, then the interrupt may not * happen immediately, but will wait until either a set number @@ -52,7 +47,7 @@ * interrupt handler will signal there is work to be done, and * exit. Without NAPI, the packet(s) will be handled * immediately. Both methods will start at the last known empty - * descriptor, and process every subsequent descriptor until there + * descriptor, and process every subsequent descriptor until there * are none left with data (NAPI will stop after a set number of * packets to give time to other tasks, but will eventually * process all the packets). The data arrives inside a @@ -76,6 +71,7 @@ #include #include #include +#include #include #include #include @@ -85,6 +81,7 @@ #include #include #include +#include #include #include @@ -93,9 +90,11 @@ #include #include #include +#include +#include #include "gianfar.h" -#include "gianfar_phy.h" +#include "gianfar_mii.h" #define TX_TIMEOUT (1*HZ) #define SKB_ALLOC_TIMEOUT 1000000 @@ -109,9 +108,8 @@ #endif const char gfar_driver_name[] = "Gianfar Ethernet"; -const char gfar_driver_version[] = "1.1"; +const char gfar_driver_version[] = "1.2"; -int startup_gfar(struct net_device *dev); static int gfar_enet_open(struct net_device *dev); static int gfar_start_xmit(struct sk_buff *skb, struct net_device *dev); static void gfar_timeout(struct net_device *dev); @@ -122,17 +120,13 @@ static int gfar_change_mtu(struct net_device *dev, int new_mtu); static irqreturn_t gfar_error(int irq, void *dev_id, struct pt_regs *regs); static irqreturn_t gfar_transmit(int irq, void *dev_id, struct pt_regs *regs); -irqreturn_t gfar_receive(int irq, void *dev_id, struct pt_regs *regs); static irqreturn_t gfar_interrupt(int irq, void *dev_id, struct pt_regs *regs); -static irqreturn_t phy_interrupt(int irq, void *dev_id, struct pt_regs *regs); -static void gfar_phy_change(void *data); -static void gfar_phy_timer(unsigned long data); -static void adjust_link(struct net_device *dev); +static void adjust_link(struct device *dev); static void init_registers(struct net_device *dev); static int init_phy(struct net_device *dev); -static int gfar_probe(struct ocp_device *ocpdev); -static void gfar_remove(struct ocp_device *ocpdev); -void free_skb_resources(struct gfar_private *priv); +static int gfar_probe(struct device *device); +static int gfar_remove(struct device *device); +static void free_skb_resources(struct gfar_private *priv); static void gfar_set_multi(struct net_device *dev); static void gfar_set_hash_for_addr(struct net_device *dev, u8 *addr); #ifdef CONFIG_GFAR_NAPI @@ -140,57 +134,38 @@ #endif static int gfar_clean_rx_ring(struct net_device *dev, int rx_work_limit); static int gfar_process_frame(struct net_device *dev, struct sk_buff *skb, int length); -static void gfar_phy_startup_timer(unsigned long data); - -extern struct ethtool_ops gfar_ethtool_ops; MODULE_AUTHOR("Freescale Semiconductor, Inc"); MODULE_DESCRIPTION("Gianfar Ethernet Driver"); MODULE_LICENSE("GPL"); -/* Called by the ocp code to initialize device data structures - * required for bringing up the device - * returns 0 on success */ -static int gfar_probe(struct ocp_device *ocpdev) +/* Set up the ethernet device structure, private data, + * and anything else we need before we start */ +static int gfar_probe(struct device *device) { u32 tempval; - struct ocp_device *mdiodev; struct net_device *dev = NULL; struct gfar_private *priv = NULL; - struct ocp_gfar_data *einfo; + struct platform_device *pdev = to_platform_device(device); + struct gianfar_platform_data *einfo; + struct resource *r; int idx; int err = 0; int dev_ethtool_ops = 0; - einfo = (struct ocp_gfar_data *) ocpdev->def->additions; + einfo = (struct gianfar_platform_data *) pdev->dev.platform_data; if (einfo == NULL) { printk(KERN_ERR "gfar %d: Missing additional data!\n", - ocpdev->def->index); + pdev->id); return -ENODEV; } - /* get a pointer to the register memory which can - * configure the PHYs. If it's different from this set, - * get the device which has those regs */ - if ((einfo->phyregidx >= 0) && - (einfo->phyregidx != ocpdev->def->index)) { - mdiodev = ocp_find_device(OCP_ANY_ID, - OCP_FUNC_GFAR, einfo->phyregidx); - - /* If the device which holds the MDIO regs isn't - * up, wait for it to come up */ - if (mdiodev == NULL) - return -EAGAIN; - } else { - mdiodev = ocpdev; - } - /* Create an ethernet device instance */ dev = alloc_etherdev(sizeof (*priv)); - if (dev == NULL) + if (NULL == dev) return -ENOMEM; priv = netdev_priv(dev); @@ -198,27 +173,28 @@ /* Set the info in the priv to the current info */ priv->einfo = einfo; + /* fill out IRQ fields */ + if (einfo->device_flags & FSL_GIANFAR_DEV_HAS_MULTI_INTR) { + priv->interruptTransmit = platform_get_irq_byname(pdev, "tx"); + priv->interruptReceive = platform_get_irq_byname(pdev, "rx"); + priv->interruptError = platform_get_irq_byname(pdev, "error"); + } else { + priv->interruptTransmit = platform_get_irq(pdev, 0); + } + /* get a pointer to the register memory */ + r = platform_get_resource(pdev, IORESOURCE_MEM, 0); priv->regs = (struct gfar *) - ioremap(ocpdev->def->paddr, sizeof (struct gfar)); + ioremap(r->start, sizeof (struct gfar)); if (priv->regs == NULL) { err = -ENOMEM; goto regs_fail; } - /* Set the PHY base address */ - priv->phyregs = (struct gfar *) - ioremap(mdiodev->def->paddr, sizeof (struct gfar)); - - if (priv->phyregs == NULL) { - err = -ENOMEM; - goto phy_regs_fail; - } - spin_lock_init(&priv->lock); - ocp_set_drvdata(ocpdev, dev); + dev_set_drvdata(device, dev); /* Stop the DMA engine now, in case it was running before */ /* (The firmware could have used it, and left it running). */ @@ -255,7 +231,7 @@ dev->base_addr = (unsigned long) (priv->regs); SET_MODULE_OWNER(dev); - SET_NETDEV_DEV(dev, &ocpdev->dev); + SET_NETDEV_DEV(dev, device); /* Fill in the dev structure */ dev->open = gfar_enet_open; @@ -274,10 +250,10 @@ /* Index into the array of possible ethtool * ops to catch all 4 possibilities */ - if((priv->einfo->flags & GFAR_HAS_RMON) == 0) + if((priv->einfo->device_flags & FSL_GIANFAR_DEV_HAS_RMON) == 0) dev_ethtool_ops += 1; - if((priv->einfo->flags & GFAR_HAS_COALESCE) == 0) + if((priv->einfo->device_flags & FSL_GIANFAR_DEV_HAS_COALESCE) == 0) dev_ethtool_ops += 2; dev->ethtool_ops = gfar_op_array[dev_ethtool_ops]; @@ -324,119 +300,59 @@ return 0; register_fail: - iounmap((void *) priv->phyregs); -phy_regs_fail: iounmap((void *) priv->regs); regs_fail: free_netdev(dev); - return -ENOMEM; + return err; } -static void gfar_remove(struct ocp_device *ocpdev) +static int gfar_remove(struct device *device) { - struct net_device *dev = ocp_get_drvdata(ocpdev); + struct net_device *dev = dev_get_drvdata(device); struct gfar_private *priv = netdev_priv(dev); - ocp_set_drvdata(ocpdev, NULL); + dev_set_drvdata(device, NULL); iounmap((void *) priv->regs); - iounmap((void *) priv->phyregs); free_netdev(dev); + + return 0; } -/* Configure the PHY for dev. - * returns 0 if success. -1 if failure + +/* Initializes driver PHY state, and attaches to the PHY. + * Returns 0 on success, errno on failure to attach. */ static int init_phy(struct net_device *dev) { struct gfar_private *priv = netdev_priv(dev); - struct phy_info *curphy; - unsigned int timeout = PHY_INIT_TIMEOUT; - struct gfar *phyregs = priv->phyregs; - struct gfar_mii_info *mii_info; - int err; + uint gigabit_support = + priv->einfo->device_flags & FSL_GIANFAR_DEV_HAS_GIGABIT ? + SUPPORTED_1000baseT_Full : 0; + struct phy_device *phydev; priv->oldlink = 0; priv->oldspeed = 0; priv->oldduplex = -1; - mii_info = kmalloc(sizeof(struct gfar_mii_info), - GFP_KERNEL); - - if(NULL == mii_info) { - printk(KERN_ERR "%s: Could not allocate mii_info\n", - dev->name); - return -ENOMEM; - } - - mii_info->speed = SPEED_1000; - mii_info->duplex = DUPLEX_FULL; - mii_info->pause = 0; - mii_info->link = 1; - - mii_info->advertising = (ADVERTISED_10baseT_Half | - ADVERTISED_10baseT_Full | - ADVERTISED_100baseT_Half | - ADVERTISED_100baseT_Full | - ADVERTISED_1000baseT_Full); - mii_info->autoneg = 1; - - mii_info->mii_id = priv->einfo->phyid; - - mii_info->dev = dev; - - mii_info->mdio_read = &read_phy_reg; - mii_info->mdio_write = &write_phy_reg; - - priv->mii_info = mii_info; - - /* Reset the management interface */ - gfar_write(&phyregs->miimcfg, MIIMCFG_RESET); + phydev = phy_connect(dev->class_dev.dev, priv->einfo->bus_id, + &adjust_link); - /* Setup the MII Mgmt clock speed */ - gfar_write(&phyregs->miimcfg, MIIMCFG_INIT_VALUE); - - /* Wait until the bus is free */ - while ((gfar_read(&phyregs->miimind) & MIIMIND_BUSY) && - timeout--) - cpu_relax(); - - if(timeout <= 0) { - printk(KERN_ERR "%s: The MII Bus is stuck!\n", - dev->name); - err = -1; - goto bus_fail; - } - - /* get info for this PHY */ - curphy = get_phy_info(priv->mii_info); - - if (curphy == NULL) { - printk(KERN_ERR "%s: No PHY found\n", dev->name); - err = -1; - goto no_phy; + if(NULL == phydev) { + printk(KERN_ERR "%s: Could not attach to PHY\n", dev->name); + return errno; } - mii_info->phyinfo = curphy; + /* Remove any features not supported by the controller */ + phydev->supported &= (GFAR_SUPPORTED | gigabit_support); + phydev->advertising = phydev->supported; - /* Run the commands which initialize the PHY */ - if(curphy->init) { - err = curphy->init(priv->mii_info); - - if (err) - goto phy_init_fail; - } + priv->phydev = phydev; return 0; - -phy_init_fail: -no_phy: -bus_fail: - kfree(mii_info); - - return err; } + static void init_registers(struct net_device *dev) { struct gfar_private *priv = netdev_priv(dev); @@ -470,7 +386,7 @@ gfar_write(&priv->regs->rctrl, 0x00000000); /* Zero out the rmon mib registers if it has them */ - if (priv->einfo->flags & GFAR_HAS_RMON) { + if (priv->einfo->device_flags & FSL_GIANFAR_DEV_HAS_RMON) { memset((void *) &(priv->regs->rmon), 0, sizeof (struct rmon_mib)); @@ -506,13 +422,11 @@ unsigned long flags; u32 tempval; + phy_stop(priv->phydev); + /* Lock it down */ spin_lock_irqsave(&priv->lock, flags); - /* Tell the kernel the link is down */ - priv->mii_info->link = 0; - adjust_link(dev); - /* Mask all interrupts */ gfar_write(®s->imask, IMASK_INIT_CLEAR); @@ -536,30 +450,15 @@ tempval &= ~(MACCFG1_RX_EN | MACCFG1_TX_EN); gfar_write(®s->maccfg1, tempval); - if (priv->einfo->flags & GFAR_HAS_PHY_INTR) { - /* Clear any pending interrupts */ - mii_clear_phy_interrupt(priv->mii_info); - - /* Disable PHY Interrupts */ - mii_configure_phy_interrupt(priv->mii_info, - MII_INTERRUPT_DISABLED); - } - spin_unlock_irqrestore(&priv->lock, flags); /* Free the IRQs */ - if (priv->einfo->flags & GFAR_HAS_MULTI_INTR) { - free_irq(priv->einfo->interruptError, dev); - free_irq(priv->einfo->interruptTransmit, dev); - free_irq(priv->einfo->interruptReceive, dev); + if (priv->einfo->device_flags & FSL_GIANFAR_DEV_HAS_MULTI_INTR) { + free_irq(priv->interruptError, dev); + free_irq(priv->interruptTransmit, dev); + free_irq(priv->interruptReceive, dev); } else { - free_irq(priv->einfo->interruptTransmit, dev); - } - - if (priv->einfo->flags & GFAR_HAS_PHY_INTR) { - free_irq(priv->einfo->interruptPHY, dev); - } else { - del_timer_sync(&priv->phy_info_timer); + free_irq(priv->interruptTransmit, dev); } free_skb_resources(priv); @@ -573,7 +472,7 @@ /* If there are any tx skbs or rx skbs still around, free them. * Then free tx_skbuff and rx_skbuff */ -void free_skb_resources(struct gfar_private *priv) +static void free_skb_resources(struct gfar_private *priv) { struct rxbd8 *rxbdp; struct txbd8 *txbdp; @@ -638,7 +537,7 @@ gfar_write(®s->imask, IMASK_INIT_CLEAR); /* Allocate memory for the buffer descriptors */ - vaddr = (unsigned long) dma_alloc_coherent(NULL, + vaddr = (unsigned long) dma_alloc_coherent(NULL, sizeof (struct txbd8) * priv->tx_ring_size + sizeof (struct rxbd8) * priv->rx_ring_size, &addr, GFP_KERNEL); @@ -727,54 +626,48 @@ /* If the device has multiple interrupts, register for * them. Otherwise, only register for the one */ - if (priv->einfo->flags & GFAR_HAS_MULTI_INTR) { - /* Install our interrupt handlers for Error, + if (priv->einfo->device_flags & FSL_GIANFAR_DEV_HAS_MULTI_INTR) { + /* Install our interrupt handlers for Error, * Transmit, and Receive */ - if (request_irq(priv->einfo->interruptError, gfar_error, + if (request_irq(priv->interruptError, gfar_error, 0, "enet_error", dev) < 0) { printk(KERN_ERR "%s: Can't get IRQ %d\n", - dev->name, priv->einfo->interruptError); + dev->name, priv->interruptError); err = -1; goto err_irq_fail; } - if (request_irq(priv->einfo->interruptTransmit, gfar_transmit, + if (request_irq(priv->interruptTransmit, gfar_transmit, 0, "enet_tx", dev) < 0) { printk(KERN_ERR "%s: Can't get IRQ %d\n", - dev->name, priv->einfo->interruptTransmit); + dev->name, priv->interruptTransmit); err = -1; goto tx_irq_fail; } - if (request_irq(priv->einfo->interruptReceive, gfar_receive, + if (request_irq(priv->interruptReceive, gfar_receive, 0, "enet_rx", dev) < 0) { printk(KERN_ERR "%s: Can't get IRQ %d (receive0)\n", - dev->name, priv->einfo->interruptReceive); + dev->name, priv->interruptReceive); err = -1; goto rx_irq_fail; } } else { - if (request_irq(priv->einfo->interruptTransmit, gfar_interrupt, + if (request_irq(priv->interruptTransmit, gfar_interrupt, 0, "gfar_interrupt", dev) < 0) { printk(KERN_ERR "%s: Can't get IRQ %d\n", - dev->name, priv->einfo->interruptError); + dev->name, priv->interruptError); err = -1; goto err_irq_fail; } } - /* Set up the PHY change work queue */ - INIT_WORK(&priv->tq, gfar_phy_change, dev); - - init_timer(&priv->phy_info_timer); - priv->phy_info_timer.function = &gfar_phy_startup_timer; - priv->phy_info_timer.data = (unsigned long) priv->mii_info; - mod_timer(&priv->phy_info_timer, jiffies + HZ); + phy_start(priv->phydev); /* Configure the coalescing support */ if (priv->txcoalescing) @@ -815,9 +708,9 @@ return 0; rx_irq_fail: - free_irq(priv->einfo->interruptTransmit, dev); + free_irq(priv->interruptTransmit, dev); tx_irq_fail: - free_irq(priv->einfo->interruptError, dev); + free_irq(priv->interruptError, dev); err_irq_fail: rx_skb_fail: free_skb_resources(priv); @@ -828,11 +721,6 @@ priv->tx_bd_base, gfar_read(®s->tbase)); - if (priv->mii_info->phyinfo->close) - priv->mii_info->phyinfo->close(priv->mii_info); - - kfree(priv->mii_info); - return err; } @@ -880,7 +768,7 @@ /* Set buffer length and pointer */ txbdp->length = skb->len; - txbdp->bufPtr = dma_map_single(NULL, skb->data, + txbdp->bufPtr = dma_map_single(NULL, skb->data, skb->len, DMA_TO_DEVICE); /* Save the skb pointer so we can free it later */ @@ -932,11 +820,9 @@ struct gfar_private *priv = netdev_priv(dev); stop_gfar(dev); - /* Shutdown the PHY */ - if (priv->mii_info->phyinfo->close) - priv->mii_info->phyinfo->close(priv->mii_info); - - kfree(priv->mii_info); + /* Disconnect from the PHY */ + phy_disconnect(priv->phydev); + priv->phydev = NULL; netif_stop_queue(dev); @@ -1122,7 +1008,7 @@ skb->dev = dev; bdp->bufPtr = dma_map_single(NULL, skb->data, - priv->rx_buffer_size + RXBUF_ALIGNMENT, + priv->rx_buffer_size + RXBUF_ALIGNMENT, DMA_FROM_DEVICE); bdp->length = 0; @@ -1252,7 +1138,7 @@ } /* gfar_clean_rx_ring() -- Processes each frame in the rx ring - * until the budget/quota has been reached. Returns the number + * until the budget/quota has been reached. Returns the number * of frames handled */ static int gfar_clean_rx_ring(struct net_device *dev, int rx_work_limit) @@ -1452,164 +1338,44 @@ return IRQ_HANDLED; } -static irqreturn_t phy_interrupt(int irq, void *dev_id, struct pt_regs *regs) -{ - struct net_device *dev = (struct net_device *) dev_id; - struct gfar_private *priv = netdev_priv(dev); - - /* Clear the interrupt */ - mii_clear_phy_interrupt(priv->mii_info); - - /* Disable PHY interrupts */ - mii_configure_phy_interrupt(priv->mii_info, - MII_INTERRUPT_DISABLED); - - /* Schedule the phy change */ - schedule_work(&priv->tq); - - return IRQ_HANDLED; -} - -/* Scheduled by the phy_interrupt/timer to handle PHY changes */ -static void gfar_phy_change(void *data) -{ - struct net_device *dev = (struct net_device *) data; - struct gfar_private *priv = netdev_priv(dev); - int result = 0; - - /* Delay to give the PHY a chance to change the - * register state */ - msleep(1); - - /* Update the link, speed, duplex */ - result = priv->mii_info->phyinfo->read_status(priv->mii_info); - - /* Adjust the known status as long as the link - * isn't still coming up */ - if((0 == result) || (priv->mii_info->link == 0)) - adjust_link(dev); - - /* Reenable interrupts, if needed */ - if (priv->einfo->flags & GFAR_HAS_PHY_INTR) - mii_configure_phy_interrupt(priv->mii_info, - MII_INTERRUPT_ENABLED); -} - -/* Called every so often on systems that don't interrupt - * the core for PHY changes */ -static void gfar_phy_timer(unsigned long data) -{ - struct net_device *dev = (struct net_device *) data; - struct gfar_private *priv = netdev_priv(dev); - - schedule_work(&priv->tq); - - mod_timer(&priv->phy_info_timer, jiffies + - GFAR_PHY_CHANGE_TIME * HZ); -} - -/* Keep trying aneg for some time - * If, after GFAR_AN_TIMEOUT seconds, it has not - * finished, we switch to forced. - * Either way, once the process has completed, we either - * request the interrupt, or switch the timer over to - * using gfar_phy_timer to check status */ -static void gfar_phy_startup_timer(unsigned long data) -{ - int result; - static int secondary = GFAR_AN_TIMEOUT; - struct gfar_mii_info *mii_info = (struct gfar_mii_info *)data; - struct gfar_private *priv = netdev_priv(mii_info->dev); - - /* Configure the Auto-negotiation */ - result = mii_info->phyinfo->config_aneg(mii_info); - - /* If autonegotiation failed to start, and - * we haven't timed out, reset the timer, and return */ - if (result && secondary--) { - mod_timer(&priv->phy_info_timer, jiffies + HZ); - return; - } else if (result) { - /* Couldn't start autonegotiation. - * Try switching to forced */ - mii_info->autoneg = 0; - result = mii_info->phyinfo->config_aneg(mii_info); - - /* Forcing failed! Give up */ - if(result) { - printk(KERN_ERR "%s: Forcing failed!\n", - mii_info->dev->name); - return; - } - } - - /* Kill the timer so it can be restarted */ - del_timer_sync(&priv->phy_info_timer); - - /* Grab the PHY interrupt, if necessary/possible */ - if (priv->einfo->flags & GFAR_HAS_PHY_INTR) { - if (request_irq(priv->einfo->interruptPHY, - phy_interrupt, - SA_SHIRQ, - "phy_interrupt", - mii_info->dev) < 0) { - printk(KERN_ERR "%s: Can't get IRQ %d (PHY)\n", - mii_info->dev->name, - priv->einfo->interruptPHY); - } else { - mii_configure_phy_interrupt(priv->mii_info, - MII_INTERRUPT_ENABLED); - return; - } - } - - /* Start the timer again, this time in order to - * handle a change in status */ - init_timer(&priv->phy_info_timer); - priv->phy_info_timer.function = &gfar_phy_timer; - priv->phy_info_timer.data = (unsigned long) mii_info->dev; - mod_timer(&priv->phy_info_timer, jiffies + - GFAR_PHY_CHANGE_TIME * HZ); -} - /* Called every time the controller might need to be made * aware of new link state. The PHY code conveys this - * information through variables in the priv structure, and this + * information through variables in the phydev structure, and this * function converts those variables into the appropriate * register values, and can bring down the device if needed. */ -static void adjust_link(struct net_device *dev) +static void adjust_link(struct device *d) { + struct net_device *dev = dev_get_drvdata(d); struct gfar_private *priv = netdev_priv(dev); struct gfar *regs = priv->regs; u32 tempval; - struct gfar_mii_info *mii_info = priv->mii_info; + unsigned long flags; + struct phy_device *phydev = priv->phydev; + int new_state = 0; - if (mii_info->link) { + spin_lock_irqsave(&priv->lock, flags); + if (phydev->link) { /* Now we make sure that we can be in full duplex mode. * If not, we operate in half-duplex mode. */ - if (mii_info->duplex != priv->oldduplex) { - if (!(mii_info->duplex)) { + if (phydev->duplex != priv->oldduplex) { + new_state = 1; + if (!(phydev->duplex)) { tempval = gfar_read(®s->maccfg2); tempval &= ~(MACCFG2_FULL_DUPLEX); gfar_write(®s->maccfg2, tempval); - - printk(KERN_INFO "%s: Half Duplex\n", - dev->name); } else { tempval = gfar_read(®s->maccfg2); tempval |= MACCFG2_FULL_DUPLEX; gfar_write(®s->maccfg2, tempval); - - printk(KERN_INFO "%s: Full Duplex\n", - dev->name); } - priv->oldduplex = mii_info->duplex; + priv->oldduplex = phydev->duplex; } - if (mii_info->speed != priv->oldspeed) { - switch (mii_info->speed) { + if (phydev->speed != priv->oldspeed) { + new_state = 1; + switch (phydev->speed) { case 1000: tempval = gfar_read(®s->maccfg2); tempval = @@ -1626,31 +1392,41 @@ default: printk(KERN_WARNING "%s: Ack! Speed (%d) is not 10/100/1000!\n", - dev->name, mii_info->speed); + dev->name, phydev->speed); break; } - printk(KERN_INFO "%s: Speed %dBT\n", dev->name, - mii_info->speed); - - priv->oldspeed = mii_info->speed; + priv->oldspeed = phydev->speed; } if (!priv->oldlink) { - printk(KERN_INFO "%s: Link is up\n", dev->name); + new_state = 1; priv->oldlink = 1; netif_carrier_on(dev); netif_schedule(dev); } } else { if (priv->oldlink) { - printk(KERN_INFO "%s: Link is down\n", dev->name); + new_state = 1; priv->oldlink = 0; priv->oldspeed = 0; priv->oldduplex = -1; netif_carrier_off(dev); } } + + if (new_state) { + pr_info("%s: Link is %s", dev->name, + phydev->link ? "Up" : "Down"); + if (phydev->link) + printk(" - %d/%s", phydev->speed, + DUPLEX_FULL == phydev->duplex ? + "Full" : "Half"); + + printk("\n"); + } + + spin_unlock_irqrestore(&priv->lock, flags); } @@ -1829,35 +1605,32 @@ } /* Structure for a device driver */ -static struct ocp_device_id gfar_ids[] = { - {.vendor = OCP_ANY_ID,.function = OCP_FUNC_GFAR}, - {.vendor = OCP_VENDOR_INVALID} -}; - -static struct ocp_driver gfar_driver = { - .name = "gianfar", - .id_table = gfar_ids, - +static struct device_driver gfar_driver = { + .name = "fsl-gianfar", + .bus = &platform_bus_type, .probe = gfar_probe, .remove = gfar_remove, }; static int __init gfar_init(void) { - int rc; + int err = gfar_mdio_init(); - rc = ocp_register_driver(&gfar_driver); - if (rc != 0) { - ocp_unregister_driver(&gfar_driver); - return -ENODEV; - } + if (err) + return err; - return 0; + err = driver_register(&gfar_driver); + + if (err) + gfar_mdio_exit(); + + return err; } static void __exit gfar_exit(void) { - ocp_unregister_driver(&gfar_driver); + driver_unregister(&gfar_driver); + gfar_mdio_exit(); } module_init(gfar_init); diff -Nru a/drivers/net/gianfar.h b/drivers/net/gianfar.h --- a/drivers/net/gianfar.h 2004-12-23 12:39:16 -06:00 +++ b/drivers/net/gianfar.h 2004-12-23 12:39:16 -06:00 @@ -17,7 +17,6 @@ * * Still left to do: * -Add support for module parameters - * -Add support for ethtool -s * -Add patch for ethtool phys id */ #ifndef __GIANFAR_H @@ -37,6 +36,8 @@ #include #include #include +#include +#include #include #include @@ -47,8 +48,8 @@ #include #include #include -#include -#include "gianfar_phy.h" +#include +#include "gianfar_mii.h" /* The maximum number of packets to be handled in one call of gfar_poll */ #define GFAR_DEV_WEIGHT 64 @@ -67,7 +68,7 @@ #define PHY_INIT_TIMEOUT 100000 #define GFAR_PHY_CHANGE_TIME 2 -#define DEVICE_NAME "%s: Gianfar Ethernet Controller Version 1.1, " +#define DEVICE_NAME "%s: Gianfar Ethernet Controller Version 1.2, " #define DRV_NAME "gfar-enet" extern const char gfar_driver_name[]; extern const char gfar_driver_version[]; @@ -422,12 +423,7 @@ u32 hafdup; /* 0x.50c - Half Duplex Register */ u32 maxfrm; /* 0x.510 - Maximum Frame Length Register */ u8 res18[12]; - u32 miimcfg; /* 0x.520 - MII Management Configuration Register */ - u32 miimcom; /* 0x.524 - MII Management Command Register */ - u32 miimadd; /* 0x.528 - MII Management Address Register */ - u32 miimcon; /* 0x.52c - MII Management Control Register */ - u32 miimstat; /* 0x.530 - MII Management Status Register */ - u32 miimind; /* 0x.534 - MII Management Indicator Register */ + u8 gfar_mii_regs[24]; /* See gianfar_phy.h */ u8 res19[4]; u32 ifstat; /* 0x.53c - Interface Status Register */ u32 macstnaddr1; /* 0x.540 - Station Address Part 1 Register */ @@ -496,9 +492,6 @@ struct txbd8 *cur_tx; /* Next free ring entry */ struct txbd8 *dirty_tx; /* The Ring entry to be freed. */ struct gfar *regs; /* Pointer to the GFAR memory mapped Registers */ - struct gfar *phyregs; - struct work_struct tq; - struct timer_list phy_info_timer; struct net_device_stats stats; /* linux network statistics */ struct gfar_extra_stats extra_stats; spinlock_t lock; @@ -510,9 +503,13 @@ unsigned int rxclean; /* Info structure initialized by board setup code */ - struct ocp_gfar_data *einfo; + unsigned int interruptTransmit; + unsigned int interruptReceive; + unsigned int interruptError; + struct gianfar_platform_data *einfo; - struct gfar_mii_info *mii_info; + struct phy_device *phydev; + struct mii_bus *mii_bus; int oldspeed; int oldduplex; int oldlink; @@ -531,5 +528,9 @@ } extern struct ethtool_ops *gfar_op_array[]; + +extern irqreturn_t gfar_receive(int irq, void *dev_id, struct pt_regs *regs); +extern int startup_gfar(struct net_device *dev); +extern void stop_gfar(struct net_device *dev); #endif /* __GIANFAR_H */ diff -Nru a/drivers/net/gianfar_ethtool.c b/drivers/net/gianfar_ethtool.c --- a/drivers/net/gianfar_ethtool.c 2004-12-23 12:39:16 -06:00 +++ b/drivers/net/gianfar_ethtool.c 2004-12-23 12:39:16 -06:00 @@ -39,15 +39,13 @@ #include #include #include +#include +#include #include "gianfar.h" #define is_power_of_2(x) ((x) != 0 && (((x) & ((x) - 1)) == 0)) -extern int startup_gfar(struct net_device *dev); -extern void stop_gfar(struct net_device *dev); -extern void gfar_receive(int irq, void *dev_id, struct pt_regs *regs); - void gfar_fill_stats(struct net_device *dev, struct ethtool_stats *dummy, u64 * buf); void gfar_gstrings(struct net_device *dev, u32 stringset, u8 * buf); @@ -181,32 +179,72 @@ drvinfo->eedump_len = 0; } + +static int gfar_ssettings(struct net_device *dev, struct ethtool_cmd *cmd) +{ + struct gfar_private *priv = netdev_priv(dev); + struct phy_device *phydev = priv->phydev; + + if (NULL == phydev) + return -ENODEV; + + /* We make sure that we don't pass unsupported + * values in to the PHY */ + cmd->advertising &= phydev->supported; + + /* Verify the settings we care about. */ + if (cmd->autoneg != AUTONEG_ENABLE && cmd->autoneg != AUTONEG_DISABLE) + return -EINVAL; + + if (cmd->autoneg == AUTONEG_ENABLE && cmd->advertising == 0) + return -EINVAL; + + if (cmd->autoneg == AUTONEG_DISABLE + && ((cmd->speed != SPEED_1000 + && cmd->speed != SPEED_100 + && cmd->speed != SPEED_10) + || (cmd->duplex != DUPLEX_HALF + && cmd->duplex != DUPLEX_FULL))) + return -EINVAL; + + phydev->autoneg = cmd->autoneg; + + phydev->speed = cmd->speed; + + phydev->advertising = cmd->advertising; + + if (AUTONEG_ENABLE == cmd->autoneg) + phydev->advertising |= ADVERTISED_Autoneg; + else + phydev->advertising &= ~ADVERTISED_Autoneg; + + phydev->duplex = cmd->duplex; + + /* Restart the PHY */ + phy_start_aneg(phydev); + + return 0; +} + /* Return the current settings in the ethtool_cmd structure */ int gfar_gsettings(struct net_device *dev, struct ethtool_cmd *cmd) { struct gfar_private *priv = netdev_priv(dev); - uint gigabit_support = - priv->einfo->flags & GFAR_HAS_GIGABIT ? SUPPORTED_1000baseT_Full : 0; - uint gigabit_advert = - priv->einfo->flags & GFAR_HAS_GIGABIT ? ADVERTISED_1000baseT_Full: 0; - - cmd->supported = (SUPPORTED_10baseT_Half - | SUPPORTED_100baseT_Half - | SUPPORTED_100baseT_Full - | gigabit_support | SUPPORTED_Autoneg); - - /* For now, we always advertise everything */ - cmd->advertising = (ADVERTISED_10baseT_Half - | ADVERTISED_100baseT_Half - | ADVERTISED_100baseT_Full - | gigabit_advert | ADVERTISED_Autoneg); + struct phy_device *phydev = priv->phydev; + + if (NULL == phydev) + return -ENODEV; + + cmd->supported = phydev->supported; - cmd->speed = priv->mii_info->speed; - cmd->duplex = priv->mii_info->duplex; + cmd->advertising = phydev->advertising; + + cmd->speed = phydev->speed; + cmd->duplex = phydev->duplex; cmd->port = PORT_MII; - cmd->phy_address = priv->mii_info->mii_id; + cmd->phy_address = phydev->addr; cmd->transceiver = XCVR_EXTERNAL; - cmd->autoneg = AUTONEG_ENABLE; + cmd->autoneg = phydev->autoneg; cmd->maxtxpkt = priv->txcount; cmd->maxrxpkt = priv->rxcount; @@ -245,14 +283,14 @@ unsigned int count; /* The timer is different, depending on the interface speed */ - switch (priv->mii_info->speed) { - case 1000: + switch (priv->phydev->speed) { + case SPEED_1000: count = GFAR_GBIT_TIME; break; - case 100: + case SPEED_100: count = GFAR_100_TIME; break; - case 10: + case SPEED_10: default: count = GFAR_10_TIME; break; @@ -269,14 +307,14 @@ unsigned int count; /* The timer is different, depending on the interface speed */ - switch (priv->mii_info->speed) { - case 1000: + switch (priv->phydev->speed) { + case SPEED_1000: count = GFAR_GBIT_TIME; break; - case 100: + case SPEED_100: count = GFAR_100_TIME; break; - case 10: + case SPEED_10: default: count = GFAR_10_TIME; break; @@ -293,6 +331,9 @@ { struct gfar_private *priv = netdev_priv(dev); + if (NULL == priv->phydev) + return -ENODEV; + cvals->rx_coalesce_usecs = gfar_ticks2usecs(priv, priv->rxtime); cvals->rx_max_coalesced_frames = priv->rxcount; @@ -346,6 +387,9 @@ else priv->rxcoalescing = 1; + if (NULL == priv->phydev) + return -ENODEV; + priv->rxtime = gfar_usecs2ticks(priv, cvals->rx_coalesce_usecs); priv->rxcount = cvals->rx_max_coalesced_frames; @@ -462,6 +506,7 @@ } struct ethtool_ops gfar_ethtool_ops = { + .set_settings = gfar_ssettings, .get_settings = gfar_gsettings, .get_drvinfo = gfar_gdrvinfo, .get_regs_len = gfar_reglen, @@ -477,6 +522,7 @@ }; struct ethtool_ops gfar_normon_nocoalesce_ethtool_ops = { + .set_settings = gfar_ssettings, .get_settings = gfar_gsettings, .get_drvinfo = gfar_gdrvinfo, .get_regs_len = gfar_reglen, @@ -490,6 +536,7 @@ }; struct ethtool_ops gfar_nocoalesce_ethtool_ops = { + .set_settings = gfar_ssettings, .get_settings = gfar_gsettings, .get_drvinfo = gfar_gdrvinfo, .get_regs_len = gfar_reglen, @@ -503,6 +550,7 @@ }; struct ethtool_ops gfar_normon_ethtool_ops = { + .set_settings = gfar_ssettings, .get_settings = gfar_gsettings, .get_drvinfo = gfar_gdrvinfo, .get_regs_len = gfar_reglen, diff -Nru a/drivers/net/gianfar_mii.c b/drivers/net/gianfar_mii.c --- /dev/null Wed Dec 31 16:00:00 196900 +++ b/drivers/net/gianfar_mii.c 2004-12-23 12:39:16 -06:00 @@ -0,0 +1,212 @@ +/* + * drivers/net/gianfar_mii.c + * + * Gianfar Ethernet Driver -- MIIM bus implementation + * Provides Bus interface for MIIM regs + * + * Author: Andy Fleming + * Maintainer: Kumar Gala (kumar.gala@freescale.com) + * + * Copyright (c) 2002-2004 Freescale Semiconductor, Inc. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or (at your + * option) any later version. + * + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include + +#include "gianfar.h" +#include "gianfar_mii.h" + +extern void * get_platform_data(enum fsl_devices dev); + +/* Write value to the PHY at mii_id at register regnum, + * on the bus, waiting until the write is done before returning. + * All PHY configuration is done through the TSEC1 MIIM regs */ +void gfar_mdio_write(struct mii_bus *bus, int mii_id, int regnum, u16 value) +{ + struct gfar_mii *regs = bus->priv; + + /* Set the PHY address and the register address we want to write */ + gfar_write(®s->miimadd, (mii_id << 8) | regnum); + + /* Write out the value we want */ + gfar_write(®s->miimcon, value); + + /* Wait for the transaction to finish */ + while (gfar_read(®s->miimind) & MIIMIND_BUSY) + cpu_relax(); +} + +/* Read the bus for PHY at addr mii_id, register regnum, and + * return the value. Clears miimcom first. All PHY + * configuration has to be done through the TSEC1 MIIM regs */ +u16 gfar_mdio_read(struct mii_bus *bus, int mii_id, int regnum) +{ + struct gfar_mii *regs = bus->priv; + u16 value; + + /* Set the PHY address and the register address we want to read */ + gfar_write(®s->miimadd, (mii_id << 8) | regnum); + + /* Clear miimcom, and then initiate a read */ + gfar_write(®s->miimcom, 0); + gfar_write(®s->miimcom, MII_READ_COMMAND); + + /* Wait for the transaction to finish */ + while (gfar_read(®s->miimind) & (MIIMIND_NOTVALID | MIIMIND_BUSY)) + cpu_relax(); + + /* Grab the value of the register from miimstat */ + value = gfar_read(®s->miimstat); + + return value; +} + + +/* Reset the MIIM registers, and wait for the bus to free */ +int gfar_mdio_reset(struct mii_bus *bus) +{ + struct gfar_mii *regs = bus->priv; + unsigned int timeout = PHY_INIT_TIMEOUT; + + spin_lock_bh(&bus->mdio_lock); + + /* Reset the management interface */ + gfar_write(®s->miimcfg, MIIMCFG_RESET); + + /* Setup the MII Mgmt clock speed */ + gfar_write(®s->miimcfg, MIIMCFG_INIT_VALUE); + + /* Wait until the bus is free */ + while ((gfar_read(®s->miimind) & MIIMIND_BUSY) && + timeout--) + cpu_relax(); + + spin_unlock_bh(&bus->mdio_lock); + + if(timeout <= 0) { + printk(KERN_ERR "%s: The MII Bus is stuck!\n", + bus->name); + return -EBUSY; + } + + return 0; +} + + +int gfar_mdio_probe(struct device *dev) +{ + struct platform_device *pdev = to_platform_device(dev); + struct gianfar_mdio_data *pdata; + struct gfar_mii *regs; + struct mii_bus *new_bus; + int err = 0; + + if (NULL == dev) + return -EINVAL; + + new_bus = kmalloc(sizeof(struct mii_bus), GFP_KERNEL); + + if (NULL == new_bus) + return -ENOMEM; + + new_bus->name = "Gianfar MII Bus", + new_bus->read = &gfar_mdio_read, + new_bus->write = &gfar_mdio_write, + new_bus->reset = &gfar_mdio_reset, + new_bus->id = pdev->id; + + pdata = get_platform_data(MPC85xx_MDIO); + + /* Set the PHY base address */ + regs = (struct gfar_mii *) ioremap(pdata->paddr, + sizeof (struct gfar_mii)); + + if (NULL == regs) { + err = -ENOMEM; + goto reg_map_fail; + } + + new_bus->priv = regs; + + new_bus->irq = pdata->irq; + + new_bus->dev = dev; + dev_set_drvdata(dev, new_bus); + + err = register_mdiobus(new_bus); + + if (0 != err) { + printk (KERN_ERR "%s: Cannot register as MDIO bus\n", + new_bus->name); + goto bus_register_fail; + } + + return 0; + +bus_register_fail: + iounmap((void *) regs); +reg_map_fail: + kfree(new_bus); + + return err; +} + + +int gfar_mdio_remove(struct device *dev) +{ + struct mii_bus *bus = dev_get_drvdata(dev); + + dev_set_drvdata(dev, NULL); + + iounmap((void *) (&bus->priv)); + bus->priv = NULL; + kfree(bus); + + return 0; +} + +static struct device_driver gianfar_mdio_driver = { + .name = "fsl-gianfar_mdio", + .bus = &platform_bus_type, + .probe = gfar_mdio_probe, + .remove = gfar_mdio_remove, +}; + +int __init gfar_mdio_init(void) +{ + return driver_register(&gianfar_mdio_driver); +} + +void __exit gfar_mdio_exit(void) +{ + driver_unregister(&gianfar_mdio_driver); +} diff -Nru a/drivers/net/gianfar_mii.h b/drivers/net/gianfar_mii.h --- /dev/null Wed Dec 31 16:00:00 196900 +++ b/drivers/net/gianfar_mii.h 2004-12-23 12:39:15 -06:00 @@ -0,0 +1,45 @@ +/* + * drivers/net/gianfar_mii.h + * + * Gianfar Ethernet Driver -- MII Management Bus Implementation + * Driver for the MDIO bus controller in the Gianfar register space + * + * Author: Andy Fleming + * Maintainer: Kumar Gala (kumar.gala@freescale.com) + * + * Copyright (c) 2002-2004 Freescale Semiconductor, Inc. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or (at your + * option) any later version. + * + */ +#ifndef __GIANFAR_MII_H +#define __GIANFAR_MII_H + +#define MIIMIND_BUSY 0x00000001 +#define MIIMIND_NOTVALID 0x00000004 + +#define MII_READ_COMMAND 0x00000001 + +#define GFAR_SUPPORTED (SUPPORTED_10baseT_Half \ + | SUPPORTED_100baseT_Half \ + | SUPPORTED_100baseT_Full \ + | SUPPORTED_Autoneg \ + | SUPPORTED_MII) + +struct gfar_mii { + u32 miimcfg; /* 0x.520 - MII Management Config Register */ + u32 miimcom; /* 0x.524 - MII Management Command Register */ + u32 miimadd; /* 0x.528 - MII Management Address Register */ + u32 miimcon; /* 0x.52c - MII Management Control Register */ + u32 miimstat; /* 0x.530 - MII Management Status Register */ + u32 miimind; /* 0x.534 - MII Management Indicator Register */ +}; + +u16 gfar_mdio_read(struct mii_bus *bus, int mii_id, int regnum); +void gfar_mdio_write(struct mii_bus *bus, int mii_id, int regnum, u16 value); +int __init gfar_mdio_init(void); +void __exit gfar_mdio_exit(void); +#endif /* GIANFAR_PHY_H */ diff -Nru a/drivers/net/gianfar_phy.c b/drivers/net/gianfar_phy.c --- a/drivers/net/gianfar_phy.c 2004-12-23 12:39:16 -06:00 +++ /dev/null Wed Dec 31 16:00:00 196900 @@ -1,661 +0,0 @@ -/* - * drivers/net/gianfar_phy.c - * - * Gianfar Ethernet Driver -- PHY handling - * Driver for FEC on MPC8540 and TSEC on MPC8540/MPC8560 - * Based on 8260_io/fcc_enet.c - * - * Author: Andy Fleming - * Maintainer: Kumar Gala (kumar.gala@freescale.com) - * - * Copyright (c) 2002-2004 Freescale Semiconductor, Inc. - * - * This program is free software; you can redistribute it and/or modify it - * under the terms of the GNU General Public License as published by the - * Free Software Foundation; either version 2 of the License, or (at your - * option) any later version. - * - */ - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#include -#include -#include -#include -#include -#include -#include - -#include "gianfar.h" -#include "gianfar_phy.h" - -static void config_genmii_advert(struct gfar_mii_info *mii_info); -static void genmii_setup_forced(struct gfar_mii_info *mii_info); -static void genmii_restart_aneg(struct gfar_mii_info *mii_info); -static int gbit_config_aneg(struct gfar_mii_info *mii_info); -static int genmii_config_aneg(struct gfar_mii_info *mii_info); -static int genmii_update_link(struct gfar_mii_info *mii_info); -static int genmii_read_status(struct gfar_mii_info *mii_info); -u16 phy_read(struct gfar_mii_info *mii_info, u16 regnum); -void phy_write(struct gfar_mii_info *mii_info, u16 regnum, u16 val); - -/* Write value to the PHY for this device to the register at regnum, */ -/* waiting until the write is done before it returns. All PHY */ -/* configuration has to be done through the TSEC1 MIIM regs */ -void write_phy_reg(struct net_device *dev, int mii_id, int regnum, int value) -{ - struct gfar_private *priv = netdev_priv(dev); - struct gfar *regbase = priv->phyregs; - - /* Set the PHY address and the register address we want to write */ - gfar_write(®base->miimadd, (mii_id << 8) | regnum); - - /* Write out the value we want */ - gfar_write(®base->miimcon, value); - - /* Wait for the transaction to finish */ - while (gfar_read(®base->miimind) & MIIMIND_BUSY) - cpu_relax(); -} - -/* Reads from register regnum in the PHY for device dev, */ -/* returning the value. Clears miimcom first. All PHY */ -/* configuration has to be done through the TSEC1 MIIM regs */ -int read_phy_reg(struct net_device *dev, int mii_id, int regnum) -{ - struct gfar_private *priv = netdev_priv(dev); - struct gfar *regbase = priv->phyregs; - u16 value; - - /* Set the PHY address and the register address we want to read */ - gfar_write(®base->miimadd, (mii_id << 8) | regnum); - - /* Clear miimcom, and then initiate a read */ - gfar_write(®base->miimcom, 0); - gfar_write(®base->miimcom, MII_READ_COMMAND); - - /* Wait for the transaction to finish */ - while (gfar_read(®base->miimind) & (MIIMIND_NOTVALID | MIIMIND_BUSY)) - cpu_relax(); - - /* Grab the value of the register from miimstat */ - value = gfar_read(®base->miimstat); - - return value; -} - -void mii_clear_phy_interrupt(struct gfar_mii_info *mii_info) -{ - if(mii_info->phyinfo->ack_interrupt) - mii_info->phyinfo->ack_interrupt(mii_info); -} - - -void mii_configure_phy_interrupt(struct gfar_mii_info *mii_info, u32 interrupts) -{ - mii_info->interrupts = interrupts; - if(mii_info->phyinfo->config_intr) - mii_info->phyinfo->config_intr(mii_info); -} - - -/* Writes MII_ADVERTISE with the appropriate values, after - * sanitizing advertise to make sure only supported features - * are advertised - */ -static void config_genmii_advert(struct gfar_mii_info *mii_info) -{ - u32 advertise; - u16 adv; - - /* Only allow advertising what this PHY supports */ - mii_info->advertising &= mii_info->phyinfo->features; - advertise = mii_info->advertising; - - /* Setup standard advertisement */ - adv = phy_read(mii_info, MII_ADVERTISE); - adv &= ~(ADVERTISE_ALL | ADVERTISE_100BASE4); - if (advertise & ADVERTISED_10baseT_Half) - adv |= ADVERTISE_10HALF; - if (advertise & ADVERTISED_10baseT_Full) - adv |= ADVERTISE_10FULL; - if (advertise & ADVERTISED_100baseT_Half) - adv |= ADVERTISE_100HALF; - if (advertise & ADVERTISED_100baseT_Full) - adv |= ADVERTISE_100FULL; - phy_write(mii_info, MII_ADVERTISE, adv); -} - -static void genmii_setup_forced(struct gfar_mii_info *mii_info) -{ - u16 ctrl; - u32 features = mii_info->phyinfo->features; - - ctrl = phy_read(mii_info, MII_BMCR); - - ctrl &= ~(BMCR_FULLDPLX|BMCR_SPEED100|BMCR_SPEED1000|BMCR_ANENABLE); - ctrl |= BMCR_RESET; - - switch(mii_info->speed) { - case SPEED_1000: - if(features & (SUPPORTED_1000baseT_Half - | SUPPORTED_1000baseT_Full)) { - ctrl |= BMCR_SPEED1000; - break; - } - mii_info->speed = SPEED_100; - case SPEED_100: - if (features & (SUPPORTED_100baseT_Half - | SUPPORTED_100baseT_Full)) { - ctrl |= BMCR_SPEED100; - break; - } - mii_info->speed = SPEED_10; - case SPEED_10: - if (features & (SUPPORTED_10baseT_Half - | SUPPORTED_10baseT_Full)) - break; - default: /* Unsupported speed! */ - printk(KERN_ERR "%s: Bad speed!\n", - mii_info->dev->name); - break; - } - - phy_write(mii_info, MII_BMCR, ctrl); -} - - -/* Enable and Restart Autonegotiation */ -static void genmii_restart_aneg(struct gfar_mii_info *mii_info) -{ - u16 ctl; - - ctl = phy_read(mii_info, MII_BMCR); - ctl |= (BMCR_ANENABLE | BMCR_ANRESTART); - phy_write(mii_info, MII_BMCR, ctl); -} - - -static int gbit_config_aneg(struct gfar_mii_info *mii_info) -{ - u16 adv; - u32 advertise; - - if(mii_info->autoneg) { - /* Configure the ADVERTISE register */ - config_genmii_advert(mii_info); - advertise = mii_info->advertising; - - adv = phy_read(mii_info, MII_1000BASETCONTROL); - adv &= ~(MII_1000BASETCONTROL_FULLDUPLEXCAP | - MII_1000BASETCONTROL_HALFDUPLEXCAP); - if (advertise & SUPPORTED_1000baseT_Half) - adv |= MII_1000BASETCONTROL_HALFDUPLEXCAP; - if (advertise & SUPPORTED_1000baseT_Full) - adv |= MII_1000BASETCONTROL_FULLDUPLEXCAP; - phy_write(mii_info, MII_1000BASETCONTROL, adv); - - /* Start/Restart aneg */ - genmii_restart_aneg(mii_info); - } else - genmii_setup_forced(mii_info); - - return 0; -} - -static int marvell_config_aneg(struct gfar_mii_info *mii_info) -{ - /* The Marvell PHY has an errata which requires - * that certain registers get written in order - * to restart autonegotiation */ - phy_write(mii_info, MII_BMCR, BMCR_RESET); - - phy_write(mii_info, 0x1d, 0x1f); - phy_write(mii_info, 0x1e, 0x200c); - phy_write(mii_info, 0x1d, 0x5); - phy_write(mii_info, 0x1e, 0); - phy_write(mii_info, 0x1e, 0x100); - - gbit_config_aneg(mii_info); - - return 0; -} -static int genmii_config_aneg(struct gfar_mii_info *mii_info) -{ - if (mii_info->autoneg) { - config_genmii_advert(mii_info); - genmii_restart_aneg(mii_info); - } else - genmii_setup_forced(mii_info); - - return 0; -} - - -static int genmii_update_link(struct gfar_mii_info *mii_info) -{ - u16 status; - - /* Do a fake read */ - phy_read(mii_info, MII_BMSR); - - /* Read link and autonegotiation status */ - status = phy_read(mii_info, MII_BMSR); - if ((status & BMSR_LSTATUS) == 0) - mii_info->link = 0; - else - mii_info->link = 1; - - /* If we are autonegotiating, and not done, - * return an error */ - if (mii_info->autoneg && !(status & BMSR_ANEGCOMPLETE)) - return -EAGAIN; - - return 0; -} - -static int genmii_read_status(struct gfar_mii_info *mii_info) -{ - u16 status; - int err; - - /* Update the link, but return if there - * was an error */ - err = genmii_update_link(mii_info); - if (err) - return err; - - if (mii_info->autoneg) { - status = phy_read(mii_info, MII_LPA); - - if (status & (LPA_10FULL | LPA_100FULL)) - mii_info->duplex = DUPLEX_FULL; - else - mii_info->duplex = DUPLEX_HALF; - if (status & (LPA_100FULL | LPA_100HALF)) - mii_info->speed = SPEED_100; - else - mii_info->speed = SPEED_10; - mii_info->pause = 0; - } - /* On non-aneg, we assume what we put in BMCR is the speed, - * though magic-aneg shouldn't prevent this case from occurring - */ - - return 0; -} -static int marvell_read_status(struct gfar_mii_info *mii_info) -{ - u16 status; - int err; - - /* Update the link, but return if there - * was an error */ - err = genmii_update_link(mii_info); - if (err) - return err; - - /* If the link is up, read the speed and duplex */ - /* If we aren't autonegotiating, assume speeds - * are as set */ - if (mii_info->autoneg && mii_info->link) { - int speed; - status = phy_read(mii_info, MII_M1011_PHY_SPEC_STATUS); - -#if 0 - /* If speed and duplex aren't resolved, - * return an error. Isn't this handled - * by checking aneg? - */ - if ((status & MII_M1011_PHY_SPEC_STATUS_RESOLVED) == 0) - return -EAGAIN; -#endif - - /* Get the duplexity */ - if (status & MII_M1011_PHY_SPEC_STATUS_FULLDUPLEX) - mii_info->duplex = DUPLEX_FULL; - else - mii_info->duplex = DUPLEX_HALF; - - /* Get the speed */ - speed = status & MII_M1011_PHY_SPEC_STATUS_SPD_MASK; - switch(speed) { - case MII_M1011_PHY_SPEC_STATUS_1000: - mii_info->speed = SPEED_1000; - break; - case MII_M1011_PHY_SPEC_STATUS_100: - mii_info->speed = SPEED_100; - break; - default: - mii_info->speed = SPEED_10; - break; - } - mii_info->pause = 0; - } - - return 0; -} - - -static int cis820x_read_status(struct gfar_mii_info *mii_info) -{ - u16 status; - int err; - - /* Update the link, but return if there - * was an error */ - err = genmii_update_link(mii_info); - if (err) - return err; - - /* If the link is up, read the speed and duplex */ - /* If we aren't autonegotiating, assume speeds - * are as set */ - if (mii_info->autoneg && mii_info->link) { - int speed; - - status = phy_read(mii_info, MII_CIS8201_AUX_CONSTAT); - if (status & MII_CIS8201_AUXCONSTAT_DUPLEX) - mii_info->duplex = DUPLEX_FULL; - else - mii_info->duplex = DUPLEX_HALF; - - speed = status & MII_CIS8201_AUXCONSTAT_SPEED; - - switch (speed) { - case MII_CIS8201_AUXCONSTAT_GBIT: - mii_info->speed = SPEED_1000; - break; - case MII_CIS8201_AUXCONSTAT_100: - mii_info->speed = SPEED_100; - break; - default: - mii_info->speed = SPEED_10; - break; - } - } - - return 0; -} - -static int marvell_ack_interrupt(struct gfar_mii_info *mii_info) -{ - /* Clear the interrupts by reading the reg */ - phy_read(mii_info, MII_M1011_IEVENT); - - return 0; -} - -static int marvell_config_intr(struct gfar_mii_info *mii_info) -{ - if(mii_info->interrupts == MII_INTERRUPT_ENABLED) - phy_write(mii_info, MII_M1011_IMASK, MII_M1011_IMASK_INIT); - else - phy_write(mii_info, MII_M1011_IMASK, MII_M1011_IMASK_CLEAR); - - return 0; -} - -static int cis820x_init(struct gfar_mii_info *mii_info) -{ - phy_write(mii_info, MII_CIS8201_AUX_CONSTAT, - MII_CIS8201_AUXCONSTAT_INIT); - phy_write(mii_info, MII_CIS8201_EXT_CON1, - MII_CIS8201_EXTCON1_INIT); - - return 0; -} - -static int cis820x_ack_interrupt(struct gfar_mii_info *mii_info) -{ - phy_read(mii_info, MII_CIS8201_ISTAT); - - return 0; -} - -static int cis820x_config_intr(struct gfar_mii_info *mii_info) -{ - if(mii_info->interrupts == MII_INTERRUPT_ENABLED) - phy_write(mii_info, MII_CIS8201_IMASK, MII_CIS8201_IMASK_MASK); - else - phy_write(mii_info, MII_CIS8201_IMASK, 0); - - return 0; -} - -#define DM9161_DELAY 10 - -static int dm9161_read_status(struct gfar_mii_info *mii_info) -{ - u16 status; - int err; - - /* Update the link, but return if there - * was an error */ - err = genmii_update_link(mii_info); - if (err) - return err; - - /* If the link is up, read the speed and duplex */ - /* If we aren't autonegotiating, assume speeds - * are as set */ - if (mii_info->autoneg && mii_info->link) { - status = phy_read(mii_info, MII_DM9161_SCSR); - if (status & (MII_DM9161_SCSR_100F | MII_DM9161_SCSR_100H)) - mii_info->speed = SPEED_100; - else - mii_info->speed = SPEED_10; - - if (status & (MII_DM9161_SCSR_100F | MII_DM9161_SCSR_10F)) - mii_info->duplex = DUPLEX_FULL; - else - mii_info->duplex = DUPLEX_HALF; - } - - return 0; -} - - -static int dm9161_config_aneg(struct gfar_mii_info *mii_info) -{ - struct dm9161_private *priv = mii_info->priv; - - if(0 == priv->resetdone) - return -EAGAIN; - - return 0; -} - -static void dm9161_timer(unsigned long data) -{ - struct gfar_mii_info *mii_info = (struct gfar_mii_info *)data; - struct dm9161_private *priv = mii_info->priv; - u16 status = phy_read(mii_info, MII_BMSR); - - if (status & BMSR_ANEGCOMPLETE) { - priv->resetdone = 1; - } else - mod_timer(&priv->timer, jiffies + DM9161_DELAY * HZ); -} - -static int dm9161_init(struct gfar_mii_info *mii_info) -{ - struct dm9161_private *priv; - - /* Allocate the private data structure */ - priv = kmalloc(sizeof(struct dm9161_private), GFP_KERNEL); - - if (NULL == priv) - return -ENOMEM; - - mii_info->priv = priv; - - /* Reset is not done yet */ - priv->resetdone = 0; - - /* Isolate the PHY */ - phy_write(mii_info, MII_BMCR, BMCR_ISOLATE); - - /* Do not bypass the scrambler/descrambler */ - phy_write(mii_info, MII_DM9161_SCR, MII_DM9161_SCR_INIT); - - /* Clear 10BTCSR to default */ - phy_write(mii_info, MII_DM9161_10BTCSR, MII_DM9161_10BTCSR_INIT); - - /* Reconnect the PHY, and enable Autonegotiation */ - phy_write(mii_info, MII_BMCR, BMCR_ANENABLE); - - /* Start a timer for DM9161_DELAY seconds to wait - * for the PHY to be ready */ - init_timer(&priv->timer); - priv->timer.function = &dm9161_timer; - priv->timer.data = (unsigned long) mii_info; - mod_timer(&priv->timer, jiffies + DM9161_DELAY * HZ); - - return 0; -} - -static void dm9161_close(struct gfar_mii_info *mii_info) -{ - struct dm9161_private *priv = mii_info->priv; - - del_timer_sync(&priv->timer); - kfree(priv); -} - -#if 0 -static int dm9161_ack_interrupt(struct gfar_mii_info *mii_info) -{ - phy_read(mii_info, MII_DM9161_INTR); - - return 0; -} -#endif - -/* Cicada 820x */ -static struct phy_info phy_info_cis820x = { - 0x000fc440, - "Cicada Cis8204", - 0x000fffc0, - .features = MII_GBIT_FEATURES, - .init = &cis820x_init, - .config_aneg = &gbit_config_aneg, - .read_status = &cis820x_read_status, - .ack_interrupt = &cis820x_ack_interrupt, - .config_intr = &cis820x_config_intr, -}; - -static struct phy_info phy_info_dm9161 = { - .phy_id = 0x0181b880, - .name = "Davicom DM9161E", - .phy_id_mask = 0x0ffffff0, - .init = dm9161_init, - .config_aneg = dm9161_config_aneg, - .read_status = dm9161_read_status, - .close = dm9161_close, -}; - -static struct phy_info phy_info_marvell = { - .phy_id = 0x01410c00, - .phy_id_mask = 0xffffff00, - .name = "Marvell 88E1101", - .features = MII_GBIT_FEATURES, - .config_aneg = &marvell_config_aneg, - .read_status = &marvell_read_status, - .ack_interrupt = &marvell_ack_interrupt, - .config_intr = &marvell_config_intr, -}; - -static struct phy_info phy_info_genmii= { - .phy_id = 0x00000000, - .phy_id_mask = 0x00000000, - .name = "Generic MII", - .features = MII_BASIC_FEATURES, - .config_aneg = genmii_config_aneg, - .read_status = genmii_read_status, -}; - -static struct phy_info *phy_info[] = { - &phy_info_cis820x, - &phy_info_marvell, - &phy_info_dm9161, - &phy_info_genmii, - NULL -}; - -u16 phy_read(struct gfar_mii_info *mii_info, u16 regnum) -{ - u16 retval; - unsigned long flags; - - spin_lock_irqsave(&mii_info->mdio_lock, flags); - retval = mii_info->mdio_read(mii_info->dev, mii_info->mii_id, regnum); - spin_unlock_irqrestore(&mii_info->mdio_lock, flags); - - return retval; -} - -void phy_write(struct gfar_mii_info *mii_info, u16 regnum, u16 val) -{ - unsigned long flags; - - spin_lock_irqsave(&mii_info->mdio_lock, flags); - mii_info->mdio_write(mii_info->dev, - mii_info->mii_id, - regnum, val); - spin_unlock_irqrestore(&mii_info->mdio_lock, flags); -} - -/* Use the PHY ID registers to determine what type of PHY is attached - * to device dev. return a struct phy_info structure describing that PHY - */ -struct phy_info * get_phy_info(struct gfar_mii_info *mii_info) -{ - u16 phy_reg; - u32 phy_ID; - int i; - struct phy_info *theInfo = NULL; - struct net_device *dev = mii_info->dev; - - /* Grab the bits from PHYIR1, and put them in the upper half */ - phy_reg = phy_read(mii_info, MII_PHYSID1); - phy_ID = (phy_reg & 0xffff) << 16; - - /* Grab the bits from PHYIR2, and put them in the lower half */ - phy_reg = phy_read(mii_info, MII_PHYSID2); - phy_ID |= (phy_reg & 0xffff); - - /* loop through all the known PHY types, and find one that */ - /* matches the ID we read from the PHY. */ - for (i = 0; phy_info[i]; i++) - if (phy_info[i]->phy_id == - (phy_ID & phy_info[i]->phy_id_mask)) { - theInfo = phy_info[i]; - break; - } - - /* This shouldn't happen, as we have generic PHY support */ - if (theInfo == NULL) { - printk("%s: PHY id %x is not supported!\n", dev->name, phy_ID); - return NULL; - } else { - printk("%s: PHY is %s (%x)\n", dev->name, theInfo->name, - phy_ID); - } - - return theInfo; -} diff -Nru a/drivers/net/gianfar_phy.h b/drivers/net/gianfar_phy.h --- a/drivers/net/gianfar_phy.h 2004-12-23 12:39:15 -06:00 +++ /dev/null Wed Dec 31 16:00:00 196900 @@ -1,213 +0,0 @@ -/* - * drivers/net/gianfar_phy.h - * - * Gianfar Ethernet Driver -- PHY handling - * Driver for FEC on MPC8540 and TSEC on MPC8540/MPC8560 - * Based on 8260_io/fcc_enet.c - * - * Author: Andy Fleming - * Maintainer: Kumar Gala (kumar.gala@freescale.com) - * - * Copyright (c) 2002-2004 Freescale Semiconductor, Inc. - * - * This program is free software; you can redistribute it and/or modify it - * under the terms of the GNU General Public License as published by the - * Free Software Foundation; either version 2 of the License, or (at your - * option) any later version. - * - */ -#ifndef __GIANFAR_PHY_H -#define __GIANFAR_PHY_H - -#define MII_end ((u32)-2) -#define MII_read ((u32)-1) - -#define MIIMIND_BUSY 0x00000001 -#define MIIMIND_NOTVALID 0x00000004 - -#define GFAR_AN_TIMEOUT 2000 - -/* 1000BT control (Marvell & BCM54xx at least) */ -#define MII_1000BASETCONTROL 0x09 -#define MII_1000BASETCONTROL_FULLDUPLEXCAP 0x0200 -#define MII_1000BASETCONTROL_HALFDUPLEXCAP 0x0100 - -/* Cicada Extended Control Register 1 */ -#define MII_CIS8201_EXT_CON1 0x17 -#define MII_CIS8201_EXTCON1_INIT 0x0000 - -/* Cicada Interrupt Mask Register */ -#define MII_CIS8201_IMASK 0x19 -#define MII_CIS8201_IMASK_IEN 0x8000 -#define MII_CIS8201_IMASK_SPEED 0x4000 -#define MII_CIS8201_IMASK_LINK 0x2000 -#define MII_CIS8201_IMASK_DUPLEX 0x1000 -#define MII_CIS8201_IMASK_MASK 0xf000 - -/* Cicada Interrupt Status Register */ -#define MII_CIS8201_ISTAT 0x1a -#define MII_CIS8201_ISTAT_STATUS 0x8000 -#define MII_CIS8201_ISTAT_SPEED 0x4000 -#define MII_CIS8201_ISTAT_LINK 0x2000 -#define MII_CIS8201_ISTAT_DUPLEX 0x1000 - -/* Cicada Auxiliary Control/Status Register */ -#define MII_CIS8201_AUX_CONSTAT 0x1c -#define MII_CIS8201_AUXCONSTAT_INIT 0x0004 -#define MII_CIS8201_AUXCONSTAT_DUPLEX 0x0020 -#define MII_CIS8201_AUXCONSTAT_SPEED 0x0018 -#define MII_CIS8201_AUXCONSTAT_GBIT 0x0010 -#define MII_CIS8201_AUXCONSTAT_100 0x0008 - -/* 88E1011 PHY Status Register */ -#define MII_M1011_PHY_SPEC_STATUS 0x11 -#define MII_M1011_PHY_SPEC_STATUS_1000 0x8000 -#define MII_M1011_PHY_SPEC_STATUS_100 0x4000 -#define MII_M1011_PHY_SPEC_STATUS_SPD_MASK 0xc000 -#define MII_M1011_PHY_SPEC_STATUS_FULLDUPLEX 0x2000 -#define MII_M1011_PHY_SPEC_STATUS_RESOLVED 0x0800 -#define MII_M1011_PHY_SPEC_STATUS_LINK 0x0400 - -#define MII_M1011_IEVENT 0x13 -#define MII_M1011_IEVENT_CLEAR 0x0000 - -#define MII_M1011_IMASK 0x12 -#define MII_M1011_IMASK_INIT 0x6400 -#define MII_M1011_IMASK_CLEAR 0x0000 - -#define MII_DM9161_SCR 0x10 -#define MII_DM9161_SCR_INIT 0x0610 - -/* DM9161 Specified Configuration and Status Register */ -#define MII_DM9161_SCSR 0x11 -#define MII_DM9161_SCSR_100F 0x8000 -#define MII_DM9161_SCSR_100H 0x4000 -#define MII_DM9161_SCSR_10F 0x2000 -#define MII_DM9161_SCSR_10H 0x1000 - -/* DM9161 Interrupt Register */ -#define MII_DM9161_INTR 0x15 -#define MII_DM9161_INTR_PEND 0x8000 -#define MII_DM9161_INTR_DPLX_MASK 0x0800 -#define MII_DM9161_INTR_SPD_MASK 0x0400 -#define MII_DM9161_INTR_LINK_MASK 0x0200 -#define MII_DM9161_INTR_MASK 0x0100 -#define MII_DM9161_INTR_DPLX_CHANGE 0x0010 -#define MII_DM9161_INTR_SPD_CHANGE 0x0008 -#define MII_DM9161_INTR_LINK_CHANGE 0x0004 -#define MII_DM9161_INTR_INIT 0x0000 -#define MII_DM9161_INTR_STOP \ -(MII_DM9161_INTR_DPLX_MASK | MII_DM9161_INTR_SPD_MASK \ - | MII_DM9161_INTR_LINK_MASK | MII_DM9161_INTR_MASK) - -/* DM9161 10BT Configuration/Status */ -#define MII_DM9161_10BTCSR 0x12 -#define MII_DM9161_10BTCSR_INIT 0x7800 - -#define MII_BASIC_FEATURES (SUPPORTED_10baseT_Half | \ - SUPPORTED_10baseT_Full | \ - SUPPORTED_100baseT_Half | \ - SUPPORTED_100baseT_Full | \ - SUPPORTED_Autoneg | \ - SUPPORTED_TP | \ - SUPPORTED_MII) - -#define MII_GBIT_FEATURES (MII_BASIC_FEATURES | \ - SUPPORTED_1000baseT_Half | \ - SUPPORTED_1000baseT_Full) - -#define MII_READ_COMMAND 0x00000001 - -#define MII_INTERRUPT_DISABLED 0x0 -#define MII_INTERRUPT_ENABLED 0x1 -/* Taken from mii_if_info and sungem_phy.h */ -struct gfar_mii_info { - /* Information about the PHY type */ - /* And management functions */ - struct phy_info *phyinfo; - - /* forced speed & duplex (no autoneg) - * partner speed & duplex & pause (autoneg) - */ - int speed; - int duplex; - int pause; - - /* The most recently read link state */ - int link; - - /* Enabled Interrupts */ - u32 interrupts; - - u32 advertising; - int autoneg; - int mii_id; - - /* private data pointer */ - /* For use by PHYs to maintain extra state */ - void *priv; - - /* Provided by host chip */ - struct net_device *dev; - - /* A lock to ensure that only one thing can read/write - * the MDIO bus at a time */ - spinlock_t mdio_lock; - - /* Provided by ethernet driver */ - int (*mdio_read) (struct net_device *dev, int mii_id, int reg); - void (*mdio_write) (struct net_device *dev, int mii_id, int reg, int val); -}; - -/* struct phy_info: a structure which defines attributes for a PHY - * - * id will contain a number which represents the PHY. During - * startup, the driver will poll the PHY to find out what its - * UID--as defined by registers 2 and 3--is. The 32-bit result - * gotten from the PHY will be ANDed with phy_id_mask to - * discard any bits which may change based on revision numbers - * unimportant to functionality - * - * There are 6 commands which take a gfar_mii_info structure. - * Each PHY must declare config_aneg, and read_status. - */ -struct phy_info { - u32 phy_id; - char *name; - unsigned int phy_id_mask; - u32 features; - - /* Called to initialize the PHY */ - int (*init)(struct gfar_mii_info *mii_info); - - /* Called to suspend the PHY for power */ - int (*suspend)(struct gfar_mii_info *mii_info); - - /* Reconfigures autonegotiation (or disables it) */ - int (*config_aneg)(struct gfar_mii_info *mii_info); - - /* Determines the negotiated speed and duplex */ - int (*read_status)(struct gfar_mii_info *mii_info); - - /* Clears any pending interrupts */ - int (*ack_interrupt)(struct gfar_mii_info *mii_info); - - /* Enables or disables interrupts */ - int (*config_intr)(struct gfar_mii_info *mii_info); - - /* Clears up any memory if needed */ - void (*close)(struct gfar_mii_info *mii_info); -}; - -struct phy_info *get_phy_info(struct gfar_mii_info *mii_info); -int read_phy_reg(struct net_device *dev, int mii_id, int regnum); -void write_phy_reg(struct net_device *dev, int mii_id, int regnum, int value); -void mii_clear_phy_interrupt(struct gfar_mii_info *mii_info); -void mii_configure_phy_interrupt(struct gfar_mii_info *mii_info, u32 interrupts); - -struct dm9161_private { - struct timer_list timer; - int resetdone; -}; - -#endif /* GIANFAR_PHY_H */ diff -Nru a/drivers/net/phy/Kconfig b/drivers/net/phy/Kconfig --- /dev/null Wed Dec 31 16:00:00 196900 +++ b/drivers/net/phy/Kconfig 2004-12-23 12:39:16 -06:00 @@ -0,0 +1,45 @@ +# +# PHY Layer Configuration +# + +menu "MII support" + +config MII + bool "Generic Media Independent Interface device support" + depends on NET_ETHERNET + help + Most ethernet controllers have an MII transceiver either as an + external or internal device. It is safe to say Y here even if + your ethernet card lacks MII. This code provides functions + for managing these devices, and infrastructure. + +comment "MII PHY device drivers" + depends on MII + +config MARVELL_PHY + bool "Drivers for Marvell PHYs" + ---help--- + Currently has a driver for the 88E1011S + +config DAVICOM_PHY + bool "Drivers for Davicom PHYs" + ---help--- + Currently supports dm9161e and dm9131 + +config QSEMI_PHY + bool "Drivers for Quality Semiconductor PHYs" + ---help--- + Currently supports the qs6612 + +config LXT_PHY + bool "Drivers for the Intel LXT PHYs" + ---help--- + Currently supports the lxt970, lxt971 + +config CICADA_PHY + bool "Drivers for the Cicada PHYs" + ---help--- + Currently supports the cis8204 + +endmenu + diff -Nru a/drivers/net/phy/Makefile b/drivers/net/phy/Makefile --- /dev/null Wed Dec 31 16:00:00 196900 +++ b/drivers/net/phy/Makefile 2004-12-23 12:39:16 -06:00 @@ -0,0 +1,9 @@ +# Makefile for Linux PHY drivers + +obj-$(CONFIG_MII) += phy.o phy_device.o mdio_bus.o + +obj-$(CONFIG_MARVELL_PHY) += marvell.o +obj-$(CONFIG_DAVICOM_PHY) += davicom.o +obj-$(CONFIG_CICADA_PHY) += cicada.o +obj-$(CONFIG_LXT_PHY) += lxt.o +obj-$(CONFIG_QSEMI_PHY) += qsemi.o diff -Nru a/drivers/net/phy/cicada.c b/drivers/net/phy/cicada.c --- /dev/null Wed Dec 31 16:00:00 196900 +++ b/drivers/net/phy/cicada.c 2004-12-23 12:39:16 -06:00 @@ -0,0 +1,165 @@ +/* + * drivers/net/phy/cicada.c + * + * Driver for Cicada PHYs + * + * Author: Andy Fleming + * + * Copyright (c) 2004 Freescale Semiconductor, Inc. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or (at your + * option) any later version. + * + */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include + +/* Cicada Extended Control Register 1 */ +#define MII_CIS8201_EXT_CON1 0x17 +#define MII_CIS8201_EXTCON1_INIT 0x0000 + +/* Cicada Interrupt Mask Register */ +#define MII_CIS8201_IMASK 0x19 +#define MII_CIS8201_IMASK_IEN 0x8000 +#define MII_CIS8201_IMASK_SPEED 0x4000 +#define MII_CIS8201_IMASK_LINK 0x2000 +#define MII_CIS8201_IMASK_DUPLEX 0x1000 +#define MII_CIS8201_IMASK_MASK 0xf000 + +/* Cicada Interrupt Status Register */ +#define MII_CIS8201_ISTAT 0x1a +#define MII_CIS8201_ISTAT_STATUS 0x8000 +#define MII_CIS8201_ISTAT_SPEED 0x4000 +#define MII_CIS8201_ISTAT_LINK 0x2000 +#define MII_CIS8201_ISTAT_DUPLEX 0x1000 + +/* Cicada Auxiliary Control/Status Register */ +#define MII_CIS8201_AUX_CONSTAT 0x1c +#define MII_CIS8201_AUXCONSTAT_INIT 0x0004 +#define MII_CIS8201_AUXCONSTAT_DUPLEX 0x0020 +#define MII_CIS8201_AUXCONSTAT_SPEED 0x0018 +#define MII_CIS8201_AUXCONSTAT_GBIT 0x0010 +#define MII_CIS8201_AUXCONSTAT_100 0x0008 + +static int cis820x_read_status(struct phy_device *phydev) +{ + u16 status; + int err; + + /* Update the link, but return if there + * was an error */ + err = genphy_update_link(phydev); + if (err) + return err; + + /* If the link is up, read the speed and duplex */ + /* If we aren't autonegotiating, assume speeds + * are as set */ + if (phydev->autoneg && phydev->link) { + int speed; + + status = phy_read(phydev, MII_CIS8201_AUX_CONSTAT); + if (status & MII_CIS8201_AUXCONSTAT_DUPLEX) + phydev->duplex = DUPLEX_FULL; + else + phydev->duplex = DUPLEX_HALF; + + speed = status & MII_CIS8201_AUXCONSTAT_SPEED; + + switch (speed) { + case MII_CIS8201_AUXCONSTAT_GBIT: + phydev->speed = SPEED_1000; + break; + case MII_CIS8201_AUXCONSTAT_100: + phydev->speed = SPEED_100; + break; + default: + phydev->speed = SPEED_10; + break; + } + } + + return 0; +} + +static int cis820x_probe(struct phy_device *phydev) +{ + phy_write(phydev, MII_CIS8201_AUX_CONSTAT, + MII_CIS8201_AUXCONSTAT_INIT); + phy_write(phydev, MII_CIS8201_EXT_CON1, + MII_CIS8201_EXTCON1_INIT); + + return 0; +} + +static int cis820x_ack_interrupt(struct phy_device *phydev) +{ + phy_read(phydev, MII_CIS8201_ISTAT); + + return 0; +} + +static int cis820x_config_intr(struct phy_device *phydev) +{ + if(phydev->interrupts == PHY_INTERRUPT_ENABLED) + phy_write(phydev, MII_CIS8201_IMASK, MII_CIS8201_IMASK_MASK); + else + phy_write(phydev, MII_CIS8201_IMASK, 0); + + return 0; +} + +/* Cicada 820x */ +static struct phy_driver cis8204_driver = { + 0x000fc440, + "Cicada Cis8204", + 0x000fffc0, + .features = PHY_GBIT_FEATURES, + .flags = PHY_HAS_INTERRUPT, + .probe = &cis820x_probe, + .config_aneg = &gbit_config_aneg, + .read_status = &cis820x_read_status, + .ack_interrupt = &cis820x_ack_interrupt, + .config_intr = &cis820x_config_intr, +}; + +int __init cis8204_init(void) +{ + int retval; + + retval = phy_driver_register(&cis8204_driver); + + return retval; +} + +static void __exit cis8204_exit(void) +{ + phy_driver_unregister(&cis8204_driver); +} + +module_init(cis8204_init); +module_exit(cis8204_exit); diff -Nru a/drivers/net/phy/davicom.c b/drivers/net/phy/davicom.c --- /dev/null Wed Dec 31 16:00:00 196900 +++ b/drivers/net/phy/davicom.c 2004-12-23 12:39:16 -06:00 @@ -0,0 +1,277 @@ +/* + * drivers/net/phy/davicom.c + * + * Driver for Davicom PHYs + * + * Author: Andy Fleming + * + * Copyright (c) 2004 Freescale Semiconductor, Inc. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or (at your + * option) any later version. + * + */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include + +#define MII_DM9161_SCR 0x10 +#define MII_DM9161_SCR_INIT 0x0610 + +/* DM9161 Specified Configuration and Status Register */ +#define MII_DM9161_SCSR 0x11 +#define MII_DM9161_SCSR_100F 0x8000 +#define MII_DM9161_SCSR_100H 0x4000 +#define MII_DM9161_SCSR_10F 0x2000 +#define MII_DM9161_SCSR_10H 0x1000 + +/* DM9161 Interrupt Register */ +#define MII_DM9161_INTR 0x15 +#define MII_DM9161_INTR_PEND 0x8000 +#define MII_DM9161_INTR_DPLX_MASK 0x0800 +#define MII_DM9161_INTR_SPD_MASK 0x0400 +#define MII_DM9161_INTR_LINK_MASK 0x0200 +#define MII_DM9161_INTR_MASK 0x0100 +#define MII_DM9161_INTR_DPLX_CHANGE 0x0010 +#define MII_DM9161_INTR_SPD_CHANGE 0x0008 +#define MII_DM9161_INTR_LINK_CHANGE 0x0004 +#define MII_DM9161_INTR_INIT 0x0000 +#define MII_DM9161_INTR_STOP \ +(MII_DM9161_INTR_DPLX_MASK | MII_DM9161_INTR_SPD_MASK \ + | MII_DM9161_INTR_LINK_MASK | MII_DM9161_INTR_MASK) + +/* DM9161 10BT Configuration/Status */ +#define MII_DM9161_10BTCSR 0x12 +#define MII_DM9161_10BTCSR_INIT 0x7800 + +struct dm9161_private { + struct timer_list timer; + int resetdone; +}; + +#define DM9161_DELAY 1 +int dm9161_config_intr(struct phy_device *phydev) +{ + u16 temp; + + temp = phy_read(phydev, MII_DM9161_INTR); + + if(PHY_INTERRUPT_ENABLED == phydev->interrupts ) + temp &= ~(MII_DM9161_INTR_STOP); + else + temp |= MII_DM9161_INTR_STOP; + + phy_write(phydev, MII_DM9161_INTR, temp); + + return 0; +} + + +static int dm9161_read_status(struct phy_device *phydev) +{ + u16 status; + int err; + + /* Update the link, but return if there + * was an error */ + err = genphy_update_link(phydev); + if (err) + return err; + + /* If the link is up, read the speed and duplex */ + /* If we aren't autonegotiating, assume speeds + * are as set */ + if (phydev->autoneg && phydev->link) { + status = phy_read(phydev, MII_DM9161_SCSR); + if (status & (MII_DM9161_SCSR_100F | MII_DM9161_SCSR_100H)) + phydev->speed = SPEED_100; + else + phydev->speed = SPEED_10; + + if (status & (MII_DM9161_SCSR_100F | MII_DM9161_SCSR_10F)) + phydev->duplex = DUPLEX_FULL; + else + phydev->duplex = DUPLEX_HALF; + } + + return 0; +} + + +static void dm9161_timer(unsigned long data) +{ + struct phy_device *phydev = (struct phy_device *)data; + struct dm9161_private *priv = phydev->priv; + u16 status = phy_read(phydev, MII_BMSR); + + spin_lock(&phydev->lock); + if (status & BMSR_ANEGCOMPLETE) { + if (PHY_PENDING == phydev->state) + phydev->state = PHY_UP; + else + phydev->state = PHY_READY; + } else + mod_timer(&priv->timer, jiffies + DM9161_DELAY * HZ); + + spin_unlock(&phydev->lock); +} + + +static int dm9161_config_aneg(struct phy_device *phydev) +{ + /* Isolate the PHY */ + phy_write(phydev, MII_BMCR, BMCR_ISOLATE); + + /* Configure the new settings */ + genphy_config_advert(phydev); + + /* Reconnect the PHY, and enable Autonegotiation */ + phy_write(phydev, MII_BMCR, BMCR_ANENABLE); + +#if 0 + /* Start a timer for DM9161_DELAY seconds to wait + * for the PHY to be ready */ + init_timer(&priv->timer); + priv->timer.function = &dm9161_timer; + priv->timer.data = (unsigned long) phydev; + mod_timer(&priv->timer, jiffies + DM9161_DELAY * HZ); +#endif + + return 0; +} + +static int dm9161_probe(struct phy_device *phydev) +{ + struct dm9161_private *priv; + + /* Allocate the private data structure */ + priv = kmalloc(sizeof(struct dm9161_private), GFP_KERNEL); + + if (NULL == priv) + return -ENOMEM; + + phydev->priv = priv; + + /* Reset is not done yet */ + priv->resetdone = 0; + + /* Isolate the PHY */ + phy_write(phydev, MII_BMCR, BMCR_ISOLATE); + + /* Do not bypass the scrambler/descrambler */ + phy_write(phydev, MII_DM9161_SCR, MII_DM9161_SCR_INIT); + + /* Clear 10BTCSR to default */ + phy_write(phydev, MII_DM9161_10BTCSR, MII_DM9161_10BTCSR_INIT); + + /* Reconnect the PHY, and enable Autonegotiation */ + phy_write(phydev, MII_BMCR, BMCR_ANENABLE); + + phydev->state = PHY_STARTING; + + /* Start a timer for DM9161_DELAY seconds to wait + * for the PHY to be ready */ + init_timer(&priv->timer); + priv->timer.function = &dm9161_timer; + priv->timer.data = (unsigned long) phydev; + mod_timer(&priv->timer, jiffies + DM9161_DELAY * HZ); + + printk(KERN_INFO "Bringing up a Davicom PHY, this could take" + " a while...\n"); + return 0; +} + +static void dm9161_remove(struct phy_device *phydev) +{ + struct dm9161_private *priv = phydev->priv; + + del_timer_sync(&priv->timer); + kfree(priv); +} + +static int dm9161_ack_interrupt(struct phy_device *phydev) +{ + phy_read(phydev, MII_DM9161_INTR); + + return 0; +} + +static struct phy_driver dm9161_driver = { + .phy_id = 0x0181b880, + .name = "Davicom DM9161E", + .phy_id_mask = 0x0ffffff0, + .features = PHY_BASIC_FEATURES, + .probe = dm9161_probe, + .config_aneg = dm9161_config_aneg, + .read_status = dm9161_read_status, + .remove = dm9161_remove, +}; + +static struct phy_driver dm9131_driver = { + .phy_id = 0x00181b80, + .name = "Davicom DM9131", + .phy_id_mask = 0x0ffffff0, + .features = PHY_BASIC_FEATURES, + .flags = PHY_HAS_INTERRUPT, + .config_aneg = genphy_config_aneg, + .read_status = dm9161_read_status, + .ack_interrupt = dm9161_ack_interrupt, + .config_intr = dm9161_config_intr, +}; + +int __init dm9161_init(void) +{ + int retval; + + retval = phy_driver_register(&dm9161_driver); + + return retval; +} + +static void __exit dm9161_exit(void) +{ + phy_driver_unregister(&dm9161_driver); +} + +module_init(dm9161_init); +module_exit(dm9161_exit); + +int __init dm9131_init(void) +{ + int retval; + + retval = phy_driver_register(&dm9131_driver); + + return retval; +} + +static void __exit dm9131_exit(void) +{ + phy_driver_unregister(&dm9131_driver); +} + +module_init(dm9131_init); +module_exit(dm9131_exit); diff -Nru a/drivers/net/phy/lxt.c b/drivers/net/phy/lxt.c --- /dev/null Wed Dec 31 16:00:00 196900 +++ b/drivers/net/phy/lxt.c 2004-12-23 12:39:16 -06:00 @@ -0,0 +1,237 @@ +/* + * drivers/net/phy/lxt.c + * + * Driver for Intel LXT PHYs + * + * Author: Andy Fleming + * + * Copyright (c) 2004 Freescale Semiconductor, Inc. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or (at your + * option) any later version. + * + */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include + +/* The Level one LXT970 is used by many boards */ + +#define MII_LXT970_MIRROR 16 /* Mirror register */ +#define MII_LXT970_IER 17 /* Interrupt Enable Register */ + +#define MII_LXT970_IER_IEN 0x0002 + +#define MII_LXT970_ISR 18 /* Interrupt Status Register */ + +#define MII_LXT970_CONFIG 19 /* Configuration Register */ +#define MII_LXT970_CSR 20 /* Chip Status Register */ + +#define MII_LXT970_CSR_DUPLEX 0x1000 +#define MII_LXT970_CSR_SPEED 0x0800 + +/* ------------------------------------------------------------------------- */ +/* The Level one LXT971 is used on some of my custom boards */ + +/* register definitions for the 971 */ + +#define MII_LXT971_PCR 16 /* Port Control Register */ + +#define MII_LXT971_SR2 17 /* Status Register 2 */ +#define MII_LXT971_SR2_DUPLEX 0x0200 +#define MII_LXT971_SR2_SPEED 0x4000 + +#define MII_LXT971_IER 18 /* Interrupt Enable Register */ +#define MII_LXT971_IER_IEN 0x00f2 + +#define MII_LXT971_ISR 19 /* Interrupt Status Register */ + +#define MII_LXT971_LCR 20 /* LED Control Register */ + +#define MII_LXT971_TCR 30 /* Transmit Control Register */ + + +static int lxt970_read_status(struct phy_device *phydev) +{ + u16 status; + int err; + + /* Update the link, but return if there + * was an error */ + err = genphy_update_link(phydev); + if (err) + return err; + + /* If the link is up, read the speed and duplex */ + /* If we aren't autonegotiating, assume speeds + * are as set */ + if (phydev->autoneg && phydev->link) { + status = phy_read(phydev, MII_LXT970_CSR); + if (status & MII_LXT970_CSR_DUPLEX) + phydev->duplex = DUPLEX_FULL; + else + phydev->duplex = DUPLEX_HALF; + + if (status & MII_LXT970_CSR_SPEED) + phydev->speed = SPEED_100; + else + phydev->speed = SPEED_10; + } + + return 0; +} + +static int lxt970_ack_interrupt(struct phy_device *phydev) +{ + phy_read(phydev, MII_BMSR); + phy_read(phydev, MII_LXT970_ISR); + + return 0; +} + +static int lxt970_config_intr(struct phy_device *phydev) +{ + if(phydev->interrupts == PHY_INTERRUPT_ENABLED) + phy_write(phydev, MII_LXT970_IER, MII_LXT970_IER_IEN); + else + phy_write(phydev, MII_LXT970_IER, 0); + + return 0; +} + +static int lxt970_probe(struct phy_device *phydev) +{ + phy_write(phydev, MII_LXT970_CONFIG, 0); + + return 0; +} + + +static int lxt971_read_status(struct phy_device *phydev) +{ + u16 status; + int err; + + /* Update the link, but return if there + * was an error */ + err = genphy_update_link(phydev); + if (err) + return err; + + /* If the link is up, read the speed and duplex */ + /* If we aren't autonegotiating, assume speeds + * are as set */ + if (phydev->autoneg && phydev->link) { + status = phy_read(phydev, MII_LXT971_SR2); + if (status & MII_LXT971_SR2_DUPLEX) + phydev->duplex = DUPLEX_FULL; + else + phydev->duplex= DUPLEX_HALF; + + if (status & MII_LXT971_SR2_SPEED) + phydev->speed = SPEED_100; + else + phydev->speed = SPEED_10; + } + + return 0; +} + +static int lxt971_ack_interrupt(struct phy_device *phydev) +{ + phy_read(phydev, MII_LXT971_ISR); + + return 0; +} + +static int lxt971_config_intr(struct phy_device *phydev) +{ + if(phydev->interrupts == PHY_INTERRUPT_ENABLED) + phy_write(phydev, MII_LXT971_IER, MII_LXT971_IER_IEN); + else + phy_write(phydev, MII_LXT971_IER, 0); + + return 0; +} + +static struct phy_driver lxt970_driver = { + .phy_id = 0x07810000, + .name = "LXT970", + .phy_id_mask = 0x0fffffff, + .features = PHY_BASIC_FEATURES, + .flags = PHY_HAS_INTERRUPT, + .probe = lxt970_probe, + .config_aneg = genphy_config_aneg, + .read_status = lxt970_read_status, + .ack_interrupt = lxt970_ack_interrupt, + .config_intr = lxt970_config_intr, +}; + +static struct phy_driver lxt971_driver = { + .phy_id = 0x0001378e, + .name = "LXT971", + .phy_id_mask = 0x0fffffff, + .features = PHY_BASIC_FEATURES, + .flags = PHY_HAS_INTERRUPT, + .config_aneg = genphy_config_aneg, + .read_status = lxt971_read_status, + .ack_interrupt = lxt971_ack_interrupt, + .config_intr = lxt971_config_intr, +}; + +int __init lxt970_init(void) +{ + int retval; + + retval = phy_driver_register(&lxt970_driver); + + return retval; +} + +static void __exit lxt970_exit(void) +{ + phy_driver_unregister(&lxt970_driver); +} + +module_init(lxt970_init); +module_exit(lxt970_exit); + +int __init lxt971_init(void) +{ + int retval; + + retval = phy_driver_register(&lxt971_driver); + + return retval; +} + +static void __exit lxt971_exit(void) +{ + phy_driver_unregister(&lxt971_driver); +} + +module_init(lxt971_init); +module_exit(lxt971_exit); diff -Nru a/drivers/net/phy/marvell.c b/drivers/net/phy/marvell.c --- /dev/null Wed Dec 31 16:00:00 196900 +++ b/drivers/net/phy/marvell.c 2004-12-23 12:39:16 -06:00 @@ -0,0 +1,173 @@ +/* + * drivers/net/phy/marvell.c + * + * Driver for Marvell PHYs + * + * Author: Andy Fleming + * + * Copyright (c) 2004 Freescale Semiconductor, Inc. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or (at your + * option) any later version. + * + */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include + +/* 88E1011 PHY Status Register */ +#define MII_M1011_PHY_SPEC_STATUS 0x11 +#define MII_M1011_PHY_SPEC_STATUS_1000 0x8000 +#define MII_M1011_PHY_SPEC_STATUS_100 0x4000 +#define MII_M1011_PHY_SPEC_STATUS_SPD_MASK 0xc000 +#define MII_M1011_PHY_SPEC_STATUS_FULLDUPLEX 0x2000 +#define MII_M1011_PHY_SPEC_STATUS_RESOLVED 0x0800 +#define MII_M1011_PHY_SPEC_STATUS_LINK 0x0400 + +#define MII_M1011_IEVENT 0x13 +#define MII_M1011_IEVENT_CLEAR 0x0000 + +#define MII_M1011_IMASK 0x12 +#define MII_M1011_IMASK_INIT 0x6400 +#define MII_M1011_IMASK_CLEAR 0x0000 + +static int marvell_read_status(struct phy_device *phydev) +{ + u16 status; + int err; + + /* Update the link, but return if there + * was an error */ + err = genphy_update_link(phydev); + if (err) + return err; + + /* If the link is up, read the speed and duplex */ + /* If we aren't autonegotiating, assume speeds + * are as set */ + if (phydev->autoneg && phydev->link) { + int speed; + status = phy_read(phydev, MII_M1011_PHY_SPEC_STATUS); + +#if 0 + /* If speed and duplex aren't resolved, + * return an error. Isn't this handled + * by checking aneg? + */ + if ((status & MII_M1011_PHY_SPEC_STATUS_RESOLVED) == 0) + return -EAGAIN; +#endif + + /* Get the duplexity */ + if (status & MII_M1011_PHY_SPEC_STATUS_FULLDUPLEX) + phydev->duplex = DUPLEX_FULL; + else + phydev->duplex = DUPLEX_HALF; + + /* Get the speed */ + speed = status & MII_M1011_PHY_SPEC_STATUS_SPD_MASK; + switch(speed) { + case MII_M1011_PHY_SPEC_STATUS_1000: + phydev->speed = SPEED_1000; + break; + case MII_M1011_PHY_SPEC_STATUS_100: + phydev->speed = SPEED_100; + break; + default: + phydev->speed = SPEED_10; + break; + } + phydev->pause = 0; + } + + return 0; +} + +static int marvell_ack_interrupt(struct phy_device *phydev) +{ + /* Clear the interrupts by reading the reg */ + phy_read(phydev, MII_M1011_IEVENT); + + return 0; +} + +static int marvell_config_intr(struct phy_device *phydev) +{ + if(phydev->interrupts == PHY_INTERRUPT_ENABLED) + phy_write(phydev, MII_M1011_IMASK, MII_M1011_IMASK_INIT); + else + phy_write(phydev, MII_M1011_IMASK, MII_M1011_IMASK_CLEAR); + + return 0; +} + +static int marvell_config_aneg(struct phy_device *phydev) +{ + /* The Marvell PHY has an errata which requires + * that certain registers get written in order + * to restart autonegotiation */ + phy_write(phydev, MII_BMCR, BMCR_RESET); + + phy_write(phydev, 0x1d, 0x1f); + phy_write(phydev, 0x1e, 0x200c); + phy_write(phydev, 0x1d, 0x5); + phy_write(phydev, 0x1e, 0); + phy_write(phydev, 0x1e, 0x100); + + gbit_config_aneg(phydev); + + return 0; +} + + +static struct phy_driver m88e1101_driver = { + .phy_id = 0x01410c00, + .phy_id_mask = 0xffffff00, + .name = "Marvell 88E1101", + .features = PHY_GBIT_FEATURES, + .flags = PHY_HAS_INTERRUPT, + .config_aneg = &marvell_config_aneg, + .read_status = &marvell_read_status, + .ack_interrupt = &marvell_ack_interrupt, + .config_intr = &marvell_config_intr, +}; + +int __init marvell_init(void) +{ + int retval; + + retval = phy_driver_register(&m88e1101_driver); + + return retval; +} + +static void __exit marvell_exit(void) +{ + phy_driver_unregister(&m88e1101_driver); +} + +module_init(marvell_init); +module_exit(marvell_exit); diff -Nru a/drivers/net/phy/mdio_bus.c b/drivers/net/phy/mdio_bus.c --- /dev/null Wed Dec 31 16:00:00 196900 +++ b/drivers/net/phy/mdio_bus.c 2004-12-23 12:39:16 -06:00 @@ -0,0 +1,173 @@ +/* + * drivers/net/phy/mdio_bus.c + * + * MDIO Bus interface + * + * Author: Andy Fleming + * + * Copyright (c) 2004 Freescale Semiconductor, Inc. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or (at your + * option) any later version. + * + */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include + +/* register_mdiobus + * bus: The bus being registered + * + * description: Called by a bus driver to bring up all the PHYs + * on the bus, and attach them to the bus + */ +int register_mdiobus(struct mii_bus *bus) +{ + int i; + int err = 0; + + spin_lock_init(&bus->mdio_lock); + + if (NULL == bus || NULL == bus->name || + NULL == bus->read || + NULL == bus->write) + return -EINVAL; + + if (bus->reset) + bus->reset(bus); + + for (i=0; i < PHY_MAX_ADDR; i++) { + struct phy_device *phydev; + + phydev = get_phy_device(bus, i); + + /* There's a PHY at this address + * We need to set: + * 1) IRQ + * 2) bus_id + * 3) parent + * 4) bus + * 5) mii_bus + * And, we need to register it */ + if (phydev) { + phydev->irq = bus->irq[i]; + + phydev->dev.parent = bus->dev; + + phydev->dev.bus = &mdio_bus_type; + + phydev->bus = bus; + + sprintf(phydev->dev.bus_id, "phy%d:%d", bus->id, i); + + err = device_register(&phydev->dev); + + if (err) + printk("phy %d did not register (%d)\n", + i, err); + + bus->phy_map[i] = phydev; + } + } + + pr_info("%s: probed\n", bus->name); + + return err; +} +EXPORT_SYMBOL(register_mdiobus); + +void unregister_mdiobus(struct mii_bus *bus) +{ + int i; + + for (i=0; i < PHY_MAX_ADDR; i++) + if (bus->phy_map[i]) { + device_unregister(&bus->phy_map[i]->dev); + kfree(bus->phy_map[i]); + } + +} +EXPORT_SYMBOL(unregister_mdiobus); + +/* mdio_bus_match + * dev: a PHY device + * drv: a PHY driver + * + * description: Given a PHY device, and a PHY driver, return 1 if + * the driver supports the device. Otherwise, return 0 + */ +int mdio_bus_match(struct device *dev, struct device_driver *drv) +{ + struct phy_device *phydev = to_phy_device(dev); + struct phy_driver *phydrv = to_phy_driver(drv); + + return (phydrv->phy_id == (phydev->phy_id & phydrv->phy_id_mask)); +} + +/* Suspend and resume. Copied from platform_suspend and + * platform_resume + */ +static int mdio_bus_suspend(struct device * dev, u32 state) +{ + int ret = 0; + + if (dev->driver && dev->driver->suspend) { + ret = dev->driver->suspend(dev, state, SUSPEND_DISABLE); + if (ret == 0) + ret = dev->driver->suspend(dev, state, SUSPEND_SAVE_STATE); + if (ret == 0) + ret = dev->driver->suspend(dev, state, SUSPEND_POWER_DOWN); + } + return ret; +} + +static int mdio_bus_resume(struct device * dev) +{ + int ret = 0; + + if (dev->driver && dev->driver->resume) { + ret = dev->driver->resume(dev, RESUME_POWER_ON); + if (ret == 0) + ret = dev->driver->resume(dev, RESUME_RESTORE_STATE); + if (ret == 0) + ret = dev->driver->resume(dev, RESUME_ENABLE); + } + return ret; +} + +struct bus_type mdio_bus_type = { + .name = "mdio_bus", + .match = mdio_bus_match, + .suspend= mdio_bus_suspend, + .resume = mdio_bus_resume, +}; + +int __init mdio_bus_init(void) +{ + return bus_register(&mdio_bus_type); +} + +subsys_initcall(mdio_bus_init); diff -Nru a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c --- /dev/null Wed Dec 31 16:00:00 196900 +++ b/drivers/net/phy/phy.c 2004-12-23 12:39:16 -06:00 @@ -0,0 +1,512 @@ +/* + * drivers/net/phy/phy.c + * + * Framework for configuring and reading PHY devices + * Based on code in sungem_phy.c and gianfar_phy.c + * + * Author: Andy Fleming + * + * Copyright (c) 2004 Freescale Semiconductor, Inc. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or (at your + * option) any later version. + * + */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include + +u16 phy_read(struct phy_device *phydev, u16 regnum); +void phy_write(struct phy_device *phydev, u16 regnum, u16 val); + +/* Convenience functions for reading a given PHY register. + * This MUST NOT be called from interrupt context, + * because the bus read function may sleep + * or generally lock up. */ +u16 phy_read(struct phy_device *phydev, u16 regnum) +{ + u16 retval; + struct mii_bus *bus = phydev->bus; + + spin_lock_bh(&bus->mdio_lock); + retval = bus->read(bus, phydev->addr, regnum); + spin_unlock_bh(&bus->mdio_lock); + + return retval; +} + +void phy_write(struct phy_device *phydev, u16 regnum, u16 val) +{ + struct mii_bus *bus = phydev->bus; + + spin_lock_bh(&bus->mdio_lock); + bus->write(bus, phydev->addr, regnum, val); + spin_unlock_bh(&bus->mdio_lock); +} + + +void phy_clear_interrupt(struct phy_device *phydev) +{ + if (phydev->drv->ack_interrupt) + phydev->drv->ack_interrupt(phydev); +} + + +void phy_config_interrupt(struct phy_device *phydev, + u32 interrupts) +{ + phydev->interrupts = interrupts; + if (phydev->drv->config_intr) + phydev->drv->config_intr(phydev); +} + + +static inline void phy_read_status(struct phy_device *phydev) +{ + phydev->drv->read_status(phydev); +} + +static inline int phy_aneg_done(struct phy_device *phydev) +{ + return (phy_read(phydev, MII_BMSR) & BMSR_ANEGCOMPLETE); +} + +/* phy_start_aneg + * phydev: The PHY on which to initiate auto-negotiation + * + * description: Calls the PHY driver's config_aneg, and then + * sets the PHY state to PHY_AN if auto-negotiation is enabled, + * and to PHY_FORCING if auto-negotiation is disabled. Unless + * the PHY is currently HALTED. + */ +void phy_start_aneg(struct phy_device *phydev) +{ + spin_lock(&phydev->lock); + + if (AUTONEG_DISABLE == phydev->autoneg) + phy_sanitize_settings(phydev); + + phydev->drv->config_aneg(phydev); + + if (phydev->state != PHY_HALTED) { + if (AUTONEG_ENABLE == phydev->autoneg) { + phydev->state = PHY_AN; + phydev->link_timeout = PHY_AN_TIMEOUT; + } else { + phydev->state = PHY_FORCING; + phydev->link_timeout = PHY_FORCE_TIMEOUT; + } + } + + spin_unlock(&phydev->lock); +} + + +/* phy_interrupt + * irq: Interrupt number + * phy_dat: PHY device which caused the interrupt (presumably) + * regs: -- + * + * description: When a PHY interrupt occurs, the handler disables + * interrupts, and schedules a work task to clear the interrupt. + */ +static irqreturn_t phy_interrupt(int irq, void *phy_dat, struct pt_regs *regs) +{ + struct phy_device *phydev = phy_dat; + + /* The MDIO bus is not allowed to be written in interrupt + * context, so we need to disable the irq here. A work + * queue will write the PHY to disable and clear the + * interrupt, and then reenable the irq line. */ + disable_irq_nosync(irq); + + schedule_work(&phydev->phy_queue); + + return IRQ_HANDLED; +} + +/* phy_start_interrupts + * phydev: The PHY whose interrupts are being enabled + * + * description: Request the interrupt for the given PHY. If + * this fails, then we set irq to -1 so that we do polling. + * Otherwise, we enable the interrupts. + * Returns 0 on success, -1 on error. + */ +int phy_start_interrupts(struct phy_device *phydev) +{ + if (request_irq(phydev->irq, phy_interrupt, + SA_SHIRQ, + "phy_interrupt", + phydev) < 0) { + printk(KERN_ERR "%s: Can't get IRQ %d (PHY)\n", + phydev->bus->name, + phydev->irq); + phydev->irq = -1; + return -1; + } + + phy_config_interrupt(phydev, PHY_INTERRUPT_ENABLED); + + return 0; +} + +/* Scheduled by the phy_interrupt/timer to handle PHY changes */ +void phy_change(void *data) +{ + struct phy_device *phydev = data; + + /* Disable PHY interrupts */ + phy_config_interrupt(phydev, PHY_INTERRUPT_DISABLED); + + /* Clear the interrupt */ + phy_clear_interrupt(phydev); + + spin_lock(&phydev->lock); + if ((PHY_RUNNING == phydev->state) || (PHY_NOLINK == phydev->state)) + phydev->state = PHY_CHANGELINK; + spin_unlock(&phydev->lock); + + enable_irq(phydev->irq); + + /* Reenable interrupts, if needed */ + phy_config_interrupt(phydev, PHY_INTERRUPT_ENABLED); +} + +/* Bring down the PHY link, and stop checking the status. */ +void phy_stop(struct phy_device *phydev) +{ + spin_lock(&phydev->lock); + + if (PHY_HALTED == phydev->state) { + spin_unlock(&phydev->lock); + return; + } + + if (phydev->irq != -1) { + /* Clear any pending interrupts */ + phy_clear_interrupt(phydev); + + /* Disable PHY Interrupts */ + phy_config_interrupt(phydev, PHY_INTERRUPT_DISABLED); + } + + phydev->state = PHY_HALTED; + + spin_unlock(&phydev->lock); +} + + +/* phy_start + * phydev: The PHY device being started + * + * description: Indicates the attached device's readiness to + * handle PHY-related work. Used during startup to start the + * PHY, and after a call to phy_stop() to resume operation. + */ +void phy_start(struct phy_device *phydev) +{ + spin_lock(&phydev->lock); + + switch (phydev->state) { + case PHY_STARTING: + phydev->state = PHY_PENDING; + break; + case PHY_READY: + phydev->state = PHY_UP; + break; + case PHY_HALTED: + phydev->state = PHY_RESUMING; + default: + break; + } + spin_unlock(&phydev->lock); +} +EXPORT_SYMBOL(phy_stop); +EXPORT_SYMBOL(phy_start); + +/* A structure for mapping a particular speed and duplex + * combination to a particular SUPPORTED and ADVERTISED value */ +struct phy_setting { + int speed; + int duplex; + u32 setting; +}; + +/* A mapping of all SUPPORTED settings to speed/duplex */ +static struct phy_setting settings[] = { + { .speed = 10000, .duplex = DUPLEX_FULL, + .setting = SUPPORTED_10000baseT_Full, + }, + { .speed = SPEED_1000, .duplex = DUPLEX_FULL, + .setting = SUPPORTED_1000baseT_Full, + }, + { .speed = SPEED_1000, .duplex = DUPLEX_HALF, + .setting = SUPPORTED_1000baseT_Half, + }, + { .speed = SPEED_100, .duplex = DUPLEX_FULL, + .setting = SUPPORTED_100baseT_Full, + }, + { .speed = SPEED_100, .duplex = DUPLEX_HALF, + .setting = SUPPORTED_100baseT_Half, + }, + { .speed = SPEED_10, .duplex = DUPLEX_FULL, + .setting = SUPPORTED_10baseT_Full, + }, + { .speed = SPEED_10, .duplex = DUPLEX_HALF, + .setting = SUPPORTED_10baseT_Half, + }, +}; + +#define MAX_NUM_SETTINGS (sizeof(settings)/sizeof(struct phy_setting)) + +/* phy_find_setting + * speed: desired speed of setting + * duplex: desired duplex of setting + * + * description: Searches the settings array for the setting which + * matches the desired speed and duplex, and returns the index + * of that setting. Returns the index of the last setting if + * none of the others match. + */ +static inline int phy_find_setting(int speed, int duplex) +{ + int idx = 0; + + while (idx < MAX_NUM_SETTINGS && + (settings[idx].speed != speed || + settings[idx].duplex != duplex)) + idx++; + + return idx < MAX_NUM_SETTINGS ? idx : MAX_NUM_SETTINGS - 1; +} + +/* phy_find_valid + * idx: The first index in settings[] to search + * features: A mask of the valid settings + * + * description: Returns the index of the first valid setting less + * than or equal to the one pointed to by idx, as determined by + * the mask in features. Returns the index of the last setting + * if nothing else matches. + */ +static inline int phy_find_valid(int idx, u32 features) +{ + while (idx < MAX_NUM_SETTINGS && !(settings[idx].setting & features)) + idx++; + + return idx < MAX_NUM_SETTINGS ? idx : MAX_NUM_SETTINGS - 1; +} + +/* phy_sanitize_settings + * phydev: The PHY in question + * + * description: Make sure the PHY is set to supported speeds and + * duplexes. Drop down by one in this order: 1000/FULL, + * 1000/HALF, 100/FULL, 100/HALF, 10/FULL, 10/HALF + */ +void phy_sanitize_settings(struct phy_device *phydev) +{ + u32 features = phydev->supported; + int idx; + + /* Sanitize settings based on PHY capabilities */ + if ((features & SUPPORTED_Autoneg) == 0) + phydev->autoneg = 0; + + idx = phy_find_valid(phy_find_setting(phydev->speed, phydev->duplex), + features); + + phydev->speed = settings[idx].speed; + phydev->duplex = settings[idx].duplex; +} + +/* phy_force_reduction + * phydev: The PHY in question + * + * description: Reduces the speed/duplex settings by + * one notch. The order is so: + * 1000/FULL, 1000/HALF, 100/FULL, 100/HALF, + * 10/FULL, 10/HALF. The function bottoms out at 10/HALF. + */ +void phy_force_reduction(struct phy_device *phydev) +{ + int idx; + + idx = phy_find_setting(phydev->speed, phydev->duplex); + + idx++; + + idx = phy_find_valid(idx, phydev->supported); + + phydev->speed = settings[idx].speed; + phydev->duplex = settings[idx].duplex; + + printk(KERN_INFO "Trying %d/%s\n", phydev->speed, + DUPLEX_FULL == phydev->duplex ? "FULL" : "HALF"); +} + +/* PHY timer which handles the state machine */ +void phy_timer(unsigned long data) +{ + struct phy_device *phydev = (struct phy_device *)data; + int needs_aneg = 0; + + spin_lock(&phydev->lock); + + if (phydev->adjust_state) + phydev->adjust_state(phydev->attached_dev); + + switch(phydev->state) { + case PHY_DOWN: + case PHY_STARTING: + case PHY_READY: + case PHY_PENDING: + break; + case PHY_UP: + needs_aneg = 1; + + phydev->link_timeout = PHY_AN_TIMEOUT; + + if (phydev->irq != -1) + phy_start_interrupts(phydev); + + break; + case PHY_AN: + /* Check if negotiation is done. If so, + * we change to either RUNNING, or NOLINK */ + if (phy_aneg_done(phydev)) { + phy_read_status(phydev); + + if (phydev->link) + phydev->state = PHY_RUNNING; + else + phydev->state = PHY_NOLINK; + + phydev->adjust_link(phydev->attached_dev); + break; + } + + /* The counter expired, so either we + * switch to forced mode, or the + * magic_aneg bit exists, and we try aneg + * again */ + if (0 == phydev->link_timeout--) { + if (!(phydev->drv->flags & PHY_HAS_MAGICANEG)) { + int idx; + + /* We'll start from the + * fastest speed, and work + * our way down */ + idx = phy_find_valid(0, + phydev->supported); + + phydev->speed = settings[idx].speed; + phydev->duplex = settings[idx].duplex; + + phydev->autoneg = AUTONEG_DISABLE; + phydev->state = PHY_FORCING; + phydev->link_timeout = + PHY_FORCE_TIMEOUT; + + pr_info("Trying %d/%s\n", phydev->speed, + DUPLEX_FULL == + phydev->duplex ? + "FULL" : "HALF"); + } + + needs_aneg = 1; + } + break; + case PHY_NOLINK: + phy_read_status(phydev); + + if (phydev->link) { + phydev->state = PHY_RUNNING; + phydev->adjust_link(phydev->attached_dev); + } + break; + case PHY_FORCING: + phy_read_status(phydev); + + if (phydev->link) { + phydev->state = PHY_RUNNING; + } else { + if (0 == phydev->link_timeout--) { + phy_force_reduction(phydev); + needs_aneg = 1; + } + } + + phydev->adjust_link(phydev->attached_dev); + break; + case PHY_RUNNING: + /* Only register a CHANGE if we aren't + * using interrupts */ + if (-1 == phydev->irq) + phydev->state = PHY_CHANGELINK; + break; + case PHY_CHANGELINK: + phy_read_status(phydev); + + if (phydev->link) + phydev->state = PHY_RUNNING; + else { + phydev->state = PHY_NOLINK; + } + + phydev->adjust_link(phydev->attached_dev); + + if (-1 != phydev->irq) + phy_config_interrupt(phydev, + PHY_INTERRUPT_ENABLED); + break; + case PHY_HALTED: + if (phydev->link) { + phydev->link = 0; + phydev->adjust_link(phydev->attached_dev); + } + break; + case PHY_RESUMING: + if (AUTONEG_ENABLE == phydev->autoneg) { + if (phy_aneg_done(phydev)) { + phydev->state = PHY_RUNNING; + } else { + phydev->state = PHY_AN; + phydev->link_timeout = PHY_AN_TIMEOUT; + } + } else + phydev->state = PHY_RUNNING; + break; + } + + spin_unlock(&phydev->lock); + + if (needs_aneg) + phy_start_aneg(phydev); + + mod_timer(&phydev->phy_timer, jiffies + PHY_STATE_TIME * HZ); +} diff -Nru a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c --- /dev/null Wed Dec 31 16:00:00 196900 +++ b/drivers/net/phy/phy_device.c 2004-12-23 12:39:16 -06:00 @@ -0,0 +1,598 @@ +/* + * drivers/net/phy/phy_device.c + * + * Framework for finding and configuring PHYs. + * Also contains generic PHY driver + * + * Author: Andy Fleming + * + * Copyright (c) 2004 Freescale Semiconductor, Inc. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or (at your + * option) any later version. + * + */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include + +/* get_phy_device + * bus: The bus the PHY is on + * addr: The address of the desired PHY device + * + * description: Reads the ID registers of the desired PHY, + * then allocates and returns the phy_device which + * represents it. + */ +struct phy_device * get_phy_device(struct mii_bus *bus, uint addr) +{ + u16 phy_reg; + u32 phy_id; + struct phy_device *dev = NULL; + + /* Grab the bits from PHYIR1, and put them + * in the upper half */ + phy_reg = bus->read(bus, addr, MII_PHYSID1); + phy_id = (phy_reg & 0xffff) << 16; + + /* Grab the bits from PHYIR2, and put them in the lower half */ + phy_reg = bus->read(bus, addr, MII_PHYSID2); + phy_id |= (phy_reg & 0xffff); + + /* If the phy_id is all Fs, there is no device there */ + if (0xffffffff == phy_id) + return NULL; + + /* Otherwise, we allocate the device, and initialize the + * default values */ + dev = kmalloc(sizeof(*dev), GFP_KERNEL); + + if (NULL == dev) { + errno = -ENOMEM; + return NULL; + } + + memset(dev, 0, sizeof(*dev)); + + dev->speed = 0; + dev->duplex = -1; + dev->pause = 0; + dev->link = 1; + + dev->autoneg = AUTONEG_ENABLE; + + dev->addr = addr; + dev->phy_id = phy_id; + dev->bus = bus; + + dev->state = PHY_DOWN; + + spin_lock_init(&dev->lock); + + INIT_WORK(&dev->phy_queue, phy_change, dev); + + return dev; +} + +/* phy_prepare_link: + * phydev: The PHY device whose link is being prepped + * adjust_link: The link change handler for the controller + * + * description: Tells the PHY infrastructure to handle the + * gory details on monitoring link status (whether through + * polling or an interrupt), and to call back to the + * connected device driver when the link status changes. + * If you want to monitor your own link state, don't call + * this function */ +void phy_prepare_link(struct phy_device *phydev, + void (*handler)(struct device *)) +{ + if (handler) + phydev->adjust_link = handler; + else + phydev->adjust_link = NULL; +} + +/* phy_start_machine: + * phydev: The PHY device whose state machine is being started + * handler: The callback function for state change notifications. + * + * description: The PHY infrastructure can run a state machine + * which tracks whether the PHY is starting up, negotiating, + * etc. This function starts the timer which tracks the state + * of the PHY. If you want to be notified when the state + * changes, pass in the callback, otherwise, pass NULL. If you + * want to maintain your own state machine, do not call this + * function. */ +void phy_start_machine(struct phy_device *phydev, + void (*handler)(struct device *)) +{ + if (handler) + phydev->adjust_state = handler; + else + phydev->adjust_state = NULL; + + init_timer(&phydev->phy_timer); + phydev->phy_timer.function = &phy_timer; + phydev->phy_timer.data = (unsigned long) phydev; + mod_timer(&phydev->phy_timer, jiffies + HZ); +} + +/* phy_stop_machine + * + * description: Stops the state machine timer, sets the state to + * UP (unless it wasn't up yet), and then frees the interrupt, + * if it is in use. This function must be called BEFORE + * phy_detach. + */ +void phy_stop_machine(struct phy_device *phydev) +{ + del_timer_sync(&phydev->phy_timer); + + spin_lock(&phydev->lock); + if (phydev->state > PHY_UP) + phydev->state = PHY_UP; + spin_unlock(&phydev->lock); + + if (phydev->irq != -1) { + phy_config_interrupt(phydev, PHY_INTERRUPT_DISABLED); + phy_clear_interrupt(phydev); + free_irq(phydev->irq, phydev); + } + + phydev->adjust_state = NULL; +} + +/* phy_attach: + * dev: The requesting device + * phy_id: The name of the requested PHY device + * + * description: Called by drivers to attach to a particular PHY + * device. The phy_device is found, and properly hooked up + * to the phy_driver. If no driver is attached, then the + * genphy_driver is used. The phy_device is given a ptr to + * the attaching device, and given a callback for link status + * change. The phy_device is returned to the attaching + * driver. + */ +struct phy_device *phy_attach(struct device *dev, char *phy_id) +{ + struct phy_device *phydev = NULL; + struct bus_type *bus = &mdio_bus_type; + struct list_head *entry; + + /* Search the list of PHY devices on the mdio bus for the + * PHY with the requested name */ + list_for_each(entry, &bus->devices.list) + { + struct device *d = container_of(entry, struct device, bus_list); + + if (!strcmp(phy_id, d->bus_id)) { + phydev = to_phy_device(d); + break; + } + } + + if (NULL == phydev) { + printk(KERN_ERR "%s not found\n", phy_id); + errno = -ENODEV; + return NULL; + } + + /* Assume that if there is no driver, that it doesn't + * exist, and we should use the genphy driver. */ + if (NULL == phydev->dev.driver) { + down_write(&phydev->dev.bus->subsys.rwsem); + phydev->dev.driver = &genphy_driver.driver; + + device_bind_driver(&phydev->dev); + up_write(&phydev->dev.bus->subsys.rwsem); + } + + if (phydev->attached_dev) { + printk(KERN_ERR "%s: %s already attached\n", + dev->bus_id, phy_id); + errno = -EBUSY; + return NULL; + } + + phydev->attached_dev = dev; + + return phydev; +} +EXPORT_SYMBOL(phy_attach); + +/* phy_connect: + * dev: The requesting device + * phy_id: The name of the requested PHY device + * adjust_link: A callback function for handling link status + * changes + * + * description: Convenience function for connecting ethernet (or + * other) devices to PHY devices. The default behavior is for + * the PHY infrastructure to handle everything, and only notify + * the connected driver when the link status changes. If you + * don't want, or can't use the provided functionality, you may + * choose to call only the subset of functions which provide + * the desired functionality. + */ +struct phy_device * phy_connect(struct device *dev, char *phy_id, + void (*handler)(struct device *)) +{ + struct phy_device *phydev; + + phydev = phy_attach(dev, phy_id); + + if (NULL == phydev) + return phydev; + + phy_prepare_link(phydev, handler); + + phy_start_machine(phydev, NULL); + + return phydev; +} +EXPORT_SYMBOL(phy_connect); + +void phy_disconnect(struct phy_device *phydev) +{ + phy_stop_machine(phydev); + + phydev->adjust_link = NULL; + + phy_detach(phydev); +} +EXPORT_SYMBOL(phy_disconnect); + +void phy_detach(struct phy_device *phydev) +{ + phydev->attached_dev = NULL; + + /* If the device had no specific driver before (i.e. - it + * was using the generic driver), we unbind the device + * from the generic driver so that there's a chance a + * real driver could be loaded */ + if (phydev->dev.driver == &genphy_driver.driver) { + down_write(&phydev->dev.bus->subsys.rwsem); + device_release_driver(&phydev->dev); + up_write(&phydev->dev.bus->subsys.rwsem); + } +} +EXPORT_SYMBOL(phy_detach); + + +/* Generic PHY support and helper functions */ + +/* genphy_config_advert + * + * description: Writes MII_ADVERTISE with the appropriate values, + * after sanitizing the values to make sure we only advertise + * what is supported + */ +void genphy_config_advert(struct phy_device *phydev) +{ + u32 advertise; + u16 adv; + + /* Only allow advertising what + * this PHY supports */ + phydev->advertising &= phydev->supported; + advertise = phydev->advertising; + + /* Setup standard advertisement */ + adv = phy_read(phydev, MII_ADVERTISE); + + adv &= ~(ADVERTISE_ALL | ADVERTISE_100BASE4); + if (advertise & ADVERTISED_10baseT_Half) + adv |= ADVERTISE_10HALF; + if (advertise & ADVERTISED_10baseT_Full) + adv |= ADVERTISE_10FULL; + if (advertise & ADVERTISED_100baseT_Half) + adv |= ADVERTISE_100HALF; + if (advertise & ADVERTISED_100baseT_Full) + adv |= ADVERTISE_100FULL; + + phy_write(phydev, MII_ADVERTISE, adv); +} + + +/* genphy_setup_forced + * + * description: Configures MII_BMCR to force speed/duplex + * to the values in phydev. Assumes that the values are valid. + * Please see phy_sanitize_settings() */ +void genphy_setup_forced(struct phy_device *phydev) +{ + u16 ctrl = phy_read(phydev, MII_BMCR); + + ctrl &= ~(BMCR_FULLDPLX|BMCR_SPEED100|BMCR_SPEED1000|BMCR_ANENABLE); + ctrl |= BMCR_RESET; + + if (SPEED_1000 == phydev->speed) + ctrl |= BMCR_SPEED1000; + else if (SPEED_100 == phydev->speed) + ctrl |= BMCR_SPEED100; + + if (DUPLEX_FULL == phydev->duplex) + ctrl |= BMCR_FULLDPLX; + + phy_write(phydev, MII_BMCR, ctrl); +} + + +/* Enable and Restart Autonegotiation */ +void genphy_restart_aneg(struct phy_device *phydev) +{ + u16 ctl; + + ctl = phy_read(phydev, MII_BMCR); + ctl |= (BMCR_ANENABLE | BMCR_ANRESTART); + phy_write(phydev, MII_BMCR, ctl); +} + + +/* gbit_config_aneg + * + * description: Does the same thing as genphy_config_advert() + * (it even calls it), but also properly configures + * MII_1000BASETCONTROL. Should only be called for + * gigabit-capable PHYs + */ +int gbit_config_aneg(struct phy_device *phydev) +{ + u16 adv; + u32 advertise; + + if (AUTONEG_ENABLE == phydev->autoneg) { + /* Configure the ADVERTISE register */ + genphy_config_advert(phydev); + advertise = phydev->advertising; + + adv = phy_read(phydev, MII_1000BASETCONTROL); + adv &= ~(MII_1000BASETCONTROL_FULLDUPLEXCAP | + MII_1000BASETCONTROL_HALFDUPLEXCAP); + if (advertise & SUPPORTED_1000baseT_Half) + adv |= MII_1000BASETCONTROL_HALFDUPLEXCAP; + if (advertise & SUPPORTED_1000baseT_Full) + adv |= MII_1000BASETCONTROL_FULLDUPLEXCAP; + phy_write(phydev, MII_1000BASETCONTROL, adv); + + /* Start/Restart aneg */ + genphy_restart_aneg(phydev); + } else + genphy_setup_forced(phydev); + + return 0; +} + + +/* genphy_config_aneg + * + * description: If auto-negotiation is enabled, we configure the + * advertising, and then restart auto-negotiation. If it is not + * enabled, then we write the BMCR + */ +int genphy_config_aneg(struct phy_device *phydev) +{ + if (AUTONEG_ENABLE == phydev->autoneg) { + genphy_config_advert(phydev); + genphy_restart_aneg(phydev); + } else + genphy_setup_forced(phydev); + + return 0; +} + + +/* genphy_update_link + * + * description: Update the value in phydev->link to reflect the + * current link value. In order to do this, we need to read + * the status register twice, keeping the second value + */ +int genphy_update_link(struct phy_device *phydev) +{ + u16 status; + + /* Do a fake read */ + phy_read(phydev, MII_BMSR); + + /* Read link and autonegotiation status */ + status = phy_read(phydev, MII_BMSR); + if ((status & BMSR_LSTATUS) == 0) + phydev->link = 0; + else + phydev->link = 1; + + return 0; +} + +/* genphy_read_status + * + * description: Check the link, then figure out the current state + * by comparing what we advertise with what the link partner + * advertises. This is a bit silly, since pretty much every + * PHY has actual status fields to tell you what the result + * was, but if you don't want to implement that, this should + * work. + */ +int genphy_read_status(struct phy_device *phydev) +{ + u16 status; + int err; + + /* Update the link, but return if there + * was an error */ + err = genphy_update_link(phydev); + if (err) + return err; + + if (AUTONEG_ENABLE == phydev->autoneg) { + status = phy_read(phydev, MII_LPA); + + status &= phy_read(phydev, MII_ADVERTISE); + + /* If we can do 100, set it so */ + if (status & (LPA_100FULL | LPA_100HALF)) + phydev->speed = SPEED_100; + else + phydev->speed = SPEED_10; + + /* If we have 100 full, it's full */ + if (status & (LPA_100FULL)) + phydev->duplex = DUPLEX_FULL; + + /* It's also full if we have 10 full, but not 100 half */ + else if ((status & (LPA_100HALF|LPA_10FULL)) == LPA_10FULL) + phydev->duplex = DUPLEX_FULL; + else + phydev->duplex = DUPLEX_HALF; + + phydev->pause = 0; + } + /* On non-aneg, we assume what we put in BMCR is the speed, + * though magic-aneg shouldn't prevent this case from occurring + */ + + return 0; +} + + +/* phy_probe + * dev: The device belonging to a PHY device + * + * description: Take care of setting up the phy_device structure, + * set the state to READY (the driver's probe function should + * set it to STARTING if needed). + */ +int phy_probe(struct device *dev) +{ + struct phy_device *phydev; + struct phy_driver *phydrv; + struct device_driver *drv; + int err = 0; + + phydev = to_phy_device(dev); + + /* Make sure the driver is held. + * XXX -- Is this correct? */ + drv = get_driver(phydev->dev.driver); + phydrv = to_phy_driver(drv); + phydev->drv = phydrv; + + /* Disable the interrupt if the PHY doesn't support it */ + if (!(phydrv->flags & PHY_HAS_INTERRUPT)) + phydev->irq = -1; + + /* Start out supporting everything. Eventually, + * a controller will attach, and may modify one + * or both of these values */ + phydev->supported = phydrv->features; + phydev->advertising = phydrv->features; + + spin_lock(&phydev->lock); + + /* Set the state to READY by default */ + phydev->state = PHY_READY; + + if (phydev->drv->probe) + err = phydev->drv->probe(phydev); + + spin_unlock(&phydev->lock); + + return 0; +} + +int phy_remove(struct device *dev) +{ + struct phy_device *phydev; + + phydev = to_phy_device(dev); + + spin_lock(&phydev->lock); + phydev->state = PHY_DOWN; + spin_unlock(&phydev->lock); + + if (phydev->drv->remove) + phydev->drv->remove(phydev); + + put_driver(phydev->dev.driver); + phydev->drv = NULL; + + return 0; +} + +int phy_driver_register(struct phy_driver *new_driver) +{ + int retval; + + memset(&new_driver->driver, 0, sizeof(new_driver->driver)); + new_driver->driver.name = new_driver->name; + new_driver->driver.bus = &mdio_bus_type; + new_driver->driver.probe = phy_probe; + new_driver->driver.remove = phy_remove; + + retval = driver_register(&new_driver->driver); + + if (!retval) + pr_info("%s: Registered new driver\n", new_driver->name); + else + printk(KERN_ERR "%s: Error %d in registering driver\n", + new_driver->name, retval); + + return retval; +} + +void phy_driver_unregister(struct phy_driver *drv) +{ + driver_unregister(&drv->driver); +} + +struct phy_driver genphy_driver = { + .phy_id = 0x00000000, + .phy_id_mask = 0xffffffff, + .name = "Generic PHY", + .features = PHY_BASIC_FEATURES, + .config_aneg = genphy_config_aneg, + .read_status = genphy_read_status, +}; + +static int __init genphy_init(void) +{ + int retval; + + retval = phy_driver_register(&genphy_driver); + + return retval; +} + +static void __exit genphy_exit(void) +{ + phy_driver_unregister(&genphy_driver); +} + +module_init(genphy_init); +module_exit(genphy_exit); diff -Nru a/drivers/net/phy/qsemi.c b/drivers/net/phy/qsemi.c --- /dev/null Wed Dec 31 16:00:00 196900 +++ b/drivers/net/phy/qsemi.c 2004-12-23 12:39:16 -06:00 @@ -0,0 +1,183 @@ +/* + * drivers/net/phy/qsemi.c + * + * Driver for Quality Semiconductor PHYs + * + * Author: Andy Fleming + * + * Copyright (c) 2004 Freescale Semiconductor, Inc. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or (at your + * option) any later version. + * + */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include + +/* ------------------------------------------------------------------------- */ +/* The Quality Semiconductor QS6612 is used on the RPX CLLF */ + +/* register definitions */ + +#define MII_QS6612_MCR 17 /* Mode Control Register */ +#define MII_QS6612_FTR 27 /* Factory Test Register */ +#define MII_QS6612_MCO 28 /* Misc. Control Register */ +#define MII_QS6612_ISR 29 /* Interrupt Source Register */ +#define MII_QS6612_IMR 30 /* Interrupt Mask Register */ +#define MII_QS6612_IMR_INIT 0x003a +#define MII_QS6612_PCR 31 /* 100BaseTx PHY Control Reg. */ + +#define QS6612_PCR_AN_COMPLETE 0x1000 +#define QS6612_PCR_RLBEN 0x0200 +#define QS6612_PCR_DCREN 0x0100 +#define QS6612_PCR_4B5BEN 0x0040 +#define QS6612_PCR_TX_ISOLATE 0x0020 +#define QS6612_PCR_OPMODE_MASK 0x001c +#define QS6612_PCR_MLT3_DIS 0x0002 +#define QS6612_PCR_SCRM_DESCRM 0x0001 + +enum qs6612_opmode { + still_an=0, + up10_half, + up100_half, + repeater, + reserved, + up10_full, + up100_full, + isolate_noaneg +}; + +static int qs6612_read_status(struct phy_device *phydev) +{ + u16 status; + int err; + + /* Update the link, but return if there + * was an error */ + err = genphy_update_link(phydev); + if (err) + return err; + + /* If the link is up, read the speed and duplex */ + /* If we aren't autonegotiating, assume speeds + * are as set */ + if (phydev->autoneg && phydev->link) { + status = phy_read(phydev, MII_QS6612_PCR); + switch((status >> 2) & QS6612_PCR_OPMODE_MASK) { + case up10_half: + phydev->speed = SPEED_10; + phydev->duplex = DUPLEX_HALF; + break; + case up100_half: + phydev->speed = SPEED_100; + phydev->duplex = DUPLEX_HALF; + break; + case up10_full: + phydev->speed = SPEED_10; + phydev->duplex = DUPLEX_FULL; + break; + case up100_full: + phydev->speed = SPEED_100; + phydev->duplex = DUPLEX_FULL; + break; + default: + /* Do nothing in the other states */ + break; + } + } + + return 0; +} + +int qs6612_probe(struct phy_device *phydev) +{ + /* The PHY powers up isolated on the RPX, + * so send a command to allow operation. + * XXX - My docs indicate this should be 0x0940 + * ...or something. The current value sets three + * reserved bits, bit 11, which specifies it should be + * set to one, bit 10, which specifies it should be set + * to 0, and bit 7, which doesn't specify. However, my + * docs are preliminary, and I will leave it like this + * until someone more knowledgable corrects me or it. + * -- Andy Fleming + */ + phy_write(phydev, MII_QS6612_PCR, 0x0dc0); + + return 0; +} + +int qs6612_ack_interrupt(struct phy_device *phydev) +{ + phy_read(phydev, MII_QS6612_ISR); + phy_read(phydev, MII_BMSR); + phy_read(phydev, MII_EXPANSION); + + return 0; +} + +int qs6612_config_intr(struct phy_device *phydev) +{ + if (phydev->interrupts == PHY_INTERRUPT_ENABLED) + phy_write(phydev, MII_QS6612_IMR, + MII_QS6612_IMR_INIT); + else + phy_write(phydev, MII_QS6612_IMR, 0); + + return 0; + +} + +static struct phy_driver qs6612_driver = { + .phy_id = 0x00181440, + .name = "QS6612", + .phy_id_mask = 0xfffffff0, + .features = PHY_BASIC_FEATURES, + .flags = PHY_HAS_INTERRUPT, + .probe = qs6612_probe, + .config_aneg = genphy_config_aneg, + .read_status = qs6612_read_status, + .ack_interrupt = qs6612_ack_interrupt, + .config_intr = qs6612_config_intr, +}; + +int __init qs6612_init(void) +{ + int retval; + + retval = phy_driver_register(&qs6612_driver); + + return retval; +} + +static void __exit qs6612_exit(void) +{ + phy_driver_unregister(&qs6612_driver); +} + +module_init(qs6612_init); +module_exit(qs6612_exit); diff -Nru a/include/asm-ppc/fsl_ocp.h b/include/asm-ppc/fsl_ocp.h --- a/include/asm-ppc/fsl_ocp.h 2004-12-23 12:39:16 -06:00 +++ b/include/asm-ppc/fsl_ocp.h 2004-12-23 12:39:16 -06:00 @@ -17,6 +17,7 @@ #ifndef __ASM_FS_OCP_H__ #define __ASM_FS_OCP_H__ +#define GFAR_MII_OFFSET 0x520 /* A table of information for supporting the Gianfar Ethernet Controller * This helps identify which enet controller we are dealing with, * and what type of enet controller it is @@ -25,20 +26,18 @@ uint interruptTransmit; uint interruptError; uint interruptReceive; - uint interruptPHY; uint flags; - uint phyid; - uint phyregidx; + char *bus_id; unsigned char mac_addr[6]; }; /* Flags in the flags field */ -#define GFAR_HAS_COALESCE 0x20 -#define GFAR_HAS_RMON 0x10 -#define GFAR_HAS_MULTI_INTR 0x08 -#define GFAR_FIRM_SET_MACADDR 0x04 -#define GFAR_HAS_PHY_INTR 0x02 /* if not set use a timer */ -#define GFAR_HAS_GIGABIT 0x01 +#define GIANFAR_HAS_COALESCE 0x20 +#define GIANFAR_HAS_RMON 0x10 +#define GIANFAR_HAS_MULTI_INTR 0x08 +#define GIANFAR_FIRM_SET_MACADDR 0x04 +#define GIANFAR_HAS_PHY_INTR 0x02 /* if not set use a timer */ +#define GIANFAR_HAS_GIGABIT 0x01 /* Data structure for I2C support. Just contains a couple flags * to distinguish various I2C implementations*/ diff -Nru a/include/asm-ppc/mpc85xx.h b/include/asm-ppc/mpc85xx.h --- a/include/asm-ppc/mpc85xx.h 2004-12-23 12:39:16 -06:00 +++ b/include/asm-ppc/mpc85xx.h 2004-12-23 12:39:16 -06:00 @@ -103,8 +103,18 @@ #define MPC85xx_CPM_SIZE (0x40000) #define MPC85xx_DMA_OFFSET (0x21000) #define MPC85xx_DMA_SIZE (0x01000) +#define MPC85xx_DMA0_OFFSET (0x21100) +#define MPC85xx_DMA0_SIZE (0x00080) +#define MPC85xx_DMA1_OFFSET (0x21180) +#define MPC85xx_DMA1_SIZE (0x00080) +#define MPC85xx_DMA2_OFFSET (0x21200) +#define MPC85xx_DMA2_SIZE (0x00080) +#define MPC85xx_DMA3_OFFSET (0x21280) +#define MPC85xx_DMA3_SIZE (0x00080) #define MPC85xx_ENET1_OFFSET (0x24000) #define MPC85xx_ENET1_SIZE (0x01000) +#define MPC85xx_MIIM_OFFSET (0x24520) +#define MPC85xx_MIIM_SIZE (0x00018) #define MPC85xx_ENET2_OFFSET (0x25000) #define MPC85xx_ENET2_SIZE (0x01000) #define MPC85xx_ENET3_OFFSET (0x26000) @@ -138,6 +148,18 @@ #else #define CCSRBAR BOARD_CCSRBAR #endif + +enum fsl_devices { + MPC85xx_TSEC1, + MPC85xx_TSEC2, + MPC85xx_FEC, + MPC85xx_IIC1, + MPC85xx_DMA0, + MPC85xx_DMA1, + MPC85xx_DMA2, + MPC85xx_DMA3, + MPC85xx_MDIO, +}; #endif /* CONFIG_85xx */ #endif /* __ASM_MPC85xx_H__ */ diff -Nru a/include/linux/device.h b/include/linux/device.h --- a/include/linux/device.h 2004-12-23 12:39:16 -06:00 +++ b/include/linux/device.h 2004-12-23 12:39:16 -06:00 @@ -382,6 +382,8 @@ extern struct resource *platform_get_resource(struct platform_device *, unsigned int, unsigned int); extern int platform_get_irq(struct platform_device *, unsigned int); +extern struct resource *platform_get_resource_byname(struct platform_device *, unsigned int, char *); +extern int platform_get_irq_byname(struct platform_device *, char *); extern int platform_add_devices(struct platform_device **, int); extern struct platform_device *platform_device_register_simple(char *, unsigned int, struct resource *, unsigned int); diff -Nru a/include/linux/phy.h b/include/linux/phy.h --- /dev/null Wed Dec 31 16:00:00 196900 +++ b/include/linux/phy.h 2004-12-23 12:39:16 -06:00 @@ -0,0 +1,355 @@ +/* + * include/linux/phy.h + * + * Framework and drivers for configuring and reading different PHYs + * Based on code in sungem_phy.c and gianfar_phy.c + * + * Author: Andy Fleming + * + * Copyright (c) 2004 Freescale Semiconductor, Inc. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or (at your + * option) any later version. + * + */ + +#ifndef __PHY_H +#define __PHY_H + +#include +#include + +/* 1000BT control (Marvell & BCM54xx at least) */ +#define MII_1000BASETCONTROL 0x09 +#define MII_1000BASETCONTROL_FULLDUPLEXCAP 0x0200 +#define MII_1000BASETCONTROL_HALFDUPLEXCAP 0x0100 + +#define PHY_BASIC_FEATURES (SUPPORTED_10baseT_Half | \ + SUPPORTED_10baseT_Full | \ + SUPPORTED_100baseT_Half | \ + SUPPORTED_100baseT_Full | \ + SUPPORTED_Autoneg | \ + SUPPORTED_TP | \ + SUPPORTED_MII) + +#define PHY_GBIT_FEATURES (PHY_BASIC_FEATURES | \ + SUPPORTED_1000baseT_Half | \ + SUPPORTED_1000baseT_Full) + +#define PHY_HAS_INTERRUPT 0x00000001 +#define PHY_HAS_MAGICANEG 0x00000002 + +#define MII_BUS_MAX 4 + + +#define PHY_INIT_TIMEOUT 100000 +#define PHY_STATE_TIME 1 +#define PHY_FORCE_TIMEOUT 10 +#define PHY_AN_TIMEOUT 10 + +#define PHY_MAX_ADDR 32 + +/* The Bus class for PHYs. Devices which provide access to + * PHYs should register using this structure */ +struct mii_bus { + const char *name; + int id; + void *priv; + u16 (*read)(struct mii_bus *bus, int phy_id, int regnum); + void (*write)(struct mii_bus *bus, int phy_id, int regnum, u16 val); + int (*reset)(struct mii_bus *bus); + + /* A lock to ensure that only one thing can read/write + * the MDIO bus at a time */ + spinlock_t mdio_lock; + + struct device *dev; + + /* list of all PHYs on bus */ + struct phy_device *phy_map[PHY_MAX_ADDR]; + + /* Pointer to an array of interrupts, each PHY's + * interrupt at the index matching its address */ + int *irq; +}; + +#define PHY_INTERRUPT_DISABLED 0x0 +#define PHY_INTERRUPT_ENABLED 0x80000000 + +/* PHY state machine states: + * + * DOWN: PHY device and driver are not ready for anything. probe + * should be called if and only if the PHY is in this state, + * given that the PHY device exists. + * - PHY driver probe function will, depending on the PHY, set + * the state to STARTING or READY + * + * STARTING: PHY device is coming up, and the ethernet driver is + * not ready. PHY drivers may set this in the probe function. + * If they do, they are responsible for making sure the state is + * eventually set to indicate whether the PHY is UP or READY, + * depending on the state when the PHY is done starting up. + * - PHY driver will set the state to READY + * - start will set the state to PENDING + * + * READY: PHY is ready to send and receive packets, but the + * controller is not. By default, PHYs which do not implement + * probe will be set to this state by phy_probe(). If the PHY + * driver knows the PHY is ready, and the PHY state is STARTING, + * then it sets this STATE. + * - start will set the state to UP + * + * PENDING: PHY device is coming up, but the ethernet driver is + * ready. phy_start will set this state if the PHY state is + * STARTING. + * - PHY driver will set the state to UP when the PHY is ready + * + * UP: The PHY and attached device are ready to do work. + * Interrupts should be started here. + * - timer moves to AN + * + * AN: The PHY is currently negotiating the link state. Link is + * therefore down for now. phy_timer will set this state when it + * detects the state is UP. config_aneg will set this state + * whenever called with phydev->autoneg set to AUTONEG_ENABLE. + * - If autonegotiation finishes, but there's no link, it sets + * the state to NOLINK. + * - If aneg finishes with link, it sets the state to RUNNING, + * and calls adjust_link + * - If autonegotiation did not finish after an arbitrary amount + * of time, autonegotiation should be tried again if the PHY + * supports "magic" autonegotiation (back to AN) + * - If it didn't finish, and no magic_aneg, move to FORCING. + * + * NOLINK: PHY is up, but not currently plugged in. + * - If the timer notes that the link comes back, we move to RUNNING + * - config_aneg moves to AN + * - phy_stop moves to HALTED + * + * FORCING: PHY is being configured with forced settings + * - if link is up, move to RUNNING + * - If link is down, we drop to the next highest setting, and + * retry (FORCING) after a timeout + * - phy_stop moves to HALTED + * + * RUNNING: PHY is currently up, running, and possibly sending + * and/or receiving packets + * - timer will set CHANGELINK if we're polling (this ensures the + * link state is polled every other cycle of this state machine, + * which makes it every other second) + * - irq will set CHANGELINK + * - config_aneg will set AN + * - phy_stop moves to HALTED + * + * CHANGELINK: PHY experienced a change in link state + * - timer moves to RUNNING if link + * - timer moves to NOLINK if the link is down + * - phy_stop moves to HALTED + * + * HALTED: PHY is up, but no polling or interrupts are done. +* Brings the link down. +* - phy_start moves to RESUMING +* +* RESUMING: PHY was halted, but now wants to run again. +* - If we are forcing, or aneg is done, timer moves to RUNNING +* - If aneg is not done, timer moves to AN +* - phy_stop moves to HALTED +*/ +enum phy_state { + PHY_DOWN=0, + PHY_STARTING, + PHY_READY, + PHY_PENDING, + PHY_UP, + PHY_AN, + PHY_RUNNING, + PHY_NOLINK, + PHY_FORCING, + PHY_CHANGELINK, + PHY_HALTED, + PHY_RESUMING +}; + +/* phy_device: An instance of a PHY + * + * drv: Pointer to the driver for this PHY instance + * bus: Pointer to the bus this PHY is on + * dev: driver model device structure for this PHY + * phy_id: UID for this device found during discovery + * state: state of the PHY for management purposes + * addr: Bus address of PHY + * link_timeout: The number of timer firings to wait before the + * giving up on the current attempt at acquiring a link + * irq: IRQ number of the PHY's interrupt (-1 if none) + * phy_timer: The timer for handling the state machine + * phy_queue: A work_queue for the interrupt + * attached_dev: The attached enet driver's device instance ptr + * adjust_link: Callback for the enet controller to respond to + * changes in the link state. + * adjust_state: Callback for the enet driver to respond to + * changes in the state machine. + * + * speed, duplex, pause, supported, advertising, and + * autoneg are used like in mii_if_info + * + * interrupts currently only supports enabled or disabled, + * but could be changed in the future to support enabling + * and disabling specific interrupts + * + * Contains some infrastructure for polling and interrupt + * handling, as well as handling shifts in PHY hardware state + */ +struct phy_device { + /* Information about the PHY type */ + /* And management functions */ + struct phy_driver *drv; + + struct mii_bus *bus; + + struct device dev; + + u32 phy_id; + + enum phy_state state; + + /* Bus address of the PHY (0-32) */ + int addr; + + /* forced speed & duplex (no autoneg) + * partner speed & duplex & pause (autoneg) + */ + int speed; + int duplex; + int pause; + + /* The most recently read link state */ + int link; + + /* Enabled Interrupts */ + u32 interrupts; + + /* Union of PHY and Attached devices' supported modes */ + /* See mii.h for more info */ + u32 supported; + u32 advertising; + + int autoneg; + + int link_timeout; + + /* Interrupt number for this PHY + * -1 means no interrupt */ + int irq; + + /* private data pointer */ + /* For use by PHYs to maintain extra state */ + void *priv; + + /* Interrupt and Polling infrastructure */ + struct work_struct phy_queue; + struct timer_list phy_timer; + + spinlock_t lock; + + struct device *attached_dev; + + void (*adjust_link)(struct device *dev); + + void (*adjust_state)(struct device *dev); +}; +#define to_phy_device(d) container_of(d, struct phy_device, dev) + +/* struct phy_driver: Driver structure for a particular PHY type + * + * phy_id: The result of reading the UID registers of this PHY + * type, and ANDing them with the phy_id_mask. This driver + * only works for PHYs with IDs which match this field + * name: The friendly name of this PHY type + * phy_id_mask: Defines the important bits of the phy_id + * features: A list of features (speed, duplex, etc) supported + * by this PHY + * flags: A bitfield defining certain other features this PHY + * supports (like interrupts) + * + * The drivers must implement config_aneg and read_status. All + * other functions are optional. Note that none of these + * functions should be called from interrupt time. The goal is + * for the bus read/write functions to be able to block when the + * bus transaction is happening, and be freed up by an interrupt + * (The MPC85xx has this ability, though it is not currently + * supported in the driver). + */ +struct phy_driver { + u32 phy_id; + char *name; + unsigned int phy_id_mask; + u32 features; + u32 flags; + + /* Called to initialize the PHY */ + int (*probe)(struct phy_device *phydev); + + /* PHY Power Management */ + int (*suspend)(struct phy_device *phydev); + int (*resume)(struct phy_device *phydev); + + /* Configures the advertisement and resets + * autonegotiation if phydev->autoneg is on, + * forces the speed to the current settings in phydev + * if phydev->autoneg is off */ + int (*config_aneg)(struct phy_device *phydev); + + /* Determines the negotiated speed and duplex */ + int (*read_status)(struct phy_device *phydev); + + /* Clears any pending interrupts */ + int (*ack_interrupt)(struct phy_device *phydev); + + /* Enables or disables interrupts */ + int (*config_intr)(struct phy_device *phydev); + + /* Clears up any memory if needed */ + void (*remove)(struct phy_device *phydev); + + struct device_driver driver; +}; +#define to_phy_driver(d) container_of(d, struct phy_driver, driver) + +u16 phy_read(struct phy_device *phydev, u16 regnum); +void phy_write(struct phy_device *phydev, u16 regnum, u16 val); +struct phy_device* get_phy_device(struct mii_bus *bus, uint addr); +void phy_clear_interrupt(struct phy_device *phydev); +void phy_config_interrupt(struct phy_device *phydev, u32 interrupts); +struct phy_device * phy_attach(struct device *dev, char *phy_id); +struct phy_device * phy_connect(struct device *dev, char *phy_id, + void (*handler)(struct device *)); +void phy_disconnect(struct phy_device *phydev); +void phy_detach(struct phy_device *phydev); +void phy_start(struct phy_device *phydev); +void phy_stop(struct phy_device *phydev); +void phy_start_aneg(struct phy_device *phydev); +int register_mdiobus(struct mii_bus *bus); +void phy_change(void *data); +void phy_timer(unsigned long data); +void phy_sanitize_settings(struct phy_device *phydev); + +void genphy_config_advert(struct phy_device *phydev); +void genphy_setup_forced(struct phy_device *phydev); +void genphy_restart_aneg(struct phy_device *phydev); +int gbit_config_aneg(struct phy_device *phydev); +int genphy_config_aneg(struct phy_device *phydev); +int genphy_update_link(struct phy_device *phydev); +int genphy_read_status(struct phy_device *phydev); +void phy_driver_unregister(struct phy_driver *drv); +int phy_driver_register(struct phy_driver *new_driver); +void phy_prepare_link(struct phy_device *phydev, + void (*adjust_link)(struct device *)); +void phy_start_machine(struct phy_device *phydev, + void (*handler)(struct device *)); +void phy_stop_machine(struct phy_device *phydev); + +extern struct bus_type mdio_bus_type; +extern struct phy_driver genphy_driver; +#endif /* __PHY_H */ --Apple-Mail-2-967209165 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=US-ASCII; format=flowed Andy Fleming --Apple-Mail-2-967209165-- From afleming@freescale.com Thu Dec 23 12:59:39 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 23 Dec 2004 12:59:47 -0800 (PST) Received: from de01egw01.freescale.net (de01egw01.freescale.net [192.88.165.102]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBNKxH9T010131 for ; Thu, 23 Dec 2004 12:59:38 -0800 Received: from de01smr02.am.mot.com (de01smr02 [10.208.0.151]) by de01egw01.freescale.net (8.12.11/de01egw01) with ESMTP id iBNL4W0x002359; Thu, 23 Dec 2004 14:04:32 -0700 (MST) Received: from [10.82.17.240] ([10.82.17.240]) by de01smr02.am.mot.com (8.13.1/8.13.0) with ESMTP id iBNL4vOj025583; Thu, 23 Dec 2004 15:04:57 -0600 (CST) In-Reply-To: References: Mime-Version: 1.0 (Apple Message framework v619) Content-Type: multipart/mixed; boundary=Apple-Mail-2-974361640 Message-Id: Cc: Kumar Gala , Embedded PPC Linux list From: Andy Fleming Subject: Re: [RFC] Patch to Abstract Ethernet PHY support (using driver model) Date: Thu, 23 Dec 2004 15:00:13 -0600 To: Netdev X-Mailer: Apple Mail (2.619) X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13006 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: afleming@freescale.com Precedence: bulk X-list: netdev --Apple-Mail-2-974361640 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=US-ASCII; format=flowed It has been suggested that I split the patch into 2 parts: gianfar specific stuff, and PHY specific stuff. Here is the result: --Apple-Mail-2-974361640 Content-Transfer-Encoding: 7bit Content-Type: application/octet-stream; x-unix-mode=0644; name="gfar_12232004.patch" Content-Disposition: attachment; filename=gfar_12232004.patch diff -Nru a/arch/ppc/platforms/85xx/Makefile b/arch/ppc/platforms/85xx/Makefile --- a/arch/ppc/platforms/85xx/Makefile 2004-12-23 12:39:15 -06:00 +++ b/arch/ppc/platforms/85xx/Makefile 2004-12-23 12:39:15 -06:00 @@ -1,6 +1,7 @@ # # Makefile for the PowerPC 85xx linux kernel. # +obj-$(CONFIG_85xx) += mpc85xx.o obj-$(CONFIG_MPC8540_ADS) += mpc85xx_ads_common.o mpc8540_ads.o obj-$(CONFIG_MPC8555_CDS) += mpc85xx_cds_common.o diff -Nru a/arch/ppc/platforms/85xx/mpc8540.c b/arch/ppc/platforms/85xx/mpc8540.c --- a/arch/ppc/platforms/85xx/mpc8540.c 2004-12-23 12:39:16 -06:00 +++ b/arch/ppc/platforms/85xx/mpc8540.c 2004-12-23 12:39:16 -06:00 @@ -22,7 +22,7 @@ extern struct ocp_gfar_data mpc85xx_tsec1_def; extern struct ocp_gfar_data mpc85xx_tsec2_def; extern struct ocp_gfar_data mpc85xx_fec_def; -extern struct ocp_mpc_i2c_data mpc85xx_i2c1_def; +extern struct ocp_mpc_fs_data mpc85xx_i2c1_def; /* We use offsets for paddr since we do not know at compile time * what CCSRBAR is, platform code should fix this up in diff -Nru a/arch/ppc/platforms/85xx/mpc8540_ads.c b/arch/ppc/platforms/85xx/mpc8540_ads.c --- a/arch/ppc/platforms/85xx/mpc8540_ads.c 2004-12-23 12:39:15 -06:00 +++ b/arch/ppc/platforms/85xx/mpc8540_ads.c 2004-12-23 12:39:15 -06:00 @@ -32,6 +32,7 @@ #include #include #include +#include #include #include @@ -48,50 +49,24 @@ #include #include #include -#include #include +#include #include #include struct ocp_gfar_data mpc85xx_tsec1_def = { - .interruptTransmit = MPC85xx_IRQ_TSEC1_TX, - .interruptError = MPC85xx_IRQ_TSEC1_ERROR, - .interruptReceive = MPC85xx_IRQ_TSEC1_RX, - .interruptPHY = MPC85xx_IRQ_EXT5, - .flags = (GFAR_HAS_GIGABIT | GFAR_HAS_MULTI_INTR - | GFAR_HAS_RMON - | GFAR_HAS_PHY_INTR | GFAR_HAS_COALESCE), - .phyid = 0, - .phyregidx = 0, }; - struct ocp_gfar_data mpc85xx_tsec2_def = { - .interruptTransmit = MPC85xx_IRQ_TSEC2_TX, - .interruptError = MPC85xx_IRQ_TSEC2_ERROR, - .interruptReceive = MPC85xx_IRQ_TSEC2_RX, - .interruptPHY = MPC85xx_IRQ_EXT5, - .flags = (GFAR_HAS_GIGABIT | GFAR_HAS_MULTI_INTR - | GFAR_HAS_RMON - | GFAR_HAS_PHY_INTR | GFAR_HAS_COALESCE), - .phyid = 1, - .phyregidx = 0, }; - struct ocp_gfar_data mpc85xx_fec_def = { - .interruptTransmit = MPC85xx_IRQ_FEC, - .interruptError = MPC85xx_IRQ_FEC, - .interruptReceive = MPC85xx_IRQ_FEC, - .interruptPHY = MPC85xx_IRQ_EXT5, - .flags = 0, - .phyid = 3, - .phyregidx = 0, }; - struct ocp_fs_i2c_data mpc85xx_i2c1_def = { .flags = FS_I2C_SEPARATE_DFSRR, }; +extern void * get_platform_data(enum fsl_devices dev); + /* ************************************************************************ * * Setup the architecture @@ -100,10 +75,10 @@ static void __init mpc8540ads_setup_arch(void) { - struct ocp_def *def; - struct ocp_gfar_data *einfo; bd_t *binfo = (bd_t *) __res; unsigned int freq; + struct gianfar_platform_data *pdata; + struct gianfar_mdio_data *mdata; /* get the core frequency */ freq = binfo->bi_intfreq; @@ -130,23 +105,26 @@ invalidate_tlbcam_entry(NUM_TLBCAMS - 1); #endif - def = ocp_get_one_device(OCP_VENDOR_FREESCALE, OCP_FUNC_GFAR, 0); - if (def) { - einfo = (struct ocp_gfar_data *) def->additions; - memcpy(einfo->mac_addr, binfo->bi_enetaddr, 6); - } - - def = ocp_get_one_device(OCP_VENDOR_FREESCALE, OCP_FUNC_GFAR, 1); - if (def) { - einfo = (struct ocp_gfar_data *) def->additions; - memcpy(einfo->mac_addr, binfo->bi_enet1addr, 6); - } - - def = ocp_get_one_device(OCP_VENDOR_FREESCALE, OCP_FUNC_GFAR, 2); - if (def) { - einfo = (struct ocp_gfar_data *) def->additions; - memcpy(einfo->mac_addr, binfo->bi_enet2addr, 6); - } + /* setup the board related information for the enet controllers */ + pdata = (struct gianfar_platform_data *) get_platform_data(MPC85xx_TSEC1); + pdata->bus_id = "phy0:0"; + memcpy(pdata->mac_addr, binfo->bi_enetaddr, 6); + + mdata = (struct gianfar_mdio_data *) get_platform_data(MPC85xx_MDIO); + mdata->paddr += binfo->bi_immr_base; + memset(&mdata->irq, -1, sizeof(mdata->irq)); + mdata->irq[0] = MPC85xx_IRQ_EXT5; + mdata->irq[1] = MPC85xx_IRQ_EXT5; + mdata->irq[2] = MPC85xx_IRQ_EXT5; + mdata->irq[3] = MPC85xx_IRQ_EXT5; + + pdata = (struct gianfar_platform_data *) get_platform_data(MPC85xx_TSEC2); + pdata->bus_id = "phy0:1"; + memcpy(pdata->mac_addr, binfo->bi_enet1addr, 6); + + pdata = (struct gianfar_platform_data *) get_platform_data(MPC85xx_FEC); + pdata->bus_id = "phy0:3"; + memcpy(pdata->mac_addr, binfo->bi_enet2addr, 6); #ifdef CONFIG_BLK_DEV_INITRD if (initrd_start) @@ -158,8 +136,6 @@ #else ROOT_DEV = Root_HDA1; #endif - - ocp_for_each_device(mpc85xx_update_paddr_ocp, &(binfo->bi_immr_base)); } /* ************************************************************************ */ diff -Nru a/arch/ppc/platforms/85xx/mpc8560_ads.c b/arch/ppc/platforms/85xx/mpc8560_ads.c --- a/arch/ppc/platforms/85xx/mpc8560_ads.c 2004-12-23 12:39:16 -06:00 +++ b/arch/ppc/platforms/85xx/mpc8560_ads.c 2004-12-23 12:39:16 -06:00 @@ -48,7 +48,6 @@ #include #include #include -#include #include #include @@ -59,33 +58,15 @@ extern void cpm2_reset(void); struct ocp_gfar_data mpc85xx_tsec1_def = { - .interruptTransmit = MPC85xx_IRQ_TSEC1_TX, - .interruptError = MPC85xx_IRQ_TSEC1_ERROR, - .interruptReceive = MPC85xx_IRQ_TSEC1_RX, - .interruptPHY = MPC85xx_IRQ_EXT5, - .flags = (GFAR_HAS_GIGABIT | GFAR_HAS_MULTI_INTR - | GFAR_HAS_RMON | GFAR_HAS_COALESCE - | GFAR_HAS_PHY_INTR), - .phyid = 0, - .phyregidx = 0, }; - struct ocp_gfar_data mpc85xx_tsec2_def = { - .interruptTransmit = MPC85xx_IRQ_TSEC2_TX, - .interruptError = MPC85xx_IRQ_TSEC2_ERROR, - .interruptReceive = MPC85xx_IRQ_TSEC2_RX, - .interruptPHY = MPC85xx_IRQ_EXT5, - .flags = (GFAR_HAS_GIGABIT | GFAR_HAS_MULTI_INTR - | GFAR_HAS_RMON | GFAR_HAS_COALESCE - | GFAR_HAS_PHY_INTR), - .phyid = 1, - .phyregidx = 0, }; - struct ocp_fs_i2c_data mpc85xx_i2c1_def = { .flags = FS_I2C_SEPARATE_DFSRR, }; +extern void * get_platform_data(enum fsl_devices dev); + /* ************************************************************************ * * Setup the architecture @@ -95,10 +76,10 @@ static void __init mpc8560ads_setup_arch(void) { - struct ocp_def *def; - struct ocp_gfar_data *einfo; bd_t *binfo = (bd_t *) __res; unsigned int freq; + struct gianfar_platform_data *pdata; + struct gianfar_mdio_data *mdata; cpm2_reset(); @@ -117,17 +98,22 @@ mpc85xx_setup_hose(); #endif - def = ocp_get_one_device(OCP_VENDOR_FREESCALE, OCP_FUNC_GFAR, 0); - if (def) { - einfo = (struct ocp_gfar_data *) def->additions; - memcpy(einfo->mac_addr, binfo->bi_enetaddr, 6); - } - - def = ocp_get_one_device(OCP_VENDOR_FREESCALE, OCP_FUNC_GFAR, 1); - if (def) { - einfo = (struct ocp_gfar_data *) def->additions; - memcpy(einfo->mac_addr, binfo->bi_enet1addr, 6); - } + /* setup the board related information for the enet controllers */ + pdata = (struct gianfar_platform_data *) get_platform_data(MPC85xx_TSEC1); + pdata->bus_id = "phy0:0"; + memcpy(pdata->mac_addr, binfo->bi_enetaddr, 6); + + mdata = (struct gianfar_mdio_data *) get_platform_data(MPC85xx_MDIO); + mdata->paddr += binfo->bi_immr_base; + memset(&mdata->irq, -1, sizeof(mdata->irq)); + mdata->irq[0] = MPC85xx_IRQ_EXT5; + mdata->irq[1] = MPC85xx_IRQ_EXT5; + mdata->irq[2] = MPC85xx_IRQ_EXT5; + mdata->irq[3] = MPC85xx_IRQ_EXT5; + + pdata = (struct gianfar_platform_data *) get_platform_data(MPC85xx_TSEC2); + pdata->bus_id = "phy0:1"; + memcpy(pdata->mac_addr, binfo->bi_enet1addr, 6); #ifdef CONFIG_BLK_DEV_INITRD if (initrd_start) @@ -139,8 +125,6 @@ #else ROOT_DEV = Root_HDA1; #endif - - ocp_for_each_device(mpc85xx_update_paddr_ocp, &(binfo->bi_immr_base)); } static irqreturn_t cpm2_cascade(int irq, void *dev_id, struct pt_regs *regs) diff -Nru a/drivers/base/platform.c b/drivers/base/platform.c --- a/drivers/base/platform.c 2004-12-23 12:39:15 -06:00 +++ b/drivers/base/platform.c 2004-12-23 12:39:15 -06:00 @@ -58,6 +58,42 @@ } /** + * platform_get_resource_byname - get a resource for a device by name + * @dev: platform device + * @type: resource type + * @name: resource name + */ +struct resource * +platform_get_resource_byname(struct platform_device *dev, unsigned int type, + char * name) +{ + int i; + + for (i = 0; i < dev->num_resources; i++) { + struct resource *r = &dev->resource[i]; + + if ((r->flags & (IORESOURCE_IO|IORESOURCE_MEM| + IORESOURCE_IRQ|IORESOURCE_DMA)) + == type) + if (!strcmp(r->name, name)) + return r; + } + return NULL; +} + +/** + * platform_get_irq - get an IRQ for a device + * @dev: platform device + * @name: IRQ name + */ +int platform_get_irq_byname(struct platform_device *dev, char * name) +{ + struct resource *r = platform_get_resource_byname(dev, IORESOURCE_IRQ, name); + + return r ? r->start : 0; +} + +/** * platform_add_devices - add a numbers of platform devices * @devs: array of platform devices to add * @num: number of platform devices in array @@ -103,7 +139,8 @@ for (i = 0; i < pdev->num_resources; i++) { struct resource *p, *r = &pdev->resource[i]; - r->name = pdev->dev.bus_id; + if (r->name == NULL) + r->name = pdev->dev.bus_id; p = NULL; if (r->flags & IORESOURCE_MEM) @@ -308,3 +345,5 @@ EXPORT_SYMBOL_GPL(platform_device_unregister); EXPORT_SYMBOL_GPL(platform_get_irq); EXPORT_SYMBOL_GPL(platform_get_resource); +EXPORT_SYMBOL_GPL(platform_get_irq_byname); +EXPORT_SYMBOL_GPL(platform_get_resource_byname); diff -Nru a/drivers/net/gianfar.c b/drivers/net/gianfar.c --- a/drivers/net/gianfar.c 2004-12-23 12:39:15 -06:00 +++ b/drivers/net/gianfar.c 2004-12-23 12:39:15 -06:00 @@ -1,4 +1,4 @@ -/* +/* * drivers/net/gianfar.c * * Gianfar Ethernet Driver @@ -24,27 +24,22 @@ * Theory of operation * This driver is designed for the Triple-speed Ethernet * controllers on the Freescale 8540/8560 integrated processors, - * as well as the Fast Ethernet Controller on the 8540. - * - * The driver is initialized through OCP. Structures which - * define the configuration needed by the board are defined in a - * board structure in arch/ppc/platforms (though I do not + * as well as the Fast Ethernet Controller on the 8540. + * + * The driver is initialized through platform_device. Structures + * which define the configuration needed by the board are defined + * in a board structure in arch/ppc/platforms (though I do not * discount the possibility that other architectures could one - * day be supported. One assumption the driver currently makes - * is that the PHY is configured in such a way to advertise all - * capabilities. This is a sensible default, and on certain - * PHYs, changing this default encounters substantial errata - * issues. Future versions may remove this requirement, but for - * now, it is best for the firmware to ensure this is the case. + * day be supported. * * The Gianfar Ethernet Controller uses a ring of buffer * descriptors. The beginning is indicated by a register - * pointing to the physical address of the start of the ring. - * The end is determined by a "wrap" bit being set in the + * pointing to the physical address of the start of the ring. + * The end is determined by a "wrap" bit being set in the * last descriptor of the ring. * * When a packet is received, the RXF bit in the - * IEVENT register is set, triggering an interrupt when the + * IEVENT register is set, triggering an interrupt when the * corresponding bit in the IMASK register is also set (if * interrupt coalescing is active, then the interrupt may not * happen immediately, but will wait until either a set number @@ -52,7 +47,7 @@ * interrupt handler will signal there is work to be done, and * exit. Without NAPI, the packet(s) will be handled * immediately. Both methods will start at the last known empty - * descriptor, and process every subsequent descriptor until there + * descriptor, and process every subsequent descriptor until there * are none left with data (NAPI will stop after a set number of * packets to give time to other tasks, but will eventually * process all the packets). The data arrives inside a @@ -76,6 +71,7 @@ #include #include #include +#include #include #include #include @@ -85,6 +81,7 @@ #include #include #include +#include #include #include @@ -93,9 +90,11 @@ #include #include #include +#include +#include #include "gianfar.h" -#include "gianfar_phy.h" +#include "gianfar_mii.h" #define TX_TIMEOUT (1*HZ) #define SKB_ALLOC_TIMEOUT 1000000 @@ -109,9 +108,8 @@ #endif const char gfar_driver_name[] = "Gianfar Ethernet"; -const char gfar_driver_version[] = "1.1"; +const char gfar_driver_version[] = "1.2"; -int startup_gfar(struct net_device *dev); static int gfar_enet_open(struct net_device *dev); static int gfar_start_xmit(struct sk_buff *skb, struct net_device *dev); static void gfar_timeout(struct net_device *dev); @@ -122,17 +120,13 @@ static int gfar_change_mtu(struct net_device *dev, int new_mtu); static irqreturn_t gfar_error(int irq, void *dev_id, struct pt_regs *regs); static irqreturn_t gfar_transmit(int irq, void *dev_id, struct pt_regs *regs); -irqreturn_t gfar_receive(int irq, void *dev_id, struct pt_regs *regs); static irqreturn_t gfar_interrupt(int irq, void *dev_id, struct pt_regs *regs); -static irqreturn_t phy_interrupt(int irq, void *dev_id, struct pt_regs *regs); -static void gfar_phy_change(void *data); -static void gfar_phy_timer(unsigned long data); -static void adjust_link(struct net_device *dev); +static void adjust_link(struct device *dev); static void init_registers(struct net_device *dev); static int init_phy(struct net_device *dev); -static int gfar_probe(struct ocp_device *ocpdev); -static void gfar_remove(struct ocp_device *ocpdev); -void free_skb_resources(struct gfar_private *priv); +static int gfar_probe(struct device *device); +static int gfar_remove(struct device *device); +static void free_skb_resources(struct gfar_private *priv); static void gfar_set_multi(struct net_device *dev); static void gfar_set_hash_for_addr(struct net_device *dev, u8 *addr); #ifdef CONFIG_GFAR_NAPI @@ -140,57 +134,38 @@ #endif static int gfar_clean_rx_ring(struct net_device *dev, int rx_work_limit); static int gfar_process_frame(struct net_device *dev, struct sk_buff *skb, int length); -static void gfar_phy_startup_timer(unsigned long data); - -extern struct ethtool_ops gfar_ethtool_ops; MODULE_AUTHOR("Freescale Semiconductor, Inc"); MODULE_DESCRIPTION("Gianfar Ethernet Driver"); MODULE_LICENSE("GPL"); -/* Called by the ocp code to initialize device data structures - * required for bringing up the device - * returns 0 on success */ -static int gfar_probe(struct ocp_device *ocpdev) +/* Set up the ethernet device structure, private data, + * and anything else we need before we start */ +static int gfar_probe(struct device *device) { u32 tempval; - struct ocp_device *mdiodev; struct net_device *dev = NULL; struct gfar_private *priv = NULL; - struct ocp_gfar_data *einfo; + struct platform_device *pdev = to_platform_device(device); + struct gianfar_platform_data *einfo; + struct resource *r; int idx; int err = 0; int dev_ethtool_ops = 0; - einfo = (struct ocp_gfar_data *) ocpdev->def->additions; + einfo = (struct gianfar_platform_data *) pdev->dev.platform_data; if (einfo == NULL) { printk(KERN_ERR "gfar %d: Missing additional data!\n", - ocpdev->def->index); + pdev->id); return -ENODEV; } - /* get a pointer to the register memory which can - * configure the PHYs. If it's different from this set, - * get the device which has those regs */ - if ((einfo->phyregidx >= 0) && - (einfo->phyregidx != ocpdev->def->index)) { - mdiodev = ocp_find_device(OCP_ANY_ID, - OCP_FUNC_GFAR, einfo->phyregidx); - - /* If the device which holds the MDIO regs isn't - * up, wait for it to come up */ - if (mdiodev == NULL) - return -EAGAIN; - } else { - mdiodev = ocpdev; - } - /* Create an ethernet device instance */ dev = alloc_etherdev(sizeof (*priv)); - if (dev == NULL) + if (NULL == dev) return -ENOMEM; priv = netdev_priv(dev); @@ -198,27 +173,28 @@ /* Set the info in the priv to the current info */ priv->einfo = einfo; + /* fill out IRQ fields */ + if (einfo->device_flags & FSL_GIANFAR_DEV_HAS_MULTI_INTR) { + priv->interruptTransmit = platform_get_irq_byname(pdev, "tx"); + priv->interruptReceive = platform_get_irq_byname(pdev, "rx"); + priv->interruptError = platform_get_irq_byname(pdev, "error"); + } else { + priv->interruptTransmit = platform_get_irq(pdev, 0); + } + /* get a pointer to the register memory */ + r = platform_get_resource(pdev, IORESOURCE_MEM, 0); priv->regs = (struct gfar *) - ioremap(ocpdev->def->paddr, sizeof (struct gfar)); + ioremap(r->start, sizeof (struct gfar)); if (priv->regs == NULL) { err = -ENOMEM; goto regs_fail; } - /* Set the PHY base address */ - priv->phyregs = (struct gfar *) - ioremap(mdiodev->def->paddr, sizeof (struct gfar)); - - if (priv->phyregs == NULL) { - err = -ENOMEM; - goto phy_regs_fail; - } - spin_lock_init(&priv->lock); - ocp_set_drvdata(ocpdev, dev); + dev_set_drvdata(device, dev); /* Stop the DMA engine now, in case it was running before */ /* (The firmware could have used it, and left it running). */ @@ -255,7 +231,7 @@ dev->base_addr = (unsigned long) (priv->regs); SET_MODULE_OWNER(dev); - SET_NETDEV_DEV(dev, &ocpdev->dev); + SET_NETDEV_DEV(dev, device); /* Fill in the dev structure */ dev->open = gfar_enet_open; @@ -274,10 +250,10 @@ /* Index into the array of possible ethtool * ops to catch all 4 possibilities */ - if((priv->einfo->flags & GFAR_HAS_RMON) == 0) + if((priv->einfo->device_flags & FSL_GIANFAR_DEV_HAS_RMON) == 0) dev_ethtool_ops += 1; - if((priv->einfo->flags & GFAR_HAS_COALESCE) == 0) + if((priv->einfo->device_flags & FSL_GIANFAR_DEV_HAS_COALESCE) == 0) dev_ethtool_ops += 2; dev->ethtool_ops = gfar_op_array[dev_ethtool_ops]; @@ -324,119 +300,59 @@ return 0; register_fail: - iounmap((void *) priv->phyregs); -phy_regs_fail: iounmap((void *) priv->regs); regs_fail: free_netdev(dev); - return -ENOMEM; + return err; } -static void gfar_remove(struct ocp_device *ocpdev) +static int gfar_remove(struct device *device) { - struct net_device *dev = ocp_get_drvdata(ocpdev); + struct net_device *dev = dev_get_drvdata(device); struct gfar_private *priv = netdev_priv(dev); - ocp_set_drvdata(ocpdev, NULL); + dev_set_drvdata(device, NULL); iounmap((void *) priv->regs); - iounmap((void *) priv->phyregs); free_netdev(dev); + + return 0; } -/* Configure the PHY for dev. - * returns 0 if success. -1 if failure + +/* Initializes driver PHY state, and attaches to the PHY. + * Returns 0 on success, errno on failure to attach. */ static int init_phy(struct net_device *dev) { struct gfar_private *priv = netdev_priv(dev); - struct phy_info *curphy; - unsigned int timeout = PHY_INIT_TIMEOUT; - struct gfar *phyregs = priv->phyregs; - struct gfar_mii_info *mii_info; - int err; + uint gigabit_support = + priv->einfo->device_flags & FSL_GIANFAR_DEV_HAS_GIGABIT ? + SUPPORTED_1000baseT_Full : 0; + struct phy_device *phydev; priv->oldlink = 0; priv->oldspeed = 0; priv->oldduplex = -1; - mii_info = kmalloc(sizeof(struct gfar_mii_info), - GFP_KERNEL); - - if(NULL == mii_info) { - printk(KERN_ERR "%s: Could not allocate mii_info\n", - dev->name); - return -ENOMEM; - } - - mii_info->speed = SPEED_1000; - mii_info->duplex = DUPLEX_FULL; - mii_info->pause = 0; - mii_info->link = 1; - - mii_info->advertising = (ADVERTISED_10baseT_Half | - ADVERTISED_10baseT_Full | - ADVERTISED_100baseT_Half | - ADVERTISED_100baseT_Full | - ADVERTISED_1000baseT_Full); - mii_info->autoneg = 1; - - mii_info->mii_id = priv->einfo->phyid; - - mii_info->dev = dev; - - mii_info->mdio_read = &read_phy_reg; - mii_info->mdio_write = &write_phy_reg; - - priv->mii_info = mii_info; - - /* Reset the management interface */ - gfar_write(&phyregs->miimcfg, MIIMCFG_RESET); + phydev = phy_connect(dev->class_dev.dev, priv->einfo->bus_id, + &adjust_link); - /* Setup the MII Mgmt clock speed */ - gfar_write(&phyregs->miimcfg, MIIMCFG_INIT_VALUE); - - /* Wait until the bus is free */ - while ((gfar_read(&phyregs->miimind) & MIIMIND_BUSY) && - timeout--) - cpu_relax(); - - if(timeout <= 0) { - printk(KERN_ERR "%s: The MII Bus is stuck!\n", - dev->name); - err = -1; - goto bus_fail; - } - - /* get info for this PHY */ - curphy = get_phy_info(priv->mii_info); - - if (curphy == NULL) { - printk(KERN_ERR "%s: No PHY found\n", dev->name); - err = -1; - goto no_phy; + if(NULL == phydev) { + printk(KERN_ERR "%s: Could not attach to PHY\n", dev->name); + return errno; } - mii_info->phyinfo = curphy; + /* Remove any features not supported by the controller */ + phydev->supported &= (GFAR_SUPPORTED | gigabit_support); + phydev->advertising = phydev->supported; - /* Run the commands which initialize the PHY */ - if(curphy->init) { - err = curphy->init(priv->mii_info); - - if (err) - goto phy_init_fail; - } + priv->phydev = phydev; return 0; - -phy_init_fail: -no_phy: -bus_fail: - kfree(mii_info); - - return err; } + static void init_registers(struct net_device *dev) { struct gfar_private *priv = netdev_priv(dev); @@ -470,7 +386,7 @@ gfar_write(&priv->regs->rctrl, 0x00000000); /* Zero out the rmon mib registers if it has them */ - if (priv->einfo->flags & GFAR_HAS_RMON) { + if (priv->einfo->device_flags & FSL_GIANFAR_DEV_HAS_RMON) { memset((void *) &(priv->regs->rmon), 0, sizeof (struct rmon_mib)); @@ -506,13 +422,11 @@ unsigned long flags; u32 tempval; + phy_stop(priv->phydev); + /* Lock it down */ spin_lock_irqsave(&priv->lock, flags); - /* Tell the kernel the link is down */ - priv->mii_info->link = 0; - adjust_link(dev); - /* Mask all interrupts */ gfar_write(®s->imask, IMASK_INIT_CLEAR); @@ -536,30 +450,15 @@ tempval &= ~(MACCFG1_RX_EN | MACCFG1_TX_EN); gfar_write(®s->maccfg1, tempval); - if (priv->einfo->flags & GFAR_HAS_PHY_INTR) { - /* Clear any pending interrupts */ - mii_clear_phy_interrupt(priv->mii_info); - - /* Disable PHY Interrupts */ - mii_configure_phy_interrupt(priv->mii_info, - MII_INTERRUPT_DISABLED); - } - spin_unlock_irqrestore(&priv->lock, flags); /* Free the IRQs */ - if (priv->einfo->flags & GFAR_HAS_MULTI_INTR) { - free_irq(priv->einfo->interruptError, dev); - free_irq(priv->einfo->interruptTransmit, dev); - free_irq(priv->einfo->interruptReceive, dev); + if (priv->einfo->device_flags & FSL_GIANFAR_DEV_HAS_MULTI_INTR) { + free_irq(priv->interruptError, dev); + free_irq(priv->interruptTransmit, dev); + free_irq(priv->interruptReceive, dev); } else { - free_irq(priv->einfo->interruptTransmit, dev); - } - - if (priv->einfo->flags & GFAR_HAS_PHY_INTR) { - free_irq(priv->einfo->interruptPHY, dev); - } else { - del_timer_sync(&priv->phy_info_timer); + free_irq(priv->interruptTransmit, dev); } free_skb_resources(priv); @@ -573,7 +472,7 @@ /* If there are any tx skbs or rx skbs still around, free them. * Then free tx_skbuff and rx_skbuff */ -void free_skb_resources(struct gfar_private *priv) +static void free_skb_resources(struct gfar_private *priv) { struct rxbd8 *rxbdp; struct txbd8 *txbdp; @@ -638,7 +537,7 @@ gfar_write(®s->imask, IMASK_INIT_CLEAR); /* Allocate memory for the buffer descriptors */ - vaddr = (unsigned long) dma_alloc_coherent(NULL, + vaddr = (unsigned long) dma_alloc_coherent(NULL, sizeof (struct txbd8) * priv->tx_ring_size + sizeof (struct rxbd8) * priv->rx_ring_size, &addr, GFP_KERNEL); @@ -727,54 +626,48 @@ /* If the device has multiple interrupts, register for * them. Otherwise, only register for the one */ - if (priv->einfo->flags & GFAR_HAS_MULTI_INTR) { - /* Install our interrupt handlers for Error, + if (priv->einfo->device_flags & FSL_GIANFAR_DEV_HAS_MULTI_INTR) { + /* Install our interrupt handlers for Error, * Transmit, and Receive */ - if (request_irq(priv->einfo->interruptError, gfar_error, + if (request_irq(priv->interruptError, gfar_error, 0, "enet_error", dev) < 0) { printk(KERN_ERR "%s: Can't get IRQ %d\n", - dev->name, priv->einfo->interruptError); + dev->name, priv->interruptError); err = -1; goto err_irq_fail; } - if (request_irq(priv->einfo->interruptTransmit, gfar_transmit, + if (request_irq(priv->interruptTransmit, gfar_transmit, 0, "enet_tx", dev) < 0) { printk(KERN_ERR "%s: Can't get IRQ %d\n", - dev->name, priv->einfo->interruptTransmit); + dev->name, priv->interruptTransmit); err = -1; goto tx_irq_fail; } - if (request_irq(priv->einfo->interruptReceive, gfar_receive, + if (request_irq(priv->interruptReceive, gfar_receive, 0, "enet_rx", dev) < 0) { printk(KERN_ERR "%s: Can't get IRQ %d (receive0)\n", - dev->name, priv->einfo->interruptReceive); + dev->name, priv->interruptReceive); err = -1; goto rx_irq_fail; } } else { - if (request_irq(priv->einfo->interruptTransmit, gfar_interrupt, + if (request_irq(priv->interruptTransmit, gfar_interrupt, 0, "gfar_interrupt", dev) < 0) { printk(KERN_ERR "%s: Can't get IRQ %d\n", - dev->name, priv->einfo->interruptError); + dev->name, priv->interruptError); err = -1; goto err_irq_fail; } } - /* Set up the PHY change work queue */ - INIT_WORK(&priv->tq, gfar_phy_change, dev); - - init_timer(&priv->phy_info_timer); - priv->phy_info_timer.function = &gfar_phy_startup_timer; - priv->phy_info_timer.data = (unsigned long) priv->mii_info; - mod_timer(&priv->phy_info_timer, jiffies + HZ); + phy_start(priv->phydev); /* Configure the coalescing support */ if (priv->txcoalescing) @@ -815,9 +708,9 @@ return 0; rx_irq_fail: - free_irq(priv->einfo->interruptTransmit, dev); + free_irq(priv->interruptTransmit, dev); tx_irq_fail: - free_irq(priv->einfo->interruptError, dev); + free_irq(priv->interruptError, dev); err_irq_fail: rx_skb_fail: free_skb_resources(priv); @@ -828,11 +721,6 @@ priv->tx_bd_base, gfar_read(®s->tbase)); - if (priv->mii_info->phyinfo->close) - priv->mii_info->phyinfo->close(priv->mii_info); - - kfree(priv->mii_info); - return err; } @@ -880,7 +768,7 @@ /* Set buffer length and pointer */ txbdp->length = skb->len; - txbdp->bufPtr = dma_map_single(NULL, skb->data, + txbdp->bufPtr = dma_map_single(NULL, skb->data, skb->len, DMA_TO_DEVICE); /* Save the skb pointer so we can free it later */ @@ -932,11 +820,9 @@ struct gfar_private *priv = netdev_priv(dev); stop_gfar(dev); - /* Shutdown the PHY */ - if (priv->mii_info->phyinfo->close) - priv->mii_info->phyinfo->close(priv->mii_info); - - kfree(priv->mii_info); + /* Disconnect from the PHY */ + phy_disconnect(priv->phydev); + priv->phydev = NULL; netif_stop_queue(dev); @@ -1122,7 +1008,7 @@ skb->dev = dev; bdp->bufPtr = dma_map_single(NULL, skb->data, - priv->rx_buffer_size + RXBUF_ALIGNMENT, + priv->rx_buffer_size + RXBUF_ALIGNMENT, DMA_FROM_DEVICE); bdp->length = 0; @@ -1252,7 +1138,7 @@ } /* gfar_clean_rx_ring() -- Processes each frame in the rx ring - * until the budget/quota has been reached. Returns the number + * until the budget/quota has been reached. Returns the number * of frames handled */ static int gfar_clean_rx_ring(struct net_device *dev, int rx_work_limit) @@ -1452,164 +1338,44 @@ return IRQ_HANDLED; } -static irqreturn_t phy_interrupt(int irq, void *dev_id, struct pt_regs *regs) -{ - struct net_device *dev = (struct net_device *) dev_id; - struct gfar_private *priv = netdev_priv(dev); - - /* Clear the interrupt */ - mii_clear_phy_interrupt(priv->mii_info); - - /* Disable PHY interrupts */ - mii_configure_phy_interrupt(priv->mii_info, - MII_INTERRUPT_DISABLED); - - /* Schedule the phy change */ - schedule_work(&priv->tq); - - return IRQ_HANDLED; -} - -/* Scheduled by the phy_interrupt/timer to handle PHY changes */ -static void gfar_phy_change(void *data) -{ - struct net_device *dev = (struct net_device *) data; - struct gfar_private *priv = netdev_priv(dev); - int result = 0; - - /* Delay to give the PHY a chance to change the - * register state */ - msleep(1); - - /* Update the link, speed, duplex */ - result = priv->mii_info->phyinfo->read_status(priv->mii_info); - - /* Adjust the known status as long as the link - * isn't still coming up */ - if((0 == result) || (priv->mii_info->link == 0)) - adjust_link(dev); - - /* Reenable interrupts, if needed */ - if (priv->einfo->flags & GFAR_HAS_PHY_INTR) - mii_configure_phy_interrupt(priv->mii_info, - MII_INTERRUPT_ENABLED); -} - -/* Called every so often on systems that don't interrupt - * the core for PHY changes */ -static void gfar_phy_timer(unsigned long data) -{ - struct net_device *dev = (struct net_device *) data; - struct gfar_private *priv = netdev_priv(dev); - - schedule_work(&priv->tq); - - mod_timer(&priv->phy_info_timer, jiffies + - GFAR_PHY_CHANGE_TIME * HZ); -} - -/* Keep trying aneg for some time - * If, after GFAR_AN_TIMEOUT seconds, it has not - * finished, we switch to forced. - * Either way, once the process has completed, we either - * request the interrupt, or switch the timer over to - * using gfar_phy_timer to check status */ -static void gfar_phy_startup_timer(unsigned long data) -{ - int result; - static int secondary = GFAR_AN_TIMEOUT; - struct gfar_mii_info *mii_info = (struct gfar_mii_info *)data; - struct gfar_private *priv = netdev_priv(mii_info->dev); - - /* Configure the Auto-negotiation */ - result = mii_info->phyinfo->config_aneg(mii_info); - - /* If autonegotiation failed to start, and - * we haven't timed out, reset the timer, and return */ - if (result && secondary--) { - mod_timer(&priv->phy_info_timer, jiffies + HZ); - return; - } else if (result) { - /* Couldn't start autonegotiation. - * Try switching to forced */ - mii_info->autoneg = 0; - result = mii_info->phyinfo->config_aneg(mii_info); - - /* Forcing failed! Give up */ - if(result) { - printk(KERN_ERR "%s: Forcing failed!\n", - mii_info->dev->name); - return; - } - } - - /* Kill the timer so it can be restarted */ - del_timer_sync(&priv->phy_info_timer); - - /* Grab the PHY interrupt, if necessary/possible */ - if (priv->einfo->flags & GFAR_HAS_PHY_INTR) { - if (request_irq(priv->einfo->interruptPHY, - phy_interrupt, - SA_SHIRQ, - "phy_interrupt", - mii_info->dev) < 0) { - printk(KERN_ERR "%s: Can't get IRQ %d (PHY)\n", - mii_info->dev->name, - priv->einfo->interruptPHY); - } else { - mii_configure_phy_interrupt(priv->mii_info, - MII_INTERRUPT_ENABLED); - return; - } - } - - /* Start the timer again, this time in order to - * handle a change in status */ - init_timer(&priv->phy_info_timer); - priv->phy_info_timer.function = &gfar_phy_timer; - priv->phy_info_timer.data = (unsigned long) mii_info->dev; - mod_timer(&priv->phy_info_timer, jiffies + - GFAR_PHY_CHANGE_TIME * HZ); -} - /* Called every time the controller might need to be made * aware of new link state. The PHY code conveys this - * information through variables in the priv structure, and this + * information through variables in the phydev structure, and this * function converts those variables into the appropriate * register values, and can bring down the device if needed. */ -static void adjust_link(struct net_device *dev) +static void adjust_link(struct device *d) { + struct net_device *dev = dev_get_drvdata(d); struct gfar_private *priv = netdev_priv(dev); struct gfar *regs = priv->regs; u32 tempval; - struct gfar_mii_info *mii_info = priv->mii_info; + unsigned long flags; + struct phy_device *phydev = priv->phydev; + int new_state = 0; - if (mii_info->link) { + spin_lock_irqsave(&priv->lock, flags); + if (phydev->link) { /* Now we make sure that we can be in full duplex mode. * If not, we operate in half-duplex mode. */ - if (mii_info->duplex != priv->oldduplex) { - if (!(mii_info->duplex)) { + if (phydev->duplex != priv->oldduplex) { + new_state = 1; + if (!(phydev->duplex)) { tempval = gfar_read(®s->maccfg2); tempval &= ~(MACCFG2_FULL_DUPLEX); gfar_write(®s->maccfg2, tempval); - - printk(KERN_INFO "%s: Half Duplex\n", - dev->name); } else { tempval = gfar_read(®s->maccfg2); tempval |= MACCFG2_FULL_DUPLEX; gfar_write(®s->maccfg2, tempval); - - printk(KERN_INFO "%s: Full Duplex\n", - dev->name); } - priv->oldduplex = mii_info->duplex; + priv->oldduplex = phydev->duplex; } - if (mii_info->speed != priv->oldspeed) { - switch (mii_info->speed) { + if (phydev->speed != priv->oldspeed) { + new_state = 1; + switch (phydev->speed) { case 1000: tempval = gfar_read(®s->maccfg2); tempval = @@ -1626,31 +1392,41 @@ default: printk(KERN_WARNING "%s: Ack! Speed (%d) is not 10/100/1000!\n", - dev->name, mii_info->speed); + dev->name, phydev->speed); break; } - printk(KERN_INFO "%s: Speed %dBT\n", dev->name, - mii_info->speed); - - priv->oldspeed = mii_info->speed; + priv->oldspeed = phydev->speed; } if (!priv->oldlink) { - printk(KERN_INFO "%s: Link is up\n", dev->name); + new_state = 1; priv->oldlink = 1; netif_carrier_on(dev); netif_schedule(dev); } } else { if (priv->oldlink) { - printk(KERN_INFO "%s: Link is down\n", dev->name); + new_state = 1; priv->oldlink = 0; priv->oldspeed = 0; priv->oldduplex = -1; netif_carrier_off(dev); } } + + if (new_state) { + pr_info("%s: Link is %s", dev->name, + phydev->link ? "Up" : "Down"); + if (phydev->link) + printk(" - %d/%s", phydev->speed, + DUPLEX_FULL == phydev->duplex ? + "Full" : "Half"); + + printk("\n"); + } + + spin_unlock_irqrestore(&priv->lock, flags); } @@ -1829,35 +1605,32 @@ } /* Structure for a device driver */ -static struct ocp_device_id gfar_ids[] = { - {.vendor = OCP_ANY_ID,.function = OCP_FUNC_GFAR}, - {.vendor = OCP_VENDOR_INVALID} -}; - -static struct ocp_driver gfar_driver = { - .name = "gianfar", - .id_table = gfar_ids, - +static struct device_driver gfar_driver = { + .name = "fsl-gianfar", + .bus = &platform_bus_type, .probe = gfar_probe, .remove = gfar_remove, }; static int __init gfar_init(void) { - int rc; + int err = gfar_mdio_init(); - rc = ocp_register_driver(&gfar_driver); - if (rc != 0) { - ocp_unregister_driver(&gfar_driver); - return -ENODEV; - } + if (err) + return err; - return 0; + err = driver_register(&gfar_driver); + + if (err) + gfar_mdio_exit(); + + return err; } static void __exit gfar_exit(void) { - ocp_unregister_driver(&gfar_driver); + driver_unregister(&gfar_driver); + gfar_mdio_exit(); } module_init(gfar_init); diff -Nru a/drivers/net/gianfar.h b/drivers/net/gianfar.h --- a/drivers/net/gianfar.h 2004-12-23 12:39:16 -06:00 +++ b/drivers/net/gianfar.h 2004-12-23 12:39:16 -06:00 @@ -17,7 +17,6 @@ * * Still left to do: * -Add support for module parameters - * -Add support for ethtool -s * -Add patch for ethtool phys id */ #ifndef __GIANFAR_H @@ -37,6 +36,8 @@ #include #include #include +#include +#include #include #include @@ -47,8 +48,8 @@ #include #include #include -#include -#include "gianfar_phy.h" +#include +#include "gianfar_mii.h" /* The maximum number of packets to be handled in one call of gfar_poll */ #define GFAR_DEV_WEIGHT 64 @@ -67,7 +68,7 @@ #define PHY_INIT_TIMEOUT 100000 #define GFAR_PHY_CHANGE_TIME 2 -#define DEVICE_NAME "%s: Gianfar Ethernet Controller Version 1.1, " +#define DEVICE_NAME "%s: Gianfar Ethernet Controller Version 1.2, " #define DRV_NAME "gfar-enet" extern const char gfar_driver_name[]; extern const char gfar_driver_version[]; @@ -422,12 +423,7 @@ u32 hafdup; /* 0x.50c - Half Duplex Register */ u32 maxfrm; /* 0x.510 - Maximum Frame Length Register */ u8 res18[12]; - u32 miimcfg; /* 0x.520 - MII Management Configuration Register */ - u32 miimcom; /* 0x.524 - MII Management Command Register */ - u32 miimadd; /* 0x.528 - MII Management Address Register */ - u32 miimcon; /* 0x.52c - MII Management Control Register */ - u32 miimstat; /* 0x.530 - MII Management Status Register */ - u32 miimind; /* 0x.534 - MII Management Indicator Register */ + u8 gfar_mii_regs[24]; /* See gianfar_phy.h */ u8 res19[4]; u32 ifstat; /* 0x.53c - Interface Status Register */ u32 macstnaddr1; /* 0x.540 - Station Address Part 1 Register */ @@ -496,9 +492,6 @@ struct txbd8 *cur_tx; /* Next free ring entry */ struct txbd8 *dirty_tx; /* The Ring entry to be freed. */ struct gfar *regs; /* Pointer to the GFAR memory mapped Registers */ - struct gfar *phyregs; - struct work_struct tq; - struct timer_list phy_info_timer; struct net_device_stats stats; /* linux network statistics */ struct gfar_extra_stats extra_stats; spinlock_t lock; @@ -510,9 +503,13 @@ unsigned int rxclean; /* Info structure initialized by board setup code */ - struct ocp_gfar_data *einfo; + unsigned int interruptTransmit; + unsigned int interruptReceive; + unsigned int interruptError; + struct gianfar_platform_data *einfo; - struct gfar_mii_info *mii_info; + struct phy_device *phydev; + struct mii_bus *mii_bus; int oldspeed; int oldduplex; int oldlink; @@ -531,5 +528,9 @@ } extern struct ethtool_ops *gfar_op_array[]; + +extern irqreturn_t gfar_receive(int irq, void *dev_id, struct pt_regs *regs); +extern int startup_gfar(struct net_device *dev); +extern void stop_gfar(struct net_device *dev); #endif /* __GIANFAR_H */ diff -Nru a/drivers/net/gianfar_ethtool.c b/drivers/net/gianfar_ethtool.c --- a/drivers/net/gianfar_ethtool.c 2004-12-23 12:39:16 -06:00 +++ b/drivers/net/gianfar_ethtool.c 2004-12-23 12:39:16 -06:00 @@ -39,15 +39,13 @@ #include #include #include +#include +#include #include "gianfar.h" #define is_power_of_2(x) ((x) != 0 && (((x) & ((x) - 1)) == 0)) -extern int startup_gfar(struct net_device *dev); -extern void stop_gfar(struct net_device *dev); -extern void gfar_receive(int irq, void *dev_id, struct pt_regs *regs); - void gfar_fill_stats(struct net_device *dev, struct ethtool_stats *dummy, u64 * buf); void gfar_gstrings(struct net_device *dev, u32 stringset, u8 * buf); @@ -181,32 +179,72 @@ drvinfo->eedump_len = 0; } + +static int gfar_ssettings(struct net_device *dev, struct ethtool_cmd *cmd) +{ + struct gfar_private *priv = netdev_priv(dev); + struct phy_device *phydev = priv->phydev; + + if (NULL == phydev) + return -ENODEV; + + /* We make sure that we don't pass unsupported + * values in to the PHY */ + cmd->advertising &= phydev->supported; + + /* Verify the settings we care about. */ + if (cmd->autoneg != AUTONEG_ENABLE && cmd->autoneg != AUTONEG_DISABLE) + return -EINVAL; + + if (cmd->autoneg == AUTONEG_ENABLE && cmd->advertising == 0) + return -EINVAL; + + if (cmd->autoneg == AUTONEG_DISABLE + && ((cmd->speed != SPEED_1000 + && cmd->speed != SPEED_100 + && cmd->speed != SPEED_10) + || (cmd->duplex != DUPLEX_HALF + && cmd->duplex != DUPLEX_FULL))) + return -EINVAL; + + phydev->autoneg = cmd->autoneg; + + phydev->speed = cmd->speed; + + phydev->advertising = cmd->advertising; + + if (AUTONEG_ENABLE == cmd->autoneg) + phydev->advertising |= ADVERTISED_Autoneg; + else + phydev->advertising &= ~ADVERTISED_Autoneg; + + phydev->duplex = cmd->duplex; + + /* Restart the PHY */ + phy_start_aneg(phydev); + + return 0; +} + /* Return the current settings in the ethtool_cmd structure */ int gfar_gsettings(struct net_device *dev, struct ethtool_cmd *cmd) { struct gfar_private *priv = netdev_priv(dev); - uint gigabit_support = - priv->einfo->flags & GFAR_HAS_GIGABIT ? SUPPORTED_1000baseT_Full : 0; - uint gigabit_advert = - priv->einfo->flags & GFAR_HAS_GIGABIT ? ADVERTISED_1000baseT_Full: 0; - - cmd->supported = (SUPPORTED_10baseT_Half - | SUPPORTED_100baseT_Half - | SUPPORTED_100baseT_Full - | gigabit_support | SUPPORTED_Autoneg); - - /* For now, we always advertise everything */ - cmd->advertising = (ADVERTISED_10baseT_Half - | ADVERTISED_100baseT_Half - | ADVERTISED_100baseT_Full - | gigabit_advert | ADVERTISED_Autoneg); + struct phy_device *phydev = priv->phydev; + + if (NULL == phydev) + return -ENODEV; + + cmd->supported = phydev->supported; - cmd->speed = priv->mii_info->speed; - cmd->duplex = priv->mii_info->duplex; + cmd->advertising = phydev->advertising; + + cmd->speed = phydev->speed; + cmd->duplex = phydev->duplex; cmd->port = PORT_MII; - cmd->phy_address = priv->mii_info->mii_id; + cmd->phy_address = phydev->addr; cmd->transceiver = XCVR_EXTERNAL; - cmd->autoneg = AUTONEG_ENABLE; + cmd->autoneg = phydev->autoneg; cmd->maxtxpkt = priv->txcount; cmd->maxrxpkt = priv->rxcount; @@ -245,14 +283,14 @@ unsigned int count; /* The timer is different, depending on the interface speed */ - switch (priv->mii_info->speed) { - case 1000: + switch (priv->phydev->speed) { + case SPEED_1000: count = GFAR_GBIT_TIME; break; - case 100: + case SPEED_100: count = GFAR_100_TIME; break; - case 10: + case SPEED_10: default: count = GFAR_10_TIME; break; @@ -269,14 +307,14 @@ unsigned int count; /* The timer is different, depending on the interface speed */ - switch (priv->mii_info->speed) { - case 1000: + switch (priv->phydev->speed) { + case SPEED_1000: count = GFAR_GBIT_TIME; break; - case 100: + case SPEED_100: count = GFAR_100_TIME; break; - case 10: + case SPEED_10: default: count = GFAR_10_TIME; break; @@ -293,6 +331,9 @@ { struct gfar_private *priv = netdev_priv(dev); + if (NULL == priv->phydev) + return -ENODEV; + cvals->rx_coalesce_usecs = gfar_ticks2usecs(priv, priv->rxtime); cvals->rx_max_coalesced_frames = priv->rxcount; @@ -346,6 +387,9 @@ else priv->rxcoalescing = 1; + if (NULL == priv->phydev) + return -ENODEV; + priv->rxtime = gfar_usecs2ticks(priv, cvals->rx_coalesce_usecs); priv->rxcount = cvals->rx_max_coalesced_frames; @@ -462,6 +506,7 @@ } struct ethtool_ops gfar_ethtool_ops = { + .set_settings = gfar_ssettings, .get_settings = gfar_gsettings, .get_drvinfo = gfar_gdrvinfo, .get_regs_len = gfar_reglen, @@ -477,6 +522,7 @@ }; struct ethtool_ops gfar_normon_nocoalesce_ethtool_ops = { + .set_settings = gfar_ssettings, .get_settings = gfar_gsettings, .get_drvinfo = gfar_gdrvinfo, .get_regs_len = gfar_reglen, @@ -490,6 +536,7 @@ }; struct ethtool_ops gfar_nocoalesce_ethtool_ops = { + .set_settings = gfar_ssettings, .get_settings = gfar_gsettings, .get_drvinfo = gfar_gdrvinfo, .get_regs_len = gfar_reglen, @@ -503,6 +550,7 @@ }; struct ethtool_ops gfar_normon_ethtool_ops = { + .set_settings = gfar_ssettings, .get_settings = gfar_gsettings, .get_drvinfo = gfar_gdrvinfo, .get_regs_len = gfar_reglen, diff -Nru a/drivers/net/gianfar_mii.c b/drivers/net/gianfar_mii.c --- /dev/null Wed Dec 31 16:00:00 196900 +++ b/drivers/net/gianfar_mii.c 2004-12-23 12:39:16 -06:00 @@ -0,0 +1,212 @@ +/* + * drivers/net/gianfar_mii.c + * + * Gianfar Ethernet Driver -- MIIM bus implementation + * Provides Bus interface for MIIM regs + * + * Author: Andy Fleming + * Maintainer: Kumar Gala (kumar.gala@freescale.com) + * + * Copyright (c) 2002-2004 Freescale Semiconductor, Inc. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or (at your + * option) any later version. + * + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include + +#include "gianfar.h" +#include "gianfar_mii.h" + +extern void * get_platform_data(enum fsl_devices dev); + +/* Write value to the PHY at mii_id at register regnum, + * on the bus, waiting until the write is done before returning. + * All PHY configuration is done through the TSEC1 MIIM regs */ +void gfar_mdio_write(struct mii_bus *bus, int mii_id, int regnum, u16 value) +{ + struct gfar_mii *regs = bus->priv; + + /* Set the PHY address and the register address we want to write */ + gfar_write(®s->miimadd, (mii_id << 8) | regnum); + + /* Write out the value we want */ + gfar_write(®s->miimcon, value); + + /* Wait for the transaction to finish */ + while (gfar_read(®s->miimind) & MIIMIND_BUSY) + cpu_relax(); +} + +/* Read the bus for PHY at addr mii_id, register regnum, and + * return the value. Clears miimcom first. All PHY + * configuration has to be done through the TSEC1 MIIM regs */ +u16 gfar_mdio_read(struct mii_bus *bus, int mii_id, int regnum) +{ + struct gfar_mii *regs = bus->priv; + u16 value; + + /* Set the PHY address and the register address we want to read */ + gfar_write(®s->miimadd, (mii_id << 8) | regnum); + + /* Clear miimcom, and then initiate a read */ + gfar_write(®s->miimcom, 0); + gfar_write(®s->miimcom, MII_READ_COMMAND); + + /* Wait for the transaction to finish */ + while (gfar_read(®s->miimind) & (MIIMIND_NOTVALID | MIIMIND_BUSY)) + cpu_relax(); + + /* Grab the value of the register from miimstat */ + value = gfar_read(®s->miimstat); + + return value; +} + + +/* Reset the MIIM registers, and wait for the bus to free */ +int gfar_mdio_reset(struct mii_bus *bus) +{ + struct gfar_mii *regs = bus->priv; + unsigned int timeout = PHY_INIT_TIMEOUT; + + spin_lock_bh(&bus->mdio_lock); + + /* Reset the management interface */ + gfar_write(®s->miimcfg, MIIMCFG_RESET); + + /* Setup the MII Mgmt clock speed */ + gfar_write(®s->miimcfg, MIIMCFG_INIT_VALUE); + + /* Wait until the bus is free */ + while ((gfar_read(®s->miimind) & MIIMIND_BUSY) && + timeout--) + cpu_relax(); + + spin_unlock_bh(&bus->mdio_lock); + + if(timeout <= 0) { + printk(KERN_ERR "%s: The MII Bus is stuck!\n", + bus->name); + return -EBUSY; + } + + return 0; +} + + +int gfar_mdio_probe(struct device *dev) +{ + struct platform_device *pdev = to_platform_device(dev); + struct gianfar_mdio_data *pdata; + struct gfar_mii *regs; + struct mii_bus *new_bus; + int err = 0; + + if (NULL == dev) + return -EINVAL; + + new_bus = kmalloc(sizeof(struct mii_bus), GFP_KERNEL); + + if (NULL == new_bus) + return -ENOMEM; + + new_bus->name = "Gianfar MII Bus", + new_bus->read = &gfar_mdio_read, + new_bus->write = &gfar_mdio_write, + new_bus->reset = &gfar_mdio_reset, + new_bus->id = pdev->id; + + pdata = get_platform_data(MPC85xx_MDIO); + + /* Set the PHY base address */ + regs = (struct gfar_mii *) ioremap(pdata->paddr, + sizeof (struct gfar_mii)); + + if (NULL == regs) { + err = -ENOMEM; + goto reg_map_fail; + } + + new_bus->priv = regs; + + new_bus->irq = pdata->irq; + + new_bus->dev = dev; + dev_set_drvdata(dev, new_bus); + + err = register_mdiobus(new_bus); + + if (0 != err) { + printk (KERN_ERR "%s: Cannot register as MDIO bus\n", + new_bus->name); + goto bus_register_fail; + } + + return 0; + +bus_register_fail: + iounmap((void *) regs); +reg_map_fail: + kfree(new_bus); + + return err; +} + + +int gfar_mdio_remove(struct device *dev) +{ + struct mii_bus *bus = dev_get_drvdata(dev); + + dev_set_drvdata(dev, NULL); + + iounmap((void *) (&bus->priv)); + bus->priv = NULL; + kfree(bus); + + return 0; +} + +static struct device_driver gianfar_mdio_driver = { + .name = "fsl-gianfar_mdio", + .bus = &platform_bus_type, + .probe = gfar_mdio_probe, + .remove = gfar_mdio_remove, +}; + +int __init gfar_mdio_init(void) +{ + return driver_register(&gianfar_mdio_driver); +} + +void __exit gfar_mdio_exit(void) +{ + driver_unregister(&gianfar_mdio_driver); +} diff -Nru a/drivers/net/gianfar_mii.h b/drivers/net/gianfar_mii.h --- /dev/null Wed Dec 31 16:00:00 196900 +++ b/drivers/net/gianfar_mii.h 2004-12-23 12:39:15 -06:00 @@ -0,0 +1,45 @@ +/* + * drivers/net/gianfar_mii.h + * + * Gianfar Ethernet Driver -- MII Management Bus Implementation + * Driver for the MDIO bus controller in the Gianfar register space + * + * Author: Andy Fleming + * Maintainer: Kumar Gala (kumar.gala@freescale.com) + * + * Copyright (c) 2002-2004 Freescale Semiconductor, Inc. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or (at your + * option) any later version. + * + */ +#ifndef __GIANFAR_MII_H +#define __GIANFAR_MII_H + +#define MIIMIND_BUSY 0x00000001 +#define MIIMIND_NOTVALID 0x00000004 + +#define MII_READ_COMMAND 0x00000001 + +#define GFAR_SUPPORTED (SUPPORTED_10baseT_Half \ + | SUPPORTED_100baseT_Half \ + | SUPPORTED_100baseT_Full \ + | SUPPORTED_Autoneg \ + | SUPPORTED_MII) + +struct gfar_mii { + u32 miimcfg; /* 0x.520 - MII Management Config Register */ + u32 miimcom; /* 0x.524 - MII Management Command Register */ + u32 miimadd; /* 0x.528 - MII Management Address Register */ + u32 miimcon; /* 0x.52c - MII Management Control Register */ + u32 miimstat; /* 0x.530 - MII Management Status Register */ + u32 miimind; /* 0x.534 - MII Management Indicator Register */ +}; + +u16 gfar_mdio_read(struct mii_bus *bus, int mii_id, int regnum); +void gfar_mdio_write(struct mii_bus *bus, int mii_id, int regnum, u16 value); +int __init gfar_mdio_init(void); +void __exit gfar_mdio_exit(void); +#endif /* GIANFAR_PHY_H */ diff -Nru a/drivers/net/gianfar_phy.c b/drivers/net/gianfar_phy.c --- a/drivers/net/gianfar_phy.c 2004-12-23 12:39:16 -06:00 +++ /dev/null Wed Dec 31 16:00:00 196900 @@ -1,661 +0,0 @@ -/* - * drivers/net/gianfar_phy.c - * - * Gianfar Ethernet Driver -- PHY handling - * Driver for FEC on MPC8540 and TSEC on MPC8540/MPC8560 - * Based on 8260_io/fcc_enet.c - * - * Author: Andy Fleming - * Maintainer: Kumar Gala (kumar.gala@freescale.com) - * - * Copyright (c) 2002-2004 Freescale Semiconductor, Inc. - * - * This program is free software; you can redistribute it and/or modify it - * under the terms of the GNU General Public License as published by the - * Free Software Foundation; either version 2 of the License, or (at your - * option) any later version. - * - */ - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#include -#include -#include -#include -#include -#include -#include - -#include "gianfar.h" -#include "gianfar_phy.h" - -static void config_genmii_advert(struct gfar_mii_info *mii_info); -static void genmii_setup_forced(struct gfar_mii_info *mii_info); -static void genmii_restart_aneg(struct gfar_mii_info *mii_info); -static int gbit_config_aneg(struct gfar_mii_info *mii_info); -static int genmii_config_aneg(struct gfar_mii_info *mii_info); -static int genmii_update_link(struct gfar_mii_info *mii_info); -static int genmii_read_status(struct gfar_mii_info *mii_info); -u16 phy_read(struct gfar_mii_info *mii_info, u16 regnum); -void phy_write(struct gfar_mii_info *mii_info, u16 regnum, u16 val); - -/* Write value to the PHY for this device to the register at regnum, */ -/* waiting until the write is done before it returns. All PHY */ -/* configuration has to be done through the TSEC1 MIIM regs */ -void write_phy_reg(struct net_device *dev, int mii_id, int regnum, int value) -{ - struct gfar_private *priv = netdev_priv(dev); - struct gfar *regbase = priv->phyregs; - - /* Set the PHY address and the register address we want to write */ - gfar_write(®base->miimadd, (mii_id << 8) | regnum); - - /* Write out the value we want */ - gfar_write(®base->miimcon, value); - - /* Wait for the transaction to finish */ - while (gfar_read(®base->miimind) & MIIMIND_BUSY) - cpu_relax(); -} - -/* Reads from register regnum in the PHY for device dev, */ -/* returning the value. Clears miimcom first. All PHY */ -/* configuration has to be done through the TSEC1 MIIM regs */ -int read_phy_reg(struct net_device *dev, int mii_id, int regnum) -{ - struct gfar_private *priv = netdev_priv(dev); - struct gfar *regbase = priv->phyregs; - u16 value; - - /* Set the PHY address and the register address we want to read */ - gfar_write(®base->miimadd, (mii_id << 8) | regnum); - - /* Clear miimcom, and then initiate a read */ - gfar_write(®base->miimcom, 0); - gfar_write(®base->miimcom, MII_READ_COMMAND); - - /* Wait for the transaction to finish */ - while (gfar_read(®base->miimind) & (MIIMIND_NOTVALID | MIIMIND_BUSY)) - cpu_relax(); - - /* Grab the value of the register from miimstat */ - value = gfar_read(®base->miimstat); - - return value; -} - -void mii_clear_phy_interrupt(struct gfar_mii_info *mii_info) -{ - if(mii_info->phyinfo->ack_interrupt) - mii_info->phyinfo->ack_interrupt(mii_info); -} - - -void mii_configure_phy_interrupt(struct gfar_mii_info *mii_info, u32 interrupts) -{ - mii_info->interrupts = interrupts; - if(mii_info->phyinfo->config_intr) - mii_info->phyinfo->config_intr(mii_info); -} - - -/* Writes MII_ADVERTISE with the appropriate values, after - * sanitizing advertise to make sure only supported features - * are advertised - */ -static void config_genmii_advert(struct gfar_mii_info *mii_info) -{ - u32 advertise; - u16 adv; - - /* Only allow advertising what this PHY supports */ - mii_info->advertising &= mii_info->phyinfo->features; - advertise = mii_info->advertising; - - /* Setup standard advertisement */ - adv = phy_read(mii_info, MII_ADVERTISE); - adv &= ~(ADVERTISE_ALL | ADVERTISE_100BASE4); - if (advertise & ADVERTISED_10baseT_Half) - adv |= ADVERTISE_10HALF; - if (advertise & ADVERTISED_10baseT_Full) - adv |= ADVERTISE_10FULL; - if (advertise & ADVERTISED_100baseT_Half) - adv |= ADVERTISE_100HALF; - if (advertise & ADVERTISED_100baseT_Full) - adv |= ADVERTISE_100FULL; - phy_write(mii_info, MII_ADVERTISE, adv); -} - -static void genmii_setup_forced(struct gfar_mii_info *mii_info) -{ - u16 ctrl; - u32 features = mii_info->phyinfo->features; - - ctrl = phy_read(mii_info, MII_BMCR); - - ctrl &= ~(BMCR_FULLDPLX|BMCR_SPEED100|BMCR_SPEED1000|BMCR_ANENABLE); - ctrl |= BMCR_RESET; - - switch(mii_info->speed) { - case SPEED_1000: - if(features & (SUPPORTED_1000baseT_Half - | SUPPORTED_1000baseT_Full)) { - ctrl |= BMCR_SPEED1000; - break; - } - mii_info->speed = SPEED_100; - case SPEED_100: - if (features & (SUPPORTED_100baseT_Half - | SUPPORTED_100baseT_Full)) { - ctrl |= BMCR_SPEED100; - break; - } - mii_info->speed = SPEED_10; - case SPEED_10: - if (features & (SUPPORTED_10baseT_Half - | SUPPORTED_10baseT_Full)) - break; - default: /* Unsupported speed! */ - printk(KERN_ERR "%s: Bad speed!\n", - mii_info->dev->name); - break; - } - - phy_write(mii_info, MII_BMCR, ctrl); -} - - -/* Enable and Restart Autonegotiation */ -static void genmii_restart_aneg(struct gfar_mii_info *mii_info) -{ - u16 ctl; - - ctl = phy_read(mii_info, MII_BMCR); - ctl |= (BMCR_ANENABLE | BMCR_ANRESTART); - phy_write(mii_info, MII_BMCR, ctl); -} - - -static int gbit_config_aneg(struct gfar_mii_info *mii_info) -{ - u16 adv; - u32 advertise; - - if(mii_info->autoneg) { - /* Configure the ADVERTISE register */ - config_genmii_advert(mii_info); - advertise = mii_info->advertising; - - adv = phy_read(mii_info, MII_1000BASETCONTROL); - adv &= ~(MII_1000BASETCONTROL_FULLDUPLEXCAP | - MII_1000BASETCONTROL_HALFDUPLEXCAP); - if (advertise & SUPPORTED_1000baseT_Half) - adv |= MII_1000BASETCONTROL_HALFDUPLEXCAP; - if (advertise & SUPPORTED_1000baseT_Full) - adv |= MII_1000BASETCONTROL_FULLDUPLEXCAP; - phy_write(mii_info, MII_1000BASETCONTROL, adv); - - /* Start/Restart aneg */ - genmii_restart_aneg(mii_info); - } else - genmii_setup_forced(mii_info); - - return 0; -} - -static int marvell_config_aneg(struct gfar_mii_info *mii_info) -{ - /* The Marvell PHY has an errata which requires - * that certain registers get written in order - * to restart autonegotiation */ - phy_write(mii_info, MII_BMCR, BMCR_RESET); - - phy_write(mii_info, 0x1d, 0x1f); - phy_write(mii_info, 0x1e, 0x200c); - phy_write(mii_info, 0x1d, 0x5); - phy_write(mii_info, 0x1e, 0); - phy_write(mii_info, 0x1e, 0x100); - - gbit_config_aneg(mii_info); - - return 0; -} -static int genmii_config_aneg(struct gfar_mii_info *mii_info) -{ - if (mii_info->autoneg) { - config_genmii_advert(mii_info); - genmii_restart_aneg(mii_info); - } else - genmii_setup_forced(mii_info); - - return 0; -} - - -static int genmii_update_link(struct gfar_mii_info *mii_info) -{ - u16 status; - - /* Do a fake read */ - phy_read(mii_info, MII_BMSR); - - /* Read link and autonegotiation status */ - status = phy_read(mii_info, MII_BMSR); - if ((status & BMSR_LSTATUS) == 0) - mii_info->link = 0; - else - mii_info->link = 1; - - /* If we are autonegotiating, and not done, - * return an error */ - if (mii_info->autoneg && !(status & BMSR_ANEGCOMPLETE)) - return -EAGAIN; - - return 0; -} - -static int genmii_read_status(struct gfar_mii_info *mii_info) -{ - u16 status; - int err; - - /* Update the link, but return if there - * was an error */ - err = genmii_update_link(mii_info); - if (err) - return err; - - if (mii_info->autoneg) { - status = phy_read(mii_info, MII_LPA); - - if (status & (LPA_10FULL | LPA_100FULL)) - mii_info->duplex = DUPLEX_FULL; - else - mii_info->duplex = DUPLEX_HALF; - if (status & (LPA_100FULL | LPA_100HALF)) - mii_info->speed = SPEED_100; - else - mii_info->speed = SPEED_10; - mii_info->pause = 0; - } - /* On non-aneg, we assume what we put in BMCR is the speed, - * though magic-aneg shouldn't prevent this case from occurring - */ - - return 0; -} -static int marvell_read_status(struct gfar_mii_info *mii_info) -{ - u16 status; - int err; - - /* Update the link, but return if there - * was an error */ - err = genmii_update_link(mii_info); - if (err) - return err; - - /* If the link is up, read the speed and duplex */ - /* If we aren't autonegotiating, assume speeds - * are as set */ - if (mii_info->autoneg && mii_info->link) { - int speed; - status = phy_read(mii_info, MII_M1011_PHY_SPEC_STATUS); - -#if 0 - /* If speed and duplex aren't resolved, - * return an error. Isn't this handled - * by checking aneg? - */ - if ((status & MII_M1011_PHY_SPEC_STATUS_RESOLVED) == 0) - return -EAGAIN; -#endif - - /* Get the duplexity */ - if (status & MII_M1011_PHY_SPEC_STATUS_FULLDUPLEX) - mii_info->duplex = DUPLEX_FULL; - else - mii_info->duplex = DUPLEX_HALF; - - /* Get the speed */ - speed = status & MII_M1011_PHY_SPEC_STATUS_SPD_MASK; - switch(speed) { - case MII_M1011_PHY_SPEC_STATUS_1000: - mii_info->speed = SPEED_1000; - break; - case MII_M1011_PHY_SPEC_STATUS_100: - mii_info->speed = SPEED_100; - break; - default: - mii_info->speed = SPEED_10; - break; - } - mii_info->pause = 0; - } - - return 0; -} - - -static int cis820x_read_status(struct gfar_mii_info *mii_info) -{ - u16 status; - int err; - - /* Update the link, but return if there - * was an error */ - err = genmii_update_link(mii_info); - if (err) - return err; - - /* If the link is up, read the speed and duplex */ - /* If we aren't autonegotiating, assume speeds - * are as set */ - if (mii_info->autoneg && mii_info->link) { - int speed; - - status = phy_read(mii_info, MII_CIS8201_AUX_CONSTAT); - if (status & MII_CIS8201_AUXCONSTAT_DUPLEX) - mii_info->duplex = DUPLEX_FULL; - else - mii_info->duplex = DUPLEX_HALF; - - speed = status & MII_CIS8201_AUXCONSTAT_SPEED; - - switch (speed) { - case MII_CIS8201_AUXCONSTAT_GBIT: - mii_info->speed = SPEED_1000; - break; - case MII_CIS8201_AUXCONSTAT_100: - mii_info->speed = SPEED_100; - break; - default: - mii_info->speed = SPEED_10; - break; - } - } - - return 0; -} - -static int marvell_ack_interrupt(struct gfar_mii_info *mii_info) -{ - /* Clear the interrupts by reading the reg */ - phy_read(mii_info, MII_M1011_IEVENT); - - return 0; -} - -static int marvell_config_intr(struct gfar_mii_info *mii_info) -{ - if(mii_info->interrupts == MII_INTERRUPT_ENABLED) - phy_write(mii_info, MII_M1011_IMASK, MII_M1011_IMASK_INIT); - else - phy_write(mii_info, MII_M1011_IMASK, MII_M1011_IMASK_CLEAR); - - return 0; -} - -static int cis820x_init(struct gfar_mii_info *mii_info) -{ - phy_write(mii_info, MII_CIS8201_AUX_CONSTAT, - MII_CIS8201_AUXCONSTAT_INIT); - phy_write(mii_info, MII_CIS8201_EXT_CON1, - MII_CIS8201_EXTCON1_INIT); - - return 0; -} - -static int cis820x_ack_interrupt(struct gfar_mii_info *mii_info) -{ - phy_read(mii_info, MII_CIS8201_ISTAT); - - return 0; -} - -static int cis820x_config_intr(struct gfar_mii_info *mii_info) -{ - if(mii_info->interrupts == MII_INTERRUPT_ENABLED) - phy_write(mii_info, MII_CIS8201_IMASK, MII_CIS8201_IMASK_MASK); - else - phy_write(mii_info, MII_CIS8201_IMASK, 0); - - return 0; -} - -#define DM9161_DELAY 10 - -static int dm9161_read_status(struct gfar_mii_info *mii_info) -{ - u16 status; - int err; - - /* Update the link, but return if there - * was an error */ - err = genmii_update_link(mii_info); - if (err) - return err; - - /* If the link is up, read the speed and duplex */ - /* If we aren't autonegotiating, assume speeds - * are as set */ - if (mii_info->autoneg && mii_info->link) { - status = phy_read(mii_info, MII_DM9161_SCSR); - if (status & (MII_DM9161_SCSR_100F | MII_DM9161_SCSR_100H)) - mii_info->speed = SPEED_100; - else - mii_info->speed = SPEED_10; - - if (status & (MII_DM9161_SCSR_100F | MII_DM9161_SCSR_10F)) - mii_info->duplex = DUPLEX_FULL; - else - mii_info->duplex = DUPLEX_HALF; - } - - return 0; -} - - -static int dm9161_config_aneg(struct gfar_mii_info *mii_info) -{ - struct dm9161_private *priv = mii_info->priv; - - if(0 == priv->resetdone) - return -EAGAIN; - - return 0; -} - -static void dm9161_timer(unsigned long data) -{ - struct gfar_mii_info *mii_info = (struct gfar_mii_info *)data; - struct dm9161_private *priv = mii_info->priv; - u16 status = phy_read(mii_info, MII_BMSR); - - if (status & BMSR_ANEGCOMPLETE) { - priv->resetdone = 1; - } else - mod_timer(&priv->timer, jiffies + DM9161_DELAY * HZ); -} - -static int dm9161_init(struct gfar_mii_info *mii_info) -{ - struct dm9161_private *priv; - - /* Allocate the private data structure */ - priv = kmalloc(sizeof(struct dm9161_private), GFP_KERNEL); - - if (NULL == priv) - return -ENOMEM; - - mii_info->priv = priv; - - /* Reset is not done yet */ - priv->resetdone = 0; - - /* Isolate the PHY */ - phy_write(mii_info, MII_BMCR, BMCR_ISOLATE); - - /* Do not bypass the scrambler/descrambler */ - phy_write(mii_info, MII_DM9161_SCR, MII_DM9161_SCR_INIT); - - /* Clear 10BTCSR to default */ - phy_write(mii_info, MII_DM9161_10BTCSR, MII_DM9161_10BTCSR_INIT); - - /* Reconnect the PHY, and enable Autonegotiation */ - phy_write(mii_info, MII_BMCR, BMCR_ANENABLE); - - /* Start a timer for DM9161_DELAY seconds to wait - * for the PHY to be ready */ - init_timer(&priv->timer); - priv->timer.function = &dm9161_timer; - priv->timer.data = (unsigned long) mii_info; - mod_timer(&priv->timer, jiffies + DM9161_DELAY * HZ); - - return 0; -} - -static void dm9161_close(struct gfar_mii_info *mii_info) -{ - struct dm9161_private *priv = mii_info->priv; - - del_timer_sync(&priv->timer); - kfree(priv); -} - -#if 0 -static int dm9161_ack_interrupt(struct gfar_mii_info *mii_info) -{ - phy_read(mii_info, MII_DM9161_INTR); - - return 0; -} -#endif - -/* Cicada 820x */ -static struct phy_info phy_info_cis820x = { - 0x000fc440, - "Cicada Cis8204", - 0x000fffc0, - .features = MII_GBIT_FEATURES, - .init = &cis820x_init, - .config_aneg = &gbit_config_aneg, - .read_status = &cis820x_read_status, - .ack_interrupt = &cis820x_ack_interrupt, - .config_intr = &cis820x_config_intr, -}; - -static struct phy_info phy_info_dm9161 = { - .phy_id = 0x0181b880, - .name = "Davicom DM9161E", - .phy_id_mask = 0x0ffffff0, - .init = dm9161_init, - .config_aneg = dm9161_config_aneg, - .read_status = dm9161_read_status, - .close = dm9161_close, -}; - -static struct phy_info phy_info_marvell = { - .phy_id = 0x01410c00, - .phy_id_mask = 0xffffff00, - .name = "Marvell 88E1101", - .features = MII_GBIT_FEATURES, - .config_aneg = &marvell_config_aneg, - .read_status = &marvell_read_status, - .ack_interrupt = &marvell_ack_interrupt, - .config_intr = &marvell_config_intr, -}; - -static struct phy_info phy_info_genmii= { - .phy_id = 0x00000000, - .phy_id_mask = 0x00000000, - .name = "Generic MII", - .features = MII_BASIC_FEATURES, - .config_aneg = genmii_config_aneg, - .read_status = genmii_read_status, -}; - -static struct phy_info *phy_info[] = { - &phy_info_cis820x, - &phy_info_marvell, - &phy_info_dm9161, - &phy_info_genmii, - NULL -}; - -u16 phy_read(struct gfar_mii_info *mii_info, u16 regnum) -{ - u16 retval; - unsigned long flags; - - spin_lock_irqsave(&mii_info->mdio_lock, flags); - retval = mii_info->mdio_read(mii_info->dev, mii_info->mii_id, regnum); - spin_unlock_irqrestore(&mii_info->mdio_lock, flags); - - return retval; -} - -void phy_write(struct gfar_mii_info *mii_info, u16 regnum, u16 val) -{ - unsigned long flags; - - spin_lock_irqsave(&mii_info->mdio_lock, flags); - mii_info->mdio_write(mii_info->dev, - mii_info->mii_id, - regnum, val); - spin_unlock_irqrestore(&mii_info->mdio_lock, flags); -} - -/* Use the PHY ID registers to determine what type of PHY is attached - * to device dev. return a struct phy_info structure describing that PHY - */ -struct phy_info * get_phy_info(struct gfar_mii_info *mii_info) -{ - u16 phy_reg; - u32 phy_ID; - int i; - struct phy_info *theInfo = NULL; - struct net_device *dev = mii_info->dev; - - /* Grab the bits from PHYIR1, and put them in the upper half */ - phy_reg = phy_read(mii_info, MII_PHYSID1); - phy_ID = (phy_reg & 0xffff) << 16; - - /* Grab the bits from PHYIR2, and put them in the lower half */ - phy_reg = phy_read(mii_info, MII_PHYSID2); - phy_ID |= (phy_reg & 0xffff); - - /* loop through all the known PHY types, and find one that */ - /* matches the ID we read from the PHY. */ - for (i = 0; phy_info[i]; i++) - if (phy_info[i]->phy_id == - (phy_ID & phy_info[i]->phy_id_mask)) { - theInfo = phy_info[i]; - break; - } - - /* This shouldn't happen, as we have generic PHY support */ - if (theInfo == NULL) { - printk("%s: PHY id %x is not supported!\n", dev->name, phy_ID); - return NULL; - } else { - printk("%s: PHY is %s (%x)\n", dev->name, theInfo->name, - phy_ID); - } - - return theInfo; -} diff -Nru a/drivers/net/gianfar_phy.h b/drivers/net/gianfar_phy.h --- a/drivers/net/gianfar_phy.h 2004-12-23 12:39:15 -06:00 +++ /dev/null Wed Dec 31 16:00:00 196900 @@ -1,213 +0,0 @@ -/* - * drivers/net/gianfar_phy.h - * - * Gianfar Ethernet Driver -- PHY handling - * Driver for FEC on MPC8540 and TSEC on MPC8540/MPC8560 - * Based on 8260_io/fcc_enet.c - * - * Author: Andy Fleming - * Maintainer: Kumar Gala (kumar.gala@freescale.com) - * - * Copyright (c) 2002-2004 Freescale Semiconductor, Inc. - * - * This program is free software; you can redistribute it and/or modify it - * under the terms of the GNU General Public License as published by the - * Free Software Foundation; either version 2 of the License, or (at your - * option) any later version. - * - */ -#ifndef __GIANFAR_PHY_H -#define __GIANFAR_PHY_H - -#define MII_end ((u32)-2) -#define MII_read ((u32)-1) - -#define MIIMIND_BUSY 0x00000001 -#define MIIMIND_NOTVALID 0x00000004 - -#define GFAR_AN_TIMEOUT 2000 - -/* 1000BT control (Marvell & BCM54xx at least) */ -#define MII_1000BASETCONTROL 0x09 -#define MII_1000BASETCONTROL_FULLDUPLEXCAP 0x0200 -#define MII_1000BASETCONTROL_HALFDUPLEXCAP 0x0100 - -/* Cicada Extended Control Register 1 */ -#define MII_CIS8201_EXT_CON1 0x17 -#define MII_CIS8201_EXTCON1_INIT 0x0000 - -/* Cicada Interrupt Mask Register */ -#define MII_CIS8201_IMASK 0x19 -#define MII_CIS8201_IMASK_IEN 0x8000 -#define MII_CIS8201_IMASK_SPEED 0x4000 -#define MII_CIS8201_IMASK_LINK 0x2000 -#define MII_CIS8201_IMASK_DUPLEX 0x1000 -#define MII_CIS8201_IMASK_MASK 0xf000 - -/* Cicada Interrupt Status Register */ -#define MII_CIS8201_ISTAT 0x1a -#define MII_CIS8201_ISTAT_STATUS 0x8000 -#define MII_CIS8201_ISTAT_SPEED 0x4000 -#define MII_CIS8201_ISTAT_LINK 0x2000 -#define MII_CIS8201_ISTAT_DUPLEX 0x1000 - -/* Cicada Auxiliary Control/Status Register */ -#define MII_CIS8201_AUX_CONSTAT 0x1c -#define MII_CIS8201_AUXCONSTAT_INIT 0x0004 -#define MII_CIS8201_AUXCONSTAT_DUPLEX 0x0020 -#define MII_CIS8201_AUXCONSTAT_SPEED 0x0018 -#define MII_CIS8201_AUXCONSTAT_GBIT 0x0010 -#define MII_CIS8201_AUXCONSTAT_100 0x0008 - -/* 88E1011 PHY Status Register */ -#define MII_M1011_PHY_SPEC_STATUS 0x11 -#define MII_M1011_PHY_SPEC_STATUS_1000 0x8000 -#define MII_M1011_PHY_SPEC_STATUS_100 0x4000 -#define MII_M1011_PHY_SPEC_STATUS_SPD_MASK 0xc000 -#define MII_M1011_PHY_SPEC_STATUS_FULLDUPLEX 0x2000 -#define MII_M1011_PHY_SPEC_STATUS_RESOLVED 0x0800 -#define MII_M1011_PHY_SPEC_STATUS_LINK 0x0400 - -#define MII_M1011_IEVENT 0x13 -#define MII_M1011_IEVENT_CLEAR 0x0000 - -#define MII_M1011_IMASK 0x12 -#define MII_M1011_IMASK_INIT 0x6400 -#define MII_M1011_IMASK_CLEAR 0x0000 - -#define MII_DM9161_SCR 0x10 -#define MII_DM9161_SCR_INIT 0x0610 - -/* DM9161 Specified Configuration and Status Register */ -#define MII_DM9161_SCSR 0x11 -#define MII_DM9161_SCSR_100F 0x8000 -#define MII_DM9161_SCSR_100H 0x4000 -#define MII_DM9161_SCSR_10F 0x2000 -#define MII_DM9161_SCSR_10H 0x1000 - -/* DM9161 Interrupt Register */ -#define MII_DM9161_INTR 0x15 -#define MII_DM9161_INTR_PEND 0x8000 -#define MII_DM9161_INTR_DPLX_MASK 0x0800 -#define MII_DM9161_INTR_SPD_MASK 0x0400 -#define MII_DM9161_INTR_LINK_MASK 0x0200 -#define MII_DM9161_INTR_MASK 0x0100 -#define MII_DM9161_INTR_DPLX_CHANGE 0x0010 -#define MII_DM9161_INTR_SPD_CHANGE 0x0008 -#define MII_DM9161_INTR_LINK_CHANGE 0x0004 -#define MII_DM9161_INTR_INIT 0x0000 -#define MII_DM9161_INTR_STOP \ -(MII_DM9161_INTR_DPLX_MASK | MII_DM9161_INTR_SPD_MASK \ - | MII_DM9161_INTR_LINK_MASK | MII_DM9161_INTR_MASK) - -/* DM9161 10BT Configuration/Status */ -#define MII_DM9161_10BTCSR 0x12 -#define MII_DM9161_10BTCSR_INIT 0x7800 - -#define MII_BASIC_FEATURES (SUPPORTED_10baseT_Half | \ - SUPPORTED_10baseT_Full | \ - SUPPORTED_100baseT_Half | \ - SUPPORTED_100baseT_Full | \ - SUPPORTED_Autoneg | \ - SUPPORTED_TP | \ - SUPPORTED_MII) - -#define MII_GBIT_FEATURES (MII_BASIC_FEATURES | \ - SUPPORTED_1000baseT_Half | \ - SUPPORTED_1000baseT_Full) - -#define MII_READ_COMMAND 0x00000001 - -#define MII_INTERRUPT_DISABLED 0x0 -#define MII_INTERRUPT_ENABLED 0x1 -/* Taken from mii_if_info and sungem_phy.h */ -struct gfar_mii_info { - /* Information about the PHY type */ - /* And management functions */ - struct phy_info *phyinfo; - - /* forced speed & duplex (no autoneg) - * partner speed & duplex & pause (autoneg) - */ - int speed; - int duplex; - int pause; - - /* The most recently read link state */ - int link; - - /* Enabled Interrupts */ - u32 interrupts; - - u32 advertising; - int autoneg; - int mii_id; - - /* private data pointer */ - /* For use by PHYs to maintain extra state */ - void *priv; - - /* Provided by host chip */ - struct net_device *dev; - - /* A lock to ensure that only one thing can read/write - * the MDIO bus at a time */ - spinlock_t mdio_lock; - - /* Provided by ethernet driver */ - int (*mdio_read) (struct net_device *dev, int mii_id, int reg); - void (*mdio_write) (struct net_device *dev, int mii_id, int reg, int val); -}; - -/* struct phy_info: a structure which defines attributes for a PHY - * - * id will contain a number which represents the PHY. During - * startup, the driver will poll the PHY to find out what its - * UID--as defined by registers 2 and 3--is. The 32-bit result - * gotten from the PHY will be ANDed with phy_id_mask to - * discard any bits which may change based on revision numbers - * unimportant to functionality - * - * There are 6 commands which take a gfar_mii_info structure. - * Each PHY must declare config_aneg, and read_status. - */ -struct phy_info { - u32 phy_id; - char *name; - unsigned int phy_id_mask; - u32 features; - - /* Called to initialize the PHY */ - int (*init)(struct gfar_mii_info *mii_info); - - /* Called to suspend the PHY for power */ - int (*suspend)(struct gfar_mii_info *mii_info); - - /* Reconfigures autonegotiation (or disables it) */ - int (*config_aneg)(struct gfar_mii_info *mii_info); - - /* Determines the negotiated speed and duplex */ - int (*read_status)(struct gfar_mii_info *mii_info); - - /* Clears any pending interrupts */ - int (*ack_interrupt)(struct gfar_mii_info *mii_info); - - /* Enables or disables interrupts */ - int (*config_intr)(struct gfar_mii_info *mii_info); - - /* Clears up any memory if needed */ - void (*close)(struct gfar_mii_info *mii_info); -}; - -struct phy_info *get_phy_info(struct gfar_mii_info *mii_info); -int read_phy_reg(struct net_device *dev, int mii_id, int regnum); -void write_phy_reg(struct net_device *dev, int mii_id, int regnum, int value); -void mii_clear_phy_interrupt(struct gfar_mii_info *mii_info); -void mii_configure_phy_interrupt(struct gfar_mii_info *mii_info, u32 interrupts); - -struct dm9161_private { - struct timer_list timer; - int resetdone; -}; - -#endif /* GIANFAR_PHY_H */ diff -Nru a/include/asm-ppc/fsl_ocp.h b/include/asm-ppc/fsl_ocp.h --- a/include/asm-ppc/fsl_ocp.h 2004-12-23 12:39:16 -06:00 +++ b/include/asm-ppc/fsl_ocp.h 2004-12-23 12:39:16 -06:00 @@ -17,6 +17,7 @@ #ifndef __ASM_FS_OCP_H__ #define __ASM_FS_OCP_H__ +#define GFAR_MII_OFFSET 0x520 /* A table of information for supporting the Gianfar Ethernet Controller * This helps identify which enet controller we are dealing with, * and what type of enet controller it is @@ -25,20 +26,18 @@ uint interruptTransmit; uint interruptError; uint interruptReceive; - uint interruptPHY; uint flags; - uint phyid; - uint phyregidx; + char *bus_id; unsigned char mac_addr[6]; }; /* Flags in the flags field */ -#define GFAR_HAS_COALESCE 0x20 -#define GFAR_HAS_RMON 0x10 -#define GFAR_HAS_MULTI_INTR 0x08 -#define GFAR_FIRM_SET_MACADDR 0x04 -#define GFAR_HAS_PHY_INTR 0x02 /* if not set use a timer */ -#define GFAR_HAS_GIGABIT 0x01 +#define GIANFAR_HAS_COALESCE 0x20 +#define GIANFAR_HAS_RMON 0x10 +#define GIANFAR_HAS_MULTI_INTR 0x08 +#define GIANFAR_FIRM_SET_MACADDR 0x04 +#define GIANFAR_HAS_PHY_INTR 0x02 /* if not set use a timer */ +#define GIANFAR_HAS_GIGABIT 0x01 /* Data structure for I2C support. Just contains a couple flags * to distinguish various I2C implementations*/ diff -Nru a/include/asm-ppc/mpc85xx.h b/include/asm-ppc/mpc85xx.h --- a/include/asm-ppc/mpc85xx.h 2004-12-23 12:39:16 -06:00 +++ b/include/asm-ppc/mpc85xx.h 2004-12-23 12:39:16 -06:00 @@ -103,8 +103,18 @@ #define MPC85xx_CPM_SIZE (0x40000) #define MPC85xx_DMA_OFFSET (0x21000) #define MPC85xx_DMA_SIZE (0x01000) +#define MPC85xx_DMA0_OFFSET (0x21100) +#define MPC85xx_DMA0_SIZE (0x00080) +#define MPC85xx_DMA1_OFFSET (0x21180) +#define MPC85xx_DMA1_SIZE (0x00080) +#define MPC85xx_DMA2_OFFSET (0x21200) +#define MPC85xx_DMA2_SIZE (0x00080) +#define MPC85xx_DMA3_OFFSET (0x21280) +#define MPC85xx_DMA3_SIZE (0x00080) #define MPC85xx_ENET1_OFFSET (0x24000) #define MPC85xx_ENET1_SIZE (0x01000) +#define MPC85xx_MIIM_OFFSET (0x24520) +#define MPC85xx_MIIM_SIZE (0x00018) #define MPC85xx_ENET2_OFFSET (0x25000) #define MPC85xx_ENET2_SIZE (0x01000) #define MPC85xx_ENET3_OFFSET (0x26000) @@ -138,6 +148,18 @@ #else #define CCSRBAR BOARD_CCSRBAR #endif + +enum fsl_devices { + MPC85xx_TSEC1, + MPC85xx_TSEC2, + MPC85xx_FEC, + MPC85xx_IIC1, + MPC85xx_DMA0, + MPC85xx_DMA1, + MPC85xx_DMA2, + MPC85xx_DMA3, + MPC85xx_MDIO, +}; #endif /* CONFIG_85xx */ #endif /* __ASM_MPC85xx_H__ */ diff -Nru a/include/linux/device.h b/include/linux/device.h --- a/include/linux/device.h 2004-12-23 12:39:16 -06:00 +++ b/include/linux/device.h 2004-12-23 12:39:16 -06:00 @@ -382,6 +382,8 @@ extern struct resource *platform_get_resource(struct platform_device *, unsigned int, unsigned int); extern int platform_get_irq(struct platform_device *, unsigned int); +extern struct resource *platform_get_resource_byname(struct platform_device *, unsigned int, char *); +extern int platform_get_irq_byname(struct platform_device *, char *); extern int platform_add_devices(struct platform_device **, int); extern struct platform_device *platform_device_register_simple(char *, unsigned int, struct resource *, unsigned int); --Apple-Mail-2-974361640 Content-Transfer-Encoding: 7bit Content-Type: application/octet-stream; x-unix-mode=0644; name="phy_12232004.patch" Content-Disposition: attachment; filename=phy_12232004.patch diff -Nru a/drivers/net/Kconfig b/drivers/net/Kconfig --- a/drivers/net/Kconfig 2004-12-23 12:39:16 -06:00 +++ b/drivers/net/Kconfig 2004-12-23 12:39:16 -06:00 @@ -155,6 +155,8 @@ source "drivers/net/arcnet/Kconfig" endif +source "drivers/net/phy/Kconfig" + # # Ethernet # @@ -188,14 +190,6 @@ kernel: saying N will just cause the configurator to skip all the questions about Ethernet network cards. If unsure, say N. -config MII - tristate "Generic Media Independent Interface device support" - depends on NET_ETHERNET - help - Most ethernet controllers have MII transceiver either as an external - or internal device. It is safe to say Y or M here even if your - ethernet card lack MII. - source "drivers/net/arm/Kconfig" config MACE @@ -2079,17 +2073,6 @@ To compile this driver as a module, choose M here: the module will be called tg3. This is recommended. -config GIANFAR - tristate "Gianfar Ethernet" - depends on 85xx - help - This driver supports the Gigabit TSEC on the MPC85xx - family of chips, and the FEC on the 8540 - -config GFAR_NAPI - bool "NAPI Support" - depends on GIANFAR - config MV643XX_ETH tristate "MV-643XX Ethernet support" depends on MOMENCO_OCELOT_C || MOMENCO_JAGUAR_ATX @@ -2117,6 +2100,18 @@ help This enables support for Port 2 of the Marvell MV643XX Gigabit Ethernet. + +config GIANFAR + tristate "Gianfar Ethernet" + depends on 85xx + depends on MII + help + This driver supports the Gigabit TSEC on the MPC85xx + family of chips, and the FEC on the 8540 + +config GFAR_NAPI + bool "NAPI Support" + depends on GIANFAR endmenu diff -Nru a/drivers/net/Makefile b/drivers/net/Makefile --- a/drivers/net/Makefile 2004-12-23 12:39:15 -06:00 +++ b/drivers/net/Makefile 2004-12-23 12:39:15 -06:00 @@ -12,7 +12,7 @@ obj-$(CONFIG_BONDING) += bonding/ obj-$(CONFIG_GIANFAR) += gianfar_driver.o -gianfar_driver-objs := gianfar.o gianfar_ethtool.o gianfar_phy.o +gianfar_driver-objs := gianfar.o gianfar_ethtool.o gianfar_mii.o # # link order important here @@ -63,6 +63,7 @@ # obj-$(CONFIG_MII) += mii.o +obj-$(CONFIG_MII) += phy/ obj-$(CONFIG_SUNDANCE) += sundance.o obj-$(CONFIG_HAMACHI) += hamachi.o diff -Nru a/drivers/net/phy/Kconfig b/drivers/net/phy/Kconfig --- /dev/null Wed Dec 31 16:00:00 196900 +++ b/drivers/net/phy/Kconfig 2004-12-23 12:39:16 -06:00 @@ -0,0 +1,45 @@ +# +# PHY Layer Configuration +# + +menu "MII support" + +config MII + bool "Generic Media Independent Interface device support" + depends on NET_ETHERNET + help + Most ethernet controllers have an MII transceiver either as an + external or internal device. It is safe to say Y here even if + your ethernet card lacks MII. This code provides functions + for managing these devices, and infrastructure. + +comment "MII PHY device drivers" + depends on MII + +config MARVELL_PHY + bool "Drivers for Marvell PHYs" + ---help--- + Currently has a driver for the 88E1011S + +config DAVICOM_PHY + bool "Drivers for Davicom PHYs" + ---help--- + Currently supports dm9161e and dm9131 + +config QSEMI_PHY + bool "Drivers for Quality Semiconductor PHYs" + ---help--- + Currently supports the qs6612 + +config LXT_PHY + bool "Drivers for the Intel LXT PHYs" + ---help--- + Currently supports the lxt970, lxt971 + +config CICADA_PHY + bool "Drivers for the Cicada PHYs" + ---help--- + Currently supports the cis8204 + +endmenu + diff -Nru a/drivers/net/phy/Makefile b/drivers/net/phy/Makefile --- /dev/null Wed Dec 31 16:00:00 196900 +++ b/drivers/net/phy/Makefile 2004-12-23 12:39:16 -06:00 @@ -0,0 +1,9 @@ +# Makefile for Linux PHY drivers + +obj-$(CONFIG_MII) += phy.o phy_device.o mdio_bus.o + +obj-$(CONFIG_MARVELL_PHY) += marvell.o +obj-$(CONFIG_DAVICOM_PHY) += davicom.o +obj-$(CONFIG_CICADA_PHY) += cicada.o +obj-$(CONFIG_LXT_PHY) += lxt.o +obj-$(CONFIG_QSEMI_PHY) += qsemi.o diff -Nru a/drivers/net/phy/cicada.c b/drivers/net/phy/cicada.c --- /dev/null Wed Dec 31 16:00:00 196900 +++ b/drivers/net/phy/cicada.c 2004-12-23 12:39:16 -06:00 @@ -0,0 +1,165 @@ +/* + * drivers/net/phy/cicada.c + * + * Driver for Cicada PHYs + * + * Author: Andy Fleming + * + * Copyright (c) 2004 Freescale Semiconductor, Inc. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or (at your + * option) any later version. + * + */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include + +/* Cicada Extended Control Register 1 */ +#define MII_CIS8201_EXT_CON1 0x17 +#define MII_CIS8201_EXTCON1_INIT 0x0000 + +/* Cicada Interrupt Mask Register */ +#define MII_CIS8201_IMASK 0x19 +#define MII_CIS8201_IMASK_IEN 0x8000 +#define MII_CIS8201_IMASK_SPEED 0x4000 +#define MII_CIS8201_IMASK_LINK 0x2000 +#define MII_CIS8201_IMASK_DUPLEX 0x1000 +#define MII_CIS8201_IMASK_MASK 0xf000 + +/* Cicada Interrupt Status Register */ +#define MII_CIS8201_ISTAT 0x1a +#define MII_CIS8201_ISTAT_STATUS 0x8000 +#define MII_CIS8201_ISTAT_SPEED 0x4000 +#define MII_CIS8201_ISTAT_LINK 0x2000 +#define MII_CIS8201_ISTAT_DUPLEX 0x1000 + +/* Cicada Auxiliary Control/Status Register */ +#define MII_CIS8201_AUX_CONSTAT 0x1c +#define MII_CIS8201_AUXCONSTAT_INIT 0x0004 +#define MII_CIS8201_AUXCONSTAT_DUPLEX 0x0020 +#define MII_CIS8201_AUXCONSTAT_SPEED 0x0018 +#define MII_CIS8201_AUXCONSTAT_GBIT 0x0010 +#define MII_CIS8201_AUXCONSTAT_100 0x0008 + +static int cis820x_read_status(struct phy_device *phydev) +{ + u16 status; + int err; + + /* Update the link, but return if there + * was an error */ + err = genphy_update_link(phydev); + if (err) + return err; + + /* If the link is up, read the speed and duplex */ + /* If we aren't autonegotiating, assume speeds + * are as set */ + if (phydev->autoneg && phydev->link) { + int speed; + + status = phy_read(phydev, MII_CIS8201_AUX_CONSTAT); + if (status & MII_CIS8201_AUXCONSTAT_DUPLEX) + phydev->duplex = DUPLEX_FULL; + else + phydev->duplex = DUPLEX_HALF; + + speed = status & MII_CIS8201_AUXCONSTAT_SPEED; + + switch (speed) { + case MII_CIS8201_AUXCONSTAT_GBIT: + phydev->speed = SPEED_1000; + break; + case MII_CIS8201_AUXCONSTAT_100: + phydev->speed = SPEED_100; + break; + default: + phydev->speed = SPEED_10; + break; + } + } + + return 0; +} + +static int cis820x_probe(struct phy_device *phydev) +{ + phy_write(phydev, MII_CIS8201_AUX_CONSTAT, + MII_CIS8201_AUXCONSTAT_INIT); + phy_write(phydev, MII_CIS8201_EXT_CON1, + MII_CIS8201_EXTCON1_INIT); + + return 0; +} + +static int cis820x_ack_interrupt(struct phy_device *phydev) +{ + phy_read(phydev, MII_CIS8201_ISTAT); + + return 0; +} + +static int cis820x_config_intr(struct phy_device *phydev) +{ + if(phydev->interrupts == PHY_INTERRUPT_ENABLED) + phy_write(phydev, MII_CIS8201_IMASK, MII_CIS8201_IMASK_MASK); + else + phy_write(phydev, MII_CIS8201_IMASK, 0); + + return 0; +} + +/* Cicada 820x */ +static struct phy_driver cis8204_driver = { + 0x000fc440, + "Cicada Cis8204", + 0x000fffc0, + .features = PHY_GBIT_FEATURES, + .flags = PHY_HAS_INTERRUPT, + .probe = &cis820x_probe, + .config_aneg = &gbit_config_aneg, + .read_status = &cis820x_read_status, + .ack_interrupt = &cis820x_ack_interrupt, + .config_intr = &cis820x_config_intr, +}; + +int __init cis8204_init(void) +{ + int retval; + + retval = phy_driver_register(&cis8204_driver); + + return retval; +} + +static void __exit cis8204_exit(void) +{ + phy_driver_unregister(&cis8204_driver); +} + +module_init(cis8204_init); +module_exit(cis8204_exit); diff -Nru a/drivers/net/phy/davicom.c b/drivers/net/phy/davicom.c --- /dev/null Wed Dec 31 16:00:00 196900 +++ b/drivers/net/phy/davicom.c 2004-12-23 12:39:16 -06:00 @@ -0,0 +1,277 @@ +/* + * drivers/net/phy/davicom.c + * + * Driver for Davicom PHYs + * + * Author: Andy Fleming + * + * Copyright (c) 2004 Freescale Semiconductor, Inc. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or (at your + * option) any later version. + * + */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include + +#define MII_DM9161_SCR 0x10 +#define MII_DM9161_SCR_INIT 0x0610 + +/* DM9161 Specified Configuration and Status Register */ +#define MII_DM9161_SCSR 0x11 +#define MII_DM9161_SCSR_100F 0x8000 +#define MII_DM9161_SCSR_100H 0x4000 +#define MII_DM9161_SCSR_10F 0x2000 +#define MII_DM9161_SCSR_10H 0x1000 + +/* DM9161 Interrupt Register */ +#define MII_DM9161_INTR 0x15 +#define MII_DM9161_INTR_PEND 0x8000 +#define MII_DM9161_INTR_DPLX_MASK 0x0800 +#define MII_DM9161_INTR_SPD_MASK 0x0400 +#define MII_DM9161_INTR_LINK_MASK 0x0200 +#define MII_DM9161_INTR_MASK 0x0100 +#define MII_DM9161_INTR_DPLX_CHANGE 0x0010 +#define MII_DM9161_INTR_SPD_CHANGE 0x0008 +#define MII_DM9161_INTR_LINK_CHANGE 0x0004 +#define MII_DM9161_INTR_INIT 0x0000 +#define MII_DM9161_INTR_STOP \ +(MII_DM9161_INTR_DPLX_MASK | MII_DM9161_INTR_SPD_MASK \ + | MII_DM9161_INTR_LINK_MASK | MII_DM9161_INTR_MASK) + +/* DM9161 10BT Configuration/Status */ +#define MII_DM9161_10BTCSR 0x12 +#define MII_DM9161_10BTCSR_INIT 0x7800 + +struct dm9161_private { + struct timer_list timer; + int resetdone; +}; + +#define DM9161_DELAY 1 +int dm9161_config_intr(struct phy_device *phydev) +{ + u16 temp; + + temp = phy_read(phydev, MII_DM9161_INTR); + + if(PHY_INTERRUPT_ENABLED == phydev->interrupts ) + temp &= ~(MII_DM9161_INTR_STOP); + else + temp |= MII_DM9161_INTR_STOP; + + phy_write(phydev, MII_DM9161_INTR, temp); + + return 0; +} + + +static int dm9161_read_status(struct phy_device *phydev) +{ + u16 status; + int err; + + /* Update the link, but return if there + * was an error */ + err = genphy_update_link(phydev); + if (err) + return err; + + /* If the link is up, read the speed and duplex */ + /* If we aren't autonegotiating, assume speeds + * are as set */ + if (phydev->autoneg && phydev->link) { + status = phy_read(phydev, MII_DM9161_SCSR); + if (status & (MII_DM9161_SCSR_100F | MII_DM9161_SCSR_100H)) + phydev->speed = SPEED_100; + else + phydev->speed = SPEED_10; + + if (status & (MII_DM9161_SCSR_100F | MII_DM9161_SCSR_10F)) + phydev->duplex = DUPLEX_FULL; + else + phydev->duplex = DUPLEX_HALF; + } + + return 0; +} + + +static void dm9161_timer(unsigned long data) +{ + struct phy_device *phydev = (struct phy_device *)data; + struct dm9161_private *priv = phydev->priv; + u16 status = phy_read(phydev, MII_BMSR); + + spin_lock(&phydev->lock); + if (status & BMSR_ANEGCOMPLETE) { + if (PHY_PENDING == phydev->state) + phydev->state = PHY_UP; + else + phydev->state = PHY_READY; + } else + mod_timer(&priv->timer, jiffies + DM9161_DELAY * HZ); + + spin_unlock(&phydev->lock); +} + + +static int dm9161_config_aneg(struct phy_device *phydev) +{ + /* Isolate the PHY */ + phy_write(phydev, MII_BMCR, BMCR_ISOLATE); + + /* Configure the new settings */ + genphy_config_advert(phydev); + + /* Reconnect the PHY, and enable Autonegotiation */ + phy_write(phydev, MII_BMCR, BMCR_ANENABLE); + +#if 0 + /* Start a timer for DM9161_DELAY seconds to wait + * for the PHY to be ready */ + init_timer(&priv->timer); + priv->timer.function = &dm9161_timer; + priv->timer.data = (unsigned long) phydev; + mod_timer(&priv->timer, jiffies + DM9161_DELAY * HZ); +#endif + + return 0; +} + +static int dm9161_probe(struct phy_device *phydev) +{ + struct dm9161_private *priv; + + /* Allocate the private data structure */ + priv = kmalloc(sizeof(struct dm9161_private), GFP_KERNEL); + + if (NULL == priv) + return -ENOMEM; + + phydev->priv = priv; + + /* Reset is not done yet */ + priv->resetdone = 0; + + /* Isolate the PHY */ + phy_write(phydev, MII_BMCR, BMCR_ISOLATE); + + /* Do not bypass the scrambler/descrambler */ + phy_write(phydev, MII_DM9161_SCR, MII_DM9161_SCR_INIT); + + /* Clear 10BTCSR to default */ + phy_write(phydev, MII_DM9161_10BTCSR, MII_DM9161_10BTCSR_INIT); + + /* Reconnect the PHY, and enable Autonegotiation */ + phy_write(phydev, MII_BMCR, BMCR_ANENABLE); + + phydev->state = PHY_STARTING; + + /* Start a timer for DM9161_DELAY seconds to wait + * for the PHY to be ready */ + init_timer(&priv->timer); + priv->timer.function = &dm9161_timer; + priv->timer.data = (unsigned long) phydev; + mod_timer(&priv->timer, jiffies + DM9161_DELAY * HZ); + + printk(KERN_INFO "Bringing up a Davicom PHY, this could take" + " a while...\n"); + return 0; +} + +static void dm9161_remove(struct phy_device *phydev) +{ + struct dm9161_private *priv = phydev->priv; + + del_timer_sync(&priv->timer); + kfree(priv); +} + +static int dm9161_ack_interrupt(struct phy_device *phydev) +{ + phy_read(phydev, MII_DM9161_INTR); + + return 0; +} + +static struct phy_driver dm9161_driver = { + .phy_id = 0x0181b880, + .name = "Davicom DM9161E", + .phy_id_mask = 0x0ffffff0, + .features = PHY_BASIC_FEATURES, + .probe = dm9161_probe, + .config_aneg = dm9161_config_aneg, + .read_status = dm9161_read_status, + .remove = dm9161_remove, +}; + +static struct phy_driver dm9131_driver = { + .phy_id = 0x00181b80, + .name = "Davicom DM9131", + .phy_id_mask = 0x0ffffff0, + .features = PHY_BASIC_FEATURES, + .flags = PHY_HAS_INTERRUPT, + .config_aneg = genphy_config_aneg, + .read_status = dm9161_read_status, + .ack_interrupt = dm9161_ack_interrupt, + .config_intr = dm9161_config_intr, +}; + +int __init dm9161_init(void) +{ + int retval; + + retval = phy_driver_register(&dm9161_driver); + + return retval; +} + +static void __exit dm9161_exit(void) +{ + phy_driver_unregister(&dm9161_driver); +} + +module_init(dm9161_init); +module_exit(dm9161_exit); + +int __init dm9131_init(void) +{ + int retval; + + retval = phy_driver_register(&dm9131_driver); + + return retval; +} + +static void __exit dm9131_exit(void) +{ + phy_driver_unregister(&dm9131_driver); +} + +module_init(dm9131_init); +module_exit(dm9131_exit); diff -Nru a/drivers/net/phy/lxt.c b/drivers/net/phy/lxt.c --- /dev/null Wed Dec 31 16:00:00 196900 +++ b/drivers/net/phy/lxt.c 2004-12-23 12:39:16 -06:00 @@ -0,0 +1,237 @@ +/* + * drivers/net/phy/lxt.c + * + * Driver for Intel LXT PHYs + * + * Author: Andy Fleming + * + * Copyright (c) 2004 Freescale Semiconductor, Inc. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or (at your + * option) any later version. + * + */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include + +/* The Level one LXT970 is used by many boards */ + +#define MII_LXT970_MIRROR 16 /* Mirror register */ +#define MII_LXT970_IER 17 /* Interrupt Enable Register */ + +#define MII_LXT970_IER_IEN 0x0002 + +#define MII_LXT970_ISR 18 /* Interrupt Status Register */ + +#define MII_LXT970_CONFIG 19 /* Configuration Register */ +#define MII_LXT970_CSR 20 /* Chip Status Register */ + +#define MII_LXT970_CSR_DUPLEX 0x1000 +#define MII_LXT970_CSR_SPEED 0x0800 + +/* ------------------------------------------------------------------------- */ +/* The Level one LXT971 is used on some of my custom boards */ + +/* register definitions for the 971 */ + +#define MII_LXT971_PCR 16 /* Port Control Register */ + +#define MII_LXT971_SR2 17 /* Status Register 2 */ +#define MII_LXT971_SR2_DUPLEX 0x0200 +#define MII_LXT971_SR2_SPEED 0x4000 + +#define MII_LXT971_IER 18 /* Interrupt Enable Register */ +#define MII_LXT971_IER_IEN 0x00f2 + +#define MII_LXT971_ISR 19 /* Interrupt Status Register */ + +#define MII_LXT971_LCR 20 /* LED Control Register */ + +#define MII_LXT971_TCR 30 /* Transmit Control Register */ + + +static int lxt970_read_status(struct phy_device *phydev) +{ + u16 status; + int err; + + /* Update the link, but return if there + * was an error */ + err = genphy_update_link(phydev); + if (err) + return err; + + /* If the link is up, read the speed and duplex */ + /* If we aren't autonegotiating, assume speeds + * are as set */ + if (phydev->autoneg && phydev->link) { + status = phy_read(phydev, MII_LXT970_CSR); + if (status & MII_LXT970_CSR_DUPLEX) + phydev->duplex = DUPLEX_FULL; + else + phydev->duplex = DUPLEX_HALF; + + if (status & MII_LXT970_CSR_SPEED) + phydev->speed = SPEED_100; + else + phydev->speed = SPEED_10; + } + + return 0; +} + +static int lxt970_ack_interrupt(struct phy_device *phydev) +{ + phy_read(phydev, MII_BMSR); + phy_read(phydev, MII_LXT970_ISR); + + return 0; +} + +static int lxt970_config_intr(struct phy_device *phydev) +{ + if(phydev->interrupts == PHY_INTERRUPT_ENABLED) + phy_write(phydev, MII_LXT970_IER, MII_LXT970_IER_IEN); + else + phy_write(phydev, MII_LXT970_IER, 0); + + return 0; +} + +static int lxt970_probe(struct phy_device *phydev) +{ + phy_write(phydev, MII_LXT970_CONFIG, 0); + + return 0; +} + + +static int lxt971_read_status(struct phy_device *phydev) +{ + u16 status; + int err; + + /* Update the link, but return if there + * was an error */ + err = genphy_update_link(phydev); + if (err) + return err; + + /* If the link is up, read the speed and duplex */ + /* If we aren't autonegotiating, assume speeds + * are as set */ + if (phydev->autoneg && phydev->link) { + status = phy_read(phydev, MII_LXT971_SR2); + if (status & MII_LXT971_SR2_DUPLEX) + phydev->duplex = DUPLEX_FULL; + else + phydev->duplex= DUPLEX_HALF; + + if (status & MII_LXT971_SR2_SPEED) + phydev->speed = SPEED_100; + else + phydev->speed = SPEED_10; + } + + return 0; +} + +static int lxt971_ack_interrupt(struct phy_device *phydev) +{ + phy_read(phydev, MII_LXT971_ISR); + + return 0; +} + +static int lxt971_config_intr(struct phy_device *phydev) +{ + if(phydev->interrupts == PHY_INTERRUPT_ENABLED) + phy_write(phydev, MII_LXT971_IER, MII_LXT971_IER_IEN); + else + phy_write(phydev, MII_LXT971_IER, 0); + + return 0; +} + +static struct phy_driver lxt970_driver = { + .phy_id = 0x07810000, + .name = "LXT970", + .phy_id_mask = 0x0fffffff, + .features = PHY_BASIC_FEATURES, + .flags = PHY_HAS_INTERRUPT, + .probe = lxt970_probe, + .config_aneg = genphy_config_aneg, + .read_status = lxt970_read_status, + .ack_interrupt = lxt970_ack_interrupt, + .config_intr = lxt970_config_intr, +}; + +static struct phy_driver lxt971_driver = { + .phy_id = 0x0001378e, + .name = "LXT971", + .phy_id_mask = 0x0fffffff, + .features = PHY_BASIC_FEATURES, + .flags = PHY_HAS_INTERRUPT, + .config_aneg = genphy_config_aneg, + .read_status = lxt971_read_status, + .ack_interrupt = lxt971_ack_interrupt, + .config_intr = lxt971_config_intr, +}; + +int __init lxt970_init(void) +{ + int retval; + + retval = phy_driver_register(&lxt970_driver); + + return retval; +} + +static void __exit lxt970_exit(void) +{ + phy_driver_unregister(&lxt970_driver); +} + +module_init(lxt970_init); +module_exit(lxt970_exit); + +int __init lxt971_init(void) +{ + int retval; + + retval = phy_driver_register(&lxt971_driver); + + return retval; +} + +static void __exit lxt971_exit(void) +{ + phy_driver_unregister(&lxt971_driver); +} + +module_init(lxt971_init); +module_exit(lxt971_exit); diff -Nru a/drivers/net/phy/marvell.c b/drivers/net/phy/marvell.c --- /dev/null Wed Dec 31 16:00:00 196900 +++ b/drivers/net/phy/marvell.c 2004-12-23 12:39:16 -06:00 @@ -0,0 +1,173 @@ +/* + * drivers/net/phy/marvell.c + * + * Driver for Marvell PHYs + * + * Author: Andy Fleming + * + * Copyright (c) 2004 Freescale Semiconductor, Inc. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or (at your + * option) any later version. + * + */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include + +/* 88E1011 PHY Status Register */ +#define MII_M1011_PHY_SPEC_STATUS 0x11 +#define MII_M1011_PHY_SPEC_STATUS_1000 0x8000 +#define MII_M1011_PHY_SPEC_STATUS_100 0x4000 +#define MII_M1011_PHY_SPEC_STATUS_SPD_MASK 0xc000 +#define MII_M1011_PHY_SPEC_STATUS_FULLDUPLEX 0x2000 +#define MII_M1011_PHY_SPEC_STATUS_RESOLVED 0x0800 +#define MII_M1011_PHY_SPEC_STATUS_LINK 0x0400 + +#define MII_M1011_IEVENT 0x13 +#define MII_M1011_IEVENT_CLEAR 0x0000 + +#define MII_M1011_IMASK 0x12 +#define MII_M1011_IMASK_INIT 0x6400 +#define MII_M1011_IMASK_CLEAR 0x0000 + +static int marvell_read_status(struct phy_device *phydev) +{ + u16 status; + int err; + + /* Update the link, but return if there + * was an error */ + err = genphy_update_link(phydev); + if (err) + return err; + + /* If the link is up, read the speed and duplex */ + /* If we aren't autonegotiating, assume speeds + * are as set */ + if (phydev->autoneg && phydev->link) { + int speed; + status = phy_read(phydev, MII_M1011_PHY_SPEC_STATUS); + +#if 0 + /* If speed and duplex aren't resolved, + * return an error. Isn't this handled + * by checking aneg? + */ + if ((status & MII_M1011_PHY_SPEC_STATUS_RESOLVED) == 0) + return -EAGAIN; +#endif + + /* Get the duplexity */ + if (status & MII_M1011_PHY_SPEC_STATUS_FULLDUPLEX) + phydev->duplex = DUPLEX_FULL; + else + phydev->duplex = DUPLEX_HALF; + + /* Get the speed */ + speed = status & MII_M1011_PHY_SPEC_STATUS_SPD_MASK; + switch(speed) { + case MII_M1011_PHY_SPEC_STATUS_1000: + phydev->speed = SPEED_1000; + break; + case MII_M1011_PHY_SPEC_STATUS_100: + phydev->speed = SPEED_100; + break; + default: + phydev->speed = SPEED_10; + break; + } + phydev->pause = 0; + } + + return 0; +} + +static int marvell_ack_interrupt(struct phy_device *phydev) +{ + /* Clear the interrupts by reading the reg */ + phy_read(phydev, MII_M1011_IEVENT); + + return 0; +} + +static int marvell_config_intr(struct phy_device *phydev) +{ + if(phydev->interrupts == PHY_INTERRUPT_ENABLED) + phy_write(phydev, MII_M1011_IMASK, MII_M1011_IMASK_INIT); + else + phy_write(phydev, MII_M1011_IMASK, MII_M1011_IMASK_CLEAR); + + return 0; +} + +static int marvell_config_aneg(struct phy_device *phydev) +{ + /* The Marvell PHY has an errata which requires + * that certain registers get written in order + * to restart autonegotiation */ + phy_write(phydev, MII_BMCR, BMCR_RESET); + + phy_write(phydev, 0x1d, 0x1f); + phy_write(phydev, 0x1e, 0x200c); + phy_write(phydev, 0x1d, 0x5); + phy_write(phydev, 0x1e, 0); + phy_write(phydev, 0x1e, 0x100); + + gbit_config_aneg(phydev); + + return 0; +} + + +static struct phy_driver m88e1101_driver = { + .phy_id = 0x01410c00, + .phy_id_mask = 0xffffff00, + .name = "Marvell 88E1101", + .features = PHY_GBIT_FEATURES, + .flags = PHY_HAS_INTERRUPT, + .config_aneg = &marvell_config_aneg, + .read_status = &marvell_read_status, + .ack_interrupt = &marvell_ack_interrupt, + .config_intr = &marvell_config_intr, +}; + +int __init marvell_init(void) +{ + int retval; + + retval = phy_driver_register(&m88e1101_driver); + + return retval; +} + +static void __exit marvell_exit(void) +{ + phy_driver_unregister(&m88e1101_driver); +} + +module_init(marvell_init); +module_exit(marvell_exit); diff -Nru a/drivers/net/phy/mdio_bus.c b/drivers/net/phy/mdio_bus.c --- /dev/null Wed Dec 31 16:00:00 196900 +++ b/drivers/net/phy/mdio_bus.c 2004-12-23 12:39:16 -06:00 @@ -0,0 +1,173 @@ +/* + * drivers/net/phy/mdio_bus.c + * + * MDIO Bus interface + * + * Author: Andy Fleming + * + * Copyright (c) 2004 Freescale Semiconductor, Inc. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or (at your + * option) any later version. + * + */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include + +/* register_mdiobus + * bus: The bus being registered + * + * description: Called by a bus driver to bring up all the PHYs + * on the bus, and attach them to the bus + */ +int register_mdiobus(struct mii_bus *bus) +{ + int i; + int err = 0; + + spin_lock_init(&bus->mdio_lock); + + if (NULL == bus || NULL == bus->name || + NULL == bus->read || + NULL == bus->write) + return -EINVAL; + + if (bus->reset) + bus->reset(bus); + + for (i=0; i < PHY_MAX_ADDR; i++) { + struct phy_device *phydev; + + phydev = get_phy_device(bus, i); + + /* There's a PHY at this address + * We need to set: + * 1) IRQ + * 2) bus_id + * 3) parent + * 4) bus + * 5) mii_bus + * And, we need to register it */ + if (phydev) { + phydev->irq = bus->irq[i]; + + phydev->dev.parent = bus->dev; + + phydev->dev.bus = &mdio_bus_type; + + phydev->bus = bus; + + sprintf(phydev->dev.bus_id, "phy%d:%d", bus->id, i); + + err = device_register(&phydev->dev); + + if (err) + printk("phy %d did not register (%d)\n", + i, err); + + bus->phy_map[i] = phydev; + } + } + + pr_info("%s: probed\n", bus->name); + + return err; +} +EXPORT_SYMBOL(register_mdiobus); + +void unregister_mdiobus(struct mii_bus *bus) +{ + int i; + + for (i=0; i < PHY_MAX_ADDR; i++) + if (bus->phy_map[i]) { + device_unregister(&bus->phy_map[i]->dev); + kfree(bus->phy_map[i]); + } + +} +EXPORT_SYMBOL(unregister_mdiobus); + +/* mdio_bus_match + * dev: a PHY device + * drv: a PHY driver + * + * description: Given a PHY device, and a PHY driver, return 1 if + * the driver supports the device. Otherwise, return 0 + */ +int mdio_bus_match(struct device *dev, struct device_driver *drv) +{ + struct phy_device *phydev = to_phy_device(dev); + struct phy_driver *phydrv = to_phy_driver(drv); + + return (phydrv->phy_id == (phydev->phy_id & phydrv->phy_id_mask)); +} + +/* Suspend and resume. Copied from platform_suspend and + * platform_resume + */ +static int mdio_bus_suspend(struct device * dev, u32 state) +{ + int ret = 0; + + if (dev->driver && dev->driver->suspend) { + ret = dev->driver->suspend(dev, state, SUSPEND_DISABLE); + if (ret == 0) + ret = dev->driver->suspend(dev, state, SUSPEND_SAVE_STATE); + if (ret == 0) + ret = dev->driver->suspend(dev, state, SUSPEND_POWER_DOWN); + } + return ret; +} + +static int mdio_bus_resume(struct device * dev) +{ + int ret = 0; + + if (dev->driver && dev->driver->resume) { + ret = dev->driver->resume(dev, RESUME_POWER_ON); + if (ret == 0) + ret = dev->driver->resume(dev, RESUME_RESTORE_STATE); + if (ret == 0) + ret = dev->driver->resume(dev, RESUME_ENABLE); + } + return ret; +} + +struct bus_type mdio_bus_type = { + .name = "mdio_bus", + .match = mdio_bus_match, + .suspend= mdio_bus_suspend, + .resume = mdio_bus_resume, +}; + +int __init mdio_bus_init(void) +{ + return bus_register(&mdio_bus_type); +} + +subsys_initcall(mdio_bus_init); diff -Nru a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c --- /dev/null Wed Dec 31 16:00:00 196900 +++ b/drivers/net/phy/phy.c 2004-12-23 12:39:16 -06:00 @@ -0,0 +1,512 @@ +/* + * drivers/net/phy/phy.c + * + * Framework for configuring and reading PHY devices + * Based on code in sungem_phy.c and gianfar_phy.c + * + * Author: Andy Fleming + * + * Copyright (c) 2004 Freescale Semiconductor, Inc. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or (at your + * option) any later version. + * + */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include + +u16 phy_read(struct phy_device *phydev, u16 regnum); +void phy_write(struct phy_device *phydev, u16 regnum, u16 val); + +/* Convenience functions for reading a given PHY register. + * This MUST NOT be called from interrupt context, + * because the bus read function may sleep + * or generally lock up. */ +u16 phy_read(struct phy_device *phydev, u16 regnum) +{ + u16 retval; + struct mii_bus *bus = phydev->bus; + + spin_lock_bh(&bus->mdio_lock); + retval = bus->read(bus, phydev->addr, regnum); + spin_unlock_bh(&bus->mdio_lock); + + return retval; +} + +void phy_write(struct phy_device *phydev, u16 regnum, u16 val) +{ + struct mii_bus *bus = phydev->bus; + + spin_lock_bh(&bus->mdio_lock); + bus->write(bus, phydev->addr, regnum, val); + spin_unlock_bh(&bus->mdio_lock); +} + + +void phy_clear_interrupt(struct phy_device *phydev) +{ + if (phydev->drv->ack_interrupt) + phydev->drv->ack_interrupt(phydev); +} + + +void phy_config_interrupt(struct phy_device *phydev, + u32 interrupts) +{ + phydev->interrupts = interrupts; + if (phydev->drv->config_intr) + phydev->drv->config_intr(phydev); +} + + +static inline void phy_read_status(struct phy_device *phydev) +{ + phydev->drv->read_status(phydev); +} + +static inline int phy_aneg_done(struct phy_device *phydev) +{ + return (phy_read(phydev, MII_BMSR) & BMSR_ANEGCOMPLETE); +} + +/* phy_start_aneg + * phydev: The PHY on which to initiate auto-negotiation + * + * description: Calls the PHY driver's config_aneg, and then + * sets the PHY state to PHY_AN if auto-negotiation is enabled, + * and to PHY_FORCING if auto-negotiation is disabled. Unless + * the PHY is currently HALTED. + */ +void phy_start_aneg(struct phy_device *phydev) +{ + spin_lock(&phydev->lock); + + if (AUTONEG_DISABLE == phydev->autoneg) + phy_sanitize_settings(phydev); + + phydev->drv->config_aneg(phydev); + + if (phydev->state != PHY_HALTED) { + if (AUTONEG_ENABLE == phydev->autoneg) { + phydev->state = PHY_AN; + phydev->link_timeout = PHY_AN_TIMEOUT; + } else { + phydev->state = PHY_FORCING; + phydev->link_timeout = PHY_FORCE_TIMEOUT; + } + } + + spin_unlock(&phydev->lock); +} + + +/* phy_interrupt + * irq: Interrupt number + * phy_dat: PHY device which caused the interrupt (presumably) + * regs: -- + * + * description: When a PHY interrupt occurs, the handler disables + * interrupts, and schedules a work task to clear the interrupt. + */ +static irqreturn_t phy_interrupt(int irq, void *phy_dat, struct pt_regs *regs) +{ + struct phy_device *phydev = phy_dat; + + /* The MDIO bus is not allowed to be written in interrupt + * context, so we need to disable the irq here. A work + * queue will write the PHY to disable and clear the + * interrupt, and then reenable the irq line. */ + disable_irq_nosync(irq); + + schedule_work(&phydev->phy_queue); + + return IRQ_HANDLED; +} + +/* phy_start_interrupts + * phydev: The PHY whose interrupts are being enabled + * + * description: Request the interrupt for the given PHY. If + * this fails, then we set irq to -1 so that we do polling. + * Otherwise, we enable the interrupts. + * Returns 0 on success, -1 on error. + */ +int phy_start_interrupts(struct phy_device *phydev) +{ + if (request_irq(phydev->irq, phy_interrupt, + SA_SHIRQ, + "phy_interrupt", + phydev) < 0) { + printk(KERN_ERR "%s: Can't get IRQ %d (PHY)\n", + phydev->bus->name, + phydev->irq); + phydev->irq = -1; + return -1; + } + + phy_config_interrupt(phydev, PHY_INTERRUPT_ENABLED); + + return 0; +} + +/* Scheduled by the phy_interrupt/timer to handle PHY changes */ +void phy_change(void *data) +{ + struct phy_device *phydev = data; + + /* Disable PHY interrupts */ + phy_config_interrupt(phydev, PHY_INTERRUPT_DISABLED); + + /* Clear the interrupt */ + phy_clear_interrupt(phydev); + + spin_lock(&phydev->lock); + if ((PHY_RUNNING == phydev->state) || (PHY_NOLINK == phydev->state)) + phydev->state = PHY_CHANGELINK; + spin_unlock(&phydev->lock); + + enable_irq(phydev->irq); + + /* Reenable interrupts, if needed */ + phy_config_interrupt(phydev, PHY_INTERRUPT_ENABLED); +} + +/* Bring down the PHY link, and stop checking the status. */ +void phy_stop(struct phy_device *phydev) +{ + spin_lock(&phydev->lock); + + if (PHY_HALTED == phydev->state) { + spin_unlock(&phydev->lock); + return; + } + + if (phydev->irq != -1) { + /* Clear any pending interrupts */ + phy_clear_interrupt(phydev); + + /* Disable PHY Interrupts */ + phy_config_interrupt(phydev, PHY_INTERRUPT_DISABLED); + } + + phydev->state = PHY_HALTED; + + spin_unlock(&phydev->lock); +} + + +/* phy_start + * phydev: The PHY device being started + * + * description: Indicates the attached device's readiness to + * handle PHY-related work. Used during startup to start the + * PHY, and after a call to phy_stop() to resume operation. + */ +void phy_start(struct phy_device *phydev) +{ + spin_lock(&phydev->lock); + + switch (phydev->state) { + case PHY_STARTING: + phydev->state = PHY_PENDING; + break; + case PHY_READY: + phydev->state = PHY_UP; + break; + case PHY_HALTED: + phydev->state = PHY_RESUMING; + default: + break; + } + spin_unlock(&phydev->lock); +} +EXPORT_SYMBOL(phy_stop); +EXPORT_SYMBOL(phy_start); + +/* A structure for mapping a particular speed and duplex + * combination to a particular SUPPORTED and ADVERTISED value */ +struct phy_setting { + int speed; + int duplex; + u32 setting; +}; + +/* A mapping of all SUPPORTED settings to speed/duplex */ +static struct phy_setting settings[] = { + { .speed = 10000, .duplex = DUPLEX_FULL, + .setting = SUPPORTED_10000baseT_Full, + }, + { .speed = SPEED_1000, .duplex = DUPLEX_FULL, + .setting = SUPPORTED_1000baseT_Full, + }, + { .speed = SPEED_1000, .duplex = DUPLEX_HALF, + .setting = SUPPORTED_1000baseT_Half, + }, + { .speed = SPEED_100, .duplex = DUPLEX_FULL, + .setting = SUPPORTED_100baseT_Full, + }, + { .speed = SPEED_100, .duplex = DUPLEX_HALF, + .setting = SUPPORTED_100baseT_Half, + }, + { .speed = SPEED_10, .duplex = DUPLEX_FULL, + .setting = SUPPORTED_10baseT_Full, + }, + { .speed = SPEED_10, .duplex = DUPLEX_HALF, + .setting = SUPPORTED_10baseT_Half, + }, +}; + +#define MAX_NUM_SETTINGS (sizeof(settings)/sizeof(struct phy_setting)) + +/* phy_find_setting + * speed: desired speed of setting + * duplex: desired duplex of setting + * + * description: Searches the settings array for the setting which + * matches the desired speed and duplex, and returns the index + * of that setting. Returns the index of the last setting if + * none of the others match. + */ +static inline int phy_find_setting(int speed, int duplex) +{ + int idx = 0; + + while (idx < MAX_NUM_SETTINGS && + (settings[idx].speed != speed || + settings[idx].duplex != duplex)) + idx++; + + return idx < MAX_NUM_SETTINGS ? idx : MAX_NUM_SETTINGS - 1; +} + +/* phy_find_valid + * idx: The first index in settings[] to search + * features: A mask of the valid settings + * + * description: Returns the index of the first valid setting less + * than or equal to the one pointed to by idx, as determined by + * the mask in features. Returns the index of the last setting + * if nothing else matches. + */ +static inline int phy_find_valid(int idx, u32 features) +{ + while (idx < MAX_NUM_SETTINGS && !(settings[idx].setting & features)) + idx++; + + return idx < MAX_NUM_SETTINGS ? idx : MAX_NUM_SETTINGS - 1; +} + +/* phy_sanitize_settings + * phydev: The PHY in question + * + * description: Make sure the PHY is set to supported speeds and + * duplexes. Drop down by one in this order: 1000/FULL, + * 1000/HALF, 100/FULL, 100/HALF, 10/FULL, 10/HALF + */ +void phy_sanitize_settings(struct phy_device *phydev) +{ + u32 features = phydev->supported; + int idx; + + /* Sanitize settings based on PHY capabilities */ + if ((features & SUPPORTED_Autoneg) == 0) + phydev->autoneg = 0; + + idx = phy_find_valid(phy_find_setting(phydev->speed, phydev->duplex), + features); + + phydev->speed = settings[idx].speed; + phydev->duplex = settings[idx].duplex; +} + +/* phy_force_reduction + * phydev: The PHY in question + * + * description: Reduces the speed/duplex settings by + * one notch. The order is so: + * 1000/FULL, 1000/HALF, 100/FULL, 100/HALF, + * 10/FULL, 10/HALF. The function bottoms out at 10/HALF. + */ +void phy_force_reduction(struct phy_device *phydev) +{ + int idx; + + idx = phy_find_setting(phydev->speed, phydev->duplex); + + idx++; + + idx = phy_find_valid(idx, phydev->supported); + + phydev->speed = settings[idx].speed; + phydev->duplex = settings[idx].duplex; + + printk(KERN_INFO "Trying %d/%s\n", phydev->speed, + DUPLEX_FULL == phydev->duplex ? "FULL" : "HALF"); +} + +/* PHY timer which handles the state machine */ +void phy_timer(unsigned long data) +{ + struct phy_device *phydev = (struct phy_device *)data; + int needs_aneg = 0; + + spin_lock(&phydev->lock); + + if (phydev->adjust_state) + phydev->adjust_state(phydev->attached_dev); + + switch(phydev->state) { + case PHY_DOWN: + case PHY_STARTING: + case PHY_READY: + case PHY_PENDING: + break; + case PHY_UP: + needs_aneg = 1; + + phydev->link_timeout = PHY_AN_TIMEOUT; + + if (phydev->irq != -1) + phy_start_interrupts(phydev); + + break; + case PHY_AN: + /* Check if negotiation is done. If so, + * we change to either RUNNING, or NOLINK */ + if (phy_aneg_done(phydev)) { + phy_read_status(phydev); + + if (phydev->link) + phydev->state = PHY_RUNNING; + else + phydev->state = PHY_NOLINK; + + phydev->adjust_link(phydev->attached_dev); + break; + } + + /* The counter expired, so either we + * switch to forced mode, or the + * magic_aneg bit exists, and we try aneg + * again */ + if (0 == phydev->link_timeout--) { + if (!(phydev->drv->flags & PHY_HAS_MAGICANEG)) { + int idx; + + /* We'll start from the + * fastest speed, and work + * our way down */ + idx = phy_find_valid(0, + phydev->supported); + + phydev->speed = settings[idx].speed; + phydev->duplex = settings[idx].duplex; + + phydev->autoneg = AUTONEG_DISABLE; + phydev->state = PHY_FORCING; + phydev->link_timeout = + PHY_FORCE_TIMEOUT; + + pr_info("Trying %d/%s\n", phydev->speed, + DUPLEX_FULL == + phydev->duplex ? + "FULL" : "HALF"); + } + + needs_aneg = 1; + } + break; + case PHY_NOLINK: + phy_read_status(phydev); + + if (phydev->link) { + phydev->state = PHY_RUNNING; + phydev->adjust_link(phydev->attached_dev); + } + break; + case PHY_FORCING: + phy_read_status(phydev); + + if (phydev->link) { + phydev->state = PHY_RUNNING; + } else { + if (0 == phydev->link_timeout--) { + phy_force_reduction(phydev); + needs_aneg = 1; + } + } + + phydev->adjust_link(phydev->attached_dev); + break; + case PHY_RUNNING: + /* Only register a CHANGE if we aren't + * using interrupts */ + if (-1 == phydev->irq) + phydev->state = PHY_CHANGELINK; + break; + case PHY_CHANGELINK: + phy_read_status(phydev); + + if (phydev->link) + phydev->state = PHY_RUNNING; + else { + phydev->state = PHY_NOLINK; + } + + phydev->adjust_link(phydev->attached_dev); + + if (-1 != phydev->irq) + phy_config_interrupt(phydev, + PHY_INTERRUPT_ENABLED); + break; + case PHY_HALTED: + if (phydev->link) { + phydev->link = 0; + phydev->adjust_link(phydev->attached_dev); + } + break; + case PHY_RESUMING: + if (AUTONEG_ENABLE == phydev->autoneg) { + if (phy_aneg_done(phydev)) { + phydev->state = PHY_RUNNING; + } else { + phydev->state = PHY_AN; + phydev->link_timeout = PHY_AN_TIMEOUT; + } + } else + phydev->state = PHY_RUNNING; + break; + } + + spin_unlock(&phydev->lock); + + if (needs_aneg) + phy_start_aneg(phydev); + + mod_timer(&phydev->phy_timer, jiffies + PHY_STATE_TIME * HZ); +} diff -Nru a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c --- /dev/null Wed Dec 31 16:00:00 196900 +++ b/drivers/net/phy/phy_device.c 2004-12-23 12:39:16 -06:00 @@ -0,0 +1,598 @@ +/* + * drivers/net/phy/phy_device.c + * + * Framework for finding and configuring PHYs. + * Also contains generic PHY driver + * + * Author: Andy Fleming + * + * Copyright (c) 2004 Freescale Semiconductor, Inc. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or (at your + * option) any later version. + * + */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include + +/* get_phy_device + * bus: The bus the PHY is on + * addr: The address of the desired PHY device + * + * description: Reads the ID registers of the desired PHY, + * then allocates and returns the phy_device which + * represents it. + */ +struct phy_device * get_phy_device(struct mii_bus *bus, uint addr) +{ + u16 phy_reg; + u32 phy_id; + struct phy_device *dev = NULL; + + /* Grab the bits from PHYIR1, and put them + * in the upper half */ + phy_reg = bus->read(bus, addr, MII_PHYSID1); + phy_id = (phy_reg & 0xffff) << 16; + + /* Grab the bits from PHYIR2, and put them in the lower half */ + phy_reg = bus->read(bus, addr, MII_PHYSID2); + phy_id |= (phy_reg & 0xffff); + + /* If the phy_id is all Fs, there is no device there */ + if (0xffffffff == phy_id) + return NULL; + + /* Otherwise, we allocate the device, and initialize the + * default values */ + dev = kmalloc(sizeof(*dev), GFP_KERNEL); + + if (NULL == dev) { + errno = -ENOMEM; + return NULL; + } + + memset(dev, 0, sizeof(*dev)); + + dev->speed = 0; + dev->duplex = -1; + dev->pause = 0; + dev->link = 1; + + dev->autoneg = AUTONEG_ENABLE; + + dev->addr = addr; + dev->phy_id = phy_id; + dev->bus = bus; + + dev->state = PHY_DOWN; + + spin_lock_init(&dev->lock); + + INIT_WORK(&dev->phy_queue, phy_change, dev); + + return dev; +} + +/* phy_prepare_link: + * phydev: The PHY device whose link is being prepped + * adjust_link: The link change handler for the controller + * + * description: Tells the PHY infrastructure to handle the + * gory details on monitoring link status (whether through + * polling or an interrupt), and to call back to the + * connected device driver when the link status changes. + * If you want to monitor your own link state, don't call + * this function */ +void phy_prepare_link(struct phy_device *phydev, + void (*handler)(struct device *)) +{ + if (handler) + phydev->adjust_link = handler; + else + phydev->adjust_link = NULL; +} + +/* phy_start_machine: + * phydev: The PHY device whose state machine is being started + * handler: The callback function for state change notifications. + * + * description: The PHY infrastructure can run a state machine + * which tracks whether the PHY is starting up, negotiating, + * etc. This function starts the timer which tracks the state + * of the PHY. If you want to be notified when the state + * changes, pass in the callback, otherwise, pass NULL. If you + * want to maintain your own state machine, do not call this + * function. */ +void phy_start_machine(struct phy_device *phydev, + void (*handler)(struct device *)) +{ + if (handler) + phydev->adjust_state = handler; + else + phydev->adjust_state = NULL; + + init_timer(&phydev->phy_timer); + phydev->phy_timer.function = &phy_timer; + phydev->phy_timer.data = (unsigned long) phydev; + mod_timer(&phydev->phy_timer, jiffies + HZ); +} + +/* phy_stop_machine + * + * description: Stops the state machine timer, sets the state to + * UP (unless it wasn't up yet), and then frees the interrupt, + * if it is in use. This function must be called BEFORE + * phy_detach. + */ +void phy_stop_machine(struct phy_device *phydev) +{ + del_timer_sync(&phydev->phy_timer); + + spin_lock(&phydev->lock); + if (phydev->state > PHY_UP) + phydev->state = PHY_UP; + spin_unlock(&phydev->lock); + + if (phydev->irq != -1) { + phy_config_interrupt(phydev, PHY_INTERRUPT_DISABLED); + phy_clear_interrupt(phydev); + free_irq(phydev->irq, phydev); + } + + phydev->adjust_state = NULL; +} + +/* phy_attach: + * dev: The requesting device + * phy_id: The name of the requested PHY device + * + * description: Called by drivers to attach to a particular PHY + * device. The phy_device is found, and properly hooked up + * to the phy_driver. If no driver is attached, then the + * genphy_driver is used. The phy_device is given a ptr to + * the attaching device, and given a callback for link status + * change. The phy_device is returned to the attaching + * driver. + */ +struct phy_device *phy_attach(struct device *dev, char *phy_id) +{ + struct phy_device *phydev = NULL; + struct bus_type *bus = &mdio_bus_type; + struct list_head *entry; + + /* Search the list of PHY devices on the mdio bus for the + * PHY with the requested name */ + list_for_each(entry, &bus->devices.list) + { + struct device *d = container_of(entry, struct device, bus_list); + + if (!strcmp(phy_id, d->bus_id)) { + phydev = to_phy_device(d); + break; + } + } + + if (NULL == phydev) { + printk(KERN_ERR "%s not found\n", phy_id); + errno = -ENODEV; + return NULL; + } + + /* Assume that if there is no driver, that it doesn't + * exist, and we should use the genphy driver. */ + if (NULL == phydev->dev.driver) { + down_write(&phydev->dev.bus->subsys.rwsem); + phydev->dev.driver = &genphy_driver.driver; + + device_bind_driver(&phydev->dev); + up_write(&phydev->dev.bus->subsys.rwsem); + } + + if (phydev->attached_dev) { + printk(KERN_ERR "%s: %s already attached\n", + dev->bus_id, phy_id); + errno = -EBUSY; + return NULL; + } + + phydev->attached_dev = dev; + + return phydev; +} +EXPORT_SYMBOL(phy_attach); + +/* phy_connect: + * dev: The requesting device + * phy_id: The name of the requested PHY device + * adjust_link: A callback function for handling link status + * changes + * + * description: Convenience function for connecting ethernet (or + * other) devices to PHY devices. The default behavior is for + * the PHY infrastructure to handle everything, and only notify + * the connected driver when the link status changes. If you + * don't want, or can't use the provided functionality, you may + * choose to call only the subset of functions which provide + * the desired functionality. + */ +struct phy_device * phy_connect(struct device *dev, char *phy_id, + void (*handler)(struct device *)) +{ + struct phy_device *phydev; + + phydev = phy_attach(dev, phy_id); + + if (NULL == phydev) + return phydev; + + phy_prepare_link(phydev, handler); + + phy_start_machine(phydev, NULL); + + return phydev; +} +EXPORT_SYMBOL(phy_connect); + +void phy_disconnect(struct phy_device *phydev) +{ + phy_stop_machine(phydev); + + phydev->adjust_link = NULL; + + phy_detach(phydev); +} +EXPORT_SYMBOL(phy_disconnect); + +void phy_detach(struct phy_device *phydev) +{ + phydev->attached_dev = NULL; + + /* If the device had no specific driver before (i.e. - it + * was using the generic driver), we unbind the device + * from the generic driver so that there's a chance a + * real driver could be loaded */ + if (phydev->dev.driver == &genphy_driver.driver) { + down_write(&phydev->dev.bus->subsys.rwsem); + device_release_driver(&phydev->dev); + up_write(&phydev->dev.bus->subsys.rwsem); + } +} +EXPORT_SYMBOL(phy_detach); + + +/* Generic PHY support and helper functions */ + +/* genphy_config_advert + * + * description: Writes MII_ADVERTISE with the appropriate values, + * after sanitizing the values to make sure we only advertise + * what is supported + */ +void genphy_config_advert(struct phy_device *phydev) +{ + u32 advertise; + u16 adv; + + /* Only allow advertising what + * this PHY supports */ + phydev->advertising &= phydev->supported; + advertise = phydev->advertising; + + /* Setup standard advertisement */ + adv = phy_read(phydev, MII_ADVERTISE); + + adv &= ~(ADVERTISE_ALL | ADVERTISE_100BASE4); + if (advertise & ADVERTISED_10baseT_Half) + adv |= ADVERTISE_10HALF; + if (advertise & ADVERTISED_10baseT_Full) + adv |= ADVERTISE_10FULL; + if (advertise & ADVERTISED_100baseT_Half) + adv |= ADVERTISE_100HALF; + if (advertise & ADVERTISED_100baseT_Full) + adv |= ADVERTISE_100FULL; + + phy_write(phydev, MII_ADVERTISE, adv); +} + + +/* genphy_setup_forced + * + * description: Configures MII_BMCR to force speed/duplex + * to the values in phydev. Assumes that the values are valid. + * Please see phy_sanitize_settings() */ +void genphy_setup_forced(struct phy_device *phydev) +{ + u16 ctrl = phy_read(phydev, MII_BMCR); + + ctrl &= ~(BMCR_FULLDPLX|BMCR_SPEED100|BMCR_SPEED1000|BMCR_ANENABLE); + ctrl |= BMCR_RESET; + + if (SPEED_1000 == phydev->speed) + ctrl |= BMCR_SPEED1000; + else if (SPEED_100 == phydev->speed) + ctrl |= BMCR_SPEED100; + + if (DUPLEX_FULL == phydev->duplex) + ctrl |= BMCR_FULLDPLX; + + phy_write(phydev, MII_BMCR, ctrl); +} + + +/* Enable and Restart Autonegotiation */ +void genphy_restart_aneg(struct phy_device *phydev) +{ + u16 ctl; + + ctl = phy_read(phydev, MII_BMCR); + ctl |= (BMCR_ANENABLE | BMCR_ANRESTART); + phy_write(phydev, MII_BMCR, ctl); +} + + +/* gbit_config_aneg + * + * description: Does the same thing as genphy_config_advert() + * (it even calls it), but also properly configures + * MII_1000BASETCONTROL. Should only be called for + * gigabit-capable PHYs + */ +int gbit_config_aneg(struct phy_device *phydev) +{ + u16 adv; + u32 advertise; + + if (AUTONEG_ENABLE == phydev->autoneg) { + /* Configure the ADVERTISE register */ + genphy_config_advert(phydev); + advertise = phydev->advertising; + + adv = phy_read(phydev, MII_1000BASETCONTROL); + adv &= ~(MII_1000BASETCONTROL_FULLDUPLEXCAP | + MII_1000BASETCONTROL_HALFDUPLEXCAP); + if (advertise & SUPPORTED_1000baseT_Half) + adv |= MII_1000BASETCONTROL_HALFDUPLEXCAP; + if (advertise & SUPPORTED_1000baseT_Full) + adv |= MII_1000BASETCONTROL_FULLDUPLEXCAP; + phy_write(phydev, MII_1000BASETCONTROL, adv); + + /* Start/Restart aneg */ + genphy_restart_aneg(phydev); + } else + genphy_setup_forced(phydev); + + return 0; +} + + +/* genphy_config_aneg + * + * description: If auto-negotiation is enabled, we configure the + * advertising, and then restart auto-negotiation. If it is not + * enabled, then we write the BMCR + */ +int genphy_config_aneg(struct phy_device *phydev) +{ + if (AUTONEG_ENABLE == phydev->autoneg) { + genphy_config_advert(phydev); + genphy_restart_aneg(phydev); + } else + genphy_setup_forced(phydev); + + return 0; +} + + +/* genphy_update_link + * + * description: Update the value in phydev->link to reflect the + * current link value. In order to do this, we need to read + * the status register twice, keeping the second value + */ +int genphy_update_link(struct phy_device *phydev) +{ + u16 status; + + /* Do a fake read */ + phy_read(phydev, MII_BMSR); + + /* Read link and autonegotiation status */ + status = phy_read(phydev, MII_BMSR); + if ((status & BMSR_LSTATUS) == 0) + phydev->link = 0; + else + phydev->link = 1; + + return 0; +} + +/* genphy_read_status + * + * description: Check the link, then figure out the current state + * by comparing what we advertise with what the link partner + * advertises. This is a bit silly, since pretty much every + * PHY has actual status fields to tell you what the result + * was, but if you don't want to implement that, this should + * work. + */ +int genphy_read_status(struct phy_device *phydev) +{ + u16 status; + int err; + + /* Update the link, but return if there + * was an error */ + err = genphy_update_link(phydev); + if (err) + return err; + + if (AUTONEG_ENABLE == phydev->autoneg) { + status = phy_read(phydev, MII_LPA); + + status &= phy_read(phydev, MII_ADVERTISE); + + /* If we can do 100, set it so */ + if (status & (LPA_100FULL | LPA_100HALF)) + phydev->speed = SPEED_100; + else + phydev->speed = SPEED_10; + + /* If we have 100 full, it's full */ + if (status & (LPA_100FULL)) + phydev->duplex = DUPLEX_FULL; + + /* It's also full if we have 10 full, but not 100 half */ + else if ((status & (LPA_100HALF|LPA_10FULL)) == LPA_10FULL) + phydev->duplex = DUPLEX_FULL; + else + phydev->duplex = DUPLEX_HALF; + + phydev->pause = 0; + } + /* On non-aneg, we assume what we put in BMCR is the speed, + * though magic-aneg shouldn't prevent this case from occurring + */ + + return 0; +} + + +/* phy_probe + * dev: The device belonging to a PHY device + * + * description: Take care of setting up the phy_device structure, + * set the state to READY (the driver's probe function should + * set it to STARTING if needed). + */ +int phy_probe(struct device *dev) +{ + struct phy_device *phydev; + struct phy_driver *phydrv; + struct device_driver *drv; + int err = 0; + + phydev = to_phy_device(dev); + + /* Make sure the driver is held. + * XXX -- Is this correct? */ + drv = get_driver(phydev->dev.driver); + phydrv = to_phy_driver(drv); + phydev->drv = phydrv; + + /* Disable the interrupt if the PHY doesn't support it */ + if (!(phydrv->flags & PHY_HAS_INTERRUPT)) + phydev->irq = -1; + + /* Start out supporting everything. Eventually, + * a controller will attach, and may modify one + * or both of these values */ + phydev->supported = phydrv->features; + phydev->advertising = phydrv->features; + + spin_lock(&phydev->lock); + + /* Set the state to READY by default */ + phydev->state = PHY_READY; + + if (phydev->drv->probe) + err = phydev->drv->probe(phydev); + + spin_unlock(&phydev->lock); + + return 0; +} + +int phy_remove(struct device *dev) +{ + struct phy_device *phydev; + + phydev = to_phy_device(dev); + + spin_lock(&phydev->lock); + phydev->state = PHY_DOWN; + spin_unlock(&phydev->lock); + + if (phydev->drv->remove) + phydev->drv->remove(phydev); + + put_driver(phydev->dev.driver); + phydev->drv = NULL; + + return 0; +} + +int phy_driver_register(struct phy_driver *new_driver) +{ + int retval; + + memset(&new_driver->driver, 0, sizeof(new_driver->driver)); + new_driver->driver.name = new_driver->name; + new_driver->driver.bus = &mdio_bus_type; + new_driver->driver.probe = phy_probe; + new_driver->driver.remove = phy_remove; + + retval = driver_register(&new_driver->driver); + + if (!retval) + pr_info("%s: Registered new driver\n", new_driver->name); + else + printk(KERN_ERR "%s: Error %d in registering driver\n", + new_driver->name, retval); + + return retval; +} + +void phy_driver_unregister(struct phy_driver *drv) +{ + driver_unregister(&drv->driver); +} + +struct phy_driver genphy_driver = { + .phy_id = 0x00000000, + .phy_id_mask = 0xffffffff, + .name = "Generic PHY", + .features = PHY_BASIC_FEATURES, + .config_aneg = genphy_config_aneg, + .read_status = genphy_read_status, +}; + +static int __init genphy_init(void) +{ + int retval; + + retval = phy_driver_register(&genphy_driver); + + return retval; +} + +static void __exit genphy_exit(void) +{ + phy_driver_unregister(&genphy_driver); +} + +module_init(genphy_init); +module_exit(genphy_exit); diff -Nru a/drivers/net/phy/qsemi.c b/drivers/net/phy/qsemi.c --- /dev/null Wed Dec 31 16:00:00 196900 +++ b/drivers/net/phy/qsemi.c 2004-12-23 12:39:16 -06:00 @@ -0,0 +1,183 @@ +/* + * drivers/net/phy/qsemi.c + * + * Driver for Quality Semiconductor PHYs + * + * Author: Andy Fleming + * + * Copyright (c) 2004 Freescale Semiconductor, Inc. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or (at your + * option) any later version. + * + */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include + +/* ------------------------------------------------------------------------- */ +/* The Quality Semiconductor QS6612 is used on the RPX CLLF */ + +/* register definitions */ + +#define MII_QS6612_MCR 17 /* Mode Control Register */ +#define MII_QS6612_FTR 27 /* Factory Test Register */ +#define MII_QS6612_MCO 28 /* Misc. Control Register */ +#define MII_QS6612_ISR 29 /* Interrupt Source Register */ +#define MII_QS6612_IMR 30 /* Interrupt Mask Register */ +#define MII_QS6612_IMR_INIT 0x003a +#define MII_QS6612_PCR 31 /* 100BaseTx PHY Control Reg. */ + +#define QS6612_PCR_AN_COMPLETE 0x1000 +#define QS6612_PCR_RLBEN 0x0200 +#define QS6612_PCR_DCREN 0x0100 +#define QS6612_PCR_4B5BEN 0x0040 +#define QS6612_PCR_TX_ISOLATE 0x0020 +#define QS6612_PCR_OPMODE_MASK 0x001c +#define QS6612_PCR_MLT3_DIS 0x0002 +#define QS6612_PCR_SCRM_DESCRM 0x0001 + +enum qs6612_opmode { + still_an=0, + up10_half, + up100_half, + repeater, + reserved, + up10_full, + up100_full, + isolate_noaneg +}; + +static int qs6612_read_status(struct phy_device *phydev) +{ + u16 status; + int err; + + /* Update the link, but return if there + * was an error */ + err = genphy_update_link(phydev); + if (err) + return err; + + /* If the link is up, read the speed and duplex */ + /* If we aren't autonegotiating, assume speeds + * are as set */ + if (phydev->autoneg && phydev->link) { + status = phy_read(phydev, MII_QS6612_PCR); + switch((status >> 2) & QS6612_PCR_OPMODE_MASK) { + case up10_half: + phydev->speed = SPEED_10; + phydev->duplex = DUPLEX_HALF; + break; + case up100_half: + phydev->speed = SPEED_100; + phydev->duplex = DUPLEX_HALF; + break; + case up10_full: + phydev->speed = SPEED_10; + phydev->duplex = DUPLEX_FULL; + break; + case up100_full: + phydev->speed = SPEED_100; + phydev->duplex = DUPLEX_FULL; + break; + default: + /* Do nothing in the other states */ + break; + } + } + + return 0; +} + +int qs6612_probe(struct phy_device *phydev) +{ + /* The PHY powers up isolated on the RPX, + * so send a command to allow operation. + * XXX - My docs indicate this should be 0x0940 + * ...or something. The current value sets three + * reserved bits, bit 11, which specifies it should be + * set to one, bit 10, which specifies it should be set + * to 0, and bit 7, which doesn't specify. However, my + * docs are preliminary, and I will leave it like this + * until someone more knowledgable corrects me or it. + * -- Andy Fleming + */ + phy_write(phydev, MII_QS6612_PCR, 0x0dc0); + + return 0; +} + +int qs6612_ack_interrupt(struct phy_device *phydev) +{ + phy_read(phydev, MII_QS6612_ISR); + phy_read(phydev, MII_BMSR); + phy_read(phydev, MII_EXPANSION); + + return 0; +} + +int qs6612_config_intr(struct phy_device *phydev) +{ + if (phydev->interrupts == PHY_INTERRUPT_ENABLED) + phy_write(phydev, MII_QS6612_IMR, + MII_QS6612_IMR_INIT); + else + phy_write(phydev, MII_QS6612_IMR, 0); + + return 0; + +} + +static struct phy_driver qs6612_driver = { + .phy_id = 0x00181440, + .name = "QS6612", + .phy_id_mask = 0xfffffff0, + .features = PHY_BASIC_FEATURES, + .flags = PHY_HAS_INTERRUPT, + .probe = qs6612_probe, + .config_aneg = genphy_config_aneg, + .read_status = qs6612_read_status, + .ack_interrupt = qs6612_ack_interrupt, + .config_intr = qs6612_config_intr, +}; + +int __init qs6612_init(void) +{ + int retval; + + retval = phy_driver_register(&qs6612_driver); + + return retval; +} + +static void __exit qs6612_exit(void) +{ + phy_driver_unregister(&qs6612_driver); +} + +module_init(qs6612_init); +module_exit(qs6612_exit); diff -Nru a/include/linux/phy.h b/include/linux/phy.h --- /dev/null Wed Dec 31 16:00:00 196900 +++ b/include/linux/phy.h 2004-12-23 12:39:16 -06:00 @@ -0,0 +1,355 @@ +/* + * include/linux/phy.h + * + * Framework and drivers for configuring and reading different PHYs + * Based on code in sungem_phy.c and gianfar_phy.c + * + * Author: Andy Fleming + * + * Copyright (c) 2004 Freescale Semiconductor, Inc. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or (at your + * option) any later version. + * + */ + +#ifndef __PHY_H +#define __PHY_H + +#include +#include + +/* 1000BT control (Marvell & BCM54xx at least) */ +#define MII_1000BASETCONTROL 0x09 +#define MII_1000BASETCONTROL_FULLDUPLEXCAP 0x0200 +#define MII_1000BASETCONTROL_HALFDUPLEXCAP 0x0100 + +#define PHY_BASIC_FEATURES (SUPPORTED_10baseT_Half | \ + SUPPORTED_10baseT_Full | \ + SUPPORTED_100baseT_Half | \ + SUPPORTED_100baseT_Full | \ + SUPPORTED_Autoneg | \ + SUPPORTED_TP | \ + SUPPORTED_MII) + +#define PHY_GBIT_FEATURES (PHY_BASIC_FEATURES | \ + SUPPORTED_1000baseT_Half | \ + SUPPORTED_1000baseT_Full) + +#define PHY_HAS_INTERRUPT 0x00000001 +#define PHY_HAS_MAGICANEG 0x00000002 + +#define MII_BUS_MAX 4 + + +#define PHY_INIT_TIMEOUT 100000 +#define PHY_STATE_TIME 1 +#define PHY_FORCE_TIMEOUT 10 +#define PHY_AN_TIMEOUT 10 + +#define PHY_MAX_ADDR 32 + +/* The Bus class for PHYs. Devices which provide access to + * PHYs should register using this structure */ +struct mii_bus { + const char *name; + int id; + void *priv; + u16 (*read)(struct mii_bus *bus, int phy_id, int regnum); + void (*write)(struct mii_bus *bus, int phy_id, int regnum, u16 val); + int (*reset)(struct mii_bus *bus); + + /* A lock to ensure that only one thing can read/write + * the MDIO bus at a time */ + spinlock_t mdio_lock; + + struct device *dev; + + /* list of all PHYs on bus */ + struct phy_device *phy_map[PHY_MAX_ADDR]; + + /* Pointer to an array of interrupts, each PHY's + * interrupt at the index matching its address */ + int *irq; +}; + +#define PHY_INTERRUPT_DISABLED 0x0 +#define PHY_INTERRUPT_ENABLED 0x80000000 + +/* PHY state machine states: + * + * DOWN: PHY device and driver are not ready for anything. probe + * should be called if and only if the PHY is in this state, + * given that the PHY device exists. + * - PHY driver probe function will, depending on the PHY, set + * the state to STARTING or READY + * + * STARTING: PHY device is coming up, and the ethernet driver is + * not ready. PHY drivers may set this in the probe function. + * If they do, they are responsible for making sure the state is + * eventually set to indicate whether the PHY is UP or READY, + * depending on the state when the PHY is done starting up. + * - PHY driver will set the state to READY + * - start will set the state to PENDING + * + * READY: PHY is ready to send and receive packets, but the + * controller is not. By default, PHYs which do not implement + * probe will be set to this state by phy_probe(). If the PHY + * driver knows the PHY is ready, and the PHY state is STARTING, + * then it sets this STATE. + * - start will set the state to UP + * + * PENDING: PHY device is coming up, but the ethernet driver is + * ready. phy_start will set this state if the PHY state is + * STARTING. + * - PHY driver will set the state to UP when the PHY is ready + * + * UP: The PHY and attached device are ready to do work. + * Interrupts should be started here. + * - timer moves to AN + * + * AN: The PHY is currently negotiating the link state. Link is + * therefore down for now. phy_timer will set this state when it + * detects the state is UP. config_aneg will set this state + * whenever called with phydev->autoneg set to AUTONEG_ENABLE. + * - If autonegotiation finishes, but there's no link, it sets + * the state to NOLINK. + * - If aneg finishes with link, it sets the state to RUNNING, + * and calls adjust_link + * - If autonegotiation did not finish after an arbitrary amount + * of time, autonegotiation should be tried again if the PHY + * supports "magic" autonegotiation (back to AN) + * - If it didn't finish, and no magic_aneg, move to FORCING. + * + * NOLINK: PHY is up, but not currently plugged in. + * - If the timer notes that the link comes back, we move to RUNNING + * - config_aneg moves to AN + * - phy_stop moves to HALTED + * + * FORCING: PHY is being configured with forced settings + * - if link is up, move to RUNNING + * - If link is down, we drop to the next highest setting, and + * retry (FORCING) after a timeout + * - phy_stop moves to HALTED + * + * RUNNING: PHY is currently up, running, and possibly sending + * and/or receiving packets + * - timer will set CHANGELINK if we're polling (this ensures the + * link state is polled every other cycle of this state machine, + * which makes it every other second) + * - irq will set CHANGELINK + * - config_aneg will set AN + * - phy_stop moves to HALTED + * + * CHANGELINK: PHY experienced a change in link state + * - timer moves to RUNNING if link + * - timer moves to NOLINK if the link is down + * - phy_stop moves to HALTED + * + * HALTED: PHY is up, but no polling or interrupts are done. +* Brings the link down. +* - phy_start moves to RESUMING +* +* RESUMING: PHY was halted, but now wants to run again. +* - If we are forcing, or aneg is done, timer moves to RUNNING +* - If aneg is not done, timer moves to AN +* - phy_stop moves to HALTED +*/ +enum phy_state { + PHY_DOWN=0, + PHY_STARTING, + PHY_READY, + PHY_PENDING, + PHY_UP, + PHY_AN, + PHY_RUNNING, + PHY_NOLINK, + PHY_FORCING, + PHY_CHANGELINK, + PHY_HALTED, + PHY_RESUMING +}; + +/* phy_device: An instance of a PHY + * + * drv: Pointer to the driver for this PHY instance + * bus: Pointer to the bus this PHY is on + * dev: driver model device structure for this PHY + * phy_id: UID for this device found during discovery + * state: state of the PHY for management purposes + * addr: Bus address of PHY + * link_timeout: The number of timer firings to wait before the + * giving up on the current attempt at acquiring a link + * irq: IRQ number of the PHY's interrupt (-1 if none) + * phy_timer: The timer for handling the state machine + * phy_queue: A work_queue for the interrupt + * attached_dev: The attached enet driver's device instance ptr + * adjust_link: Callback for the enet controller to respond to + * changes in the link state. + * adjust_state: Callback for the enet driver to respond to + * changes in the state machine. + * + * speed, duplex, pause, supported, advertising, and + * autoneg are used like in mii_if_info + * + * interrupts currently only supports enabled or disabled, + * but could be changed in the future to support enabling + * and disabling specific interrupts + * + * Contains some infrastructure for polling and interrupt + * handling, as well as handling shifts in PHY hardware state + */ +struct phy_device { + /* Information about the PHY type */ + /* And management functions */ + struct phy_driver *drv; + + struct mii_bus *bus; + + struct device dev; + + u32 phy_id; + + enum phy_state state; + + /* Bus address of the PHY (0-32) */ + int addr; + + /* forced speed & duplex (no autoneg) + * partner speed & duplex & pause (autoneg) + */ + int speed; + int duplex; + int pause; + + /* The most recently read link state */ + int link; + + /* Enabled Interrupts */ + u32 interrupts; + + /* Union of PHY and Attached devices' supported modes */ + /* See mii.h for more info */ + u32 supported; + u32 advertising; + + int autoneg; + + int link_timeout; + + /* Interrupt number for this PHY + * -1 means no interrupt */ + int irq; + + /* private data pointer */ + /* For use by PHYs to maintain extra state */ + void *priv; + + /* Interrupt and Polling infrastructure */ + struct work_struct phy_queue; + struct timer_list phy_timer; + + spinlock_t lock; + + struct device *attached_dev; + + void (*adjust_link)(struct device *dev); + + void (*adjust_state)(struct device *dev); +}; +#define to_phy_device(d) container_of(d, struct phy_device, dev) + +/* struct phy_driver: Driver structure for a particular PHY type + * + * phy_id: The result of reading the UID registers of this PHY + * type, and ANDing them with the phy_id_mask. This driver + * only works for PHYs with IDs which match this field + * name: The friendly name of this PHY type + * phy_id_mask: Defines the important bits of the phy_id + * features: A list of features (speed, duplex, etc) supported + * by this PHY + * flags: A bitfield defining certain other features this PHY + * supports (like interrupts) + * + * The drivers must implement config_aneg and read_status. All + * other functions are optional. Note that none of these + * functions should be called from interrupt time. The goal is + * for the bus read/write functions to be able to block when the + * bus transaction is happening, and be freed up by an interrupt + * (The MPC85xx has this ability, though it is not currently + * supported in the driver). + */ +struct phy_driver { + u32 phy_id; + char *name; + unsigned int phy_id_mask; + u32 features; + u32 flags; + + /* Called to initialize the PHY */ + int (*probe)(struct phy_device *phydev); + + /* PHY Power Management */ + int (*suspend)(struct phy_device *phydev); + int (*resume)(struct phy_device *phydev); + + /* Configures the advertisement and resets + * autonegotiation if phydev->autoneg is on, + * forces the speed to the current settings in phydev + * if phydev->autoneg is off */ + int (*config_aneg)(struct phy_device *phydev); + + /* Determines the negotiated speed and duplex */ + int (*read_status)(struct phy_device *phydev); + + /* Clears any pending interrupts */ + int (*ack_interrupt)(struct phy_device *phydev); + + /* Enables or disables interrupts */ + int (*config_intr)(struct phy_device *phydev); + + /* Clears up any memory if needed */ + void (*remove)(struct phy_device *phydev); + + struct device_driver driver; +}; +#define to_phy_driver(d) container_of(d, struct phy_driver, driver) + +u16 phy_read(struct phy_device *phydev, u16 regnum); +void phy_write(struct phy_device *phydev, u16 regnum, u16 val); +struct phy_device* get_phy_device(struct mii_bus *bus, uint addr); +void phy_clear_interrupt(struct phy_device *phydev); +void phy_config_interrupt(struct phy_device *phydev, u32 interrupts); +struct phy_device * phy_attach(struct device *dev, char *phy_id); +struct phy_device * phy_connect(struct device *dev, char *phy_id, + void (*handler)(struct device *)); +void phy_disconnect(struct phy_device *phydev); +void phy_detach(struct phy_device *phydev); +void phy_start(struct phy_device *phydev); +void phy_stop(struct phy_device *phydev); +void phy_start_aneg(struct phy_device *phydev); +int register_mdiobus(struct mii_bus *bus); +void phy_change(void *data); +void phy_timer(unsigned long data); +void phy_sanitize_settings(struct phy_device *phydev); + +void genphy_config_advert(struct phy_device *phydev); +void genphy_setup_forced(struct phy_device *phydev); +void genphy_restart_aneg(struct phy_device *phydev); +int gbit_config_aneg(struct phy_device *phydev); +int genphy_config_aneg(struct phy_device *phydev); +int genphy_update_link(struct phy_device *phydev); +int genphy_read_status(struct phy_device *phydev); +void phy_driver_unregister(struct phy_driver *drv); +int phy_driver_register(struct phy_driver *new_driver); +void phy_prepare_link(struct phy_device *phydev, + void (*adjust_link)(struct device *)); +void phy_start_machine(struct phy_device *phydev, + void (*handler)(struct device *)); +void phy_stop_machine(struct phy_device *phydev); + +extern struct bus_type mdio_bus_type; +extern struct phy_driver genphy_driver; +#endif /* __PHY_H */ --Apple-Mail-2-974361640 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=US-ASCII; format=flowed On Dec 23, 2004, at 13:01, Andy Fleming wrote: > Adds a Phy Abstraction Layer which allows ethernet controllers to > manage their PHYs without knowing the details of how the particular > PHY device operates. This code steals heavily from BenH's sungem > driver, and got some stuff out of Jason McMullan's patch. > > Primary features of the code: > * Allows drivers to only use what they want (to a degree). If you > want to handle it all yourself, but use some of the data structures > and functions, that's ok. If you want to handle your own interrupts, > that's fine. However, it also allows you to minimize PHY management > code. See the gianfar driver patches (included for reference). > * Integrates with current ethtool/mii defined fields. > * Uses the driver model to manage binding PHY drivers to PHY devices, > and MDIO bus drivers to MDIO bus devices. > * Doesn't affect drivers which don't use it. --Apple-Mail-2-974361640-- From simon.roscic@chello.at Thu Dec 23 15:01:22 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 23 Dec 2004 15:01:29 -0800 (PST) Received: from idefix1.limbo.tikom.at (data.tikom.at [193.108.212.253] (may be forged)) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBNN11HN016237 for ; Thu, 23 Dec 2004 15:01:22 -0800 Received: from idefix1.limbo.tikom.at (localhost [127.0.0.1]) by idefix1.limbo.tikom.at (8.12.10/8.12.10) with ESMTP id iBNN2Ppk029699 for ; Fri, 24 Dec 2004 00:02:25 +0100 Received: from proxy.econet.tikom.at (asterix.limbo.tikom.at [172.16.3.2]) by idefix1.limbo.tikom.at (8.12.10/8.12.10) with ESMTP id iBNN2PaT029694 for ; Fri, 24 Dec 2004 00:02:25 +0100 Received: from proxy.mpreis.at (IDENT:0@[192.168.16.1]) by proxy.econet.tikom.at (8.12.9/8.12.9) with ESMTP id iBNN2O8b024393 for ; Fri, 24 Dec 2004 00:02:25 +0100 Received: from adam.mpreis.at (adam.mpreis.at [10.1.1.15]) by proxy.mpreis.at (Sendmail) with ESMTP id AAA02828 for ; Fri, 24 Dec 2004 00:02:24 +0100 Received: from vpn03.mpreis.at ([10.1.1.130]) by adam.mpreis.at (Lotus Domino Release 5.0.11) with ESMTP id 2004122400022421:14618 ; Fri, 24 Dec 2004 00:02:24 +0100 From: Simon Roscic Subject: Fwd: [2.6] ethertap and af_inet.c assertion failures Date: Fri, 24 Dec 2004 00:02:13 +0100 User-Agent: KMail/1.7.1 To: netdev@oss.sgi.com MIME-Version: 1.0 Message-Id: <200412240002.13206.simon.roscic@chello.at> X-MIMETrack: Itemize by SMTP Server on Adam/Mpreis(Release 5.0.11 |September 30, 2002) at 24.12.2004 00:02:24, Serialize by Router on Adam/Mpreis(Release 5.0.11 |September 30, 2002) at 24.12.2004 00:02:24, Serialize complete at 24.12.2004 00:02:24 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="utf-8" Content-Disposition: inline X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13007 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: simon.roscic@chello.at Precedence: bulk X-list: netdev (i forgot to cc this mail to this list, sorry) ---------- Forwarded Message ---------- Subject: [2.6] ethertap and af_inet.c assertion failures Date: Wednesday 22 December 2004 19:53 From: Simon Roscic To: linux-kernel@vger.kernel.org (please cc me, as i'm not subscribed to lkml - thanks) hi, today i upgraded my kernel from 2.6.9-rc2 to 2.6.10-rc3-bk12, now i get the following assertion failures while using the (closed source) phion vpn client, the vpn client uses ethertap, there are no closed source kernel modules or the like: KERNEL: assertion (!atomic_read(&sk->sk_wmem_alloc)) failed at net/ipv4/af_inet.c (150) when the kernel prints out the above message the connection for the program using the vpn gets stuck - it happens very often if i use rdesktop, but it also happens when i just use ssh, so the bug may be triggered more often when there is more traffic over the vpn tunnel. i tried with some other 2.6 kernel releases: up to and including 2.6.9-rc2: no problem 2.6.9-rc3 does not boot on my machine 2.6.9-rc4 assertion failed as explained above 2.6.9 assertion failed as explained above so it seems ethertap got broken somewhere post 2.6.9-rc2. any ideas what got changed post 2.6.9-rc2 wich might cause this? thanks for looking at this problem, if i can provide more information, just contact me. bye, simon. ------------------------------------------------------- From tgraf@suug.ch Thu Dec 23 15:07:53 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 23 Dec 2004 15:08:00 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBNN7WTA017006 for ; Thu, 23 Dec 2004 15:07:53 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id 3D8EB82 for ; Fri, 24 Dec 2004 00:08:40 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id D605D1C0EA; Fri, 24 Dec 2004 00:09:22 +0100 (CET) Date: Fri, 24 Dec 2004 00:09:22 +0100 From: Thomas Graf To: netdev@oss.sgi.com Subject: Re: [RFC] Extended Generic Packet Classifier Message-ID: <20041223230922.GH7884@postel.suug.ch> References: <20041223194834.GF7884@postel.suug.ch> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041223194834.GF7884@postel.suug.ch> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13008 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * Thomas Graf <20041223194834.GF7884@postel.suug.ch> 2004-12-23 20:48 > The patch including the userspace tools can be found at > http://people.suug.ch/~tgr/egp/. Be warned, this is totally unfinished > work, some parts are not fully implemented yet it and still contains a lot > of known issues. Nevertheless, I think it is time to publish it. A few remarks on the yet to be addressed issues: o The evaluation of values when reading one uses a recursive algorithm which must be transformed. It's not difficult, just needs to be done. o The design decision to transfer the complete configuration in a single netlink message was wrong, it sets strong limits on the size of configurations. Changing it to a architecture where keys/values can be added/changed/remove brings some problems with locking and makes validation difficult since the number of values/filters/keys must be known when validating a key or value. A good solution would be to temporarly disable a classifer while changing it and require the first change request to submit all parameters required for validation. o The fact that a value can represent a key result introduces a nasty endless loop possibility. The key referenced by the value could itself again reference the value or another key doing so. There is no simple solution for this problem. Some notes on performance: EGP generally is a bit slower than classifiers such as u32 and fw if used in the same way. This is easly explainable by the improved complexity and the level of abstraction. However, one can reduce the number of filters and required matches a lot which in the end gives better performance results. OTOH, it might be real performance killer if used in a wrong way. My thoughts on inclusion and future of this: o I think it's a rather bad idea to include this as of now, even if the code itself is quite clean. Using the cls api is currently the only way but it would definitely be better to replace the cls api with something like this. EGP could be used as basis. o I'll wait for some comments, given someone is actually interested in it and decide what do afterwards. There is quite some stuff in the queue postponed over and over such as netlink errors, move sched/ out of rtnetlink, metamatch, etc. If we're going to do the move outside of rtnetlink we could at the same time do fundamental changes to the API which would be required to take over some of egp's ideas. It might be a step too far for some, but we can give it a few hair cuts and shrink it down to what is really needed. Cheers From apache@pisces1.dnswithus.net Thu Dec 23 17:10:19 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 23 Dec 2004 17:10:25 -0800 (PST) Received: from pisces1.dnswithus.net (pisces1.dnswithus.net [209.218.166.70]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBO19wob026716 for ; Thu, 23 Dec 2004 17:10:19 -0800 Received: (qmail 17945 invoked from network); 24 Dec 2004 01:07:00 -0000 Received: from localhost (HELO pisces1.dnswithus.net) (127.0.0.1) by localhost with AES256-SHA encrypted SMTP; 24 Dec 2004 01:07:00 -0000 Received: (from apache@localhost) by pisces1.dnswithus.net (8.12.8p1/8.12.8/Submit) id iBO15m3u017598; Thu, 23 Dec 2004 19:05:48 -0600 (CST) Date: Thu, 23 Dec 2004 19:05:48 -0600 (CST) Message-Id: <200412240105.iBO15m3u017598@pisces1.dnswithus.net> To: Subject: CRY FOR HELP From: susan10 X-Priority: 3 (Normal) CC: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Mailer: RLSP Mailer X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13009 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: susanwilliams@mail2africa.com Precedence: bulk X-list: netdev Dear Friend, How are you doing today? i hope fine. if so glory be to God. my names is susan williams i am the daughter to the late former minister of agriculture in Nigeria. my dad died since 5yrs ago in a ghastly plane crash on his way back from Kenya, and since then life has not been easy for me and my family anyway i thank god that i am alive today. Since the death of my dad things have not been easy with us. Though my dad died leaving a huge amount of funds and the government has refused to release this money because the money has been lodged in a foreign account. I and my family have gone through hell even my dads brothers could not help us and the government has only given us an option that the funds can only be released if we have a foreigner that the funds can be transferred into his or her account. After this has been done my family and I are willing to offer 30% of the funds to anyone that can assist us so that we can claim our fathers money. My mum has been so ill and things are not really going well with us here. I decided to contact you because i knew god was gong to use you to assist me and my family from our present predicament. As soon as we have been able to collect this money I and my family will be coming overseas so we can start up a new life and invest in a profitable business venture i will be glad if you can assist. There is nothing to fear about as i have attached my pics for proper identification as i write this write in tears and pains, I appeal to you to assist me so that the funds can be transfer will give u more details as soon as hear from you. I want you to know that i have been directed by the spirit to send this to you, may the lord bless you richly as you assist my family and me in regaining our lost life back as this is my mum's only wish right now. Please reply me with this email address: susanwilliams@mail2africa.com Best regards Miss susan williams. ___________________________________________________________________________ Mail sent from WebMail service at PHP-Nuke Powered Site - http://www.weareallimmigrants.us From advertiser@localhost.localdomain Thu Dec 23 21:45:31 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 23 Dec 2004 21:45:39 -0800 (PST) Received: from localhost.localdomain ([82.201.178.238]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBO5jAGt011935 for ; Thu, 23 Dec 2004 21:45:31 -0800 Received: from localhost.localdomain (admin3 [127.0.0.1]) by localhost.localdomain (8.12.8/8.12.8) with ESMTP id iBOHprmI006421 for ; Fri, 24 Dec 2004 19:51:53 +0200 Received: (from advertiser@localhost) by localhost.localdomain (8.12.8/8.12.8/Submit) id iBOHpri0006414 for netdev@oss.sgi.com; Fri, 24 Dec 2004 19:51:53 +0200 Date: Fri, 24 Dec 2004 19:51:53 +0200 From: advertiser@advertise.com Message-Id: <200412241751.iBOHpri0006414@localhost.localdomain> To: netdev@oss.sgi.com Subject: Cheap Prices NOT Cheap Hosting X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13010 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: advertiser@advertise.com Precedence: bulk X-list: netdev .HellO... ------------------------------------------------------------ ############################################################## $ Visit http://www.mkhoster.com For Verey Good Hosting Offer $ $--- Cpanel $ $--- PHP $ $--- CGI-perl $ $--- Mysql $ $--- And MORE ....... $ ############################################################## FOR MORE INFORMATIONS -----< http://mkhoster.com/support.html >----- ************************************************************** From pb@bieringer.de Fri Dec 24 01:43:42 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 24 Dec 2004 01:43:49 -0800 (PST) Received: from smtp2.aerasec.de (gromit.aerasec.de [195.226.187.57]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBO9hLHe027880 for ; Fri, 24 Dec 2004 01:43:42 -0800 Received: from localhost (localhost [127.0.0.1]) by smtp2.aerasec.de (Postfix) with SMTP id 2AE61137EE; Fri, 24 Dec 2004 10:44:49 +0100 (CET) X-AV-Checked: Fri Dec 24 10:44:49 2004 smtp2.aerasec.de Received: from [192.168.17.65] (pD9E96625.dip.t-dialin.net [217.233.102.37]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (Client did not present a certificate) by smtp2.aerasec.de (Postfix) with ESMTP id AE43A137EA; Fri, 24 Dec 2004 10:44:43 +0100 (CET) Date: Fri, 24 Dec 2004 10:45:25 +0100 From: Peter Bieringer To: USAGI core , Maillist netdev Cc: Harald Welte Subject: ip6tables: accept of IPv6 transport esp packages not possible - no rule matches Message-ID: <019064D0423CE6C823CBF476@t1mobil.muc.aerasec.de> X-Mailer: Mulberry/3.1.5 (Linux/x86) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13011 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pb@bieringer.de Precedence: bulk X-list: netdev Hi all, (first a Merry Christmas to all) I ran here into a major problem: 2 IPv6 hosts can successfully connect each other in case of unencrypted traffic, filtering with ip6table works fine. Now I'v setup between this two hosts encryption (setkey & racoon). IKE phase 1 & 2 works perfectly. But now, no ip6table-ACCEPT rule matches anymore. I played around, but without success. I got following log message (some MAC,IPv4,IPv6 addresses are changed for privacy): Dec 24 10:22:27 gate kernel: extIN-FW6-default:IN=sit_sixxs OUT= MAC=00:11:22:33:44:01->00:11:22:33:44:02 TUNNEL=212.224. 0.188-> 84.000. 0. 12 SRC=2001:06f8:0900:0449:0000:0000:0000:0002 DST=2001:06f8:0900:0094:0000:0000:0000:0002 LEN=116 TC=0 HOPLIMIT=63 FLOWLBL=0 OPT ( ) PROTO=59 Caused by following ruleset: # ip6tables -vn -L extIN --line-num Chain extIN (4 references) num pkts bytes target prot opt in out source destination 1 0 0 ACCEPT all * * 2001:6f8:900:449::2/128 2001:6f8:900:94::2/128 2 0 0 ACCEPT tcp * * ::/0 3ffe:400:100:f101::1/128tcp spts:1024:65535 dpt:80 3 27 2808 ACCEPT icmpv6 * * ::/0 ::/0 4 6 888 ACCEPT udp * * 2001:6f8:900:449::2/128 2001:6f8:900:94::2/128udp spt:500 dpt:500 5 0 0 ACCEPT esp * * 2001:6f8:900:449::2/128 2001:6f8:900:94::2/128 6 0 0 ACCEPT 59 * * 2001:6f8:900:449::2/128 2001:6f8:900:94::2/128 tcp spts:512:65535 dpt:22 10 0 0 ACCEPT tcp * * ::/0 ::/0 tcp spts:1:65535 dpts:32768:60099 flags:!0x16/0x02 11 0 0 ACCEPT udp * * ::/0 ::/0 udp spts:1:65535 dpts:32768:60099 12 13 1564 LOG all * * ::/0 ::/0 limit: avg 5/min burst 5 LOG flags 0 level 7 prefix `extIN-FW6-default:' 13 13 1564 DROP all * * ::/0 ::/0 As you see, neither rule 1 nor rule 6 matches, which is strange indeed - what's the reason? Why matches the DROP rule (13), but not the global ACCEPT rule (1)? Both sides are using Linux kernel 2.6.9-1.681_FC3 from Fedora Core 3 updates. BTW: can someone fix the log statement? TUNNEL=212.224. 0.188-> 84.128. 0. 12 -> leading spaces instead of leading 0 are not very well. Thank you very much. Peter -- Dr. Peter Bieringer http://www.bieringer.de/pb/ GPG/PGP Key 0x958F422D mailto: pb at bieringer dot de Deep Space 6 Co-Founder and Core Member http://www.deepspace6.net/ From gtq@aol.com Fri Dec 24 06:42:36 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 24 Dec 2004 06:42:43 -0800 (PST) Received: from localhost (v92line55.dialup.kirov.ru [217.9.146.55]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBOEgC9V019458 for ; Fri, 24 Dec 2004 06:42:34 -0800 Date: Fri, 24 Dec 2004 06:42:32 -0800 Message-Id: <200412241442.iBOEgC9V019458@oss.sgi.com> FROM: gtq@aol.com TO: netdev@oss.sgi.com Subject: Hi, Nick. In this archive you can find all those things, you asked me. X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13012 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: gtq@aol.com Precedence: bulk X-list: netdev See you. Steve MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_NextPart_000_0005_3BAB1552.E76AABC2" X-Priority: 3 X-MSMail-Priority: Normal This is a multi-part message in MIME format. ------=_NextPart_000_0005_3BAB1552.E76AABC2 Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit Hi, Mike ------=_NextPart_000_0005_3BAB1552.E76AABC2 Content-Type: application/x-msdownload; name="AtlantI.exe" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="AtlantI.exe" ------=_NextPart_000_0005_3BAB1552.E76AABC2-- From pb@bieringer.de Fri Dec 24 07:57:18 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 24 Dec 2004 07:57:24 -0800 (PST) Received: from smtp2.aerasec.de (gromit.aerasec.de [195.226.187.57]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBOFuuDj024797 for ; Fri, 24 Dec 2004 07:57:18 -0800 Received: from localhost (localhost [127.0.0.1]) by smtp2.aerasec.de (Postfix) with SMTP id ABCD4137EE; Fri, 24 Dec 2004 16:58:25 +0100 (CET) X-AV-Checked: Fri Dec 24 16:58:25 2004 smtp2.aerasec.de Received: from [192.168.17.65] (pD9E96625.dip.t-dialin.net [217.233.102.37]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (Client did not present a certificate) by smtp2.aerasec.de (Postfix) with ESMTP id ABEDE137EA; Fri, 24 Dec 2004 16:58:24 +0100 (CET) Date: Fri, 24 Dec 2004 16:59:07 +0100 From: Peter Bieringer To: USAGI core , Maillist netdev Cc: Harald Welte Subject: Re: ip6tables: accept of IPv6 transport esp packages not possible - no rule matches Message-ID: <5F6ACA5CEF52DBFBF11FBF94@t1mobil.muc.aerasec.de> In-Reply-To: <019064D0423CE6C823CBF476@t1mobil.muc.aerasec.de> References: <019064D0423CE6C823CBF476@t1mobil.muc.aerasec.de> X-Mailer: Mulberry/3.1.6 (Linux/x86) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13013 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pb@bieringer.de Precedence: bulk X-list: netdev Hi again, one update (after playing now with openswan): > Dec 24 10:22:27 gate kernel: extIN-FW6-default:IN=sit_sixxs OUT= > MAC=00:11:22:33:44:01->00:11:22:33:44:02 TUNNEL=212.224. 0.188-> 84.000. > 0. 12 SRC=2001:06f8:0900:0449:0000:0000:0000:0002 > DST=2001:06f8:0900:0094:0000:0000:0000:0002 LEN=116 TC=0 HOPLIMIT=63 > FLOWLBL=0 OPT ( ) PROTO=59 I found a difference in handling of following rules: #1 ip6tables -A extIN -p all -s 2001:6f8:900:94::2 -d 2001:6f8:900:449::2 -j ACCEPT #2 ip6tables -A extIN -s 2001:6f8:900:94::2 -d 2001:6f8:900:449::2 -j ACCEPT Rule #1 doesn't match that strangeness, while rule #2 does (and - partially - solve my problem now)! Looks like there is something going wrong in the protocol matching algorithm in netfilter6. So at the moment, I can't filter the traffic, but connection is encrypted. Perhaps for interesting, using openswan of Fedora Core 3 and following very simple configuration: conn ipv6-location1-location2 connaddrfamily=ipv6 left=2001:6f8:900:94::2 right=2001:6f8:900:449::2 authby=secret type=transport Peter -- Dr. Peter Bieringer http://www.bieringer.de/pb/ GPG/PGP Key 0x958F422D mailto: pb at bieringer dot de Deep Space 6 Co-Founder and Core Member http://www.deepspace6.net/ From eric.lemoine@gmail.com Fri Dec 24 08:09:34 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 24 Dec 2004 08:09:40 -0800 (PST) Received: from wproxy.gmail.com (wproxy.gmail.com [64.233.184.193]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBOG9DQd025960 for ; Fri, 24 Dec 2004 08:09:34 -0800 Received: by wproxy.gmail.com with SMTP id 71so161740wra for ; Fri, 24 Dec 2004 08:10:39 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:references; b=RPJI48dglD8YeBJ46kWN+HcSYDXuZzBK0eRBKUhaeAwuhky7aczYeTZ0Pud/QjsnD52fn+zitTjRd42LJY43rxgj2gbB4l2uy/iO2biGJGYObHOWnCWR8LwjEPhzCzZTeBdWOXQ8OUlS3UDgCjZXeQPQk+m1CkEHyW7/7usk/X0= Received: by 10.54.10.22 with SMTP id 22mr308549wrj; Fri, 24 Dec 2004 08:10:38 -0800 (PST) Received: by 10.54.30.8 with HTTP; Fri, 24 Dec 2004 08:10:38 -0800 (PST) Message-ID: <5cac192f04122408102129af43@mail.gmail.com> Date: Fri, 24 Dec 2004 17:10:38 +0100 From: Eric Lemoine Reply-To: Eric Lemoine To: Patrick McHardy Subject: Re: LLTX and netif_stop_queue Cc: "David S. Miller" , hadi@cyberus.ca, roland@topspin.com, netdev@oss.sgi.com, openib-general@openib.org In-Reply-To: <41CAF444.3000305@trash.net> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit References: <52llbwoaej.fsf@topspin.com> <20041217214432.07b7b21e.davem@davemloft.net> <1103484675.1050.158.camel@jzny.localdomain> <5cac192f04122210491d64d4b6@mail.gmail.com> <20041222202919.057b8331.davem@davemloft.net> <5cac192f0412230110628749e3@mail.gmail.com> <41CAF444.3000305@trash.net> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13014 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: eric.lemoine@gmail.com Precedence: bulk X-list: netdev On Thu, 23 Dec 2004 17:37:24 +0100, Patrick McHardy wrote: > Eric Lemoine wrote: > > I still have one concern with the LLTX code (and it may be that the > > correct patch is Jamal's) : > > > > Without LLTX we do : lock(queue_lock), lock(xmit_lock), > > release(queue_lock), release(xmit_lock). With LLTX (without Jamal's > > patch) we do : lock(queue_lock), release(queue_lock), lock(tx_lock), > > release(tx_lock). LLTX doesn't look correct because it creates a race > > condition window between the the two lock-protected sections. So you > > may want to reconsider Jamal's patch or pull out LLTX... > > You're right, it can cause packet reordering if something like this > happens: > > CPU1 CPU2 > lock(queue_lock) > dequeue > unlock(queue_lock) > lock(queue_lock) > dequeue > unlock(queue_lock) > lock(xmit_lock) > hard_start_xmit > unlock(xmit_lock) > lock(xmit_lock) > hard_start_xmit > unlock(xmit_lock) > > Jamal's patch should fix this. Yes but requiring drivers to release a lock that they should not even be aware of doesn't sound good. Another way would be to keep dev->queue_lock grabbed when entering start_xmit() and let the driver drop it (and re-acquire it before it returns) only if it wishes so. Although I don't like this too much either, that's the best way I can think of up to now... -- Eric From linux.lover2004@gmail.com Fri Dec 24 21:17:21 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 24 Dec 2004 21:17:27 -0800 (PST) Received: from rproxy.gmail.com (rproxy.gmail.com [64.233.170.192]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBP5H07D007783 for ; Fri, 24 Dec 2004 21:17:20 -0800 Received: by rproxy.gmail.com with SMTP id f1so90802rne for ; Fri, 24 Dec 2004 21:18:19 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:mime-version:content-type:content-transfer-encoding; b=fzbiyJK9XeeXS/LCitUZMSj5KhaR1cBIt+0LjpSk5noS19en6I+8eZ1dGq9OkmCuxEew/b73k8/a3B83hDcYmj63rgXtWn6R9ZtrnbmDkkiqCkEn110yAf5gx7N2szrovVnMAxK1JTZj4mHuuIMDionEAQu9Tgf4SqCzaKRc5Lg= Received: by 10.38.67.12 with SMTP id p12mr249842rna; Fri, 24 Dec 2004 21:18:18 -0800 (PST) Received: by 10.38.207.9 with HTTP; Fri, 24 Dec 2004 21:18:18 -0800 (PST) Message-ID: <72c6e37904122421186c8d4d67@mail.gmail.com> Date: Sat, 25 Dec 2004 10:48:18 +0530 From: linux lover Reply-To: linux lover To: netdev@oss.sgi.com Subject: tcp sequence no. query Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13015 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: linux.lover2004@gmail.com Precedence: bulk X-list: netdev Hi all, I am anyalysing tcp sequence no.s on single pc by running TCP socket program on ethernet card with ip 192.168.1.200. For that i added debug messages in tcp_v4_rcv and also in tcp_transmit_skb to check what seq. no. are used by kenrel 2.4.24. What results i found by doing dmesg on console is that -----------OUTPUT INTERFACE FOR TCP PACKET CONNECT---------- TCP_CONNECT After push tcp header in tcp_transmit_skb th->seq=-745717026 After push tcp header in tcp_transmit_skb th->ack_seq=0 After pull tcp header in tcp_v4_rcv th->seq=-745717026 After pull tcp header in tcp_v4_rcv th->ack_seq=0 But that is happening once in 10 times. Why am i got above -ve seq.no. results? regards, linux.lover From pranav@nodeinfotech.com Sat Dec 25 03:25:41 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 25 Dec 2004 03:25:47 -0800 (PST) Received: from localhost.localdomain (dialpool-210-214-17-202.maa.sify.net [210.214.17.202]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBPBPIlB025209 for ; Sat, 25 Dec 2004 03:25:40 -0800 Received: from pranav ([192.168.10.220]) by localhost.localdomain (8.11.2/8.11.2) with SMTP id iBPBhci00811; Sat, 25 Dec 2004 17:13:40 +0530 Reply-To: From: "Pranav" To: Cc: Subject: Wish you all a Merry Christmas Date: Sat, 25 Dec 2004 16:56:44 +0530 Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0) In-Reply-To: <20041222171836.GL5974@waste.org> Importance: Normal X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13016 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pranav@nodeinfotech.com Precedence: bulk X-list: netdev Hi everyone, Wishing you all A Prosperous Merry ChristMas. Hope Coming years brings Peace,Happiness,blessings of CHRISTmas to you all ,your family and this World. With Regards, Pranav. From jengelh@linux01.gwdg.de Sat Dec 25 03:29:39 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 25 Dec 2004 03:29:47 -0800 (PST) Received: from linux01.gwdg.de (root@linux01.gwdg.de [134.76.13.21]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBPBTH8V025730 for ; Sat, 25 Dec 2004 03:29:38 -0800 Received: from linux01.gwdg.de (localhost [127.0.0.1]) by linux01.gwdg.de (8.12.7/8.12.7/SuSE Linux 0.6) with ESMTP id iBPBUjrg026358; Sat, 25 Dec 2004 12:30:45 +0100 Received: from localhost (jengelh@localhost) by linux01.gwdg.de (8.12.7/8.12.7/Submit) with ESMTP id iBPBUjX1026354; Sat, 25 Dec 2004 12:30:45 +0100 Date: Sat, 25 Dec 2004 12:30:44 +0100 (MET) From: Jan Engelhardt To: Pranav cc: netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: Wish you all a Merry Christmas In-Reply-To: Message-ID: References: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13017 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jengelh@linux01.gwdg.de Precedence: bulk X-list: netdev >Wishing you all A Prosperous Merry ChristMas. >Hope Coming years brings Peace,Happiness,blessings of CHRISTmas to you all >,your family and this World. > >With Regards, >Pranav. I don't see how this is related to linux-kernel. Jan Engelhardt -- ENOSPC From laforge@gnumonks.org Sat Dec 25 03:53:14 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 25 Dec 2004 03:53:21 -0800 (PST) Received: from ganesha.gnumonks.org (Debian-exim@ganesha.gnumonks.org [213.95.27.120]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBPBqrwW027032 for ; Sat, 25 Dec 2004 03:53:13 -0800 Received: from dsl-082-082-096-220.arcor-ip.net ([82.82.96.220] helo=sunbeam.gnumonks.org) by ganesha.gnumonks.org with asmtp (TLS-1.0:RSA_ARCFOUR_SHA:16) (Exim 4.34) id 1CiAV4-0003Pv-7T; Sat, 25 Dec 2004 12:54:22 +0100 Received: from laforge by sunbeam.gnumonks.org with local (Exim 4.34) id 1CiAUx-0007Ed-PH; Sat, 25 Dec 2004 12:54:15 +0100 Date: Sat, 25 Dec 2004 12:54:15 +0100 From: Harald Welte To: "YOSHIFUJI Hideaki / ?$B5HF#1QL@" Cc: davem@davemloft.net, pekkas@netcore.fi, netdev@oss.sgi.com, usagi-users@linux-ipv6.org Subject: Re: 2.6.8.1 IPv6 Routing Problem Message-ID: <20041225115415.GI24142@sunbeam.de.gnumonks.org> References: <20040920.152012.114156249.yoshfuji@linux-ipv6.org> <20040921195752.015e3d1d.davem@davemloft.net> <20040922.120630.96674716.yoshfuji@linux-ipv6.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="kGQlNN4Ir6FkfZg7" Content-Disposition: inline In-Reply-To: <20040922.120630.96674716.yoshfuji@linux-ipv6.org> User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13018 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: laforge@gnumonks.org Precedence: bulk X-list: netdev --kGQlNN4Ir6FkfZg7 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Getting back to an ooooold thread... On Wed, Sep 22, 2004 at 12:06:30PM +0900, YOSHIFUJI Hideaki / ?$B5HF#1QL@ w= rote: > In article <20040921195752.015e3d1d.davem@davemloft.net> (at Tue, 21 Sep = 2004 19:57:52 -0700), "David S. Miller" says: >=20 > > On Mon, 20 Sep 2004 15:20:12 +0900 (JST) > > YOSHIFUJI Hideaki / =1B$B5HF#1QL@=1B(B wrote: > >=20 > > > This behavior lives for years (AFAIK), > > > and we haven't got so many reports=20 > > > because people usually bring loopback device first. > > >=20 > > > I think the following message will help people, anyway. > >=20 > > If ipv6 has a dependency upon this, why doesn't it just > > bring the device up itself at this moment if necessary? >=20 > Okay, I'll make a patch for this. As far as I can see (please correct me) no such patch was included so far, at least with 2.6.10-rc3 I still have the old behaviour. Could you please include the proposed printk-warning-patch until the issue gets resolved? Thanks! --=20 - Harald Welte http://www.gnumonks.org/ =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D Programming is like sex: One mistake and you have to support it your lifeti= me --kGQlNN4Ir6FkfZg7 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Digital signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) iD8DBQFBzVTnXaXGVTD0i/8RAnPVAKCMsgFcBs6mnznNGik38Tad4gDFaQCfRxvx jefOfynXR/pSGzB0CYcEIsA= =96sV -----END PGP SIGNATURE----- --kGQlNN4Ir6FkfZg7-- From domen@coderock.org Sat Dec 25 06:13:56 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 25 Dec 2004 06:14:04 -0800 (PST) Received: from golobica.uni-mb.si (golobica.uni-mb.si [164.8.100.4]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBPEDYHd007150 for ; Sat, 25 Dec 2004 06:13:55 -0800 Received: from localhost (unknown [127.0.0.1]) by golobica.uni-mb.si (Postfix) with ESMTP id 9646C4DC0CE; Sat, 25 Dec 2004 15:15:03 +0100 (CET) Received: from localhost ([127.0.0.1]) by localhost (golobica.uni-mb.si [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 30305-01-38; Sat, 25 Dec 2004 15:14:57 +0100 (CET) Received: from localhost.localdomain (um-sd06-229-2.uni-mb.si [164.8.233.107]) by golobica.uni-mb.si (Postfix) with ESMTP id 00AB84DC0B3; Sat, 25 Dec 2004 15:14:57 +0100 (CET) Subject: [patch 1/1] net/3c59x: module_param conversions To: jgarzik@pobox.com Cc: netdev@oss.sgi.com, domen@coderock.org From: domen@coderock.org Date: Sat, 25 Dec 2004 15:15:06 +0100 Message-Id: <20041225141457.00AB84DC0B3@golobica.uni-mb.si> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Scanned: by amavisd-new / Sophos+Sophie & ClamAV at golobica.uni-mb.si X-Virus-Status: Clean X-archive-position: 13019 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: domen@coderock.org Precedence: bulk X-list: netdev Convert 3c59x.c to new module parameter api. Variables are moved before module_param*. Compile and run tested. Signed-off-by: Domen Puncer --- kj-domen/drivers/net/3c59x.c | 72 ++++++++++++++++++++++--------------------- 1 files changed, 37 insertions(+), 35 deletions(-) diff -puN drivers/net/3c59x.c~module_param-drivers_net_3c59x.c drivers/net/3c59x.c --- kj/drivers/net/3c59x.c~module_param-drivers_net_3c59x.c 2004-12-25 01:35:27.000000000 +0100 +++ kj-domen/drivers/net/3c59x.c 2004-12-25 01:35:27.000000000 +0100 @@ -273,26 +273,48 @@ static int vortex_debug = 1; static char version[] __devinitdata = DRV_NAME ": Donald Becker and others. www.scyld.com/network/vortex.html\n"; + +/* This driver uses 'options' to pass the media type, full-duplex flag, etc. */ +/* Option count limit only -- unlimited interfaces are supported. */ +#define MAX_UNITS 8 +static int options[MAX_UNITS] = { -1, -1, -1, -1, -1, -1, -1, -1,}; +static int full_duplex[MAX_UNITS] = {-1, -1, -1, -1, -1, -1, -1, -1}; +static int hw_checksums[MAX_UNITS] = {-1, -1, -1, -1, -1, -1, -1, -1}; +static int flow_ctrl[MAX_UNITS] = {-1, -1, -1, -1, -1, -1, -1, -1}; +static int enable_wol[MAX_UNITS] = {-1, -1, -1, -1, -1, -1, -1, -1}; +static int global_options = -1; +static int global_full_duplex = -1; +static int global_enable_wol = -1; + +/* #define dev_alloc_skb dev_alloc_skb_debug */ + +/* Variables to work-around the Compaq PCI BIOS32 problem. */ +static int compaq_ioaddr, compaq_irq, compaq_device_id = 0x5900; +static struct net_device *compaq_net_device; + +static int vortex_cards_found; + + MODULE_AUTHOR("Donald Becker "); MODULE_DESCRIPTION("3Com 3c59x/3c9xx ethernet driver " DRV_VERSION " " DRV_RELDATE); MODULE_LICENSE("GPL"); -MODULE_PARM(debug, "i"); -MODULE_PARM(global_options, "i"); -MODULE_PARM(options, "1-" __MODULE_STRING(8) "i"); -MODULE_PARM(global_full_duplex, "i"); -MODULE_PARM(full_duplex, "1-" __MODULE_STRING(8) "i"); -MODULE_PARM(hw_checksums, "1-" __MODULE_STRING(8) "i"); -MODULE_PARM(flow_ctrl, "1-" __MODULE_STRING(8) "i"); -MODULE_PARM(global_enable_wol, "i"); -MODULE_PARM(enable_wol, "1-" __MODULE_STRING(8) "i"); -MODULE_PARM(rx_copybreak, "i"); -MODULE_PARM(max_interrupt_work, "i"); -MODULE_PARM(compaq_ioaddr, "i"); -MODULE_PARM(compaq_irq, "i"); -MODULE_PARM(compaq_device_id, "i"); -MODULE_PARM(watchdog, "i"); +module_param(debug, int, 0644); +module_param(global_options, int, 0); +module_param_array(options, int, NULL, 0); +module_param(global_full_duplex, int, 0); +module_param_array(full_duplex, int, NULL, 0); +module_param_array(hw_checksums, int, NULL, 0); +module_param_array(flow_ctrl, int, NULL, 0); +module_param(global_enable_wol, int, 0); +module_param_array(enable_wol, int, NULL, 0); +module_param(rx_copybreak, int, 0); +module_param(max_interrupt_work, int, 0); +module_param(compaq_ioaddr, int, 0); +module_param(compaq_irq, int, 0); +module_param(compaq_device_id, int, 0); +module_param(watchdog, int, 0); MODULE_PARM_DESC(debug, "3c59x debug level (0-6)"); MODULE_PARM_DESC(options, "3c59x: Bits 0-3: media type, bit 4: bus mastering, bit 9: full duplex"); MODULE_PARM_DESC(global_options, "3c59x: same as options, but applies to all NICs if options is unset"); @@ -909,26 +931,6 @@ static void acpi_set_WOL(struct net_devi static struct ethtool_ops vortex_ethtool_ops; static void set_8021q_mode(struct net_device *dev, int enable); - -/* This driver uses 'options' to pass the media type, full-duplex flag, etc. */ -/* Option count limit only -- unlimited interfaces are supported. */ -#define MAX_UNITS 8 -static int options[MAX_UNITS] = { -1, -1, -1, -1, -1, -1, -1, -1,}; -static int full_duplex[MAX_UNITS] = {-1, -1, -1, -1, -1, -1, -1, -1}; -static int hw_checksums[MAX_UNITS] = {-1, -1, -1, -1, -1, -1, -1, -1}; -static int flow_ctrl[MAX_UNITS] = {-1, -1, -1, -1, -1, -1, -1, -1}; -static int enable_wol[MAX_UNITS] = {-1, -1, -1, -1, -1, -1, -1, -1}; -static int global_options = -1; -static int global_full_duplex = -1; -static int global_enable_wol = -1; - -/* #define dev_alloc_skb dev_alloc_skb_debug */ - -/* Variables to work-around the Compaq PCI BIOS32 problem. */ -static int compaq_ioaddr, compaq_irq, compaq_device_id = 0x5900; -static struct net_device *compaq_net_device; - -static int vortex_cards_found; #ifdef CONFIG_NET_POLL_CONTROLLER static void poll_vortex(struct net_device *dev) _ From kaber@trash.net Sat Dec 25 07:46:12 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 25 Dec 2004 07:46:20 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBPFjpSY020073 for ; Sat, 25 Dec 2004 07:46:12 -0800 Received: from eru.coreworks.de ([172.16.0.2] helo=trash.net) by kaber.coreworks.de with esmtp (Exim 4.34) id 1CiE8c-0003bU-Gm; Sat, 25 Dec 2004 16:47:26 +0100 Message-ID: <41CD8B4F.6010402@trash.net> Date: Sat, 25 Dec 2004 16:46:23 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040413 Debian/1.6-5 X-Accept-Language: en MIME-Version: 1.0 To: Peter Bieringer CC: USAGI core , Maillist netdev , Harald Welte , Netfilter development mailing list Subject: Re: ip6tables: accept of IPv6 transport esp packages not possible - no rule matches References: <019064D0423CE6C823CBF476@t1mobil.muc.aerasec.de> <5F6ACA5CEF52DBFBF11FBF94@t1mobil.muc.aerasec.de> In-Reply-To: <5F6ACA5CEF52DBFBF11FBF94@t1mobil.muc.aerasec.de> Content-Type: multipart/mixed; boundary="------------070905080408060602080109" X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13020 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------070905080408060602080109 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Peter Bieringer wrote: > Looks like there is something going wrong in the protocol matching > algorithm in netfilter6. Does this patch fix the problem ? Regards Patrick --------------070905080408060602080109 Content-Type: text/plain; name="x" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="x" ===== net/ipv6/netfilter/ip6_tables.c 1.34 vs edited ===== --- 1.34/net/ipv6/netfilter/ip6_tables.c 2004-11-10 01:44:26 +01:00 +++ edited/net/ipv6/netfilter/ip6_tables.c 2004-12-25 16:42:21 +01:00 @@ -234,7 +234,7 @@ * we will change the return 0 to 1*/ if ((currenthdr == IPPROTO_NONE) || (currenthdr == IPPROTO_ESP)) - return 0; + break; hp = skb_header_pointer(skb, ptr, sizeof(_hdr), &_hdr); BUG_ON(hp == NULL); --------------070905080408060602080109-- From pb@bieringer.de Sat Dec 25 09:46:11 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 25 Dec 2004 09:46:17 -0800 (PST) Received: from smtp2.aerasec.de (gromit.aerasec.de [195.226.187.57]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBPHjnSJ027924 for ; Sat, 25 Dec 2004 09:46:10 -0800 Received: from localhost (localhost [127.0.0.1]) by smtp2.aerasec.de (Postfix) with SMTP id 9F6B1137EE; Sat, 25 Dec 2004 18:47:17 +0100 (CET) X-AV-Checked: Sat Dec 25 18:47:17 2004 smtp2.aerasec.de Received: from [192.168.17.65] (pD9E96855.dip.t-dialin.net [217.233.104.85]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (Client did not present a certificate) by smtp2.aerasec.de (Postfix) with ESMTP id B67F3137EA; Sat, 25 Dec 2004 18:47:15 +0100 (CET) Date: Sat, 25 Dec 2004 18:47:52 +0100 From: Peter Bieringer To: Maillist netdev , Maillist USAGI-users Cc: Harald Welte , Patrick McHardy Subject: netfilter6: ICMPv6 type 143 doesn't match Message-ID: <6050E336B1A0D7D8E70C66F3@t1mobil.muc.aerasec.de> X-Mailer: Mulberry/3.1.6 (Linux/x86) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13021 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pb@bieringer.de Precedence: bulk X-list: netdev Hi, playing around with DHCPv6 (running on a very secured box with also outgoing netfilter ruleset) I found that something's going wrong with the ICMPv6 matcher: LOG rule reports: Dec 25 18:31:01 gatepbg kernel: OUTPUT-FW6/cleanup:IN= OUT=eth0 SRC=0000:0000:0000:0000:0000:0000:0000:0000 DST=ff02:0000:0000:0000:0000:0000:0000:0016 LEN=96 TC=0 HOPLIMIT=1 FLOWLBL=0 OPT ( ) PROTO=ICMPv6 TYPE=143 CODE=0 I tried several rules (don't wonder about the wrong order, it was a try and error -I insert, uppest rule was inserted last): # ip6tables -vn -L OUTPUT Chain OUTPUT (policy DROP 4 packets, 4872 bytes) pkts bytes target prot opt in out source destination 2 192 ACCEPT all * eth0 ::/0 ::/0 0 0 ACCEPT icmpv6 * * ::/0 ::/0 0 0 ACCEPT icmpv6 * * ::/0 ::/0 ipv6-icmp type 143 0 0 ACCEPT icmpv6 * * ::/0 ff02::/16 ipv6-icmp type 143 0 0 ACCEPT icmpv6 * * ::/0 ff02::/16 ipv6-icmp type 143 0 0 ACCEPT icmpv6 * * ::/0 ff02::16/128 ipv6-icmp type 143 Packet dump: 18:46:07.984044 :: > ff02::16: HBH (rtalert: 0x0000) (padn)[icmp6 sum ok] icmp6: type-#143 [hlim 1] (len 56) 0x0000: 6000 0000 0038 0001 0000 0000 0000 0000 `....8.......... 0x0010: 0000 0000 0000 0000 ff02 0000 0000 0000 ................ 0x0020: 0000 0000 0000 0016 3a00 0502 0000 0100 ........:....... 0x0030: 8f00 6b6a 0000 0002 0400 0000 ff05 0000 ..kj............ 0x0040: 0000 0000 0000 0000 0001 0003 0400 0000 ................ 0x0050: ff02 0000 0000 0000 0000 0000 0001 0002 ................ I wonder that only the proto "all" rule matches such packet. BTW: makes it sense that ip6tables remember, whether I had used "-p all" on insert or not? # ip6tables -I OUTPUT -p all -o eth0 -j ACCEPT # ip6tables -D OUTPUT -o eth0 -j ACCEPT ip6tables: Bad rule (does a matching rule exist in that chain?) # ip6tables -D OUTPUT -p all -o eth0 -j ACCEPT (ok) Same the other way: # ip6tables -I OUTPUT -o eth0 -j ACCEPT # ip6tables -D OUTPUT -p all -o eth0 -j ACCEPT ip6tables: Bad rule (does a matching rule exist in that chain?) Strange...I didn't really expect such behaviour as "newbie" ;-) Peter -- Dr. Peter Bieringer http://www.bieringer.de/pb/ GPG/PGP Key 0x958F422D mailto: pb at bieringer dot de Deep Space 6 Co-Founder and Core Member http://www.deepspace6.net/ From pb@bieringer.de Sat Dec 25 11:41:04 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 25 Dec 2004 11:41:11 -0800 (PST) Received: from smtp2.aerasec.de (gromit.aerasec.de [195.226.187.57]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBPJehjh031934 for ; Sat, 25 Dec 2004 11:41:04 -0800 Received: from localhost (localhost [127.0.0.1]) by smtp2.aerasec.de (Postfix) with SMTP id 73F96137EE; Sat, 25 Dec 2004 20:42:12 +0100 (CET) X-AV-Checked: Sat Dec 25 20:42:12 2004 smtp2.aerasec.de Received: from [192.168.17.65] (p54873414.dip.t-dialin.net [84.135.52.20]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (Client did not present a certificate) by smtp2.aerasec.de (Postfix) with ESMTP id AAAA0137EA; Sat, 25 Dec 2004 20:42:11 +0100 (CET) Date: Sat, 25 Dec 2004 20:42:49 +0100 From: Peter Bieringer To: Maillist netdev , Maillist USAGI-users Subject: IPv6: removal of the autogenerated link-local address of an interface still possible Message-ID: <8251764896D7138E21068580@t1mobil.muc.aerasec.de> X-Mailer: Mulberry/3.1.6 (Linux/x86) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13022 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pb@bieringer.de Precedence: bulk X-list: netdev Hi, used kernel: 2.6.9-1.681_FC3 is this behavior still by design? I'm not happy about this because if the link-local address is removed, it can't be added easily anymore. "howto": # ip -6 addr show dev eth1 3: eth1: mtu 1500 qlen 1000 inet6 fe80::200:cbff:fe23:32bb/64 scope link valid_lft forever preferred_lft forever # ip -6 addr del fe80::200:cbff:fe23:32bb/64 dev eth1 # ip -6 addr show dev eth1 Device "eth1" does not exist. # ip addr show dev eth1 3: eth1: mtu 1500 qdisc pfifo_fast qlen 1000 link/ether 00:00:cb:23:32:bb brd ff:ff:ff:ff:ff:ff Would it be not better to prevent user space tools from removal of the (one and only) autogenerated link-local address? Peter -- Dr. Peter Bieringer http://www.bieringer.de/pb/ GPG/PGP Key 0x958F422D mailto: pb at bieringer dot de Deep Space 6 Co-Founder and Core Member http://www.deepspace6.net/ From davem@davemloft.net Sat Dec 25 13:56:21 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 25 Dec 2004 13:56:31 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBPLu1qo007273 for ; Sat, 25 Dec 2004 13:56:21 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1CiJu7-0004bu-00; Sat, 25 Dec 2004 13:56:51 -0800 Date: Sat, 25 Dec 2004 13:56:50 -0800 From: "David S. Miller" To: Harald Welte Cc: yoshfuji@linux-ipv6.org, pekkas@netcore.fi, netdev@oss.sgi.com, usagi-users@linux-ipv6.org Subject: Re: 2.6.8.1 IPv6 Routing Problem Message-Id: <20041225135650.55dca29f.davem@davemloft.net> In-Reply-To: <20041225115415.GI24142@sunbeam.de.gnumonks.org> References: <20040920.152012.114156249.yoshfuji@linux-ipv6.org> <20040921195752.015e3d1d.davem@davemloft.net> <20040922.120630.96674716.yoshfuji@linux-ipv6.org> <20041225115415.GI24142@sunbeam.de.gnumonks.org> X-Mailer: Sylpheed version 1.0.0beta3 (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13023 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Sat, 25 Dec 2004 12:54:15 +0100 Harald Welte wrote: > As far as I can see (please correct me) no such patch was included so > far, at least with 2.6.10-rc3 I still have the old behaviour. This specific issue was not resolved, but the issue of loading the ipv6 module after bringing up your interfaces was fixed via this patch below. I've also discussed this with Herbert Xu a bit, and no one solution is really clear yet. IPV4 has similar issues, just in a slightly different form. # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/11/15 14:09:05-08:00 davem@nuts.davemloft.net # [IPV6]: Temp fix for ipv6 link-local address problem. # # Make sure loopback_dev, if up, has the ipv6 bits # for it setup before the addrconf netdev notifier # is registered. # # Signed-off-by: David S. Miller # # net/ipv6/addrconf.c # 2004/11/15 14:08:09-08:00 davem@nuts.davemloft.net +23 -0 # [IPV6]: Temp fix for ipv6 link-local address problem. # diff -Nru a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c --- a/net/ipv6/addrconf.c 2004-12-25 13:26:13 -08:00 +++ b/net/ipv6/addrconf.c 2004-12-25 13:26:13 -08:00 @@ -3387,6 +3387,29 @@ void __init addrconf_init(void) { + /* The addrconf netdev notifier requires that loopback_dev + * has it's ipv6 private information allocated and setup + * before it can bring up and give link-local addresses + * to other devices which are up. + * + * Unfortunately, loopback_dev is not necessarily the first + * entry in the global dev_base list of net devices. In fact, + * it is likely to be the very last entry on that list. + * So this causes the notifier registry below to try and + * give link-local addresses to all devices besides loopback_dev + * first, then loopback_dev, which cases all the non-loopback_dev + * devices to fail to get a link-local address. + * + * So, as a temporary fix, register loopback_dev first by hand. + * Longer term, all of the dependencies ipv6 has upon the loopback + * device and it being up should be removed. + */ + rtnl_lock(); + addrconf_notify(&ipv6_dev_notf, NETDEV_REGISTER, &loopback_dev); + if (loopback_dev.flags & IFF_UP) + addrconf_notify(&ipv6_dev_notf, NETDEV_UP, &loopback_dev); + rtnl_unlock(); + register_netdevice_notifier(&ipv6_dev_notf); #ifdef CONFIG_IPV6_PRIVACY From acme@conectiva.com.br Sat Dec 25 17:49:07 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 25 Dec 2004 17:49:16 -0800 (PST) Received: from perninha.conectiva.com.br (perninha.conectiva.com.br [200.140.247.100]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBQ1mjY9018357 for ; Sat, 25 Dec 2004 17:49:06 -0800 Received: by perninha.conectiva.com.br (Postfix, from userid 568) id 5731E472D4; Sat, 25 Dec 2004 23:50:13 -0200 (BRST) Received: from burns.conectiva (burns.conectiva [10.0.0.4]) by perninha.conectiva.com.br (Postfix) with SMTP id 2394E47371 for ; Sat, 25 Dec 2004 23:50:12 -0200 (BRST) Received: (qmail 12051 invoked by uid 0); 26 Dec 2004 02:46:36 -0000 Received: from mapi8.distro.conectiva (HELO oops.ghostprotocols.net) (10.0.16.10) by burns.conectiva with SMTP; 26 Dec 2004 02:46:36 -0000 Received: from [192.168.1.6] (amd64.kerneljanitors.org [192.168.1.6]) by oops.ghostprotocols.net (Postfix) with ESMTP id C8EB11463D; Sat, 25 Dec 2004 23:50:03 -0200 (BRST) Message-ID: <41CE198B.7040005@conectiva.com.br> Date: Sat, 25 Dec 2004 23:53:15 -0200 From: Arnaldo Carvalho de Melo Organization: Conectiva S.A. User-Agent: Mozilla Thunderbird 0.9 (X11/20041103) X-Accept-Language: en-us, en MIME-Version: 1.0 To: "David S. Miller" Cc: netdev@oss.sgi.com Subject: [PATCH][INET] move inet_sock into inet_opt and rename it to inet_sock Content-Type: multipart/mixed; boundary="------------060000020103000505000009" X-Bogosity: No, tests=bogofilter, spamicity=0.499925, version=0.16.3 X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13024 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: acme@conectiva.com.br Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------060000020103000505000009 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Hi David, Now that 2.6.10 is out, please take a look if this is acceptable, the following patches will deal with udp_sock, tcp_sock, etc. This is the start of the series of patches that will introduce struct connection_sock, reducing the memory used by non connected protocols, such as UDP. It is available at: bk://kernel.bkbits.net/acme/connection_sock-2.6 Best Regards, - Arnaldo --------------060000020103000505000009 Content-Type: text/plain; name="inet_sock.patch" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="inet_sock.patch" =================================================================== ChangeSet@1.2221, 2004-12-23 22:46:12-02:00, acme@conectiva.com.br [INET] move inet_sock into inet_opt and rename it to inet_sock With this we can remove all the cut'n'pasted layouts in all inet_sock derived classes, such as tcp_sock, udp_sock, sctp_sock, etc. Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller include/linux/ip.h | 24 ++++++++++-------------- include/linux/ipv6.h | 14 ++++---------- include/linux/tcp.h | 6 +----- include/linux/udp.h | 6 +----- include/net/icmp.h | 8 +------- include/net/sctp/sctp.h | 12 +++--------- include/net/tcp.h | 2 +- net/ipv4/af_inet.c | 10 +++++----- net/ipv4/datagram.c | 2 +- net/ipv4/icmp.c | 4 ++-- net/ipv4/igmp.c | 16 ++++++++-------- net/ipv4/ip_output.c | 16 ++++++++-------- net/ipv4/ip_sockglue.c | 12 ++++++------ net/ipv4/ipvs/ip_vs_sync.c | 6 +++--- net/ipv4/netfilter/ip_conntrack_core.c | 2 +- net/ipv4/raw.c | 14 +++++++------- net/ipv4/tcp.c | 4 ++-- net/ipv4/tcp_diag.c | 14 +++++++------- net/ipv4/tcp_input.c | 2 +- net/ipv4/tcp_ipv4.c | 30 +++++++++++++++--------------- net/ipv4/tcp_minisocks.c | 2 +- net/ipv4/tcp_output.c | 2 +- net/ipv4/tcp_timer.c | 2 +- net/ipv4/udp.c | 20 ++++++++++---------- net/ipv6/af_inet6.c | 10 ++++------ net/ipv6/datagram.c | 4 ++-- net/ipv6/ip6_output.c | 6 +++--- net/ipv6/raw.c | 10 +++++----- net/ipv6/tcp_ipv6.c | 20 ++++++++++---------- net/ipv6/udp.c | 12 ++++++------ net/sctp/input.c | 2 +- net/sctp/ipv6.c | 6 +++--- net/sctp/protocol.c | 4 ++-- security/selinux/avc.c | 4 ++-- 34 files changed, 138 insertions(+), 170 deletions(-) diff -Nru a/include/linux/ip.h b/include/linux/ip.h --- a/include/linux/ip.h 2004-12-25 23:46:53 -02:00 +++ b/include/linux/ip.h 2004-12-25 23:46:53 -02:00 @@ -107,7 +107,14 @@ #define optlength(opt) (sizeof(struct ip_options) + opt->optlen) -struct inet_opt { +struct ipv6_pinfo; + +struct inet_sock { + /* sk and pinet6 has to be the first two members of inet_sock */ + struct sock sk; +#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) + struct ipv6_pinfo *pinet6; +#endif /* Socket demultiplex comparisons on incoming packets. */ __u32 daddr; /* Foreign IPv4 addr */ __u32 rcv_saddr; /* Bound local IPv4 addr */ @@ -146,20 +153,9 @@ #define IPCORK_OPT 1 /* ip-options has been held in ipcork.opt */ -struct ipv6_pinfo; - -/* WARNING: don't change the layout of the members in inet_sock! */ -struct inet_sock { - struct sock sk; -#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) - struct ipv6_pinfo *pinet6; -#endif - struct inet_opt inet; -}; - -static inline struct inet_opt * inet_sk(const struct sock *__sk) +static inline struct inet_sock *inet_sk(const struct sock *sk) { - return &((struct inet_sock *)__sk)->inet; + return (struct inet_sock *)sk; } #endif diff -Nru a/include/linux/ipv6.h b/include/linux/ipv6.h --- a/include/linux/ipv6.h 2004-12-25 23:46:53 -02:00 +++ b/include/linux/ipv6.h 2004-12-25 23:46:53 -02:00 @@ -256,32 +256,26 @@ /* WARNING: don't change the layout of the members in {raw,udp,tcp}6_sock! */ struct raw6_sock { - struct sock sk; - struct ipv6_pinfo *pinet6; - struct inet_opt inet; + struct inet_sock inet; struct raw6_opt raw6; struct ipv6_pinfo inet6; }; struct udp6_sock { - struct sock sk; - struct ipv6_pinfo *pinet6; - struct inet_opt inet; + struct inet_sock inet; struct udp_opt udp; struct ipv6_pinfo inet6; }; struct tcp6_sock { - struct sock sk; - struct ipv6_pinfo *pinet6; - struct inet_opt inet; + struct inet_sock inet; struct tcp_opt tcp; struct ipv6_pinfo inet6; }; static inline struct ipv6_pinfo * inet6_sk(const struct sock *__sk) { - return ((struct raw6_sock *)__sk)->pinet6; + return inet_sk(__sk)->pinet6; } static inline struct raw6_opt * raw6_sk(const struct sock *__sk) diff -Nru a/include/linux/tcp.h b/include/linux/tcp.h --- a/include/linux/tcp.h 2004-12-25 23:46:53 -02:00 +++ b/include/linux/tcp.h 2004-12-25 23:46:53 -02:00 @@ -440,11 +440,7 @@ /* WARNING: don't change the layout of the members in tcp_sock! */ struct tcp_sock { - struct sock sk; -#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) - struct ipv6_pinfo *pinet6; -#endif - struct inet_opt inet; + struct inet_sock inet; struct tcp_opt tcp; }; diff -Nru a/include/linux/udp.h b/include/linux/udp.h --- a/include/linux/udp.h 2004-12-25 23:46:53 -02:00 +++ b/include/linux/udp.h 2004-12-25 23:46:53 -02:00 @@ -53,11 +53,7 @@ /* WARNING: don't change the layout of the members in udp_sock! */ struct udp_sock { - struct sock sk; -#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) - struct ipv6_pinfo *pinet6; -#endif - struct inet_opt inet; + struct inet_sock inet; struct udp_opt udp; }; diff -Nru a/include/net/icmp.h b/include/net/icmp.h --- a/include/net/icmp.h 2004-12-25 23:46:53 -02:00 +++ b/include/net/icmp.h 2004-12-25 23:46:53 -02:00 @@ -50,15 +50,9 @@ struct icmp_filter filter; }; -struct ipv6_pinfo; - /* WARNING: don't change the layout of the members in raw_sock! */ struct raw_sock { - struct sock sk; -#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) - struct ipv6_pinfo *pinet6; -#endif - struct inet_opt inet; + struct inet_sock inet; struct raw_opt raw4; }; diff -Nru a/include/net/sctp/sctp.h b/include/net/sctp/sctp.h --- a/include/net/sctp/sctp.h 2004-12-25 23:46:53 -02:00 +++ b/include/net/sctp/sctp.h 2004-12-25 23:46:53 -02:00 @@ -584,26 +584,20 @@ /* WARNING: Do not change the layout of the members in sctp_sock! */ struct sctp_sock { - struct sock sk; -#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) - struct ipv6_pinfo *pinet6; -#endif /* CONFIG_IPV6 */ - struct inet_opt inet; + struct inet_sock inet; struct sctp_opt sctp; }; #if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) struct sctp6_sock { - struct sock sk; - struct ipv6_pinfo *pinet6; - struct inet_opt inet; + struct inet_sock inet; struct sctp_opt sctp; struct ipv6_pinfo inet6; }; #endif /* CONFIG_IPV6 */ #define sctp_sk(__sk) (&((struct sctp_sock *)__sk)->sctp) -#define sctp_opt2sk(__sp) &container_of(__sp, struct sctp_sock, sctp)->sk +#define sctp_opt2sk(__sp) &container_of(__sp, struct sctp_sock, sctp)->inet.sk /* Is a socket of this style? */ #define sctp_style(sk, style) __sctp_style((sk), (SCTP_SOCKET_##style)) diff -Nru a/include/net/tcp.h b/include/net/tcp.h --- a/include/net/tcp.h 2004-12-25 23:46:53 -02:00 +++ b/include/net/tcp.h 2004-12-25 23:46:53 -02:00 @@ -196,7 +196,7 @@ unsigned char tw_rcv_wscale; __u16 tw_sport; /* Socket demultiplex comparisons on incoming packets. */ - /* these five are in inet_opt */ + /* these five are in inet_sock */ __u32 tw_daddr __attribute__((aligned(TCP_ADDRCMP_ALIGN_BYTES))); __u32 tw_rcv_saddr; diff -Nru a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c --- a/net/ipv4/af_inet.c 2004-12-25 23:46:53 -02:00 +++ b/net/ipv4/af_inet.c 2004-12-25 23:46:53 -02:00 @@ -131,7 +131,7 @@ void inet_sock_destruct(struct sock *sk) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); __skb_queue_purge(&sk->sk_receive_queue); __skb_queue_purge(&sk->sk_error_queue); @@ -173,7 +173,7 @@ static int inet_autobind(struct sock *sk) { - struct inet_opt *inet; + struct inet_sock *inet; /* We may need to bind the socket. */ lock_sock(sk); inet = inet_sk(sk); @@ -232,7 +232,7 @@ struct sock *sk; struct list_head *p; struct inet_protosw *answer; - struct inet_opt *inet; + struct inet_sock *inet; struct proto *answer_prot; unsigned char answer_flags; char answer_no_check; @@ -389,7 +389,7 @@ { struct sockaddr_in *addr = (struct sockaddr_in *)uaddr; struct sock *sk = sock->sk; - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); unsigned short snum; int chk_addr_ret; int err; @@ -623,7 +623,7 @@ int *uaddr_len, int peer) { struct sock *sk = sock->sk; - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct sockaddr_in *sin = (struct sockaddr_in *)uaddr; sin->sin_family = AF_INET; diff -Nru a/net/ipv4/datagram.c b/net/ipv4/datagram.c --- a/net/ipv4/datagram.c 2004-12-25 23:46:53 -02:00 +++ b/net/ipv4/datagram.c 2004-12-25 23:46:53 -02:00 @@ -22,7 +22,7 @@ int ip4_datagram_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct sockaddr_in *usin = (struct sockaddr_in *) uaddr; struct rtable *rt; u32 saddr; diff -Nru a/net/ipv4/icmp.c b/net/ipv4/icmp.c --- a/net/ipv4/icmp.c 2004-12-25 23:46:53 -02:00 +++ b/net/ipv4/icmp.c 2004-12-25 23:46:53 -02:00 @@ -377,7 +377,7 @@ static void icmp_reply(struct icmp_bxm *icmp_param, struct sk_buff *skb) { struct sock *sk = icmp_socket->sk; - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ipcm_cookie ipc; struct rtable *rt = (struct rtable *)skb->dst; u32 daddr; @@ -1097,7 +1097,7 @@ void __init icmp_init(struct net_proto_family *ops) { - struct inet_opt *inet; + struct inet_sock *inet; int i; for (i = 0; i < NR_CPUS; i++) { diff -Nru a/net/ipv4/igmp.c b/net/ipv4/igmp.c --- a/net/ipv4/igmp.c 2004-12-25 23:46:53 -02:00 +++ b/net/ipv4/igmp.c 2004-12-25 23:46:53 -02:00 @@ -1617,7 +1617,7 @@ u32 addr = imr->imr_multiaddr.s_addr; struct ip_mc_socklist *iml, *i; struct in_device *in_dev; - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); int count = 0; if (!MULTICAST(addr)) @@ -1691,7 +1691,7 @@ int ip_mc_leave_group(struct sock *sk, struct ip_mreqn *imr) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ip_mc_socklist *iml, **imlp; rtnl_lock(); @@ -1734,7 +1734,7 @@ u32 addr = mreqs->imr_multiaddr; struct ip_mc_socklist *pmc; struct in_device *in_dev = NULL; - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ip_sf_socklist *psl; int i, j, rv; @@ -1852,7 +1852,7 @@ u32 addr = msf->imsf_multiaddr; struct ip_mc_socklist *pmc; struct in_device *in_dev; - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ip_sf_socklist *newpsl, *psl; if (!MULTICAST(addr)) @@ -1922,7 +1922,7 @@ u32 addr = msf->imsf_multiaddr; struct ip_mc_socklist *pmc; struct in_device *in_dev; - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ip_sf_socklist *psl; if (!MULTICAST(addr)) @@ -1980,7 +1980,7 @@ struct sockaddr_in *psin; u32 addr; struct ip_mc_socklist *pmc; - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ip_sf_socklist *psl; psin = (struct sockaddr_in *)&gsf->gf_group; @@ -2033,7 +2033,7 @@ */ int ip_mc_sf_allow(struct sock *sk, u32 loc_addr, u32 rmt_addr, int dif) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ip_mc_socklist *pmc; struct ip_sf_socklist *psl; int i; @@ -2069,7 +2069,7 @@ void ip_mc_drop_socket(struct sock *sk) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ip_mc_socklist *iml; if (inet->mc_list == NULL) diff -Nru a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c --- a/net/ipv4/ip_output.c 2004-12-25 23:46:53 -02:00 +++ b/net/ipv4/ip_output.c 2004-12-25 23:46:53 -02:00 @@ -115,7 +115,7 @@ return 0; } -static inline int ip_select_ttl(struct inet_opt *inet, struct dst_entry *dst) +static inline int ip_select_ttl(struct inet_sock *inet, struct dst_entry *dst) { int ttl = inet->uc_ttl; @@ -131,7 +131,7 @@ int ip_build_and_send_pkt(struct sk_buff *skb, struct sock *sk, u32 saddr, u32 daddr, struct ip_options *opt) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct rtable *rt = (struct rtable *)skb->dst; struct iphdr *iph; @@ -297,7 +297,7 @@ int ip_queue_xmit(struct sk_buff *skb, int ipfragok) { struct sock *sk = skb->sk; - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ip_options *opt = inet->opt; struct rtable *rt; struct iphdr *iph; @@ -712,7 +712,7 @@ struct ipcm_cookie *ipc, struct rtable *rt, unsigned int flags) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct sk_buff *skb; struct ip_options *opt = NULL; @@ -973,7 +973,7 @@ ssize_t ip_append_page(struct sock *sk, struct page *page, int offset, size_t size, int flags) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct sk_buff *skb; struct rtable *rt; struct ip_options *opt = NULL; @@ -1112,7 +1112,7 @@ { struct sk_buff *skb, *tmp_skb; struct sk_buff **tail_skb; - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ip_options *opt = NULL; struct rtable *rt = inet->cork.rt; struct iphdr *iph; @@ -1217,7 +1217,7 @@ */ void ip_flush_pending_frames(struct sock *sk) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct sk_buff *skb; while ((skb = __skb_dequeue_tail(&sk->sk_write_queue)) != NULL) @@ -1260,7 +1260,7 @@ void ip_send_reply(struct sock *sk, struct sk_buff *skb, struct ip_reply_arg *arg, unsigned int len) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct { struct ip_options opt; char data[40]; diff -Nru a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c --- a/net/ipv4/ip_sockglue.c 2004-12-25 23:46:53 -02:00 +++ b/net/ipv4/ip_sockglue.c 2004-12-25 23:46:53 -02:00 @@ -112,7 +112,7 @@ void ip_cmsg_recv(struct msghdr *msg, struct sk_buff *skb) { - struct inet_opt *inet = inet_sk(skb->sk); + struct inet_sock *inet = inet_sk(skb->sk); unsigned flags = inet->cmsg_flags; /* Ordered by supposed usage frequency */ @@ -234,7 +234,7 @@ void ip_icmp_error(struct sock *sk, struct sk_buff *skb, int err, u16 port, u32 info, u8 *payload) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct sock_exterr_skb *serr; if (!inet->recverr) @@ -263,7 +263,7 @@ void ip_local_error(struct sock *sk, int err, u32 daddr, u16 port, u32 info) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct sock_exterr_skb *serr; struct iphdr *iph; struct sk_buff *skb; @@ -342,7 +342,7 @@ sin = &errhdr.offender; sin->sin_family = AF_UNSPEC; if (serr->ee.ee_origin == SO_EE_ORIGIN_ICMP) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); sin->sin_family = AF_INET; sin->sin_addr.s_addr = skb->nh.iph->saddr; @@ -383,7 +383,7 @@ int ip_setsockopt(struct sock *sk, int level, int optname, char __user *optval, int optlen) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); int val=0,err; if (level != SOL_IP) @@ -875,7 +875,7 @@ int ip_getsockopt(struct sock *sk, int level, int optname, char __user *optval, int __user *optlen) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); int val; int len; diff -Nru a/net/ipv4/ipvs/ip_vs_sync.c b/net/ipv4/ipvs/ip_vs_sync.c --- a/net/ipv4/ipvs/ip_vs_sync.c 2004-12-25 23:46:53 -02:00 +++ b/net/ipv4/ipvs/ip_vs_sync.c 2004-12-25 23:46:53 -02:00 @@ -343,7 +343,7 @@ */ static void set_mcast_loop(struct sock *sk, u_char loop) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); /* setsockopt(sock, SOL_IP, IP_MULTICAST_LOOP, &loop, sizeof(loop)); */ lock_sock(sk); @@ -356,7 +356,7 @@ */ static void set_mcast_ttl(struct sock *sk, u_char ttl) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); /* setsockopt(sock, SOL_IP, IP_MULTICAST_TTL, &ttl, sizeof(ttl)); */ lock_sock(sk); @@ -370,7 +370,7 @@ static int set_mcast_if(struct sock *sk, char *ifname) { struct net_device *dev; - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); if ((dev = __dev_get_by_name(ifname)) == NULL) return -ENODEV; diff -Nru a/net/ipv4/netfilter/ip_conntrack_core.c b/net/ipv4/netfilter/ip_conntrack_core.c --- a/net/ipv4/netfilter/ip_conntrack_core.c 2004-12-25 23:46:53 -02:00 +++ b/net/ipv4/netfilter/ip_conntrack_core.c 2004-12-25 23:46:53 -02:00 @@ -1229,7 +1229,7 @@ static int getorigdst(struct sock *sk, int optval, void __user *user, int *len) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ip_conntrack_tuple_hash *h; struct ip_conntrack_tuple tuple; diff -Nru a/net/ipv4/raw.c b/net/ipv4/raw.c --- a/net/ipv4/raw.c 2004-12-25 23:46:53 -02:00 +++ b/net/ipv4/raw.c 2004-12-25 23:46:53 -02:00 @@ -109,7 +109,7 @@ struct hlist_node *node; sk_for_each_from(sk, node) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); if (inet->num == num && !(inet->daddr && inet->daddr != raddr) && @@ -181,7 +181,7 @@ void raw_err (struct sock *sk, struct sk_buff *skb, u32 info) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); int type = skb->h.icmph->type; int code = skb->h.icmph->code; int err = 0; @@ -263,7 +263,7 @@ struct rtable *rt, unsigned int flags) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); int hh_len; struct iphdr *iph; struct sk_buff *skb; @@ -374,7 +374,7 @@ static int raw_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg, size_t len) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ipcm_cookie ipc; struct rtable *rt = NULL; int free = 0; @@ -537,7 +537,7 @@ /* This gets rid of all the nasties in af_inet. -DaveM */ static int raw_bind(struct sock *sk, struct sockaddr *uaddr, int addr_len) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct sockaddr_in *addr = (struct sockaddr_in *) uaddr; int ret = -EINVAL; int chk_addr_ret; @@ -565,7 +565,7 @@ int raw_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg, size_t len, int noblock, int flags, int *addr_len) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); size_t copied = 0; int err = -EOPNOTSUPP; struct sockaddr_in *sin = (struct sockaddr_in *)msg->msg_name; @@ -802,7 +802,7 @@ static __inline__ char *get_raw_sock(struct sock *sp, char *tmpbuf, int i) { - struct inet_opt *inet = inet_sk(sp); + struct inet_sock *inet = inet_sk(sp); unsigned int dest = inet->daddr, src = inet->rcv_saddr; __u16 destp = 0, diff -Nru a/net/ipv4/tcp.c b/net/ipv4/tcp.c --- a/net/ipv4/tcp.c 2004-12-25 23:46:53 -02:00 +++ b/net/ipv4/tcp.c 2004-12-25 23:46:53 -02:00 @@ -460,7 +460,7 @@ int tcp_listen_start(struct sock *sk) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct tcp_opt *tp = tcp_sk(sk); struct tcp_listen_opt *lopt; @@ -1772,7 +1772,7 @@ int tcp_disconnect(struct sock *sk, int flags) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct tcp_opt *tp = tcp_sk(sk); int err = 0; int old_state = sk->sk_state; diff -Nru a/net/ipv4/tcp_diag.c b/net/ipv4/tcp_diag.c --- a/net/ipv4/tcp_diag.c 2004-12-25 23:46:53 -02:00 +++ b/net/ipv4/tcp_diag.c 2004-12-25 23:46:53 -02:00 @@ -55,7 +55,7 @@ static int tcpdiag_fill(struct sk_buff *skb, struct sock *sk, int ext, u32 pid, u32 seq, u16 nlmsg_flags) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct tcp_opt *tp = tcp_sk(sk); struct tcpdiagmsg *r; struct nlmsghdr *nlh; @@ -427,7 +427,7 @@ if (cb->nlh->nlmsg_len > 4 + NLMSG_SPACE(sizeof(*r))) { struct tcpdiag_entry entry; struct rtattr *bc = (struct rtattr *)(r + 1); - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); entry.family = sk->sk_family; #ifdef CONFIG_IP_TCPDIAG_IPV6 @@ -458,7 +458,7 @@ struct open_request *req, u32 pid, u32 seq) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); unsigned char *b = skb->tail; struct tcpdiagmsg *r; struct nlmsghdr *nlh; @@ -515,7 +515,7 @@ struct tcp_opt *tp = tcp_sk(sk); struct tcp_listen_opt *lopt; struct rtattr *bc = NULL; - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); int j, s_j; int reqnum, s_reqnum; int err = 0; @@ -609,7 +609,7 @@ num = 0; sk_for_each(sk, node, &tcp_listening_hash[i]) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); if (num < s_num) { num++; @@ -670,7 +670,7 @@ num = 0; sk_for_each(sk, node, &head->chain) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); if (num < s_num) goto next_normal; @@ -692,7 +692,7 @@ if (r->tcpdiag_states&TCPF_TIME_WAIT) { sk_for_each(sk, node, &tcp_ehash[i + tcp_ehash_size].chain) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); if (num < s_num) goto next_dying; diff -Nru a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c --- a/net/ipv4/tcp_input.c 2004-12-25 23:46:53 -02:00 +++ b/net/ipv4/tcp_input.c 2004-12-25 23:46:53 -02:00 @@ -1647,7 +1647,7 @@ #if FASTRETRANS_DEBUG > 1 static void DBGUNDO(struct sock *sk, struct tcp_opt *tp, const char *msg) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); printk(KERN_DEBUG "Undo %s %u.%u.%u.%u/%u c%u l%u ss%u/%u p%u\n", msg, NIPQUAD(inet->daddr), ntohs(inet->dport), diff -Nru a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c --- a/net/ipv4/tcp_ipv4.c 2004-12-25 23:46:53 -02:00 +++ b/net/ipv4/tcp_ipv4.c 2004-12-25 23:46:53 -02:00 @@ -115,7 +115,7 @@ static __inline__ int tcp_sk_hashfn(struct sock *sk) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); __u32 laddr = inet->rcv_saddr; __u16 lport = inet->num; __u32 faddr = inet->daddr; @@ -300,7 +300,7 @@ */ static void __tcp_put_port(struct sock *sk) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct tcp_bind_hashbucket *head = &tcp_bhash[tcp_bhashfn(inet->num)]; struct tcp_bind_bucket *tb; @@ -420,7 +420,7 @@ hiscore=-1; sk_for_each(sk, node, head) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); if (inet->num == hnum && !ipv6_only_sock(sk)) { __u32 rcv_saddr = inet->rcv_saddr; @@ -457,7 +457,7 @@ read_lock(&tcp_lhash_lock); head = &tcp_listening_hash[tcp_lhashfn(hnum)]; if (!hlist_empty(head)) { - struct inet_opt *inet = inet_sk((sk = __sk_head(head))); + struct inet_sock *inet = inet_sk((sk = __sk_head(head))); if (inet->num == hnum && !sk->sk_node.next && (!inet->rcv_saddr || inet->rcv_saddr == daddr) && @@ -549,7 +549,7 @@ static int __tcp_v4_check_established(struct sock *sk, __u16 lport, struct tcp_tw_bucket **twp) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); u32 daddr = inet->rcv_saddr; u32 saddr = inet->daddr; int dif = sk->sk_bound_dev_if; @@ -755,7 +755,7 @@ /* This will initiate an outgoing connection. */ int tcp_v4_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct tcp_opt *tp = tcp_sk(sk); struct sockaddr_in *usin = (struct sockaddr_in *)uaddr; struct rtable *rt; @@ -929,7 +929,7 @@ u32 mtu) { struct dst_entry *dst; - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct tcp_opt *tp = tcp_sk(sk); /* We are not interested in TCP_LISTEN and open_requests (SYN-ACKs @@ -992,7 +992,7 @@ struct iphdr *iph = (struct iphdr *)skb->data; struct tcphdr *th = (struct tcphdr *)(skb->data + (iph->ihl << 2)); struct tcp_opt *tp; - struct inet_opt *inet; + struct inet_sock *inet; int type = skb->h.icmph->type; int code = skb->h.icmph->code; struct sock *sk; @@ -1139,7 +1139,7 @@ void tcp_v4_send_check(struct sock *sk, struct tcphdr *th, int len, struct sk_buff *skb) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); if (skb->ip_summed == CHECKSUM_HW) { th->check = ~tcp_v4_check(th, len, inet->saddr, inet->daddr, 0); @@ -1561,7 +1561,7 @@ struct open_request *req, struct dst_entry *dst) { - struct inet_opt *newinet; + struct inet_sock *newinet; struct tcp_opt *newtp; struct sock *newsk; @@ -1868,7 +1868,7 @@ static int tcp_v4_reselect_saddr(struct sock *sk) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); int err; struct rtable *rt; __u32 old_saddr = inet->saddr; @@ -1919,7 +1919,7 @@ int tcp_v4_rebuild_header(struct sock *sk) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct rtable *rt = (struct rtable *)__sk_dst_check(sk, 0); u32 daddr; int err; @@ -1968,7 +1968,7 @@ static void v4_addr2sockaddr(struct sock *sk, struct sockaddr * uaddr) { struct sockaddr_in *sin = (struct sockaddr_in *) uaddr; - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); sin->sin_family = AF_INET; sin->sin_addr.s_addr = inet->daddr; @@ -1983,7 +1983,7 @@ int tcp_v4_remember_stamp(struct sock *sk) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct tcp_opt *tp = tcp_sk(sk); struct rtable *rt = (struct rtable *)__sk_dst_get(sk); struct inet_peer *peer = NULL; @@ -2486,7 +2486,7 @@ int timer_active; unsigned long timer_expires; struct tcp_opt *tp = tcp_sk(sp); - struct inet_opt *inet = inet_sk(sp); + struct inet_sock *inet = inet_sk(sp); unsigned int dest = inet->daddr; unsigned int src = inet->rcv_saddr; __u16 destp = ntohs(inet->dport); diff -Nru a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c --- a/net/ipv4/tcp_minisocks.c 2004-12-25 23:46:53 -02:00 +++ b/net/ipv4/tcp_minisocks.c 2004-12-25 23:46:53 -02:00 @@ -337,7 +337,7 @@ tw = kmem_cache_alloc(tcp_timewait_cachep, SLAB_ATOMIC); if(tw != NULL) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); int rto = (tp->rto<<2) - (tp->rto>>1); /* Give us an identity. */ diff -Nru a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c --- a/net/ipv4/tcp_output.c 2004-12-25 23:46:53 -02:00 +++ b/net/ipv4/tcp_output.c 2004-12-25 23:46:53 -02:00 @@ -266,7 +266,7 @@ static int tcp_transmit_skb(struct sock *sk, struct sk_buff *skb) { if (skb != NULL) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct tcp_opt *tp = tcp_sk(sk); struct tcp_skb_cb *tcb = TCP_SKB_CB(skb); int tcp_header_size = tp->tcp_header_len; diff -Nru a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c --- a/net/ipv4/tcp_timer.c 2004-12-25 23:46:53 -02:00 +++ b/net/ipv4/tcp_timer.c 2004-12-25 23:46:53 -02:00 @@ -332,7 +332,7 @@ */ #ifdef TCP_DEBUG if (net_ratelimit()) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); printk(KERN_DEBUG "TCP: Treason uncloaked! Peer %u.%u.%u.%u:%u/%u shrinks window %u:%u. Repaired.\n", NIPQUAD(inet->daddr), htons(inet->dport), inet->num, tp->snd_una, tp->snd_nxt); diff -Nru a/net/ipv4/udp.c b/net/ipv4/udp.c --- a/net/ipv4/udp.c 2004-12-25 23:46:53 -02:00 +++ b/net/ipv4/udp.c 2004-12-25 23:46:53 -02:00 @@ -124,7 +124,7 @@ { struct hlist_node *node; struct sock *sk2; - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); write_lock_bh(&udp_hash_lock); if (snum == 0) { @@ -171,7 +171,7 @@ } else { sk_for_each(sk2, node, &udp_hash[snum & (UDP_HTABLE_SIZE - 1)]) { - struct inet_opt *inet2 = inet_sk(sk2); + struct inet_sock *inet2 = inet_sk(sk2); if (inet2->num == snum && sk2 != sk && @@ -227,7 +227,7 @@ int badness = -1; sk_for_each(sk, node, &udp_hash[hnum & (UDP_HTABLE_SIZE - 1)]) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); if (inet->num == hnum && !ipv6_only_sock(sk)) { int score = (sk->sk_family == PF_INET ? 1 : 0); @@ -285,7 +285,7 @@ unsigned short hnum = ntohs(loc_port); sk_for_each_from(s, node) { - struct inet_opt *inet = inet_sk(s); + struct inet_sock *inet = inet_sk(s); if (inet->num != hnum || (inet->daddr && inet->daddr != rmt_addr) || @@ -316,7 +316,7 @@ void udp_err(struct sk_buff *skb, u32 info) { - struct inet_opt *inet; + struct inet_sock *inet; struct iphdr *iph = (struct iphdr*)skb->data; struct udphdr *uh = (struct udphdr*)(skb->data+(iph->ihl<<2)); int type = skb->h.icmph->type; @@ -398,7 +398,7 @@ */ static int udp_push_pending_frames(struct sock *sk, struct udp_opt *up) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct flowi *fl = &inet->cork.fl; struct sk_buff *skb; struct udphdr *uh; @@ -480,7 +480,7 @@ int udp_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg, size_t len) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct udp_opt *up = udp_sk(sk); int ulen = len; struct ipcm_cookie ipc; @@ -773,7 +773,7 @@ int udp_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg, size_t len, int noblock, int flags, int *addr_len) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct sockaddr_in *sin = (struct sockaddr_in *)msg->msg_name; struct sk_buff *skb; int copied, err; @@ -864,7 +864,7 @@ int udp_disconnect(struct sock *sk, int flags) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); /* * 1003.1g - break association. */ @@ -1503,7 +1503,7 @@ /* ------------------------------------------------------------------------ */ static void udp4_format_sock(struct sock *sp, char *tmpbuf, int bucket) { - struct inet_opt *inet = inet_sk(sp); + struct inet_sock *inet = inet_sk(sp); unsigned int dest = inet->daddr; unsigned int src = inet->rcv_saddr; __u16 destp = ntohs(inet->dport); diff -Nru a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c --- a/net/ipv6/af_inet6.c 2004-12-25 23:46:53 -02:00 +++ b/net/ipv6/af_inet6.c 2004-12-25 23:46:53 -02:00 @@ -114,10 +114,9 @@ static int inet6_create(struct socket *sock, int protocol) { - struct inet_opt *inet; + struct inet_sock *inet; struct ipv6_pinfo *np; struct sock *sk; - struct tcp6_sock* tcp6sk; struct list_head *p; struct inet_protosw *answer; struct proto *answer_prot; @@ -196,8 +195,7 @@ sk->sk_backlog_rcv = answer->prot->backlog_rcv; - tcp6sk = (struct tcp6_sock *)sk; - tcp6sk->pinet6 = np = inet6_sk_generic(sk); + inet_sk(sk)->pinet6 = np = inet6_sk_generic(sk); np->hop_limit = -1; np->mcast_hops = -1; np->mc_loop = 1; @@ -252,7 +250,7 @@ { struct sockaddr_in6 *addr=(struct sockaddr_in6 *)uaddr; struct sock *sk = sock->sk; - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ipv6_pinfo *np = inet6_sk(sk); __u32 v4addr = 0; unsigned short snum; @@ -410,7 +408,7 @@ { struct sockaddr_in6 *sin=(struct sockaddr_in6 *)uaddr; struct sock *sk = sock->sk; - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ipv6_pinfo *np = inet6_sk(sk); sin->sin6_family = AF_INET6; diff -Nru a/net/ipv6/datagram.c b/net/ipv6/datagram.c --- a/net/ipv6/datagram.c 2004-12-25 23:46:53 -02:00 +++ b/net/ipv6/datagram.c 2004-12-25 23:46:53 -02:00 @@ -36,7 +36,7 @@ int ip6_datagram_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len) { struct sockaddr_in6 *usin = (struct sockaddr_in6 *) uaddr; - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ipv6_pinfo *np = inet6_sk(sk); struct in6_addr *daddr, *final_p = NULL, final; struct dst_entry *dst; @@ -335,7 +335,7 @@ if (ipv6_addr_type(&sin->sin6_addr) & IPV6_ADDR_LINKLOCAL) sin->sin6_scope_id = IP6CB(skb)->iif; } else { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); ipv6_addr_set(&sin->sin6_addr, 0, 0, htonl(0xffff), diff -Nru a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c --- a/net/ipv6/ip6_output.c 2004-12-25 23:46:53 -02:00 +++ b/net/ipv6/ip6_output.c 2004-12-25 23:46:53 -02:00 @@ -809,7 +809,7 @@ int hlimit, struct ipv6_txoptions *opt, struct flowi *fl, struct rt6_info *rt, unsigned int flags) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ipv6_pinfo *np = inet6_sk(sk); struct sk_buff *skb; unsigned int maxfraglen, fragheaderlen; @@ -1087,7 +1087,7 @@ struct sk_buff *skb, *tmp_skb; struct sk_buff **tail_skb; struct in6_addr final_dst_buf, *final_dst = &final_dst_buf; - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ipv6_pinfo *np = inet6_sk(sk); struct ipv6hdr *hdr; struct ipv6_txoptions *opt = np->cork.opt; @@ -1165,7 +1165,7 @@ void ip6_flush_pending_frames(struct sock *sk) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ipv6_pinfo *np = inet6_sk(sk); struct sk_buff *skb; diff -Nru a/net/ipv6/raw.c b/net/ipv6/raw.c --- a/net/ipv6/raw.c 2004-12-25 23:46:53 -02:00 +++ b/net/ipv6/raw.c 2004-12-25 23:46:53 -02:00 @@ -178,7 +178,7 @@ /* This cleans up af_inet6 a bit. -DaveM */ static int rawv6_bind(struct sock *sk, struct sockaddr *uaddr, int addr_len) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ipv6_pinfo *np = inet6_sk(sk); struct sockaddr_in6 *addr = (struct sockaddr_in6 *) uaddr; __u32 v4addr = 0; @@ -253,7 +253,7 @@ struct inet6_skb_parm *opt, int type, int code, int offset, u32 info) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ipv6_pinfo *np = inet6_sk(sk); int err; int harderr; @@ -314,7 +314,7 @@ */ int rawv6_rcv(struct sock *sk, struct sk_buff *skb) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct raw6_opt *raw_opt = raw6_sk(sk); if (!xfrm6_policy_check(sk, XFRM_POLICY_IN, skb)) { @@ -505,7 +505,7 @@ struct flowi *fl, struct rt6_info *rt, unsigned int flags) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ipv6hdr *iph; struct sk_buff *skb; unsigned int hh_len; @@ -607,7 +607,7 @@ struct ipv6_txoptions opt_space; struct sockaddr_in6 * sin6 = (struct sockaddr_in6 *) msg->msg_name; struct in6_addr *daddr, *final_p = NULL, final; - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ipv6_pinfo *np = inet6_sk(sk); struct raw6_opt *raw_opt = raw6_sk(sk); struct ipv6_txoptions *opt = NULL; diff -Nru a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c --- a/net/ipv6/tcp_ipv6.c 2004-12-25 23:46:53 -02:00 +++ b/net/ipv6/tcp_ipv6.c 2004-12-25 23:46:53 -02:00 @@ -89,7 +89,7 @@ static __inline__ int tcp_v6_sk_hashfn(struct sock *sk) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ipv6_pinfo *np = inet6_sk(sk); struct in6_addr *laddr = &np->rcv_saddr; struct in6_addr *faddr = &np->daddr; @@ -443,7 +443,7 @@ static int tcp_v6_check_established(struct sock *sk) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ipv6_pinfo *np = inet6_sk(sk); struct in6_addr *daddr = &np->rcv_saddr; struct in6_addr *saddr = &np->daddr; @@ -549,7 +549,7 @@ int addr_len) { struct sockaddr_in6 *usin = (struct sockaddr_in6 *) uaddr; - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ipv6_pinfo *np = inet6_sk(sk); struct tcp_opt *tp = tcp_sk(sk); struct in6_addr *saddr = NULL, *final_p = NULL, final; @@ -785,7 +785,7 @@ dst = __sk_dst_check(sk, np->dst_cookie); if (dst == NULL) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct flowi fl; /* BUGGG_FUTURE: Again, it is not clear how @@ -1281,7 +1281,7 @@ { struct ipv6_pinfo *newnp, *np = inet6_sk(sk); struct tcp6_sock *newtcp6sk; - struct inet_opt *newinet; + struct inet_sock *newinet; struct tcp_opt *newtp; struct sock *newsk; struct ipv6_txoptions *opt; @@ -1297,7 +1297,7 @@ return NULL; newtcp6sk = (struct tcp6_sock *)newsk; - newtcp6sk->pinet6 = &newtcp6sk->inet6; + newtcp6sk->inet.pinet6 = &newtcp6sk->inet6; newinet = inet_sk(newsk); newnp = inet6_sk(newsk); @@ -1390,7 +1390,7 @@ ~(NETIF_F_IP_CSUM | NETIF_F_TSO); newtcp6sk = (struct tcp6_sock *)newsk; - newtcp6sk->pinet6 = &newtcp6sk->inet6; + newtcp6sk->inet.pinet6 = &newtcp6sk->inet6; newtp = tcp_sk(newsk); newinet = inet_sk(newsk); @@ -1754,7 +1754,7 @@ dst = __sk_dst_check(sk, np->dst_cookie); if (dst == NULL) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct in6_addr *final_p = NULL, final; struct flowi fl; @@ -1800,7 +1800,7 @@ static int tcp_v6_xmit(struct sk_buff *skb, int ipfragok) { struct sock *sk = skb->sk; - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ipv6_pinfo *np = inet6_sk(sk); struct flowi fl; struct dst_entry *dst; @@ -2006,7 +2006,7 @@ __u16 destp, srcp; int timer_active; unsigned long timer_expires; - struct inet_opt *inet = inet_sk(sp); + struct inet_sock *inet = inet_sk(sp); struct tcp_opt *tp = tcp_sk(sp); struct ipv6_pinfo *np = inet6_sk(sp); diff -Nru a/net/ipv6/udp.c b/net/ipv6/udp.c --- a/net/ipv6/udp.c 2004-12-25 23:46:53 -02:00 +++ b/net/ipv6/udp.c 2004-12-25 23:46:53 -02:00 @@ -160,7 +160,7 @@ read_lock(&udp_hash_lock); sk_for_each(sk, node, &udp_hash[hnum & (UDP_HTABLE_SIZE - 1)]) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); if (inet->num == hnum && sk->sk_family == PF_INET6) { struct ipv6_pinfo *np = inet6_sk(sk); @@ -269,7 +269,7 @@ sin6->sin6_scope_id = 0; if (skb->protocol == htons(ETH_P_IP)) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); ipv6_addr_set(&sin6->sin6_addr, 0, 0, htonl(0xffff), skb->nh.iph->saddr); @@ -386,7 +386,7 @@ unsigned short num = ntohs(loc_port); sk_for_each_from(s, node) { - struct inet_opt *inet = inet_sk(s); + struct inet_sock *inet = inet_sk(s); if (inet->num == num && s->sk_family == PF_INET6) { struct ipv6_pinfo *np = inet6_sk(s); @@ -566,7 +566,7 @@ { struct sk_buff *skb; struct udphdr *uh; - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct flowi *fl = &inet->cork.fl; int err = 0; @@ -624,7 +624,7 @@ { struct ipv6_txoptions opt_space; struct udp_opt *up = udp_sk(sk); - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ipv6_pinfo *np = inet6_sk(sk); struct sockaddr_in6 *sin6 = (struct sockaddr_in6 *) msg->msg_name; struct in6_addr *daddr, *final_p = NULL, final; @@ -970,7 +970,7 @@ static void udp6_sock_seq_show(struct seq_file *seq, struct sock *sp, int bucket) { - struct inet_opt *inet = inet_sk(sp); + struct inet_sock *inet = inet_sk(sp); struct ipv6_pinfo *np = inet6_sk(sp); struct in6_addr *dest, *src; __u16 destp, srcp; diff -Nru a/net/sctp/input.c b/net/sctp/input.c --- a/net/sctp/input.c 2004-12-25 23:46:53 -02:00 +++ b/net/sctp/input.c 2004-12-25 23:46:53 -02:00 @@ -393,7 +393,7 @@ struct sctp_endpoint *ep; struct sctp_association *asoc; struct sctp_transport *transport; - struct inet_opt *inet; + struct inet_sock *inet; char *saveip, *savesctp; int err; diff -Nru a/net/sctp/ipv6.c b/net/sctp/ipv6.c --- a/net/sctp/ipv6.c 2004-12-25 23:46:53 -02:00 +++ b/net/sctp/ipv6.c 2004-12-25 23:46:53 -02:00 @@ -580,9 +580,9 @@ struct sock *sctp_v6_create_accept_sk(struct sock *sk, struct sctp_association *asoc) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct sock *newsk; - struct inet_opt *newinet; + struct inet_sock *newinet; struct ipv6_pinfo *newnp, *np = inet6_sk(sk); struct sctp6_sock *newsctp6sk; @@ -608,7 +608,7 @@ newsk->sk_shutdown = sk->sk_shutdown; newsctp6sk = (struct sctp6_sock *)newsk; - newsctp6sk->pinet6 = &newsctp6sk->inet6; + newsctp6sk->inet.pinet6 = &newsctp6sk->inet6; newinet = inet_sk(newsk); newnp = inet6_sk(newsk); diff -Nru a/net/sctp/protocol.c b/net/sctp/protocol.c --- a/net/sctp/protocol.c 2004-12-25 23:46:53 -02:00 +++ b/net/sctp/protocol.c 2004-12-25 23:46:53 -02:00 @@ -551,8 +551,8 @@ struct sctp_association *asoc) { struct sock *newsk; - struct inet_opt *inet = inet_sk(sk); - struct inet_opt *newinet; + struct inet_sock *inet = inet_sk(sk); + struct inet_sock *newinet; newsk = sk_alloc(PF_INET, GFP_KERNEL, sk->sk_prot->slab_obj_size, sk->sk_prot->slab); diff -Nru a/security/selinux/avc.c b/security/selinux/avc.c --- a/security/selinux/avc.c 2004-12-25 23:46:53 -02:00 +++ b/security/selinux/avc.c 2004-12-25 23:46:53 -02:00 @@ -566,7 +566,7 @@ switch (sk->sk_family) { case AF_INET: { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); avc_print_ipv4_addr(ab, inet->rcv_saddr, inet->sport, @@ -577,7 +577,7 @@ break; } case AF_INET6: { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ipv6_pinfo *inet6 = inet6_sk(sk); avc_print_ipv6_addr(ab, &inet6->rcv_saddr, --------------060000020103000505000009-- From sekiya@wide.ad.jp Sat Dec 25 18:16:04 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 25 Dec 2004 18:16:11 -0800 (PST) Received: from yui.nc.u-tokyo.ac.jp (yui.nc.u-tokyo.ac.jp [130.69.251.116]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBQ2Fgts019703 for ; Sat, 25 Dec 2004 18:16:03 -0800 Received: from anzu.nc.u-tokyo.ac.jp (anzu.nc.u-tokyo.ac.jp [130.69.251.114]) (authenticated bits=0) by yui.nc.u-tokyo.ac.jp (8.12.10/8.12.3/Debian-6.4) with ESMTP id iBQ2HBjF002225; Sun, 26 Dec 2004 11:17:11 +0900 Date: Sun, 26 Dec 2004 11:17:01 +0900 Message-ID: From: Yuji Sekiya To: usagi-users@linux-ipv6.org, netdev@oss.sgi.com Subject: Re: (usagi-users 03168) IPv6: removal of the autogenerated link-local address of an interface still possible In-Reply-To: <8251764896D7138E21068580@t1mobil.muc.aerasec.de> References: <8251764896D7138E21068580@t1mobil.muc.aerasec.de> User-Agent: Wanderlust/2.10.1 (Watching The Wheels) SEMI/1.14.6 (=?ISO-2022-JP?B?GyRCNF0yLBsoQg==?=) FLIM/1.14.6 (=?ISO-2022-JP?B?GyRCNF1CQEQuGyhC?=) APEL/10.6 Emacs/21.3 (i686-pc-linux-gnu) MULE/5.0 (=?ISO-2022-JP?B?GyRCOC1MWhsoQg==?=) Organization: The University of Tokyo MIME-Version: 1.0 (generated by SEMI 1.14.6 - =?ISO-2022-JP?B?IhskQjRdGyhC?= =?ISO-2022-JP?B?GyRCMiwbKEIi?=) Content-Type: text/plain; charset=US-ASCII X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13025 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: sekiya@wide.ad.jp Precedence: bulk X-list: netdev At Sat, 25 Dec 2004 20:42:49 +0100, Peter Bieringer wrote: Hello, > Would it be not better to prevent user space tools from removal of the (one > and only) autogenerated link-local address? I don't agree. Because some user may want to remove an autogenerated link-local address and add an link-local address manually. -- Yuji Sekiya From advertiser@localhost.localdomain Sat Dec 25 21:28:43 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 25 Dec 2004 21:28:50 -0800 (PST) Received: from localhost.localdomain ([82.201.178.238]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBQ5SLUN029540 for ; Sat, 25 Dec 2004 21:28:42 -0800 Received: from localhost.localdomain (admin3 [127.0.0.1]) by localhost.localdomain (8.12.8/8.12.8) with ESMTP id iBQHZQnB005247 for ; Sun, 26 Dec 2004 19:35:26 +0200 Received: (from advertiser@localhost) by localhost.localdomain (8.12.8/8.12.8/Submit) id iBQHZPbh005240 for netdev@oss.sgi.com; Sun, 26 Dec 2004 19:35:25 +0200 Date: Sun, 26 Dec 2004 19:35:25 +0200 From: advertiser@advertise.com Message-Id: <200412261735.iBQHZPbh005240@localhost.localdomain> To: netdev@oss.sgi.com Subject: Cheap Prices NOT Cheap Hosting X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13026 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: advertiser@advertise.com Precedence: bulk X-list: netdev .HellO... ------------------------------------------------------------ ############################################################## $ Visit http://www.mkhoster.com For Very Good Hosting Offer $ $--- Cpanel $ $--- PHP $ $--- CGI-perl $ $--- Mysql $ $--- And MORE ....... $ ############################################################## FOR MORE INFORMATIONS -----< http://mkhoster.com/support.html >----- ************************************************************** From herbert@gondor.apana.org.au Sat Dec 25 21:40:51 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 25 Dec 2004 21:41:01 -0800 (PST) Received: from arnor.apana.org.au (c211-30-229-77.rivrw4.nsw.optusnet.com.au [211.30.229.77]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBQ5eSYa030357 for ; Sat, 25 Dec 2004 21:40:51 -0800 Received: from gondolin.me.apana.org.au ([192.168.0.6] ident=mail) by arnor.apana.org.au with esmtp (Exim 3.35 #1 (Debian)) id 1CiR9A-0006wL-00; Sun, 26 Dec 2004 16:40:52 +1100 Received: from herbert by gondolin.me.apana.org.au with local (Exim 3.36 #1 (Debian)) id 1CiR4p-0004vc-00; Sun, 26 Dec 2004 16:36:23 +1100 From: Herbert Xu To: tommy.christensen@tpack.net (Tommy Christensen) Subject: Re: [patch 4/10] s390: network driver. Cc: hadi@cyberus.ca, thomas.spatzier@de.ibm.com, davem@davemloft.net, hasso@estpak.ee, herbert@gondor.apana.org.au, jgarzik@pobox.com, netdev@oss.sgi.com, paul@clubi.ie Organization: Core In-Reply-To: <41C612BC.5070909@tpack.net> X-Newsgroups: apana.lists.os.linux.netdev User-Agent: tin/1.7.4-20040225 ("Benbecula") (UNIX) (Linux/2.4.27-hx-1-686-smp (i686)) Message-Id: Date: Sun, 26 Dec 2004 16:36:23 +1100 X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13027 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@gondor.apana.org.au Precedence: bulk X-list: netdev Tommy Christensen wrote: > > OK. So is this the recommendation for these pour souls? > > - Use a socket for each device. Please show us your program first. Then we can figure out whether this is necessary or not. -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt From herbert@gondor.apana.org.au Sat Dec 25 21:45:55 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 25 Dec 2004 21:46:01 -0800 (PST) Received: from arnor.apana.org.au (c211-30-229-77.rivrw4.nsw.optusnet.com.au [211.30.229.77]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBQ5jYci030978 for ; Sat, 25 Dec 2004 21:45:54 -0800 Received: from gondolin.me.apana.org.au ([192.168.0.6] ident=mail) by arnor.apana.org.au with esmtp (Exim 3.35 #1 (Debian)) id 1CiREI-0006xI-00; Sun, 26 Dec 2004 16:46:10 +1100 Received: from herbert by gondolin.me.apana.org.au with local (Exim 3.36 #1 (Debian)) id 1CiRDA-0004wk-00; Sun, 26 Dec 2004 16:45:00 +1100 From: Herbert Xu To: davem@davemloft.net (David S. Miller) Subject: Re: 2.6.8.1 IPv6 Routing Problem Cc: laforge@gnumonks.org, yoshfuji@linux-ipv6.org, pekkas@netcore.fi, netdev@oss.sgi.com, usagi-users@linux-ipv6.org Organization: Core In-Reply-To: <20041225135650.55dca29f.davem@davemloft.net> X-Newsgroups: apana.lists.os.linux.netdev User-Agent: tin/1.7.4-20040225 ("Benbecula") (UNIX) (Linux/2.4.27-hx-1-686-smp (i686)) Message-Id: Date: Sun, 26 Dec 2004 16:45:00 +1100 X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13028 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@gondor.apana.org.au Precedence: bulk X-list: netdev David S. Miller wrote: > > I've also discussed this with Herbert Xu a bit, and no one > solution is really clear yet. IPV4 has similar issues, just > in a slightly different form. Well I still reckon that the IPv4 behaviour is just fine as it is. In fact if anything we should make the IPv6 case as similar to Ipv4 as possible. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt From linux_lover2004@yahoo.com Sat Dec 25 22:51:01 2004 Received: with ECARTIS (v1.0.0; list netdev); Sat, 25 Dec 2004 22:51:08 -0800 (PST) Received: from web52207.mail.yahoo.com (web52207.mail.yahoo.com [206.190.39.89]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBQ6oeoG001094 for ; Sat, 25 Dec 2004 22:51:01 -0800 Received: (qmail 80982 invoked by uid 60001); 26 Dec 2004 06:52:05 -0000 Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; b=ZOgLePeN8xiJf48e0Ci9fs+VFJ3Q+zDQ/UkGAY8RKIFhQ7vrLU0NCBXhKjFnpvxLbTk76agjMt2nCADVNe4SIlnNRQ6CBNBaHYdizyRa1YRgerW2r/YGFMDQMJKgOfTUcY1fNl6109vawoCYT8G9ZCjEnNUM9//xSftdgU8BJIo= ; Message-ID: <20041226065205.80980.qmail@web52207.mail.yahoo.com> Received: from [202.56.231.117] by web52207.mail.yahoo.com via HTTP; Sat, 25 Dec 2004 22:52:05 PST Date: Sat, 25 Dec 2004 22:52:05 -0800 (PST) From: linux lover Subject: ipsec implementation kernel To: netdev@oss.sgi.com MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13029 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: linux_lover2004@yahoo.com Precedence: bulk X-list: netdev Hello, I want to know where can i find ipsec implementation files in linux kernel? Also how to activate it in kernel? Does AH/ESP header added by netfilter kernel module? regards, linux.lover __________________________________ Do you Yahoo!? Jazz up your holiday email with celebrity designs. Learn more. http://celebrity.mail.yahoo.com From davem@davemloft.net Sun Dec 26 00:22:51 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 26 Dec 2004 00:23:11 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBQ8MUld007298 for ; Sun, 26 Dec 2004 00:22:50 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1CiTgQ-0005WN-00; Sun, 26 Dec 2004 00:23:22 -0800 Date: Sun, 26 Dec 2004 00:23:22 -0800 From: "David S. Miller" To: Yuji Sekiya Cc: usagi-users@linux-ipv6.org, netdev@oss.sgi.com Subject: Re: (usagi-users 03168) IPv6: removal of the autogenerated link-local address of an interface still possible Message-Id: <20041226002322.2c2914e4.davem@davemloft.net> In-Reply-To: References: <8251764896D7138E21068580@t1mobil.muc.aerasec.de> X-Mailer: Sylpheed version 1.0.0beta3 (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13030 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Sun, 26 Dec 2004 11:17:01 +0900 Yuji Sekiya wrote: > At Sat, 25 Dec 2004 20:42:49 +0100, > Peter Bieringer wrote: > > > Would it be not better to prevent user space tools from removal of the (one > > and only) autogenerated link-local address? > > I don't agree. Because some user may want to remove an autogenerated > link-local address and add an link-local address manually. I completely agree. From yoshfuji@wide.ad.jp Sun Dec 26 01:07:57 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 26 Dec 2004 01:08:05 -0800 (PST) Received: from yue.st-paulia.net (yue.linux-ipv6.org [203.178.140.15]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBQ97a8q008826 for ; Sun, 26 Dec 2004 01:07:57 -0800 Received: from localhost (localhost [127.0.0.1]) by yue.st-paulia.net (Postfix) with ESMTP id F030A33CC2; Sun, 26 Dec 2004 18:09:16 +0900 (JST) Date: Sun, 26 Dec 2004 10:09:14 +0100 (CET) Message-Id: <20041226.100914.99805068.yoshfuji@wide.ad.jp> To: usagi-users@linux-ipv6.org Cc: sekiya@wide.ad.jp, netdev@oss.sgi.com, yoshfuji@linux-ipv6.org, davem@davemloft.net Subject: Re: IPv6: removal of the autogenerated link-local address of an interface still possible From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: <20041226002322.2c2914e4.davem@davemloft.net> References: <8251764896D7138E21068580@t1mobil.muc.aerasec.de> <20041226002322.2c2914e4.davem@davemloft.net> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13031 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@wide.ad.jp Precedence: bulk X-list: netdev Sorry for silence. In article <20041226002322.2c2914e4.davem@davemloft.net> (at Sun, 26 Dec 2004 00:23:22 -0800), "David S. Miller" says: > > > Would it be not better to prevent user space tools from removal of the (one > > > and only) autogenerated link-local address? > > > > I don't agree. Because some user may want to remove an autogenerated > > link-local address and add an link-local address manually. > > I completely agree. I agree (to allow users to remove auto-generated link-local address), too. --yoshfuji From pb@bieringer.de Sun Dec 26 01:12:27 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 26 Dec 2004 01:12:34 -0800 (PST) Received: from smtp2.aerasec.de (gromit.aerasec.de [195.226.187.57]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBQ9C6af009370 for ; Sun, 26 Dec 2004 01:12:27 -0800 Received: from localhost (localhost [127.0.0.1]) by smtp2.aerasec.de (Postfix) with SMTP id A09E1137EE; Sun, 26 Dec 2004 10:13:35 +0100 (CET) X-AV-Checked: Sun Dec 26 10:13:35 2004 smtp2.aerasec.de Received: from [192.168.17.65] (pD9E97F75.dip.t-dialin.net [217.233.127.117]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (Client did not present a certificate) by smtp2.aerasec.de (Postfix) with ESMTP id 82EB1137EA; Sun, 26 Dec 2004 10:13:34 +0100 (CET) Date: Sun, 26 Dec 2004 10:14:10 +0100 From: Peter Bieringer To: Maillist USAGI-users , Maillist netdev Subject: Major deadlock: unregister_netdevice: waiting for to become free. Usage count = 1 Message-ID: <8A6334DE39BB61513FFBD614@t1mobil.muc.aerasec.de> X-Mailer: Mulberry/3.1.6 (Linux/x86) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13032 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pb@bieringer.de Precedence: bulk X-list: netdev Hi, this happens to me now on 3 hosts :-((( which leaves the boxes in no-longer-able-to-remote-reboot state (except I trigger Alt-SysRq via serial console - which is not possible on all boxes). All of them running newer kernels: 2 hosts: 2.6.9-1.681_FC3 (Fedora Core 3) 1 host : 2.6.9-1.6_FC2 (Fedora Core 2) The reason in any of this 3 hosts was that on IPv6 the initscripts (or ppp down) cleanup IPv6 tunnels using e.g. /sbin/ip tunnel del sit_sixxs (same happen on a created 6to4 device) Kernel tells me each some seconds: Dec 26 09:59:10 * kernel: unregister_netdevice: waiting for sit_sixxs to become free. Usage count = 1 Dec 26 09:59:50 * last message repeated 4 times Dec 26 10:01:00 * last message repeated 7 times Dec 26 10:02:10 * last message repeated 7 times Dec 26 10:03:20 * last message repeated 7 times There is no limit in kernel, means this problem locks the kernel infinite (even on normal reboot, which never succeeded in this case because shutdown is not successful). This deadlock blocks all other netdevice related commands, so I can't execute any "ifconfig" or "ip" command successful. It looks like that also some network related processes are blocked. I also can't kill any of that processes, most of them are in D state: Here a part of a current process table on one deadlock box, with ISDN remote login access (otherwise, the box were already lost completly): # ps -ax Warning: bad syntax, perhaps a bogus '-'? See /usr/share/doc/procps-3.2.3/FAQ PID TTY STAT TIME COMMAND 1 ? S 0:03 init [3] 2 ? SN 0:49 [ksoftirqd/0] 3 ? S< 0:07 [events/0] 4 ? S< 0:00 [khelper] 5 ? S< 0:00 [kacpid] 6 ? S< 0:07 [kblockd/0] 7 ? S 0:00 [khubd] 36 ? S< 0:00 [aio/0] 35 ? S 1:03 [kswapd0] 109 ? S 0:00 [kseriod] 195 ? S 0:00 [scsi_eh_0] 203 ? S 2:43 [kjournald] 739 ? S 5506 ? R 43:49 /sbin/ip tunnel del sit_sixxs 5534 ? Z 0:00 [pppoe] 29765 ? Zl 0:00 [dig] 12243 ? D 0:00 /usr/sbin/sendmail -FCronDaemon -i -odi -oem -oi -t 12301 ? Zl 0:00 [dig] 12374 ? Zl 0:00 [dig] 12447 ? Zl 0:00 [dig] 12523 ? Zl 0:00 [dig] 12597 ? Zl 0:00 [dig] 12671 ? Zl 0:00 [dig] 12684 ? D 0:00 pickup -l -t fifo -u 12747 ? Zl 0:00 [dig] 12821 ? Zl 0:00 [dig] 12894 ? Zl 0:00 [dig] 12969 ? Zl 0:00 [dig] 13043 ? Zl 0:00 [dig] 13116 ? Zl 0:00 [dig] 13189 ? Zl 0:00 [dig] 13263 ? Zl 0:00 [dig] 13336 ? Zl 0:00 [dig] 13411 ? Zl 0:00 [dig] 13485 ? Zl 0:00 [dig] 13559 ? Zl 0:00 [dig] 13632 ? Zl 0:00 [dig] 13706 ? Zl 0:00 [dig] 13779 ? Zl 0:00 [dig] 13854 ? Zl 0:00 [dig] 13927 ? Zl 0:00 [dig] 14001 ? Zl 0:00 [dig] 14074 ? Zl 0:00 [dig] 14147 ? Zl 0:00 [dig] 14221 ? Zl 0:00 [dig] 14296 ? Zl 0:00 [dig] 14370 ? Zl 0:00 [dig] 14429 ? S 0:00 su - 14430 ? S 0:00 -bash 14483 ? D 0:00 ifconfig 14624 ? D 0:00 /sbin/ifconfig ppp0 14675 ? Zl 0:00 [dig] 14752 ? D 0:00 ip route list match 0/0 14783 ? Ss 0:00 /sbin/mgetty ttyI2 14881 ? D 0:00 /usr/sbin/postfix stop 15788 ttyI1 Ss 0:00 -bash 15814 ttyI1 S 0:00 su - 15815 ttyI1 S 0:00 -bash 15911 ttyI1 R+ 0:00 ps -ax the dig zomies were caused by a kill -9 to regular (but hanging cron jobs). One of my major problems is now that I don't know how to issue a hard-boot-command via an ISDN tty. Peter, very unhappy -- Dr. Peter Bieringer http://www.bieringer.de/pb/ GPG/PGP Key 0x958F422D mailto: pb at bieringer dot de Deep Space 6 Co-Founder and Core Member http://www.deepspace6.net/ From pb@bieringer.de Sun Dec 26 01:24:45 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 26 Dec 2004 01:24:52 -0800 (PST) Received: from smtp2.aerasec.de (gromit.aerasec.de [195.226.187.57]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBQ9OOD8010075 for ; Sun, 26 Dec 2004 01:24:45 -0800 Received: from localhost (localhost [127.0.0.1]) by smtp2.aerasec.de (Postfix) with SMTP id D3941137EE; Sun, 26 Dec 2004 10:25:53 +0100 (CET) X-AV-Checked: Sun Dec 26 10:25:53 2004 smtp2.aerasec.de Received: from [192.168.17.65] (pD9E97F75.dip.t-dialin.net [217.233.127.117]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (Client did not present a certificate) by smtp2.aerasec.de (Postfix) with ESMTP id 0AA71137EA; Sun, 26 Dec 2004 10:25:52 +0100 (CET) Date: Sun, 26 Dec 2004 10:26:28 +0100 From: Peter Bieringer To: Maillist USAGI-users , Maillist netdev Subject: Re: Major deadlock: unregister_netdevice: waiting for to become free. Usage count = 1 Message-ID: In-Reply-To: <8A6334DE39BB61513FFBD614@t1mobil.muc.aerasec.de> References: <8A6334DE39BB61513FFBD614@t1mobil.muc.aerasec.de> X-Mailer: Mulberry/3.1.6 (Linux/x86) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13033 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pb@bieringer.de Precedence: bulk X-list: netdev Hi again, --On Sonntag, Dezember 26, 2004 10:14:10 +0100 Peter Bieringer wrote: > Kernel tells me each some seconds: > > Dec 26 09:59:10 * kernel: unregister_netdevice: waiting for sit_sixxs to > become free. Usage count = 1 > Dec 26 09:59:50 * last message repeated 4 times > Dec 26 10:01:00 * last message repeated 7 times > Dec 26 10:02:10 * last message repeated 7 times > Dec 26 10:03:20 * last message repeated 7 times During further process killing by hand suddenly the problem was solved. My last commands: 422 kill -9 5506 437 kill -9 5506 587 kill 23837 589 kill -9 15765 14675 14429 14430 14483 590 kill -9 15765 14675 14429 14430 14483 592 kill -9 12243 593 kill -9 12243 595 kill 11808 3880 23933 11739 596 service xfs stop 597 service privoxy stop 598 kill -9 2673 603 kill -9 5506 604 kill -9 2902 Because I often try to kill the hanging "ip tunnel del" command, I believe, one of the other network processes block something: > 2673 ? Ss 0:01 rpc.statd > 2902 ? Ssl 0:00 /usr/sbin/named -u named > 5506 ? R 43:49 /sbin/ip tunnel del sit_sixxs If happen next, I will check netstat to find more indicators which causes the problem. Anyway, such deadlock is not nice and doesn't appear in earlier kernel. Peter -- Dr. Peter Bieringer http://www.bieringer.de/pb/ GPG/PGP Key 0x958F422D mailto: pb at bieringer dot de Deep Space 6 Co-Founder and Core Member http://www.deepspace6.net/ From pb@bieringer.de Sun Dec 26 01:28:31 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 26 Dec 2004 01:28:38 -0800 (PST) Received: from smtp2.aerasec.de (gromit.aerasec.de [195.226.187.57]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBQ9SAmZ010617 for ; Sun, 26 Dec 2004 01:28:30 -0800 Received: from localhost (localhost [127.0.0.1]) by smtp2.aerasec.de (Postfix) with SMTP id DCDE3137EE; Sun, 26 Dec 2004 10:29:39 +0100 (CET) X-AV-Checked: Sun Dec 26 10:29:39 2004 smtp2.aerasec.de Received: from [192.168.17.65] (pD9E97F75.dip.t-dialin.net [217.233.127.117]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (Client did not present a certificate) by smtp2.aerasec.de (Postfix) with ESMTP id 1018E137EA; Sun, 26 Dec 2004 10:29:38 +0100 (CET) Date: Sun, 26 Dec 2004 10:30:14 +0100 From: Peter Bieringer To: usagi-users@linux-ipv6.org, netdev@oss.sgi.com Subject: Re: (usagi-users 03172) Re: IPv6: removal of the autogenerated link-local address of an interface still possible Message-ID: In-Reply-To: <20041226002322.2c2914e4.davem@davemloft.net> References: <8251764896D7138E21068580@t1mobil.muc.aerasec.de> <20041226002322.2c2914e4.davem@davemloft.net> X-Mailer: Mulberry/3.1.6 (Linux/x86) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13034 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: pb@bieringer.de Precedence: bulk X-list: netdev --On Sonntag, Dezember 26, 2004 00:23:22 -0800 "David S. Miller" wrote: > On Sun, 26 Dec 2004 11:17:01 +0900 > Yuji Sekiya wrote: > >> At Sat, 25 Dec 2004 20:42:49 +0100, >> Peter Bieringer wrote: >> >> > Would it be not better to prevent user space tools from removal of the >> > (one and only) autogenerated link-local address? >> >> I don't agree. Because some user may want to remove an autogenerated >> link-local address and add an link-local address manually. > > I completely agree. Ok. The reason is understandable, therefore I have to look deeper here on one of my systems, why suddenly a link-local address was gone. Thank you very much for clarification. Peter -- Dr. Peter Bieringer http://www.bieringer.de/pb/ GPG/PGP Key 0x958F422D mailto: pb at bieringer dot de Deep Space 6 Co-Founder and Core Member http://www.deepspace6.net/ From marcel@holtmann.org Sun Dec 26 03:19:20 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 26 Dec 2004 03:19:27 -0800 (PST) Received: from mail.holtmann.net (root@coyote.holtmann.net [217.160.111.169]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBQBIwRl015790 for ; Sun, 26 Dec 2004 03:19:19 -0800 Received: from pegasus (p3EE2C3E3.dip.t-dialin.net [62.226.195.227]) by mail.holtmann.net (8.12.3/8.12.3/Debian-7.1) with ESMTP id iBQBKXLL025792; Sun, 26 Dec 2004 12:20:33 +0100 Subject: Re: [2.6 patch] net/bluetooth/: misc possible cleanups From: Marcel Holtmann To: Adrian Bunk Cc: Max Krasnyansky , bluez-devel@lists.sourceforge.net, Linux Kernel Mailing List , Network Development Mailing List In-Reply-To: <20041219160758.GY21288@stusta.de> References: <20041214041352.GZ23151@stusta.de> <1103009649.2143.65.camel@pegasus> <20041219160758.GY21288@stusta.de> Content-Type: text/plain Date: Sun, 26 Dec 2004 12:20:18 +0100 Message-Id: <1104060018.8758.51.camel@pegasus> Mime-Version: 1.0 X-Mailer: Evolution 2.0.3 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Scanned: ClamAV 0.80/631/Wed Dec 15 15:01:14 2004 clamav-milter version 0.80j on coyote.holtmann.net X-Virus-Status: Clean X-Virus-Status: Clean X-archive-position: 13035 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: marcel@holtmann.org Precedence: bulk X-list: netdev Hi Adrian, > > > Please comment on which of these changes are correct and which conflict > > > with pending patches. > > > > Please send a separate patch for all the RFCOMM changes, because these > > conflicts with some pending patches and then it will make it easier for > > me to merge them. > > > > The rest of the changes are fine with me, but I like to see also a > > separate patch for the CMTP stuff and cmtp_send_capimsg() don't need a > > forward declaration. Simply move the function to another place in the > > source code. > > splitted patches follow as reply to this email. all of them are applied to my tree now. Thanks. Regards Marcel From domen@coderock.org Sun Dec 26 07:09:22 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 26 Dec 2004 07:09:31 -0800 (PST) Received: from trashy.coderock.org (postfix@coderock.org [193.77.147.115]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBQF907R030602 for ; Sun, 26 Dec 2004 07:09:20 -0800 Received: by trashy.coderock.org (Postfix, from userid 780) id 569ED1F127; Sun, 26 Dec 2004 16:10:26 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by trashy.coderock.org (Postfix) with ESMTP id 19FF61F125; Sun, 26 Dec 2004 16:10:25 +0100 (CET) Received: from localhost.localdomain (localhost [127.0.0.1]) by trashy.coderock.org (Postfix) with ESMTP id C042C1ECC0; Sun, 26 Dec 2004 16:10:17 +0100 (CET) Subject: [patch 1/1] delete unused file To: jgarzik@pobox.com Cc: netdev@oss.sgi.com, domen@coderock.org From: domen@coderock.org Date: Sun, 26 Dec 2004 16:10:32 +0100 Message-Id: <20041226151017.C042C1ECC0@trashy.coderock.org> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13036 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: domen@coderock.org Precedence: bulk X-list: netdev Remove nowhere referenced file. (egrep "filename\." didn't find anything) Signed-off-by: Domen Puncer --- kj/drivers/net/gt64240eth.h | 402 -------------------------------------------- 1 files changed, 402 deletions(-) diff -L drivers/net/gt64240eth.h -puN drivers/net/gt64240eth.h~remove_file-drivers_net_gt64240eth.h /dev/null --- kj/drivers/net/gt64240eth.h +++ /dev/null 2004-12-24 01:21:08.000000000 +0100 @@ -1,402 +0,0 @@ -/* - * This file is subject to the terms and conditions of the GNU General Public - * License. See the file "COPYING" in the main directory of this archive - * for more details. - * - * Copyright (C) 2001 Patton Electronics Company - * Copyright (C) 2002 Momentum Computer - * - * Copyright 2000 MontaVista Software Inc. - * Author: MontaVista Software, Inc. - * stevel@mvista.com or support@mvista.com - * - * This program is free software; you can distribute it and/or modify it - * under the terms of the GNU General Public License (Version 2) as - * published by the Free Software Foundation. - * - * This program is distributed in the hope it will be useful, but WITHOUT - * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or - * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License - * for more details. - * - * You should have received a copy of the GNU General Public License along - * with this program; if not, write to the Free Software Foundation, Inc., - * 59 Temple Place - Suite 330, Boston MA 02111-1307, USA. - * - * Ethernet driver definitions for the MIPS GT96100 Advanced - * Communication Controller. - * - * Modified for the Marvellous GT64240 Retarded Communication Controller. - */ -#ifndef _GT64240ETH_H -#define _GT64240ETH_H - -#include - -#define ETHERNET_PORTS_DIFFERENCE_OFFSETS 0x400 - -/* Translate those weanie names from Galileo/VxWorks header files: */ - -#define GT64240_MRR MAIN_ROUTING_REGISTER -#define GT64240_CIU_ARBITER_CONFIG COMM_UNIT_ARBITER_CONFIGURATION_REGISTER -#define GT64240_CIU_ARBITER_CONTROL COMM_UNIT_ARBITER_CONTROL -#define GT64240_MAIN_LOW_CAUSE LOW_INTERRUPT_CAUSE_REGISTER -#define GT64240_MAIN_HIGH_CAUSE HIGH_INTERRUPT_CAUSE_REGISTER -#define GT64240_CPU_LOW_MASK CPU_INTERRUPT_MASK_REGISTER_LOW -#define GT64240_CPU_HIGH_MASK CPU_INTERRUPT_MASK_REGISTER_HIGH -#define GT64240_CPU_SELECT_CAUSE CPU_SELECT_CAUSE_REGISTER - -#define GT64240_ETH_PHY_ADDR_REG ETHERNET_PHY_ADDRESS_REGISTER -#define GT64240_ETH_PORT_CONFIG ETHERNET0_PORT_CONFIGURATION_REGISTER -#define GT64240_ETH_PORT_CONFIG_EXT ETHERNET0_PORT_CONFIGURATION_EXTEND_REGISTER -#define GT64240_ETH_PORT_COMMAND ETHERNET0_PORT_COMMAND_REGISTER -#define GT64240_ETH_PORT_STATUS ETHERNET0_PORT_STATUS_REGISTER -#define GT64240_ETH_IO_SIZE ETHERNET_PORTS_DIFFERENCE_OFFSETS -#define GT64240_ETH_SMI_REG ETHERNET_SMI_REGISTER -#define GT64240_ETH_MIB_COUNT_BASE ETHERNET0_MIB_COUNTER_BASE -#define GT64240_ETH_SDMA_CONFIG ETHERNET0_SDMA_CONFIGURATION_REGISTER -#define GT64240_ETH_SDMA_COMM ETHERNET0_SDMA_COMMAND_REGISTER -#define GT64240_ETH_INT_MASK ETHERNET0_INTERRUPT_MASK_REGISTER -#define GT64240_ETH_INT_CAUSE ETHERNET0_INTERRUPT_CAUSE_REGISTER -#define GT64240_ETH_CURR_TX_DESC_PTR0 ETHERNET0_CURRENT_TX_DESCRIPTOR_POINTER0 -#define GT64240_ETH_CURR_TX_DESC_PTR1 ETHERNET0_CURRENT_TX_DESCRIPTOR_POINTER1 -#define GT64240_ETH_1ST_RX_DESC_PTR0 ETHERNET0_FIRST_RX_DESCRIPTOR_POINTER0 -#define GT64240_ETH_CURR_RX_DESC_PTR0 ETHERNET0_CURRENT_RX_DESCRIPTOR_POINTER0 -#define GT64240_ETH_HASH_TBL_PTR ETHERNET0_HASH_TABLE_POINTER_REGISTER - -/* Turn on NAPI by default */ - -#define GT64240_NAPI 1 - -/* Some 64240 settings that SHOULD eventually be setup in PROM monitor: */ -/* (Board-specific to the DSL3224 Rev A board ONLY!) */ -#define D3224_MPP_CTRL0_SETTING 0x66669900 -#define D3224_MPP_CTRL1_SETTING 0x00000000 -#define D3224_MPP_CTRL2_SETTING 0x00887700 -#define D3224_MPP_CTRL3_SETTING 0x00000044 -#define D3224_GPP_IO_CTRL_SETTING 0x0000e800 -#define D3224_GPP_LEVEL_CTRL_SETTING 0xf001f703 -#define D3224_GPP_VALUE_SETTING 0x00000000 - -/* Keep the ring sizes a power of two for efficiency. */ -//-#define TX_RING_SIZE 16 -#define TX_RING_SIZE 64 /* TESTING !!! */ -#define RX_RING_SIZE 32 -#define PKT_BUF_SZ 1536 /* Size of each temporary Rx buffer. */ - -#define RX_HASH_TABLE_SIZE 16384 -#define HASH_HOP_NUMBER 12 - -#define NUM_INTERFACES 3 - -#define GT64240ETH_TX_TIMEOUT HZ/4 - -#define MIPS_GT64240_BASE 0xf4000000 -#define GT64240_ETH0_BASE (MIPS_GT64240_BASE + GT64240_ETH_PORT_CONFIG) -#define GT64240_ETH1_BASE (GT64240_ETH0_BASE + GT64240_ETH_IO_SIZE) -#define GT64240_ETH2_BASE (GT64240_ETH1_BASE + GT64240_ETH_IO_SIZE) - -#if defined(CONFIG_MIPS_DSL3224) -#define GT64240_ETHER0_IRQ 4 -#define GT64240_ETHER1_IRQ 4 -#else -#define GT64240_ETHER0_IRQ -1 -#define GT64240_ETHER1_IRQ -1 -#endif - -#define REV_GT64240 0x1 -#define REV_GT64240A 0x10 - -#define GT64240ETH_READ(gp, offset) \ - GT_READ((gp)->port_offset + (offset)) - -#define GT64240ETH_WRITE(gp, offset, data) \ - GT_WRITE((gp)->port_offset + (offset), (data)) - -#define GT64240ETH_SETBIT(gp, offset, bits) \ - GT64240ETH_WRITE((gp), (offset), \ - GT64240ETH_READ((gp), (offset)) | (bits)) - -#define GT64240ETH_CLRBIT(gp, offset, bits) \ - GT64240ETH_WRITE((gp), (offset), \ - GT64240ETH_READ((gp), (offset)) & ~(bits)) - -#define GT64240_READ(ofs) GT_READ(ofs) -#define GT64240_WRITE(ofs, data) GT_WRITE((ofs), (data)) - -/* Bit definitions of the SMI Reg */ -enum { - smirDataMask = 0xffff, - smirPhyAdMask = 0x1f << 16, - smirPhyAdBit = 16, - smirRegAdMask = 0x1f << 21, - smirRegAdBit = 21, - smirOpCode = 1 << 26, - smirReadValid = 1 << 27, - smirBusy = 1 << 28 -}; - -/* Bit definitions of the Port Config Reg */ -enum pcr_bits { - pcrPM = 1 << 0, - pcrRBM = 1 << 1, - pcrPBF = 1 << 2, - pcrEN = 1 << 7, - pcrLPBKMask = 0x3 << 8, - pcrLPBKBit = 1 << 8, - pcrFC = 1 << 10, - pcrHS = 1 << 12, - pcrHM = 1 << 13, - pcrHDM = 1 << 14, - pcrHD = 1 << 15, - pcrISLMask = 0x7 << 28, - pcrISLBit = 28, - pcrACCS = 1 << 31 -}; - -/* Bit definitions of the Port Config Extend Reg */ -enum pcxr_bits { - pcxrIGMP = 1, - pcxrSPAN = 2, - pcxrPAR = 4, - pcxrPRIOtxMask = 0x7 << 3, - pcxrPRIOtxBit = 3, - pcxrPRIOrxMask = 0x3 << 6, - pcxrPRIOrxBit = 6, - pcxrPRIOrxOverride = 1 << 8, - pcxrDPLXen = 1 << 9, - pcxrFCTLen = 1 << 10, - pcxrFLP = 1 << 11, - pcxrFCTL = 1 << 12, - pcxrMFLMask = 0x3 << 14, - pcxrMFLBit = 14, - pcxrMIBclrMode = 1 << 16, - pcxrSpeed = 1 << 18, - pcxrSpeeden = 1 << 19, - pcxrRMIIen = 1 << 20, - pcxrDSCPen = 1 << 21 -}; - -/* Bit definitions of the Port Command Reg */ -enum pcmr_bits { - pcmrFJ = 1 << 15 -}; - - -/* Bit definitions of the Port Status Reg */ -enum psr_bits { - psrSpeed = 1, - psrDuplex = 2, - psrFctl = 4, - psrLink = 8, - psrPause = 1 << 4, - psrTxLow = 1 << 5, - psrTxHigh = 1 << 6, - psrTxInProg = 1 << 7 -}; - -/* Bit definitions of the SDMA Config Reg */ -enum sdcr_bits { - sdcrRCMask = 0xf << 2, - sdcrRCBit = 2, - sdcrBLMR = 1 << 6, - sdcrBLMT = 1 << 7, - sdcrPOVR = 1 << 8, - sdcrRIFB = 1 << 9, - sdcrBSZMask = 0x3 << 12, - sdcrBSZBit = 12 -}; - -/* Bit definitions of the SDMA Command Reg */ -enum sdcmr_bits { - sdcmrERD = 1 << 7, - sdcmrAR = 1 << 15, - sdcmrSTDH = 1 << 16, - sdcmrSTDL = 1 << 17, - sdcmrTXDH = 1 << 23, - sdcmrTXDL = 1 << 24, - sdcmrAT = 1 << 31 -}; - -/* Bit definitions of the Interrupt Cause Reg */ -enum icr_bits { - icrRxBuffer = 1, - icrTxBufferHigh = 1 << 2, - icrTxBufferLow = 1 << 3, - icrTxEndHigh = 1 << 6, - icrTxEndLow = 1 << 7, - icrRxError = 1 << 8, - icrTxErrorHigh = 1 << 10, - icrTxErrorLow = 1 << 11, - icrRxOVR = 1 << 12, - icrTxUdr = 1 << 13, - icrRxBufferQ0 = 1 << 16, - icrRxBufferQ1 = 1 << 17, - icrRxBufferQ2 = 1 << 18, - icrRxBufferQ3 = 1 << 19, - icrRxErrorQ0 = 1 << 20, - icrRxErrorQ1 = 1 << 21, - icrRxErrorQ2 = 1 << 22, - icrRxErrorQ3 = 1 << 23, - icrMIIPhySTC = 1 << 28, - icrSMIdone = 1 << 29, - icrEtherIntSum = 1 << 31 -}; - - -/* The Rx and Tx descriptor lists. */ -#ifdef __LITTLE_ENDIAN -typedef struct { - u32 cmdstat; - u16 reserved; //-prk21aug01 u32 reserved:16; - u16 byte_cnt; //-prk21aug01 u32 byte_cnt:16; - u32 buff_ptr; - u32 next; -} gt64240_td_t; - -typedef struct { - u32 cmdstat; - u16 byte_cnt; //-prk21aug01 u32 byte_cnt:16; - u16 buff_sz; //-prk21aug01 u32 buff_sz:16; - u32 buff_ptr; - u32 next; -} gt64240_rd_t; -#elif defined(__BIG_ENDIAN) -typedef struct { - u16 byte_cnt; //-prk21aug01 u32 byte_cnt:16; - u16 reserved; //-prk21aug01 u32 reserved:16; - u32 cmdstat; - u32 next; - u32 buff_ptr; -} gt64240_td_t; - -typedef struct { - u16 buff_sz; //-prk21aug01 u32 buff_sz:16; - u16 byte_cnt; //-prk21aug01 u32 byte_cnt:16; - u32 cmdstat; - u32 next; - u32 buff_ptr; -} gt64240_rd_t; -#else -#error Either __BIG_ENDIAN or __LITTLE_ENDIAN must be defined! -#endif - - -/* Values for the Tx command-status descriptor entry. */ -enum td_cmdstat { - txOwn = 1 << 31, - txAutoMode = 1 << 30, - txEI = 1 << 23, - txGenCRC = 1 << 22, - txPad = 1 << 18, - txFirst = 1 << 17, - txLast = 1 << 16, - txErrorSummary = 1 << 15, - txReTxCntMask = 0x0f << 10, - txReTxCntBit = 10, - txCollision = 1 << 9, - txReTxLimit = 1 << 8, - txUnderrun = 1 << 6, - txLateCollision = 1 << 5 -}; - - -/* Values for the Rx command-status descriptor entry. */ -enum rd_cmdstat { - rxOwn = 1 << 31, - rxAutoMode = 1 << 30, - rxEI = 1 << 23, - rxFirst = 1 << 17, - rxLast = 1 << 16, - rxErrorSummary = 1 << 15, - rxIGMP = 1 << 14, - rxHashExpired = 1 << 13, - rxMissedFrame = 1 << 12, - rxFrameType = 1 << 11, - rxShortFrame = 1 << 8, - rxMaxFrameLen = 1 << 7, - rxOverrun = 1 << 6, - rxCollision = 1 << 4, - rxCRCError = 1 -}; - -/* Bit fields of a Hash Table Entry */ -enum hash_table_entry { - hteValid = 1, - hteSkip = 2, - hteRD = 4 -}; - -// The MIB counters -typedef struct { - u32 byteReceived; - u32 byteSent; - u32 framesReceived; - u32 framesSent; - u32 totalByteReceived; - u32 totalFramesReceived; - u32 broadcastFramesReceived; - u32 multicastFramesReceived; - u32 cRCError; - u32 oversizeFrames; - u32 fragments; - u32 jabber; - u32 collision; - u32 lateCollision; - u32 frames64; - u32 frames65_127; - u32 frames128_255; - u32 frames256_511; - u32 frames512_1023; - u32 frames1024_MaxSize; - u32 macRxError; - u32 droppedFrames; - u32 outMulticastFrames; - u32 outBroadcastFrames; - u32 undersizeFrames; -} mib_counters_t; - - -struct gt64240_private { - gt64240_rd_t *rx_ring; - gt64240_td_t *tx_ring; - // The Rx and Tx rings must be 16-byte aligned - dma_addr_t rx_ring_dma; - dma_addr_t tx_ring_dma; - char *hash_table; - // The Hash Table must be 8-byte aligned - dma_addr_t hash_table_dma; - int hash_mode; - - // The Rx buffers must be 8-byte aligned - char *rx_buff; - dma_addr_t rx_buff_dma; - // Tx buffers (tx_skbuff[i]->data) with less than 8 bytes - // of payload must be 8-byte aligned - struct sk_buff *tx_skbuff[TX_RING_SIZE]; - int rx_next_out; /* The next free ring entry to receive */ - int tx_next_in; /* The next free ring entry to send */ - int tx_next_out; /* The last ring entry the ISR processed */ - int tx_count; /* current # of pkts waiting to be sent in Tx ring */ - int intr_work_done; /* number of Rx and Tx pkts processed in the isr */ - int tx_full; /* Tx ring is full */ - - mib_counters_t mib; - struct net_device_stats stats; - - int io_size; - int port_num; // 0 or 1 - u32 port_offset; - - int phy_addr; // PHY address - u32 last_psr; // last value of the port status register - - int options; /* User-settable misc. driver options. */ - int drv_flags; - spinlock_t lock; /* Serialise access to device */ - struct mii_if_info mii_if; - - u32 msg_enable; -}; - -#endif /* _GT64240ETH_H */ _ From domen@coderock.org Sun Dec 26 07:10:57 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 26 Dec 2004 07:11:03 -0800 (PST) Received: from trashy.coderock.org (postfix@coderock.org [193.77.147.115]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBQFAZhk030797 for ; Sun, 26 Dec 2004 07:10:55 -0800 Received: by trashy.coderock.org (Postfix, from userid 780) id AAFD51F12A; Sun, 26 Dec 2004 16:12:04 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by trashy.coderock.org (Postfix) with ESMTP id 374BD1F126; Sun, 26 Dec 2004 16:12:03 +0100 (CET) Received: from localhost.localdomain (localhost [127.0.0.1]) by trashy.coderock.org (Postfix) with ESMTP id 200A61ECC0; Sun, 26 Dec 2004 16:11:56 +0100 (CET) Subject: [patch 1/2] delete unused file To: davem@redhat.com Cc: netdev@oss.sgi.com, domen@coderock.org From: domen@coderock.org Date: Sun, 26 Dec 2004 16:12:11 +0100 Message-Id: <20041226151156.200A61ECC0@trashy.coderock.org> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13037 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: domen@coderock.org Precedence: bulk X-list: netdev Remove nowhere referenced file. (egrep "filename\." didn't find anything) Signed-off-by: Domen Puncer --- kj/net/sunrpc/auth_gss/gss_pseudoflavors.c | 237 ----------------------------- 1 files changed, 237 deletions(-) diff -L net/sunrpc/auth_gss/gss_pseudoflavors.c -puN net/sunrpc/auth_gss/gss_pseudoflavors.c~remove_file-net_sunrpc_auth_gss_gss_pseudoflavors.c /dev/null --- kj/net/sunrpc/auth_gss/gss_pseudoflavors.c +++ /dev/null 2004-12-24 01:21:08.000000000 +0100 @@ -1,237 +0,0 @@ -/* - * linux/net/sunrpc/gss_union.c - * - * Adapted from MIT Kerberos 5-1.2.1 lib/gssapi/generic code - * - * Copyright (c) 2001 The Regents of the University of Michigan. - * All rights reserved. - * - * Andy Adamson - * - */ - -/* - * Copyright 1993 by OpenVision Technologies, Inc. - * - * Permission to use, copy, modify, distribute, and sell this software - * and its documentation for any purpose is hereby granted without fee, - * provided that the above copyright notice appears in all copies and - * that both that copyright notice and this permission notice appear in - * supporting documentation, and that the name of OpenVision not be used - * in advertising or publicity pertaining to distribution of the software - * without specific, written prior permission. OpenVision makes no - * representations about the suitability of this software for any - * purpose. It is provided "as is" without express or implied warranty. - * - * OPENVISION DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, - * INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO - * EVENT SHALL OPENVISION BE LIABLE FOR ANY SPECIAL, INDIRECT OR - * CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF - * USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR - * OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR - * PERFORMANCE OF THIS SOFTWARE. - */ - -#include -#include -#include -#include -#include - -#ifdef RPC_DEBUG -# define RPCDBG_FACILITY RPCDBG_AUTH -#endif - -static LIST_HEAD(registered_triples); -static spinlock_t registered_triples_lock = SPIN_LOCK_UNLOCKED; - -/* The following must be called with spinlock held: */ -static struct sup_sec_triple * -do_lookup_triple_by_pseudoflavor(u32 pseudoflavor) -{ - struct sup_sec_triple *pos, *triple = NULL; - - list_for_each_entry(pos, ®istered_triples, triples) { - if (pos->pseudoflavor == pseudoflavor) { - triple = pos; - break; - } - } - return triple; -} - -/* XXX Need to think about reference counting of triples and of mechs. - * Currently we do no reference counting of triples, and I think that's - * probably OK given the reference counting on mechs, but there's probably - * a better way to do all this. */ - -int -gss_register_triple(u32 pseudoflavor, struct gss_api_mech *mech, - u32 qop, u32 service) -{ - struct sup_sec_triple *triple; - - if (!(triple = kmalloc(sizeof(*triple), GFP_KERNEL))) { - printk("Alloc failed in gss_register_triple"); - goto err; - } - triple->pseudoflavor = pseudoflavor; - triple->mech = gss_mech_get_by_OID(&mech->gm_oid); - triple->qop = qop; - triple->service = service; - - spin_lock(®istered_triples_lock); - if (do_lookup_triple_by_pseudoflavor(pseudoflavor)) { - printk(KERN_WARNING "RPC: Registered pseudoflavor %d again\n", - pseudoflavor); - goto err_unlock; - } - list_add(&triple->triples, ®istered_triples); - spin_unlock(®istered_triples_lock); - dprintk("RPC: registered pseudoflavor %d\n", pseudoflavor); - - return 0; - -err_unlock: - kfree(triple); - spin_unlock(®istered_triples_lock); -err: - return -1; -} - -int -gss_unregister_triple(u32 pseudoflavor) -{ - struct sup_sec_triple *triple; - - spin_lock(®istered_triples_lock); - if (!(triple = do_lookup_triple_by_pseudoflavor(pseudoflavor))) { - spin_unlock(®istered_triples_lock); - printk("Can't unregister unregistered pseudoflavor %d\n", - pseudoflavor); - return -1; - } - list_del(&triple->triples); - spin_unlock(®istered_triples_lock); - gss_mech_put(triple->mech); - kfree(triple); - return 0; - -} - -void -print_sec_triple(struct xdr_netobj *oid,u32 qop,u32 service) -{ - dprintk("RPC: print_sec_triple:\n"); - dprintk(" oid_len %d\n oid :\n",oid->len); - print_hexl((u32 *)oid->data,oid->len,0); - dprintk(" qop %d\n",qop); - dprintk(" service %d\n",service); -} - -/* Function: gss_get_cmp_triples - * - * Description: search sec_triples for a matching security triple - * return pseudoflavor if match, else 0 - * (Note that 0 is a valid pseudoflavor, but not for any gss pseudoflavor - * (0 means auth_null), so this shouldn't cause confusion.) - */ -u32 -gss_cmp_triples(u32 oid_len, char *oid_data, u32 qop, u32 service) -{ - struct sup_sec_triple *triple; - u32 pseudoflavor = 0; - struct xdr_netobj oid; - - oid.len = oid_len; - oid.data = oid_data; - - dprintk("RPC: gss_cmp_triples\n"); - print_sec_triple(&oid,qop,service); - - spin_lock(®istered_triples_lock); - list_for_each_entry(triple, ®istered_triples, triples) { - if((g_OID_equal(&oid, &triple->mech->gm_oid)) - && (qop == triple->qop) - && (service == triple->service)) { - pseudoflavor = triple->pseudoflavor; - break; - } - } - spin_unlock(®istered_triples_lock); - dprintk("RPC: gss_cmp_triples return %d\n", pseudoflavor); - return pseudoflavor; -} - -u32 -gss_get_pseudoflavor(struct gss_ctx *ctx, u32 qop, u32 service) -{ - return gss_cmp_triples(ctx->mech_type->gm_oid.len, - ctx->mech_type->gm_oid.data, - qop, service); -} - -/* Returns nonzero iff the given pseudoflavor is in the supported list. - * (Note that without incrementing a reference count or anything, this - * doesn't give any guarantees.) */ -int -gss_pseudoflavor_supported(u32 pseudoflavor) -{ - struct sup_sec_triple *triple; - - spin_lock(®istered_triples_lock); - triple = do_lookup_triple_by_pseudoflavor(pseudoflavor); - spin_unlock(®istered_triples_lock); - return (triple ? 1 : 0); -} - -u32 -gss_pseudoflavor_to_service(u32 pseudoflavor) -{ - struct sup_sec_triple *triple; - - spin_lock(®istered_triples_lock); - triple = do_lookup_triple_by_pseudoflavor(pseudoflavor); - spin_unlock(®istered_triples_lock); - if (!triple) { - dprintk("RPC: gss_pseudoflavor_to_service called with unsupported pseudoflavor %d\n", - pseudoflavor); - return 0; - } - return triple->service; -} - -struct gss_api_mech * -gss_pseudoflavor_to_mech(u32 pseudoflavor) { - struct sup_sec_triple *triple; - struct gss_api_mech *mech = NULL; - - spin_lock(®istered_triples_lock); - triple = do_lookup_triple_by_pseudoflavor(pseudoflavor); - spin_unlock(®istered_triples_lock); - if (triple) - mech = gss_mech_get(triple->mech); - else - dprintk("RPC: gss_pseudoflavor_to_mech called with unsupported pseudoflavor %d\n", - pseudoflavor); - return mech; -} - -int -gss_pseudoflavor_to_mechOID(u32 pseudoflavor, struct xdr_netobj * oid) -{ - struct gss_api_mech *mech; - - mech = gss_pseudoflavor_to_mech(pseudoflavor); - if (!mech) { - dprintk("RPC: gss_pseudoflavor_to_mechOID called with unsupported pseudoflavor %d\n", - pseudoflavor); - return -1; - } - oid->len = mech->gm_oid.len; - if (!(oid->data = kmalloc(oid->len, GFP_KERNEL))) - return -1; - memcpy(oid->data, mech->gm_oid.data, oid->len); - gss_mech_put(mech); - return 0; -} _ From domen@coderock.org Sun Dec 26 07:10:57 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 26 Dec 2004 07:11:04 -0800 (PST) Received: from trashy.coderock.org (postfix@coderock.org [193.77.147.115]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBQFAauW030798 for ; Sun, 26 Dec 2004 07:10:56 -0800 Received: by trashy.coderock.org (Postfix, from userid 780) id 27F741F127; Sun, 26 Dec 2004 16:12:05 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by trashy.coderock.org (Postfix) with ESMTP id 76DD01ECC0; Sun, 26 Dec 2004 16:12:03 +0100 (CET) Received: from localhost.localdomain (localhost [127.0.0.1]) by trashy.coderock.org (Postfix) with ESMTP id 393C91F125; Sun, 26 Dec 2004 16:11:59 +0100 (CET) Subject: [patch 2/2] delete unused file To: davem@redhat.com Cc: netdev@oss.sgi.com, domen@coderock.org From: domen@coderock.org Date: Sun, 26 Dec 2004 16:12:15 +0100 Message-Id: <20041226151159.393C91F125@trashy.coderock.org> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13038 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: domen@coderock.org Precedence: bulk X-list: netdev Remove nowhere referenced file. (egrep "filename\." didn't find anything) Signed-off-by: Domen Puncer --- kj/net/sunrpc/auth_gss/sunrpcgss_syms.c | 37 -------------------------------- 1 files changed, 37 deletions(-) diff -L net/sunrpc/auth_gss/sunrpcgss_syms.c -puN net/sunrpc/auth_gss/sunrpcgss_syms.c~remove_file-net_sunrpc_auth_gss_sunrpcgss_syms.c /dev/null --- kj/net/sunrpc/auth_gss/sunrpcgss_syms.c +++ /dev/null 2004-12-24 01:21:08.000000000 +0100 @@ -1,37 +0,0 @@ -#include -#include - -#include -#include -#include -#include -#include - -#include -#include -#include -#include - -/* svcauth_gss.c: */ -EXPORT_SYMBOL(svcauth_gss_register_pseudoflavor); - -/* registering gss mechanisms to the mech switching code: */ -EXPORT_SYMBOL(gss_mech_register); -EXPORT_SYMBOL(gss_mech_unregister); -EXPORT_SYMBOL(gss_mech_get); -EXPORT_SYMBOL(gss_mech_get_by_pseudoflavor); -EXPORT_SYMBOL(gss_mech_get_by_name); -EXPORT_SYMBOL(gss_mech_put); -EXPORT_SYMBOL(gss_pseudoflavor_to_service); -EXPORT_SYMBOL(gss_service_to_auth_domain_name); - -/* generic functionality in gss code: */ -EXPORT_SYMBOL(g_make_token_header); -EXPORT_SYMBOL(g_verify_token_header); -EXPORT_SYMBOL(g_token_size); -EXPORT_SYMBOL(make_checksum); -EXPORT_SYMBOL(krb5_encrypt); -EXPORT_SYMBOL(krb5_decrypt); - -/* debug */ -EXPORT_SYMBOL(print_hexl); _ From chris@scorpion.nl Sun Dec 26 11:00:28 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 26 Dec 2004 11:00:41 -0800 (PST) Received: from office.scorpion.nl (Debian-exim@36-71-dsl.ipact.nl [84.35.71.36]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBQJ07kC008122 for ; Sun, 26 Dec 2004 11:00:28 -0800 Received: from speedy.scorpion.nl ([10.136.100.61]) by hannibal.scorpion.nl with esmtp (Exim 4.34) id 1Ciddo-0004mb-6a for netdev@oss.sgi.com; Sun, 26 Dec 2004 20:01:20 +0100 Message-ID: <41CF0A76.4060607@scorpion.nl> Date: Sun, 26 Dec 2004 20:01:10 +0100 From: Christiaan den Besten Reply-To: chris@scorpion.nl User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206) X-Accept-Language: en-us, en MIME-Version: 1.0 To: netdev@oss.sgi.com Subject: packets displayed twice on ipsec interface ... Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-KM95-MailScanner: Found to be clean X-MailScanner-From: chris@scorpion.nl X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13039 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: chris@scorpion.nl Precedence: bulk X-list: netdev Hi all ! Not really sure this is a kernel, or a netfilter issue, but posting to the lkml resulted in no answers so far ;( After trying to determine the 'overhead' of my ipsec traffic, I hit a rather annoying 'feature'. (Using racoon ipsec with default debian-kernels 2.6.x kernels, but issue was with 2.4 as well if i remember correctly.) Traffic on the outgoing interface (eth0) shows both the encapsulated as well as the non-encapsulated packets. --- (tcpdump -i eth0 -n ) --- 15:24:20.003088 IP 172.20.40.45.45707 > 10.136.100.1.48193: . 297216:298592(1376) ack 1 win 5792 15:24:20.005095 IP 130.161.82.9 > 84.35.71.36: ESP(spi=0x080d4f70,seq=0x1de7c) 15:24:20.005095 IP 172.20.40.45.45707 > 10.136.100.1.48193: . 298592:299968(1376) ack 1 win 5792 15:24:20.005223 IP 84.35.71.36 > 130.161.82.9: ESP(spi=0x0451e539,seq=0xee8e) --- Using default tools a la 'iptraf' counts them both, so it would look like my adsl-line is doing 11Mbit :) (which is rather nice since the telco has limited it to 6Mbit ...) Is there any way to prevent the kernel from showing the data inside the tunnel ? (172.20.40.45 <> 10.136.100.1 is the tunneled traffic). bye, Chris ( Not a member of the list, so a cc would be very nice ) From ahu@outpost.ds9a.nl Sun Dec 26 15:00:34 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 26 Dec 2004 15:00:41 -0800 (PST) Received: from outpost.ds9a.nl (postfix@outpost.ds9a.nl [213.244.168.210]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBQN0DW3017867 for ; Sun, 26 Dec 2004 15:00:34 -0800 Received: by outpost.ds9a.nl (Postfix, from userid 1000) id 706E2441C; Mon, 27 Dec 2004 00:01:33 +0100 (CET) Date: Mon, 27 Dec 2004 00:01:33 +0100 From: bert hubert To: Peter Bieringer Cc: Maillist USAGI-users , Maillist netdev Subject: Re: Major deadlock: unregister_netdevice: waiting for to become free. Usage count = 1 Message-ID: <20041226230133.GA1448@outpost.ds9a.nl> Mail-Followup-To: bert hubert , Peter Bieringer , Maillist USAGI-users , Maillist netdev References: <8A6334DE39BB61513FFBD614@t1mobil.muc.aerasec.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.3.28i X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13040 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ahu@ds9a.nl Precedence: bulk X-list: netdev On Sun, Dec 26, 2004 at 10:26:28AM +0100, Peter Bieringer wrote: > >Dec 26 09:59:10 * kernel: unregister_netdevice: waiting for sit_sixxs to > >become free. Usage count = 1 Happens to me as well with 2.6.10-rc2 Debian sid. Haven't been able to debug it yet. -- http://www.PowerDNS.com Open source, database driven DNS Software http://lartc.org Linux Advanced Routing & Traffic Control HOWTO From yasuyuki.kozakai@toshiba.co.jp Sun Dec 26 20:17:07 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 26 Dec 2004 20:17:14 -0800 (PST) Received: from inet-tsb.toshiba.co.jp (inet-tsb.toshiba.co.jp [202.33.96.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBR4GkC7029185 for ; Sun, 26 Dec 2004 20:17:07 -0800 Received: from tsb-wall.toshiba.co.jp ([133.199.160.134]) by inet-tsb.toshiba.co.jp with ESMTP id iBR4HuIf000427; Mon, 27 Dec 2004 13:17:56 +0900 (JST) Received: (from root@localhost) by tsb-wall.toshiba.co.jp id iBR4HuwJ008881; Mon, 27 Dec 2004 13:17:56 +0900 (JST) Received: from tis2 [133.199.160.66] by tsb-wall.toshiba.co.jp with SMTP id PAA08877 ; Mon, 27 Dec 2004 13:17:55 +0900 Received: from mx.toshiba.co.jp by tis2.tis.toshiba.co.jp id NAA22683; Mon, 27 Dec 2004 13:17:55 +0900 (JST) Received: by toshiba.co.jp id iBR4HZRG021429; Mon, 27 Dec 2004 13:17:35 +0900 (JST) Date: Mon, 27 Dec 2004 13:17:34 +0900 (JST) Message-Id: <200412270417.iBR4HZRG021429@toshiba.co.jp> To: pb@bieringer.de Cc: netdev@oss.sgi.com, usagi-users@linux-ipv6.org, laforge@gnumonks.org, kaber@trash.net, netfilter-devel@lists.netfilter.org Subject: Re: netfilter6: ICMPv6 type 143 doesn't match From: Yasuyuki Kozakai In-Reply-To: <6050E336B1A0D7D8E70C66F3@t1mobil.muc.aerasec.de> References: <6050E336B1A0D7D8E70C66F3@t1mobil.muc.aerasec.de> X-Mailer: Mew version 3.3 on Emacs 20.7 / Mule 4.0 (HANANOEN) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13041 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yasuyuki.kozakai@toshiba.co.jp Precedence: bulk X-list: netdev From: Peter Bieringer Date: Sat, 25 Dec 2004 18:47:52 +0100 > I tried several rules (don't wonder about the wrong order, it was a try and > error -I insert, uppest rule was inserted last): > > # ip6tables -vn -L OUTPUT > Chain OUTPUT (policy DROP 4 packets, 4872 bytes) > pkts bytes target prot opt in out source > destination > 2 192 ACCEPT all * eth0 ::/0 ::/0 > 0 0 ACCEPT icmpv6 * * ::/0 ::/0 > 0 0 ACCEPT icmpv6 * * ::/0 ::/0 > ipv6-icmp type 143 > 0 0 ACCEPT icmpv6 * * ::/0 > ff02::/16 ipv6-icmp type 143 > 0 0 ACCEPT icmpv6 * * ::/0 > ff02::/16 ipv6-icmp type 143 > 0 0 ACCEPT icmpv6 * * ::/0 > ff02::16/128 ipv6-icmp type 143 > > Packet dump: > > 18:46:07.984044 :: > ff02::16: HBH (rtalert: 0x0000) (padn)[icmp6 sum ok] > icmp6: type-#143 [hlim 1] (len 56) > 0x0000: 6000 0000 0038 0001 0000 0000 0000 0000 `....8.......... > 0x0010: 0000 0000 0000 0000 ff02 0000 0000 0000 ................ > 0x0020: 0000 0000 0000 0016 3a00 0502 0000 0100 ........:....... > 0x0030: 8f00 6b6a 0000 0002 0400 0000 ff05 0000 ..kj............ > 0x0040: 0000 0000 0000 0000 0001 0003 0400 0000 ................ > 0x0050: ff02 0000 0000 0000 0000 0000 0001 0002 ................ > > I wonder that only the proto "all" rule matches such packet. Well, the Multicast Listener Report seems that skb->data != skb->nh.ipv6h when interface is up. But IPv6 netfilter modules assumes that skb->data == skb->nh.ipv6h like IPv4 netfilter modules. folks, is this wrong or bad asumption ? If so, I'll fix this problem in many modules as follows. --- linux-2.6.10/net/ipv6/netfilter/ip6_tables.c 2004-12-27 11:26:57.000000000 +0900 +++ linux-2.6.10-fixed/net/ipv6/netfilter/ip6_tables.c 2004-12-27 11:28:23.000000000 +0900 @@ -222,7 +222,7 @@ u_int16_t hdrlen; /* Header */ u_int16_t _fragoff = 0, *fp = NULL; - ptr = IPV6_HDR_LEN; + ptr = ((u8*)skb->nh.ipv6h - skb->data) + IPV6_HDR_LEN; while (ip6t_ext_hdr(currenthdr)) { /* Is there enough space for the next ext header? */ Regards, ----------------------------------------------------------------- Yasuyuki KOZAKAI @ USAGI Project From ralf@linux-mips.org Sun Dec 26 20:59:00 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 26 Dec 2004 20:59:06 -0800 (PST) Received: from mail.linux-mips.net (p508B7C84.dip.t-dialin.net [80.139.124.132]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBR4wc4X001449 for ; Sun, 26 Dec 2004 20:58:59 -0800 Received: from fluff.linux-mips.net (localhost.localdomain [127.0.0.1]) by mail.linux-mips.net (8.13.1/8.13.1) with ESMTP id iBR5033N000994; Mon, 27 Dec 2004 06:00:03 +0100 Received: (from ralf@localhost) by fluff.linux-mips.net (8.13.1/8.13.1/Submit) id iBR4xui9000988; Mon, 27 Dec 2004 05:59:56 +0100 Date: Mon, 27 Dec 2004 05:59:55 +0100 From: Ralf Baechle To: domen@coderock.org Cc: jgarzik@pobox.com, netdev@oss.sgi.com Subject: Re: [patch 1/1] delete unused file Message-ID: <20041227045955.GA711@linux-mips.org> References: <20041226151017.C042C1ECC0@trashy.coderock.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041226151017.C042C1ECC0@trashy.coderock.org> User-Agent: Mutt/1.4.1i X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13042 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ralf@linux-mips.org Precedence: bulk X-list: netdev On Sun, Dec 26, 2004 at 04:10:32PM +0100, domen@coderock.org wrote: > Remove nowhere referenced file. (egrep "filename\." didn't find anything) > > Signed-off-by: Domen Puncer Great idea to send patches to delete everything that seems unused. NOT. Jeff, seems the gt96100 and gt64240 drivers got mixed up, will send a patch to fix this. Ralf From advertiser@localhost.localdomain Sun Dec 26 21:46:28 2004 Received: with ECARTIS (v1.0.0; list netdev); Sun, 26 Dec 2004 21:46:35 -0800 (PST) Received: from localhost.localdomain ([82.201.178.238]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBR5k7t0002927 for ; Sun, 26 Dec 2004 21:46:28 -0800 Received: from localhost.localdomain (admin3 [127.0.0.1]) by localhost.localdomain (8.12.8/8.12.8) with ESMTP id iBRHrC1s027302 for ; Mon, 27 Dec 2004 19:53:12 +0200 Received: (from advertiser@localhost) by localhost.localdomain (8.12.8/8.12.8/Submit) id iBRHrBhf027295 for netdev@oss.sgi.com; Mon, 27 Dec 2004 19:53:11 +0200 Date: Mon, 27 Dec 2004 19:53:11 +0200 From: advertiser@advertise.com Message-Id: <200412271753.iBRHrBhf027295@localhost.localdomain> To: netdev@oss.sgi.com Subject: Cheap Prices NOT Cheap Hosting X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13043 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: advertiser@advertise.com Precedence: bulk X-list: netdev .HellO... ------------------------------------------------------------ ############################################################## $ Visit http://www.mkhoster.com For Very Good Hosting Offer $ $--- Cpanel $ $--- PHP $ $--- CGI-perl $ $--- Mysql $ $--- And MORE ....... $ ############################################################## FOR MORE INFORMATIONS -----< http://mkhoster.com/support.html >----- ************************************************************** From yoshfuji@linux-ipv6.org Mon Dec 27 01:00:44 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 01:00:54 -0800 (PST) Received: from yue.st-paulia.net (yue.linux-ipv6.org [203.178.140.15]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBR90Ndb011307 for ; Mon, 27 Dec 2004 01:00:44 -0800 Received: from localhost (localhost [127.0.0.1]) by yue.st-paulia.net (Postfix) with ESMTP id 8BCAD33CC2; Mon, 27 Dec 2004 18:02:05 +0900 (JST) Date: Mon, 27 Dec 2004 10:02:05 +0100 (CET) Message-Id: <20041227.100205.102356251.yoshfuji@linux-ipv6.org> To: yasuyuki.kozakai@toshiba.co.jp Cc: pb@bieringer.de, netdev@oss.sgi.com, usagi-users@linux-ipv6.org, laforge@gnumonks.org, kaber@trash.net, netfilter-devel@lists.netfilter.org, yoshfuji@linux-ipv6.org Subject: Re: netfilter6: ICMPv6 type 143 doesn't match From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: <200412270417.iBR4HZRG021429@toshiba.co.jp> References: <6050E336B1A0D7D8E70C66F3@t1mobil.muc.aerasec.de> <200412270417.iBR4HZRG021429@toshiba.co.jp> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13044 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev In article <200412270417.iBR4HZRG021429@toshiba.co.jp> (at Mon, 27 Dec 2004 13:17:34 +0900 (JST)), Yasuyuki Kozakai says: > > - ptr = IPV6_HDR_LEN; > + ptr = ((u8*)skb->nh.ipv6h - skb->data) + IPV6_HDR_LEN; > IMHO, skb->nh.ipv6h does not points ipv6 header anymore; it should be skb->nh.raw in this case. --yoshfuji From jgarzik@pobox.com Mon Dec 27 01:33:36 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 01:33:43 -0800 (PST) Received: from www.linux.org.uk (IDENT:93@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBR9XFbK012693 for ; Mon, 27 Dec 2004 01:33:35 -0800 Received: from [69.134.152.124] (helo=[10.10.10.88]) by www.linux.org.uk with asmtp (TLSv1:AES256-SHA:256) (Exim 4.33) id 1CirH1-0003kr-HI; Mon, 27 Dec 2004 09:34:43 +0000 Message-ID: <41CFD72C.6090503@pobox.com> Date: Mon, 27 Dec 2004 04:34:36 -0500 From: Jeff Garzik User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3) Gecko/20040922 X-Accept-Language: en-us, en MIME-Version: 1.0 To: David Dillow CC: Netdev Subject: Re: [BK netdev-2.6] Update Typhoon firmware References: <1103314025.4217.1.camel@ori.thedillows.org> In-Reply-To: <1103314025.4217.1.camel@ori.thedillows.org> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13045 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev David Dillow wrote: > Jeff, please do a > > bk pull http://typhoon.bkbits.net/typhoon-2.6-firmware [jgarzik@pretzel net-drivers-2.6]$ bk pull http://typhoon.bkbits.net/typhoon-2.6-firmware Pull http://typhoon.bkbits.net/typhoon-2.6-firmware -> file://garz/repo/net-drivers-2.6 ERROR-cannot cd to typhoon-2.6-firmware (illegal, nonexistant, or not package root) From jgarzik@pobox.com Mon Dec 27 02:03:12 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 02:03:20 -0800 (PST) Received: from www.linux.org.uk (IDENT:93@parcelfarce.linux.theplanet.co.uk [195.92.249.252]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBRA2pTR013986 for ; Mon, 27 Dec 2004 02:03:11 -0800 Received: from [69.134.152.124] (helo=[10.10.10.88]) by www.linux.org.uk with asmtp (TLSv1:AES256-SHA:256) (Exim 4.33) id 1Cirjd-0005Ll-IB; Mon, 27 Dec 2004 10:04:17 +0000 Message-ID: <41CFDE1D.6000802@pobox.com> Date: Mon, 27 Dec 2004 05:04:13 -0500 From: Jeff Garzik User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3) Gecko/20040922 X-Accept-Language: en-us, en MIME-Version: 1.0 To: =?UTF-8?B?WU9TSElGVUpJIEhpZGVha2kgLyDlkInol6Toi7HmmI4=?= , Herbert Xu , Arnaldo Carvalho de Melo CC: Netdev Subject: Badness in dst_release (continuing saga) Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13046 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jgarzik@pobox.com Precedence: bulk X-list: netdev Ok, it's happened again. x86 SMP, 2.6.10-rc3-bk9. I have left the machine up without rebooting, those with accounts may ssh in. First message occurs on Dec 24 in /var/log/messages.1, and they continue in /var/log/messages and dmesg. ssh to gw.yyz.us. NOTE: it would be wise to "ssh -4" to avoid continuing log spam, when you ssh into the box. Jeff Badness in dst_release at include/net/dst.h:149 [] ip6_dst_check+0x64/0x6a [ipv6] [] ip6_dst_lookup+0x1a7/0x1c1 [ipv6] [] udpv6_sendmsg+0x297/0x931 [ipv6] [] udp_recvmsg+0x60/0x2e9 [] inet_sendmsg+0x4d/0x59 [] sock_sendmsg+0xe8/0x103 [] find_busiest_group+0xcf/0x2db [] load_balance_newidle+0x36/0xb4 [] copy_from_user+0x42/0x6e [] autoremove_wake_function+0x0/0x57 [] sys_sendmsg+0x189/0x1e6 [] __wake_up_common+0x3f/0x5e [] __wake_up+0x40/0x56 [] wake_futex+0x37/0x62 [] futex_wake+0x74/0xc4 [] copy_from_user+0x42/0x6e [] sys_socketcall+0x236/0x254 [] sysenter_past_esp+0x52/0x75 From domen@coderock.org Mon Dec 27 03:12:51 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 03:12:58 -0800 (PST) Received: from trashy.coderock.org (postfix@coderock.org [193.77.147.115]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBRBCUhr018521 for ; Mon, 27 Dec 2004 03:12:51 -0800 Received: by trashy.coderock.org (Postfix, from userid 780) id 25C8C1F12C; Mon, 27 Dec 2004 12:13:59 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by trashy.coderock.org (Postfix) with ESMTP id 63B641F12A; Mon, 27 Dec 2004 12:13:58 +0100 (CET) Received: from localhost (coderock.org [193.77.147.115]) by trashy.coderock.org (Postfix) with ESMTP id C17C21F126; Mon, 27 Dec 2004 12:13:56 +0100 (CET) Date: Mon, 27 Dec 2004 12:14:19 +0100 From: Domen Puncer To: Ralf Baechle Cc: jgarzik@pobox.com, netdev@oss.sgi.com Subject: Re: delete unused file Message-ID: <20041227111419.GA3768@masina.coderock.org> References: <20041226151017.C042C1ECC0@trashy.coderock.org> <20041227045955.GA711@linux-mips.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041227045955.GA711@linux-mips.org> User-Agent: Mutt/1.4.2.1i X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13047 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: domen@coderock.org Precedence: bulk X-list: netdev On 27/12/04 05:59 +0100, Ralf Baechle wrote: > On Sun, Dec 26, 2004 at 04:10:32PM +0100, domen@coderock.org wrote: > > > Remove nowhere referenced file. (egrep "filename\." didn't find anything) > > > > Signed-off-by: Domen Puncer > > Great idea to send patches to delete everything that seems unused. NOT. It's the best way to make someone aware something might be wrong. ;-) > > Jeff, seems the gt96100 and gt64240 drivers got mixed up, will send a patch > to fix this. > > Ralf From tgraf@suug.ch Mon Dec 27 04:15:30 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 04:15:42 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBRCF910021567 for ; Mon, 27 Dec 2004 04:15:29 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id 2063FF; Mon, 27 Dec 2004 13:16:15 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id 48DA71C0EA; Mon, 27 Dec 2004 13:16:58 +0100 (CET) Date: Mon, 27 Dec 2004 13:16:58 +0100 From: Thomas Graf To: "David S. Miller" Cc: netdev@oss.sgi.com Subject: Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. Message-ID: <20041227121658.GI7884@postel.suug.ch> References: <200412270715.iBR7Fffe026855@hera.kernel.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200412270715.iBR7Fffe026855@hera.kernel.org> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13048 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev > ChangeSet 1.2055.37.1, 2004/11/17 16:08:01-08:00, util@deuroconsult.ro > > [PKT_SCHED]: Allow using nfmark as key in U32 classifier. > > Signed-off-by: Catalin(ux aka Dino) BOIE > Signed-off-by: David S. Miller I must have missed this one. This should have been implemented in the metadata action module to make it available to all classifiers. We should really stop to add more stuff to specific classifiers which have to be removed once we have metamatch. I've made a proposal on paper just need some more time to cook up a patch. > +#ifdef CONFIG_CLS_U32_MARK > + if (tb[TCA_U32_MARK-1]) { > + if (RTA_PAYLOAD(tb[TCA_U32_MARK-1]) < sizeof(struct tc_u32_mark)) > + return -EINVAL; > + mark = RTA_DATA(tb[TCA_U32_MARK-1]); > + memcpy(&n->mark, mark, sizeof(struct tc_u32_mark)); > + n->mark.success = 0; > + } > +#endif This should have gone into u32_set_params From l.bortot@inet.it Mon Dec 27 05:58:49 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 05:58:56 -0800 (PST) Received: from hal-5.inet.it (hal-5.inet.it [213.92.5.24]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBRDwRYK029547 for ; Mon, 27 Dec 2004 05:58:48 -0800 Received: from bove.bortot.it [::ffff:213.92.7.98] by hal-5.inet.it via I-SMTP-5.2.1-520 id ::ffff:213.92.7.98+iamX8KmGsl2kG; Mon, 27 Dec 2004 14:59:55 +0100 Message-ID: <41D01562.4090606@inet.it> Date: Mon, 27 Dec 2004 15:00:02 +0100 From: Luca Bortot User-Agent: Mozilla Thunderbird 0.9 (Windows/20041103) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Francois Romieu , netdev@oss.sgi.com Subject: Re: jumbo on 8169 References: <41CFF27A.2070008@inet.it> <20041227123136.GA25187@electric-eye.fr.zoreil.com> In-Reply-To: <20041227123136.GA25187@electric-eye.fr.zoreil.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13049 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: l.bortot@inet.it Precedence: bulk X-list: netdev Francois Romieu wrote: > Luca Bortot : >>To put it short, could you please give me a link or a hint or whatever >>to let me set jumbo frames on r8169 (I'm currently running kernel >>6.9.10/i386)? > > > You can use any recent patch issued by Andrew Morton (-mm) or apply > http://www.fr.zoreil.com/people/francois/misc/20041218-2.6.10-rc3-r8169.c-test.patch > > I have not regenerated the whole patch against 2.6.10 yet. So if you want > to apply the aforementionned patch on top of 2.6.10, you will have to revert > (cd linux-2.6.10; patch -R -p1 -d. < ...) the attached patch first. > > Please note that you will be limited to ~7000 bytes frames at most (but it > is enough to make a noticeable difference). > > Success/failure report + description of the hardware (lspci -vx/dmesg) will > be welcome. It did it as it should: applied the patch, recompiled & reboot, could now run ifconfig eth2 mtu 7000 hardware in short: intel p3 800mhz 384mb ram m/b QDI Advance 9 nic Hamlet HNNG32TX (realtek 8169 based) running fedora core 3 / kernel 2.6.10 / NAPI enabled I'm testing it together with a windows box (which is directly connected via a cross cable): athlon XP 2600 1gb ram m/b asus a7n8x same nic windows XP based on a simple tcp test I made (writes zeroes to a socket in 32Kb blocks and prints the write speed), these are the results (win box cpu not reported - always under 10% load): BEFORE PATCH (mtu 1500) speed ~38 MB/s cpu idle 10% cpu system 90% AFTER PATCH (mtu 7000) speed ~45MB/s cpu idle 40% cpu system 60% as requested, lspci /vx 00:0b.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8169 Gigabit Ethernet (rev 10) Subsystem: Realtek Semiconductor Co., Ltd. RTL-8169 Gigabit Ethernet Flags: bus master, 66Mhz, medium devsel, latency 64, IRQ 5 I/O ports at dc00 [size=256] Memory at e6603000 (32-bit, non-prefetchable) [size=256] Expansion ROM at e3000000 [disabled] [size=128K] Capabilities: [dc] Power Management version 2 00: ec 10 69 81 17 00 b0 02 10 00 00 02 08 40 00 00 10: 01 dc 00 00 00 30 60 e6 00 00 00 00 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 ec 10 69 81 30: 00 00 00 e3 dc 00 00 00 00 00 00 00 05 01 20 40 Thanks for helping Luca Bortot From kaber@trash.net Mon Dec 27 09:17:34 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 09:17:40 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBRHHD4O007652 for ; Mon, 27 Dec 2004 09:17:33 -0800 Received: from eru.coreworks.de ([172.16.0.2] helo=trash.net) by kaber.coreworks.de with esmtp (Exim 4.34) id 1CiyW8-0005uP-JA; Mon, 27 Dec 2004 18:18:48 +0100 Message-ID: <41D043AC.2070203@trash.net> Date: Mon, 27 Dec 2004 18:17:32 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040413 Debian/1.6-5 X-Accept-Language: en MIME-Version: 1.0 To: Alan Cox CC: torvalds@osdl.org, Linux Kernel Mailing List , Maillist netdev Subject: Re: PATCH: kmalloc packet slab References: <1104156983.20944.25.camel@localhost.localdomain> In-Reply-To: <1104156983.20944.25.camel@localhost.localdomain> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13050 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev Alan Cox wrote: > The networking world runs in 1514 byte packets pretty much all the time. > This adds a 1620 byte slab for such objects and is one of the internally > generated Red Hat patches we use on things like Fedora Core 3. Original: > Arjan van de Ven. > > Signed-off-by: Alan Cox Why 1620 bytes ? Most drivers allocate packet_size + 2 bytes. dev_alloc_skb adds another 16 bytes, finally alloc_skb adds sizeof(struct skb_shared_info). So we get: (32bit): 1514b + 2b + 16b + 160b = 1692b (64bit): 1514b + 2b + 16b + 312b = 1844b On paths using alloc_skb instead of dev_alloc_skb it's 16 bytes less, but 1620 bytes is still too small for full-sized packets. Regards Patrick From romieu@fr.zoreil.com Mon Dec 27 09:48:25 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 09:48:32 -0800 (PST) Received: from fr.zoreil.com (electric-eye.fr.zoreil.com [213.41.134.224]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBRHm3h6009179 for ; Mon, 27 Dec 2004 09:48:24 -0800 Received: from electric-eye.fr.zoreil.com (localhost.localdomain [127.0.0.1]) by fr.zoreil.com (8.13.1/8.12.1) with ESMTP id iBRHl5hb028785 for ; Mon, 27 Dec 2004 18:47:05 +0100 Received: (from romieu@localhost) by electric-eye.fr.zoreil.com (8.13.1/8.13.1/Submit) id iBRHl0CM028784 for netdev@oss.sgi.com; Mon, 27 Dec 2004 18:47:00 +0100 Resent-Message-Id: <200412271747.iBRHl0CM028784@electric-eye.fr.zoreil.com> Date: Mon, 27 Dec 2004 17:38:02 +0100 From: Francois Romieu To: Luca Bortot Cc: netdev@oss.sgi.com Subject: Re: jumbo on 8169 Message-ID: <20041227163802.GA27692@electric-eye.fr.zoreil.com> References: <41CFF27A.2070008@inet.it> <20041227123136.GA25187@electric-eye.fr.zoreil.com> <41D01562.4090606@inet.it> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <41D01562.4090606@inet.it> User-Agent: Mutt/1.4.1i X-Organisation: Land of Sunshine Inc. Resent-From: romieu@fr.zoreil.com Resent-Date: Mon, 27 Dec 2004 18:47:00 +0100 Resent-To: netdev@oss.sgi.com X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13051 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: romieu@fr.zoreil.com Precedence: bulk X-list: netdev Luca Bortot : [...] > based on a simple tcp test I made (writes zeroes to a socket in 32Kb > blocks and prints the write speed), these are the results (win box cpu > not reported - always under 10% load): > > BEFORE PATCH (mtu 1500) > speed ~38 MB/s > cpu idle 10% > cpu system 90% > > AFTER PATCH (mtu 7000) > speed ~45MB/s > cpu idle 40% > cpu system 60% TSO may make a difference for a TCP test. See ethtool help to enable it. You can experiment with Tx csum/SG as well. I'll welcome a complete dmesg and lspci -vx as they pretty well describe the working combinations. -- Ueimor From tgraf@suug.ch Mon Dec 27 11:36:53 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 11:37:00 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBRJaWlR012507 for ; Mon, 27 Dec 2004 11:36:53 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id 4ED06F; Mon, 27 Dec 2004 20:37:39 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id 499161C0EA; Mon, 27 Dec 2004 20:38:22 +0100 (CET) Date: Mon, 27 Dec 2004 20:38:22 +0100 From: Thomas Graf To: "David S. Miller" Cc: netdev@oss.sgi.com Subject: [PATCH] PKT_SCHED: dsmark should ignore ECN bits Message-ID: <20041227193822.GK7884@postel.suug.ch> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13052 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev Taking ECN bits into account doesn't make sense. The two bits were still unused at the time the code was written so this brings back the old behaviour. Signed-off-by: Thomas Graf --- linux-2.6.10-bk1.orig/net/sched/sch_dsmark.c 2004-12-27 14:32:33.000000000 +0100 +++ linux-2.6.10-bk1/net/sched/sch_dsmark.c 2004-12-27 20:30:40.000000000 +0100 @@ -14,6 +14,7 @@ #include #include #include +#include #include @@ -198,10 +199,12 @@ /* FIXME: Safe with non-linear skbs? --RR */ switch (skb->protocol) { case __constant_htons(ETH_P_IP): - skb->tc_index = ipv4_get_dsfield(skb->nh.iph); + skb->tc_index = ipv4_get_dsfield(skb->nh.iph) + & ~INET_ECN_MASK; break; case __constant_htons(ETH_P_IPV6): - skb->tc_index = ipv6_get_dsfield(skb->nh.ipv6h); + skb->tc_index = ipv6_get_dsfield(skb->nh.ipv6h) + & ~INET_ECN_MASK; break; default: skb->tc_index = 0; From davem@davemloft.net Mon Dec 27 14:24:01 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 14:24:08 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBRMNfUP020835 for ; Mon, 27 Dec 2004 14:24:01 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1Cj3HK-0002GE-00; Mon, 27 Dec 2004 14:23:50 -0800 Date: Mon, 27 Dec 2004 14:23:50 -0800 From: "David S. Miller" To: Patrick McHardy Cc: alan@lxorguk.ukuu.org.uk, torvalds@osdl.org, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: PATCH: kmalloc packet slab Message-Id: <20041227142350.1cf444fe.davem@davemloft.net> In-Reply-To: <41D043AC.2070203@trash.net> References: <1104156983.20944.25.camel@localhost.localdomain> <41D043AC.2070203@trash.net> X-Mailer: Sylpheed version 1.0.0rc (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13053 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Mon, 27 Dec 2004 18:17:32 +0100 Patrick McHardy wrote: > Alan Cox wrote: > > The networking world runs in 1514 byte packets pretty much all the time. > > This adds a 1620 byte slab for such objects and is one of the internally > > generated Red Hat patches we use on things like Fedora Core 3. Original: > > Arjan van de Ven. > > > > Signed-off-by: Alan Cox > > Why 1620 bytes ? Most drivers allocate packet_size + 2 bytes. > dev_alloc_skb adds another 16 bytes, finally alloc_skb adds > sizeof(struct skb_shared_info). So we get: > > (32bit): 1514b + 2b + 16b + 160b = 1692b > (64bit): 1514b + 2b + 16b + 312b = 1844b > > On paths using alloc_skb instead of dev_alloc_skb it's 16 bytes > less, but 1620 bytes is still too small for full-sized packets. Absolutely, there is no way this patch actually helps for full sized frames. Another thing in the above equations is that on output you have to add in MAX_TCP_HEADER which is 128 + MAX_HEADER. MAX_HEADER is variable sized based upon which link layer support is built into the kernel. Even on input, many ethernet device drivers add in their own amounts to the size for DMA and cache-line alignment. So this special slab would never be used on output even if it got the base equations correct. If we are really going to do something like this, it should be calculated properly and be determined per-interface type as netdevs are registered. Special casing ethernet is just rediculious. From Valdis.Kletnieks@vt.edu Mon Dec 27 14:49:10 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 14:49:17 -0800 (PST) Received: from h80ad25c6.async.vt.edu (IDENT:root@h80ad25c6.async.vt.edu [128.173.37.198]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBRMmmgx022032 for ; Mon, 27 Dec 2004 14:49:09 -0800 Received: from turing-police.cc.vt.edu (IDENT:valdis@turing-police.cc.vt.edu [127.0.0.1]) by turing-police.cc.vt.edu (8.13.2/8.13.2) with ESMTP id iBRMo2Qb011114; Mon, 27 Dec 2004 17:50:02 -0500 Message-Id: <200412272250.iBRMo2Qb011114@turing-police.cc.vt.edu> X-Mailer: exmh version 2.7.1 10/11/2004 with nmh-1.1-RC3 To: "David S. Miller" Cc: Patrick McHardy , alan@lxorguk.ukuu.org.uk, torvalds@osdl.org, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: PATCH: kmalloc packet slab In-Reply-To: Your message of "Mon, 27 Dec 2004 14:23:50 PST." <20041227142350.1cf444fe.davem@davemloft.net> From: Valdis.Kletnieks@vt.edu References: <1104156983.20944.25.camel@localhost.localdomain> <41D043AC.2070203@trash.net> <20041227142350.1cf444fe.davem@davemloft.net> Mime-Version: 1.0 Content-Type: multipart/signed; boundary="==_Exmh_-1419475951P"; micalg=pgp-sha1; protocol="application/pgp-signature" Content-Transfer-Encoding: 7bit Date: Mon, 27 Dec 2004 17:50:01 -0500 X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13054 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Valdis.Kletnieks@vt.edu Precedence: bulk X-list: netdev --==_Exmh_-1419475951P Content-Type: text/plain; charset=us-ascii On Mon, 27 Dec 2004 14:23:50 PST, "David S. Miller" said: > If we are really going to do something like this, it should > be calculated properly and be determined per-interface > type as netdevs are registered. Would you prefer to see this done for all interface types if we do it at all, or would a special-case for 1 or 2 types that can use a slab without being wasteful be an acceptable solution? (Let's face it - if 3.95 objects fit in each slab, we may not want to do it...) --==_Exmh_-1419475951P Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.6 (GNU/Linux) Comment: Exmh version 2.5 07/13/2001 iD8DBQFB0JGZcC3lWbTT17ARAtkPAKDjhu4Ocy8aQbbY8GwpjCG9aaTu+wCgybsg s31h8DjurnRR1B6j4DuSr7Y= =3OVm -----END PGP SIGNATURE----- --==_Exmh_-1419475951P-- From davem@davemloft.net Mon Dec 27 14:58:13 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 14:58:20 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBRMvrhd022722 for ; Mon, 27 Dec 2004 14:58:13 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1Cj3oV-0002PJ-00; Mon, 27 Dec 2004 14:58:07 -0800 Date: Mon, 27 Dec 2004 14:58:07 -0800 From: "David S. Miller" To: Valdis.Kletnieks@vt.edu Cc: kaber@trash.net, alan@lxorguk.ukuu.org.uk, torvalds@osdl.org, linux-kernel@vger.kernel.org, netdev@oss.sgi.com Subject: Re: PATCH: kmalloc packet slab Message-Id: <20041227145807.73803fa8.davem@davemloft.net> In-Reply-To: <200412272250.iBRMo2Qb011114@turing-police.cc.vt.edu> References: <1104156983.20944.25.camel@localhost.localdomain> <41D043AC.2070203@trash.net> <20041227142350.1cf444fe.davem@davemloft.net> <200412272250.iBRMo2Qb011114@turing-police.cc.vt.edu> X-Mailer: Sylpheed version 1.0.0rc (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13055 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Mon, 27 Dec 2004 17:50:01 -0500 Valdis.Kletnieks@vt.edu wrote: > On Mon, 27 Dec 2004 14:23:50 PST, "David S. Miller" said: > > > If we are really going to do something like this, it should > > be calculated properly and be determined per-interface > > type as netdevs are registered. > > Would you prefer to see this done for all interface types if we do it > at all, or would a special-case for 1 or 2 types that can use a slab > without being wasteful be an acceptable solution? (Let's face it - if > 3.95 objects fit in each slab, we may not want to do it...) It's not even just device MTU based (which can change dynamically at run time), it's also based upon the PMTU for various paths. As for wastefulness, that's a good question. Adding a mechanism to do kmalloc slabs dynamically doesn't sound all that wise. That would undo all the inlining tricks. Probably a better idea is to provide a way to attach a slab to an SKB's data area so that we can have per-device SLABs for this kind of stuff (and if all "ethernet" devices want to share the same SLAB, that's fine too, but it won't help all ethernet drivers for reasons outlined in my previous email). We added something similar for the Xen folks, and it's in Linus's BK tree right now. It's named alloc_skb_from_cache(). What I'd really like to see is device based determination of the correct slab to use. Unfortunately, dev_alloc_skb() doesn't take a netdev argument, which is truly offensive. Otherwise we could just stick the necessary logic there. From romieu@fr.zoreil.com Mon Dec 27 16:16:44 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 16:16:51 -0800 (PST) Received: from fr.zoreil.com (electric-eye.fr.zoreil.com [213.41.134.224]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS0GN1C025002 for ; Mon, 27 Dec 2004 16:16:43 -0800 Received: from electric-eye.fr.zoreil.com (localhost.localdomain [127.0.0.1]) by fr.zoreil.com (8.13.1/8.12.1) with ESMTP id iBS0GpTC006841; Tue, 28 Dec 2004 01:16:51 +0100 Received: (from romieu@localhost) by electric-eye.fr.zoreil.com (8.13.1/8.13.1/Submit) id iBS0Gk2V006840; Tue, 28 Dec 2004 01:16:46 +0100 Date: Tue, 28 Dec 2004 01:16:46 +0100 From: Francois Romieu To: linux-kernel@vger.kernel.org Cc: Luca Bortot , netdev@oss.sgi.com Subject: [RFT] r8169 changes in 2.6.x Message-ID: <20041228001646.GA29512@electric-eye.fr.zoreil.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.1i X-Organisation: Land of Sunshine Inc. X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13056 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: romieu@fr.zoreil.com Precedence: bulk X-list: netdev The r8169 changes which Jeff kindly hosted in -netdev are now in 2.6.10-bk1. This release should definitely improve the buzzword compliance of the r8169 driver as it now features tx/rx ip chechsumming, tso, vlan, netconsole and a more complete ethtool support. Though available in -mm, large frames support is not included in 2.6.10-bk so far. It is available on top of 2.6.10-bk1 via: - http://www.fr.zoreil.com/~romieu/misc/20041228-2.6.10-bk1-r8169.c-test.patch (single patch) or: - http://www.fr.zoreil.com/linux/kernel/2.6.x/2.6.10-bk1/r8169/patches/ (a serie of smaller patches for review) Large frames roughly means 7200 bytes frames at most. Test results for regressions/improvements/stability will be welcome. A backport is available for 2.4.x (x >= 28) at: - http://www.fr.zoreil.com/~romieu/misc/20041209-2.4.28-r8169.c-test.patch or: - http://www.fr.zoreil.com/linux/kernel/2.4.x/2.4.28/r8169/patches Thank you for your attention. -- Ueimor From acme@conectiva.com.br Mon Dec 27 17:02:28 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 17:02:35 -0800 (PST) Received: from perninha.conectiva.com.br (perninha.conectiva.com.br [200.140.247.100]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS126nK030768 for ; Mon, 27 Dec 2004 17:02:27 -0800 Received: by perninha.conectiva.com.br (Postfix, from userid 568) id 7EB0F4780A; Mon, 27 Dec 2004 23:03:34 -0200 (BRST) Received: from burns.conectiva (burns.conectiva [10.0.0.4]) by perninha.conectiva.com.br (Postfix) with SMTP id 39488477F5 for ; Mon, 27 Dec 2004 23:03:34 -0200 (BRST) Received: (qmail 1216 invoked by uid 0); 28 Dec 2004 02:00:00 -0000 Received: from mapi8.distro.conectiva (HELO oops.ghostprotocols.net) (10.0.16.10) by burns.conectiva with SMTP; 28 Dec 2004 02:00:00 -0000 Received: from amd64.kerneljanitors.org (amd64.kerneljanitors.org [192.168.1.6]) by oops.ghostprotocols.net (Postfix) with ESMTP id 3CA6B74894; Mon, 27 Dec 2004 23:03:25 -0200 (BRST) From: Arnaldo Carvalho de Melo Organization: Conectiva S.A. To: davem@davemloft.net Subject: Re: [PATCH][INET] move inet_sock into inet_opt and rename it to inet_sock Date: Mon, 27 Dec 2004 23:06:57 -0200 User-Agent: KMail/1.7.2 References: <41CE198B.7040005@conectiva.com.br> In-Reply-To: <41CE198B.7040005@conectiva.com.br> Cc: netdev@oss.sgi.com MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Disposition: inline Message-Id: <200412272306.58203.acme@conectiva.com.br> X-Bogosity: No, tests=bogofilter, spamicity=0.108924, version=0.16.3 X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id iBS126nK030768 X-archive-position: 13057 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: acme@conectiva.com.br Precedence: bulk X-list: netdev Em Sáb 25 Dez 2004 23:53, você escreveu: > Hi David, > > Now that 2.6.10 is out, please take a look if this is acceptable, > the following patches will deal with udp_sock, tcp_sock, etc. > > This is the start of the series of patches that will introduce > struct connection_sock, reducing the memory used by non connected > protocols, such as UDP. > > It is available at: > > bk://kernel.bkbits.net/acme/connection_sock-2.6 Dave, wait a while, I've just tried it with current BK, after Linus merged your 2.6.11 netdev queue and it breaks with things like Stephen's TCP ephemeral port changesets... - Arnaldo From alan@lxorguk.ukuu.org.uk Mon Dec 27 17:54:30 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 17:54:37 -0800 (PST) Received: from localhost.localdomain (clock-tower.bc.nu [81.2.110.250] (may be forged)) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS1s9Yb032437 for ; Mon, 27 Dec 2004 17:54:30 -0800 Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by localhost.localdomain (8.12.11/8.12.11) with ESMTP id iBS0pWse022359; Tue, 28 Dec 2004 00:51:32 GMT Received: (from alan@localhost) by localhost.localdomain (8.12.11/8.12.11/Submit) id iBS0pSQc022358; Tue, 28 Dec 2004 00:51:28 GMT X-Authentication-Warning: localhost.localdomain: alan set sender to alan@lxorguk.ukuu.org.uk using -f Subject: Re: PATCH: kmalloc packet slab From: Alan Cox To: "David S. Miller" Cc: Patrick McHardy , torvalds@osdl.org, Linux Kernel Mailing List , netdev@oss.sgi.com In-Reply-To: <20041227142350.1cf444fe.davem@davemloft.net> References: <1104156983.20944.25.camel@localhost.localdomain> <41D043AC.2070203@trash.net> <20041227142350.1cf444fe.davem@davemloft.net> Content-Type: text/plain Content-Transfer-Encoding: 7bit Message-Id: <1104195085.20898.62.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.6 (1.4.6-2) Date: Tue, 28 Dec 2004 00:51:28 +0000 X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13058 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: alan@lxorguk.ukuu.org.uk Precedence: bulk X-list: netdev On Llu, 2004-12-27 at 22:23, David S. Miller wrote: > If we are really going to do something like this, it should > be calculated properly and be determined per-interface > type as netdevs are registered. Fine by me, I'm just going through plausible looking changes in the Red Hat tree. You might want to slightly injure someone internally until they drop that too 8) Alan From davem@davemloft.net Mon Dec 27 18:19:03 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 18:19:13 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS2Ihiv001030 for ; Mon, 27 Dec 2004 18:19:03 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1Cj6wm-0002hs-00; Mon, 27 Dec 2004 18:18:52 -0800 Date: Mon, 27 Dec 2004 18:18:52 -0800 From: "David S. Miller" To: Arnaldo Carvalho de Melo Cc: netdev@oss.sgi.com Subject: Re: [PATCH][INET] move inet_sock into inet_opt and rename it to inet_sock Message-Id: <20041227181852.036b6a62.davem@davemloft.net> In-Reply-To: <200412272306.58203.acme@conectiva.com.br> References: <41CE198B.7040005@conectiva.com.br> <200412272306.58203.acme@conectiva.com.br> X-Mailer: Sylpheed version 1.0.0rc (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13059 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Mon, 27 Dec 2004 23:06:57 -0200 Arnaldo Carvalho de Melo wrote: > Dave, wait a while, I've just tried it with current BK, after Linus merged > your 2.6.11 netdev queue and it breaks with things like Stephen's TCP > ephemeral port changesets... Ok, no problem. From acme@conectiva.com.br Mon Dec 27 18:23:53 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 18:24:03 -0800 (PST) Received: from perninha.conectiva.com.br (perninha.conectiva.com.br [200.140.247.100]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS2NVBf001602 for ; Mon, 27 Dec 2004 18:23:51 -0800 Received: by perninha.conectiva.com.br (Postfix, from userid 568) id 5EB4647808; Tue, 28 Dec 2004 00:24:58 -0200 (BRST) Received: from burns.conectiva (burns.conectiva [10.0.0.4]) by perninha.conectiva.com.br (Postfix) with SMTP id 25BF447884 for ; Tue, 28 Dec 2004 00:24:55 -0200 (BRST) Received: (qmail 10554 invoked by uid 0); 28 Dec 2004 03:21:20 -0000 Received: from mapi8.distro.conectiva (HELO oops.ghostprotocols.net) (10.0.16.10) by burns.conectiva with SMTP; 28 Dec 2004 03:21:20 -0000 Received: from [192.168.1.6] (amd64.kerneljanitors.org [192.168.1.6]) by oops.ghostprotocols.net (Postfix) with ESMTP id 86532748EC; Tue, 28 Dec 2004 00:24:44 -0200 (BRST) Message-ID: <41D0C4C2.6030000@conectiva.com.br> Date: Tue, 28 Dec 2004 00:28:18 -0200 From: Arnaldo Carvalho de Melo Organization: Conectiva S.A. User-Agent: Mozilla Thunderbird 0.9 (X11/20041103) X-Accept-Language: en-us, en MIME-Version: 1.0 To: "David S. Miller" Cc: netdev@oss.sgi.com Subject: [PATCH][INET] move inet_sock into inet_opt and rename it to inet_sock References: <41CE198B.7040005@conectiva.com.br> In-Reply-To: <41CE198B.7040005@conectiva.com.br> Content-Type: multipart/mixed; boundary="------------000008060804040405080203" X-Bogosity: No, tests=bogofilter, spamicity=0.499802, version=0.16.3 X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13060 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: acme@conectiva.com.br Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------000008060804040405080203 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Hi David, Now that 2.6.10 is out and your 2.6.11 queue was merged, please take a look if this is acceptable, the following patches will deal with udp_sock, tcp_sock, etc. This is the start of the series of patches that will introduce struct connection_sock, reducing the memory used by non connected protocols, such as UDP. It is available at: bk://kernel.bkbits.net/acme/connection_sock-2.6 Best Regards, - Arnaldo --------------000008060804040405080203 Content-Type: text/plain; name="inet_sock.patch" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="inet_sock.patch" =================================================================== ChangeSet@1.2194, 2004-12-27 23:57:10-02:00, acme@conectiva.com.br [INET] move inet_sock into inet_opt and rename it to inet_sock With this we can remove all the cut'n'pasted layouts in all inet_sock derived classes, such as tcp_sock, udp_sock, sctp_sock, etc. Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller include/linux/ip.h | 24 ++++++++++-------------- include/linux/ipv6.h | 14 ++++---------- include/linux/tcp.h | 6 +----- include/linux/udp.h | 6 +----- include/net/icmp.h | 8 +------- include/net/sctp/sctp.h | 12 +++--------- include/net/tcp.h | 2 +- net/ipv4/af_inet.c | 10 +++++----- net/ipv4/datagram.c | 2 +- net/ipv4/icmp.c | 4 ++-- net/ipv4/igmp.c | 16 ++++++++-------- net/ipv4/ip_output.c | 16 ++++++++-------- net/ipv4/ip_sockglue.c | 12 ++++++------ net/ipv4/ipvs/ip_vs_sync.c | 6 +++--- net/ipv4/netfilter/ip_conntrack_core.c | 2 +- net/ipv4/raw.c | 14 +++++++------- net/ipv4/tcp.c | 4 ++-- net/ipv4/tcp_diag.c | 14 +++++++------- net/ipv4/tcp_input.c | 2 +- net/ipv4/tcp_ipv4.c | 32 ++++++++++++++++---------------- net/ipv4/tcp_minisocks.c | 2 +- net/ipv4/tcp_output.c | 2 +- net/ipv4/tcp_timer.c | 2 +- net/ipv4/udp.c | 20 ++++++++++---------- net/ipv6/af_inet6.c | 10 ++++------ net/ipv6/datagram.c | 4 ++-- net/ipv6/ip6_output.c | 6 +++--- net/ipv6/raw.c | 10 +++++----- net/ipv6/tcp_ipv6.c | 20 ++++++++++---------- net/ipv6/udp.c | 12 ++++++------ net/sctp/input.c | 2 +- net/sctp/ipv6.c | 6 +++--- net/sctp/protocol.c | 4 ++-- security/selinux/avc.c | 4 ++-- 34 files changed, 139 insertions(+), 171 deletions(-) diff -Nru a/include/linux/ip.h b/include/linux/ip.h --- a/include/linux/ip.h 2004-12-28 00:19:40 -02:00 +++ b/include/linux/ip.h 2004-12-28 00:19:40 -02:00 @@ -107,7 +107,14 @@ #define optlength(opt) (sizeof(struct ip_options) + opt->optlen) -struct inet_opt { +struct ipv6_pinfo; + +struct inet_sock { + /* sk and pinet6 has to be the first two members of inet_sock */ + struct sock sk; +#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) + struct ipv6_pinfo *pinet6; +#endif /* Socket demultiplex comparisons on incoming packets. */ __u32 daddr; /* Foreign IPv4 addr */ __u32 rcv_saddr; /* Bound local IPv4 addr */ @@ -146,20 +153,9 @@ #define IPCORK_OPT 1 /* ip-options has been held in ipcork.opt */ -struct ipv6_pinfo; - -/* WARNING: don't change the layout of the members in inet_sock! */ -struct inet_sock { - struct sock sk; -#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) - struct ipv6_pinfo *pinet6; -#endif - struct inet_opt inet; -}; - -static inline struct inet_opt * inet_sk(const struct sock *__sk) +static inline struct inet_sock *inet_sk(const struct sock *sk) { - return &((struct inet_sock *)__sk)->inet; + return (struct inet_sock *)sk; } #endif diff -Nru a/include/linux/ipv6.h b/include/linux/ipv6.h --- a/include/linux/ipv6.h 2004-12-28 00:19:40 -02:00 +++ b/include/linux/ipv6.h 2004-12-28 00:19:40 -02:00 @@ -256,32 +256,26 @@ /* WARNING: don't change the layout of the members in {raw,udp,tcp}6_sock! */ struct raw6_sock { - struct sock sk; - struct ipv6_pinfo *pinet6; - struct inet_opt inet; + struct inet_sock inet; struct raw6_opt raw6; struct ipv6_pinfo inet6; }; struct udp6_sock { - struct sock sk; - struct ipv6_pinfo *pinet6; - struct inet_opt inet; + struct inet_sock inet; struct udp_opt udp; struct ipv6_pinfo inet6; }; struct tcp6_sock { - struct sock sk; - struct ipv6_pinfo *pinet6; - struct inet_opt inet; + struct inet_sock inet; struct tcp_opt tcp; struct ipv6_pinfo inet6; }; static inline struct ipv6_pinfo * inet6_sk(const struct sock *__sk) { - return ((struct raw6_sock *)__sk)->pinet6; + return inet_sk(__sk)->pinet6; } static inline struct raw6_opt * raw6_sk(const struct sock *__sk) diff -Nru a/include/linux/tcp.h b/include/linux/tcp.h --- a/include/linux/tcp.h 2004-12-28 00:19:40 -02:00 +++ b/include/linux/tcp.h 2004-12-28 00:19:40 -02:00 @@ -440,11 +440,7 @@ /* WARNING: don't change the layout of the members in tcp_sock! */ struct tcp_sock { - struct sock sk; -#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) - struct ipv6_pinfo *pinet6; -#endif - struct inet_opt inet; + struct inet_sock inet; struct tcp_opt tcp; }; diff -Nru a/include/linux/udp.h b/include/linux/udp.h --- a/include/linux/udp.h 2004-12-28 00:19:40 -02:00 +++ b/include/linux/udp.h 2004-12-28 00:19:40 -02:00 @@ -53,11 +53,7 @@ /* WARNING: don't change the layout of the members in udp_sock! */ struct udp_sock { - struct sock sk; -#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) - struct ipv6_pinfo *pinet6; -#endif - struct inet_opt inet; + struct inet_sock inet; struct udp_opt udp; }; diff -Nru a/include/net/icmp.h b/include/net/icmp.h --- a/include/net/icmp.h 2004-12-28 00:19:40 -02:00 +++ b/include/net/icmp.h 2004-12-28 00:19:40 -02:00 @@ -50,15 +50,9 @@ struct icmp_filter filter; }; -struct ipv6_pinfo; - /* WARNING: don't change the layout of the members in raw_sock! */ struct raw_sock { - struct sock sk; -#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) - struct ipv6_pinfo *pinet6; -#endif - struct inet_opt inet; + struct inet_sock inet; struct raw_opt raw4; }; diff -Nru a/include/net/sctp/sctp.h b/include/net/sctp/sctp.h --- a/include/net/sctp/sctp.h 2004-12-28 00:19:40 -02:00 +++ b/include/net/sctp/sctp.h 2004-12-28 00:19:40 -02:00 @@ -584,26 +584,20 @@ /* WARNING: Do not change the layout of the members in sctp_sock! */ struct sctp_sock { - struct sock sk; -#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) - struct ipv6_pinfo *pinet6; -#endif /* CONFIG_IPV6 */ - struct inet_opt inet; + struct inet_sock inet; struct sctp_opt sctp; }; #if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) struct sctp6_sock { - struct sock sk; - struct ipv6_pinfo *pinet6; - struct inet_opt inet; + struct inet_sock inet; struct sctp_opt sctp; struct ipv6_pinfo inet6; }; #endif /* CONFIG_IPV6 */ #define sctp_sk(__sk) (&((struct sctp_sock *)__sk)->sctp) -#define sctp_opt2sk(__sp) &container_of(__sp, struct sctp_sock, sctp)->sk +#define sctp_opt2sk(__sp) &container_of(__sp, struct sctp_sock, sctp)->inet.sk /* Is a socket of this style? */ #define sctp_style(sk, style) __sctp_style((sk), (SCTP_SOCKET_##style)) diff -Nru a/include/net/tcp.h b/include/net/tcp.h --- a/include/net/tcp.h 2004-12-28 00:19:40 -02:00 +++ b/include/net/tcp.h 2004-12-28 00:19:40 -02:00 @@ -196,7 +196,7 @@ unsigned char tw_rcv_wscale; __u16 tw_sport; /* Socket demultiplex comparisons on incoming packets. */ - /* these five are in inet_opt */ + /* these five are in inet_sock */ __u32 tw_daddr __attribute__((aligned(TCP_ADDRCMP_ALIGN_BYTES))); __u32 tw_rcv_saddr; diff -Nru a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c --- a/net/ipv4/af_inet.c 2004-12-28 00:19:40 -02:00 +++ b/net/ipv4/af_inet.c 2004-12-28 00:19:40 -02:00 @@ -131,7 +131,7 @@ void inet_sock_destruct(struct sock *sk) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); __skb_queue_purge(&sk->sk_receive_queue); __skb_queue_purge(&sk->sk_error_queue); @@ -173,7 +173,7 @@ static int inet_autobind(struct sock *sk) { - struct inet_opt *inet; + struct inet_sock *inet; /* We may need to bind the socket. */ lock_sock(sk); inet = inet_sk(sk); @@ -232,7 +232,7 @@ struct sock *sk; struct list_head *p; struct inet_protosw *answer; - struct inet_opt *inet; + struct inet_sock *inet; struct proto *answer_prot; unsigned char answer_flags; char answer_no_check; @@ -389,7 +389,7 @@ { struct sockaddr_in *addr = (struct sockaddr_in *)uaddr; struct sock *sk = sock->sk; - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); unsigned short snum; int chk_addr_ret; int err; @@ -623,7 +623,7 @@ int *uaddr_len, int peer) { struct sock *sk = sock->sk; - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct sockaddr_in *sin = (struct sockaddr_in *)uaddr; sin->sin_family = AF_INET; diff -Nru a/net/ipv4/datagram.c b/net/ipv4/datagram.c --- a/net/ipv4/datagram.c 2004-12-28 00:19:40 -02:00 +++ b/net/ipv4/datagram.c 2004-12-28 00:19:40 -02:00 @@ -22,7 +22,7 @@ int ip4_datagram_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct sockaddr_in *usin = (struct sockaddr_in *) uaddr; struct rtable *rt; u32 saddr; diff -Nru a/net/ipv4/icmp.c b/net/ipv4/icmp.c --- a/net/ipv4/icmp.c 2004-12-28 00:19:40 -02:00 +++ b/net/ipv4/icmp.c 2004-12-28 00:19:40 -02:00 @@ -377,7 +377,7 @@ static void icmp_reply(struct icmp_bxm *icmp_param, struct sk_buff *skb) { struct sock *sk = icmp_socket->sk; - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ipcm_cookie ipc; struct rtable *rt = (struct rtable *)skb->dst; u32 daddr; @@ -1097,7 +1097,7 @@ void __init icmp_init(struct net_proto_family *ops) { - struct inet_opt *inet; + struct inet_sock *inet; int i; for (i = 0; i < NR_CPUS; i++) { diff -Nru a/net/ipv4/igmp.c b/net/ipv4/igmp.c --- a/net/ipv4/igmp.c 2004-12-28 00:19:40 -02:00 +++ b/net/ipv4/igmp.c 2004-12-28 00:19:40 -02:00 @@ -1617,7 +1617,7 @@ u32 addr = imr->imr_multiaddr.s_addr; struct ip_mc_socklist *iml, *i; struct in_device *in_dev; - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); int count = 0; if (!MULTICAST(addr)) @@ -1691,7 +1691,7 @@ int ip_mc_leave_group(struct sock *sk, struct ip_mreqn *imr) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ip_mc_socklist *iml, **imlp; rtnl_lock(); @@ -1734,7 +1734,7 @@ u32 addr = mreqs->imr_multiaddr; struct ip_mc_socklist *pmc; struct in_device *in_dev = NULL; - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ip_sf_socklist *psl; int i, j, rv; @@ -1852,7 +1852,7 @@ u32 addr = msf->imsf_multiaddr; struct ip_mc_socklist *pmc; struct in_device *in_dev; - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ip_sf_socklist *newpsl, *psl; if (!MULTICAST(addr)) @@ -1922,7 +1922,7 @@ u32 addr = msf->imsf_multiaddr; struct ip_mc_socklist *pmc; struct in_device *in_dev; - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ip_sf_socklist *psl; if (!MULTICAST(addr)) @@ -1980,7 +1980,7 @@ struct sockaddr_in *psin; u32 addr; struct ip_mc_socklist *pmc; - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ip_sf_socklist *psl; psin = (struct sockaddr_in *)&gsf->gf_group; @@ -2033,7 +2033,7 @@ */ int ip_mc_sf_allow(struct sock *sk, u32 loc_addr, u32 rmt_addr, int dif) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ip_mc_socklist *pmc; struct ip_sf_socklist *psl; int i; @@ -2069,7 +2069,7 @@ void ip_mc_drop_socket(struct sock *sk) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ip_mc_socklist *iml; if (inet->mc_list == NULL) diff -Nru a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c --- a/net/ipv4/ip_output.c 2004-12-28 00:19:40 -02:00 +++ b/net/ipv4/ip_output.c 2004-12-28 00:19:40 -02:00 @@ -115,7 +115,7 @@ return 0; } -static inline int ip_select_ttl(struct inet_opt *inet, struct dst_entry *dst) +static inline int ip_select_ttl(struct inet_sock *inet, struct dst_entry *dst) { int ttl = inet->uc_ttl; @@ -131,7 +131,7 @@ int ip_build_and_send_pkt(struct sk_buff *skb, struct sock *sk, u32 saddr, u32 daddr, struct ip_options *opt) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct rtable *rt = (struct rtable *)skb->dst; struct iphdr *iph; @@ -297,7 +297,7 @@ int ip_queue_xmit(struct sk_buff *skb, int ipfragok) { struct sock *sk = skb->sk; - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ip_options *opt = inet->opt; struct rtable *rt; struct iphdr *iph; @@ -712,7 +712,7 @@ struct ipcm_cookie *ipc, struct rtable *rt, unsigned int flags) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct sk_buff *skb; struct ip_options *opt = NULL; @@ -973,7 +973,7 @@ ssize_t ip_append_page(struct sock *sk, struct page *page, int offset, size_t size, int flags) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct sk_buff *skb; struct rtable *rt; struct ip_options *opt = NULL; @@ -1112,7 +1112,7 @@ { struct sk_buff *skb, *tmp_skb; struct sk_buff **tail_skb; - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ip_options *opt = NULL; struct rtable *rt = inet->cork.rt; struct iphdr *iph; @@ -1217,7 +1217,7 @@ */ void ip_flush_pending_frames(struct sock *sk) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct sk_buff *skb; while ((skb = __skb_dequeue_tail(&sk->sk_write_queue)) != NULL) @@ -1260,7 +1260,7 @@ void ip_send_reply(struct sock *sk, struct sk_buff *skb, struct ip_reply_arg *arg, unsigned int len) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct { struct ip_options opt; char data[40]; diff -Nru a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c --- a/net/ipv4/ip_sockglue.c 2004-12-28 00:19:40 -02:00 +++ b/net/ipv4/ip_sockglue.c 2004-12-28 00:19:40 -02:00 @@ -112,7 +112,7 @@ void ip_cmsg_recv(struct msghdr *msg, struct sk_buff *skb) { - struct inet_opt *inet = inet_sk(skb->sk); + struct inet_sock *inet = inet_sk(skb->sk); unsigned flags = inet->cmsg_flags; /* Ordered by supposed usage frequency */ @@ -234,7 +234,7 @@ void ip_icmp_error(struct sock *sk, struct sk_buff *skb, int err, u16 port, u32 info, u8 *payload) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct sock_exterr_skb *serr; if (!inet->recverr) @@ -263,7 +263,7 @@ void ip_local_error(struct sock *sk, int err, u32 daddr, u16 port, u32 info) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct sock_exterr_skb *serr; struct iphdr *iph; struct sk_buff *skb; @@ -342,7 +342,7 @@ sin = &errhdr.offender; sin->sin_family = AF_UNSPEC; if (serr->ee.ee_origin == SO_EE_ORIGIN_ICMP) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); sin->sin_family = AF_INET; sin->sin_addr.s_addr = skb->nh.iph->saddr; @@ -383,7 +383,7 @@ int ip_setsockopt(struct sock *sk, int level, int optname, char __user *optval, int optlen) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); int val=0,err; if (level != SOL_IP) @@ -875,7 +875,7 @@ int ip_getsockopt(struct sock *sk, int level, int optname, char __user *optval, int __user *optlen) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); int val; int len; diff -Nru a/net/ipv4/ipvs/ip_vs_sync.c b/net/ipv4/ipvs/ip_vs_sync.c --- a/net/ipv4/ipvs/ip_vs_sync.c 2004-12-28 00:19:40 -02:00 +++ b/net/ipv4/ipvs/ip_vs_sync.c 2004-12-28 00:19:40 -02:00 @@ -343,7 +343,7 @@ */ static void set_mcast_loop(struct sock *sk, u_char loop) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); /* setsockopt(sock, SOL_IP, IP_MULTICAST_LOOP, &loop, sizeof(loop)); */ lock_sock(sk); @@ -356,7 +356,7 @@ */ static void set_mcast_ttl(struct sock *sk, u_char ttl) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); /* setsockopt(sock, SOL_IP, IP_MULTICAST_TTL, &ttl, sizeof(ttl)); */ lock_sock(sk); @@ -370,7 +370,7 @@ static int set_mcast_if(struct sock *sk, char *ifname) { struct net_device *dev; - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); if ((dev = __dev_get_by_name(ifname)) == NULL) return -ENODEV; diff -Nru a/net/ipv4/netfilter/ip_conntrack_core.c b/net/ipv4/netfilter/ip_conntrack_core.c --- a/net/ipv4/netfilter/ip_conntrack_core.c 2004-12-28 00:19:40 -02:00 +++ b/net/ipv4/netfilter/ip_conntrack_core.c 2004-12-28 00:19:40 -02:00 @@ -1229,7 +1229,7 @@ static int getorigdst(struct sock *sk, int optval, void __user *user, int *len) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ip_conntrack_tuple_hash *h; struct ip_conntrack_tuple tuple; diff -Nru a/net/ipv4/raw.c b/net/ipv4/raw.c --- a/net/ipv4/raw.c 2004-12-28 00:19:40 -02:00 +++ b/net/ipv4/raw.c 2004-12-28 00:19:40 -02:00 @@ -109,7 +109,7 @@ struct hlist_node *node; sk_for_each_from(sk, node) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); if (inet->num == num && !(inet->daddr && inet->daddr != raddr) && @@ -181,7 +181,7 @@ void raw_err (struct sock *sk, struct sk_buff *skb, u32 info) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); int type = skb->h.icmph->type; int code = skb->h.icmph->code; int err = 0; @@ -263,7 +263,7 @@ struct rtable *rt, unsigned int flags) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); int hh_len; struct iphdr *iph; struct sk_buff *skb; @@ -374,7 +374,7 @@ static int raw_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg, size_t len) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ipcm_cookie ipc; struct rtable *rt = NULL; int free = 0; @@ -537,7 +537,7 @@ /* This gets rid of all the nasties in af_inet. -DaveM */ static int raw_bind(struct sock *sk, struct sockaddr *uaddr, int addr_len) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct sockaddr_in *addr = (struct sockaddr_in *) uaddr; int ret = -EINVAL; int chk_addr_ret; @@ -565,7 +565,7 @@ int raw_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg, size_t len, int noblock, int flags, int *addr_len) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); size_t copied = 0; int err = -EOPNOTSUPP; struct sockaddr_in *sin = (struct sockaddr_in *)msg->msg_name; @@ -802,7 +802,7 @@ static __inline__ char *get_raw_sock(struct sock *sp, char *tmpbuf, int i) { - struct inet_opt *inet = inet_sk(sp); + struct inet_sock *inet = inet_sk(sp); unsigned int dest = inet->daddr, src = inet->rcv_saddr; __u16 destp = 0, diff -Nru a/net/ipv4/tcp.c b/net/ipv4/tcp.c --- a/net/ipv4/tcp.c 2004-12-28 00:19:40 -02:00 +++ b/net/ipv4/tcp.c 2004-12-28 00:19:40 -02:00 @@ -460,7 +460,7 @@ int tcp_listen_start(struct sock *sk) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct tcp_opt *tp = tcp_sk(sk); struct tcp_listen_opt *lopt; @@ -1772,7 +1772,7 @@ int tcp_disconnect(struct sock *sk, int flags) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct tcp_opt *tp = tcp_sk(sk); int err = 0; int old_state = sk->sk_state; diff -Nru a/net/ipv4/tcp_diag.c b/net/ipv4/tcp_diag.c --- a/net/ipv4/tcp_diag.c 2004-12-28 00:19:40 -02:00 +++ b/net/ipv4/tcp_diag.c 2004-12-28 00:19:40 -02:00 @@ -55,7 +55,7 @@ static int tcpdiag_fill(struct sk_buff *skb, struct sock *sk, int ext, u32 pid, u32 seq, u16 nlmsg_flags) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct tcp_opt *tp = tcp_sk(sk); struct tcpdiagmsg *r; struct nlmsghdr *nlh; @@ -427,7 +427,7 @@ if (cb->nlh->nlmsg_len > 4 + NLMSG_SPACE(sizeof(*r))) { struct tcpdiag_entry entry; struct rtattr *bc = (struct rtattr *)(r + 1); - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); entry.family = sk->sk_family; #ifdef CONFIG_IP_TCPDIAG_IPV6 @@ -458,7 +458,7 @@ struct open_request *req, u32 pid, u32 seq) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); unsigned char *b = skb->tail; struct tcpdiagmsg *r; struct nlmsghdr *nlh; @@ -515,7 +515,7 @@ struct tcp_opt *tp = tcp_sk(sk); struct tcp_listen_opt *lopt; struct rtattr *bc = NULL; - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); int j, s_j; int reqnum, s_reqnum; int err = 0; @@ -609,7 +609,7 @@ num = 0; sk_for_each(sk, node, &tcp_listening_hash[i]) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); if (num < s_num) { num++; @@ -670,7 +670,7 @@ num = 0; sk_for_each(sk, node, &head->chain) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); if (num < s_num) goto next_normal; @@ -692,7 +692,7 @@ if (r->tcpdiag_states&TCPF_TIME_WAIT) { sk_for_each(sk, node, &tcp_ehash[i + tcp_ehash_size].chain) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); if (num < s_num) goto next_dying; diff -Nru a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c --- a/net/ipv4/tcp_input.c 2004-12-28 00:19:40 -02:00 +++ b/net/ipv4/tcp_input.c 2004-12-28 00:19:40 -02:00 @@ -1647,7 +1647,7 @@ #if FASTRETRANS_DEBUG > 1 static void DBGUNDO(struct sock *sk, struct tcp_opt *tp, const char *msg) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); printk(KERN_DEBUG "Undo %s %u.%u.%u.%u/%u c%u l%u ss%u/%u p%u\n", msg, NIPQUAD(inet->daddr), ntohs(inet->dport), diff -Nru a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c --- a/net/ipv4/tcp_ipv4.c 2004-12-28 00:19:40 -02:00 +++ b/net/ipv4/tcp_ipv4.c 2004-12-28 00:19:40 -02:00 @@ -115,7 +115,7 @@ static __inline__ int tcp_sk_hashfn(struct sock *sk) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); __u32 laddr = inet->rcv_saddr; __u16 lport = inet->num; __u32 faddr = inet->daddr; @@ -300,7 +300,7 @@ */ static void __tcp_put_port(struct sock *sk) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct tcp_bind_hashbucket *head = &tcp_bhash[tcp_bhashfn(inet->num)]; struct tcp_bind_bucket *tb; @@ -420,7 +420,7 @@ hiscore=-1; sk_for_each(sk, node, head) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); if (inet->num == hnum && !ipv6_only_sock(sk)) { __u32 rcv_saddr = inet->rcv_saddr; @@ -457,7 +457,7 @@ read_lock(&tcp_lhash_lock); head = &tcp_listening_hash[tcp_lhashfn(hnum)]; if (!hlist_empty(head)) { - struct inet_opt *inet = inet_sk((sk = __sk_head(head))); + struct inet_sock *inet = inet_sk((sk = __sk_head(head))); if (inet->num == hnum && !sk->sk_node.next && (!inet->rcv_saddr || inet->rcv_saddr == daddr) && @@ -549,7 +549,7 @@ static int __tcp_v4_check_established(struct sock *sk, __u16 lport, struct tcp_tw_bucket **twp) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); u32 daddr = inet->rcv_saddr; u32 saddr = inet->daddr; int dif = sk->sk_bound_dev_if; @@ -638,7 +638,7 @@ static inline u32 connect_port_offset(const struct sock *sk) { - const struct inet_opt *inet = inet_sk(sk); + const struct inet_sock *inet = inet_sk(sk); return secure_tcp_port_ephemeral(inet->rcv_saddr, inet->daddr, inet->dport); @@ -743,7 +743,7 @@ /* This will initiate an outgoing connection. */ int tcp_v4_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct tcp_opt *tp = tcp_sk(sk); struct sockaddr_in *usin = (struct sockaddr_in *)uaddr; struct rtable *rt; @@ -917,7 +917,7 @@ u32 mtu) { struct dst_entry *dst; - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct tcp_opt *tp = tcp_sk(sk); /* We are not interested in TCP_LISTEN and open_requests (SYN-ACKs @@ -980,7 +980,7 @@ struct iphdr *iph = (struct iphdr *)skb->data; struct tcphdr *th = (struct tcphdr *)(skb->data + (iph->ihl << 2)); struct tcp_opt *tp; - struct inet_opt *inet; + struct inet_sock *inet; int type = skb->h.icmph->type; int code = skb->h.icmph->code; struct sock *sk; @@ -1127,7 +1127,7 @@ void tcp_v4_send_check(struct sock *sk, struct tcphdr *th, int len, struct sk_buff *skb) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); if (skb->ip_summed == CHECKSUM_HW) { th->check = ~tcp_v4_check(th, len, inet->saddr, inet->daddr, 0); @@ -1549,7 +1549,7 @@ struct open_request *req, struct dst_entry *dst) { - struct inet_opt *newinet; + struct inet_sock *newinet; struct tcp_opt *newtp; struct sock *newsk; @@ -1856,7 +1856,7 @@ static int tcp_v4_reselect_saddr(struct sock *sk) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); int err; struct rtable *rt; __u32 old_saddr = inet->saddr; @@ -1907,7 +1907,7 @@ int tcp_v4_rebuild_header(struct sock *sk) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct rtable *rt = (struct rtable *)__sk_dst_check(sk, 0); u32 daddr; int err; @@ -1956,7 +1956,7 @@ static void v4_addr2sockaddr(struct sock *sk, struct sockaddr * uaddr) { struct sockaddr_in *sin = (struct sockaddr_in *) uaddr; - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); sin->sin_family = AF_INET; sin->sin_addr.s_addr = inet->daddr; @@ -1971,7 +1971,7 @@ int tcp_v4_remember_stamp(struct sock *sk) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct tcp_opt *tp = tcp_sk(sk); struct rtable *rt = (struct rtable *)__sk_dst_get(sk); struct inet_peer *peer = NULL; @@ -2474,7 +2474,7 @@ int timer_active; unsigned long timer_expires; struct tcp_opt *tp = tcp_sk(sp); - struct inet_opt *inet = inet_sk(sp); + struct inet_sock *inet = inet_sk(sp); unsigned int dest = inet->daddr; unsigned int src = inet->rcv_saddr; __u16 destp = ntohs(inet->dport); diff -Nru a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c --- a/net/ipv4/tcp_minisocks.c 2004-12-28 00:19:40 -02:00 +++ b/net/ipv4/tcp_minisocks.c 2004-12-28 00:19:40 -02:00 @@ -337,7 +337,7 @@ tw = kmem_cache_alloc(tcp_timewait_cachep, SLAB_ATOMIC); if(tw != NULL) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); int rto = (tp->rto<<2) - (tp->rto>>1); /* Give us an identity. */ diff -Nru a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c --- a/net/ipv4/tcp_output.c 2004-12-28 00:19:40 -02:00 +++ b/net/ipv4/tcp_output.c 2004-12-28 00:19:40 -02:00 @@ -266,7 +266,7 @@ static int tcp_transmit_skb(struct sock *sk, struct sk_buff *skb) { if (skb != NULL) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct tcp_opt *tp = tcp_sk(sk); struct tcp_skb_cb *tcb = TCP_SKB_CB(skb); int tcp_header_size = tp->tcp_header_len; diff -Nru a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c --- a/net/ipv4/tcp_timer.c 2004-12-28 00:19:40 -02:00 +++ b/net/ipv4/tcp_timer.c 2004-12-28 00:19:40 -02:00 @@ -332,7 +332,7 @@ */ #ifdef TCP_DEBUG if (net_ratelimit()) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); printk(KERN_DEBUG "TCP: Treason uncloaked! Peer %u.%u.%u.%u:%u/%u shrinks window %u:%u. Repaired.\n", NIPQUAD(inet->daddr), htons(inet->dport), inet->num, tp->snd_una, tp->snd_nxt); diff -Nru a/net/ipv4/udp.c b/net/ipv4/udp.c --- a/net/ipv4/udp.c 2004-12-28 00:19:40 -02:00 +++ b/net/ipv4/udp.c 2004-12-28 00:19:40 -02:00 @@ -124,7 +124,7 @@ { struct hlist_node *node; struct sock *sk2; - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); write_lock_bh(&udp_hash_lock); if (snum == 0) { @@ -171,7 +171,7 @@ } else { sk_for_each(sk2, node, &udp_hash[snum & (UDP_HTABLE_SIZE - 1)]) { - struct inet_opt *inet2 = inet_sk(sk2); + struct inet_sock *inet2 = inet_sk(sk2); if (inet2->num == snum && sk2 != sk && @@ -227,7 +227,7 @@ int badness = -1; sk_for_each(sk, node, &udp_hash[hnum & (UDP_HTABLE_SIZE - 1)]) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); if (inet->num == hnum && !ipv6_only_sock(sk)) { int score = (sk->sk_family == PF_INET ? 1 : 0); @@ -285,7 +285,7 @@ unsigned short hnum = ntohs(loc_port); sk_for_each_from(s, node) { - struct inet_opt *inet = inet_sk(s); + struct inet_sock *inet = inet_sk(s); if (inet->num != hnum || (inet->daddr && inet->daddr != rmt_addr) || @@ -316,7 +316,7 @@ void udp_err(struct sk_buff *skb, u32 info) { - struct inet_opt *inet; + struct inet_sock *inet; struct iphdr *iph = (struct iphdr*)skb->data; struct udphdr *uh = (struct udphdr*)(skb->data+(iph->ihl<<2)); int type = skb->h.icmph->type; @@ -398,7 +398,7 @@ */ static int udp_push_pending_frames(struct sock *sk, struct udp_opt *up) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct flowi *fl = &inet->cork.fl; struct sk_buff *skb; struct udphdr *uh; @@ -480,7 +480,7 @@ int udp_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg, size_t len) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct udp_opt *up = udp_sk(sk); int ulen = len; struct ipcm_cookie ipc; @@ -773,7 +773,7 @@ int udp_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg, size_t len, int noblock, int flags, int *addr_len) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct sockaddr_in *sin = (struct sockaddr_in *)msg->msg_name; struct sk_buff *skb; int copied, err; @@ -864,7 +864,7 @@ int udp_disconnect(struct sock *sk, int flags) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); /* * 1003.1g - break association. */ @@ -1503,7 +1503,7 @@ /* ------------------------------------------------------------------------ */ static void udp4_format_sock(struct sock *sp, char *tmpbuf, int bucket) { - struct inet_opt *inet = inet_sk(sp); + struct inet_sock *inet = inet_sk(sp); unsigned int dest = inet->daddr; unsigned int src = inet->rcv_saddr; __u16 destp = ntohs(inet->dport); diff -Nru a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c --- a/net/ipv6/af_inet6.c 2004-12-28 00:19:40 -02:00 +++ b/net/ipv6/af_inet6.c 2004-12-28 00:19:40 -02:00 @@ -114,10 +114,9 @@ static int inet6_create(struct socket *sock, int protocol) { - struct inet_opt *inet; + struct inet_sock *inet; struct ipv6_pinfo *np; struct sock *sk; - struct tcp6_sock* tcp6sk; struct list_head *p; struct inet_protosw *answer; struct proto *answer_prot; @@ -196,8 +195,7 @@ sk->sk_backlog_rcv = answer->prot->backlog_rcv; - tcp6sk = (struct tcp6_sock *)sk; - tcp6sk->pinet6 = np = inet6_sk_generic(sk); + inet_sk(sk)->pinet6 = np = inet6_sk_generic(sk); np->hop_limit = -1; np->mcast_hops = -1; np->mc_loop = 1; @@ -252,7 +250,7 @@ { struct sockaddr_in6 *addr=(struct sockaddr_in6 *)uaddr; struct sock *sk = sock->sk; - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ipv6_pinfo *np = inet6_sk(sk); __u32 v4addr = 0; unsigned short snum; @@ -410,7 +408,7 @@ { struct sockaddr_in6 *sin=(struct sockaddr_in6 *)uaddr; struct sock *sk = sock->sk; - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ipv6_pinfo *np = inet6_sk(sk); sin->sin6_family = AF_INET6; diff -Nru a/net/ipv6/datagram.c b/net/ipv6/datagram.c --- a/net/ipv6/datagram.c 2004-12-28 00:19:40 -02:00 +++ b/net/ipv6/datagram.c 2004-12-28 00:19:40 -02:00 @@ -36,7 +36,7 @@ int ip6_datagram_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len) { struct sockaddr_in6 *usin = (struct sockaddr_in6 *) uaddr; - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ipv6_pinfo *np = inet6_sk(sk); struct in6_addr *daddr, *final_p = NULL, final; struct dst_entry *dst; @@ -335,7 +335,7 @@ if (ipv6_addr_type(&sin->sin6_addr) & IPV6_ADDR_LINKLOCAL) sin->sin6_scope_id = IP6CB(skb)->iif; } else { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); ipv6_addr_set(&sin->sin6_addr, 0, 0, htonl(0xffff), diff -Nru a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c --- a/net/ipv6/ip6_output.c 2004-12-28 00:19:40 -02:00 +++ b/net/ipv6/ip6_output.c 2004-12-28 00:19:40 -02:00 @@ -809,7 +809,7 @@ int hlimit, struct ipv6_txoptions *opt, struct flowi *fl, struct rt6_info *rt, unsigned int flags) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ipv6_pinfo *np = inet6_sk(sk); struct sk_buff *skb; unsigned int maxfraglen, fragheaderlen; @@ -1087,7 +1087,7 @@ struct sk_buff *skb, *tmp_skb; struct sk_buff **tail_skb; struct in6_addr final_dst_buf, *final_dst = &final_dst_buf; - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ipv6_pinfo *np = inet6_sk(sk); struct ipv6hdr *hdr; struct ipv6_txoptions *opt = np->cork.opt; @@ -1165,7 +1165,7 @@ void ip6_flush_pending_frames(struct sock *sk) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ipv6_pinfo *np = inet6_sk(sk); struct sk_buff *skb; diff -Nru a/net/ipv6/raw.c b/net/ipv6/raw.c --- a/net/ipv6/raw.c 2004-12-28 00:19:40 -02:00 +++ b/net/ipv6/raw.c 2004-12-28 00:19:40 -02:00 @@ -178,7 +178,7 @@ /* This cleans up af_inet6 a bit. -DaveM */ static int rawv6_bind(struct sock *sk, struct sockaddr *uaddr, int addr_len) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ipv6_pinfo *np = inet6_sk(sk); struct sockaddr_in6 *addr = (struct sockaddr_in6 *) uaddr; __u32 v4addr = 0; @@ -253,7 +253,7 @@ struct inet6_skb_parm *opt, int type, int code, int offset, u32 info) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ipv6_pinfo *np = inet6_sk(sk); int err; int harderr; @@ -314,7 +314,7 @@ */ int rawv6_rcv(struct sock *sk, struct sk_buff *skb) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct raw6_opt *raw_opt = raw6_sk(sk); if (!xfrm6_policy_check(sk, XFRM_POLICY_IN, skb)) { @@ -505,7 +505,7 @@ struct flowi *fl, struct rt6_info *rt, unsigned int flags) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ipv6hdr *iph; struct sk_buff *skb; unsigned int hh_len; @@ -607,7 +607,7 @@ struct ipv6_txoptions opt_space; struct sockaddr_in6 * sin6 = (struct sockaddr_in6 *) msg->msg_name; struct in6_addr *daddr, *final_p = NULL, final; - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ipv6_pinfo *np = inet6_sk(sk); struct raw6_opt *raw_opt = raw6_sk(sk); struct ipv6_txoptions *opt = NULL; diff -Nru a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c --- a/net/ipv6/tcp_ipv6.c 2004-12-28 00:19:40 -02:00 +++ b/net/ipv6/tcp_ipv6.c 2004-12-28 00:19:40 -02:00 @@ -89,7 +89,7 @@ static __inline__ int tcp_v6_sk_hashfn(struct sock *sk) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ipv6_pinfo *np = inet6_sk(sk); struct in6_addr *laddr = &np->rcv_saddr; struct in6_addr *faddr = &np->daddr; @@ -443,7 +443,7 @@ static int tcp_v6_check_established(struct sock *sk) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ipv6_pinfo *np = inet6_sk(sk); struct in6_addr *daddr = &np->rcv_saddr; struct in6_addr *saddr = &np->daddr; @@ -549,7 +549,7 @@ int addr_len) { struct sockaddr_in6 *usin = (struct sockaddr_in6 *) uaddr; - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ipv6_pinfo *np = inet6_sk(sk); struct tcp_opt *tp = tcp_sk(sk); struct in6_addr *saddr = NULL, *final_p = NULL, final; @@ -785,7 +785,7 @@ dst = __sk_dst_check(sk, np->dst_cookie); if (dst == NULL) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct flowi fl; /* BUGGG_FUTURE: Again, it is not clear how @@ -1281,7 +1281,7 @@ { struct ipv6_pinfo *newnp, *np = inet6_sk(sk); struct tcp6_sock *newtcp6sk; - struct inet_opt *newinet; + struct inet_sock *newinet; struct tcp_opt *newtp; struct sock *newsk; struct ipv6_txoptions *opt; @@ -1297,7 +1297,7 @@ return NULL; newtcp6sk = (struct tcp6_sock *)newsk; - newtcp6sk->pinet6 = &newtcp6sk->inet6; + newtcp6sk->inet.pinet6 = &newtcp6sk->inet6; newinet = inet_sk(newsk); newnp = inet6_sk(newsk); @@ -1390,7 +1390,7 @@ ~(NETIF_F_IP_CSUM | NETIF_F_TSO); newtcp6sk = (struct tcp6_sock *)newsk; - newtcp6sk->pinet6 = &newtcp6sk->inet6; + newtcp6sk->inet.pinet6 = &newtcp6sk->inet6; newtp = tcp_sk(newsk); newinet = inet_sk(newsk); @@ -1754,7 +1754,7 @@ dst = __sk_dst_check(sk, np->dst_cookie); if (dst == NULL) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct in6_addr *final_p = NULL, final; struct flowi fl; @@ -1800,7 +1800,7 @@ static int tcp_v6_xmit(struct sk_buff *skb, int ipfragok) { struct sock *sk = skb->sk; - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ipv6_pinfo *np = inet6_sk(sk); struct flowi fl; struct dst_entry *dst; @@ -2006,7 +2006,7 @@ __u16 destp, srcp; int timer_active; unsigned long timer_expires; - struct inet_opt *inet = inet_sk(sp); + struct inet_sock *inet = inet_sk(sp); struct tcp_opt *tp = tcp_sk(sp); struct ipv6_pinfo *np = inet6_sk(sp); diff -Nru a/net/ipv6/udp.c b/net/ipv6/udp.c --- a/net/ipv6/udp.c 2004-12-28 00:19:40 -02:00 +++ b/net/ipv6/udp.c 2004-12-28 00:19:40 -02:00 @@ -160,7 +160,7 @@ read_lock(&udp_hash_lock); sk_for_each(sk, node, &udp_hash[hnum & (UDP_HTABLE_SIZE - 1)]) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); if (inet->num == hnum && sk->sk_family == PF_INET6) { struct ipv6_pinfo *np = inet6_sk(sk); @@ -269,7 +269,7 @@ sin6->sin6_scope_id = 0; if (skb->protocol == htons(ETH_P_IP)) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); ipv6_addr_set(&sin6->sin6_addr, 0, 0, htonl(0xffff), skb->nh.iph->saddr); @@ -386,7 +386,7 @@ unsigned short num = ntohs(loc_port); sk_for_each_from(s, node) { - struct inet_opt *inet = inet_sk(s); + struct inet_sock *inet = inet_sk(s); if (inet->num == num && s->sk_family == PF_INET6) { struct ipv6_pinfo *np = inet6_sk(s); @@ -566,7 +566,7 @@ { struct sk_buff *skb; struct udphdr *uh; - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct flowi *fl = &inet->cork.fl; int err = 0; @@ -624,7 +624,7 @@ { struct ipv6_txoptions opt_space; struct udp_opt *up = udp_sk(sk); - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ipv6_pinfo *np = inet6_sk(sk); struct sockaddr_in6 *sin6 = (struct sockaddr_in6 *) msg->msg_name; struct in6_addr *daddr, *final_p = NULL, final; @@ -970,7 +970,7 @@ static void udp6_sock_seq_show(struct seq_file *seq, struct sock *sp, int bucket) { - struct inet_opt *inet = inet_sk(sp); + struct inet_sock *inet = inet_sk(sp); struct ipv6_pinfo *np = inet6_sk(sp); struct in6_addr *dest, *src; __u16 destp, srcp; diff -Nru a/net/sctp/input.c b/net/sctp/input.c --- a/net/sctp/input.c 2004-12-28 00:19:40 -02:00 +++ b/net/sctp/input.c 2004-12-28 00:19:40 -02:00 @@ -393,7 +393,7 @@ struct sctp_endpoint *ep; struct sctp_association *asoc; struct sctp_transport *transport; - struct inet_opt *inet; + struct inet_sock *inet; char *saveip, *savesctp; int err; diff -Nru a/net/sctp/ipv6.c b/net/sctp/ipv6.c --- a/net/sctp/ipv6.c 2004-12-28 00:19:40 -02:00 +++ b/net/sctp/ipv6.c 2004-12-28 00:19:40 -02:00 @@ -580,9 +580,9 @@ struct sock *sctp_v6_create_accept_sk(struct sock *sk, struct sctp_association *asoc) { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct sock *newsk; - struct inet_opt *newinet; + struct inet_sock *newinet; struct ipv6_pinfo *newnp, *np = inet6_sk(sk); struct sctp6_sock *newsctp6sk; @@ -608,7 +608,7 @@ newsk->sk_shutdown = sk->sk_shutdown; newsctp6sk = (struct sctp6_sock *)newsk; - newsctp6sk->pinet6 = &newsctp6sk->inet6; + newsctp6sk->inet.pinet6 = &newsctp6sk->inet6; newinet = inet_sk(newsk); newnp = inet6_sk(newsk); diff -Nru a/net/sctp/protocol.c b/net/sctp/protocol.c --- a/net/sctp/protocol.c 2004-12-28 00:19:40 -02:00 +++ b/net/sctp/protocol.c 2004-12-28 00:19:40 -02:00 @@ -551,8 +551,8 @@ struct sctp_association *asoc) { struct sock *newsk; - struct inet_opt *inet = inet_sk(sk); - struct inet_opt *newinet; + struct inet_sock *inet = inet_sk(sk); + struct inet_sock *newinet; newsk = sk_alloc(PF_INET, GFP_KERNEL, sk->sk_prot->slab_obj_size, sk->sk_prot->slab); diff -Nru a/security/selinux/avc.c b/security/selinux/avc.c --- a/security/selinux/avc.c 2004-12-28 00:19:40 -02:00 +++ b/security/selinux/avc.c 2004-12-28 00:19:40 -02:00 @@ -566,7 +566,7 @@ switch (sk->sk_family) { case AF_INET: { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); avc_print_ipv4_addr(ab, inet->rcv_saddr, inet->sport, @@ -577,7 +577,7 @@ break; } case AF_INET6: { - struct inet_opt *inet = inet_sk(sk); + struct inet_sock *inet = inet_sk(sk); struct ipv6_pinfo *inet6 = inet6_sk(sk); avc_print_ipv6_addr(ab, &inet6->rcv_saddr, --------------000008060804040405080203-- From davem@davemloft.net Mon Dec 27 18:34:30 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 18:34:37 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS2Y92B002324 for ; Mon, 27 Dec 2004 18:34:30 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1Cj7Bo-0002jw-00; Mon, 27 Dec 2004 18:34:24 -0800 Date: Mon, 27 Dec 2004 18:34:24 -0800 From: "David S. Miller" To: Thomas Graf Cc: netdev@oss.sgi.com Subject: Re: [PATCH] PKT_SCHED: validate policer configuration TLVs Message-Id: <20041227183424.17cf34cc.davem@davemloft.net> In-Reply-To: <20041208203942.GO1371@postel.suug.ch> References: <20041207172349.GG1371@postel.suug.ch> <20041207213234.257fd0d9.davem@davemloft.net> <20041208203942.GO1371@postel.suug.ch> X-Mailer: Sylpheed version 1.0.0rc (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13061 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Wed, 8 Dec 2004 21:39:42 +0100 Thomas Graf wrote: > > Either these things are int's or u32's, they cannot be both :-) > > I know that size wise it's identical, but at least make the code > > look consistent. > > OK, I changed the dereferencing to use u32 as well and have it "casted" > while assigning the value since changing the structure datatypes > wouldn't make sense. > > Signed-off-by: Thomas Graf Applied, thanks Thomas. From davem@davemloft.net Mon Dec 27 18:36:26 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 18:36:33 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS2a6SF002824 for ; Mon, 27 Dec 2004 18:36:26 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1Cj7DU-0002kH-00; Mon, 27 Dec 2004 18:36:08 -0800 Date: Mon, 27 Dec 2004 18:36:08 -0800 From: "David S. Miller" To: "chas williams - CONTRACTOR" Cc: netdev@oss.sgi.com, davem@redhat.com, bunk@stusta.de Subject: Re: [PATCH] [ATM]: small atm cleanups (from Adrian Bunk ) Message-Id: <20041227183608.296fa0ca.davem@davemloft.net> In-Reply-To: <200412090102.iB912w8T007866@ginger.cmf.nrl.navy.mil> References: <200412090102.iB912w8T007866@ginger.cmf.nrl.navy.mil> X-Mailer: Sylpheed version 1.0.0rc (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13062 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Wed, 08 Dec 2004 20:02:59 -0500 "chas williams - CONTRACTOR" wrote: > please apply to 2.6 -- thanks! > > Signed-off-by: Chas Williams Applied, thanks Adrian and Chas. From davem@davemloft.net Mon Dec 27 18:41:00 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 18:41:07 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS2ee51003385 for ; Mon, 27 Dec 2004 18:41:00 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1Cj7Hm-0002lM-00; Mon, 27 Dec 2004 18:40:34 -0800 Date: Mon, 27 Dec 2004 18:40:34 -0800 From: "David S. Miller" To: Adrian Bunk Cc: andros@umich.edu, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: [2.6 patch] remove unused net/sunrpc/auth_gss/gss_pseudoflavors.c Message-Id: <20041227184034.6b805617.davem@davemloft.net> In-Reply-To: <20041212194750.GF22324@stusta.de> References: <20041212194750.GF22324@stusta.de> X-Mailer: Sylpheed version 1.0.0rc (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13063 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Sun, 12 Dec 2004 20:47:50 +0100 Adrian Bunk wrote: > I wasn't able to find any usage of this file. Applied, thanks Adrian. From davem@davemloft.net Mon Dec 27 18:48:17 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 18:48:24 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS2lvtg003959 for ; Mon, 27 Dec 2004 18:48:17 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1Cj7M9-0002ld-00; Mon, 27 Dec 2004 18:45:05 -0800 Date: Mon, 27 Dec 2004 18:45:05 -0800 From: "David S. Miller" To: Adrian Bunk Cc: netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: [2.6 patch] remove unused net/sunrpc/svcauth_des.c Message-Id: <20041227184505.13d026a6.davem@davemloft.net> In-Reply-To: <20041212194903.GG22324@stusta.de> References: <20041212194903.GG22324@stusta.de> X-Mailer: Sylpheed version 1.0.0rc (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13064 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Sun, 12 Dec 2004 20:49:03 +0100 Adrian Bunk wrote: > I wasn't able to find any usage of this file. Neither can I, and it doesn't even implement the auth_ops. Applied, thanks Adrian. From davem@davemloft.net Mon Dec 27 18:49:27 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 18:49:34 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS2n8TK004254 for ; Mon, 27 Dec 2004 18:49:27 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1Cj7NJ-0002lo-00; Mon, 27 Dec 2004 18:46:17 -0800 Date: Mon, 27 Dec 2004 18:46:17 -0800 From: "David S. Miller" To: Adrian Bunk Cc: netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: [2.6 patch] remove unused Message-Id: <20041227184617.4444fae4.davem@davemloft.net> In-Reply-To: <20041212195047.GH22324@stusta.de> References: <20041212195047.GH22324@stusta.de> X-Mailer: Sylpheed version 1.0.0rc (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13065 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Sun, 12 Dec 2004 20:50:47 +0100 Adrian Bunk wrote: > I wasn't able to find any usage of this file (it seems the > EXPORT_SYMBOL's were moved away, but deleting the filw was forgotten). Applied, thanks. From davem@davemloft.net Mon Dec 27 18:52:39 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 18:52:46 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS2qJMF004985 for ; Mon, 27 Dec 2004 18:52:39 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1Cj7QJ-0002me-00; Mon, 27 Dec 2004 18:49:23 -0800 Date: Mon, 27 Dec 2004 18:49:23 -0800 From: "David S. Miller" To: Adrian Bunk Cc: acme@conectiva.com.br, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: [2.6 patch] net/802/: some cleanups Message-Id: <20041227184923.5b26f5a0.davem@davemloft.net> In-Reply-To: <20041212201115.GU22324@stusta.de> References: <20041212201115.GU22324@stusta.de> X-Mailer: Sylpheed version 1.0.0rc (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13066 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Sun, 12 Dec 2004 21:11:15 +0100 Adrian Bunk wrote: > The patch below contains the following cleanups: > - make some needlessly global code static > - net/802/hippi.c: remove the unused global function hippi_net_init > - net/8021q/vlan.c: remove the global variable vlan_default_dev_flags > that was never changed > - drivers/net/net_init.c: remove four unneeded #include's drivers/net/net_init.c no longer exists in the source tree :) From davem@davemloft.net Mon Dec 27 18:53:59 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 18:54:06 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS2rg6I005465 for ; Mon, 27 Dec 2004 18:53:59 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1Cj7Rg-0002mv-00; Mon, 27 Dec 2004 18:50:48 -0800 Date: Mon, 27 Dec 2004 18:50:48 -0800 From: "David S. Miller" To: Adrian Bunk Cc: acme@conectiva.com.br, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: [2.6 patch] net/appletalk/: make some code static Message-Id: <20041227185048.031f4924.davem@davemloft.net> In-Reply-To: <20041212211128.GW22324@stusta.de> References: <20041212211128.GW22324@stusta.de> X-Mailer: Sylpheed version 1.0.0rc (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13067 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Sun, 12 Dec 2004 22:11:28 +0100 Adrian Bunk wrote: > The patch below makes some needlessly global code static. Looks good, applied. Thanks Adrian. From davem@davemloft.net Mon Dec 27 18:55:19 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 18:55:26 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS2t3L5005843 for ; Mon, 27 Dec 2004 18:55:19 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1Cj7Si-0002n6-00; Mon, 27 Dec 2004 18:51:52 -0800 Date: Mon, 27 Dec 2004 18:51:51 -0800 From: "David S. Miller" To: Adrian Bunk Cc: ralf@linux-mips.org, linux-hams@vger.kernel.org, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: [2.6 patch] /net/ax25/: some cleanups Message-Id: <20041227185151.2a7ceb71.davem@davemloft.net> In-Reply-To: <20041212211339.GX22324@stusta.de> References: <20041212211339.GX22324@stusta.de> X-Mailer: Sylpheed version 1.0.0rc (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13068 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Sun, 12 Dec 2004 22:13:39 +0100 Adrian Bunk wrote: > The patch below contains the following cleanups: > - make two needlessly global functions static > - net/ax25/ax25_addr.c: remove the unused global function ax25digicmp Applied, thanks Adrian. From davem@davemloft.net Mon Dec 27 18:56:08 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 18:56:15 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS2tmhs006126 for ; Mon, 27 Dec 2004 18:56:08 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1Cj7Tk-0002nK-00; Mon, 27 Dec 2004 18:52:56 -0800 Date: Mon, 27 Dec 2004 18:52:56 -0800 From: "David S. Miller" To: Adrian Bunk Cc: netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: [2.6 patch] net/socket.c: make a function static Message-Id: <20041227185256.74f617f4.davem@davemloft.net> In-Reply-To: <20041212211506.GY22324@stusta.de> References: <20041212211506.GY22324@stusta.de> X-Mailer: Sylpheed version 1.0.0rc (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13069 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Sun, 12 Dec 2004 22:15:06 +0100 Adrian Bunk wrote: > The patch below makes a needlessly global function static. Applied, I think we used to use this for the compat layers. Thanks Adrian. From davem@davemloft.net Mon Dec 27 19:00:34 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 19:00:41 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS30D8o006942 for ; Mon, 27 Dec 2004 19:00:33 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1Cj7Y2-0002oR-00; Mon, 27 Dec 2004 18:57:22 -0800 Date: Mon, 27 Dec 2004 18:57:21 -0800 From: "David S. Miller" To: Adrian Bunk Cc: netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: [2.6 patch] net/sunrpc/: some cleanups Message-Id: <20041227185721.6adb9867.davem@davemloft.net> In-Reply-To: <20041212211907.GZ22324@stusta.de> References: <20041212211907.GZ22324@stusta.de> X-Mailer: Sylpheed version 1.0.0rc (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13070 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Sun, 12 Dec 2004 22:19:07 +0100 Adrian Bunk wrote: > The patch below contains the following cleanups: > - make some needlessly global code static > - remove the following unused global functions: > - net/sunrpc/auth_gss/gss_generic_token.c: g_get_mech_oid > - net/sunrpc/cache.c: cache_find > - net/sunrpc/cache.c: cache_drop > - net/sunrpc/xdr.c: xdr_decode_netobj_fixed > - net/sunrpc/xdr.c: xdr_shift_iovec > - net/sunrpc/xdr.c: xdr_kmap > - net/sunrpc/xdr.c: xdr_kunmap > - remove the following unused global structs: > - net/sunrpc/auth_gss/gss_krb5_mech.c: gss_mech_krb5_oid > - net/sunrpc/auth_gss/gss_spkm3_mech.c: gss_mech_spkm3_oid > - net/sunrpc/rpc_pipe.c: rpc_pipe_iops > - remove the EXPORT_SYMBOL(cache_clean) > > Please review this patch. Looks fine. Applied, thanks Adrian. From davem@davemloft.net Mon Dec 27 19:01:32 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 19:01:40 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS31CJ2007091 for ; Mon, 27 Dec 2004 19:01:32 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1Cj7Yy-0002ob-00; Mon, 27 Dec 2004 18:58:20 -0800 Date: Mon, 27 Dec 2004 18:58:19 -0800 From: "David S. Miller" To: Adrian Bunk Cc: netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: [2.6 patch] net/unix/: make some code static Message-Id: <20041227185819.0b13eb96.davem@davemloft.net> In-Reply-To: <20041212212022.GA22324@stusta.de> References: <20041212212022.GA22324@stusta.de> X-Mailer: Sylpheed version 1.0.0rc (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13071 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Sun, 12 Dec 2004 22:20:22 +0100 Adrian Bunk wrote: > The patch below makes some needlessly global code static. Applied, thanks Adrian. From davem@davemloft.net Mon Dec 27 19:03:12 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 19:03:19 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS32q56007887 for ; Mon, 27 Dec 2004 19:03:12 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1Cj7aV-0002p1-00; Mon, 27 Dec 2004 18:59:56 -0800 Date: Mon, 27 Dec 2004 18:59:55 -0800 From: "David S. Miller" To: Adrian Bunk Cc: eis@baty.hanse.de, linux-x25@vger.kernel.org, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: [2.6 patch] net/x25/: some cleanups Message-Id: <20041227185955.132a8693.davem@davemloft.net> In-Reply-To: <20041212212318.GB22324@stusta.de> References: <20041212212318.GB22324@stusta.de> X-Mailer: Sylpheed version 1.0.0rc (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13072 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Sun, 12 Dec 2004 22:23:18 +0100 Adrian Bunk wrote: > The patch below includes the following cleanups: > - make some needlessly global code static > - remove the following unused global functions: > - net/x25/x25_dev.c: x25_llc_receive_frame > - net/x25/x25_link.c: x25_transmit_diagnostic Applied, thanks Adrian. From davem@davemloft.net Mon Dec 27 19:04:46 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 19:04:54 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS34Q0Z008377 for ; Mon, 27 Dec 2004 19:04:46 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1Cj7c6-0002pP-00; Mon, 27 Dec 2004 19:01:34 -0800 Date: Mon, 27 Dec 2004 19:01:33 -0800 From: "David S. Miller" To: Adrian Bunk Cc: jmorris@intercode.com.au, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: [2.6 patch] net/xfrm/: some cleanups Message-Id: <20041227190133.0fdfe915.davem@davemloft.net> In-Reply-To: <20041212212523.GC22324@stusta.de> References: <20041212212523.GC22324@stusta.de> X-Mailer: Sylpheed version 1.0.0rc (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13073 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Sun, 12 Dec 2004 22:25:23 +0100 Adrian Bunk wrote: > The patch below contains the following changes: > - make some needlessly global code static > - remove the EXPORT_SYMBOL_GPL'ed but unused function > xfrm_calg_get_byidx Looks fine, applied. Thanks Adrian. From davem@davemloft.net Mon Dec 27 19:05:56 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 19:06:02 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS35eYF008851 for ; Mon, 27 Dec 2004 19:05:56 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1Cj7dJ-0002qR-00; Mon, 27 Dec 2004 19:02:49 -0800 Date: Mon, 27 Dec 2004 19:02:49 -0800 From: "David S. Miller" To: Adrian Bunk Cc: netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: [2.6 patch] net/core/: misc possible cleanups Message-Id: <20041227190249.6afda3df.davem@davemloft.net> In-Reply-To: <20041214045758.GA23151@stusta.de> References: <20041214045758.GA23151@stusta.de> X-Mailer: Sylpheed version 1.0.0rc (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13074 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Tue, 14 Dec 2004 05:57:58 +0100 Adrian Bunk wrote: > - skbuff.c: skb_insert > - skbuff.c: skb_iter_first > - skbuff.c: skb_iter_next > - skbuff.c: skb_iter_abort These are actually planned to be used, let's keep them in for now. From davem@davemloft.net Mon Dec 27 19:07:05 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 19:07:12 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS36jip009304 for ; Mon, 27 Dec 2004 19:07:05 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1Cj7eL-0002qf-00; Mon, 27 Dec 2004 19:03:53 -0800 Date: Mon, 27 Dec 2004 19:03:53 -0800 From: "David S. Miller" To: Adrian Bunk Cc: netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: [2.6 patch] net/ethernet/eth.c: make a function static Message-Id: <20041227190353.69099ddf.davem@davemloft.net> In-Reply-To: <20041214134842.GD23151@stusta.de> References: <20041214134842.GD23151@stusta.de> X-Mailer: Sylpheed version 1.0.0rc (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13075 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Tue, 14 Dec 2004 14:48:42 +0100 Adrian Bunk wrote: > The patch below makes a needlessly global function static. Applied, thanks Adrian. From davem@davemloft.net Mon Dec 27 19:09:02 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 19:09:08 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS38gxZ009852 for ; Mon, 27 Dec 2004 19:09:02 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1Cj7gF-0002rN-00; Mon, 27 Dec 2004 19:05:51 -0800 Date: Mon, 27 Dec 2004 19:05:50 -0800 From: "David S. Miller" To: Adrian Bunk Cc: netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: [2.6 patch] net/key/af_key.c: make pfkey_table static Message-Id: <20041227190550.165862d0.davem@davemloft.net> In-Reply-To: <20041215004254.GG23151@stusta.de> References: <20041215004254.GG23151@stusta.de> X-Mailer: Sylpheed version 1.0.0rc (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13076 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Wed, 15 Dec 2004 01:42:54 +0100 Adrian Bunk wrote: > The patch below makes the needlessly global pfkey_table static. Applied, thanks Adrian. From davem@davemloft.net Mon Dec 27 19:11:41 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 19:11:48 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS3BLrD010355 for ; Mon, 27 Dec 2004 19:11:41 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1Cj7ia-0002sF-00; Mon, 27 Dec 2004 19:08:16 -0800 Date: Mon, 27 Dec 2004 19:08:16 -0800 From: "David S. Miller" To: hadi@cyberus.ca Cc: bunk@stusta.de, alan@redhat.com, kuznet@ms2.inr.ac.ru, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: [2.6 patch] net/netlink/af_netlink.c: possible cleanups Message-Id: <20041227190816.1d166ba2.davem@davemloft.net> In-Reply-To: <1103119623.1077.71.camel@jzny.localdomain> References: <20041215004604.GH23151@stusta.de> <1103119623.1077.71.camel@jzny.localdomain> X-Mailer: Sylpheed version 1.0.0rc (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13077 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On 15 Dec 2004 09:07:03 -0500 jamal wrote: > I think this should be left alone for now; what we need to do is > deprecate NETLINK_DEV incase someone is still using it. > Else we could get rid of it totaly including what Adrian is deleting > below. Any users of NETLINK_DEV? Maybe deleting the feature will get > someone whining? ;-> While I agree we should probably just kill NETLINK_DEV now and don't look back, I have applied Adrian's patch for the time being. From davem@davemloft.net Mon Dec 27 19:12:32 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 19:12:39 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS3CCpL010570 for ; Mon, 27 Dec 2004 19:12:32 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1Cj7jc-0002sX-00; Mon, 27 Dec 2004 19:09:20 -0800 Date: Mon, 27 Dec 2004 19:09:20 -0800 From: "David S. Miller" To: Adrian Bunk Cc: netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: [2.6 patch] net/packet/af_packet.c: make some code static Message-Id: <20041227190920.5ac1964f.davem@davemloft.net> In-Reply-To: <20041215004745.GI23151@stusta.de> References: <20041215004745.GI23151@stusta.de> X-Mailer: Sylpheed version 1.0.0rc (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13078 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Wed, 15 Dec 2004 01:47:45 +0100 Adrian Bunk wrote: > The patch below makes some needlessly global code static. Looks good. Applied, thanks Adrian. From davem@davemloft.net Mon Dec 27 19:18:47 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 19:18:54 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS3IQvc011401 for ; Mon, 27 Dec 2004 19:18:47 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1Cj7pf-0002tV-00; Mon, 27 Dec 2004 19:15:35 -0800 Date: Mon, 27 Dec 2004 19:15:35 -0800 From: "David S. Miller" To: Adrian Bunk Cc: netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: [2.6 patch] net/ipv4/: misc possible cleanups Message-Id: <20041227191535.5edce690.davem@davemloft.net> In-Reply-To: <20041215005139.GJ23151@stusta.de> References: <20041215005139.GJ23151@stusta.de> X-Mailer: Sylpheed version 1.0.0rc (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13079 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Wed, 15 Dec 2004 01:51:39 +0100 Adrian Bunk wrote: > - remove the following unneeded EXPORT_SYMBOL: > - tcp_timer.c: tcp_timer_bug_msg This one is wrong. It is needed for the ipv6 module via the call tcp_ipv6.c makes to tcp_done() which invokes tcp_clear_xmit_timer() which uses tcp_timer_bug_msg. From davem@davemloft.net Mon Dec 27 19:20:25 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 19:20:31 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS3K5Os011910 for ; Mon, 27 Dec 2004 19:20:25 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1Cj7qs-0002tm-00; Mon, 27 Dec 2004 19:16:50 -0800 Date: Mon, 27 Dec 2004 19:16:50 -0800 From: "David S. Miller" To: Adrian Bunk Cc: wensong@linux-vs.org, ja@ssi.bg, lvs-users@linuxvirtualserver.org, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: [2.6 patch] net/ipvs/: make some code static Message-Id: <20041227191650.1bedf2a8.davem@davemloft.net> In-Reply-To: <20041215005801.GB11972@stusta.de> References: <20041215005801.GB11972@stusta.de> X-Mailer: Sylpheed version 1.0.0rc (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13080 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Wed, 15 Dec 2004 01:58:01 +0100 Adrian Bunk wrote: > The patch below makes some needlessly global code static. Applied, thanks Adrian. From davem@davemloft.net Mon Dec 27 19:25:10 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 19:25:16 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS3OnT9012521 for ; Mon, 27 Dec 2004 19:25:09 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1Cj7vj-0002uo-00; Mon, 27 Dec 2004 19:21:51 -0800 Date: Mon, 27 Dec 2004 19:21:51 -0800 From: "David S. Miller" To: Adrian Bunk Cc: acme@conectiva.com.br, linux-net@vger.kernel.or, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: [2.6 patch] net/ipx/: make some code static Message-Id: <20041227192151.77b5db4d.davem@davemloft.net> In-Reply-To: <20041215005925.GC11972@stusta.de> References: <20041215005925.GC11972@stusta.de> X-Mailer: Sylpheed version 1.0.0rc (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13081 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Wed, 15 Dec 2004 01:59:25 +0100 Adrian Bunk wrote: > The patch below makes some needlessly global code static. Applied, thanks Adrian. From bunk@stusta.de Mon Dec 27 19:35:16 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 19:35:23 -0800 (PST) Received: from mailout.stusta.mhn.de (mailout.stusta.mhn.de [141.84.69.5]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBS3Ysjp013139 for ; Mon, 27 Dec 2004 19:35:15 -0800 Received: (qmail 10808 invoked from network); 28 Dec 2004 03:36:18 -0000 Received: from r063144.stusta.swh.mhn.de (10.150.63.144) by mailhub.stusta.mhn.de with SMTP; 28 Dec 2004 03:36:18 -0000 Received: by r063144.stusta.swh.mhn.de (Postfix, from userid 1000) id 6E475BBEDC; Tue, 28 Dec 2004 04:36:17 +0100 (CET) Date: Tue, 28 Dec 2004 04:36:17 +0100 From: Adrian Bunk To: "David S. Miller" Cc: netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: [2.6 patch] net/ipv4/: misc possible cleanups Message-ID: <20041228033617.GI5345@stusta.de> References: <20041215005139.GJ23151@stusta.de> <20041227191535.5edce690.davem@davemloft.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041227191535.5edce690.davem@davemloft.net> User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13082 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bunk@stusta.de Precedence: bulk X-list: netdev On Mon, Dec 27, 2004 at 07:15:35PM -0800, David S. Miller wrote: > On Wed, 15 Dec 2004 01:51:39 +0100 > Adrian Bunk wrote: > > > - remove the following unneeded EXPORT_SYMBOL: > > - tcp_timer.c: tcp_timer_bug_msg > > This one is wrong. It is needed for the ipv6 module > via the call tcp_ipv6.c makes to tcp_done() which > invokes tcp_clear_xmit_timer() which uses > tcp_timer_bug_msg. tcp_done doesn't call tcp_clear_xmit_timer, it calls tcp_clear_xmit_timers (note the s) which is not an inline but an EXPORT_SYMBOL'ed function in tcp_timer.c. cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed From advertiser@localhost.localdomain Mon Dec 27 19:47:29 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 19:47:36 -0800 (PST) Received: from localhost.localdomain ([82.201.178.238]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS3kvwO013821 for ; Mon, 27 Dec 2004 19:47:25 -0800 Received: from localhost.localdomain (admin3 [127.0.0.1]) by localhost.localdomain (8.12.8/8.12.8) with ESMTP id iBSFs4Cs026984 for ; Tue, 28 Dec 2004 17:54:05 +0200 Received: (from advertiser@localhost) by localhost.localdomain (8.12.8/8.12.8/Submit) id iBSFs4Fv026977 for netdev@oss.sgi.com; Tue, 28 Dec 2004 17:54:04 +0200 Date: Tue, 28 Dec 2004 17:54:04 +0200 From: advertiser@advertise.com Message-Id: <200412281554.iBSFs4Fv026977@localhost.localdomain> To: netdev@oss.sgi.com Subject: Cheap Prices NOT Cheap Hosting X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13083 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: advertiser@advertise.com Precedence: bulk X-list: netdev .HellO... ------------------------------------------------------------ ############################################################## $ Visit http://www.mkhoster.com For Very Good Hosting Offer $ $--- Cpanel $ $--- PHP $ $--- CGI-perl $ $--- Mysql $ $--- And MORE ....... $ ############################################################## FOR MORE INFORMATIONS -----< http://mkhoster.com/support.html >----- ************************************************************** From davem@davemloft.net Mon Dec 27 20:07:37 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 20:07:44 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS47H1o014964 for ; Mon, 27 Dec 2004 20:07:37 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1Cj8au-0002zs-00; Mon, 27 Dec 2004 20:04:24 -0800 Date: Mon, 27 Dec 2004 20:04:24 -0800 From: "David S. Miller" To: Adrian Bunk Cc: netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: [2.6 patch] net/ipv4/: misc possible cleanups Message-Id: <20041227200424.557a5902.davem@davemloft.net> In-Reply-To: <20041228033617.GI5345@stusta.de> References: <20041215005139.GJ23151@stusta.de> <20041227191535.5edce690.davem@davemloft.net> <20041228033617.GI5345@stusta.de> X-Mailer: Sylpheed version 1.0.0rc (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13084 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Tue, 28 Dec 2004 04:36:17 +0100 Adrian Bunk wrote: > tcp_done doesn't call tcp_clear_xmit_timer, it calls > tcp_clear_xmit_timers (note the s) which is not an inline but an > EXPORT_SYMBOL'ed function in tcp_timer.c. My bad, I'll re-review your patch and apply it unless I find some problem. From davem@davemloft.net Mon Dec 27 21:00:17 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 21:00:24 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS4xvQW019777 for ; Mon, 27 Dec 2004 21:00:17 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1Cj9Sq-00034x-00; Mon, 27 Dec 2004 21:00:08 -0800 Date: Mon, 27 Dec 2004 21:00:07 -0800 From: "David S. Miller" To: yoshfuji@linux-ipv6.org Cc: roland@topspin.com, linux-kernel@vger.kernel.org, openib-general@openib.org, akpm@osdl.org, torvalds@osdl.org, netdev@oss.sgi.com, yoshfuji@linux-ipv6.org Subject: Re: [PATCH][v4][0/24] Second InfiniBand merge candidate patch set Message-Id: <20041227210007.398734cd.davem@davemloft.net> In-Reply-To: <20041220.153845.70996857.yoshfuji@linux-ipv6.org> References: <200412192214.KlDxQ7icOmxHYIf0@topspin.com> <20041220.153845.70996857.yoshfuji@linux-ipv6.org> X-Mailer: Sylpheed version 1.0.0rc (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id iBS4xvQW019777 X-archive-position: 13085 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Mon, 20 Dec 2004 15:38:45 +0900 (JST) YOSHIFUJI Hideaki / $B5HF#1QL@(B wrote: > In article <200412192214.KlDxQ7icOmxHYIf0@topspin.com> (at Sun, 19 Dec 2004 22:14:43 -0800), Roland Dreier says: > > > The following series of patches is the latest version of the OpenIB > > InfiniBand drivers. We believe that this version is suitable for > > merging when 2.6.11 opens (or into -mm immediately), although of > > course we are willing to go through as many more iterations as > > required to fix any remaining issues. > > Maybe, via the net queue. David? If Roland can resubmit his patch queue to me with the fixes folks have recommended to him, I can start merging this stuff in. I leave for vacation Wednesday morning (PST time), so if it is submitted after that I'll get to it at the beginning of the new year. From davem@davemloft.net Mon Dec 27 21:16:02 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 21:16:10 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS5Fge8020616 for ; Mon, 27 Dec 2004 21:16:02 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1Cj9f1-00036r-00; Mon, 27 Dec 2004 21:12:43 -0800 Date: Mon, 27 Dec 2004 21:12:43 -0800 From: "David S. Miller" To: Adrian Bunk Cc: irda-users@lists.sourceforge.net, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: [2.6 patch] net/irda/: misc possible cleanups Message-Id: <20041227211243.7d05e891.davem@davemloft.net> In-Reply-To: <20041215010528.GA12937@stusta.de> References: <20041215010528.GA12937@stusta.de> X-Mailer: Sylpheed version 1.0.0rc (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13086 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Wed, 15 Dec 2004 02:05:28 +0100 Adrian Bunk wrote: > The patch below contains the following possible cleanups: > - make some needlessly global code static > - remove the following unused global functions: > - discovery.c: irlmp_find_device > - ircomm/ircomm_param.c: ircomm_param_flush > - irda_device.c: irda_device_set_dtr_rts > - irda_device.c: irda_device_change_speed > - irda_device.c: irda_device_set_mode > - iriap.c: iriap_getinfobasedetails_request > - iriap.c: iriap_getinfobasedetails_confirm > - iriap.c: iriap_getobjects_request > - iriap.c: iriap_getobjects_confirm > - iriap.c: iriap_getvalue > - irlan_client.c: irlan_client_reconnect_data_channel > - irlap_frame.c: irlap_send_frmr_frame > - irlmp.c: irlmp_status_request > - remove the follwong unused global variables: > - irnet/irnet_ppp.c: irnet_ppp_ops > - irsysctl.c: sysctl_compression > - qos.c: #ifndef CONFIG_IRDA_DYNAMIC_WINDOW irlap_requested_line_capacity Looks good, applied. Thanks Adrian. From davem@davemloft.net Mon Dec 27 21:17:09 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 21:17:15 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS5GlnG020820 for ; Mon, 27 Dec 2004 21:17:07 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1Cj9iw-00037n-00; Mon, 27 Dec 2004 21:16:46 -0800 Date: Mon, 27 Dec 2004 21:16:46 -0800 From: "David S. Miller" To: Adrian Bunk Cc: ralf@linux-mips.org, linux-hams@vger.kernel.org, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: [2.6 patch] net/rose/: misc possible cleanups Message-Id: <20041227211646.58326434.davem@davemloft.net> In-Reply-To: <20041215012347.GF12937@stusta.de> References: <20041215012347.GF12937@stusta.de> X-Mailer: Sylpheed version 1.0.0rc (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13087 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Wed, 15 Dec 2004 02:23:47 +0100 Adrian Bunk wrote: > The patch below contains the following possible cleanups: > - make some needlessly global code static > - remove the followingunused global functions: > - rose_dev.c: rose_rx_ip > - rose_link.c: rose_transmit_diagnostic Applied, thanks Adrian. From davem@davemloft.net Mon Dec 27 21:17:44 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 21:17:52 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS5HNFa021036 for ; Mon, 27 Dec 2004 21:17:43 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1Cj9gY-00037B-00; Mon, 27 Dec 2004 21:14:18 -0800 Date: Mon, 27 Dec 2004 21:14:17 -0800 From: "David S. Miller" To: Adrian Bunk Cc: acme@conectiva.com.br, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: [2.6 patch] net/llc/: misc possible cleanups Message-Id: <20041227211417.2d077475.davem@davemloft.net> In-Reply-To: <20041215011211.GC12937@stusta.de> References: <20041215011211.GC12937@stusta.de> X-Mailer: Sylpheed version 1.0.0rc (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13088 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Wed, 15 Dec 2004 02:12:11 +0100 Adrian Bunk wrote: > The patch below contains the following possible cleanups: > - make some needlessly global code static > - remove the following unused global functions: > - lc_c_ac.c: llc_conn_ac_report_status > - lc_c_ac.c: llc_conn_ac_send_dm_rsp_f_set_f_flag > - lc_c_ac.c: llc_conn_ac_resend_i_cmd_p_set_1 > - lc_c_ac.c: llc_conn_ac_resend_i_cmd_p_set_1_or_send_rr > - lc_c_ac.c: llc_conn_ac_send_ack_cmd_p_set_1 > - lc_c_ac.c: llc_conn_ac_send_ua_rsp_f_set_f_flag > - lc_c_ac.c: llc_conn_ac_set_f_flag_p > - llc_c_ev.c: llc_conn_ev_conn_resp > - llc_c_ev.c: llc_conn_ev_rst_resp > - llc_c_ev.c: llc_conn_ev_rx_xxx_cmd_pbit_set_0 > - llc_c_ev.c: llc_conn_ev_rx_xxx_yyy > - llc_c_ev.c: llc_conn_ev_any_tmr_exp > - llc_c_ev.c: llc_conn_ev_qlfy_init_p_f_cycle > - llc_c_ev.c: llc_conn_ev_qlfy_set_status_impossible > - llc_c_ev.c: llc_conn_ev_qlfy_set_status_received > - llc_if.c: llc_build_and_send_reset_pkt > - llc_pdu.c: llc_pdu_decode_cr_bit > - remove the following unneeded EXPORT_SYMBOL: > - llc_core.c: llc_sap_list_lock Also looks good, applied. Thanks Adrian. From roland@topspin.com Mon Dec 27 21:18:18 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 21:18:27 -0800 (PST) Received: from umhlanga.STRATNET.NET (umhlanga.stratnet.net [12.162.17.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS5Hwe7021531 for ; Mon, 27 Dec 2004 21:18:18 -0800 Received: from exch-1.topspincom.com ([12.162.17.3]) by umhlanga.STRATNET.NET with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:19:27 -0800 Received: from eddore ([10.10.253.169]) by exch-1.topspincom.com with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:19:26 -0800 Received: from roland by eddore with local (Exim 4.34) id 1Cj9lW-0007g0-Hi; Mon, 27 Dec 2004 21:19:26 -0800 To: "David S. Miller" Cc: yoshfuji@linux-ipv6.org, linux-kernel@vger.kernel.org, openib-general@openib.org, akpm@osdl.org, torvalds@osdl.org, netdev@oss.sgi.com X-Message-Flag: Warning: May contain useful information References: <200412192214.KlDxQ7icOmxHYIf0@topspin.com> <20041220.153845.70996857.yoshfuji@linux-ipv6.org> <20041227210007.398734cd.davem@davemloft.net> From: Roland Dreier Date: Mon, 27 Dec 2004 21:19:26 -0800 In-Reply-To: <20041227210007.398734cd.davem@davemloft.net> (David S. Miller's message of "Mon, 27 Dec 2004 21:00:07 -0800") Message-ID: <52k6r3j8yp.fsf@topspin.com> User-Agent: Gnus/5.1006 (Gnus v5.10.6) XEmacs/21.4 (Corporate Culture, linux) MIME-Version: 1.0 X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: roland@topspin.com Subject: Re: [PATCH][v4][0/24] Second InfiniBand merge candidate patch set Content-Type: text/plain; charset=us-ascii X-SA-Exim-Version: 4.1 (built Tue, 17 Aug 2004 11:06:07 +0200) X-SA-Exim-Scanned: Yes (on eddore) X-OriginalArrivalTime: 28 Dec 2004 05:19:26.0939 (UTC) FILETIME=[CCDE4AB0:01C4EC9C] X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13089 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: roland@topspin.com Precedence: bulk X-list: netdev David> If Roland can resubmit his patch queue to me with the fixes David> folks have recommended to him, I can start merging this David> stuff in. Andrew has indicated to me that he has merged the whole InfiniBand patch into -mm now. Dave, did you want to handle the entire merge of the whole IB stack, or just the net/ parts, which are pretty trivial and stand alone, since AF_INFINIBAND is already defined in the tree? I'm happy to do whatever it takes to get IB merged as expeditiously as possible so Dave & Andrew, please let me know what seems easiest and best to you. Thanks, Roland From davem@davemloft.net Mon Dec 27 21:19:14 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 21:19:22 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS5IsVe022317 for ; Mon, 27 Dec 2004 21:19:14 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1Cj9i0-00037X-00; Mon, 27 Dec 2004 21:15:48 -0800 Date: Mon, 27 Dec 2004 21:15:48 -0800 From: "David S. Miller" To: Adrian Bunk Cc: ralf@linux-mips.org, linux-hams@vger.kernel.org, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: [2.6 patch] net/netrom/: make some code static Message-Id: <20041227211548.3d511579.davem@davemloft.net> In-Reply-To: <20041215012107.GE12937@stusta.de> References: <20041215012107.GE12937@stusta.de> X-Mailer: Sylpheed version 1.0.0rc (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13090 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Wed, 15 Dec 2004 02:21:07 +0100 Adrian Bunk wrote: > The patch below makes some needlessly global code static. Applied, thanks Adrian. From davem@davemloft.net Mon Dec 27 21:21:18 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 21:21:25 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS5KwI7023069 for ; Mon, 27 Dec 2004 21:21:18 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1Cj9kC-00038Y-00; Mon, 27 Dec 2004 21:18:04 -0800 Date: Mon, 27 Dec 2004 21:18:04 -0800 From: "David S. Miller" To: Adrian Bunk Cc: netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: [2.6 patch] net/rxrpc/: misc possible cleanups Message-Id: <20041227211804.4365c340.davem@davemloft.net> In-Reply-To: <20041215012612.GG12937@stusta.de> References: <20041215012612.GG12937@stusta.de> X-Mailer: Sylpheed version 1.0.0rc (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13091 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Wed, 15 Dec 2004 02:26:12 +0100 Adrian Bunk wrote: > The patch below contains the following possible cleanups: > - make some needlessly global code static > - remove the following unused global function: > - transport.c: rxrpc_clear_transport > - remove the following unneeded EXPORT_SYMBOL: > - rxrpc_syms.c: rxrpc_call_flush Looks good, applied. Thanks Adrian. From davem@davemloft.net Mon Dec 27 21:22:25 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 21:22:32 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS5M5r5023449 for ; Mon, 27 Dec 2004 21:22:25 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1Cj9lH-00038s-00; Mon, 27 Dec 2004 21:19:11 -0800 Date: Mon, 27 Dec 2004 21:19:11 -0800 From: "David S. Miller" To: Adrian Bunk Cc: netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: [2.6 patch] net/sched/: possible cleanups Message-Id: <20041227211911.66c0b643.davem@davemloft.net> In-Reply-To: <20041215012754.GH12937@stusta.de> References: <20041215012754.GH12937@stusta.de> X-Mailer: Sylpheed version 1.0.0rc (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13092 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Wed, 15 Dec 2004 02:27:54 +0100 Adrian Bunk wrote: > The patch below contans the following possible cleanups: > - make some needlessly global code static > - sch_htb.c: #undef HTB_DEBUG Applied, thanks Adrian. From davem@davemloft.net Mon Dec 27 21:23:46 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 21:23:53 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS5NQQ9023882 for ; Mon, 27 Dec 2004 21:23:46 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1Cj9m8-000399-00; Mon, 27 Dec 2004 21:20:04 -0800 Date: Mon, 27 Dec 2004 21:20:04 -0800 From: "David S. Miller" To: Adrian Bunk Cc: wensong@linux-vs.org, ja@ssi.bg, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: [2.6 patch] remove subscribers-only ipvs mailing list Message-Id: <20041227212004.40f1cebb.davem@davemloft.net> In-Reply-To: <20041215013043.GI12937@stusta.de> References: <20041215013043.GI12937@stusta.de> X-Mailer: Sylpheed version 1.0.0rc (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13093 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Wed, 15 Dec 2004 02:30:43 +0100 Adrian Bunk wrote: > It's generally agreed, that maintainers mustn't contain subscribers-only > lists. Agreed, and applied. Thanks Adrian. From davem@davemloft.net Mon Dec 27 21:26:26 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 21:26:33 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS5Q5h8024537 for ; Mon, 27 Dec 2004 21:26:25 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1Cj9s8-0003AT-00; Mon, 27 Dec 2004 21:26:16 -0800 Date: Mon, 27 Dec 2004 21:26:15 -0800 From: "David S. Miller" To: Roland Dreier Cc: yoshfuji@linux-ipv6.org, linux-kernel@vger.kernel.org, openib-general@openib.org, akpm@osdl.org, torvalds@osdl.org, netdev@oss.sgi.com Subject: Re: [PATCH][v4][0/24] Second InfiniBand merge candidate patch set Message-Id: <20041227212615.0536c99f.davem@davemloft.net> In-Reply-To: <52k6r3j8yp.fsf@topspin.com> References: <200412192214.KlDxQ7icOmxHYIf0@topspin.com> <20041220.153845.70996857.yoshfuji@linux-ipv6.org> <20041227210007.398734cd.davem@davemloft.net> <52k6r3j8yp.fsf@topspin.com> X-Mailer: Sylpheed version 1.0.0rc (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13094 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Mon, 27 Dec 2004 21:19:26 -0800 Roland Dreier wrote: > Dave, did you want to handle the entire merge of the whole IB stack, > or just the net/ parts, which are pretty trivial and stand alone, > since AF_INFINIBAND is already defined in the tree? Send it all over. From dave@thedillows.org Mon Dec 27 21:38:18 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 21:38:25 -0800 (PST) Received: from smtp.knology.net (smtp.knology.net [24.214.63.101]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBS5bvPM025282 for ; Mon, 27 Dec 2004 21:38:17 -0800 Received: (qmail 4042 invoked by uid 0); 28 Dec 2004 05:43:37 -0000 Received: from user-69-1-45-93.knology.net (HELO ori.thedillows.org) (69.1.45.93) by smtp8.knology.net with SMTP; 28 Dec 2004 05:43:37 -0000 Received: from ori.thedillows.org (localhost [127.0.0.1]) by ori.thedillows.org (8.13.1/8.13.1) with ESMTP id iBS5dQ8Z004372; Tue, 28 Dec 2004 00:39:26 -0500 Received: (from il1@localhost) by ori.thedillows.org (8.13.1/8.13.1/Submit) id iBS5dPi0004371; Tue, 28 Dec 2004 00:39:25 -0500 X-Authentication-Warning: ori.thedillows.org: il1 set sender to dave@thedillows.org using -f Subject: Re: [BK netdev-2.6] Update Typhoon firmware From: David Dillow To: Jeff Garzik Cc: Netdev In-Reply-To: <41CFD72C.6090503@pobox.com> References: <1103314025.4217.1.camel@ori.thedillows.org> <41CFD72C.6090503@pobox.com> Content-Type: text/plain Content-Transfer-Encoding: 7bit Date: Tue, 28 Dec 2004 00:39:25 -0500 Message-Id: <1104212365.4293.0.camel@ori.thedillows.org> Mime-Version: 1.0 X-Mailer: Evolution 2.0.2 (2.0.2-3) X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13095 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dave@thedillows.org Precedence: bulk X-list: netdev On Mon, 2004-12-27 at 04:34 -0500, Jeff Garzik wrote: > David Dillow wrote: > > Jeff, please do a > > > > bk pull http://typhoon.bkbits.net/typhoon-2.6-firmware > > > [jgarzik@pretzel net-drivers-2.6]$ bk pull > http://typhoon.bkbits.net/typhoon-2.6-firmware > Pull http://typhoon.bkbits.net/typhoon-2.6-firmware > -> file://garz/repo/net-drivers-2.6 > ERROR-cannot cd to typhoon-2.6-firmware (illegal, nonexistant, or not > package root) Doh! Make that bk pull http://typhoon.bkbits.net/typhoon-2.6 -- David Dillow From roland@topspin.com Mon Dec 27 21:49:40 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 21:50:13 -0800 (PST) Received: from umhlanga.STRATNET.NET (umhlanga.stratnet.net [12.162.17.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS5nJDU025948 for ; Mon, 27 Dec 2004 21:49:40 -0800 Received: from exch-1.topspincom.com ([12.162.17.3]) by umhlanga.STRATNET.NET with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:50:57 -0800 Received: from eddore ([10.10.253.169]) by exch-1.topspincom.com with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:50:56 -0800 Received: from localhost ([127.0.0.1] helo=eddore) by eddore with smtp (Exim 4.34) id 1CjAFz-0000sp-Hr; Mon, 27 Dec 2004 21:50:56 -0800 Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com, openib-general@openib.org In-Reply-To: <200412272150.vKuRYXlCFl5x8NAo@topspin.com> X-Mailer: Roland's Patchbomber Date: Mon, 27 Dec 2004 21:50:55 -0800 Message-Id: <200412272150.BzlME8aULSGdgnS3@topspin.com> Mime-Version: 1.0 To: davem@davemloft.net From: Roland Dreier X-SA-Exim-Connect-IP: 127.0.0.1 X-SA-Exim-Mail-From: roland@topspin.com Subject: [PATCH][v5][3/24] Hook up drivers/infiniband Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-SA-Exim-Version: 4.1 (built Tue, 17 Aug 2004 11:06:07 +0200) X-SA-Exim-Scanned: Yes (on eddore) X-OriginalArrivalTime: 28 Dec 2004 05:50:56.0829 (UTC) FILETIME=[33549ED0:01C4ECA1] X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13097 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: roland@topspin.com Precedence: bulk X-list: netdev Add the appropriate lines to drivers/Kconfig and drivers/Makefile so that the kernel configuration and build systems know about drivers/infiniband. Signed-off-by: Roland Dreier --- linux-bk.orig/drivers/Kconfig 2004-12-27 21:47:59.198084242 -0800 +++ linux-bk/drivers/Kconfig 2004-12-27 21:48:19.194140917 -0800 @@ -56,4 +56,6 @@ source "drivers/mmc/Kconfig" +source "drivers/infiniband/Kconfig" + endmenu --- linux-bk.orig/drivers/Makefile 2004-12-27 21:48:10.314447971 -0800 +++ linux-bk/drivers/Makefile 2004-12-27 21:48:19.194140917 -0800 @@ -59,5 +59,6 @@ obj-$(CONFIG_EISA) += eisa/ obj-$(CONFIG_CPU_FREQ) += cpufreq/ obj-$(CONFIG_MMC) += mmc/ +obj-$(CONFIG_INFINIBAND) += infiniband/ obj-y += firmware/ obj-$(CONFIG_CRYPTO) += crypto/ From roland@topspin.com Mon Dec 27 21:49:39 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 21:50:05 -0800 (PST) Received: from umhlanga.STRATNET.NET (umhlanga.stratnet.net [12.162.17.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS5nJDO025948 for ; Mon, 27 Dec 2004 21:49:39 -0800 Received: from exch-1.topspincom.com ([12.162.17.3]) by umhlanga.STRATNET.NET with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:50:48 -0800 Received: from eddore ([10.10.253.169]) by exch-1.topspincom.com with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:50:48 -0800 Received: from localhost ([127.0.0.1] helo=eddore) by eddore with smtp (Exim 4.34) id 1CjAFs-0000sP-0d; Mon, 27 Dec 2004 21:50:48 -0800 Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com, openib-general@openib.org In-Reply-To: 20041227212615.0536c99f.davem@davemloft.net X-Mailer: Roland's Patchbomber Date: Mon, 27 Dec 2004 21:50:47 -0800 Message-Id: <200412272150.IBRnA4AvjendsF8x@topspin.com> Mime-Version: 1.0 To: davem@davemloft.net From: Roland Dreier X-SA-Exim-Connect-IP: 127.0.0.1 X-SA-Exim-Mail-From: roland@topspin.com Subject: [PATCH][v5][0/24] Latest IB patch queue Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-SA-Exim-Version: 4.1 (built Tue, 17 Aug 2004 11:06:07 +0200) X-SA-Exim-Scanned: Yes (on eddore) X-OriginalArrivalTime: 28 Dec 2004 05:50:48.0720 (UTC) FILETIME=[2E7F4900:01C4ECA1] X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13096 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: roland@topspin.com Precedence: bulk X-list: netdev >>>>> "David" == David S Miller writes: David> Send it all over. OK, you asked for it... here's our latest tree, which should incorporate all the feedback I've seen. (Individuals trimmed from CC list, since they probably don't want to get all 24 patches over again) Thanks, Roland Dreier From roland@topspin.com Mon Dec 27 21:49:49 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 21:50:30 -0800 (PST) Received: from umhlanga.STRATNET.NET (umhlanga.stratnet.net [12.162.17.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS5nJDw025948 for ; Mon, 27 Dec 2004 21:49:48 -0800 Received: from exch-1.topspincom.com ([12.162.17.3]) by umhlanga.STRATNET.NET with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:51:15 -0800 Received: from eddore ([10.10.253.169]) by exch-1.topspincom.com with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:51:15 -0800 Received: from localhost ([127.0.0.1] helo=eddore) by eddore with smtp (Exim 4.34) id 1CjAGI-0000v6-LE; Mon, 27 Dec 2004 21:51:15 -0800 Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com, openib-general@openib.org In-Reply-To: <200412272151.ZZGB77m9dy61uyrb@topspin.com> X-Mailer: Roland's Patchbomber Date: Mon, 27 Dec 2004 21:51:14 -0800 Message-Id: <200412272151.qKxQwjb8RXkS5kEQ@topspin.com> Mime-Version: 1.0 To: davem@davemloft.net From: Roland Dreier X-SA-Exim-Connect-IP: 127.0.0.1 X-SA-Exim-Mail-From: roland@topspin.com Subject: [PATCH][v5][17/24] IPoIB IPv4 multicast Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-SA-Exim-Version: 4.1 (built Tue, 17 Aug 2004 11:06:07 +0200) X-SA-Exim-Scanned: Yes (on eddore) X-OriginalArrivalTime: 28 Dec 2004 05:51:15.0064 (UTC) FILETIME=[3E330F80:01C4ECA1] X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13099 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: roland@topspin.com Precedence: bulk X-list: netdev Add ip_ib_mc_map() to convert IPv4 multicast addresses to IPoIB hardware addresses. Also add so INFINIBAND_ALEN has a home. The mapping for multicast addresses is described in http://www.ietf.org/internet-drafts/draft-ietf-ipoib-ip-over-infiniband-08.txt Signed-off-by: Roland Dreier --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/include/linux/if_infiniband.h 2004-12-27 21:48:24.639339403 -0800 @@ -0,0 +1,29 @@ +/* + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available at + * , or the OpenIB.org BSD + * license, available in the LICENSE.TXT file accompanying this + * software. These details are also available at + * . + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * $Id$ + */ + +#ifndef _LINUX_IF_INFINIBAND_H +#define _LINUX_IF_INFINIBAND_H + +#define INFINIBAND_ALEN 20 /* Octets in IPoIB HW addr */ + +#endif /* _LINUX_IF_INFINIBAND_H */ --- linux-bk.orig/include/net/ip.h 2004-12-27 21:47:47.982735072 -0800 +++ linux-bk/include/net/ip.h 2004-12-27 21:48:24.639339403 -0800 @@ -229,6 +229,39 @@ buf[3]=addr&0x7F; } +/* + * Map a multicast IP onto multicast MAC for type IP-over-InfiniBand. + * Leave P_Key as 0 to be filled in by driver. + */ + +static inline void ip_ib_mc_map(u32 addr, char *buf) +{ + buf[0] = 0; /* Reserved */ + buf[1] = 0xff; /* Multicast QPN */ + buf[2] = 0xff; + buf[3] = 0xff; + addr = ntohl(addr); + buf[4] = 0xff; + buf[5] = 0x12; /* link local scope */ + buf[6] = 0x40; /* IPv4 signature */ + buf[7] = 0x1b; + buf[8] = 0; /* P_Key */ + buf[9] = 0; + buf[10] = 0; + buf[11] = 0; + buf[12] = 0; + buf[13] = 0; + buf[14] = 0; + buf[15] = 0; + buf[19] = addr & 0xff; + addr >>= 8; + buf[18] = addr & 0xff; + addr >>= 8; + buf[17] = addr & 0xff; + addr >>= 8; + buf[16] = addr & 0x0f; +} + #if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) #include #endif --- linux-bk.orig/net/ipv4/arp.c 2004-12-27 21:47:52.507069119 -0800 +++ linux-bk/net/ipv4/arp.c 2004-12-27 21:48:24.640339256 -0800 @@ -213,6 +213,9 @@ case ARPHRD_IEEE802_TR: ip_tr_mc_map(addr, haddr); return 0; + case ARPHRD_INFINIBAND: + ip_ib_mc_map(addr, haddr); + return 0; default: if (dir) { memcpy(haddr, dev->broadcast, dev->addr_len); From roland@topspin.com Mon Dec 27 21:49:42 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 21:50:32 -0800 (PST) Received: from umhlanga.STRATNET.NET (umhlanga.stratnet.net [12.162.17.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS5nJDc025948 for ; Mon, 27 Dec 2004 21:49:42 -0800 Received: from exch-1.topspincom.com ([12.162.17.3]) by umhlanga.STRATNET.NET with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:51:01 -0800 Received: from eddore ([10.10.253.169]) by exch-1.topspincom.com with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:51:01 -0800 Received: from localhost ([127.0.0.1] helo=eddore) by eddore with smtp (Exim 4.34) id 1CjAG4-0000tQ-QW; Mon, 27 Dec 2004 21:51:01 -0800 Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com, openib-general@openib.org In-Reply-To: <200412272150.vaYM2cWv3KsWVYPV@topspin.com> X-Mailer: Roland's Patchbomber Date: Mon, 27 Dec 2004 21:51:00 -0800 Message-Id: <200412272151.7VvrtyAsRwgOK3mP@topspin.com> Mime-Version: 1.0 To: davem@davemloft.net From: Roland Dreier X-SA-Exim-Connect-IP: 127.0.0.1 X-SA-Exim-Mail-From: roland@topspin.com Subject: [PATCH][v5][7/24] Add InfiniBand MAD SMI support Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-SA-Exim-Version: 4.1 (built Tue, 17 Aug 2004 11:06:07 +0200) X-SA-Exim-Scanned: Yes (on eddore) X-OriginalArrivalTime: 28 Dec 2004 05:51:01.0626 (UTC) FILETIME=[363095A0:01C4ECA1] X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13101 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: roland@topspin.com Precedence: bulk X-list: netdev Add MAD layer SMI (Subnet Management Interface) code. Signed-off-by: Roland Dreier --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/core/smi.c 2004-12-27 21:48:20.566938847 -0800 @@ -0,0 +1,234 @@ +/* + * Copyright (c) 2004 Mellanox Technologies Ltd. All rights reserved. + * Copyright (c) 2004 Infinicon Corporation. All rights reserved. + * Copyright (c) 2004 Intel Corporation. All rights reserved. + * Copyright (c) 2004 Topspin Corporation. All rights reserved. + * Copyright (c) 2004 Voltaire Corporation. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: smi.c 1389 2004-12-27 22:56:47Z roland $ + */ + +#include + + +/* + * Fixup a directed route SMP for sending + * Return 0 if the SMP should be discarded + */ +int smi_handle_dr_smp_send(struct ib_smp *smp, + u8 node_type, + int port_num) +{ + u8 hop_ptr, hop_cnt; + + hop_ptr = smp->hop_ptr; + hop_cnt = smp->hop_cnt; + + /* See section 14.2.2.2, Vol 1 IB spec */ + if (!ib_get_smp_direction(smp)) { + /* C14-9:1 */ + if (hop_cnt && hop_ptr == 0) { + smp->hop_ptr++; + return (smp->initial_path[smp->hop_ptr] == + port_num); + } + + /* C14-9:2 */ + if (hop_ptr && hop_ptr < hop_cnt) { + if (node_type != IB_NODE_SWITCH) + return 0; + + /* smp->return_path set when received */ + smp->hop_ptr++; + return (smp->initial_path[smp->hop_ptr] == + port_num); + } + + /* C14-9:3 -- We're at the end of the DR segment of path */ + if (hop_ptr == hop_cnt) { + /* smp->return_path set when received */ + smp->hop_ptr++; + return (node_type == IB_NODE_SWITCH || + smp->dr_dlid == IB_LID_PERMISSIVE); + } + + /* C14-9:4 -- hop_ptr = hop_cnt + 1 -> give to SMA/SM */ + /* C14-9:5 -- Fail unreasonable hop pointer */ + return (hop_ptr == hop_cnt + 1); + + } else { + /* C14-13:1 */ + if (hop_cnt && hop_ptr == hop_cnt + 1) { + smp->hop_ptr--; + return (smp->return_path[smp->hop_ptr] == + port_num); + } + + /* C14-13:2 */ + if (2 <= hop_ptr && hop_ptr <= hop_cnt) { + if (node_type != IB_NODE_SWITCH) + return 0; + + smp->hop_ptr--; + return (smp->return_path[smp->hop_ptr] == + port_num); + } + + /* C14-13:3 -- at the end of the DR segment of path */ + if (hop_ptr == 1) { + smp->hop_ptr--; + /* C14-13:3 -- SMPs destined for SM shouldn't be here */ + return (node_type == IB_NODE_SWITCH || + smp->dr_slid == IB_LID_PERMISSIVE); + } + + /* C14-13:4 -- hop_ptr = 0 -> should have gone to SM */ + if (hop_ptr == 0) + return 1; + + /* C14-13:5 -- Check for unreasonable hop pointer */ + return 0; + } +} + +/* + * Adjust information for a received SMP + * Return 0 if the SMP should be dropped + */ +int smi_handle_dr_smp_recv(struct ib_smp *smp, + u8 node_type, + int port_num, + int phys_port_cnt) +{ + u8 hop_ptr, hop_cnt; + + hop_ptr = smp->hop_ptr; + hop_cnt = smp->hop_cnt; + + /* See section 14.2.2.2, Vol 1 IB spec */ + if (!ib_get_smp_direction(smp)) { + /* C14-9:1 -- sender should have incremented hop_ptr */ + if (hop_cnt && hop_ptr == 0) + return 0; + + /* C14-9:2 -- intermediate hop */ + if (hop_ptr && hop_ptr < hop_cnt) { + if (node_type != IB_NODE_SWITCH) + return 0; + + smp->return_path[hop_ptr] = port_num; + /* smp->hop_ptr updated when sending */ + return (smp->initial_path[hop_ptr+1] <= phys_port_cnt); + } + + /* C14-9:3 -- We're at the end of the DR segment of path */ + if (hop_ptr == hop_cnt) { + if (hop_cnt) + smp->return_path[hop_ptr] = port_num; + /* smp->hop_ptr updated when sending */ + + return (node_type == IB_NODE_SWITCH || + smp->dr_dlid == IB_LID_PERMISSIVE); + } + + /* C14-9:4 -- hop_ptr = hop_cnt + 1 -> give to SMA/SM */ + /* C14-9:5 -- fail unreasonable hop pointer */ + return (hop_ptr == hop_cnt + 1); + + } else { + + /* C14-13:1 */ + if (hop_cnt && hop_ptr == hop_cnt + 1) { + smp->hop_ptr--; + return (smp->return_path[smp->hop_ptr] == + port_num); + } + + /* C14-13:2 */ + if (2 <= hop_ptr && hop_ptr <= hop_cnt) { + if (node_type != IB_NODE_SWITCH) + return 0; + + /* smp->hop_ptr updated when sending */ + return (smp->return_path[hop_ptr-1] <= phys_port_cnt); + } + + /* C14-13:3 -- We're at the end of the DR segment of path */ + if (hop_ptr == 1) { + if (smp->dr_slid == IB_LID_PERMISSIVE) { + /* giving SMP to SM - update hop_ptr */ + smp->hop_ptr--; + return 1; + } + /* smp->hop_ptr updated when sending */ + return (node_type == IB_NODE_SWITCH); + } + + /* C14-13:4 -- hop_ptr = 0 -> give to SM */ + /* C14-13:5 -- Check for unreasonable hop pointer */ + return (hop_ptr == 0); + } +} + +/* + * Return 1 if the received DR SMP should be forwarded to the send queue + * Return 0 if the SMP should be completed up the stack + */ +int smi_check_forward_dr_smp(struct ib_smp *smp) +{ + u8 hop_ptr, hop_cnt; + + hop_ptr = smp->hop_ptr; + hop_cnt = smp->hop_cnt; + + if (!ib_get_smp_direction(smp)) { + /* C14-9:2 -- intermediate hop */ + if (hop_ptr && hop_ptr < hop_cnt) + return 1; + + /* C14-9:3 -- at the end of the DR segment of path */ + if (hop_ptr == hop_cnt) + return (smp->dr_dlid == IB_LID_PERMISSIVE); + + /* C14-9:4 -- hop_ptr = hop_cnt + 1 -> give to SMA/SM */ + if (hop_ptr == hop_cnt + 1) + return 1; + } else { + /* C14-13:2 */ + if (2 <= hop_ptr && hop_ptr <= hop_cnt) + return 1; + + /* C14-13:3 -- at the end of the DR segment of path */ + if (hop_ptr == 1) + return (smp->dr_slid != IB_LID_PERMISSIVE); + } + return 0; +} --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/core/smi.h 2004-12-27 21:48:20.592935020 -0800 @@ -0,0 +1,67 @@ +/* + * Copyright (c) 2004 Mellanox Technologies Ltd. All rights reserved. + * Copyright (c) 2004 Infinicon Corporation. All rights reserved. + * Copyright (c) 2004 Intel Corporation. All rights reserved. + * Copyright (c) 2004 Topspin Corporation. All rights reserved. + * Copyright (c) 2004 Voltaire Corporation. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: smi.h 1389 2004-12-27 22:56:47Z roland $ + */ + +#ifndef __SMI_H_ +#define __SMI_H_ + +int smi_handle_dr_smp_recv(struct ib_smp *smp, + u8 node_type, + int port_num, + int phys_port_cnt); +extern int smi_check_forward_dr_smp(struct ib_smp *smp); +extern int smi_handle_dr_smp_send(struct ib_smp *smp, + u8 node_type, + int port_num); +extern int smi_check_local_dr_smp(struct ib_smp *smp, + struct ib_device *device, + int port_num); + +/* + * Return 1 if the SMP should be handled by the local SMA/SM via process_mad + */ +static inline int smi_check_local_smp(struct ib_mad_agent *mad_agent, + struct ib_smp *smp) +{ + /* C14-9:3 -- We're at the end of the DR segment of path */ + /* C14-9:4 -- Hop Pointer = Hop Count + 1 -> give to SMA/SM */ + return ((mad_agent->device->process_mad && + !ib_get_smp_direction(smp) && + (smp->hop_ptr == smp->hop_cnt + 1))); +} + +#endif /* __SMI_H_ */ From roland@topspin.com Mon Dec 27 21:49:49 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 21:50:29 -0800 (PST) Received: from umhlanga.STRATNET.NET (umhlanga.stratnet.net [12.162.17.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS5nJE0025948 for ; Mon, 27 Dec 2004 21:49:49 -0800 Received: from exch-1.topspincom.com ([12.162.17.3]) by umhlanga.STRATNET.NET with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:51:15 -0800 Received: from eddore ([10.10.253.169]) by exch-1.topspincom.com with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:51:15 -0800 Received: from localhost ([127.0.0.1] helo=eddore) by eddore with smtp (Exim 4.34) id 1CjAGJ-0000vF-2j; Mon, 27 Dec 2004 21:51:15 -0800 Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com, openib-general@openib.org In-Reply-To: <200412272151.qKxQwjb8RXkS5kEQ@topspin.com> X-Mailer: Roland's Patchbomber Date: Mon, 27 Dec 2004 21:51:15 -0800 Message-Id: <200412272151.6vXZnyn99yKN3sSD@topspin.com> Mime-Version: 1.0 To: davem@davemloft.net From: Roland Dreier X-SA-Exim-Connect-IP: 127.0.0.1 X-SA-Exim-Mail-From: roland@topspin.com Subject: [PATCH][v5][18/24] IPoIB IPv6 support Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-SA-Exim-Version: 4.1 (built Tue, 17 Aug 2004 11:06:07 +0200) X-SA-Exim-Scanned: Yes (on eddore) X-OriginalArrivalTime: 28 Dec 2004 05:51:15.0595 (UTC) FILETIME=[3E8415B0:01C4ECA1] X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13098 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: roland@topspin.com Precedence: bulk X-list: netdev Add ipv6_ib_mc_map() to convert IPv6 multicast addresses to IPoIB hardware addresses, and add support for autoconfiguration for devices with type ARPHRD_INFINIBAND. The mapping for multicast addresses is described in http://www.ietf.org/internet-drafts/draft-ietf-ipoib-ip-over-infiniband-08.txt Signed-off-by: Nitin Hande Signed-off-by: Roland Dreier --- linux-bk.orig/include/net/if_inet6.h 2004-12-27 21:47:59.669014924 -0800 +++ linux-bk/include/net/if_inet6.h 2004-12-27 21:48:24.976289805 -0800 @@ -266,5 +266,20 @@ { buf[0] = 0x00; } + +static inline void ipv6_ib_mc_map(struct in6_addr *addr, char *buf) +{ + buf[0] = 0; /* Reserved */ + buf[1] = 0xff; /* Multicast QPN */ + buf[2] = 0xff; + buf[3] = 0xff; + buf[4] = 0xff; + buf[5] = 0x12; /* link local scope */ + buf[6] = 0x60; /* IPv6 signature */ + buf[7] = 0x1b; + buf[8] = 0; /* P_Key */ + buf[9] = 0; + memcpy(buf + 10, addr->s6_addr + 6, 10); +} #endif #endif --- linux-bk.orig/net/ipv6/addrconf.c 2004-12-27 21:47:59.159089982 -0800 +++ linux-bk/net/ipv6/addrconf.c 2004-12-27 21:48:24.978289511 -0800 @@ -48,6 +48,7 @@ #include #include #include +#include #include #include #include @@ -1095,6 +1096,12 @@ memset(eui, 0, 7); eui[7] = *(u8*)dev->dev_addr; return 0; + case ARPHRD_INFINIBAND: + if (dev->addr_len != INFINIBAND_ALEN) + return -1; + memcpy(eui, dev->dev_addr + 12, 8); + eui[0] |= 2; + return 0; } return -1; } @@ -1794,7 +1801,8 @@ if ((dev->type != ARPHRD_ETHER) && (dev->type != ARPHRD_FDDI) && (dev->type != ARPHRD_IEEE802_TR) && - (dev->type != ARPHRD_ARCNET)) { + (dev->type != ARPHRD_ARCNET) && + (dev->type != ARPHRD_INFINIBAND)) { /* Alas, we support only Ethernet autoconfiguration. */ return; } --- linux-bk.orig/net/ipv6/ndisc.c 2004-12-27 21:47:44.031316692 -0800 +++ linux-bk/net/ipv6/ndisc.c 2004-12-27 21:48:24.979289364 -0800 @@ -260,6 +260,9 @@ case ARPHRD_ARCNET: ipv6_arcnet_mc_map(addr, buf); return 0; + case ARPHRD_INFINIBAND: + ipv6_ib_mc_map(addr, buf); + return 0; default: if (dir) { memcpy(buf, dev->broadcast, dev->addr_len); From roland@topspin.com Mon Dec 27 21:49:45 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 21:50:33 -0800 (PST) Received: from umhlanga.STRATNET.NET (umhlanga.stratnet.net [12.162.17.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS5nJDm025948 for ; Mon, 27 Dec 2004 21:49:45 -0800 Received: from exch-1.topspincom.com ([12.162.17.3]) by umhlanga.STRATNET.NET with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:51:09 -0800 Received: from eddore ([10.10.253.169]) by exch-1.topspincom.com with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:51:08 -0800 Received: from localhost ([127.0.0.1] helo=eddore) by eddore with smtp (Exim 4.34) id 1CjAGC-0000uN-74; Mon, 27 Dec 2004 21:51:08 -0800 Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com, openib-general@openib.org In-Reply-To: <200412272151.6amOd1o39RpEe1KK@topspin.com> X-Mailer: Roland's Patchbomber Date: Mon, 27 Dec 2004 21:51:08 -0800 Message-Id: <200412272151.abmqZdwtcCCwm2QW@topspin.com> Mime-Version: 1.0 To: davem@davemloft.net From: Roland Dreier X-SA-Exim-Connect-IP: 127.0.0.1 X-SA-Exim-Mail-From: roland@topspin.com Subject: [PATCH][v5][12/24] Add Mellanox HCA low-level driver (EQ) Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-SA-Exim-Version: 4.1 (built Tue, 17 Aug 2004 11:06:07 +0200) X-SA-Exim-Scanned: Yes (on eddore) X-OriginalArrivalTime: 28 Dec 2004 05:51:08.0892 (UTC) FILETIME=[3A8549C0:01C4ECA1] X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13103 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: roland@topspin.com Precedence: bulk X-list: netdev Add event queue code for Mellanox HCA driver. Signed-off-by: Roland Dreier --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/hw/mthca/mthca_eq.c 2004-12-27 21:48:22.766615062 -0800 @@ -0,0 +1,690 @@ +/* + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: mthca_eq.c 1382 2004-12-24 02:21:02Z roland $ + */ + +#include +#include +#include +#include + +#include "mthca_dev.h" +#include "mthca_cmd.h" +#include "mthca_config_reg.h" + +enum { + MTHCA_NUM_ASYNC_EQE = 0x80, + MTHCA_NUM_CMD_EQE = 0x80, + MTHCA_EQ_ENTRY_SIZE = 0x20 +}; + +/* + * Must be packed because start is 64 bits but only aligned to 32 bits. + */ +struct mthca_eq_context { + u32 flags; + u64 start; + u32 logsize_usrpage; + u32 pd; + u8 reserved1[3]; + u8 intr; + u32 lost_count; + u32 lkey; + u32 reserved2[2]; + u32 consumer_index; + u32 producer_index; + u32 reserved3[4]; +} __attribute__((packed)); + +#define MTHCA_EQ_STATUS_OK ( 0 << 28) +#define MTHCA_EQ_STATUS_OVERFLOW ( 9 << 28) +#define MTHCA_EQ_STATUS_WRITE_FAIL (10 << 28) +#define MTHCA_EQ_OWNER_SW ( 0 << 24) +#define MTHCA_EQ_OWNER_HW ( 1 << 24) +#define MTHCA_EQ_FLAG_TR ( 1 << 18) +#define MTHCA_EQ_FLAG_OI ( 1 << 17) +#define MTHCA_EQ_STATE_ARMED ( 1 << 8) +#define MTHCA_EQ_STATE_FIRED ( 2 << 8) +#define MTHCA_EQ_STATE_ALWAYS_ARMED ( 3 << 8) + +enum { + MTHCA_EVENT_TYPE_COMP = 0x00, + MTHCA_EVENT_TYPE_PATH_MIG = 0x01, + MTHCA_EVENT_TYPE_COMM_EST = 0x02, + MTHCA_EVENT_TYPE_SQ_DRAINED = 0x03, + MTHCA_EVENT_TYPE_SRQ_LAST_WQE = 0x13, + MTHCA_EVENT_TYPE_CQ_ERROR = 0x04, + MTHCA_EVENT_TYPE_WQ_CATAS_ERROR = 0x05, + MTHCA_EVENT_TYPE_EEC_CATAS_ERROR = 0x06, + MTHCA_EVENT_TYPE_PATH_MIG_FAILED = 0x07, + MTHCA_EVENT_TYPE_WQ_INVAL_REQ_ERROR = 0x10, + MTHCA_EVENT_TYPE_WQ_ACCESS_ERROR = 0x11, + MTHCA_EVENT_TYPE_SRQ_CATAS_ERROR = 0x12, + MTHCA_EVENT_TYPE_LOCAL_CATAS_ERROR = 0x08, + MTHCA_EVENT_TYPE_PORT_CHANGE = 0x09, + MTHCA_EVENT_TYPE_EQ_OVERFLOW = 0x0f, + MTHCA_EVENT_TYPE_ECC_DETECT = 0x0e, + MTHCA_EVENT_TYPE_CMD = 0x0a +}; + +#define MTHCA_ASYNC_EVENT_MASK ((1ULL << MTHCA_EVENT_TYPE_PATH_MIG) | \ + (1ULL << MTHCA_EVENT_TYPE_COMM_EST) | \ + (1ULL << MTHCA_EVENT_TYPE_SQ_DRAINED) | \ + (1ULL << MTHCA_EVENT_TYPE_CQ_ERROR) | \ + (1ULL << MTHCA_EVENT_TYPE_WQ_CATAS_ERROR) | \ + (1ULL << MTHCA_EVENT_TYPE_EEC_CATAS_ERROR) | \ + (1ULL << MTHCA_EVENT_TYPE_PATH_MIG_FAILED) | \ + (1ULL << MTHCA_EVENT_TYPE_WQ_INVAL_REQ_ERROR) | \ + (1ULL << MTHCA_EVENT_TYPE_WQ_ACCESS_ERROR) | \ + (1ULL << MTHCA_EVENT_TYPE_LOCAL_CATAS_ERROR) | \ + (1ULL << MTHCA_EVENT_TYPE_PORT_CHANGE) | \ + (1ULL << MTHCA_EVENT_TYPE_ECC_DETECT)) +#define MTHCA_SRQ_EVENT_MASK (1ULL << MTHCA_EVENT_TYPE_SRQ_CATAS_ERROR) | \ + (1ULL << MTHCA_EVENT_TYPE_SRQ_LAST_WQE) +#define MTHCA_CMD_EVENT_MASK (1ULL << MTHCA_EVENT_TYPE_CMD) + +#define MTHCA_EQ_DB_INC_CI (1 << 24) +#define MTHCA_EQ_DB_REQ_NOT (2 << 24) +#define MTHCA_EQ_DB_DISARM_CQ (3 << 24) +#define MTHCA_EQ_DB_SET_CI (4 << 24) +#define MTHCA_EQ_DB_ALWAYS_ARM (5 << 24) + +struct mthca_eqe { + u8 reserved1; + u8 type; + u8 reserved2; + u8 subtype; + union { + u32 raw[6]; + struct { + u32 cqn; + } __attribute__((packed)) comp; + struct { + u16 reserved1; + u16 token; + u32 reserved2; + u8 reserved3[3]; + u8 status; + u64 out_param; + } __attribute__((packed)) cmd; + struct { + u32 qpn; + } __attribute__((packed)) qp; + struct { + u32 cqn; + u32 reserved1; + u8 reserved2[3]; + u8 syndrome; + } __attribute__((packed)) cq_err; + struct { + u32 reserved1[2]; + u32 port; + } __attribute__((packed)) port_change; + } event; + u8 reserved3[3]; + u8 owner; +} __attribute__((packed)); + +#define MTHCA_EQ_ENTRY_OWNER_SW (0 << 7) +#define MTHCA_EQ_ENTRY_OWNER_HW (1 << 7) + +static inline u64 async_mask(struct mthca_dev *dev) +{ + return dev->mthca_flags & MTHCA_FLAG_SRQ ? + MTHCA_ASYNC_EVENT_MASK | MTHCA_SRQ_EVENT_MASK : + MTHCA_ASYNC_EVENT_MASK; +} + +static inline void set_eq_ci(struct mthca_dev *dev, int eqn, int ci) +{ + u32 doorbell[2]; + + doorbell[0] = cpu_to_be32(MTHCA_EQ_DB_SET_CI | eqn); + doorbell[1] = cpu_to_be32(ci); + + mthca_write64(doorbell, + dev->kar + MTHCA_EQ_DOORBELL, + MTHCA_GET_DOORBELL_LOCK(&dev->doorbell_lock)); +} + +static inline void eq_req_not(struct mthca_dev *dev, int eqn) +{ + u32 doorbell[2]; + + doorbell[0] = cpu_to_be32(MTHCA_EQ_DB_REQ_NOT | eqn); + doorbell[1] = 0; + + mthca_write64(doorbell, + dev->kar + MTHCA_EQ_DOORBELL, + MTHCA_GET_DOORBELL_LOCK(&dev->doorbell_lock)); +} + +static inline void disarm_cq(struct mthca_dev *dev, int eqn, int cqn) +{ + u32 doorbell[2]; + + doorbell[0] = cpu_to_be32(MTHCA_EQ_DB_DISARM_CQ | eqn); + doorbell[1] = cpu_to_be32(cqn); + + mthca_write64(doorbell, + dev->kar + MTHCA_EQ_DOORBELL, + MTHCA_GET_DOORBELL_LOCK(&dev->doorbell_lock)); +} + +static inline struct mthca_eqe *get_eqe(struct mthca_eq *eq, int entry) +{ + return eq->page_list[entry * MTHCA_EQ_ENTRY_SIZE / PAGE_SIZE].buf + + (entry * MTHCA_EQ_ENTRY_SIZE) % PAGE_SIZE; +} + +static inline int next_eqe_sw(struct mthca_eq *eq) +{ + return !(MTHCA_EQ_ENTRY_OWNER_HW & + get_eqe(eq, eq->cons_index)->owner); +} + +static inline void set_eqe_hw(struct mthca_eq *eq, int entry) +{ + get_eqe(eq, entry)->owner = MTHCA_EQ_ENTRY_OWNER_HW; +} + +static void port_change(struct mthca_dev *dev, int port, int active) +{ + struct ib_event record; + + mthca_dbg(dev, "Port change to %s for port %d\n", + active ? "active" : "down", port); + + record.device = &dev->ib_dev; + record.event = active ? IB_EVENT_PORT_ACTIVE : IB_EVENT_PORT_ERR; + record.element.port_num = port; + + ib_dispatch_event(&record); +} + +static void mthca_eq_int(struct mthca_dev *dev, struct mthca_eq *eq) +{ + struct mthca_eqe *eqe; + int disarm_cqn; + + while (next_eqe_sw(eq)) { + int set_ci = 0; + eqe = get_eqe(eq, eq->cons_index); + + switch (eqe->type) { + case MTHCA_EVENT_TYPE_COMP: + disarm_cqn = be32_to_cpu(eqe->event.comp.cqn) & 0xffffff; + disarm_cq(dev, eq->eqn, disarm_cqn); + mthca_cq_event(dev, disarm_cqn); + break; + + case MTHCA_EVENT_TYPE_PATH_MIG: + mthca_qp_event(dev, be32_to_cpu(eqe->event.qp.qpn) & 0xffffff, + IB_EVENT_PATH_MIG); + break; + + case MTHCA_EVENT_TYPE_COMM_EST: + mthca_qp_event(dev, be32_to_cpu(eqe->event.qp.qpn) & 0xffffff, + IB_EVENT_COMM_EST); + break; + + case MTHCA_EVENT_TYPE_SQ_DRAINED: + mthca_qp_event(dev, be32_to_cpu(eqe->event.qp.qpn) & 0xffffff, + IB_EVENT_SQ_DRAINED); + break; + + case MTHCA_EVENT_TYPE_WQ_CATAS_ERROR: + mthca_qp_event(dev, be32_to_cpu(eqe->event.qp.qpn) & 0xffffff, + IB_EVENT_QP_FATAL); + break; + + case MTHCA_EVENT_TYPE_PATH_MIG_FAILED: + mthca_qp_event(dev, be32_to_cpu(eqe->event.qp.qpn) & 0xffffff, + IB_EVENT_PATH_MIG_ERR); + break; + + case MTHCA_EVENT_TYPE_WQ_INVAL_REQ_ERROR: + mthca_qp_event(dev, be32_to_cpu(eqe->event.qp.qpn) & 0xffffff, + IB_EVENT_QP_REQ_ERR); + break; + + case MTHCA_EVENT_TYPE_WQ_ACCESS_ERROR: + mthca_qp_event(dev, be32_to_cpu(eqe->event.qp.qpn) & 0xffffff, + IB_EVENT_QP_ACCESS_ERR); + break; + + case MTHCA_EVENT_TYPE_CMD: + mthca_cmd_event(dev, + be16_to_cpu(eqe->event.cmd.token), + eqe->event.cmd.status, + be64_to_cpu(eqe->event.cmd.out_param)); + /* + * cmd_event() may add more commands. + * The card will think the queue has overflowed if + * we don't tell it we've been processing events. + */ + set_ci = 1; + break; + + case MTHCA_EVENT_TYPE_PORT_CHANGE: + port_change(dev, + (be32_to_cpu(eqe->event.port_change.port) >> 28) & 3, + eqe->subtype == 0x4); + break; + + case MTHCA_EVENT_TYPE_CQ_ERROR: + mthca_warn(dev, "CQ %s on CQN %08x\n", + eqe->event.cq_err.syndrome == 1 ? + "overrun" : "access violation", + be32_to_cpu(eqe->event.cq_err.cqn)); + break; + + case MTHCA_EVENT_TYPE_EQ_OVERFLOW: + mthca_warn(dev, "EQ overrun on EQN %d\n", eq->eqn); + break; + + case MTHCA_EVENT_TYPE_EEC_CATAS_ERROR: + case MTHCA_EVENT_TYPE_SRQ_CATAS_ERROR: + case MTHCA_EVENT_TYPE_LOCAL_CATAS_ERROR: + case MTHCA_EVENT_TYPE_ECC_DETECT: + default: + mthca_warn(dev, "Unhandled event %02x(%02x) on EQ %d\n", + eqe->type, eqe->subtype, eq->eqn); + break; + }; + + set_eqe_hw(eq, eq->cons_index); + eq->cons_index = (eq->cons_index + 1) & (eq->nent - 1); + + if (set_ci) { + wmb(); /* see comment below */ + set_eq_ci(dev, eq->eqn, eq->cons_index); + set_ci = 0; + } + } + + /* + * This barrier makes sure that all updates to + * ownership bits done by set_eqe_hw() hit memory + * before the consumer index is updated. set_eq_ci() + * allows the HCA to possibly write more EQ entries, + * and we want to avoid the exceedingly unlikely + * possibility of the HCA writing an entry and then + * having set_eqe_hw() overwrite the owner field. + */ + wmb(); + set_eq_ci(dev, eq->eqn, eq->cons_index); + eq_req_not(dev, eq->eqn); +} + +static irqreturn_t mthca_interrupt(int irq, void *dev_ptr, struct pt_regs *regs) +{ + struct mthca_dev *dev = dev_ptr; + u32 ecr; + int work = 0; + int i; + + if (dev->eq_table.clr_mask) + writel(dev->eq_table.clr_mask, dev->eq_table.clr_int); + + while ((ecr = readl(dev->hcr + MTHCA_ECR_OFFSET + 4)) != 0) { + work = 1; + + writel(ecr, dev->hcr + MTHCA_ECR_CLR_OFFSET + 4); + + for (i = 0; i < MTHCA_NUM_EQ; ++i) + if (ecr & dev->eq_table.eq[i].ecr_mask) + mthca_eq_int(dev, &dev->eq_table.eq[i]); + } + + return IRQ_RETVAL(work); +} + +static irqreturn_t mthca_msi_x_interrupt(int irq, void *eq_ptr, + struct pt_regs *regs) +{ + struct mthca_eq *eq = eq_ptr; + struct mthca_dev *dev = eq->dev; + + writel(eq->ecr_mask, dev->hcr + MTHCA_ECR_CLR_OFFSET + 4); + mthca_eq_int(dev, eq); + + /* MSI-X vectors always belong to us */ + return IRQ_HANDLED; +} + +static int __devinit mthca_create_eq(struct mthca_dev *dev, + int nent, + u8 intr, + struct mthca_eq *eq) +{ + int npages = (nent * MTHCA_EQ_ENTRY_SIZE + PAGE_SIZE - 1) / + PAGE_SIZE; + u64 *dma_list = NULL; + dma_addr_t t; + void *mailbox = NULL; + struct mthca_eq_context *eq_context; + int err = -ENOMEM; + int i; + u8 status; + + /* Make sure EQ size is aligned to a power of 2 size. */ + for (i = 1; i < nent; i <<= 1) + ; /* nothing */ + nent = i; + + eq->dev = dev; + + eq->page_list = kmalloc(npages * sizeof *eq->page_list, + GFP_KERNEL); + if (!eq->page_list) + goto err_out; + + for (i = 0; i < npages; ++i) + eq->page_list[i].buf = NULL; + + dma_list = kmalloc(npages * sizeof *dma_list, GFP_KERNEL); + if (!dma_list) + goto err_out_free; + + mailbox = kmalloc(sizeof *eq_context + MTHCA_CMD_MAILBOX_EXTRA, + GFP_KERNEL); + if (!mailbox) + goto err_out_free; + eq_context = MAILBOX_ALIGN(mailbox); + + for (i = 0; i < npages; ++i) { + eq->page_list[i].buf = pci_alloc_consistent(dev->pdev, + PAGE_SIZE, &t); + if (!eq->page_list[i].buf) + goto err_out_free; + + dma_list[i] = t; + pci_unmap_addr_set(&eq->page_list[i], mapping, t); + + memset(eq->page_list[i].buf, 0, PAGE_SIZE); + } + + for (i = 0; i < nent; ++i) + set_eqe_hw(eq, i); + + eq->eqn = mthca_alloc(&dev->eq_table.alloc); + if (eq->eqn == -1) + goto err_out_free; + + err = mthca_mr_alloc_phys(dev, dev->driver_pd.pd_num, + dma_list, PAGE_SHIFT, npages, + 0, npages * PAGE_SIZE, + MTHCA_MPT_FLAG_LOCAL_WRITE | + MTHCA_MPT_FLAG_LOCAL_READ, + &eq->mr); + if (err) + goto err_out_free_eq; + + eq->nent = nent; + + memset(eq_context, 0, sizeof *eq_context); + eq_context->flags = cpu_to_be32(MTHCA_EQ_STATUS_OK | + MTHCA_EQ_OWNER_HW | + MTHCA_EQ_STATE_ARMED | + MTHCA_EQ_FLAG_TR); + eq_context->start = cpu_to_be64(0); + eq_context->logsize_usrpage = cpu_to_be32((ffs(nent) - 1) << 24 | + MTHCA_KAR_PAGE); + eq_context->pd = cpu_to_be32(dev->driver_pd.pd_num); + eq_context->intr = intr; + eq_context->lkey = cpu_to_be32(eq->mr.ibmr.lkey); + + err = mthca_SW2HW_EQ(dev, eq_context, eq->eqn, &status); + if (err) { + mthca_warn(dev, "SW2HW_EQ failed (%d)\n", err); + goto err_out_free_mr; + } + if (status) { + mthca_warn(dev, "SW2HW_EQ returned status 0x%02x\n", + status); + err = -EINVAL; + goto err_out_free_mr; + } + + kfree(dma_list); + kfree(mailbox); + + eq->ecr_mask = swab32(1 << eq->eqn); + eq->cons_index = 0; + + eq_req_not(dev, eq->eqn); + + mthca_dbg(dev, "Allocated EQ %d with %d entries\n", + eq->eqn, nent); + + return err; + + err_out_free_mr: + mthca_free_mr(dev, &eq->mr); + + err_out_free_eq: + mthca_free(&dev->eq_table.alloc, eq->eqn); + + err_out_free: + for (i = 0; i < npages; ++i) + if (eq->page_list[i].buf) + pci_free_consistent(dev->pdev, PAGE_SIZE, + eq->page_list[i].buf, + pci_unmap_addr(&eq->page_list[i], + mapping)); + + kfree(eq->page_list); + kfree(dma_list); + kfree(mailbox); + + err_out: + return err; +} + +static void mthca_free_eq(struct mthca_dev *dev, + struct mthca_eq *eq) +{ + void *mailbox = NULL; + int err; + u8 status; + int npages = (eq->nent * MTHCA_EQ_ENTRY_SIZE + PAGE_SIZE - 1) / + PAGE_SIZE; + int i; + + mailbox = kmalloc(sizeof (struct mthca_eq_context) + MTHCA_CMD_MAILBOX_EXTRA, + GFP_KERNEL); + if (!mailbox) + return; + + err = mthca_HW2SW_EQ(dev, MAILBOX_ALIGN(mailbox), + eq->eqn, &status); + if (err) + mthca_warn(dev, "HW2SW_EQ failed (%d)\n", err); + if (status) + mthca_warn(dev, "HW2SW_EQ returned status 0x%02x\n", + status); + + if (0) { + mthca_dbg(dev, "Dumping EQ context %02x:\n", eq->eqn); + for (i = 0; i < sizeof (struct mthca_eq_context) / 4; ++i) { + if (i % 4 == 0) + printk("[%02x] ", i * 4); + printk(" %08x", be32_to_cpup(MAILBOX_ALIGN(mailbox) + i * 4)); + if ((i + 1) % 4 == 0) + printk("\n"); + } + } + + + mthca_free_mr(dev, &eq->mr); + for (i = 0; i < npages; ++i) + pci_free_consistent(dev->pdev, PAGE_SIZE, + eq->page_list[i].buf, + pci_unmap_addr(&eq->page_list[i], mapping)); + + kfree(eq->page_list); + kfree(mailbox); +} + +static void mthca_free_irqs(struct mthca_dev *dev) +{ + int i; + + if (dev->eq_table.have_irq) + free_irq(dev->pdev->irq, dev); + for (i = 0; i < MTHCA_NUM_EQ; ++i) + if (dev->eq_table.eq[i].have_irq) + free_irq(dev->eq_table.eq[i].msi_x_vector, + dev->eq_table.eq + i); +} + +int __devinit mthca_init_eq_table(struct mthca_dev *dev) +{ + int err; + u8 status; + u8 intr; + int i; + + err = mthca_alloc_init(&dev->eq_table.alloc, + dev->limits.num_eqs, + dev->limits.num_eqs - 1, + dev->limits.reserved_eqs); + if (err) + return err; + + if (dev->mthca_flags & MTHCA_FLAG_MSI || + dev->mthca_flags & MTHCA_FLAG_MSI_X) { + dev->eq_table.clr_mask = 0; + } else { + dev->eq_table.clr_mask = + swab32(1 << (dev->eq_table.inta_pin & 31)); + dev->eq_table.clr_int = dev->clr_base + + (dev->eq_table.inta_pin < 31 ? 4 : 0); + } + + intr = (dev->mthca_flags & MTHCA_FLAG_MSI) ? + 128 : dev->eq_table.inta_pin; + + err = mthca_create_eq(dev, dev->limits.num_cqs, + (dev->mthca_flags & MTHCA_FLAG_MSI_X) ? 128 : intr, + &dev->eq_table.eq[MTHCA_EQ_COMP]); + if (err) + goto err_out_free; + + err = mthca_create_eq(dev, MTHCA_NUM_ASYNC_EQE, + (dev->mthca_flags & MTHCA_FLAG_MSI_X) ? 129 : intr, + &dev->eq_table.eq[MTHCA_EQ_ASYNC]); + if (err) + goto err_out_comp; + + err = mthca_create_eq(dev, MTHCA_NUM_CMD_EQE, + (dev->mthca_flags & MTHCA_FLAG_MSI_X) ? 130 : intr, + &dev->eq_table.eq[MTHCA_EQ_CMD]); + if (err) + goto err_out_async; + + if (dev->mthca_flags & MTHCA_FLAG_MSI_X) { + static const char *eq_name[] = { + [MTHCA_EQ_COMP] = DRV_NAME " (comp)", + [MTHCA_EQ_ASYNC] = DRV_NAME " (async)", + [MTHCA_EQ_CMD] = DRV_NAME " (cmd)" + }; + + for (i = 0; i < MTHCA_NUM_EQ; ++i) { + err = request_irq(dev->eq_table.eq[i].msi_x_vector, + mthca_msi_x_interrupt, 0, + eq_name[i], dev->eq_table.eq + i); + if (err) + goto err_out_cmd; + dev->eq_table.eq[i].have_irq = 1; + } + } else { + err = request_irq(dev->pdev->irq, mthca_interrupt, SA_SHIRQ, + DRV_NAME, dev); + if (err) + goto err_out_cmd; + dev->eq_table.have_irq = 1; + } + + err = mthca_MAP_EQ(dev, async_mask(dev), + 0, dev->eq_table.eq[MTHCA_EQ_ASYNC].eqn, &status); + if (err) + mthca_warn(dev, "MAP_EQ for async EQ %d failed (%d)\n", + dev->eq_table.eq[MTHCA_EQ_ASYNC].eqn, err); + if (status) + mthca_warn(dev, "MAP_EQ for async EQ %d returned status 0x%02x\n", + dev->eq_table.eq[MTHCA_EQ_ASYNC].eqn, status); + + err = mthca_MAP_EQ(dev, MTHCA_CMD_EVENT_MASK, + 0, dev->eq_table.eq[MTHCA_EQ_CMD].eqn, &status); + if (err) + mthca_warn(dev, "MAP_EQ for cmd EQ %d failed (%d)\n", + dev->eq_table.eq[MTHCA_EQ_CMD].eqn, err); + if (status) + mthca_warn(dev, "MAP_EQ for cmd EQ %d returned status 0x%02x\n", + dev->eq_table.eq[MTHCA_EQ_CMD].eqn, status); + + return 0; + +err_out_cmd: + mthca_free_irqs(dev); + mthca_free_eq(dev, &dev->eq_table.eq[MTHCA_EQ_CMD]); + +err_out_async: + mthca_free_eq(dev, &dev->eq_table.eq[MTHCA_EQ_ASYNC]); + +err_out_comp: + mthca_free_eq(dev, &dev->eq_table.eq[MTHCA_EQ_COMP]); + +err_out_free: + mthca_alloc_cleanup(&dev->eq_table.alloc); + return err; +} + +void __devexit mthca_cleanup_eq_table(struct mthca_dev *dev) +{ + u8 status; + int i; + + mthca_free_irqs(dev); + + mthca_MAP_EQ(dev, async_mask(dev), + 1, dev->eq_table.eq[MTHCA_EQ_ASYNC].eqn, &status); + mthca_MAP_EQ(dev, MTHCA_CMD_EVENT_MASK, + 1, dev->eq_table.eq[MTHCA_EQ_CMD].eqn, &status); + + for (i = 0; i < MTHCA_NUM_EQ; ++i) + mthca_free_eq(dev, &dev->eq_table.eq[i]); + + mthca_alloc_cleanup(&dev->eq_table.alloc); +} From roland@topspin.com Mon Dec 27 21:49:48 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 21:50:31 -0800 (PST) Received: from umhlanga.STRATNET.NET (umhlanga.stratnet.net [12.162.17.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS5nJDu025948 for ; Mon, 27 Dec 2004 21:49:48 -0800 Received: from exch-1.topspincom.com ([12.162.17.3]) by umhlanga.STRATNET.NET with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:51:14 -0800 Received: from eddore ([10.10.253.169]) by exch-1.topspincom.com with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:51:14 -0800 Received: from localhost ([127.0.0.1] helo=eddore) by eddore with smtp (Exim 4.34) id 1CjAGH-0000ux-Kh; Mon, 27 Dec 2004 21:51:14 -0800 Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com, openib-general@openib.org In-Reply-To: <200412272151.KJdDvki21LuSsIo9@topspin.com> X-Mailer: Roland's Patchbomber Date: Mon, 27 Dec 2004 21:51:13 -0800 Message-Id: <200412272151.ZZGB77m9dy61uyrb@topspin.com> Mime-Version: 1.0 To: davem@davemloft.net From: Roland Dreier X-SA-Exim-Connect-IP: 127.0.0.1 X-SA-Exim-Mail-From: roland@topspin.com Subject: [PATCH][v5][16/24] Add Mellanox HCA low-level driver (MAD) Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-SA-Exim-Version: 4.1 (built Tue, 17 Aug 2004 11:06:07 +0200) X-SA-Exim-Scanned: Yes (on eddore) X-OriginalArrivalTime: 28 Dec 2004 05:51:14.0626 (UTC) FILETIME=[3DF03A20:01C4ECA1] X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13100 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: roland@topspin.com Precedence: bulk X-list: netdev Add MAD (management datagram) code for Mellanox HCA driver. Signed-off-by: Roland Dreier --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/hw/mthca/mthca_mad.c 2004-12-27 21:48:24.331384733 -0800 @@ -0,0 +1,320 @@ +/* + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: mthca_mad.c 1349 2004-12-16 21:09:43Z roland $ + */ + +#include +#include +#include + +#include "mthca_dev.h" +#include "mthca_cmd.h" + +enum { + MTHCA_VENDOR_CLASS1 = 0x9, + MTHCA_VENDOR_CLASS2 = 0xa +}; + +struct mthca_trap_mad { + struct ib_mad *mad; + DECLARE_PCI_UNMAP_ADDR(mapping) +}; + +static void update_sm_ah(struct mthca_dev *dev, + u8 port_num, u16 lid, u8 sl) +{ + struct ib_ah *new_ah; + struct ib_ah_attr ah_attr; + unsigned long flags; + + if (!dev->send_agent[port_num - 1][0]) + return; + + memset(&ah_attr, 0, sizeof ah_attr); + ah_attr.dlid = lid; + ah_attr.sl = sl; + ah_attr.port_num = port_num; + + new_ah = ib_create_ah(dev->send_agent[port_num - 1][0]->qp->pd, + &ah_attr); + if (IS_ERR(new_ah)) + return; + + spin_lock_irqsave(&dev->sm_lock, flags); + if (dev->sm_ah[port_num - 1]) + ib_destroy_ah(dev->sm_ah[port_num - 1]); + dev->sm_ah[port_num - 1] = new_ah; + spin_unlock_irqrestore(&dev->sm_lock, flags); +} + +/* + * Snoop SM MADs for port info and P_Key table sets, so we can + * synthesize LID change and P_Key change events. + */ +static void smp_snoop(struct ib_device *ibdev, + u8 port_num, + struct ib_mad *mad) +{ + struct ib_event event; + + if ((mad->mad_hdr.mgmt_class == IB_MGMT_CLASS_SUBN_LID_ROUTED || + mad->mad_hdr.mgmt_class == IB_MGMT_CLASS_SUBN_DIRECTED_ROUTE) && + mad->mad_hdr.method == IB_MGMT_METHOD_SET) { + if (mad->mad_hdr.attr_id == IB_SMP_ATTR_PORT_INFO) { + update_sm_ah(to_mdev(ibdev), port_num, + be16_to_cpup((__be16 *) (mad->data + 58)), + (*(u8 *) (mad->data + 76)) & 0xf); + + event.device = ibdev; + event.event = IB_EVENT_LID_CHANGE; + event.element.port_num = port_num; + ib_dispatch_event(&event); + } + + if (mad->mad_hdr.attr_id == IB_SMP_ATTR_PKEY_TABLE) { + event.device = ibdev; + event.event = IB_EVENT_PKEY_CHANGE; + event.element.port_num = port_num; + ib_dispatch_event(&event); + } + } +} + +static void forward_trap(struct mthca_dev *dev, + u8 port_num, + struct ib_mad *mad) +{ + int qpn = mad->mad_hdr.mgmt_class != IB_MGMT_CLASS_SUBN_LID_ROUTED; + struct mthca_trap_mad *tmad; + struct ib_sge gather_list; + struct ib_send_wr *bad_wr, wr = { + .opcode = IB_WR_SEND, + .sg_list = &gather_list, + .num_sge = 1, + .send_flags = IB_SEND_SIGNALED, + .wr = { + .ud = { + .remote_qpn = qpn, + .remote_qkey = qpn ? IB_QP1_QKEY : 0, + .timeout_ms = 0 + } + } + }; + struct ib_mad_agent *agent = dev->send_agent[port_num - 1][qpn]; + int ret; + unsigned long flags; + + if (agent) { + tmad = kmalloc(sizeof *tmad, GFP_KERNEL); + if (!tmad) + return; + + tmad->mad = kmalloc(sizeof *tmad->mad, GFP_KERNEL); + if (!tmad->mad) { + kfree(tmad); + return; + } + + memcpy(tmad->mad, mad, sizeof *mad); + + wr.wr.ud.mad_hdr = &tmad->mad->mad_hdr; + wr.wr_id = (unsigned long) tmad; + + gather_list.addr = dma_map_single(agent->device->dma_device, + tmad->mad, + sizeof *tmad->mad, + DMA_TO_DEVICE); + gather_list.length = sizeof *tmad->mad; + gather_list.lkey = to_mpd(agent->qp->pd)->ntmr.ibmr.lkey; + pci_unmap_addr_set(tmad, mapping, gather_list.addr); + + /* + * We rely here on the fact that MLX QPs don't use the + * address handle after the send is posted (this is + * wrong following the IB spec strictly, but we know + * it's OK for our devices). + */ + spin_lock_irqsave(&dev->sm_lock, flags); + wr.wr.ud.ah = dev->sm_ah[port_num - 1]; + if (wr.wr.ud.ah) + ret = ib_post_send_mad(agent, &wr, &bad_wr); + else + ret = -EINVAL; + spin_unlock_irqrestore(&dev->sm_lock, flags); + + if (ret) { + dma_unmap_single(agent->device->dma_device, + pci_unmap_addr(tmad, mapping), + sizeof *tmad->mad, + DMA_TO_DEVICE); + kfree(tmad->mad); + kfree(tmad); + } + } +} + +int mthca_process_mad(struct ib_device *ibdev, + int mad_flags, + u8 port_num, + u16 slid, + struct ib_mad *in_mad, + struct ib_mad *out_mad) +{ + int err; + u8 status; + + /* Forward locally generated traps to the SM */ + if (in_mad->mad_hdr.method == IB_MGMT_METHOD_TRAP && + slid == 0) { + forward_trap(to_mdev(ibdev), port_num, in_mad); + return IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_CONSUMED; + } + + /* + * Only handle SM gets, sets and trap represses for SM class + * + * Only handle PMA and Mellanox vendor-specific class gets and + * sets for other classes. + */ + if (in_mad->mad_hdr.mgmt_class == IB_MGMT_CLASS_SUBN_LID_ROUTED || + in_mad->mad_hdr.mgmt_class == IB_MGMT_CLASS_SUBN_DIRECTED_ROUTE) { + if (in_mad->mad_hdr.method != IB_MGMT_METHOD_GET && + in_mad->mad_hdr.method != IB_MGMT_METHOD_SET && + in_mad->mad_hdr.method != IB_MGMT_METHOD_TRAP_REPRESS) + return IB_MAD_RESULT_SUCCESS; + + /* + * Don't process SMInfo queries or vendor-specific + * MADs -- the SMA can't handle them. + */ + if (in_mad->mad_hdr.attr_id == IB_SMP_ATTR_SM_INFO || + ((in_mad->mad_hdr.attr_id & IB_SMP_ATTR_VENDOR_MASK) == + IB_SMP_ATTR_VENDOR_MASK)) + return IB_MAD_RESULT_SUCCESS; + } else if (in_mad->mad_hdr.mgmt_class == IB_MGMT_CLASS_PERF_MGMT || + in_mad->mad_hdr.mgmt_class == MTHCA_VENDOR_CLASS1 || + in_mad->mad_hdr.mgmt_class == MTHCA_VENDOR_CLASS2) { + if (in_mad->mad_hdr.method != IB_MGMT_METHOD_GET && + in_mad->mad_hdr.method != IB_MGMT_METHOD_SET) + return IB_MAD_RESULT_SUCCESS; + } else + return IB_MAD_RESULT_SUCCESS; + + err = mthca_MAD_IFC(to_mdev(ibdev), + !!(mad_flags & IB_MAD_IGNORE_MKEY), + port_num, in_mad, out_mad, + &status); + if (err) { + mthca_err(to_mdev(ibdev), "MAD_IFC failed\n"); + return IB_MAD_RESULT_FAILURE; + } + if (status == MTHCA_CMD_STAT_BAD_PKT) + return IB_MAD_RESULT_SUCCESS; + if (status) { + mthca_err(to_mdev(ibdev), "MAD_IFC returned status %02x\n", + status); + return IB_MAD_RESULT_FAILURE; + } + + if (!out_mad->mad_hdr.status) + smp_snoop(ibdev, port_num, in_mad); + + /* set return bit in status of directed route responses */ + if (in_mad->mad_hdr.mgmt_class == IB_MGMT_CLASS_SUBN_DIRECTED_ROUTE) + out_mad->mad_hdr.status |= cpu_to_be16(1 << 15); + + if (in_mad->mad_hdr.method == IB_MGMT_METHOD_TRAP_REPRESS) + /* no response for trap repress */ + return IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_CONSUMED; + + return IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_REPLY; +} + +static void send_handler(struct ib_mad_agent *agent, + struct ib_mad_send_wc *mad_send_wc) +{ + struct mthca_trap_mad *tmad = + (void *) (unsigned long) mad_send_wc->wr_id; + + dma_unmap_single(agent->device->dma_device, + pci_unmap_addr(tmad, mapping), + sizeof *tmad->mad, + DMA_TO_DEVICE); + kfree(tmad->mad); + kfree(tmad); +} + +int mthca_create_agents(struct mthca_dev *dev) +{ + struct ib_mad_agent *agent; + int p, q; + + spin_lock_init(&dev->sm_lock); + + for (p = 0; p < dev->limits.num_ports; ++p) + for (q = 0; q <= 1; ++q) { + agent = ib_register_mad_agent(&dev->ib_dev, p + 1, + q ? IB_QPT_GSI : IB_QPT_SMI, + NULL, 0, send_handler, + NULL, NULL); + if (IS_ERR(agent)) + goto err; + dev->send_agent[p][q] = agent; + } + + return 0; + +err: + for (p = 0; p < dev->limits.num_ports; ++p) + for (q = 0; q <= 1; ++q) + if (dev->send_agent[p][q]) + ib_unregister_mad_agent(dev->send_agent[p][q]); + + return PTR_ERR(agent); +} + +void mthca_free_agents(struct mthca_dev *dev) +{ + struct ib_mad_agent *agent; + int p, q; + + for (p = 0; p < dev->limits.num_ports; ++p) { + for (q = 0; q <= 1; ++q) { + agent = dev->send_agent[p][q]; + dev->send_agent[p][q] = NULL; + ib_unregister_mad_agent(agent); + } + + if (dev->sm_ah[p]) + ib_destroy_ah(dev->sm_ah[p]); + } +} From roland@topspin.com Mon Dec 27 21:49:46 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 21:50:32 -0800 (PST) Received: from umhlanga.STRATNET.NET (umhlanga.stratnet.net [12.162.17.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS5nJDo025948 for ; Mon, 27 Dec 2004 21:49:45 -0800 Received: from exch-1.topspincom.com ([12.162.17.3]) by umhlanga.STRATNET.NET with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:51:10 -0800 Received: from eddore ([10.10.253.169]) by exch-1.topspincom.com with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:51:09 -0800 Received: from localhost ([127.0.0.1] helo=eddore) by eddore with smtp (Exim 4.34) id 1CjAGC-0000uW-T9; Mon, 27 Dec 2004 21:51:10 -0800 Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com, openib-general@openib.org In-Reply-To: <200412272151.abmqZdwtcCCwm2QW@topspin.com> X-Mailer: Roland's Patchbomber Date: Mon, 27 Dec 2004 21:51:08 -0800 Message-Id: <200412272151.MaHN86hIK0Tt84ws@topspin.com> Mime-Version: 1.0 To: davem@davemloft.net From: Roland Dreier X-SA-Exim-Connect-IP: 127.0.0.1 X-SA-Exim-Mail-From: roland@topspin.com Subject: [PATCH][v5][13/24] Add Mellanox HCA low-level driver (initialization) Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-SA-Exim-Version: 4.1 (built Tue, 17 Aug 2004 11:06:07 +0200) X-SA-Exim-Scanned: Yes (on eddore) X-OriginalArrivalTime: 28 Dec 2004 05:51:10.0032 (UTC) FILETIME=[3B333D00:01C4ECA1] X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13102 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: roland@topspin.com Precedence: bulk X-list: netdev Add device initializaton code for Mellanox HCA driver. Signed-off-by: Roland Dreier --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/hw/mthca/mthca_profile.c 2004-12-27 21:48:23.120562962 -0800 @@ -0,0 +1,226 @@ +/* + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: mthca_profile.c 1349 2004-12-16 21:09:43Z roland $ + */ + +#include +#include + +#include "mthca_profile.h" + +static int default_profile[MTHCA_RES_NUM] = { + [MTHCA_RES_QP] = 1 << 16, + [MTHCA_RES_EQP] = 1 << 16, + [MTHCA_RES_CQ] = 1 << 16, + [MTHCA_RES_EQ] = 32, + [MTHCA_RES_RDB] = 1 << 18, + [MTHCA_RES_MCG] = 1 << 13, + [MTHCA_RES_MPT] = 1 << 17, + [MTHCA_RES_MTT] = 1 << 20, + [MTHCA_RES_UDAV] = 1 << 15 +}; + +enum { + MTHCA_RDB_ENTRY_SIZE = 32, + MTHCA_MTT_SEG_SIZE = 64 +}; + +enum { + MTHCA_NUM_PDS = 1 << 15 +}; + +int mthca_make_profile(struct mthca_dev *dev, + struct mthca_dev_lim *dev_lim, + struct mthca_init_hca_param *init_hca) +{ + /* just use default profile for now */ + struct mthca_resource { + u64 size; + u64 start; + int type; + int num; + int log_num; + }; + + u64 total_size = 0; + struct mthca_resource *profile; + struct mthca_resource tmp; + int i, j; + + default_profile[MTHCA_RES_UAR] = dev_lim->uar_size / PAGE_SIZE; + + profile = kmalloc(MTHCA_RES_NUM * sizeof *profile, GFP_KERNEL); + if (!profile) + return -ENOMEM; + + profile[MTHCA_RES_QP].size = dev_lim->qpc_entry_sz; + profile[MTHCA_RES_EEC].size = dev_lim->eec_entry_sz; + profile[MTHCA_RES_SRQ].size = dev_lim->srq_entry_sz; + profile[MTHCA_RES_CQ].size = dev_lim->cqc_entry_sz; + profile[MTHCA_RES_EQP].size = dev_lim->eqpc_entry_sz; + profile[MTHCA_RES_EEEC].size = dev_lim->eeec_entry_sz; + profile[MTHCA_RES_EQ].size = dev_lim->eqc_entry_sz; + profile[MTHCA_RES_RDB].size = MTHCA_RDB_ENTRY_SIZE; + profile[MTHCA_RES_MCG].size = MTHCA_MGM_ENTRY_SIZE; + profile[MTHCA_RES_MPT].size = MTHCA_MPT_ENTRY_SIZE; + profile[MTHCA_RES_MTT].size = MTHCA_MTT_SEG_SIZE; + profile[MTHCA_RES_UAR].size = dev_lim->uar_scratch_entry_sz; + profile[MTHCA_RES_UDAV].size = MTHCA_AV_SIZE; + + for (i = 0; i < MTHCA_RES_NUM; ++i) { + profile[i].type = i; + profile[i].num = default_profile[i]; + profile[i].log_num = max(ffs(default_profile[i]) - 1, 0); + profile[i].size *= default_profile[i]; + } + + /* + * Sort the resources in decreasing order of size. Since they + * all have sizes that are powers of 2, we'll be able to keep + * resources aligned to their size and pack them without gaps + * using the sorted order. + */ + for (i = MTHCA_RES_NUM; i > 0; --i) + for (j = 1; j < i; ++j) { + if (profile[j].size > profile[j - 1].size) { + tmp = profile[j]; + profile[j] = profile[j - 1]; + profile[j - 1] = tmp; + } + } + + for (i = 0; i < MTHCA_RES_NUM; ++i) { + if (profile[i].size) { + profile[i].start = dev->ddr_start + total_size; + total_size += profile[i].size; + } + if (total_size > dev->fw.tavor.fw_start - dev->ddr_start) { + mthca_err(dev, "Profile requires 0x%llx bytes; " + "won't fit between DDR start at 0x%016llx " + "and FW start at 0x%016llx.\n", + (unsigned long long) total_size, + (unsigned long long) dev->ddr_start, + (unsigned long long) dev->fw.tavor.fw_start); + kfree(profile); + return -ENOMEM; + } + + if (profile[i].size) + mthca_dbg(dev, "profile[%2d]--%2d/%2d @ 0x%16llx " + "(size 0x%8llx)\n", + i, profile[i].type, profile[i].log_num, + (unsigned long long) profile[i].start, + (unsigned long long) profile[i].size); + } + + mthca_dbg(dev, "HCA memory: allocated %d KB/%d KB (%d KB free)\n", + (int) (total_size >> 10), + (int) ((dev->fw.tavor.fw_start - dev->ddr_start) >> 10), + (int) ((dev->fw.tavor.fw_start - dev->ddr_start - total_size) >> 10)); + + for (i = 0; i < MTHCA_RES_NUM; ++i) { + switch (profile[i].type) { + case MTHCA_RES_QP: + dev->limits.num_qps = profile[i].num; + init_hca->qpc_base = profile[i].start; + init_hca->log_num_qps = profile[i].log_num; + break; + case MTHCA_RES_EEC: + dev->limits.num_eecs = profile[i].num; + init_hca->eec_base = profile[i].start; + init_hca->log_num_eecs = profile[i].log_num; + break; + case MTHCA_RES_SRQ: + dev->limits.num_srqs = profile[i].num; + init_hca->srqc_base = profile[i].start; + init_hca->log_num_srqs = profile[i].log_num; + break; + case MTHCA_RES_CQ: + dev->limits.num_cqs = profile[i].num; + init_hca->cqc_base = profile[i].start; + init_hca->log_num_cqs = profile[i].log_num; + break; + case MTHCA_RES_EQP: + init_hca->eqpc_base = profile[i].start; + break; + case MTHCA_RES_EEEC: + init_hca->eeec_base = profile[i].start; + break; + case MTHCA_RES_EQ: + dev->limits.num_eqs = profile[i].num; + init_hca->eqc_base = profile[i].start; + init_hca->log_num_eqs = profile[i].log_num; + break; + case MTHCA_RES_RDB: + dev->limits.num_rdbs = profile[i].num; + init_hca->rdb_base = profile[i].start; + break; + case MTHCA_RES_MCG: + dev->limits.num_mgms = profile[i].num >> 1; + dev->limits.num_amgms = profile[i].num >> 1; + init_hca->mc_base = profile[i].start; + init_hca->log_mc_entry_sz = ffs(MTHCA_MGM_ENTRY_SIZE) - 1; + init_hca->log_mc_table_sz = profile[i].log_num; + init_hca->mc_hash_sz = 1 << (profile[i].log_num - 1); + break; + case MTHCA_RES_MPT: + dev->limits.num_mpts = profile[i].num; + init_hca->mpt_base = profile[i].start; + init_hca->log_mpt_sz = profile[i].log_num; + break; + case MTHCA_RES_MTT: + dev->limits.num_mtt_segs = profile[i].num; + dev->limits.mtt_seg_size = MTHCA_MTT_SEG_SIZE; + dev->mr_table.mtt_base = profile[i].start; + init_hca->mtt_base = profile[i].start; + init_hca->mtt_seg_sz = ffs(MTHCA_MTT_SEG_SIZE) - 7; + break; + case MTHCA_RES_UAR: + init_hca->uar_scratch_base = profile[i].start; + break; + case MTHCA_RES_UDAV: + dev->av_table.ddr_av_base = profile[i].start; + dev->av_table.num_ddr_avs = profile[i].num; + default: + break; + } + } + + /* + * PDs don't take any HCA memory, but we assign them as part + * of the HCA profile anyway. + */ + dev->limits.num_pds = MTHCA_NUM_PDS; + + kfree(profile); + return 0; +} --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/hw/mthca/mthca_profile.h 2004-12-27 21:48:23.154557958 -0800 @@ -0,0 +1,62 @@ +/* + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: mthca_profile.h 1349 2004-12-16 21:09:43Z roland $ + */ + +#ifndef MTHCA_PROFILE_H +#define MTHCA_PROFILE_H + +#include "mthca_dev.h" +#include "mthca_cmd.h" + +enum { + MTHCA_RES_QP, + MTHCA_RES_EEC, + MTHCA_RES_SRQ, + MTHCA_RES_CQ, + MTHCA_RES_EQP, + MTHCA_RES_EEEC, + MTHCA_RES_EQ, + MTHCA_RES_RDB, + MTHCA_RES_MCG, + MTHCA_RES_MPT, + MTHCA_RES_MTT, + MTHCA_RES_UAR, + MTHCA_RES_UDAV, + MTHCA_RES_NUM +}; + +int mthca_make_profile(struct mthca_dev *mdev, + struct mthca_dev_lim *dev_lim, + struct mthca_init_hca_param *init_hca); + +#endif /* MTHCA_PROFILE_H */ --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/hw/mthca/mthca_reset.c 2004-12-27 21:48:23.199551335 -0800 @@ -0,0 +1,232 @@ +/* + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: mthca_reset.c 1349 2004-12-16 21:09:43Z roland $ + */ + +#include +#include +#include +#include +#include + +#include "mthca_dev.h" +#include "mthca_cmd.h" + +int mthca_reset(struct mthca_dev *mdev) +{ + int i; + int err = 0; + u32 *hca_header = NULL; + u32 *bridge_header = NULL; + struct pci_dev *bridge = NULL; + +#define MTHCA_RESET_OFFSET 0xf0010 +#define MTHCA_RESET_VALUE cpu_to_be32(1) + + /* + * Reset the chip. This is somewhat ugly because we have to + * save off the PCI header before reset and then restore it + * after the chip reboots. We skip config space offsets 22 + * and 23 since those have a special meaning. + * + * To make matters worse, for Tavor (PCI-X HCA) we have to + * find the associated bridge device and save off its PCI + * header as well. + */ + + if (mdev->hca_type == TAVOR) { + /* Look for the bridge -- its device ID will be 2 more + than HCA's device ID. */ + while ((bridge = pci_get_device(mdev->pdev->vendor, + mdev->pdev->device + 2, + bridge)) != NULL) { + if (bridge->hdr_type == PCI_HEADER_TYPE_BRIDGE && + bridge->subordinate == mdev->pdev->bus) { + mthca_dbg(mdev, "Found bridge: %s (%s)\n", + pci_pretty_name(bridge), pci_name(bridge)); + break; + } + } + + if (!bridge) { + /* + * Didn't find a bridge for a Tavor device -- + * assume we're in no-bridge mode and hope for + * the best. + */ + mthca_warn(mdev, "No bridge found for %s (%s)\n", + pci_pretty_name(mdev->pdev), pci_name(mdev->pdev)); + } + + } + + /* For Arbel do we need to save off the full 4K PCI Express header?? */ + hca_header = kmalloc(256, GFP_KERNEL); + if (!hca_header) { + err = -ENOMEM; + mthca_err(mdev, "Couldn't allocate memory to save HCA " + "PCI header, aborting.\n"); + goto out; + } + + for (i = 0; i < 64; ++i) { + if (i == 22 || i == 23) + continue; + if (pci_read_config_dword(mdev->pdev, i * 4, hca_header + i)) { + err = -ENODEV; + mthca_err(mdev, "Couldn't save HCA " + "PCI header, aborting.\n"); + goto out; + } + } + + if (bridge) { + bridge_header = kmalloc(256, GFP_KERNEL); + if (!bridge_header) { + err = -ENOMEM; + mthca_err(mdev, "Couldn't allocate memory to save HCA " + "bridge PCI header, aborting.\n"); + goto out; + } + + for (i = 0; i < 64; ++i) { + if (i == 22 || i == 23) + continue; + if (pci_read_config_dword(bridge, i * 4, bridge_header + i)) { + err = -ENODEV; + mthca_err(mdev, "Couldn't save HCA bridge " + "PCI header, aborting.\n"); + goto out; + } + } + } + + /* actually hit reset */ + { + void __iomem *reset = ioremap(pci_resource_start(mdev->pdev, 0) + + MTHCA_RESET_OFFSET, 4); + + if (!reset) { + err = -ENOMEM; + mthca_err(mdev, "Couldn't map HCA reset register, " + "aborting.\n"); + goto out; + } + + writel(MTHCA_RESET_VALUE, reset); + iounmap(reset); + } + + /* Docs say to wait one second before accessing device */ + msleep(1000); + + /* Now wait for PCI device to start responding again */ + { + u32 v; + int c = 0; + + for (c = 0; c < 100; ++c) { + if (pci_read_config_dword(bridge ? bridge : mdev->pdev, 0, &v)) { + err = -ENODEV; + mthca_err(mdev, "Couldn't access HCA after reset, " + "aborting.\n"); + goto out; + } + + if (v != 0xffffffff) + goto good; + + msleep(100); + } + + err = -ENODEV; + mthca_err(mdev, "PCI device did not come back after reset, " + "aborting.\n"); + goto out; + } + +good: + /* Now restore the PCI headers */ + if (bridge) { + /* + * Bridge control register is at 0x3e, so we'll + * naturally restore it last in this loop. + */ + for (i = 0; i < 16; ++i) { + if (i * 4 == PCI_COMMAND) + continue; + + if (pci_write_config_dword(bridge, i * 4, bridge_header[i])) { + err = -ENODEV; + mthca_err(mdev, "Couldn't restore HCA bridge reg %x, " + "aborting.\n", i); + goto out; + } + } + + if (pci_write_config_dword(bridge, PCI_COMMAND, + bridge_header[PCI_COMMAND / 4])) { + err = -ENODEV; + mthca_err(mdev, "Couldn't restore HCA bridge COMMAND, " + "aborting.\n"); + goto out; + } + } + + for (i = 0; i < 16; ++i) { + if (i * 4 == PCI_COMMAND) + continue; + + if (pci_write_config_dword(mdev->pdev, i * 4, hca_header[i])) { + err = -ENODEV; + mthca_err(mdev, "Couldn't restore HCA reg %x, " + "aborting.\n", i); + goto out; + } + } + + if (pci_write_config_dword(mdev->pdev, PCI_COMMAND, + hca_header[PCI_COMMAND / 4])) { + err = -ENODEV; + mthca_err(mdev, "Couldn't restore HCA COMMAND, " + "aborting.\n"); + goto out; + } + +out: + if (bridge) + pci_dev_put(bridge); + kfree(bridge_header); + kfree(hca_header); + + return err; +} From roland@topspin.com Mon Dec 27 21:49:42 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 21:50:43 -0800 (PST) Received: from umhlanga.STRATNET.NET (umhlanga.stratnet.net [12.162.17.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS5nJDe025948 for ; Mon, 27 Dec 2004 21:49:42 -0800 Received: from exch-1.topspincom.com ([12.162.17.3]) by umhlanga.STRATNET.NET with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:51:02 -0800 Received: from eddore ([10.10.253.169]) by exch-1.topspincom.com with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:51:02 -0800 Received: from localhost ([127.0.0.1] helo=eddore) by eddore with smtp (Exim 4.34) id 1CjAG5-0000tZ-Kz; Mon, 27 Dec 2004 21:51:02 -0800 Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com, openib-general@openib.org In-Reply-To: <200412272151.7VvrtyAsRwgOK3mP@topspin.com> X-Mailer: Roland's Patchbomber Date: Mon, 27 Dec 2004 21:51:01 -0800 Message-Id: <200412272151.h64HpFQTg9SpyhAM@topspin.com> Mime-Version: 1.0 To: davem@davemloft.net From: Roland Dreier X-SA-Exim-Connect-IP: 127.0.0.1 X-SA-Exim-Mail-From: roland@topspin.com Subject: [PATCH][v5][8/24] Add InfiniBand SA (Subnet Administration) query support Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-SA-Exim-Version: 4.1 (built Tue, 17 Aug 2004 11:06:07 +0200) X-SA-Exim-Scanned: Yes (on eddore) X-OriginalArrivalTime: 28 Dec 2004 05:51:02.0564 (UTC) FILETIME=[36BFB640:01C4ECA1] X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13106 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: roland@topspin.com Precedence: bulk X-list: netdev Add support for sending queries to the SA (Subnet Administration). In particular the PathRecord and MCMember (multicast group member) used by the IP-over-InfiniBand driver are implemented. Signed-off-by: Roland Dreier --- linux-bk.orig/drivers/infiniband/core/Makefile 2004-12-27 21:48:19.838046137 -0800 +++ linux-bk/drivers/infiniband/core/Makefile 2004-12-27 21:48:20.847897490 -0800 @@ -1,8 +1,10 @@ EXTRA_CFLAGS += -Idrivers/infiniband/include -obj-$(CONFIG_INFINIBAND) += ib_core.o ib_mad.o +obj-$(CONFIG_INFINIBAND) += ib_core.o ib_mad.o ib_sa.o ib_core-y := packer.o ud_header.o verbs.o sysfs.o \ device.o fmr_pool.o cache.o ib_mad-y := mad.o smi.o agent.o + +ib_sa-y := sa_query.o --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/core/sa_query.c 2004-12-27 21:48:20.896890279 -0800 @@ -0,0 +1,866 @@ +/* + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: sa_query.c 1389 2004-12-27 22:56:47Z roland $ + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include + +MODULE_AUTHOR("Roland Dreier"); +MODULE_DESCRIPTION("InfiniBand subnet administration query support"); +MODULE_LICENSE("Dual BSD/GPL"); + +/* + * These two structures must be packed because they have 64-bit fields + * that are only 32-bit aligned. 64-bit architectures will lay them + * out wrong otherwise. (And unfortunately they are sent on the wire + * so we can't change the layout) + */ +struct ib_sa_hdr { + u64 sm_key; + u16 attr_offset; + u16 reserved; + ib_sa_comp_mask comp_mask; +} __attribute__ ((packed)); + +struct ib_sa_mad { + struct ib_mad_hdr mad_hdr; + struct ib_rmpp_hdr rmpp_hdr; + struct ib_sa_hdr sa_hdr; + u8 data[200]; +} __attribute__ ((packed)); + +struct ib_sa_sm_ah { + struct ib_ah *ah; + struct kref ref; +}; + +struct ib_sa_port { + struct ib_mad_agent *agent; + struct ib_mr *mr; + struct ib_sa_sm_ah *sm_ah; + struct work_struct update_task; + spinlock_t ah_lock; + u8 port_num; +}; + +struct ib_sa_device { + int start_port, end_port; + struct ib_event_handler event_handler; + struct ib_sa_port port[0]; +}; + +struct ib_sa_query { + void (*callback)(struct ib_sa_query *, int, struct ib_sa_mad *); + void (*release)(struct ib_sa_query *); + struct ib_sa_port *port; + struct ib_sa_mad *mad; + struct ib_sa_sm_ah *sm_ah; + DECLARE_PCI_UNMAP_ADDR(mapping) + int id; +}; + +struct ib_sa_path_query { + void (*callback)(int, struct ib_sa_path_rec *, void *); + void *context; + struct ib_sa_query sa_query; +}; + +struct ib_sa_mcmember_query { + void (*callback)(int, struct ib_sa_mcmember_rec *, void *); + void *context; + struct ib_sa_query sa_query; +}; + +static void ib_sa_add_one(struct ib_device *device); +static void ib_sa_remove_one(struct ib_device *device); + +static struct ib_client sa_client = { + .name = "sa", + .add = ib_sa_add_one, + .remove = ib_sa_remove_one +}; + +static spinlock_t idr_lock; +static DEFINE_IDR(query_idr); + +static spinlock_t tid_lock; +static u32 tid; + +enum { + IB_SA_ATTR_CLASS_PORTINFO = 0x01, + IB_SA_ATTR_NOTICE = 0x02, + IB_SA_ATTR_INFORM_INFO = 0x03, + IB_SA_ATTR_NODE_REC = 0x11, + IB_SA_ATTR_PORT_INFO_REC = 0x12, + IB_SA_ATTR_SL2VL_REC = 0x13, + IB_SA_ATTR_SWITCH_REC = 0x14, + IB_SA_ATTR_LINEAR_FDB_REC = 0x15, + IB_SA_ATTR_RANDOM_FDB_REC = 0x16, + IB_SA_ATTR_MCAST_FDB_REC = 0x17, + IB_SA_ATTR_SM_INFO_REC = 0x18, + IB_SA_ATTR_LINK_REC = 0x20, + IB_SA_ATTR_GUID_INFO_REC = 0x30, + IB_SA_ATTR_SERVICE_REC = 0x31, + IB_SA_ATTR_PARTITION_REC = 0x33, + IB_SA_ATTR_RANGE_REC = 0x34, + IB_SA_ATTR_PATH_REC = 0x35, + IB_SA_ATTR_VL_ARB_REC = 0x36, + IB_SA_ATTR_MC_GROUP_REC = 0x37, + IB_SA_ATTR_MC_MEMBER_REC = 0x38, + IB_SA_ATTR_TRACE_REC = 0x39, + IB_SA_ATTR_MULTI_PATH_REC = 0x3a, + IB_SA_ATTR_SERVICE_ASSOC_REC = 0x3b +}; + +#define PATH_REC_FIELD(field) \ + .struct_offset_bytes = offsetof(struct ib_sa_path_rec, field), \ + .struct_size_bytes = sizeof ((struct ib_sa_path_rec *) 0)->field, \ + .field_name = "sa_path_rec:" #field + +static const struct ib_field path_rec_table[] = { + { RESERVED, + .offset_words = 0, + .offset_bits = 0, + .size_bits = 32 }, + { RESERVED, + .offset_words = 1, + .offset_bits = 0, + .size_bits = 32 }, + { PATH_REC_FIELD(dgid), + .offset_words = 2, + .offset_bits = 0, + .size_bits = 128 }, + { PATH_REC_FIELD(sgid), + .offset_words = 6, + .offset_bits = 0, + .size_bits = 128 }, + { PATH_REC_FIELD(dlid), + .offset_words = 10, + .offset_bits = 0, + .size_bits = 16 }, + { PATH_REC_FIELD(slid), + .offset_words = 10, + .offset_bits = 16, + .size_bits = 16 }, + { PATH_REC_FIELD(raw_traffic), + .offset_words = 11, + .offset_bits = 0, + .size_bits = 1 }, + { RESERVED, + .offset_words = 11, + .offset_bits = 1, + .size_bits = 3 }, + { PATH_REC_FIELD(flow_label), + .offset_words = 11, + .offset_bits = 4, + .size_bits = 20 }, + { PATH_REC_FIELD(hop_limit), + .offset_words = 11, + .offset_bits = 24, + .size_bits = 8 }, + { PATH_REC_FIELD(traffic_class), + .offset_words = 12, + .offset_bits = 0, + .size_bits = 8 }, + { PATH_REC_FIELD(reversible), + .offset_words = 12, + .offset_bits = 8, + .size_bits = 1 }, + { PATH_REC_FIELD(numb_path), + .offset_words = 12, + .offset_bits = 9, + .size_bits = 7 }, + { PATH_REC_FIELD(pkey), + .offset_words = 12, + .offset_bits = 16, + .size_bits = 16 }, + { RESERVED, + .offset_words = 13, + .offset_bits = 0, + .size_bits = 12 }, + { PATH_REC_FIELD(sl), + .offset_words = 13, + .offset_bits = 12, + .size_bits = 4 }, + { PATH_REC_FIELD(mtu_selector), + .offset_words = 13, + .offset_bits = 16, + .size_bits = 2 }, + { PATH_REC_FIELD(mtu), + .offset_words = 13, + .offset_bits = 18, + .size_bits = 6 }, + { PATH_REC_FIELD(rate_selector), + .offset_words = 13, + .offset_bits = 24, + .size_bits = 2 }, + { PATH_REC_FIELD(rate), + .offset_words = 13, + .offset_bits = 26, + .size_bits = 6 }, + { PATH_REC_FIELD(packet_life_time_selector), + .offset_words = 14, + .offset_bits = 0, + .size_bits = 2 }, + { PATH_REC_FIELD(packet_life_time), + .offset_words = 14, + .offset_bits = 2, + .size_bits = 6 }, + { PATH_REC_FIELD(preference), + .offset_words = 14, + .offset_bits = 8, + .size_bits = 8 }, + { RESERVED, + .offset_words = 14, + .offset_bits = 16, + .size_bits = 48 }, +}; + +#define MCMEMBER_REC_FIELD(field) \ + .struct_offset_bytes = offsetof(struct ib_sa_mcmember_rec, field), \ + .struct_size_bytes = sizeof ((struct ib_sa_mcmember_rec *) 0)->field, \ + .field_name = "sa_mcmember_rec:" #field + +static const struct ib_field mcmember_rec_table[] = { + { MCMEMBER_REC_FIELD(mgid), + .offset_words = 0, + .offset_bits = 0, + .size_bits = 128 }, + { MCMEMBER_REC_FIELD(port_gid), + .offset_words = 4, + .offset_bits = 0, + .size_bits = 128 }, + { MCMEMBER_REC_FIELD(qkey), + .offset_words = 8, + .offset_bits = 0, + .size_bits = 32 }, + { MCMEMBER_REC_FIELD(mlid), + .offset_words = 9, + .offset_bits = 0, + .size_bits = 16 }, + { MCMEMBER_REC_FIELD(mtu_selector), + .offset_words = 9, + .offset_bits = 16, + .size_bits = 2 }, + { MCMEMBER_REC_FIELD(mtu), + .offset_words = 9, + .offset_bits = 18, + .size_bits = 6 }, + { MCMEMBER_REC_FIELD(traffic_class), + .offset_words = 9, + .offset_bits = 24, + .size_bits = 8 }, + { MCMEMBER_REC_FIELD(pkey), + .offset_words = 10, + .offset_bits = 0, + .size_bits = 16 }, + { MCMEMBER_REC_FIELD(rate_selector), + .offset_words = 10, + .offset_bits = 16, + .size_bits = 2 }, + { MCMEMBER_REC_FIELD(rate), + .offset_words = 10, + .offset_bits = 18, + .size_bits = 6 }, + { MCMEMBER_REC_FIELD(packet_life_time_selector), + .offset_words = 10, + .offset_bits = 24, + .size_bits = 2 }, + { MCMEMBER_REC_FIELD(packet_life_time), + .offset_words = 10, + .offset_bits = 26, + .size_bits = 6 }, + { MCMEMBER_REC_FIELD(sl), + .offset_words = 11, + .offset_bits = 0, + .size_bits = 4 }, + { MCMEMBER_REC_FIELD(flow_label), + .offset_words = 11, + .offset_bits = 4, + .size_bits = 20 }, + { MCMEMBER_REC_FIELD(hop_limit), + .offset_words = 11, + .offset_bits = 24, + .size_bits = 8 }, + { MCMEMBER_REC_FIELD(scope), + .offset_words = 12, + .offset_bits = 0, + .size_bits = 4 }, + { MCMEMBER_REC_FIELD(join_state), + .offset_words = 12, + .offset_bits = 4, + .size_bits = 4 }, + { MCMEMBER_REC_FIELD(proxy_join), + .offset_words = 12, + .offset_bits = 8, + .size_bits = 1 }, + { RESERVED, + .offset_words = 12, + .offset_bits = 9, + .size_bits = 23 }, +}; + +static void free_sm_ah(struct kref *kref) +{ + struct ib_sa_sm_ah *sm_ah = container_of(kref, struct ib_sa_sm_ah, ref); + + ib_destroy_ah(sm_ah->ah); + kfree(sm_ah); +} + +static void update_sm_ah(void *port_ptr) +{ + struct ib_sa_port *port = port_ptr; + struct ib_sa_sm_ah *new_ah, *old_ah; + struct ib_port_attr port_attr; + struct ib_ah_attr ah_attr; + + if (ib_query_port(port->agent->device, port->port_num, &port_attr)) { + printk(KERN_WARNING "Couldn't query port\n"); + return; + } + + new_ah = kmalloc(sizeof *new_ah, GFP_KERNEL); + if (!new_ah) { + printk(KERN_WARNING "Couldn't allocate new SM AH\n"); + return; + } + + kref_init(&new_ah->ref); + + memset(&ah_attr, 0, sizeof ah_attr); + ah_attr.dlid = port_attr.sm_lid; + ah_attr.sl = port_attr.sm_sl; + ah_attr.port_num = port->port_num; + + new_ah->ah = ib_create_ah(port->agent->qp->pd, &ah_attr); + if (IS_ERR(new_ah->ah)) { + printk(KERN_WARNING "Couldn't create new SM AH\n"); + kfree(new_ah); + return; + } + + spin_lock_irq(&port->ah_lock); + old_ah = port->sm_ah; + port->sm_ah = new_ah; + spin_unlock_irq(&port->ah_lock); + + if (old_ah) + kref_put(&old_ah->ref, free_sm_ah); +} + +static void ib_sa_event(struct ib_event_handler *handler, struct ib_event *event) +{ + if (event->event == IB_EVENT_PORT_ERR || + event->event == IB_EVENT_PORT_ACTIVE || + event->event == IB_EVENT_LID_CHANGE || + event->event == IB_EVENT_PKEY_CHANGE || + event->event == IB_EVENT_SM_CHANGE) { + struct ib_sa_device *sa_dev = + ib_get_client_data(event->device, &sa_client); + + schedule_work(&sa_dev->port[event->element.port_num - + sa_dev->start_port].update_task); + } +} + +/** + * ib_sa_cancel_query - try to cancel an SA query + * @id:ID of query to cancel + * @query:query pointer to cancel + * + * Try to cancel an SA query. If the id and query don't match up or + * the query has already completed, nothing is done. Otherwise the + * query is canceled and will complete with a status of -EINTR. + */ +void ib_sa_cancel_query(int id, struct ib_sa_query *query) +{ + unsigned long flags; + struct ib_mad_agent *agent; + + spin_lock_irqsave(&idr_lock, flags); + if (idr_find(&query_idr, id) != query) { + spin_unlock_irqrestore(&idr_lock, flags); + return; + } + agent = query->port->agent; + spin_unlock_irqrestore(&idr_lock, flags); + + ib_cancel_mad(agent, id); +} +EXPORT_SYMBOL(ib_sa_cancel_query); + +static void init_mad(struct ib_sa_mad *mad, struct ib_mad_agent *agent) +{ + unsigned long flags; + + memset(mad, 0, sizeof *mad); + + mad->mad_hdr.base_version = IB_MGMT_BASE_VERSION; + mad->mad_hdr.mgmt_class = IB_MGMT_CLASS_SUBN_ADM; + mad->mad_hdr.class_version = IB_SA_CLASS_VERSION; + + spin_lock_irqsave(&tid_lock, flags); + mad->mad_hdr.tid = + cpu_to_be64(((u64) agent->hi_tid) << 32 | tid++); + spin_unlock_irqrestore(&tid_lock, flags); +} + +static int send_mad(struct ib_sa_query *query, int timeout_ms) +{ + struct ib_sa_port *port = query->port; + unsigned long flags; + int ret; + struct ib_sge gather_list; + struct ib_send_wr *bad_wr, wr = { + .opcode = IB_WR_SEND, + .sg_list = &gather_list, + .num_sge = 1, + .send_flags = IB_SEND_SIGNALED, + .wr = { + .ud = { + .mad_hdr = &query->mad->mad_hdr, + .remote_qpn = 1, + .remote_qkey = IB_QP1_QKEY, + .timeout_ms = timeout_ms + } + } + }; + +retry: + if (!idr_pre_get(&query_idr, GFP_ATOMIC)) + return -ENOMEM; + spin_lock_irqsave(&idr_lock, flags); + ret = idr_get_new(&query_idr, query, &query->id); + spin_unlock_irqrestore(&idr_lock, flags); + if (ret == -EAGAIN) + goto retry; + if (ret) + return ret; + + wr.wr_id = query->id; + + spin_lock_irqsave(&port->ah_lock, flags); + kref_get(&port->sm_ah->ref); + query->sm_ah = port->sm_ah; + wr.wr.ud.ah = port->sm_ah->ah; + spin_unlock_irqrestore(&port->ah_lock, flags); + + gather_list.addr = dma_map_single(port->agent->device->dma_device, + query->mad, + sizeof (struct ib_sa_mad), + DMA_TO_DEVICE); + gather_list.length = sizeof (struct ib_sa_mad); + gather_list.lkey = port->mr->lkey; + pci_unmap_addr_set(query, mapping, gather_list.addr); + + ret = ib_post_send_mad(port->agent, &wr, &bad_wr); + if (ret) { + dma_unmap_single(port->agent->device->dma_device, + pci_unmap_addr(query, mapping), + sizeof (struct ib_sa_mad), + DMA_TO_DEVICE); + kref_put(&query->sm_ah->ref, free_sm_ah); + spin_lock_irqsave(&idr_lock, flags); + idr_remove(&query_idr, query->id); + spin_unlock_irqrestore(&idr_lock, flags); + } + + return ret; +} + +static void ib_sa_path_rec_callback(struct ib_sa_query *sa_query, + int status, + struct ib_sa_mad *mad) +{ + struct ib_sa_path_query *query = + container_of(sa_query, struct ib_sa_path_query, sa_query); + + if (mad) { + struct ib_sa_path_rec rec; + + ib_unpack(path_rec_table, ARRAY_SIZE(path_rec_table), + mad->data, &rec); + query->callback(status, &rec, query->context); + } else + query->callback(status, NULL, query->context); +} + +static void ib_sa_path_rec_release(struct ib_sa_query *sa_query) +{ + kfree(sa_query->mad); + kfree(container_of(sa_query, struct ib_sa_path_query, sa_query)); +} + +/** + * ib_sa_path_rec_get - Start a Path get query + * @device:device to send query on + * @port_num: port number to send query on + * @rec:Path Record to send in query + * @comp_mask:component mask to send in query + * @timeout_ms:time to wait for response + * @gfp_mask:GFP mask to use for internal allocations + * @callback:function called when query completes, times out or is + * canceled + * @context:opaque user context passed to callback + * @sa_query:query context, used to cancel query + * + * Send a Path Record Get query to the SA to look up a path. The + * callback function will be called when the query completes (or + * fails); status is 0 for a successful response, -EINTR if the query + * is canceled, -ETIMEDOUT is the query timed out, or -EIO if an error + * occurred sending the query. The resp parameter of the callback is + * only valid if status is 0. + * + * If the return value of ib_sa_path_rec_get() is negative, it is an + * error code. Otherwise it is a query ID that can be used to cancel + * the query. + */ +int ib_sa_path_rec_get(struct ib_device *device, u8 port_num, + struct ib_sa_path_rec *rec, + ib_sa_comp_mask comp_mask, + int timeout_ms, int gfp_mask, + void (*callback)(int status, + struct ib_sa_path_rec *resp, + void *context), + void *context, + struct ib_sa_query **sa_query) +{ + struct ib_sa_path_query *query; + struct ib_sa_device *sa_dev = ib_get_client_data(device, &sa_client); + struct ib_sa_port *port = &sa_dev->port[port_num - sa_dev->start_port]; + struct ib_mad_agent *agent = port->agent; + int ret; + + query = kmalloc(sizeof *query, gfp_mask); + if (!query) + return -ENOMEM; + query->sa_query.mad = kmalloc(sizeof *query->sa_query.mad, gfp_mask); + if (!query->sa_query.mad) { + kfree(query); + return -ENOMEM; + } + + query->callback = callback; + query->context = context; + + init_mad(query->sa_query.mad, agent); + + query->sa_query.callback = ib_sa_path_rec_callback; + query->sa_query.release = ib_sa_path_rec_release; + query->sa_query.port = port; + query->sa_query.mad->mad_hdr.method = IB_MGMT_METHOD_GET; + query->sa_query.mad->mad_hdr.attr_id = cpu_to_be16(IB_SA_ATTR_PATH_REC); + query->sa_query.mad->sa_hdr.comp_mask = comp_mask; + + ib_pack(path_rec_table, ARRAY_SIZE(path_rec_table), + rec, query->sa_query.mad->data); + + *sa_query = &query->sa_query; + ret = send_mad(&query->sa_query, timeout_ms); + if (ret) { + *sa_query = NULL; + kfree(query->sa_query.mad); + kfree(query); + } + + return ret ? ret : query->sa_query.id; +} +EXPORT_SYMBOL(ib_sa_path_rec_get); + +static void ib_sa_mcmember_rec_callback(struct ib_sa_query *sa_query, + int status, + struct ib_sa_mad *mad) +{ + struct ib_sa_mcmember_query *query = + container_of(sa_query, struct ib_sa_mcmember_query, sa_query); + + if (mad) { + struct ib_sa_mcmember_rec rec; + + ib_unpack(mcmember_rec_table, ARRAY_SIZE(mcmember_rec_table), + mad->data, &rec); + query->callback(status, &rec, query->context); + } else + query->callback(status, NULL, query->context); +} + +static void ib_sa_mcmember_rec_release(struct ib_sa_query *sa_query) +{ + kfree(sa_query->mad); + kfree(container_of(sa_query, struct ib_sa_mcmember_query, sa_query)); +} + +int ib_sa_mcmember_rec_query(struct ib_device *device, u8 port_num, + u8 method, + struct ib_sa_mcmember_rec *rec, + ib_sa_comp_mask comp_mask, + int timeout_ms, int gfp_mask, + void (*callback)(int status, + struct ib_sa_mcmember_rec *resp, + void *context), + void *context, + struct ib_sa_query **sa_query) +{ + struct ib_sa_mcmember_query *query; + struct ib_sa_device *sa_dev = ib_get_client_data(device, &sa_client); + struct ib_sa_port *port = &sa_dev->port[port_num - sa_dev->start_port]; + struct ib_mad_agent *agent = port->agent; + int ret; + + query = kmalloc(sizeof *query, gfp_mask); + if (!query) + return -ENOMEM; + query->sa_query.mad = kmalloc(sizeof *query->sa_query.mad, gfp_mask); + if (!query->sa_query.mad) { + kfree(query); + return -ENOMEM; + } + + query->callback = callback; + query->context = context; + + init_mad(query->sa_query.mad, agent); + + query->sa_query.callback = ib_sa_mcmember_rec_callback; + query->sa_query.release = ib_sa_mcmember_rec_release; + query->sa_query.port = port; + query->sa_query.mad->mad_hdr.method = method; + query->sa_query.mad->mad_hdr.attr_id = cpu_to_be16(IB_SA_ATTR_MC_MEMBER_REC); + query->sa_query.mad->sa_hdr.comp_mask = comp_mask; + + ib_pack(mcmember_rec_table, ARRAY_SIZE(mcmember_rec_table), + rec, query->sa_query.mad->data); + + *sa_query = &query->sa_query; + ret = send_mad(&query->sa_query, timeout_ms); + if (ret) { + *sa_query = NULL; + kfree(query->sa_query.mad); + kfree(query); + } + + return ret ? ret : query->sa_query.id; +} +EXPORT_SYMBOL(ib_sa_mcmember_rec_query); + +static void send_handler(struct ib_mad_agent *agent, + struct ib_mad_send_wc *mad_send_wc) +{ + struct ib_sa_query *query; + unsigned long flags; + + spin_lock_irqsave(&idr_lock, flags); + query = idr_find(&query_idr, mad_send_wc->wr_id); + spin_unlock_irqrestore(&idr_lock, flags); + + if (!query) + return; + + switch (mad_send_wc->status) { + case IB_WC_SUCCESS: + /* No callback -- already got recv */ + break; + case IB_WC_RESP_TIMEOUT_ERR: + query->callback(query, -ETIMEDOUT, NULL); + break; + case IB_WC_WR_FLUSH_ERR: + query->callback(query, -EINTR, NULL); + break; + default: + query->callback(query, -EIO, NULL); + break; + } + + dma_unmap_single(agent->device->dma_device, + pci_unmap_addr(query, mapping), + sizeof (struct ib_sa_mad), + DMA_TO_DEVICE); + kref_put(&query->sm_ah->ref, free_sm_ah); + + query->release(query); + + spin_lock_irqsave(&idr_lock, flags); + idr_remove(&query_idr, mad_send_wc->wr_id); + spin_unlock_irqrestore(&idr_lock, flags); +} + +static void recv_handler(struct ib_mad_agent *mad_agent, + struct ib_mad_recv_wc *mad_recv_wc) +{ + struct ib_sa_query *query; + unsigned long flags; + + spin_lock_irqsave(&idr_lock, flags); + query = idr_find(&query_idr, mad_recv_wc->wc->wr_id); + spin_unlock_irqrestore(&idr_lock, flags); + + if (query) { + if (mad_recv_wc->wc->status == IB_WC_SUCCESS) + query->callback(query, + mad_recv_wc->recv_buf.mad->mad_hdr.status ? + -EINVAL : 0, + (struct ib_sa_mad *) mad_recv_wc->recv_buf.mad); + else + query->callback(query, -EIO, NULL); + } + + ib_free_recv_mad(mad_recv_wc); +} + +static void ib_sa_add_one(struct ib_device *device) +{ + struct ib_sa_device *sa_dev; + int s, e, i; + + if (device->node_type == IB_NODE_SWITCH) + s = e = 0; + else { + s = 1; + e = device->phys_port_cnt; + } + + sa_dev = kmalloc(sizeof *sa_dev + + (e - s + 1) * sizeof (struct ib_sa_port), + GFP_KERNEL); + if (!sa_dev) + return; + + sa_dev->start_port = s; + sa_dev->end_port = e; + + for (i = 0; i <= e - s; ++i) { + sa_dev->port[i].mr = NULL; + sa_dev->port[i].sm_ah = NULL; + sa_dev->port[i].port_num = i + s; + spin_lock_init(&sa_dev->port[i].ah_lock); + + sa_dev->port[i].agent = + ib_register_mad_agent(device, i + s, IB_QPT_GSI, + NULL, 0, send_handler, + recv_handler, sa_dev); + if (IS_ERR(sa_dev->port[i].agent)) + goto err; + + sa_dev->port[i].mr = ib_get_dma_mr(sa_dev->port[i].agent->qp->pd, + IB_ACCESS_LOCAL_WRITE); + if (IS_ERR(sa_dev->port[i].mr)) { + ib_unregister_mad_agent(sa_dev->port[i].agent); + goto err; + } + + INIT_WORK(&sa_dev->port[i].update_task, + update_sm_ah, &sa_dev->port[i]); + } + + ib_set_client_data(device, &sa_client, sa_dev); + + /* + * We register our event handler after everything is set up, + * and then update our cached info after the event handler is + * registered to avoid any problems if a port changes state + * during our initialization. + */ + + INIT_IB_EVENT_HANDLER(&sa_dev->event_handler, device, ib_sa_event); + if (ib_register_event_handler(&sa_dev->event_handler)) + goto err; + + for (i = 0; i <= e - s; ++i) + update_sm_ah(&sa_dev->port[i]); + + return; + +err: + while (--i >= 0) { + ib_dereg_mr(sa_dev->port[i].mr); + ib_unregister_mad_agent(sa_dev->port[i].agent); + } + + kfree(sa_dev); + + return; +} + +static void ib_sa_remove_one(struct ib_device *device) +{ + struct ib_sa_device *sa_dev = ib_get_client_data(device, &sa_client); + int i; + + if (!sa_dev) + return; + + ib_unregister_event_handler(&sa_dev->event_handler); + + for (i = 0; i <= sa_dev->end_port - sa_dev->start_port; ++i) { + ib_unregister_mad_agent(sa_dev->port[i].agent); + kref_put(&sa_dev->port[i].sm_ah->ref, free_sm_ah); + } + + kfree(sa_dev); +} + +static int __init ib_sa_init(void) +{ + int ret; + + spin_lock_init(&idr_lock); + spin_lock_init(&tid_lock); + + get_random_bytes(&tid, sizeof tid); + + ret = ib_register_client(&sa_client); + if (ret) + printk(KERN_ERR "Couldn't register ib_sa client\n"); + + return ret; +} + +static void __exit ib_sa_cleanup(void) +{ + ib_unregister_client(&sa_client); +} + +module_init(ib_sa_init); +module_exit(ib_sa_cleanup); --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/include/ib_sa.h 2004-12-27 21:48:20.923886305 -0800 @@ -0,0 +1,280 @@ +/* + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: ib_sa.h 1389 2004-12-27 22:56:47Z roland $ + */ + +#ifndef IB_SA_H +#define IB_SA_H + +#include + +#include +#include + +enum { + IB_SA_CLASS_VERSION = 2, /* IB spec version 1.1/1.2 */ + + IB_SA_METHOD_DELETE = 0x15 +}; + +enum ib_sa_selector { + IB_SA_GTE = 0, + IB_SA_LTE = 1, + IB_SA_EQ = 2, + /* + * The meaning of "best" depends on the attribute: for + * example, for MTU best will return the largest available + * MTU, while for packet life time, best will return the + * smallest available life time. + */ + IB_SA_BEST = 3 +}; + +typedef u64 __bitwise ib_sa_comp_mask; + +#define IB_SA_COMP_MASK(n) ((__force ib_sa_comp_mask) cpu_to_be64(1ull << n)) + +/* + * Structures for SA records are named "struct ib_sa_xxx_rec." No + * attempt is made to pack structures to match the physical layout of + * SA records in SA MADs; all packing and unpacking is handled by the + * SA query code. + * + * For a record with structure ib_sa_xxx_rec, the naming convention + * for the component mask value for field yyy is IB_SA_XXX_REC_YYY (we + * never use different abbreviations or otherwise change the spelling + * of xxx/yyy between ib_sa_xxx_rec.yyy and IB_SA_XXX_REC_YYY). + * + * Reserved rows are indicated with comments to help maintainability. + */ + +/* reserved: 0 */ +/* reserved: 1 */ +#define IB_SA_PATH_REC_DGID IB_SA_COMP_MASK( 2) +#define IB_SA_PATH_REC_SGID IB_SA_COMP_MASK( 3) +#define IB_SA_PATH_REC_DLID IB_SA_COMP_MASK( 4) +#define IB_SA_PATH_REC_SLID IB_SA_COMP_MASK( 5) +#define IB_SA_PATH_REC_RAW_TRAFFIC IB_SA_COMP_MASK( 6) +/* reserved: 7 */ +#define IB_SA_PATH_REC_FLOW_LABEL IB_SA_COMP_MASK( 8) +#define IB_SA_PATH_REC_HOP_LIMIT IB_SA_COMP_MASK( 9) +#define IB_SA_PATH_REC_TRAFFIC_CLASS IB_SA_COMP_MASK(10) +#define IB_SA_PATH_REC_REVERSIBLE IB_SA_COMP_MASK(11) +#define IB_SA_PATH_REC_NUMB_PATH IB_SA_COMP_MASK(12) +#define IB_SA_PATH_REC_PKEY IB_SA_COMP_MASK(13) +/* reserved: 14 */ +#define IB_SA_PATH_REC_SL IB_SA_COMP_MASK(15) +#define IB_SA_PATH_REC_MTU_SELECTOR IB_SA_COMP_MASK(16) +#define IB_SA_PATH_REC_MTU IB_SA_COMP_MASK(17) +#define IB_SA_PATH_REC_RATE_SELECTOR IB_SA_COMP_MASK(18) +#define IB_SA_PATH_REC_RATE IB_SA_COMP_MASK(19) +#define IB_SA_PATH_REC_PACKET_LIFE_TIME_SELECTOR IB_SA_COMP_MASK(20) +#define IB_SA_PATH_REC_PACKET_LIFE_TIME IB_SA_COMP_MASK(21) +#define IB_SA_PATH_REC_PREFERENCE IB_SA_COMP_MASK(22) + +struct ib_sa_path_rec { + /* reserved */ + /* reserved */ + union ib_gid dgid; + union ib_gid sgid; + u16 dlid; + u16 slid; + int raw_traffic; + /* reserved */ + u32 flow_label; + u8 hop_limit; + u8 traffic_class; + int reversible; + u8 numb_path; + u16 pkey; + /* reserved */ + u8 sl; + u8 mtu_selector; + enum ib_mtu mtu; + u8 rate_selector; + u8 rate; + u8 packet_life_time_selector; + u8 packet_life_time; + u8 preference; +}; + +#define IB_SA_MCMEMBER_REC_MGID IB_SA_COMP_MASK( 0) +#define IB_SA_MCMEMBER_REC_PORT_GID IB_SA_COMP_MASK( 1) +#define IB_SA_MCMEMBER_REC_QKEY IB_SA_COMP_MASK( 2) +#define IB_SA_MCMEMBER_REC_MLID IB_SA_COMP_MASK( 3) +#define IB_SA_MCMEMBER_REC_MTU_SELECTOR IB_SA_COMP_MASK( 4) +#define IB_SA_MCMEMBER_REC_MTU IB_SA_COMP_MASK( 5) +#define IB_SA_MCMEMBER_REC_TRAFFIC_CLASS IB_SA_COMP_MASK( 6) +#define IB_SA_MCMEMBER_REC_PKEY IB_SA_COMP_MASK( 7) +#define IB_SA_MCMEMBER_REC_RATE_SELECTOR IB_SA_COMP_MASK( 8) +#define IB_SA_MCMEMBER_REC_RATE IB_SA_COMP_MASK( 9) +#define IB_SA_MCMEMBER_REC_PACKET_LIFE_TIME_SELECTOR IB_SA_COMP_MASK(10) +#define IB_SA_MCMEMBER_REC_PACKET_LIFE_TIME IB_SA_COMP_MASK(11) +#define IB_SA_MCMEMBER_REC_SL IB_SA_COMP_MASK(12) +#define IB_SA_MCMEMBER_REC_FLOW_LABEL IB_SA_COMP_MASK(13) +#define IB_SA_MCMEMBER_REC_HOP_LIMIT IB_SA_COMP_MASK(14) +#define IB_SA_MCMEMBER_REC_SCOPE IB_SA_COMP_MASK(15) +#define IB_SA_MCMEMBER_REC_JOIN_STATE IB_SA_COMP_MASK(16) +#define IB_SA_MCMEMBER_REC_PROXY_JOIN IB_SA_COMP_MASK(17) + +struct ib_sa_mcmember_rec { + union ib_gid mgid; + union ib_gid port_gid; + u32 qkey; + u16 mlid; + u8 mtu_selector; + enum ib_mtu mtu; + u8 traffic_class; + u16 pkey; + u8 rate_selector; + u8 rate; + u8 packet_life_time_selector; + u8 packet_life_time; + u8 sl; + u32 flow_label; + u8 hop_limit; + u8 scope; + u8 join_state; + int proxy_join; +}; + +struct ib_sa_query; + +void ib_sa_cancel_query(int id, struct ib_sa_query *query); + +int ib_sa_path_rec_get(struct ib_device *device, u8 port_num, + struct ib_sa_path_rec *rec, + ib_sa_comp_mask comp_mask, + int timeout_ms, int gfp_mask, + void (*callback)(int status, + struct ib_sa_path_rec *resp, + void *context), + void *context, + struct ib_sa_query **query); + +int ib_sa_mcmember_rec_query(struct ib_device *device, u8 port_num, + u8 method, + struct ib_sa_mcmember_rec *rec, + ib_sa_comp_mask comp_mask, + int timeout_ms, int gfp_mask, + void (*callback)(int status, + struct ib_sa_mcmember_rec *resp, + void *context), + void *context, + struct ib_sa_query **query); + +/** + * ib_sa_mcmember_rec_set - Start an MCMember set query + * @device:device to send query on + * @port_num: port number to send query on + * @rec:MCMember Record to send in query + * @comp_mask:component mask to send in query + * @timeout_ms:time to wait for response + * @gfp_mask:GFP mask to use for internal allocations + * @callback:function called when query completes, times out or is + * canceled + * @context:opaque user context passed to callback + * @sa_query:query context, used to cancel query + * + * Send an MCMember Set query to the SA (eg to join a multicast + * group). The callback function will be called when the query + * completes (or fails); status is 0 for a successful response, -EINTR + * if the query is canceled, -ETIMEDOUT is the query timed out, or + * -EIO if an error occurred sending the query. The resp parameter of + * the callback is only valid if status is 0. + * + * If the return value of ib_sa_mcmember_rec_set() is negative, it is + * an error code. Otherwise it is a query ID that can be used to + * cancel the query. + */ +static inline int +ib_sa_mcmember_rec_set(struct ib_device *device, u8 port_num, + struct ib_sa_mcmember_rec *rec, + ib_sa_comp_mask comp_mask, + int timeout_ms, int gfp_mask, + void (*callback)(int status, + struct ib_sa_mcmember_rec *resp, + void *context), + void *context, + struct ib_sa_query **query) +{ + return ib_sa_mcmember_rec_query(device, port_num, + IB_MGMT_METHOD_SET, + rec, comp_mask, + timeout_ms, gfp_mask, callback, + context, query); +} + +/** + * ib_sa_mcmember_rec_delete - Start an MCMember delete query + * @device:device to send query on + * @port_num: port number to send query on + * @rec:MCMember Record to send in query + * @comp_mask:component mask to send in query + * @timeout_ms:time to wait for response + * @gfp_mask:GFP mask to use for internal allocations + * @callback:function called when query completes, times out or is + * canceled + * @context:opaque user context passed to callback + * @sa_query:query context, used to cancel query + * + * Send an MCMember Delete query to the SA (eg to leave a multicast + * group). The callback function will be called when the query + * completes (or fails); status is 0 for a successful response, -EINTR + * if the query is canceled, -ETIMEDOUT is the query timed out, or + * -EIO if an error occurred sending the query. The resp parameter of + * the callback is only valid if status is 0. + * + * If the return value of ib_sa_mcmember_rec_delete() is negative, it + * is an error code. Otherwise it is a query ID that can be used to + * cancel the query. + */ +static inline int +ib_sa_mcmember_rec_delete(struct ib_device *device, u8 port_num, + struct ib_sa_mcmember_rec *rec, + ib_sa_comp_mask comp_mask, + int timeout_ms, int gfp_mask, + void (*callback)(int status, + struct ib_sa_mcmember_rec *resp, + void *context), + void *context, + struct ib_sa_query **query) +{ + return ib_sa_mcmember_rec_query(device, port_num, + IB_SA_METHOD_DELETE, + rec, comp_mask, + timeout_ms, gfp_mask, callback, + context, query); +} + + +#endif /* IB_SA_H */ From roland@topspin.com Mon Dec 27 21:49:41 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 21:50:38 -0800 (PST) Received: from umhlanga.STRATNET.NET (umhlanga.stratnet.net [12.162.17.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS5nJDa025948 for ; Mon, 27 Dec 2004 21:49:41 -0800 Received: from exch-1.topspincom.com ([12.162.17.3]) by umhlanga.STRATNET.NET with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:51:00 -0800 Received: from eddore ([10.10.253.169]) by exch-1.topspincom.com with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:51:00 -0800 Received: from localhost ([127.0.0.1] helo=eddore) by eddore with smtp (Exim 4.34) id 1CjAG3-0000tH-R3; Mon, 27 Dec 2004 21:51:00 -0800 Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com, openib-general@openib.org In-Reply-To: <200412272150.vDVch42vu9imXkVM@topspin.com> X-Mailer: Roland's Patchbomber Date: Mon, 27 Dec 2004 21:50:59 -0800 Message-Id: <200412272150.vaYM2cWv3KsWVYPV@topspin.com> Mime-Version: 1.0 To: davem@davemloft.net From: Roland Dreier X-SA-Exim-Connect-IP: 127.0.0.1 X-SA-Exim-Mail-From: roland@topspin.com Subject: [PATCH][v5][6/24] Add InfiniBand MAD (management datagram) support (private headers) Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-SA-Exim-Version: 4.1 (built Tue, 17 Aug 2004 11:06:07 +0200) X-SA-Exim-Scanned: Yes (on eddore) X-OriginalArrivalTime: 28 Dec 2004 05:51:00.0798 (UTC) FILETIME=[35B23DE0:01C4ECA1] X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13104 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: roland@topspin.com Precedence: bulk X-list: netdev Add MAD layer private implementation headers. Signed-off-by: Roland Dreier --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/core/agent.h 2004-12-27 21:48:20.224989180 -0800 @@ -0,0 +1,55 @@ +/* + * Copyright (c) 2004 Mellanox Technologies Ltd. All rights reserved. + * Copyright (c) 2004 Infinicon Corporation. All rights reserved. + * Copyright (c) 2004 Intel Corporation. All rights reserved. + * Copyright (c) 2004 Topspin Corporation. All rights reserved. + * Copyright (c) 2004 Voltaire Corporation. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: agent.h 1389 2004-12-27 22:56:47Z roland $ + */ + +#ifndef __AGENT_H_ +#define __AGENT_H_ + +extern spinlock_t ib_agent_port_list_lock; + +extern int ib_agent_port_open(struct ib_device *device, + int port_num); + +extern int ib_agent_port_close(struct ib_device *device, int port_num); + +extern int agent_send(struct ib_mad_private *mad, + struct ib_grh *grh, + struct ib_wc *wc, + struct ib_device *device, + int port_num); + +#endif /* __AGENT_H_ */ --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/core/agent_priv.h 2004-12-27 21:48:20.250985354 -0800 @@ -0,0 +1,64 @@ +/* + * Copyright (c) 2004 Mellanox Technologies Ltd. All rights reserved. + * Copyright (c) 2004 Infinicon Corporation. All rights reserved. + * Copyright (c) 2004 Intel Corporation. All rights reserved. + * Copyright (c) 2004 Topspin Corporation. All rights reserved. + * Copyright (c) 2004 Voltaire Corporation. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: agent_priv.h 1389 2004-12-27 22:56:47Z roland $ + */ + +#ifndef __IB_AGENT_PRIV_H__ +#define __IB_AGENT_PRIV_H__ + +#include + +#define SPFX "ib_agent: " + +struct ib_agent_send_wr { + struct list_head send_list; + struct ib_ah *ah; + struct ib_mad_private *mad; + DECLARE_PCI_UNMAP_ADDR(mapping) +}; + +struct ib_agent_port_private { + struct list_head port_list; + struct list_head send_posted_list; + spinlock_t send_list_lock; + int port_num; + struct ib_mad_agent *dr_smp_agent; /* DR SM class */ + struct ib_mad_agent *lr_smp_agent; /* LR SM class */ + struct ib_mad_agent *perf_mgmt_agent; /* PerfMgmt class */ + struct ib_mr *mr; +}; + +#endif /* __IB_AGENT_PRIV_H__ */ --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/core/mad_priv.h 2004-12-27 21:48:20.321974904 -0800 @@ -0,0 +1,194 @@ +/* + * Copyright (c) 2004, Voltaire, Inc. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: mad_priv.h 1389 2004-12-27 22:56:47Z roland $ + */ + +#ifndef __IB_MAD_PRIV_H__ +#define __IB_MAD_PRIV_H__ + +#include +#include +#include +#include +#include + + +#define PFX "ib_mad: " + +#define IB_MAD_QPS_CORE 2 /* Always QP0 and QP1 as a minimum */ + +/* QP and CQ parameters */ +#define IB_MAD_QP_SEND_SIZE 128 +#define IB_MAD_QP_RECV_SIZE 512 +#define IB_MAD_SEND_REQ_MAX_SG 2 +#define IB_MAD_RECV_REQ_MAX_SG 1 + +#define IB_MAD_SEND_Q_PSN 0 + +/* Registration table sizes */ +#define MAX_MGMT_CLASS 80 +#define MAX_MGMT_VERSION 8 +#define MAX_MGMT_OUI 8 +#define MAX_MGMT_VENDOR_RANGE2 IB_MGMT_CLASS_VENDOR_RANGE2_END - \ + IB_MGMT_CLASS_VENDOR_RANGE2_START + 1 + +struct ib_mad_list_head { + struct list_head list; + struct ib_mad_queue *mad_queue; +}; + +struct ib_mad_private_header { + struct ib_mad_list_head mad_list; + struct ib_mad_recv_wc recv_wc; + DECLARE_PCI_UNMAP_ADDR(mapping) +} __attribute__ ((packed)); + +struct ib_mad_private { + struct ib_mad_private_header header; + struct ib_grh grh; + union { + struct ib_mad mad; + struct ib_rmpp_mad rmpp_mad; + struct ib_smp smp; + } mad; +} __attribute__ ((packed)); + +struct ib_mad_agent_private { + struct list_head agent_list; + struct ib_mad_agent agent; + struct ib_mad_reg_req *reg_req; + struct ib_mad_qp_info *qp_info; + + spinlock_t lock; + struct list_head send_list; + struct list_head wait_list; + struct work_struct timed_work; + unsigned long timeout; + struct list_head local_list; + struct work_struct local_work; + + atomic_t refcount; + wait_queue_head_t wait; + u8 rmpp_version; +}; + +struct ib_mad_snoop_private { + struct ib_mad_agent agent; + struct ib_mad_qp_info *qp_info; + int snoop_index; + int mad_snoop_flags; + atomic_t refcount; + wait_queue_head_t wait; +}; + +struct ib_mad_send_wr_private { + struct ib_mad_list_head mad_list; + struct list_head agent_list; + struct ib_mad_agent *agent; + struct ib_send_wr send_wr; + struct ib_sge sg_list[IB_MAD_SEND_REQ_MAX_SG]; + u64 wr_id; /* client WR ID */ + u64 tid; + unsigned long timeout; + int retry; + int refcount; + enum ib_wc_status status; +}; + +struct ib_mad_local_private { + struct list_head completion_list; + struct ib_mad_private *mad_priv; + struct ib_send_wr send_wr; + struct ib_sge sg_list[IB_MAD_SEND_REQ_MAX_SG]; + u64 wr_id; /* client WR ID */ + u64 tid; +}; + +struct ib_mad_mgmt_method_table { + struct ib_mad_agent_private *agent[IB_MGMT_MAX_METHODS]; +}; + +struct ib_mad_mgmt_class_table { + struct ib_mad_mgmt_method_table *method_table[MAX_MGMT_CLASS]; +}; + +struct ib_mad_mgmt_vendor_class { + u8 oui[MAX_MGMT_OUI][3]; + struct ib_mad_mgmt_method_table *method_table[MAX_MGMT_OUI]; +}; + +struct ib_mad_mgmt_vendor_class_table { + struct ib_mad_mgmt_vendor_class *vendor_class[MAX_MGMT_VENDOR_RANGE2]; +}; + +struct ib_mad_mgmt_version_table { + struct ib_mad_mgmt_class_table *class; + struct ib_mad_mgmt_vendor_class_table *vendor; +}; + +struct ib_mad_queue { + spinlock_t lock; + struct list_head list; + int count; + int max_active; + struct ib_mad_qp_info *qp_info; +}; + +struct ib_mad_qp_info { + struct ib_mad_port_private *port_priv; + struct ib_qp *qp; + struct ib_mad_queue send_queue; + struct ib_mad_queue recv_queue; + struct list_head overflow_list; + spinlock_t snoop_lock; + struct ib_mad_snoop_private **snoop_table; + int snoop_table_size; + atomic_t snoop_count; +}; + +struct ib_mad_port_private { + struct list_head port_list; + struct ib_device *device; + int port_num; + struct ib_cq *cq; + struct ib_pd *pd; + struct ib_mr *mr; + + spinlock_t reg_lock; + struct ib_mad_mgmt_version_table version[MAX_MGMT_VERSION]; + struct list_head agent_list; + struct workqueue_struct *wq; + struct work_struct work; + struct ib_mad_qp_info qp_info[IB_MAD_QPS_CORE]; +}; + +#endif /* __IB_MAD_PRIV_H__ */ From roland@topspin.com Mon Dec 27 21:49:40 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 21:50:48 -0800 (PST) Received: from umhlanga.STRATNET.NET (umhlanga.stratnet.net [12.162.17.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS5nJDW025948 for ; Mon, 27 Dec 2004 21:49:40 -0800 Received: from exch-1.topspincom.com ([12.162.17.3]) by umhlanga.STRATNET.NET with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:50:58 -0800 Received: from eddore ([10.10.253.169]) by exch-1.topspincom.com with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:50:57 -0800 Received: from localhost ([127.0.0.1] helo=eddore) by eddore with smtp (Exim 4.34) id 1CjAG0-0000sz-Pb; Mon, 27 Dec 2004 21:50:57 -0800 Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com, openib-general@openib.org In-Reply-To: <200412272150.BzlME8aULSGdgnS3@topspin.com> X-Mailer: Roland's Patchbomber Date: Mon, 27 Dec 2004 21:50:56 -0800 Message-Id: <200412272150.7LBVS92XE77zrCiS@topspin.com> Mime-Version: 1.0 To: davem@davemloft.net From: Roland Dreier X-SA-Exim-Connect-IP: 127.0.0.1 X-SA-Exim-Mail-From: roland@topspin.com Subject: [PATCH][v5][4/24] Add InfiniBand MAD (management datagram) support (public headers) Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-SA-Exim-Version: 4.1 (built Tue, 17 Aug 2004 11:06:07 +0200) X-SA-Exim-Scanned: Yes (on eddore) X-OriginalArrivalTime: 28 Dec 2004 05:50:58.0001 (UTC) FILETIME=[34077410:01C4ECA1] X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13108 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: roland@topspin.com Precedence: bulk X-list: netdev Add public headers for handling InfiniBand MADs (management datagrams), including sending and receiving MADs as well as passing MADs on to local agents. Signed-off-by: Roland Dreier --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/include/ib_mad.h 2004-12-27 21:48:19.513093969 -0800 @@ -0,0 +1,404 @@ +/* + * Copyright (c) 2004 Mellanox Technologies Ltd. All rights reserved. + * Copyright (c) 2004 Infinicon Corporation. All rights reserved. + * Copyright (c) 2004 Intel Corporation. All rights reserved. + * Copyright (c) 2004 Topspin Corporation. All rights reserved. + * Copyright (c) 2004 Voltaire Corporation. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: ib_mad.h 1389 2004-12-27 22:56:47Z roland $ + */ + +#if !defined( IB_MAD_H ) +#define IB_MAD_H + +#include + +/* Management base version */ +#define IB_MGMT_BASE_VERSION 1 + +/* Management classes */ +#define IB_MGMT_CLASS_SUBN_LID_ROUTED 0x01 +#define IB_MGMT_CLASS_SUBN_DIRECTED_ROUTE 0x81 +#define IB_MGMT_CLASS_SUBN_ADM 0x03 +#define IB_MGMT_CLASS_PERF_MGMT 0x04 +#define IB_MGMT_CLASS_BM 0x05 +#define IB_MGMT_CLASS_DEVICE_MGMT 0x06 +#define IB_MGMT_CLASS_CM 0x07 +#define IB_MGMT_CLASS_SNMP 0x08 +#define IB_MGMT_CLASS_VENDOR_RANGE2_START 0x30 +#define IB_MGMT_CLASS_VENDOR_RANGE2_END 0x4F + +/* Management methods */ +#define IB_MGMT_METHOD_GET 0x01 +#define IB_MGMT_METHOD_SET 0x02 +#define IB_MGMT_METHOD_GET_RESP 0x81 +#define IB_MGMT_METHOD_SEND 0x03 +#define IB_MGMT_METHOD_TRAP 0x05 +#define IB_MGMT_METHOD_REPORT 0x06 +#define IB_MGMT_METHOD_REPORT_RESP 0x86 +#define IB_MGMT_METHOD_TRAP_REPRESS 0x07 + +#define IB_MGMT_METHOD_RESP 0x80 + +#define IB_MGMT_MAX_METHODS 128 + +#define IB_QP0 0 +#define IB_QP1 __constant_htonl(1) +#define IB_QP1_QKEY 0x80010000 + +struct ib_grh { + u32 version_tclass_flow; + u16 paylen; + u8 next_hdr; + u8 hop_limit; + union ib_gid sgid; + union ib_gid dgid; +} __attribute__ ((packed)); + +struct ib_mad_hdr { + u8 base_version; + u8 mgmt_class; + u8 class_version; + u8 method; + u16 status; + u16 class_specific; + u64 tid; + u16 attr_id; + u16 resv; + u32 attr_mod; +} __attribute__ ((packed)); + +struct ib_rmpp_hdr { + u8 rmpp_version; + u8 rmpp_type; + u8 rmpp_rtime_flags; + u8 rmpp_status; + u32 seg_num; + u32 paylen_newwin; +} __attribute__ ((packed)); + +struct ib_mad { + struct ib_mad_hdr mad_hdr; + u8 data[232]; +} __attribute__ ((packed)); + +struct ib_rmpp_mad { + struct ib_mad_hdr mad_hdr; + struct ib_rmpp_hdr rmpp_hdr; + u8 data[220]; +} __attribute__ ((packed)); + +struct ib_vendor_mad { + struct ib_mad_hdr mad_hdr; + struct ib_rmpp_hdr rmpp_hdr; + u8 reserved; + u8 oui[3]; + u8 data[216]; +} __attribute__ ((packed)); + +struct ib_mad_agent; +struct ib_mad_send_wc; +struct ib_mad_recv_wc; + +/** + * ib_mad_send_handler - callback handler for a sent MAD. + * @mad_agent: MAD agent that sent the MAD. + * @mad_send_wc: Send work completion information on the sent MAD. + */ +typedef void (*ib_mad_send_handler)(struct ib_mad_agent *mad_agent, + struct ib_mad_send_wc *mad_send_wc); + +/** + * ib_mad_snoop_handler - Callback handler for snooping sent MADs. + * @mad_agent: MAD agent that snooped the MAD. + * @send_wr: Work request information on the sent MAD. + * @mad_send_wc: Work completion information on the sent MAD. Valid + * only for snooping that occurs on a send completion. + * + * Clients snooping MADs should not modify data referenced by the @send_wr + * or @mad_send_wc. + */ +typedef void (*ib_mad_snoop_handler)(struct ib_mad_agent *mad_agent, + struct ib_send_wr *send_wr, + struct ib_mad_send_wc *mad_send_wc); + +/** + * ib_mad_recv_handler - callback handler for a received MAD. + * @mad_agent: MAD agent requesting the received MAD. + * @mad_recv_wc: Received work completion information on the received MAD. + * + * MADs received in response to a send request operation will be handed to + * the user after the send operation completes. All data buffers given + * to registered agents through this routine are owned by the receiving + * client, except for snooping agents. Clients snooping MADs should not + * modify the data referenced by @mad_recv_wc. + */ +typedef void (*ib_mad_recv_handler)(struct ib_mad_agent *mad_agent, + struct ib_mad_recv_wc *mad_recv_wc); + +/** + * ib_mad_agent - Used to track MAD registration with the access layer. + * @device: Reference to device registration is on. + * @qp: Reference to QP used for sending and receiving MADs. + * @recv_handler: Callback handler for a received MAD. + * @send_handler: Callback handler for a sent MAD. + * @snoop_handler: Callback handler for snooped sent MADs. + * @context: User-specified context associated with this registration. + * @hi_tid: Access layer assigned transaction ID for this client. + * Unsolicited MADs sent by this client will have the upper 32-bits + * of their TID set to this value. + * @port_num: Port number on which QP is registered + */ +struct ib_mad_agent { + struct ib_device *device; + struct ib_qp *qp; + ib_mad_recv_handler recv_handler; + ib_mad_send_handler send_handler; + ib_mad_snoop_handler snoop_handler; + void *context; + u32 hi_tid; + u8 port_num; +}; + +/** + * ib_mad_send_wc - MAD send completion information. + * @wr_id: Work request identifier associated with the send MAD request. + * @status: Completion status. + * @vendor_err: Optional vendor error information returned with a failed + * request. + */ +struct ib_mad_send_wc { + u64 wr_id; + enum ib_wc_status status; + u32 vendor_err; +}; + +/** + * ib_mad_recv_buf - received MAD buffer information. + * @list: Reference to next data buffer for a received RMPP MAD. + * @grh: References a data buffer containing the global route header. + * The data refereced by this buffer is only valid if the GRH is + * valid. + * @mad: References the start of the received MAD. + */ +struct ib_mad_recv_buf { + struct list_head list; + struct ib_grh *grh; + struct ib_mad *mad; +}; + +/** + * ib_mad_recv_wc - received MAD information. + * @wc: Completion information for the received data. + * @recv_buf: Specifies the location of the received data buffer(s). + * @mad_len: The length of the received MAD, without duplicated headers. + * + * For received response, the wr_id field of the wc is set to the wr_id + * for the corresponding send request. + */ +struct ib_mad_recv_wc { + struct ib_wc *wc; + struct ib_mad_recv_buf recv_buf; + int mad_len; +}; + +/** + * ib_mad_reg_req - MAD registration request + * @mgmt_class: Indicates which management class of MADs should be receive + * by the caller. This field is only required if the user wishes to + * receive unsolicited MADs, otherwise it should be 0. + * @mgmt_class_version: Indicates which version of MADs for the given + * management class to receive. + * @oui: Indicates IEEE OUI when mgmt_class is a vendor class + * in the range from 0x30 to 0x4f. Otherwise not used. + * @method_mask: The caller will receive unsolicited MADs for any method + * where @method_mask = 1. + */ +struct ib_mad_reg_req { + u8 mgmt_class; + u8 mgmt_class_version; + u8 oui[3]; + DECLARE_BITMAP(method_mask, IB_MGMT_MAX_METHODS); +}; + +/** + * ib_register_mad_agent - Register to send/receive MADs. + * @device: The device to register with. + * @port_num: The port on the specified device to use. + * @qp_type: Specifies which QP to access. Must be either + * IB_QPT_SMI or IB_QPT_GSI. + * @mad_reg_req: Specifies which unsolicited MADs should be received + * by the caller. This parameter may be NULL if the caller only + * wishes to receive solicited responses. + * @rmpp_version: If set, indicates that the client will send + * and receive MADs that contain the RMPP header for the given version. + * If set to 0, indicates that RMPP is not used by this client. + * @send_handler: The completion callback routine invoked after a send + * request has completed. + * @recv_handler: The completion callback routine invoked for a received + * MAD. + * @context: User specified context associated with the registration. + */ +struct ib_mad_agent *ib_register_mad_agent(struct ib_device *device, + u8 port_num, + enum ib_qp_type qp_type, + struct ib_mad_reg_req *mad_reg_req, + u8 rmpp_version, + ib_mad_send_handler send_handler, + ib_mad_recv_handler recv_handler, + void *context); + +enum ib_mad_snoop_flags { + /*IB_MAD_SNOOP_POSTED_SENDS = 1,*/ + /*IB_MAD_SNOOP_RMPP_SENDS = (1<<1),*/ + IB_MAD_SNOOP_SEND_COMPLETIONS = (1<<2), + /*IB_MAD_SNOOP_RMPP_SEND_COMPLETIONS = (1<<3),*/ + IB_MAD_SNOOP_RECVS = (1<<4) + /*IB_MAD_SNOOP_RMPP_RECVS = (1<<5),*/ + /*IB_MAD_SNOOP_REDIRECTED_QPS = (1<<6)*/ +}; + +/** + * ib_register_mad_snoop - Register to snoop sent and received MADs. + * @device: The device to register with. + * @port_num: The port on the specified device to use. + * @qp_type: Specifies which QP traffic to snoop. Must be either + * IB_QPT_SMI or IB_QPT_GSI. + * @mad_snoop_flags: Specifies information where snooping occurs. + * @send_handler: The callback routine invoked for a snooped send. + * @recv_handler: The callback routine invoked for a snooped receive. + * @context: User specified context associated with the registration. + */ +struct ib_mad_agent *ib_register_mad_snoop(struct ib_device *device, + u8 port_num, + enum ib_qp_type qp_type, + int mad_snoop_flags, + ib_mad_snoop_handler snoop_handler, + ib_mad_recv_handler recv_handler, + void *context); + +/** + * ib_unregister_mad_agent - Unregisters a client from using MAD services. + * @mad_agent: Corresponding MAD registration request to deregister. + * + * After invoking this routine, MAD services are no longer usable by the + * client on the associated QP. + */ +int ib_unregister_mad_agent(struct ib_mad_agent *mad_agent); + +/** + * ib_post_send_mad - Posts MAD(s) to the send queue of the QP associated + * with the registered client. + * @mad_agent: Specifies the associated registration to post the send to. + * @send_wr: Specifies the information needed to send the MAD(s). + * @bad_send_wr: Specifies the MAD on which an error was encountered. + * + * Sent MADs are not guaranteed to complete in the order that they were posted. + */ +int ib_post_send_mad(struct ib_mad_agent *mad_agent, + struct ib_send_wr *send_wr, + struct ib_send_wr **bad_send_wr); + +/** + * ib_coalesce_recv_mad - Coalesces received MAD data into a single buffer. + * @mad_recv_wc: Work completion information for a received MAD. + * @buf: User-provided data buffer to receive the coalesced buffers. The + * referenced buffer should be at least the size of the mad_len specified + * by @mad_recv_wc. + * + * This call copies a chain of received RMPP MADs into a single data buffer, + * removing duplicated headers. + */ +void ib_coalesce_recv_mad(struct ib_mad_recv_wc *mad_recv_wc, + void *buf); + +/** + * ib_free_recv_mad - Returns data buffers used to receive a MAD to the + * access layer. + * @mad_recv_wc: Work completion information for a received MAD. + * + * Clients receiving MADs through their ib_mad_recv_handler must call this + * routine to return the work completion buffers to the access layer. + */ +void ib_free_recv_mad(struct ib_mad_recv_wc *mad_recv_wc); + +/** + * ib_cancel_mad - Cancels an outstanding send MAD operation. + * @mad_agent: Specifies the registration associated with sent MAD. + * @wr_id: Indicates the work request identifier of the MAD to cancel. + * + * MADs will be returned to the user through the corresponding + * ib_mad_send_handler. + */ +void ib_cancel_mad(struct ib_mad_agent *mad_agent, + u64 wr_id); + +/** + * ib_redirect_mad_qp - Registers a QP for MAD services. + * @qp: Reference to a QP that requires MAD services. + * @rmpp_version: If set, indicates that the client will send + * and receive MADs that contain the RMPP header for the given version. + * If set to 0, indicates that RMPP is not used by this client. + * @send_handler: The completion callback routine invoked after a send + * request has completed. + * @recv_handler: The completion callback routine invoked for a received + * MAD. + * @context: User specified context associated with the registration. + * + * Use of this call allows clients to use MAD services, such as RMPP, + * on user-owned QPs. After calling this routine, users may send + * MADs on the specified QP by calling ib_mad_post_send. + */ +struct ib_mad_agent *ib_redirect_mad_qp(struct ib_qp *qp, + u8 rmpp_version, + ib_mad_send_handler send_handler, + ib_mad_recv_handler recv_handler, + void *context); + +/** + * ib_process_mad_wc - Processes a work completion associated with a + * MAD sent or received on a redirected QP. + * @mad_agent: Specifies the registered MAD service using the redirected QP. + * @wc: References a work completion associated with a sent or received + * MAD segment. + * + * This routine is used to complete or continue processing on a MAD request. + * If the work completion is associated with a send operation, calling + * this routine is required to continue an RMPP transfer or to wait for a + * corresponding response, if it is a request. If the work completion is + * associated with a receive operation, calling this routine is required to + * process an inbound or outbound RMPP transfer, or to match a response MAD + * with its corresponding request. + */ +int ib_process_mad_wc(struct ib_mad_agent *mad_agent, + struct ib_wc *wc); + +#endif /* IB_MAD_H */ --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/include/ib_smi.h 2004-12-27 21:48:19.539090142 -0800 @@ -0,0 +1,96 @@ +/* + * Copyright (c) 2004 Mellanox Technologies Ltd. All rights reserved. + * Copyright (c) 2004 Infinicon Corporation. All rights reserved. + * Copyright (c) 2004 Intel Corporation. All rights reserved. + * Copyright (c) 2004 Topspin Corporation. All rights reserved. + * Copyright (c) 2004 Voltaire Corporation. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: ib_smi.h 1389 2004-12-27 22:56:47Z roland $ + */ + +#if !defined( IB_SMI_H ) +#define IB_SMI_H + +#include + +#define IB_LID_PERMISSIVE 0xFFFF + +#define IB_SMP_DATA_SIZE 64 +#define IB_SMP_MAX_PATH_HOPS 64 + +struct ib_smp { + u8 base_version; + u8 mgmt_class; + u8 class_version; + u8 method; + u16 status; + u8 hop_ptr; + u8 hop_cnt; + u64 tid; + u16 attr_id; + u16 resv; + u32 attr_mod; + u64 mkey; + u16 dr_slid; + u16 dr_dlid; + u8 reserved[28]; + u8 data[IB_SMP_DATA_SIZE]; + u8 initial_path[IB_SMP_MAX_PATH_HOPS]; + u8 return_path[IB_SMP_MAX_PATH_HOPS]; +} __attribute__ ((packed)); + +#define IB_SMP_DIRECTION __constant_htons(0x8000) + +/* Subnet management attributes */ +#define IB_SMP_ATTR_NOTICE __constant_htons(0x0002) +#define IB_SMP_ATTR_NODE_DESC __constant_htons(0x0010) +#define IB_SMP_ATTR_NODE_INFO __constant_htons(0x0011) +#define IB_SMP_ATTR_SWITCH_INFO __constant_htons(0x0012) +#define IB_SMP_ATTR_GUID_INFO __constant_htons(0x0014) +#define IB_SMP_ATTR_PORT_INFO __constant_htons(0x0015) +#define IB_SMP_ATTR_PKEY_TABLE __constant_htons(0x0016) +#define IB_SMP_ATTR_SL_TO_VL_TABLE __constant_htons(0x0017) +#define IB_SMP_ATTR_VL_ARB_TABLE __constant_htons(0x0018) +#define IB_SMP_ATTR_LINEAR_FORWARD_TABLE __constant_htons(0x0019) +#define IB_SMP_ATTR_RANDOM_FORWARD_TABLE __constant_htons(0x001A) +#define IB_SMP_ATTR_MCAST_FORWARD_TABLE __constant_htons(0x001B) +#define IB_SMP_ATTR_SM_INFO __constant_htons(0x0020) +#define IB_SMP_ATTR_VENDOR_DIAG __constant_htons(0x0030) +#define IB_SMP_ATTR_LED_INFO __constant_htons(0x0031) +#define IB_SMP_ATTR_VENDOR_MASK __constant_htons(0xFF00) + +static inline u8 +ib_get_smp_direction(struct ib_smp *smp) +{ + return ((smp->status & IB_SMP_DIRECTION) == IB_SMP_DIRECTION); +} + +#endif /* IB_SMI_H */ From roland@topspin.com Mon Dec 27 21:49:44 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 21:50:39 -0800 (PST) Received: from umhlanga.STRATNET.NET (umhlanga.stratnet.net [12.162.17.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS5nJDi025948 for ; Mon, 27 Dec 2004 21:49:43 -0800 Received: from exch-1.topspincom.com ([12.162.17.3]) by umhlanga.STRATNET.NET with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:51:05 -0800 Received: from eddore ([10.10.253.169]) by exch-1.topspincom.com with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:51:05 -0800 Received: from localhost ([127.0.0.1] helo=eddore) by eddore with smtp (Exim 4.34) id 1CjAG8-0000tr-72; Mon, 27 Dec 2004 21:51:05 -0800 Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com, openib-general@openib.org In-Reply-To: <200412272151.HKpCNDFOMg5KO6kB@topspin.com> X-Mailer: Roland's Patchbomber Date: Mon, 27 Dec 2004 21:51:04 -0800 Message-Id: <200412272151.5KRaOFYNt0hYYQgh@topspin.com> Mime-Version: 1.0 To: davem@davemloft.net From: Roland Dreier X-SA-Exim-Connect-IP: 127.0.0.1 X-SA-Exim-Mail-From: roland@topspin.com Subject: [PATCH][v5][10/24] Add Mellanox HCA low-level driver (midlayer interface) Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-SA-Exim-Version: 4.1 (built Tue, 17 Aug 2004 11:06:07 +0200) X-SA-Exim-Scanned: Yes (on eddore) X-OriginalArrivalTime: 28 Dec 2004 05:51:05.0782 (UTC) FILETIME=[38AABD60:01C4ECA1] X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13105 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: roland@topspin.com Precedence: bulk X-list: netdev Add midlayer interface code for Mellanox HCA driver. Signed-off-by: Roland Dreier --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/hw/mthca/mthca_provider.c 2004-12-27 21:48:22.043721469 -0800 @@ -0,0 +1,627 @@ +/* + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: mthca_provider.c 1397 2004-12-28 05:09:00Z roland $ + */ + +#include + +#include "mthca_dev.h" +#include "mthca_cmd.h" + +static int mthca_query_device(struct ib_device *ibdev, + struct ib_device_attr *props) +{ + struct ib_smp *in_mad = NULL; + struct ib_smp *out_mad = NULL; + int err = -ENOMEM; + u8 status; + + in_mad = kmalloc(sizeof *in_mad, GFP_KERNEL); + out_mad = kmalloc(sizeof *out_mad, GFP_KERNEL); + if (!in_mad || !out_mad) + goto out; + + props->fw_ver = to_mdev(ibdev)->fw_ver; + + memset(in_mad, 0, sizeof *in_mad); + in_mad->base_version = 1; + in_mad->mgmt_class = IB_MGMT_CLASS_SUBN_LID_ROUTED; + in_mad->class_version = 1; + in_mad->method = IB_MGMT_METHOD_GET; + in_mad->attr_id = IB_SMP_ATTR_NODE_INFO; + + err = mthca_MAD_IFC(to_mdev(ibdev), 1, + 1, in_mad, out_mad, + &status); + if (err) + goto out; + if (status) { + err = -EINVAL; + goto out; + } + + props->vendor_id = be32_to_cpup((u32 *) (out_mad->data + 36)) & + 0xffffff; + props->vendor_part_id = be16_to_cpup((u16 *) (out_mad->data + 30)); + props->hw_ver = be16_to_cpup((u16 *) (out_mad->data + 32)); + memcpy(&props->sys_image_guid, out_mad->data + 4, 8); + memcpy(&props->node_guid, out_mad->data + 12, 8); + + err = 0; + out: + kfree(in_mad); + kfree(out_mad); + return err; +} + +static int mthca_query_port(struct ib_device *ibdev, + u8 port, struct ib_port_attr *props) +{ + struct ib_smp *in_mad = NULL; + struct ib_smp *out_mad = NULL; + int err = -ENOMEM; + u8 status; + + in_mad = kmalloc(sizeof *in_mad, GFP_KERNEL); + out_mad = kmalloc(sizeof *out_mad, GFP_KERNEL); + if (!in_mad || !out_mad) + goto out; + + memset(in_mad, 0, sizeof *in_mad); + in_mad->base_version = 1; + in_mad->mgmt_class = IB_MGMT_CLASS_SUBN_LID_ROUTED; + in_mad->class_version = 1; + in_mad->method = IB_MGMT_METHOD_GET; + in_mad->attr_id = IB_SMP_ATTR_PORT_INFO; + in_mad->attr_mod = cpu_to_be32(port); + + err = mthca_MAD_IFC(to_mdev(ibdev), 1, + port, in_mad, out_mad, + &status); + if (err) + goto out; + if (status) { + err = -EINVAL; + goto out; + } + + props->lid = be16_to_cpup((u16 *) (out_mad->data + 16)); + props->lmc = out_mad->data[34] & 0x7; + props->sm_lid = be16_to_cpup((u16 *) (out_mad->data + 18)); + props->sm_sl = out_mad->data[36] & 0xf; + props->state = out_mad->data[32] & 0xf; + props->port_cap_flags = be32_to_cpup((u32 *) (out_mad->data + 20)); + props->gid_tbl_len = to_mdev(ibdev)->limits.gid_table_len; + props->pkey_tbl_len = to_mdev(ibdev)->limits.pkey_table_len; + props->qkey_viol_cntr = be16_to_cpup((u16 *) (out_mad->data + 48)); + props->active_width = out_mad->data[31] & 0xf; + props->active_speed = out_mad->data[35] >> 4; + + out: + kfree(in_mad); + kfree(out_mad); + return err; +} + +static int mthca_modify_port(struct ib_device *ibdev, + u8 port, int port_modify_mask, + struct ib_port_modify *props) +{ + return 0; +} + +static int mthca_query_pkey(struct ib_device *ibdev, + u8 port, u16 index, u16 *pkey) +{ + struct ib_smp *in_mad = NULL; + struct ib_smp *out_mad = NULL; + int err = -ENOMEM; + u8 status; + + in_mad = kmalloc(sizeof *in_mad, GFP_KERNEL); + out_mad = kmalloc(sizeof *out_mad, GFP_KERNEL); + if (!in_mad || !out_mad) + goto out; + + memset(in_mad, 0, sizeof *in_mad); + in_mad->base_version = 1; + in_mad->mgmt_class = IB_MGMT_CLASS_SUBN_LID_ROUTED; + in_mad->class_version = 1; + in_mad->method = IB_MGMT_METHOD_GET; + in_mad->attr_id = IB_SMP_ATTR_PKEY_TABLE; + in_mad->attr_mod = cpu_to_be32(index / 32); + + err = mthca_MAD_IFC(to_mdev(ibdev), 1, + port, in_mad, out_mad, + &status); + if (err) + goto out; + if (status) { + err = -EINVAL; + goto out; + } + + *pkey = be16_to_cpu(((u16 *) out_mad->data)[index % 32]); + + out: + kfree(in_mad); + kfree(out_mad); + return err; +} + +static int mthca_query_gid(struct ib_device *ibdev, u8 port, + int index, union ib_gid *gid) +{ + struct ib_smp *in_mad = NULL; + struct ib_smp *out_mad = NULL; + int err = -ENOMEM; + u8 status; + + in_mad = kmalloc(sizeof *in_mad, GFP_KERNEL); + out_mad = kmalloc(sizeof *out_mad, GFP_KERNEL); + if (!in_mad || !out_mad) + goto out; + + memset(in_mad, 0, sizeof *in_mad); + in_mad->base_version = 1; + in_mad->mgmt_class = IB_MGMT_CLASS_SUBN_LID_ROUTED; + in_mad->class_version = 1; + in_mad->method = IB_MGMT_METHOD_GET; + in_mad->attr_id = IB_SMP_ATTR_PORT_INFO; + in_mad->attr_mod = cpu_to_be32(port); + + err = mthca_MAD_IFC(to_mdev(ibdev), 1, + port, in_mad, out_mad, + &status); + if (err) + goto out; + if (status) { + err = -EINVAL; + goto out; + } + + memcpy(gid->raw, out_mad->data + 8, 8); + + memset(in_mad, 0, sizeof *in_mad); + in_mad->base_version = 1; + in_mad->mgmt_class = IB_MGMT_CLASS_SUBN_LID_ROUTED; + in_mad->class_version = 1; + in_mad->method = IB_MGMT_METHOD_GET; + in_mad->attr_id = IB_SMP_ATTR_GUID_INFO; + in_mad->attr_mod = cpu_to_be32(index / 8); + + err = mthca_MAD_IFC(to_mdev(ibdev), 1, + port, in_mad, out_mad, + &status); + if (err) + goto out; + if (status) { + err = -EINVAL; + goto out; + } + + memcpy(gid->raw + 8, out_mad->data + (index % 8) * 16, 8); + + out: + kfree(in_mad); + kfree(out_mad); + return err; +} + +static struct ib_pd *mthca_alloc_pd(struct ib_device *ibdev) +{ + struct mthca_pd *pd; + int err; + + pd = kmalloc(sizeof *pd, GFP_KERNEL); + if (!pd) + return ERR_PTR(-ENOMEM); + + err = mthca_pd_alloc(to_mdev(ibdev), pd); + if (err) { + kfree(pd); + return ERR_PTR(err); + } + + return &pd->ibpd; +} + +static int mthca_dealloc_pd(struct ib_pd *pd) +{ + mthca_pd_free(to_mdev(pd->device), to_mpd(pd)); + kfree(pd); + + return 0; +} + +static struct ib_ah *mthca_ah_create(struct ib_pd *pd, + struct ib_ah_attr *ah_attr) +{ + int err; + struct mthca_ah *ah; + + ah = kmalloc(sizeof *ah, GFP_KERNEL); + if (!ah) + return ERR_PTR(-ENOMEM); + + err = mthca_create_ah(to_mdev(pd->device), to_mpd(pd), ah_attr, ah); + if (err) { + kfree(ah); + return ERR_PTR(err); + } + + return &ah->ibah; +} + +static int mthca_ah_destroy(struct ib_ah *ah) +{ + mthca_destroy_ah(to_mdev(ah->device), to_mah(ah)); + kfree(ah); + + return 0; +} + +static struct ib_qp *mthca_create_qp(struct ib_pd *pd, + struct ib_qp_init_attr *init_attr) +{ + struct mthca_qp *qp; + int err; + + switch (init_attr->qp_type) { + case IB_QPT_RC: + case IB_QPT_UC: + case IB_QPT_UD: + { + qp = kmalloc(sizeof *qp, GFP_KERNEL); + if (!qp) + return ERR_PTR(-ENOMEM); + + qp->sq.max = init_attr->cap.max_send_wr; + qp->rq.max = init_attr->cap.max_recv_wr; + qp->sq.max_gs = init_attr->cap.max_send_sge; + qp->rq.max_gs = init_attr->cap.max_recv_sge; + + err = mthca_alloc_qp(to_mdev(pd->device), to_mpd(pd), + to_mcq(init_attr->send_cq), + to_mcq(init_attr->recv_cq), + init_attr->qp_type, init_attr->sq_sig_type, + init_attr->rq_sig_type, qp); + qp->ibqp.qp_num = qp->qpn; + break; + } + case IB_QPT_SMI: + case IB_QPT_GSI: + { + qp = kmalloc(sizeof (struct mthca_sqp), GFP_KERNEL); + if (!qp) + return ERR_PTR(-ENOMEM); + + qp->sq.max = init_attr->cap.max_send_wr; + qp->rq.max = init_attr->cap.max_recv_wr; + qp->sq.max_gs = init_attr->cap.max_send_sge; + qp->rq.max_gs = init_attr->cap.max_recv_sge; + + qp->ibqp.qp_num = init_attr->qp_type == IB_QPT_SMI ? 0 : 1; + + err = mthca_alloc_sqp(to_mdev(pd->device), to_mpd(pd), + to_mcq(init_attr->send_cq), + to_mcq(init_attr->recv_cq), + init_attr->sq_sig_type, init_attr->rq_sig_type, + qp->ibqp.qp_num, init_attr->port_num, + to_msqp(qp)); + break; + } + default: + /* Don't support raw QPs */ + return ERR_PTR(-ENOSYS); + } + + if (err) { + kfree(qp); + return ERR_PTR(err); + } + + init_attr->cap.max_inline_data = 0; + + return &qp->ibqp; +} + +static int mthca_destroy_qp(struct ib_qp *qp) +{ + mthca_free_qp(to_mdev(qp->device), to_mqp(qp)); + kfree(qp); + return 0; +} + +static struct ib_cq *mthca_create_cq(struct ib_device *ibdev, int entries) +{ + struct mthca_cq *cq; + int nent; + int err; + + cq = kmalloc(sizeof *cq, GFP_KERNEL); + if (!cq) + return ERR_PTR(-ENOMEM); + + for (nent = 1; nent <= entries; nent <<= 1) + ; /* nothing */ + + err = mthca_init_cq(to_mdev(ibdev), nent, cq); + if (err) { + kfree(cq); + cq = ERR_PTR(err); + } else + cq->ibcq.cqe = nent - 1; + + return &cq->ibcq; +} + +static int mthca_destroy_cq(struct ib_cq *cq) +{ + mthca_free_cq(to_mdev(cq->device), to_mcq(cq)); + kfree(cq); + + return 0; +} + +static int mthca_req_notify_cq(struct ib_cq *cq, enum ib_cq_notify notify) +{ + mthca_arm_cq(to_mdev(cq->device), to_mcq(cq), + notify == IB_CQ_SOLICITED); + return 0; +} + +static inline u32 convert_access(int acc) +{ + return (acc & IB_ACCESS_REMOTE_ATOMIC ? MTHCA_MPT_FLAG_ATOMIC : 0) | + (acc & IB_ACCESS_REMOTE_WRITE ? MTHCA_MPT_FLAG_REMOTE_WRITE : 0) | + (acc & IB_ACCESS_REMOTE_READ ? MTHCA_MPT_FLAG_REMOTE_READ : 0) | + (acc & IB_ACCESS_LOCAL_WRITE ? MTHCA_MPT_FLAG_LOCAL_WRITE : 0) | + MTHCA_MPT_FLAG_LOCAL_READ; +} + +static struct ib_mr *mthca_get_dma_mr(struct ib_pd *pd, int acc) +{ + struct mthca_mr *mr; + int err; + + mr = kmalloc(sizeof *mr, GFP_KERNEL); + if (!mr) + return ERR_PTR(-ENOMEM); + + err = mthca_mr_alloc_notrans(to_mdev(pd->device), + to_mpd(pd)->pd_num, + convert_access(acc), mr); + + if (err) { + kfree(mr); + return ERR_PTR(err); + } + + return &mr->ibmr; +} + +static struct ib_mr *mthca_reg_phys_mr(struct ib_pd *pd, + struct ib_phys_buf *buffer_list, + int num_phys_buf, + int acc, + u64 *iova_start) +{ + struct mthca_mr *mr; + u64 *page_list; + u64 total_size; + u64 mask; + int shift; + int npages; + int err; + int i, j, n; + + /* First check that we have enough alignment */ + if ((*iova_start & ~PAGE_MASK) != (buffer_list[0].addr & ~PAGE_MASK)) + return ERR_PTR(-EINVAL); + + if (num_phys_buf > 1 && + ((buffer_list[0].addr + buffer_list[0].size) & ~PAGE_MASK)) + return ERR_PTR(-EINVAL); + + mask = 0; + total_size = 0; + for (i = 0; i < num_phys_buf; ++i) { + if (buffer_list[i].addr & ~PAGE_MASK) + return ERR_PTR(-EINVAL); + if (i != 0 && i != num_phys_buf - 1 && + (buffer_list[i].size & ~PAGE_MASK)) + return ERR_PTR(-EINVAL); + + total_size += buffer_list[i].size; + if (i > 0) + mask |= buffer_list[i].addr; + } + + /* Find largest page shift we can use to cover buffers */ + for (shift = PAGE_SHIFT; shift < 31; ++shift) + if (num_phys_buf > 1) { + if ((1ULL << shift) & mask) + break; + } else { + if (1ULL << shift >= + buffer_list[0].size + + (buffer_list[0].addr & ((1ULL << shift) - 1))) + break; + } + + buffer_list[0].size += buffer_list[0].addr & ((1ULL << shift) - 1); + buffer_list[0].addr &= ~0ull << shift; + + mr = kmalloc(sizeof *mr, GFP_KERNEL); + if (!mr) + return ERR_PTR(-ENOMEM); + + npages = 0; + for (i = 0; i < num_phys_buf; ++i) + npages += (buffer_list[i].size + (1ULL << shift) - 1) >> shift; + + if (!npages) + return &mr->ibmr; + + page_list = kmalloc(npages * sizeof *page_list, GFP_KERNEL); + if (!page_list) { + kfree(mr); + return ERR_PTR(-ENOMEM); + } + + n = 0; + for (i = 0; i < num_phys_buf; ++i) + for (j = 0; + j < (buffer_list[i].size + (1ULL << shift) - 1) >> shift; + ++j) + page_list[n++] = buffer_list[i].addr + ((u64) j << shift); + + mthca_dbg(to_mdev(pd->device), "Registering memory at %llx (iova %llx) " + "in PD %x; shift %d, npages %d.\n", + (unsigned long long) buffer_list[0].addr, + (unsigned long long) *iova_start, + to_mpd(pd)->pd_num, + shift, npages); + + err = mthca_mr_alloc_phys(to_mdev(pd->device), + to_mpd(pd)->pd_num, + page_list, shift, npages, + *iova_start, total_size, + convert_access(acc), mr); + + if (err) { + kfree(mr); + return ERR_PTR(err); + } + + kfree(page_list); + return &mr->ibmr; +} + +static int mthca_dereg_mr(struct ib_mr *mr) +{ + mthca_free_mr(to_mdev(mr->device), to_mmr(mr)); + kfree(mr); + return 0; +} + +static ssize_t show_rev(struct class_device *cdev, char *buf) +{ + struct mthca_dev *dev = container_of(cdev, struct mthca_dev, ib_dev.class_dev); + return sprintf(buf, "%x\n", dev->rev_id); +} + +static ssize_t show_fw_ver(struct class_device *cdev, char *buf) +{ + struct mthca_dev *dev = container_of(cdev, struct mthca_dev, ib_dev.class_dev); + return sprintf(buf, "%x.%x.%x\n", (int) (dev->fw_ver >> 32), + (int) (dev->fw_ver >> 16) & 0xffff, + (int) dev->fw_ver & 0xffff); +} + +static ssize_t show_hca(struct class_device *cdev, char *buf) +{ + struct mthca_dev *dev = container_of(cdev, struct mthca_dev, ib_dev.class_dev); + switch (dev->hca_type) { + case TAVOR: return sprintf(buf, "MT23108\n"); + case ARBEL_COMPAT: return sprintf(buf, "MT25208 (MT23108 compat mode)\n"); + case ARBEL_NATIVE: return sprintf(buf, "MT25208\n"); + default: return sprintf(buf, "unknown\n"); + } +} + +static CLASS_DEVICE_ATTR(hw_rev, S_IRUGO, show_rev, NULL); +static CLASS_DEVICE_ATTR(fw_ver, S_IRUGO, show_fw_ver, NULL); +static CLASS_DEVICE_ATTR(hca_type, S_IRUGO, show_hca, NULL); + +static struct class_device_attribute *mthca_class_attributes[] = { + &class_device_attr_hw_rev, + &class_device_attr_fw_ver, + &class_device_attr_hca_type +}; + +int mthca_register_device(struct mthca_dev *dev) +{ + int ret; + int i; + + strlcpy(dev->ib_dev.name, "mthca%d", IB_DEVICE_NAME_MAX); + dev->ib_dev.node_type = IB_NODE_CA; + dev->ib_dev.phys_port_cnt = dev->limits.num_ports; + dev->ib_dev.dma_device = &dev->pdev->dev; + dev->ib_dev.class_dev.dev = &dev->pdev->dev; + dev->ib_dev.query_device = mthca_query_device; + dev->ib_dev.query_port = mthca_query_port; + dev->ib_dev.modify_port = mthca_modify_port; + dev->ib_dev.query_pkey = mthca_query_pkey; + dev->ib_dev.query_gid = mthca_query_gid; + dev->ib_dev.alloc_pd = mthca_alloc_pd; + dev->ib_dev.dealloc_pd = mthca_dealloc_pd; + dev->ib_dev.create_ah = mthca_ah_create; + dev->ib_dev.destroy_ah = mthca_ah_destroy; + dev->ib_dev.create_qp = mthca_create_qp; + dev->ib_dev.modify_qp = mthca_modify_qp; + dev->ib_dev.destroy_qp = mthca_destroy_qp; + dev->ib_dev.post_send = mthca_post_send; + dev->ib_dev.post_recv = mthca_post_receive; + dev->ib_dev.create_cq = mthca_create_cq; + dev->ib_dev.destroy_cq = mthca_destroy_cq; + dev->ib_dev.poll_cq = mthca_poll_cq; + dev->ib_dev.req_notify_cq = mthca_req_notify_cq; + dev->ib_dev.get_dma_mr = mthca_get_dma_mr; + dev->ib_dev.reg_phys_mr = mthca_reg_phys_mr; + dev->ib_dev.dereg_mr = mthca_dereg_mr; + dev->ib_dev.attach_mcast = mthca_multicast_attach; + dev->ib_dev.detach_mcast = mthca_multicast_detach; + dev->ib_dev.process_mad = mthca_process_mad; + + ret = ib_register_device(&dev->ib_dev); + if (ret) + return ret; + + for (i = 0; i < ARRAY_SIZE(mthca_class_attributes); ++i) { + ret = class_device_create_file(&dev->ib_dev.class_dev, + mthca_class_attributes[i]); + if (ret) { + ib_unregister_device(&dev->ib_dev); + return ret; + } + } + + return 0; +} + +void mthca_unregister_device(struct mthca_dev *dev) +{ + ib_unregister_device(&dev->ib_dev); +} --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/hw/mthca/mthca_provider.h 2004-12-27 21:48:22.091714405 -0800 @@ -0,0 +1,225 @@ +/* + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: mthca_provider.h 1349 2004-12-16 21:09:43Z roland $ + */ + +#ifndef MTHCA_PROVIDER_H +#define MTHCA_PROVIDER_H + +#include +#include + +#define MTHCA_MPT_FLAG_ATOMIC (1 << 14) +#define MTHCA_MPT_FLAG_REMOTE_WRITE (1 << 13) +#define MTHCA_MPT_FLAG_REMOTE_READ (1 << 12) +#define MTHCA_MPT_FLAG_LOCAL_WRITE (1 << 11) +#define MTHCA_MPT_FLAG_LOCAL_READ (1 << 10) + +struct mthca_buf_list { + void *buf; + DECLARE_PCI_UNMAP_ADDR(mapping) +}; + +struct mthca_mr { + struct ib_mr ibmr; + int order; + u32 first_seg; +}; + +struct mthca_pd { + struct ib_pd ibpd; + u32 pd_num; + atomic_t sqp_count; + struct mthca_mr ntmr; +}; + +struct mthca_eq { + struct mthca_dev *dev; + int eqn; + u32 ecr_mask; + u16 msi_x_vector; + u16 msi_x_entry; + int have_irq; + int nent; + int cons_index; + struct mthca_buf_list *page_list; + struct mthca_mr mr; +}; + +struct mthca_av; + +struct mthca_ah { + struct ib_ah ibah; + int on_hca; + u32 key; + struct mthca_av *av; + dma_addr_t avdma; +}; + +/* + * Quick description of our CQ/QP locking scheme: + * + * We have one global lock that protects dev->cq/qp_table. Each + * struct mthca_cq/qp also has its own lock. An individual qp lock + * may be taken inside of an individual cq lock. Both cqs attached to + * a qp may be locked, with the send cq locked first. No other + * nesting should be done. + * + * Each struct mthca_cq/qp also has an atomic_t ref count. The + * pointer from the cq/qp_table to the struct counts as one reference. + * This reference also is good for access through the consumer API, so + * modifying the CQ/QP etc doesn't need to take another reference. + * Access because of a completion being polled does need a reference. + * + * Finally, each struct mthca_cq/qp has a wait_queue_head_t for the + * destroy function to sleep on. + * + * This means that access from the consumer API requires nothing but + * taking the struct's lock. + * + * Access because of a completion event should go as follows: + * - lock cq/qp_table and look up struct + * - increment ref count in struct + * - drop cq/qp_table lock + * - lock struct, do your thing, and unlock struct + * - decrement ref count; if zero, wake up waiters + * + * To destroy a CQ/QP, we can do the following: + * - lock cq/qp_table, remove pointer, unlock cq/qp_table lock + * - decrement ref count + * - wait_event until ref count is zero + * + * It is the consumer's responsibilty to make sure that no QP + * operations (WQE posting or state modification) are pending when the + * QP is destroyed. Also, the consumer must make sure that calls to + * qp_modify are serialized. + * + * Possible optimizations (wait for profile data to see if/where we + * have locks bouncing between CPUs): + * - split cq/qp table lock into n separate (cache-aligned) locks, + * indexed (say) by the page in the table + * - split QP struct lock into three (one for common info, one for the + * send queue and one for the receive queue) + */ + +struct mthca_cq { + struct ib_cq ibcq; + spinlock_t lock; + atomic_t refcount; + int cqn; + int cons_index; + int is_direct; + union { + struct mthca_buf_list direct; + struct mthca_buf_list *page_list; + } queue; + struct mthca_mr mr; + wait_queue_head_t wait; +}; + +struct mthca_wq { + int max; + int cur; + int next; + int last_comp; + void *last; + int max_gs; + int wqe_shift; + enum ib_sig_type policy; +}; + +struct mthca_qp { + struct ib_qp ibqp; + spinlock_t lock; + atomic_t refcount; + u32 qpn; + int transport; + enum ib_qp_state state; + int is_direct; + struct mthca_mr mr; + + struct mthca_wq rq; + struct mthca_wq sq; + int send_wqe_offset; + + u64 *wrid; + union { + struct mthca_buf_list direct; + struct mthca_buf_list *page_list; + } queue; + + wait_queue_head_t wait; +}; + +struct mthca_sqp { + struct mthca_qp qp; + int port; + int pkey_index; + u32 qkey; + u32 send_psn; + struct ib_ud_header ud_header; + int header_buf_size; + void *header_buf; + dma_addr_t header_dma; +}; + +static inline struct mthca_mr *to_mmr(struct ib_mr *ibmr) +{ + return container_of(ibmr, struct mthca_mr, ibmr); +} + +static inline struct mthca_pd *to_mpd(struct ib_pd *ibpd) +{ + return container_of(ibpd, struct mthca_pd, ibpd); +} + +static inline struct mthca_ah *to_mah(struct ib_ah *ibah) +{ + return container_of(ibah, struct mthca_ah, ibah); +} + +static inline struct mthca_cq *to_mcq(struct ib_cq *ibcq) +{ + return container_of(ibcq, struct mthca_cq, ibcq); +} + +static inline struct mthca_qp *to_mqp(struct ib_qp *ibqp) +{ + return container_of(ibqp, struct mthca_qp, ibqp); +} + +static inline struct mthca_sqp *to_msqp(struct mthca_qp *qp) +{ + return container_of(qp, struct mthca_sqp, qp); +} + +#endif /* MTHCA_PROVIDER_H */ From roland@topspin.com Mon Dec 27 21:49:48 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 21:50:46 -0800 (PST) Received: from umhlanga.STRATNET.NET (umhlanga.stratnet.net [12.162.17.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS5nJDs025948 for ; Mon, 27 Dec 2004 21:49:47 -0800 Received: from exch-1.topspincom.com ([12.162.17.3]) by umhlanga.STRATNET.NET with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:51:13 -0800 Received: from eddore ([10.10.253.169]) by exch-1.topspincom.com with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:51:13 -0800 Received: from localhost ([127.0.0.1] helo=eddore) by eddore with smtp (Exim 4.34) id 1CjAGG-0000uo-8s; Mon, 27 Dec 2004 21:51:13 -0800 Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com, openib-general@openib.org In-Reply-To: <200412272151.qdu1iD71iJs65qnW@topspin.com> X-Mailer: Roland's Patchbomber Date: Mon, 27 Dec 2004 21:51:12 -0800 Message-Id: <200412272151.KJdDvki21LuSsIo9@topspin.com> Mime-Version: 1.0 To: davem@davemloft.net From: Roland Dreier X-SA-Exim-Connect-IP: 127.0.0.1 X-SA-Exim-Mail-From: roland@topspin.com Subject: [PATCH][v5][15/24] Add Mellanox HCA low-level driver (last bits) Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-SA-Exim-Version: 4.1 (built Tue, 17 Aug 2004 11:06:07 +0200) X-SA-Exim-Scanned: Yes (on eddore) X-OriginalArrivalTime: 28 Dec 2004 05:51:13.0610 (UTC) FILETIME=[3D5532A0:01C4ECA1] X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13107 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: roland@topspin.com Precedence: bulk X-list: netdev Add code for remaining InfiniBand objects (address vectors, multicast groups, memory regions and protection domains) Signed-off-by: Roland Dreier --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/hw/mthca/mthca_av.c 2004-12-27 21:48:23.889449784 -0800 @@ -0,0 +1,219 @@ +/* + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: mthca_av.c 1349 2004-12-16 21:09:43Z roland $ + */ + +#include + +#include +#include + +#include "mthca_dev.h" + +struct mthca_av { + u32 port_pd; + u8 reserved1; + u8 g_slid; + u16 dlid; + u8 reserved2; + u8 gid_index; + u8 msg_sr; + u8 hop_limit; + u32 sl_tclass_flowlabel; + u32 dgid[4]; +}; + +int mthca_create_ah(struct mthca_dev *dev, + struct mthca_pd *pd, + struct ib_ah_attr *ah_attr, + struct mthca_ah *ah) +{ + u32 index = -1; + struct mthca_av *av = NULL; + + ah->on_hca = 0; + + if (!atomic_read(&pd->sqp_count) && + !(dev->mthca_flags & MTHCA_FLAG_DDR_HIDDEN)) { + index = mthca_alloc(&dev->av_table.alloc); + + /* fall back to allocate in host memory */ + if (index == -1) + goto host_alloc; + + av = kmalloc(sizeof *av, GFP_KERNEL); + if (!av) + goto host_alloc; + + ah->on_hca = 1; + ah->avdma = dev->av_table.ddr_av_base + + index * MTHCA_AV_SIZE; + } + + host_alloc: + if (!ah->on_hca) { + ah->av = pci_pool_alloc(dev->av_table.pool, + SLAB_KERNEL, &ah->avdma); + if (!ah->av) + return -ENOMEM; + + av = ah->av; + } + + ah->key = pd->ntmr.ibmr.lkey; + + memset(av, 0, MTHCA_AV_SIZE); + + av->port_pd = cpu_to_be32(pd->pd_num | (ah_attr->port_num << 24)); + av->g_slid = ah_attr->src_path_bits; + av->dlid = cpu_to_be16(ah_attr->dlid); + av->msg_sr = (3 << 4) | /* 2K message */ + ah_attr->static_rate; + av->sl_tclass_flowlabel = cpu_to_be32(ah_attr->sl << 28); + if (ah_attr->ah_flags & IB_AH_GRH) { + av->g_slid |= 0x80; + av->gid_index = (ah_attr->port_num - 1) * dev->limits.gid_table_len + + ah_attr->grh.sgid_index; + av->hop_limit = ah_attr->grh.hop_limit; + av->sl_tclass_flowlabel |= + cpu_to_be32((ah_attr->grh.traffic_class << 20) | + ah_attr->grh.flow_label); + memcpy(av->dgid, ah_attr->grh.dgid.raw, 16); + } else { + /* Arbel workaround -- low byte of GID must be 2 */ + av->dgid[3] = cpu_to_be32(2); + } + + if (0) { + int j; + + mthca_dbg(dev, "Created UDAV at %p/%08lx:\n", + av, (unsigned long) ah->avdma); + for (j = 0; j < 8; ++j) + printk(KERN_DEBUG " [%2x] %08x\n", + j * 4, be32_to_cpu(((u32 *) av)[j])); + } + + if (ah->on_hca) { + memcpy_toio(dev->av_table.av_map + index * MTHCA_AV_SIZE, + av, MTHCA_AV_SIZE); + kfree(av); + } + + return 0; +} + +int mthca_destroy_ah(struct mthca_dev *dev, struct mthca_ah *ah) +{ + if (ah->on_hca) + mthca_free(&dev->av_table.alloc, + (ah->avdma - dev->av_table.ddr_av_base) / + MTHCA_AV_SIZE); + else + pci_pool_free(dev->av_table.pool, ah->av, ah->avdma); + + return 0; +} + +int mthca_read_ah(struct mthca_dev *dev, struct mthca_ah *ah, + struct ib_ud_header *header) +{ + if (ah->on_hca) + return -EINVAL; + + header->lrh.service_level = be32_to_cpu(ah->av->sl_tclass_flowlabel) >> 28; + header->lrh.destination_lid = ah->av->dlid; + header->lrh.source_lid = ah->av->g_slid & 0x7f; + if (ah->av->g_slid & 0x80) { + header->grh_present = 1; + header->grh.traffic_class = + (be32_to_cpu(ah->av->sl_tclass_flowlabel) >> 20) & 0xff; + header->grh.flow_label = + ah->av->sl_tclass_flowlabel & cpu_to_be32(0xfffff); + ib_cached_gid_get(&dev->ib_dev, + be32_to_cpu(ah->av->port_pd) >> 24, + ah->av->gid_index, + &header->grh.source_gid); + memcpy(header->grh.destination_gid.raw, + ah->av->dgid, 16); + } else { + header->grh_present = 0; + } + + return 0; +} + +int __devinit mthca_init_av_table(struct mthca_dev *dev) +{ + int err; + + err = mthca_alloc_init(&dev->av_table.alloc, + dev->av_table.num_ddr_avs, + dev->av_table.num_ddr_avs - 1, + 0); + if (err) + return err; + + dev->av_table.pool = pci_pool_create("mthca_av", dev->pdev, + MTHCA_AV_SIZE, + MTHCA_AV_SIZE, 0); + if (!dev->av_table.pool) + goto out_free_alloc; + + if (!(dev->mthca_flags & MTHCA_FLAG_DDR_HIDDEN)) { + dev->av_table.av_map = ioremap(pci_resource_start(dev->pdev, 4) + + dev->av_table.ddr_av_base - + dev->ddr_start, + dev->av_table.num_ddr_avs * + MTHCA_AV_SIZE); + if (!dev->av_table.av_map) + goto out_free_pool; + } else + dev->av_table.av_map = NULL; + + return 0; + + out_free_pool: + pci_pool_destroy(dev->av_table.pool); + + out_free_alloc: + mthca_alloc_cleanup(&dev->av_table.alloc); + return -ENOMEM; +} + +void __devexit mthca_cleanup_av_table(struct mthca_dev *dev) +{ + if (dev->av_table.av_map) + iounmap(dev->av_table.av_map); + pci_pool_destroy(dev->av_table.pool); + mthca_alloc_cleanup(&dev->av_table.alloc); +} --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/hw/mthca/mthca_mcg.c 2004-12-27 21:48:23.936442867 -0800 @@ -0,0 +1,376 @@ +/* + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: mthca_mcg.c 1349 2004-12-16 21:09:43Z roland $ + */ + +#include + +#include "mthca_dev.h" +#include "mthca_cmd.h" + +enum { + MTHCA_QP_PER_MGM = 4 * (MTHCA_MGM_ENTRY_SIZE / 16 - 2) +}; + +struct mthca_mgm { + u32 next_gid_index; + u32 reserved[3]; + u8 gid[16]; + u32 qp[MTHCA_QP_PER_MGM]; +}; + +static const u8 zero_gid[16]; /* automatically initialized to 0 */ + +/* + * Caller must hold MCG table semaphore. gid and mgm parameters must + * be properly aligned for command interface. + * + * Returns 0 unless a firmware command error occurs. + * + * If GID is found in MGM or MGM is empty, *index = *hash, *prev = -1 + * and *mgm holds MGM entry. + * + * if GID is found in AMGM, *index = index in AMGM, *prev = index of + * previous entry in hash chain and *mgm holds AMGM entry. + * + * If no AMGM exists for given gid, *index = -1, *prev = index of last + * entry in hash chain and *mgm holds end of hash chain. + */ +static int find_mgm(struct mthca_dev *dev, + u8 *gid, struct mthca_mgm *mgm, + u16 *hash, int *prev, int *index) +{ + void *mailbox; + u8 *mgid; + int err; + u8 status; + + mailbox = kmalloc(16 + MTHCA_CMD_MAILBOX_EXTRA, GFP_KERNEL); + if (!mailbox) + return -ENOMEM; + mgid = MAILBOX_ALIGN(mailbox); + + memcpy(mgid, gid, 16); + + err = mthca_MGID_HASH(dev, mgid, hash, &status); + if (err) + goto out; + if (status) { + mthca_err(dev, "MGID_HASH returned status %02x\n", status); + err = -EINVAL; + goto out; + } + + if (0) + mthca_dbg(dev, "Hash for %04x:%04x:%04x:%04x:" + "%04x:%04x:%04x:%04x is %04x\n", + be16_to_cpu(((u16 *) gid)[0]), be16_to_cpu(((u16 *) gid)[1]), + be16_to_cpu(((u16 *) gid)[2]), be16_to_cpu(((u16 *) gid)[3]), + be16_to_cpu(((u16 *) gid)[4]), be16_to_cpu(((u16 *) gid)[5]), + be16_to_cpu(((u16 *) gid)[6]), be16_to_cpu(((u16 *) gid)[7]), + *hash); + + *index = *hash; + *prev = -1; + + do { + err = mthca_READ_MGM(dev, *index, mgm, &status); + if (err) + goto out; + if (status) { + mthca_err(dev, "READ_MGM returned status %02x\n", status); + return -EINVAL; + } + + if (!memcmp(mgm->gid, zero_gid, 16)) { + if (*index != *hash) { + mthca_err(dev, "Found zero MGID in AMGM.\n"); + err = -EINVAL; + } + goto out; + } + + if (!memcmp(mgm->gid, gid, 16)) + goto out; + + *prev = *index; + *index = be32_to_cpu(mgm->next_gid_index) >> 5; + } while (*index); + + *index = -1; + + out: + kfree(mailbox); + return err; +} + +int mthca_multicast_attach(struct ib_qp *ibqp, union ib_gid *gid, u16 lid) +{ + struct mthca_dev *dev = to_mdev(ibqp->device); + void *mailbox; + struct mthca_mgm *mgm; + u16 hash; + int index, prev; + int link = 0; + int i; + int err; + u8 status; + + mailbox = kmalloc(sizeof *mgm + MTHCA_CMD_MAILBOX_EXTRA, GFP_KERNEL); + if (!mailbox) + return -ENOMEM; + mgm = MAILBOX_ALIGN(mailbox); + + if (down_interruptible(&dev->mcg_table.sem)) + return -EINTR; + + err = find_mgm(dev, gid->raw, mgm, &hash, &prev, &index); + if (err) + goto out; + + if (index != -1) { + if (!memcmp(mgm->gid, zero_gid, 16)) + memcpy(mgm->gid, gid->raw, 16); + } else { + link = 1; + + index = mthca_alloc(&dev->mcg_table.alloc); + if (index == -1) { + mthca_err(dev, "No AMGM entries left\n"); + err = -ENOMEM; + goto out; + } + + err = mthca_READ_MGM(dev, index, mgm, &status); + if (err) + goto out; + if (status) { + mthca_err(dev, "READ_MGM returned status %02x\n", status); + err = -EINVAL; + goto out; + } + + memcpy(mgm->gid, gid->raw, 16); + mgm->next_gid_index = 0; + } + + for (i = 0; i < MTHCA_QP_PER_MGM; ++i) + if (!(mgm->qp[i] & cpu_to_be32(1 << 31))) { + mgm->qp[i] = cpu_to_be32(ibqp->qp_num | (1 << 31)); + break; + } + + if (i == MTHCA_QP_PER_MGM) { + mthca_err(dev, "MGM at index %x is full.\n", index); + err = -ENOMEM; + goto out; + } + + err = mthca_WRITE_MGM(dev, index, mgm, &status); + if (err) + goto out; + if (status) { + mthca_err(dev, "WRITE_MGM returned status %02x\n", status); + err = -EINVAL; + } + + if (!link) + goto out; + + err = mthca_READ_MGM(dev, prev, mgm, &status); + if (err) + goto out; + if (status) { + mthca_err(dev, "READ_MGM returned status %02x\n", status); + err = -EINVAL; + goto out; + } + + mgm->next_gid_index = cpu_to_be32(index << 5); + + err = mthca_WRITE_MGM(dev, prev, mgm, &status); + if (err) + goto out; + if (status) { + mthca_err(dev, "WRITE_MGM returned status %02x\n", status); + err = -EINVAL; + } + + out: + up(&dev->mcg_table.sem); + kfree(mailbox); + return err; +} + +int mthca_multicast_detach(struct ib_qp *ibqp, union ib_gid *gid, u16 lid) +{ + struct mthca_dev *dev = to_mdev(ibqp->device); + void *mailbox; + struct mthca_mgm *mgm; + u16 hash; + int prev, index; + int i, loc; + int err; + u8 status; + + mailbox = kmalloc(sizeof *mgm + MTHCA_CMD_MAILBOX_EXTRA, GFP_KERNEL); + if (!mailbox) + return -ENOMEM; + mgm = MAILBOX_ALIGN(mailbox); + + if (down_interruptible(&dev->mcg_table.sem)) + return -EINTR; + + err = find_mgm(dev, gid->raw, mgm, &hash, &prev, &index); + if (err) + goto out; + + if (index == -1) { + mthca_err(dev, "MGID %04x:%04x:%04x:%04x:%04x:%04x:%04x:%04x " + "not found\n", + be16_to_cpu(((u16 *) gid->raw)[0]), + be16_to_cpu(((u16 *) gid->raw)[1]), + be16_to_cpu(((u16 *) gid->raw)[2]), + be16_to_cpu(((u16 *) gid->raw)[3]), + be16_to_cpu(((u16 *) gid->raw)[4]), + be16_to_cpu(((u16 *) gid->raw)[5]), + be16_to_cpu(((u16 *) gid->raw)[6]), + be16_to_cpu(((u16 *) gid->raw)[7])); + err = -EINVAL; + goto out; + } + + for (loc = -1, i = 0; i < MTHCA_QP_PER_MGM; ++i) { + if (mgm->qp[i] == cpu_to_be32(ibqp->qp_num | (1 << 31))) + loc = i; + if (!(mgm->qp[i] & cpu_to_be32(1 << 31))) + break; + } + + if (loc == -1) { + mthca_err(dev, "QP %06x not found in MGM\n", ibqp->qp_num); + err = -EINVAL; + goto out; + } + + mgm->qp[loc] = mgm->qp[i - 1]; + mgm->qp[i - 1] = 0; + + err = mthca_WRITE_MGM(dev, index, mgm, &status); + if (err) + goto out; + if (status) { + mthca_err(dev, "WRITE_MGM returned status %02x\n", status); + err = -EINVAL; + goto out; + } + + if (i != 1) + goto out; + + goto out; + + if (prev == -1) { + /* Remove entry from MGM */ + if (be32_to_cpu(mgm->next_gid_index) >> 5) { + err = mthca_READ_MGM(dev, + be32_to_cpu(mgm->next_gid_index) >> 5, + mgm, &status); + if (err) + goto out; + if (status) { + mthca_err(dev, "READ_MGM returned status %02x\n", + status); + err = -EINVAL; + goto out; + } + } else + memset(mgm->gid, 0, 16); + + err = mthca_WRITE_MGM(dev, index, mgm, &status); + if (err) + goto out; + if (status) { + mthca_err(dev, "WRITE_MGM returned status %02x\n", status); + err = -EINVAL; + goto out; + } + } else { + /* Remove entry from AMGM */ + index = be32_to_cpu(mgm->next_gid_index) >> 5; + err = mthca_READ_MGM(dev, prev, mgm, &status); + if (err) + goto out; + if (status) { + mthca_err(dev, "READ_MGM returned status %02x\n", status); + err = -EINVAL; + goto out; + } + + mgm->next_gid_index = cpu_to_be32(index << 5); + + err = mthca_WRITE_MGM(dev, prev, mgm, &status); + if (err) + goto out; + if (status) { + mthca_err(dev, "WRITE_MGM returned status %02x\n", status); + err = -EINVAL; + goto out; + } + } + + out: + up(&dev->mcg_table.sem); + kfree(mailbox); + return err; +} + +int __devinit mthca_init_mcg_table(struct mthca_dev *dev) +{ + int err; + + err = mthca_alloc_init(&dev->mcg_table.alloc, + dev->limits.num_amgms, + dev->limits.num_amgms - 1, + 0); + if (err) + return err; + + init_MUTEX(&dev->mcg_table.sem); + + return 0; +} + +void __devexit mthca_cleanup_mcg_table(struct mthca_dev *dev) +{ + mthca_alloc_cleanup(&dev->mcg_table.alloc); +} --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/hw/mthca/mthca_mr.c 2004-12-27 21:48:23.964438746 -0800 @@ -0,0 +1,396 @@ +/* + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: mthca_mr.c 1349 2004-12-16 21:09:43Z roland $ + */ + +#include +#include +#include + +#include "mthca_dev.h" +#include "mthca_cmd.h" + +/* + * Must be packed because mtt_seg is 64 bits but only aligned to 32 bits. + */ +struct mthca_mpt_entry { + u32 flags; + u32 page_size; + u32 key; + u32 pd; + u64 start; + u64 length; + u32 lkey; + u32 window_count; + u32 window_count_limit; + u64 mtt_seg; + u32 reserved[3]; +} __attribute__((packed)); + +#define MTHCA_MPT_FLAG_SW_OWNS (0xfUL << 28) +#define MTHCA_MPT_FLAG_MIO (1 << 17) +#define MTHCA_MPT_FLAG_BIND_ENABLE (1 << 15) +#define MTHCA_MPT_FLAG_PHYSICAL (1 << 9) +#define MTHCA_MPT_FLAG_REGION (1 << 8) + +#define MTHCA_MTT_FLAG_PRESENT 1 + +/* + * Buddy allocator for MTT segments (currently not very efficient + * since it doesn't keep a free list and just searches linearly + * through the bitmaps) + */ + +static u32 mthca_alloc_mtt(struct mthca_dev *dev, int order) +{ + int o; + int m; + u32 seg; + + spin_lock(&dev->mr_table.mpt_alloc.lock); + + for (o = order; o <= dev->mr_table.max_mtt_order; ++o) { + m = 1 << (dev->mr_table.max_mtt_order - o); + seg = find_first_bit(dev->mr_table.mtt_buddy[o], m); + if (seg < m) + goto found; + } + + spin_unlock(&dev->mr_table.mpt_alloc.lock); + return -1; + + found: + clear_bit(seg, dev->mr_table.mtt_buddy[o]); + + while (o > order) { + --o; + seg <<= 1; + set_bit(seg ^ 1, dev->mr_table.mtt_buddy[o]); + } + + spin_unlock(&dev->mr_table.mpt_alloc.lock); + + seg <<= order; + + return seg; +} + +static void mthca_free_mtt(struct mthca_dev *dev, u32 seg, int order) +{ + seg >>= order; + + spin_lock(&dev->mr_table.mpt_alloc.lock); + + while (test_bit(seg ^ 1, dev->mr_table.mtt_buddy[order])) { + clear_bit(seg ^ 1, dev->mr_table.mtt_buddy[order]); + seg >>= 1; + ++order; + } + + set_bit(seg, dev->mr_table.mtt_buddy[order]); + + spin_unlock(&dev->mr_table.mpt_alloc.lock); +} + +int mthca_mr_alloc_notrans(struct mthca_dev *dev, u32 pd, + u32 access, struct mthca_mr *mr) +{ + void *mailbox; + struct mthca_mpt_entry *mpt_entry; + int err; + u8 status; + + might_sleep(); + + mr->order = -1; + mr->ibmr.lkey = mthca_alloc(&dev->mr_table.mpt_alloc); + if (mr->ibmr.lkey == -1) + return -ENOMEM; + mr->ibmr.rkey = mr->ibmr.lkey; + + mailbox = kmalloc(sizeof *mpt_entry + MTHCA_CMD_MAILBOX_EXTRA, + GFP_KERNEL); + if (!mailbox) { + mthca_free(&dev->mr_table.mpt_alloc, mr->ibmr.lkey); + return -ENOMEM; + } + mpt_entry = MAILBOX_ALIGN(mailbox); + + mpt_entry->flags = cpu_to_be32(MTHCA_MPT_FLAG_SW_OWNS | + MTHCA_MPT_FLAG_MIO | + MTHCA_MPT_FLAG_PHYSICAL | + MTHCA_MPT_FLAG_REGION | + access); + mpt_entry->page_size = 0; + mpt_entry->key = cpu_to_be32(mr->ibmr.lkey); + mpt_entry->pd = cpu_to_be32(pd); + mpt_entry->start = 0; + mpt_entry->length = ~0ULL; + + memset(&mpt_entry->lkey, 0, + sizeof *mpt_entry - offsetof(struct mthca_mpt_entry, lkey)); + + err = mthca_SW2HW_MPT(dev, mpt_entry, + mr->ibmr.lkey & (dev->limits.num_mpts - 1), + &status); + if (err) + mthca_warn(dev, "SW2HW_MPT failed (%d)\n", err); + else if (status) { + mthca_warn(dev, "SW2HW_MPT returned status 0x%02x\n", + status); + err = -EINVAL; + } + + kfree(mailbox); + return err; +} + +int mthca_mr_alloc_phys(struct mthca_dev *dev, u32 pd, + u64 *buffer_list, int buffer_size_shift, + int list_len, u64 iova, u64 total_size, + u32 access, struct mthca_mr *mr) +{ + void *mailbox; + u64 *mtt_entry; + struct mthca_mpt_entry *mpt_entry; + int err = -ENOMEM; + u8 status; + int i; + + might_sleep(); + WARN_ON(buffer_size_shift >= 32); + + mr->ibmr.lkey = mthca_alloc(&dev->mr_table.mpt_alloc); + if (mr->ibmr.lkey == -1) + return -ENOMEM; + mr->ibmr.rkey = mr->ibmr.lkey; + + for (i = dev->limits.mtt_seg_size / 8, mr->order = 0; + i < list_len; + i <<= 1, ++mr->order) + /* nothing */ ; + + mr->first_seg = mthca_alloc_mtt(dev, mr->order); + if (mr->first_seg == -1) + goto err_out_mpt_free; + + /* + * If list_len is odd, we add one more dummy entry for + * firmware efficiency. + */ + mailbox = kmalloc(max(sizeof *mpt_entry, + (size_t) 8 * (list_len + (list_len & 1) + 2)) + + MTHCA_CMD_MAILBOX_EXTRA, + GFP_KERNEL); + if (!mailbox) + goto err_out_free_mtt; + + mtt_entry = MAILBOX_ALIGN(mailbox); + + mtt_entry[0] = cpu_to_be64(dev->mr_table.mtt_base + + mr->first_seg * dev->limits.mtt_seg_size); + mtt_entry[1] = 0; + for (i = 0; i < list_len; ++i) + mtt_entry[i + 2] = cpu_to_be64(buffer_list[i] | + MTHCA_MTT_FLAG_PRESENT); + if (list_len & 1) { + mtt_entry[i + 2] = 0; + ++list_len; + } + + if (0) { + mthca_dbg(dev, "Dumping MPT entry\n"); + for (i = 0; i < list_len + 2; ++i) + printk(KERN_ERR "[%2d] %016llx\n", + i, (unsigned long long) be64_to_cpu(mtt_entry[i])); + } + + err = mthca_WRITE_MTT(dev, mtt_entry, list_len, &status); + if (err) { + mthca_warn(dev, "WRITE_MTT failed (%d)\n", err); + goto err_out_mailbox_free; + } + if (status) { + mthca_warn(dev, "WRITE_MTT returned status 0x%02x\n", + status); + err = -EINVAL; + goto err_out_mailbox_free; + } + + mpt_entry = MAILBOX_ALIGN(mailbox); + + mpt_entry->flags = cpu_to_be32(MTHCA_MPT_FLAG_SW_OWNS | + MTHCA_MPT_FLAG_MIO | + MTHCA_MPT_FLAG_REGION | + access); + + mpt_entry->page_size = cpu_to_be32(buffer_size_shift - 12); + mpt_entry->key = cpu_to_be32(mr->ibmr.lkey); + mpt_entry->pd = cpu_to_be32(pd); + mpt_entry->start = cpu_to_be64(iova); + mpt_entry->length = cpu_to_be64(total_size); + memset(&mpt_entry->lkey, 0, + sizeof *mpt_entry - offsetof(struct mthca_mpt_entry, lkey)); + mpt_entry->mtt_seg = cpu_to_be64(dev->mr_table.mtt_base + + mr->first_seg * dev->limits.mtt_seg_size); + + if (0) { + mthca_dbg(dev, "Dumping MPT entry %08x:\n", mr->ibmr.lkey); + for (i = 0; i < sizeof (struct mthca_mpt_entry) / 4; ++i) { + if (i % 4 == 0) + printk("[%02x] ", i * 4); + printk(" %08x", be32_to_cpu(((u32 *) mpt_entry)[i])); + if ((i + 1) % 4 == 0) + printk("\n"); + } + } + + err = mthca_SW2HW_MPT(dev, mpt_entry, + mr->ibmr.lkey & (dev->limits.num_mpts - 1), + &status); + if (err) + mthca_warn(dev, "SW2HW_MPT failed (%d)\n", err); + else if (status) { + mthca_warn(dev, "SW2HW_MPT returned status 0x%02x\n", + status); + err = -EINVAL; + } + + kfree(mailbox); + return err; + + err_out_mailbox_free: + kfree(mailbox); + + err_out_free_mtt: + mthca_free_mtt(dev, mr->first_seg, mr->order); + + err_out_mpt_free: + mthca_free(&dev->mr_table.mpt_alloc, mr->ibmr.lkey); + return err; +} + +void mthca_free_mr(struct mthca_dev *dev, struct mthca_mr *mr) +{ + int err; + u8 status; + + might_sleep(); + + err = mthca_HW2SW_MPT(dev, NULL, + mr->ibmr.lkey & (dev->limits.num_mpts - 1), + &status); + if (err) + mthca_warn(dev, "HW2SW_MPT failed (%d)\n", err); + else if (status) + mthca_warn(dev, "HW2SW_MPT returned status 0x%02x\n", + status); + + if (mr->order >= 0) + mthca_free_mtt(dev, mr->first_seg, mr->order); + + mthca_free(&dev->mr_table.mpt_alloc, mr->ibmr.lkey); +} + +int __devinit mthca_init_mr_table(struct mthca_dev *dev) +{ + int err; + int i, s; + + err = mthca_alloc_init(&dev->mr_table.mpt_alloc, + dev->limits.num_mpts, + ~0, dev->limits.reserved_mrws); + if (err) + return err; + + err = -ENOMEM; + + for (i = 1, dev->mr_table.max_mtt_order = 0; + i < dev->limits.num_mtt_segs; + i <<= 1, ++dev->mr_table.max_mtt_order) + /* nothing */ ; + + dev->mr_table.mtt_buddy = kmalloc((dev->mr_table.max_mtt_order + 1) * + sizeof (long *), + GFP_KERNEL); + if (!dev->mr_table.mtt_buddy) + goto err_out; + + for (i = 0; i <= dev->mr_table.max_mtt_order; ++i) + dev->mr_table.mtt_buddy[i] = NULL; + + for (i = 0; i <= dev->mr_table.max_mtt_order; ++i) { + s = BITS_TO_LONGS(1 << (dev->mr_table.max_mtt_order - i)); + dev->mr_table.mtt_buddy[i] = kmalloc(s * sizeof (long), + GFP_KERNEL); + if (!dev->mr_table.mtt_buddy[i]) + goto err_out_free; + bitmap_zero(dev->mr_table.mtt_buddy[i], + 1 << (dev->mr_table.max_mtt_order - i)); + } + + set_bit(0, dev->mr_table.mtt_buddy[dev->mr_table.max_mtt_order]); + + for (i = 0; i < dev->mr_table.max_mtt_order; ++i) + if (1 << i >= dev->limits.reserved_mtts) + break; + + if (i == dev->mr_table.max_mtt_order) { + mthca_err(dev, "MTT table of order %d is " + "too small.\n", i); + goto err_out_free; + } + + (void) mthca_alloc_mtt(dev, i); + + return 0; + + err_out_free: + for (i = 0; i <= dev->mr_table.max_mtt_order; ++i) + kfree(dev->mr_table.mtt_buddy[i]); + + err_out: + mthca_alloc_cleanup(&dev->mr_table.mpt_alloc); + + return err; +} + +void __devexit mthca_cleanup_mr_table(struct mthca_dev *dev) +{ + int i; + + /* XXX check if any MRs are still allocated? */ + for (i = 0; i <= dev->mr_table.max_mtt_order; ++i) + kfree(dev->mr_table.mtt_buddy[i]); + kfree(dev->mr_table.mtt_buddy); + mthca_alloc_cleanup(&dev->mr_table.mpt_alloc); +} --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/hw/mthca/mthca_pd.c 2004-12-27 21:48:23.990434920 -0800 @@ -0,0 +1,80 @@ +/* + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: mthca_pd.c 1349 2004-12-16 21:09:43Z roland $ + */ + +#include +#include + +#include "mthca_dev.h" + +int mthca_pd_alloc(struct mthca_dev *dev, struct mthca_pd *pd) +{ + int err; + + might_sleep(); + + atomic_set(&pd->sqp_count, 0); + pd->pd_num = mthca_alloc(&dev->pd_table.alloc); + if (pd->pd_num == -1) + return -ENOMEM; + + err = mthca_mr_alloc_notrans(dev, pd->pd_num, + MTHCA_MPT_FLAG_LOCAL_READ | + MTHCA_MPT_FLAG_LOCAL_WRITE, + &pd->ntmr); + if (err) + mthca_free(&dev->pd_table.alloc, pd->pd_num); + + return err; +} + +void mthca_pd_free(struct mthca_dev *dev, struct mthca_pd *pd) +{ + might_sleep(); + mthca_free_mr(dev, &pd->ntmr); + mthca_free(&dev->pd_table.alloc, pd->pd_num); +} + +int __devinit mthca_init_pd_table(struct mthca_dev *dev) +{ + return mthca_alloc_init(&dev->pd_table.alloc, + dev->limits.num_pds, + (1 << 24) - 1, + dev->limits.reserved_pds); +} + +void __devexit mthca_cleanup_pd_table(struct mthca_dev *dev) +{ + /* XXX check if any PDs are still allocated? */ + mthca_alloc_cleanup(&dev->pd_table.alloc); +} From roland@topspin.com Mon Dec 27 21:49:39 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 21:51:00 -0800 (PST) Received: from umhlanga.STRATNET.NET (umhlanga.stratnet.net [12.162.17.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS5nJDQ025948 for ; Mon, 27 Dec 2004 21:49:39 -0800 Received: from exch-1.topspincom.com ([12.162.17.3]) by umhlanga.STRATNET.NET with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:50:53 -0800 Received: from eddore ([10.10.253.169]) by exch-1.topspincom.com with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:50:52 -0800 Received: from localhost ([127.0.0.1] helo=eddore) by eddore with smtp (Exim 4.34) id 1CjAFs-0000sX-NA; Mon, 27 Dec 2004 21:50:52 -0800 Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com, openib-general@openib.org In-Reply-To: <200412272150.IBRnA4AvjendsF8x@topspin.com> X-Mailer: Roland's Patchbomber Date: Mon, 27 Dec 2004 21:50:48 -0800 Message-Id: <200412272150.khYObxkGxtPP9Oju@topspin.com> Mime-Version: 1.0 To: davem@davemloft.net From: Roland Dreier X-SA-Exim-Connect-IP: 127.0.0.1 X-SA-Exim-Mail-From: roland@topspin.com Subject: [PATCH][v5][1/24] Add core InfiniBand support (public headers) Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-SA-Exim-Version: 4.1 (built Tue, 17 Aug 2004 11:06:07 +0200) X-SA-Exim-Scanned: Yes (on eddore) X-OriginalArrivalTime: 28 Dec 2004 05:50:52.0876 (UTC) FILETIME=[30F970C0:01C4ECA1] X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13111 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: roland@topspin.com Precedence: bulk X-list: netdev Add public headers for core InfiniBand support. This can be thought of as a midlayer that provides an abstraction between low-level hardware drivers and upper level protocols (such as IP-over-InfiniBand). Signed-off-by: Roland Dreier --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/include/ib_cache.h 2004-12-27 21:48:17.561381253 -0800 @@ -0,0 +1,53 @@ +/* + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: ib_cache.h 1349 2004-12-16 21:09:43Z roland $ + */ + +#ifndef _IB_CACHE_H +#define _IB_CACHE_H + +#include + +int ib_cached_gid_get(struct ib_device *device, + u8 port, + int index, + union ib_gid *gid); +int ib_cached_pkey_get(struct ib_device *device_handle, + u8 port, + int index, + u16 *pkey); +int ib_cached_pkey_find(struct ib_device *device, + u8 port, + u16 pkey, + u16 *index); + +#endif /* _IB_CACHE_H */ --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/include/ib_fmr_pool.h 2004-12-27 21:48:17.586377574 -0800 @@ -0,0 +1,92 @@ +/* + * Copyright (c) 2004 Topspin Corporation. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: ib_fmr_pool.h 1349 2004-12-16 21:09:43Z roland $ + */ + +#if !defined(IB_FMR_POOL_H) +#define IB_FMR_POOL_H + +#include + +struct ib_fmr_pool; + +/** + * struct ib_fmr_pool_param - Parameters for creating FMR pool + * @max_pages_per_fmr:Maximum number of pages per map request. + * @access:Access flags for FMRs in pool. + * @pool_size:Number of FMRs to allocate for pool. + * @dirty_watermark:Flush is triggered when @dirty_watermark dirty + * FMRs are present. + * @flush_function:Callback called when unmapped FMRs are flushed and + * more FMRs are possibly available for mapping + * @flush_arg:Context passed to user's flush function. + * @cache:If set, FMRs may be reused after unmapping for identical map + * requests. + */ +struct ib_fmr_pool_param { + int max_pages_per_fmr; + enum ib_access_flags access; + int pool_size; + int dirty_watermark; + void (*flush_function)(struct ib_fmr_pool *pool, + void * arg); + void *flush_arg; + unsigned cache:1; +}; + +struct ib_pool_fmr { + struct ib_fmr *fmr; + struct ib_fmr_pool *pool; + struct list_head list; + struct hlist_node cache_node; + int ref_count; + int remap_count; + u64 io_virtual_address; + int page_list_len; + u64 page_list[0]; +}; + +struct ib_fmr_pool *ib_create_fmr_pool(struct ib_pd *pd, + struct ib_fmr_pool_param *params); + +int ib_destroy_fmr_pool(struct ib_fmr_pool *pool); + +int ib_flush_fmr_pool(struct ib_fmr_pool *pool); + +struct ib_pool_fmr *ib_fmr_pool_map_phys(struct ib_fmr_pool *pool_handle, + u64 *page_list, + int list_len, + u64 *io_virtual_address); + +int ib_fmr_pool_unmap(struct ib_pool_fmr *fmr); + +#endif /* IB_FMR_POOL_H */ --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/include/ib_pack.h 2004-12-27 21:48:17.640369627 -0800 @@ -0,0 +1,245 @@ +/* + * Copyright (c) 2004 Topspin Corporation. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: ib_pack.h 1349 2004-12-16 21:09:43Z roland $ + */ + +#ifndef IB_PACK_H +#define IB_PACK_H + +#include + +enum { + IB_LRH_BYTES = 8, + IB_GRH_BYTES = 40, + IB_BTH_BYTES = 12, + IB_DETH_BYTES = 8 +}; + +struct ib_field { + size_t struct_offset_bytes; + size_t struct_size_bytes; + int offset_words; + int offset_bits; + int size_bits; + char *field_name; +}; + +#define RESERVED \ + .field_name = "reserved" + +/* + * This macro cleans up the definitions of constants for BTH opcodes. + * It is used to define constants such as IB_OPCODE_UD_SEND_ONLY, + * which becomes IB_OPCODE_UD + IB_OPCODE_SEND_ONLY, and this gives + * the correct value. + * + * In short, user code should use the constants defined using the + * macro rather than worrying about adding together other constants. +*/ +#define IB_OPCODE(transport, op) \ + IB_OPCODE_ ## transport ## _ ## op = \ + IB_OPCODE_ ## transport + IB_OPCODE_ ## op + +enum { + /* transport types -- just used to define real constants */ + IB_OPCODE_RC = 0x00, + IB_OPCODE_UC = 0x20, + IB_OPCODE_RD = 0x40, + IB_OPCODE_UD = 0x60, + + /* operations -- just used to define real constants */ + IB_OPCODE_SEND_FIRST = 0x00, + IB_OPCODE_SEND_MIDDLE = 0x01, + IB_OPCODE_SEND_LAST = 0x02, + IB_OPCODE_SEND_LAST_WITH_IMMEDIATE = 0x03, + IB_OPCODE_SEND_ONLY = 0x04, + IB_OPCODE_SEND_ONLY_WITH_IMMEDIATE = 0x05, + IB_OPCODE_RDMA_WRITE_FIRST = 0x06, + IB_OPCODE_RDMA_WRITE_MIDDLE = 0x07, + IB_OPCODE_RDMA_WRITE_LAST = 0x08, + IB_OPCODE_RDMA_WRITE_LAST_WITH_IMMEDIATE = 0x09, + IB_OPCODE_RDMA_WRITE_ONLY = 0x0a, + IB_OPCODE_RDMA_WRITE_ONLY_WITH_IMMEDIATE = 0x0b, + IB_OPCODE_RDMA_READ_REQUEST = 0x0c, + IB_OPCODE_RDMA_READ_RESPONSE_FIRST = 0x0d, + IB_OPCODE_RDMA_READ_RESPONSE_MIDDLE = 0x0e, + IB_OPCODE_RDMA_READ_RESPONSE_LAST = 0x0f, + IB_OPCODE_RDMA_READ_RESPONSE_ONLY = 0x10, + IB_OPCODE_ACKNOWLEDGE = 0x11, + IB_OPCODE_ATOMIC_ACKNOWLEDGE = 0x12, + IB_OPCODE_COMPARE_SWAP = 0x13, + IB_OPCODE_FETCH_ADD = 0x14, + + /* real constants follow -- see comment about above IB_OPCODE() + macro for more details */ + + /* RC */ + IB_OPCODE(RC, SEND_FIRST), + IB_OPCODE(RC, SEND_MIDDLE), + IB_OPCODE(RC, SEND_LAST), + IB_OPCODE(RC, SEND_LAST_WITH_IMMEDIATE), + IB_OPCODE(RC, SEND_ONLY), + IB_OPCODE(RC, SEND_ONLY_WITH_IMMEDIATE), + IB_OPCODE(RC, RDMA_WRITE_FIRST), + IB_OPCODE(RC, RDMA_WRITE_MIDDLE), + IB_OPCODE(RC, RDMA_WRITE_LAST), + IB_OPCODE(RC, RDMA_WRITE_LAST_WITH_IMMEDIATE), + IB_OPCODE(RC, RDMA_WRITE_ONLY), + IB_OPCODE(RC, RDMA_WRITE_ONLY_WITH_IMMEDIATE), + IB_OPCODE(RC, RDMA_READ_REQUEST), + IB_OPCODE(RC, RDMA_READ_RESPONSE_FIRST), + IB_OPCODE(RC, RDMA_READ_RESPONSE_MIDDLE), + IB_OPCODE(RC, RDMA_READ_RESPONSE_LAST), + IB_OPCODE(RC, RDMA_READ_RESPONSE_ONLY), + IB_OPCODE(RC, ACKNOWLEDGE), + IB_OPCODE(RC, ATOMIC_ACKNOWLEDGE), + IB_OPCODE(RC, COMPARE_SWAP), + IB_OPCODE(RC, FETCH_ADD), + + /* UC */ + IB_OPCODE(UC, SEND_FIRST), + IB_OPCODE(UC, SEND_MIDDLE), + IB_OPCODE(UC, SEND_LAST), + IB_OPCODE(UC, SEND_LAST_WITH_IMMEDIATE), + IB_OPCODE(UC, SEND_ONLY), + IB_OPCODE(UC, SEND_ONLY_WITH_IMMEDIATE), + IB_OPCODE(UC, RDMA_WRITE_FIRST), + IB_OPCODE(UC, RDMA_WRITE_MIDDLE), + IB_OPCODE(UC, RDMA_WRITE_LAST), + IB_OPCODE(UC, RDMA_WRITE_LAST_WITH_IMMEDIATE), + IB_OPCODE(UC, RDMA_WRITE_ONLY), + IB_OPCODE(UC, RDMA_WRITE_ONLY_WITH_IMMEDIATE), + + /* RD */ + IB_OPCODE(RD, SEND_FIRST), + IB_OPCODE(RD, SEND_MIDDLE), + IB_OPCODE(RD, SEND_LAST), + IB_OPCODE(RD, SEND_LAST_WITH_IMMEDIATE), + IB_OPCODE(RD, SEND_ONLY), + IB_OPCODE(RD, SEND_ONLY_WITH_IMMEDIATE), + IB_OPCODE(RD, RDMA_WRITE_FIRST), + IB_OPCODE(RD, RDMA_WRITE_MIDDLE), + IB_OPCODE(RD, RDMA_WRITE_LAST), + IB_OPCODE(RD, RDMA_WRITE_LAST_WITH_IMMEDIATE), + IB_OPCODE(RD, RDMA_WRITE_ONLY), + IB_OPCODE(RD, RDMA_WRITE_ONLY_WITH_IMMEDIATE), + IB_OPCODE(RD, RDMA_READ_REQUEST), + IB_OPCODE(RD, RDMA_READ_RESPONSE_FIRST), + IB_OPCODE(RD, RDMA_READ_RESPONSE_MIDDLE), + IB_OPCODE(RD, RDMA_READ_RESPONSE_LAST), + IB_OPCODE(RD, RDMA_READ_RESPONSE_ONLY), + IB_OPCODE(RD, ACKNOWLEDGE), + IB_OPCODE(RD, ATOMIC_ACKNOWLEDGE), + IB_OPCODE(RD, COMPARE_SWAP), + IB_OPCODE(RD, FETCH_ADD), + + /* UD */ + IB_OPCODE(UD, SEND_ONLY), + IB_OPCODE(UD, SEND_ONLY_WITH_IMMEDIATE) +}; + +enum { + IB_LNH_RAW = 0, + IB_LNH_IP = 1, + IB_LNH_IBA_LOCAL = 2, + IB_LNH_IBA_GLOBAL = 3 +}; + +struct ib_unpacked_lrh { + u8 virtual_lane; + u8 link_version; + u8 service_level; + u8 link_next_header; + __be16 destination_lid; + __be16 packet_length; + __be16 source_lid; +}; + +struct ib_unpacked_grh { + u8 ip_version; + u8 traffic_class; + __be32 flow_label; + __be16 payload_length; + u8 next_header; + u8 hop_limit; + union ib_gid source_gid; + union ib_gid destination_gid; +}; + +struct ib_unpacked_bth { + u8 opcode; + u8 solicited_event; + u8 mig_req; + u8 pad_count; + u8 transport_header_version; + __be16 pkey; + __be32 destination_qpn; + u8 ack_req; + __be32 psn; +}; + +struct ib_unpacked_deth { + __be32 qkey; + __be32 source_qpn; +}; + +struct ib_ud_header { + struct ib_unpacked_lrh lrh; + int grh_present; + struct ib_unpacked_grh grh; + struct ib_unpacked_bth bth; + struct ib_unpacked_deth deth; + int immediate_present; + __be32 immediate_data; +}; + +void ib_pack(const struct ib_field *desc, + int desc_len, + void *structure, + void *buf); + +void ib_unpack(const struct ib_field *desc, + int desc_len, + void *buf, + void *structure); + +void ib_ud_header_init(int payload_bytes, + int grh_present, + struct ib_ud_header *header); + +int ib_ud_header_pack(struct ib_ud_header *header, + void *buf); + +int ib_ud_header_unpack(void *buf, + struct ib_ud_header *header); + +#endif /* IB_PACK_H */ --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/include/ib_verbs.h 2004-12-27 21:48:17.684363151 -0800 @@ -0,0 +1,1249 @@ +/* + * Copyright (c) 2004 Mellanox Technologies Ltd. All rights reserved. + * Copyright (c) 2004 Infinicon Corporation. All rights reserved. + * Copyright (c) 2004 Intel Corporation. All rights reserved. + * Copyright (c) 2004 Topspin Corporation. All rights reserved. + * Copyright (c) 2004 Voltaire Corporation. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: ib_verbs.h 1349 2004-12-16 21:09:43Z roland $ + */ + +#if !defined(IB_VERBS_H) +#define IB_VERBS_H + +#include +#include +#include + +union ib_gid { + u8 raw[16]; + struct { + u64 subnet_prefix; + u64 interface_id; + } global; +}; + +enum ib_node_type { + IB_NODE_CA = 1, + IB_NODE_SWITCH, + IB_NODE_ROUTER +}; + +enum ib_device_cap_flags { + IB_DEVICE_RESIZE_MAX_WR = 1, + IB_DEVICE_BAD_PKEY_CNTR = (1<<1), + IB_DEVICE_BAD_QKEY_CNTR = (1<<2), + IB_DEVICE_RAW_MULTI = (1<<3), + IB_DEVICE_AUTO_PATH_MIG = (1<<4), + IB_DEVICE_CHANGE_PHY_PORT = (1<<5), + IB_DEVICE_UD_AV_PORT_ENFORCE = (1<<6), + IB_DEVICE_CURR_QP_STATE_MOD = (1<<7), + IB_DEVICE_SHUTDOWN_PORT = (1<<8), + IB_DEVICE_INIT_TYPE = (1<<9), + IB_DEVICE_PORT_ACTIVE_EVENT = (1<<10), + IB_DEVICE_SYS_IMAGE_GUID = (1<<11), + IB_DEVICE_RC_RNR_NAK_GEN = (1<<12), + IB_DEVICE_SRQ_RESIZE = (1<<13), + IB_DEVICE_N_NOTIFY_CQ = (1<<14), + IB_DEVICE_RQ_SIG_TYPE = (1<<15) +}; + +enum ib_atomic_cap { + IB_ATOMIC_NONE, + IB_ATOMIC_HCA, + IB_ATOMIC_GLOB +}; + +struct ib_device_attr { + u64 fw_ver; + u64 node_guid; + u64 sys_image_guid; + u64 max_mr_size; + u64 page_size_cap; + u32 vendor_id; + u32 vendor_part_id; + u32 hw_ver; + int max_qp; + int max_qp_wr; + int device_cap_flags; + int max_sge; + int max_sge_rd; + int max_cq; + int max_cqe; + int max_mr; + int max_pd; + int max_qp_rd_atom; + int max_ee_rd_atom; + int max_res_rd_atom; + int max_qp_init_rd_atom; + int max_ee_init_rd_atom; + enum ib_atomic_cap atomic_cap; + int max_ee; + int max_rdd; + int max_mw; + int max_raw_ipv6_qp; + int max_raw_ethy_qp; + int max_mcast_grp; + int max_mcast_qp_attach; + int max_total_mcast_qp_attach; + int max_ah; + int max_fmr; + int max_map_per_fmr; + int max_srq; + int max_srq_wr; + int max_srq_sge; + u16 max_pkeys; + u8 local_ca_ack_delay; +}; + +enum ib_mtu { + IB_MTU_256 = 1, + IB_MTU_512 = 2, + IB_MTU_1024 = 3, + IB_MTU_2048 = 4, + IB_MTU_4096 = 5 +}; + +static inline int ib_mtu_enum_to_int(enum ib_mtu mtu) +{ + switch (mtu) { + case IB_MTU_256: return 256; + case IB_MTU_512: return 512; + case IB_MTU_1024: return 1024; + case IB_MTU_2048: return 2048; + case IB_MTU_4096: return 4096; + default: return -1; + } +} + +enum ib_port_state { + IB_PORT_NOP = 0, + IB_PORT_DOWN = 1, + IB_PORT_INIT = 2, + IB_PORT_ARMED = 3, + IB_PORT_ACTIVE = 4, + IB_PORT_ACTIVE_DEFER = 5 +}; + +enum ib_port_cap_flags { + IB_PORT_SM = (1<<31), + IB_PORT_NOTICE_SUP = (1<<30), + IB_PORT_TRAP_SUP = (1<<29), + IB_PORT_AUTO_MIGR_SUP = (1<<27), + IB_PORT_SL_MAP_SUP = (1<<26), + IB_PORT_MKEY_NVRAM = (1<<25), + IB_PORT_PKEY_NVRAM = (1<<24), + IB_PORT_LED_INFO_SUP = (1<<23), + IB_PORT_SM_DISABLED = (1<<22), + IB_PORT_SYS_IMAGE_GUID_SUP = (1<<21), + IB_PORT_PKEY_SW_EXT_PORT_TRAP_SUP = (1<<20), + IB_PORT_CM_SUP = (1<<16), + IB_PORT_SNMP_TUNNEL_SUP = (1<<15), + IB_PORT_REINIT_SUP = (1<<14), + IB_PORT_DEVICE_MGMT_SUP = (1<<13), + IB_PORT_VENDOR_CLASS_SUP = (1<<12), + IB_PORT_DR_NOTICE_SUP = (1<<11), + IB_PORT_PORT_NOTICE_SUP = (1<<10), + IB_PORT_BOOT_MGMT_SUP = (1<<9) +}; + +enum ib_port_width { + IB_WIDTH_1X = 1, + IB_WIDTH_4X = 2, + IB_WIDTH_8X = 4, + IB_WIDTH_12X = 8 +}; + +static inline int ib_width_enum_to_int(enum ib_port_width width) +{ + switch (width) { + case IB_WIDTH_1X: return 1; + case IB_WIDTH_4X: return 4; + case IB_WIDTH_8X: return 8; + case IB_WIDTH_12X: return 12; + default: return -1; + } +} + +struct ib_port_attr { + enum ib_port_state state; + enum ib_mtu max_mtu; + enum ib_mtu active_mtu; + int gid_tbl_len; + u32 port_cap_flags; + u32 max_msg_sz; + u32 bad_pkey_cntr; + u32 qkey_viol_cntr; + u16 pkey_tbl_len; + u16 lid; + u16 sm_lid; + u8 lmc; + u8 max_vl_num; + u8 sm_sl; + u8 subnet_timeout; + u8 init_type_reply; + u8 active_width; + u8 active_speed; +}; + +enum ib_device_modify_flags { + IB_DEVICE_MODIFY_SYS_IMAGE_GUID = 1 +}; + +struct ib_device_modify { + u64 sys_image_guid; +}; + +enum ib_port_modify_flags { + IB_PORT_SHUTDOWN = 1, + IB_PORT_INIT_TYPE = (1<<2), + IB_PORT_RESET_QKEY_CNTR = (1<<3) +}; + +struct ib_port_modify { + u32 set_port_cap_mask; + u32 clr_port_cap_mask; + u8 init_type; +}; + +enum ib_event_type { + IB_EVENT_CQ_ERR, + IB_EVENT_QP_FATAL, + IB_EVENT_QP_REQ_ERR, + IB_EVENT_QP_ACCESS_ERR, + IB_EVENT_COMM_EST, + IB_EVENT_SQ_DRAINED, + IB_EVENT_PATH_MIG, + IB_EVENT_PATH_MIG_ERR, + IB_EVENT_DEVICE_FATAL, + IB_EVENT_PORT_ACTIVE, + IB_EVENT_PORT_ERR, + IB_EVENT_LID_CHANGE, + IB_EVENT_PKEY_CHANGE, + IB_EVENT_SM_CHANGE +}; + +struct ib_event { + struct ib_device *device; + union { + struct ib_cq *cq; + struct ib_qp *qp; + u8 port_num; + } element; + enum ib_event_type event; +}; + +struct ib_event_handler { + struct ib_device *device; + void (*handler)(struct ib_event_handler *, struct ib_event *); + struct list_head list; +}; + +#define INIT_IB_EVENT_HANDLER(_ptr, _device, _handler) \ + do { \ + (_ptr)->device = _device; \ + (_ptr)->handler = _handler; \ + INIT_LIST_HEAD(&(_ptr)->list); \ + } while (0) + +struct ib_global_route { + union ib_gid dgid; + u32 flow_label; + u8 sgid_index; + u8 hop_limit; + u8 traffic_class; +}; + +enum { + IB_MULTICAST_QPN = 0xffffff +}; + +enum ib_ah_flags { + IB_AH_GRH = 1 +}; + +struct ib_ah_attr { + struct ib_global_route grh; + u16 dlid; + u8 sl; + u8 src_path_bits; + u8 static_rate; + u8 ah_flags; + u8 port_num; +}; + +enum ib_wc_status { + IB_WC_SUCCESS, + IB_WC_LOC_LEN_ERR, + IB_WC_LOC_QP_OP_ERR, + IB_WC_LOC_EEC_OP_ERR, + IB_WC_LOC_PROT_ERR, + IB_WC_WR_FLUSH_ERR, + IB_WC_MW_BIND_ERR, + IB_WC_BAD_RESP_ERR, + IB_WC_LOC_ACCESS_ERR, + IB_WC_REM_INV_REQ_ERR, + IB_WC_REM_ACCESS_ERR, + IB_WC_REM_OP_ERR, + IB_WC_RETRY_EXC_ERR, + IB_WC_RNR_RETRY_EXC_ERR, + IB_WC_LOC_RDD_VIOL_ERR, + IB_WC_REM_INV_RD_REQ_ERR, + IB_WC_REM_ABORT_ERR, + IB_WC_INV_EECN_ERR, + IB_WC_INV_EEC_STATE_ERR, + IB_WC_FATAL_ERR, + IB_WC_RESP_TIMEOUT_ERR, + IB_WC_GENERAL_ERR +}; + +enum ib_wc_opcode { + IB_WC_SEND, + IB_WC_RDMA_WRITE, + IB_WC_RDMA_READ, + IB_WC_COMP_SWAP, + IB_WC_FETCH_ADD, + IB_WC_BIND_MW, +/* + * Set value of IB_WC_RECV so consumers can test if a completion is a + * receive by testing (opcode & IB_WC_RECV). + */ + IB_WC_RECV = 1 << 7, + IB_WC_RECV_RDMA_WITH_IMM +}; + +enum ib_wc_flags { + IB_WC_GRH = 1, + IB_WC_WITH_IMM = (1<<1) +}; + +struct ib_wc { + u64 wr_id; + enum ib_wc_status status; + enum ib_wc_opcode opcode; + u32 vendor_err; + u32 byte_len; + __be32 imm_data; + u32 src_qp; + int wc_flags; + u16 pkey_index; + u16 slid; + u8 sl; + u8 dlid_path_bits; + u8 port_num; /* valid only for DR SMPs on switches */ +}; + +enum ib_cq_notify { + IB_CQ_SOLICITED, + IB_CQ_NEXT_COMP +}; + +struct ib_qp_cap { + u32 max_send_wr; + u32 max_recv_wr; + u32 max_send_sge; + u32 max_recv_sge; + u32 max_inline_data; +}; + +enum ib_sig_type { + IB_SIGNAL_ALL_WR, + IB_SIGNAL_REQ_WR +}; + +enum ib_qp_type { + /* + * IB_QPT_SMI and IB_QPT_GSI have to be the first two entries + * here (and in that order) since the MAD layer uses them as + * indices into a 2-entry table. + */ + IB_QPT_SMI, + IB_QPT_GSI, + + IB_QPT_RC, + IB_QPT_UC, + IB_QPT_UD, + IB_QPT_RAW_IPV6, + IB_QPT_RAW_ETY +}; + +struct ib_qp_init_attr { + void (*event_handler)(struct ib_event *, void *); + void *qp_context; + struct ib_cq *send_cq; + struct ib_cq *recv_cq; + struct ib_srq *srq; + struct ib_qp_cap cap; + enum ib_sig_type sq_sig_type; + enum ib_sig_type rq_sig_type; + enum ib_qp_type qp_type; + u8 port_num; /* special QP types only */ +}; + +enum ib_rnr_timeout { + IB_RNR_TIMER_655_36 = 0, + IB_RNR_TIMER_000_01 = 1, + IB_RNR_TIMER_000_02 = 2, + IB_RNR_TIMER_000_03 = 3, + IB_RNR_TIMER_000_04 = 4, + IB_RNR_TIMER_000_06 = 5, + IB_RNR_TIMER_000_08 = 6, + IB_RNR_TIMER_000_12 = 7, + IB_RNR_TIMER_000_16 = 8, + IB_RNR_TIMER_000_24 = 9, + IB_RNR_TIMER_000_32 = 10, + IB_RNR_TIMER_000_48 = 11, + IB_RNR_TIMER_000_64 = 12, + IB_RNR_TIMER_000_96 = 13, + IB_RNR_TIMER_001_28 = 14, + IB_RNR_TIMER_001_92 = 15, + IB_RNR_TIMER_002_56 = 16, + IB_RNR_TIMER_003_84 = 17, + IB_RNR_TIMER_005_12 = 18, + IB_RNR_TIMER_007_68 = 19, + IB_RNR_TIMER_010_24 = 20, + IB_RNR_TIMER_015_36 = 21, + IB_RNR_TIMER_020_48 = 22, + IB_RNR_TIMER_030_72 = 23, + IB_RNR_TIMER_040_96 = 24, + IB_RNR_TIMER_061_44 = 25, + IB_RNR_TIMER_081_92 = 26, + IB_RNR_TIMER_122_88 = 27, + IB_RNR_TIMER_163_84 = 28, + IB_RNR_TIMER_245_76 = 29, + IB_RNR_TIMER_327_68 = 30, + IB_RNR_TIMER_491_52 = 31 +}; + +enum ib_qp_attr_mask { + IB_QP_STATE = 1, + IB_QP_CUR_STATE = (1<<1), + IB_QP_EN_SQD_ASYNC_NOTIFY = (1<<2), + IB_QP_ACCESS_FLAGS = (1<<3), + IB_QP_PKEY_INDEX = (1<<4), + IB_QP_PORT = (1<<5), + IB_QP_QKEY = (1<<6), + IB_QP_AV = (1<<7), + IB_QP_PATH_MTU = (1<<8), + IB_QP_TIMEOUT = (1<<9), + IB_QP_RETRY_CNT = (1<<10), + IB_QP_RNR_RETRY = (1<<11), + IB_QP_RQ_PSN = (1<<12), + IB_QP_MAX_QP_RD_ATOMIC = (1<<13), + IB_QP_ALT_PATH = (1<<14), + IB_QP_MIN_RNR_TIMER = (1<<15), + IB_QP_SQ_PSN = (1<<16), + IB_QP_MAX_DEST_RD_ATOMIC = (1<<17), + IB_QP_PATH_MIG_STATE = (1<<18), + IB_QP_CAP = (1<<19), + IB_QP_DEST_QPN = (1<<20) +}; + +enum ib_qp_state { + IB_QPS_RESET, + IB_QPS_INIT, + IB_QPS_RTR, + IB_QPS_RTS, + IB_QPS_SQD, + IB_QPS_SQE, + IB_QPS_ERR +}; + +enum ib_mig_state { + IB_MIG_MIGRATED, + IB_MIG_REARM, + IB_MIG_ARMED +}; + +struct ib_qp_attr { + enum ib_qp_state qp_state; + enum ib_qp_state cur_qp_state; + enum ib_mtu path_mtu; + enum ib_mig_state path_mig_state; + u32 qkey; + u32 rq_psn; + u32 sq_psn; + u32 dest_qp_num; + int qp_access_flags; + struct ib_qp_cap cap; + struct ib_ah_attr ah_attr; + struct ib_ah_attr alt_ah_attr; + u16 pkey_index; + u16 alt_pkey_index; + u8 en_sqd_async_notify; + u8 sq_draining; + u8 max_rd_atomic; + u8 max_dest_rd_atomic; + u8 min_rnr_timer; + u8 port_num; + u8 timeout; + u8 retry_cnt; + u8 rnr_retry; + u8 alt_port_num; + u8 alt_timeout; +}; + +enum ib_wr_opcode { + IB_WR_RDMA_WRITE, + IB_WR_RDMA_WRITE_WITH_IMM, + IB_WR_SEND, + IB_WR_SEND_WITH_IMM, + IB_WR_RDMA_READ, + IB_WR_ATOMIC_CMP_AND_SWP, + IB_WR_ATOMIC_FETCH_AND_ADD +}; + +enum ib_send_flags { + IB_SEND_FENCE = 1, + IB_SEND_SIGNALED = (1<<1), + IB_SEND_SOLICITED = (1<<2), + IB_SEND_INLINE = (1<<3) +}; + +enum ib_recv_flags { + IB_RECV_SIGNALED = 1 +}; + +struct ib_sge { + u64 addr; + u32 length; + u32 lkey; +}; + +struct ib_send_wr { + struct ib_send_wr *next; + u64 wr_id; + struct ib_sge *sg_list; + int num_sge; + enum ib_wr_opcode opcode; + int send_flags; + u32 imm_data; + union { + struct { + u64 remote_addr; + u32 rkey; + } rdma; + struct { + u64 remote_addr; + u64 compare_add; + u64 swap; + u32 rkey; + } atomic; + struct { + struct ib_ah *ah; + struct ib_mad_hdr *mad_hdr; + u32 remote_qpn; + u32 remote_qkey; + int timeout_ms; /* valid for MADs only */ + u16 pkey_index; /* valid for GSI only */ + u8 port_num; /* valid for DR SMPs on switch only */ + } ud; + } wr; +}; + +struct ib_recv_wr { + struct ib_recv_wr *next; + u64 wr_id; + struct ib_sge *sg_list; + int num_sge; + int recv_flags; +}; + +enum ib_access_flags { + IB_ACCESS_LOCAL_WRITE = 1, + IB_ACCESS_REMOTE_WRITE = (1<<1), + IB_ACCESS_REMOTE_READ = (1<<2), + IB_ACCESS_REMOTE_ATOMIC = (1<<3), + IB_ACCESS_MW_BIND = (1<<4) +}; + +struct ib_phys_buf { + u64 addr; + u64 size; +}; + +struct ib_mr_attr { + struct ib_pd *pd; + u64 device_virt_addr; + u64 size; + int mr_access_flags; + u32 lkey; + u32 rkey; +}; + +enum ib_mr_rereg_flags { + IB_MR_REREG_TRANS = 1, + IB_MR_REREG_PD = (1<<1), + IB_MR_REREG_ACCESS = (1<<2) +}; + +struct ib_mw_bind { + struct ib_mr *mr; + u64 wr_id; + u64 addr; + u32 length; + int send_flags; + int mw_access_flags; +}; + +struct ib_fmr_attr { + int max_pages; + int max_maps; + u8 page_size; +}; + +struct ib_pd { + struct ib_device *device; + atomic_t usecnt; /* count all resources */ +}; + +struct ib_ah { + struct ib_device *device; + struct ib_pd *pd; +}; + +typedef void (*ib_comp_handler)(struct ib_cq *cq, void *cq_context); + +struct ib_cq { + struct ib_device *device; + ib_comp_handler comp_handler; + void (*event_handler)(struct ib_event *, void *); + void * cq_context; + int cqe; + atomic_t usecnt; /* count number of work queues */ +}; + +struct ib_srq { + struct ib_device *device; + struct ib_pd *pd; + void *srq_context; + atomic_t usecnt; +}; + +struct ib_qp { + struct ib_device *device; + struct ib_pd *pd; + struct ib_cq *send_cq; + struct ib_cq *recv_cq; + struct ib_srq *srq; + void (*event_handler)(struct ib_event *, void *); + void *qp_context; + u32 qp_num; +}; + +struct ib_mr { + struct ib_device *device; + struct ib_pd *pd; + u32 lkey; + u32 rkey; + atomic_t usecnt; /* count number of MWs */ +}; + +struct ib_mw { + struct ib_device *device; + struct ib_pd *pd; + u32 rkey; +}; + +struct ib_fmr { + struct ib_device *device; + struct ib_pd *pd; + struct list_head list; + u32 lkey; + u32 rkey; +}; + +struct ib_mad; + +enum ib_process_mad_flags { + IB_MAD_IGNORE_MKEY = 1 +}; + +enum ib_mad_result { + IB_MAD_RESULT_FAILURE = 0, /* (!SUCCESS is the important flag) */ + IB_MAD_RESULT_SUCCESS = 1 << 0, /* MAD was successfully processed */ + IB_MAD_RESULT_REPLY = 1 << 1, /* Reply packet needs to be sent */ + IB_MAD_RESULT_CONSUMED = 1 << 2 /* Packet consumed: stop processing */ +}; + +#define IB_DEVICE_NAME_MAX 64 + +struct ib_cache { + rwlock_t lock; + struct ib_event_handler event_handler; + struct ib_pkey_cache **pkey_cache; + struct ib_gid_cache **gid_cache; +}; + +struct ib_device { + struct device *dma_device; + + char name[IB_DEVICE_NAME_MAX]; + + struct list_head event_handler_list; + spinlock_t event_handler_lock; + + struct list_head core_list; + struct list_head client_data_list; + spinlock_t client_data_lock; + + struct ib_cache cache; + + u32 flags; + + int (*query_device)(struct ib_device *device, + struct ib_device_attr *device_attr); + int (*query_port)(struct ib_device *device, + u8 port_num, + struct ib_port_attr *port_attr); + int (*query_gid)(struct ib_device *device, + u8 port_num, int index, + union ib_gid *gid); + int (*query_pkey)(struct ib_device *device, + u8 port_num, u16 index, u16 *pkey); + int (*modify_device)(struct ib_device *device, + int device_modify_mask, + struct ib_device_modify *device_modify); + int (*modify_port)(struct ib_device *device, + u8 port_num, int port_modify_mask, + struct ib_port_modify *port_modify); + struct ib_pd * (*alloc_pd)(struct ib_device *device); + int (*dealloc_pd)(struct ib_pd *pd); + struct ib_ah * (*create_ah)(struct ib_pd *pd, + struct ib_ah_attr *ah_attr); + int (*modify_ah)(struct ib_ah *ah, + struct ib_ah_attr *ah_attr); + int (*query_ah)(struct ib_ah *ah, + struct ib_ah_attr *ah_attr); + int (*destroy_ah)(struct ib_ah *ah); + struct ib_qp * (*create_qp)(struct ib_pd *pd, + struct ib_qp_init_attr *qp_init_attr); + int (*modify_qp)(struct ib_qp *qp, + struct ib_qp_attr *qp_attr, + int qp_attr_mask); + int (*query_qp)(struct ib_qp *qp, + struct ib_qp_attr *qp_attr, + int qp_attr_mask, + struct ib_qp_init_attr *qp_init_attr); + int (*destroy_qp)(struct ib_qp *qp); + int (*post_send)(struct ib_qp *qp, + struct ib_send_wr *send_wr, + struct ib_send_wr **bad_send_wr); + int (*post_recv)(struct ib_qp *qp, + struct ib_recv_wr *recv_wr, + struct ib_recv_wr **bad_recv_wr); + struct ib_cq * (*create_cq)(struct ib_device *device, + int cqe); + int (*destroy_cq)(struct ib_cq *cq); + int (*resize_cq)(struct ib_cq *cq, int *cqe); + int (*poll_cq)(struct ib_cq *cq, int num_entries, + struct ib_wc *wc); + int (*peek_cq)(struct ib_cq *cq, int wc_cnt); + int (*req_notify_cq)(struct ib_cq *cq, + enum ib_cq_notify cq_notify); + int (*req_ncomp_notif)(struct ib_cq *cq, + int wc_cnt); + struct ib_mr * (*get_dma_mr)(struct ib_pd *pd, + int mr_access_flags); + struct ib_mr * (*reg_phys_mr)(struct ib_pd *pd, + struct ib_phys_buf *phys_buf_array, + int num_phys_buf, + int mr_access_flags, + u64 *iova_start); + int (*query_mr)(struct ib_mr *mr, + struct ib_mr_attr *mr_attr); + int (*dereg_mr)(struct ib_mr *mr); + int (*rereg_phys_mr)(struct ib_mr *mr, + int mr_rereg_mask, + struct ib_pd *pd, + struct ib_phys_buf *phys_buf_array, + int num_phys_buf, + int mr_access_flags, + u64 *iova_start); + struct ib_mw * (*alloc_mw)(struct ib_pd *pd); + int (*bind_mw)(struct ib_qp *qp, + struct ib_mw *mw, + struct ib_mw_bind *mw_bind); + int (*dealloc_mw)(struct ib_mw *mw); + struct ib_fmr * (*alloc_fmr)(struct ib_pd *pd, + int mr_access_flags, + struct ib_fmr_attr *fmr_attr); + int (*map_phys_fmr)(struct ib_fmr *fmr, + u64 *page_list, int list_len, + u64 iova); + int (*unmap_fmr)(struct list_head *fmr_list); + int (*dealloc_fmr)(struct ib_fmr *fmr); + int (*attach_mcast)(struct ib_qp *qp, + union ib_gid *gid, + u16 lid); + int (*detach_mcast)(struct ib_qp *qp, + union ib_gid *gid, + u16 lid); + int (*process_mad)(struct ib_device *device, + int process_mad_flags, + u8 port_num, + u16 source_lid, + struct ib_mad *in_mad, + struct ib_mad *out_mad); + + struct class_device class_dev; + struct kobject ports_parent; + struct list_head port_list; + + enum { + IB_DEV_UNINITIALIZED, + IB_DEV_REGISTERED, + IB_DEV_UNREGISTERED + } reg_state; + + u8 node_type; + u8 phys_port_cnt; +}; + +struct ib_client { + char *name; + void (*add) (struct ib_device *); + void (*remove)(struct ib_device *); + + struct list_head list; +}; + +struct ib_device *ib_alloc_device(size_t size); +void ib_dealloc_device(struct ib_device *device); + +int ib_register_device (struct ib_device *device); +void ib_unregister_device(struct ib_device *device); + +int ib_register_client (struct ib_client *client); +void ib_unregister_client(struct ib_client *client); + +void *ib_get_client_data(struct ib_device *device, struct ib_client *client); +void ib_set_client_data(struct ib_device *device, struct ib_client *client, + void *data); + +int ib_register_event_handler (struct ib_event_handler *event_handler); +int ib_unregister_event_handler(struct ib_event_handler *event_handler); +void ib_dispatch_event(struct ib_event *event); + +int ib_query_device(struct ib_device *device, + struct ib_device_attr *device_attr); + +int ib_query_port(struct ib_device *device, + u8 port_num, struct ib_port_attr *port_attr); + +int ib_query_gid(struct ib_device *device, + u8 port_num, int index, union ib_gid *gid); + +int ib_query_pkey(struct ib_device *device, + u8 port_num, u16 index, u16 *pkey); + +int ib_modify_device(struct ib_device *device, + int device_modify_mask, + struct ib_device_modify *device_modify); + +int ib_modify_port(struct ib_device *device, + u8 port_num, int port_modify_mask, + struct ib_port_modify *port_modify); + +/** + * ib_alloc_pd - Allocates an unused protection domain. + * @device: The device on which to allocate the protection domain. + * + * A protection domain object provides an association between QPs, shared + * receive queues, address handles, memory regions, and memory windows. + */ +struct ib_pd *ib_alloc_pd(struct ib_device *device); + +/** + * ib_dealloc_pd - Deallocates a protection domain. + * @pd: The protection domain to deallocate. + */ +int ib_dealloc_pd(struct ib_pd *pd); + +/** + * ib_create_ah - Creates an address handle for the given address vector. + * @pd: The protection domain associated with the address handle. + * @ah_attr: The attributes of the address vector. + * + * The address handle is used to reference a local or global destination + * in all UD QP post sends. + */ +struct ib_ah *ib_create_ah(struct ib_pd *pd, struct ib_ah_attr *ah_attr); + +/** + * ib_modify_ah - Modifies the address vector associated with an address + * handle. + * @ah: The address handle to modify. + * @ah_attr: The new address vector attributes to associate with the + * address handle. + */ +int ib_modify_ah(struct ib_ah *ah, struct ib_ah_attr *ah_attr); + +/** + * ib_query_ah - Queries the address vector associated with an address + * handle. + * @ah: The address handle to query. + * @ah_attr: The address vector attributes associated with the address + * handle. + */ +int ib_query_ah(struct ib_ah *ah, struct ib_ah_attr *ah_attr); + +/** + * ib_destroy_ah - Destroys an address handle. + * @ah: The address handle to destroy. + */ +int ib_destroy_ah(struct ib_ah *ah); + +/** + * ib_create_qp - Creates a QP associated with the specified protection + * domain. + * @pd: The protection domain associated with the QP. + * @qp_init_attr: A list of initial attributes required to create the QP. + */ +struct ib_qp *ib_create_qp(struct ib_pd *pd, + struct ib_qp_init_attr *qp_init_attr); + +/** + * ib_modify_qp - Modifies the attributes for the specified QP and then + * transitions the QP to the given state. + * @qp: The QP to modify. + * @qp_attr: On input, specifies the QP attributes to modify. On output, + * the current values of selected QP attributes are returned. + * @qp_attr_mask: A bit-mask used to specify which attributes of the QP + * are being modified. + */ +int ib_modify_qp(struct ib_qp *qp, + struct ib_qp_attr *qp_attr, + int qp_attr_mask); + +/** + * ib_query_qp - Returns the attribute list and current values for the + * specified QP. + * @qp: The QP to query. + * @qp_attr: The attributes of the specified QP. + * @qp_attr_mask: A bit-mask used to select specific attributes to query. + * @qp_init_attr: Additional attributes of the selected QP. + * + * The qp_attr_mask may be used to limit the query to gathering only the + * selected attributes. + */ +int ib_query_qp(struct ib_qp *qp, + struct ib_qp_attr *qp_attr, + int qp_attr_mask, + struct ib_qp_init_attr *qp_init_attr); + +/** + * ib_destroy_qp - Destroys the specified QP. + * @qp: The QP to destroy. + */ +int ib_destroy_qp(struct ib_qp *qp); + +/** + * ib_post_send - Posts a list of work requests to the send queue of + * the specified QP. + * @qp: The QP to post the work request on. + * @send_wr: A list of work requests to post on the send queue. + * @bad_send_wr: On an immediate failure, this parameter will reference + * the work request that failed to be posted on the QP. + */ +static inline int ib_post_send(struct ib_qp *qp, + struct ib_send_wr *send_wr, + struct ib_send_wr **bad_send_wr) +{ + return qp->device->post_send(qp, send_wr, bad_send_wr); +} + +/** + * ib_post_recv - Posts a list of work requests to the receive queue of + * the specified QP. + * @qp: The QP to post the work request on. + * @recv_wr: A list of work requests to post on the receive queue. + * @bad_recv_wr: On an immediate failure, this parameter will reference + * the work request that failed to be posted on the QP. + */ +static inline int ib_post_recv(struct ib_qp *qp, + struct ib_recv_wr *recv_wr, + struct ib_recv_wr **bad_recv_wr) +{ + return qp->device->post_recv(qp, recv_wr, bad_recv_wr); +} + +/** + * ib_create_cq - Creates a CQ on the specified device. + * @device: The device on which to create the CQ. + * @comp_handler: A user-specified callback that is invoked when a + * completion event occurs on the CQ. + * @event_handler: A user-specified callback that is invoked when an + * asynchronous event not associated with a completion occurs on the CQ. + * @cq_context: Context associated with the CQ returned to the user via + * the associated completion and event handlers. + * @cqe: The minimum size of the CQ. + * + * Users can examine the cq structure to determine the actual CQ size. + */ +struct ib_cq *ib_create_cq(struct ib_device *device, + ib_comp_handler comp_handler, + void (*event_handler)(struct ib_event *, void *), + void *cq_context, int cqe); + +/** + * ib_resize_cq - Modifies the capacity of the CQ. + * @cq: The CQ to resize. + * @cqe: The minimum size of the CQ. + * + * Users can examine the cq structure to determine the actual CQ size. + */ +int ib_resize_cq(struct ib_cq *cq, int cqe); + +/** + * ib_destroy_cq - Destroys the specified CQ. + * @cq: The CQ to destroy. + */ +int ib_destroy_cq(struct ib_cq *cq); + +/** + * ib_poll_cq - poll a CQ for completion(s) + * @cq:the CQ being polled + * @num_entries:maximum number of completions to return + * @wc:array of at least @num_entries &struct ib_wc where completions + * will be returned + * + * Poll a CQ for (possibly multiple) completions. If the return value + * is < 0, an error occurred. If the return value is >= 0, it is the + * number of completions returned. If the return value is + * non-negative and < num_entries, then the CQ was emptied. + */ +static inline int ib_poll_cq(struct ib_cq *cq, int num_entries, + struct ib_wc *wc) +{ + return cq->device->poll_cq(cq, num_entries, wc); +} + +/** + * ib_peek_cq - Returns the number of unreaped completions currently + * on the specified CQ. + * @cq: The CQ to peek. + * @wc_cnt: A minimum number of unreaped completions to check for. + * + * If the number of unreaped completions is greater than or equal to wc_cnt, + * this function returns wc_cnt, otherwise, it returns the actual number of + * unreaped completions. + */ +int ib_peek_cq(struct ib_cq *cq, int wc_cnt); + +/** + * ib_req_notify_cq - Request completion notification on a CQ. + * @cq: The CQ to generate an event for. + * @cq_notify: If set to %IB_CQ_SOLICITED, completion notification will + * occur on the next solicited event. If set to %IB_CQ_NEXT_COMP, + * notification will occur on the next completion. + */ +static inline int ib_req_notify_cq(struct ib_cq *cq, + enum ib_cq_notify cq_notify) +{ + return cq->device->req_notify_cq(cq, cq_notify); +} + +/** + * ib_req_ncomp_notif - Request completion notification when there are + * at least the specified number of unreaped completions on the CQ. + * @cq: The CQ to generate an event for. + * @wc_cnt: The number of unreaped completions that should be on the + * CQ before an event is generated. + */ +static inline int ib_req_ncomp_notif(struct ib_cq *cq, int wc_cnt) +{ + return cq->device->req_ncomp_notif ? + cq->device->req_ncomp_notif(cq, wc_cnt) : + -ENOSYS; +} + +/** + * ib_get_dma_mr - Returns a memory region for system memory that is + * usable for DMA. + * @pd: The protection domain associated with the memory region. + * @mr_access_flags: Specifies the memory access rights. + */ +struct ib_mr *ib_get_dma_mr(struct ib_pd *pd, int mr_access_flags); + +/** + * ib_reg_phys_mr - Prepares a virtually addressed memory region for use + * by an HCA. + * @pd: The protection domain associated assigned to the registered region. + * @phys_buf_array: Specifies a list of physical buffers to use in the + * memory region. + * @num_phys_buf: Specifies the size of the phys_buf_array. + * @mr_access_flags: Specifies the memory access rights. + * @iova_start: The offset of the region's starting I/O virtual address. + */ +struct ib_mr *ib_reg_phys_mr(struct ib_pd *pd, + struct ib_phys_buf *phys_buf_array, + int num_phys_buf, + int mr_access_flags, + u64 *iova_start); + +/** + * ib_rereg_phys_mr - Modifies the attributes of an existing memory region. + * Conceptually, this call performs the functions deregister memory region + * followed by register physical memory region. Where possible, + * resources are reused instead of deallocated and reallocated. + * @mr: The memory region to modify. + * @mr_rereg_mask: A bit-mask used to indicate which of the following + * properties of the memory region are being modified. + * @pd: If %IB_MR_REREG_PD is set in mr_rereg_mask, this field specifies + * the new protection domain to associated with the memory region, + * otherwise, this parameter is ignored. + * @phys_buf_array: If %IB_MR_REREG_TRANS is set in mr_rereg_mask, this + * field specifies a list of physical buffers to use in the new + * translation, otherwise, this parameter is ignored. + * @num_phys_buf: If %IB_MR_REREG_TRANS is set in mr_rereg_mask, this + * field specifies the size of the phys_buf_array, otherwise, this + * parameter is ignored. + * @mr_access_flags: If %IB_MR_REREG_ACCESS is set in mr_rereg_mask, this + * field specifies the new memory access rights, otherwise, this + * parameter is ignored. + * @iova_start: The offset of the region's starting I/O virtual address. + */ +int ib_rereg_phys_mr(struct ib_mr *mr, + int mr_rereg_mask, + struct ib_pd *pd, + struct ib_phys_buf *phys_buf_array, + int num_phys_buf, + int mr_access_flags, + u64 *iova_start); + +/** + * ib_query_mr - Retrieves information about a specific memory region. + * @mr: The memory region to retrieve information about. + * @mr_attr: The attributes of the specified memory region. + */ +int ib_query_mr(struct ib_mr *mr, struct ib_mr_attr *mr_attr); + +/** + * ib_dereg_mr - Deregisters a memory region and removes it from the + * HCA translation table. + * @mr: The memory region to deregister. + */ +int ib_dereg_mr(struct ib_mr *mr); + +/** + * ib_alloc_mw - Allocates a memory window. + * @pd: The protection domain associated with the memory window. + */ +struct ib_mw *ib_alloc_mw(struct ib_pd *pd); + +/** + * ib_bind_mw - Posts a work request to the send queue of the specified + * QP, which binds the memory window to the given address range and + * remote access attributes. + * @qp: QP to post the bind work request on. + * @mw: The memory window to bind. + * @mw_bind: Specifies information about the memory window, including + * its address range, remote access rights, and associated memory region. + */ +static inline int ib_bind_mw(struct ib_qp *qp, + struct ib_mw *mw, + struct ib_mw_bind *mw_bind) +{ + /* XXX reference counting in corresponding MR? */ + return mw->device->bind_mw ? + mw->device->bind_mw(qp, mw, mw_bind) : + -ENOSYS; +} + +/** + * ib_dealloc_mw - Deallocates a memory window. + * @mw: The memory window to deallocate. + */ +int ib_dealloc_mw(struct ib_mw *mw); + +/** + * ib_alloc_fmr - Allocates a unmapped fast memory region. + * @pd: The protection domain associated with the unmapped region. + * @mr_access_flags: Specifies the memory access rights. + * @fmr_attr: Attributes of the unmapped region. + * + * A fast memory region must be mapped before it can be used as part of + * a work request. + */ +struct ib_fmr *ib_alloc_fmr(struct ib_pd *pd, + int mr_access_flags, + struct ib_fmr_attr *fmr_attr); + +/** + * ib_map_phys_fmr - Maps a list of physical pages to a fast memory region. + * @fmr: The fast memory region to associate with the pages. + * @page_list: An array of physical pages to map to the fast memory region. + * @list_len: The number of pages in page_list. + * @iova: The I/O virtual address to use with the mapped region. + */ +static inline int ib_map_phys_fmr(struct ib_fmr *fmr, + u64 *page_list, int list_len, + u64 iova) +{ + return fmr->device->map_phys_fmr(fmr, page_list, list_len, iova); +} + +/** + * ib_unmap_fmr - Removes the mapping from a list of fast memory regions. + * @fmr_list: A linked list of fast memory regions to unmap. + */ +int ib_unmap_fmr(struct list_head *fmr_list); + +/** + * ib_dealloc_fmr - Deallocates a fast memory region. + * @fmr: The fast memory region to deallocate. + */ +int ib_dealloc_fmr(struct ib_fmr *fmr); + +/** + * ib_attach_mcast - Attaches the specified QP to a multicast group. + * @qp: QP to attach to the multicast group. The QP must be type + * IB_QPT_UD. + * @gid: Multicast group GID. + * @lid: Multicast group LID in host byte order. + * + * In order to send and receive multicast packets, subnet + * administration must have created the multicast group and configured + * the fabric appropriately. The port associated with the specified + * QP must also be a member of the multicast group. + */ +int ib_attach_mcast(struct ib_qp *qp, union ib_gid *gid, u16 lid); + +/** + * ib_detach_mcast - Detaches the specified QP from a multicast group. + * @qp: QP to detach from the multicast group. + * @gid: Multicast group GID. + * @lid: Multicast group LID in host byte order. + */ +int ib_detach_mcast(struct ib_qp *qp, union ib_gid *gid, u16 lid); + +#endif /* IB_VERBS_H */ From roland@topspin.com Mon Dec 27 21:49:45 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 21:50:56 -0800 (PST) Received: from umhlanga.STRATNET.NET (umhlanga.stratnet.net [12.162.17.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS5nJDk025948 for ; Mon, 27 Dec 2004 21:49:44 -0800 Received: from exch-1.topspincom.com ([12.162.17.3]) by umhlanga.STRATNET.NET with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:51:08 -0800 Received: from eddore ([10.10.253.169]) by exch-1.topspincom.com with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:51:08 -0800 Received: from localhost ([127.0.0.1] helo=eddore) by eddore with smtp (Exim 4.34) id 1CjAG9-0000u5-Q9; Mon, 27 Dec 2004 21:51:08 -0800 Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com, openib-general@openib.org In-Reply-To: <200412272151.5KRaOFYNt0hYYQgh@topspin.com> X-Mailer: Roland's Patchbomber Date: Mon, 27 Dec 2004 21:51:05 -0800 Message-Id: <200412272151.6amOd1o39RpEe1KK@topspin.com> Mime-Version: 1.0 To: davem@davemloft.net From: Roland Dreier X-SA-Exim-Connect-IP: 127.0.0.1 X-SA-Exim-Mail-From: roland@topspin.com Subject: [PATCH][v5][11/24] Add Mellanox HCA low-level driver (FW commands) Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-SA-Exim-Version: 4.1 (built Tue, 17 Aug 2004 11:06:07 +0200) X-SA-Exim-Scanned: Yes (on eddore) X-OriginalArrivalTime: 28 Dec 2004 05:51:08.0173 (UTC) FILETIME=[3A1793D0:01C4ECA1] X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13110 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: roland@topspin.com Precedence: bulk X-list: netdev Add firmware command processing code for Mellanox HCA driver. Signed-off-by: Roland Dreier --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/hw/mthca/mthca_cmd.c 2004-12-27 21:48:22.369673490 -0800 @@ -0,0 +1,1573 @@ +/* + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: mthca_cmd.c 1349 2004-12-16 21:09:43Z roland $ + */ + +#include +#include +#include +#include + +#include "mthca_dev.h" +#include "mthca_config_reg.h" +#include "mthca_cmd.h" + +#define CMD_POLL_TOKEN 0xffff + +enum { + HCR_IN_PARAM_OFFSET = 0x00, + HCR_IN_MODIFIER_OFFSET = 0x08, + HCR_OUT_PARAM_OFFSET = 0x0c, + HCR_TOKEN_OFFSET = 0x14, + HCR_STATUS_OFFSET = 0x18, + + HCR_OPMOD_SHIFT = 12, + HCA_E_BIT = 22, + HCR_GO_BIT = 23 +}; + +enum { + /* initialization and general commands */ + CMD_SYS_EN = 0x1, + CMD_SYS_DIS = 0x2, + CMD_MAP_FA = 0xfff, + CMD_UNMAP_FA = 0xffe, + CMD_RUN_FW = 0xff6, + CMD_MOD_STAT_CFG = 0x34, + CMD_QUERY_DEV_LIM = 0x3, + CMD_QUERY_FW = 0x4, + CMD_ENABLE_LAM = 0xff8, + CMD_DISABLE_LAM = 0xff7, + CMD_QUERY_DDR = 0x5, + CMD_QUERY_ADAPTER = 0x6, + CMD_INIT_HCA = 0x7, + CMD_CLOSE_HCA = 0x8, + CMD_INIT_IB = 0x9, + CMD_CLOSE_IB = 0xa, + CMD_QUERY_HCA = 0xb, + CMD_SET_IB = 0xc, + CMD_ACCESS_DDR = 0x2e, + CMD_MAP_ICM = 0xffa, + CMD_UNMAP_ICM = 0xff9, + CMD_MAP_ICM_AUX = 0xffc, + CMD_UNMAP_ICM_AUX = 0xffb, + CMD_SET_ICM_SIZE = 0xffd, + + /* TPT commands */ + CMD_SW2HW_MPT = 0xd, + CMD_QUERY_MPT = 0xe, + CMD_HW2SW_MPT = 0xf, + CMD_READ_MTT = 0x10, + CMD_WRITE_MTT = 0x11, + CMD_SYNC_TPT = 0x2f, + + /* EQ commands */ + CMD_MAP_EQ = 0x12, + CMD_SW2HW_EQ = 0x13, + CMD_HW2SW_EQ = 0x14, + CMD_QUERY_EQ = 0x15, + + /* CQ commands */ + CMD_SW2HW_CQ = 0x16, + CMD_HW2SW_CQ = 0x17, + CMD_QUERY_CQ = 0x18, + CMD_RESIZE_CQ = 0x2c, + + /* SRQ commands */ + CMD_SW2HW_SRQ = 0x35, + CMD_HW2SW_SRQ = 0x36, + CMD_QUERY_SRQ = 0x37, + + /* QP/EE commands */ + CMD_RST2INIT_QPEE = 0x19, + CMD_INIT2RTR_QPEE = 0x1a, + CMD_RTR2RTS_QPEE = 0x1b, + CMD_RTS2RTS_QPEE = 0x1c, + CMD_SQERR2RTS_QPEE = 0x1d, + CMD_2ERR_QPEE = 0x1e, + CMD_RTS2SQD_QPEE = 0x1f, + CMD_SQD2SQD_QPEE = 0x38, + CMD_SQD2RTS_QPEE = 0x20, + CMD_ERR2RST_QPEE = 0x21, + CMD_QUERY_QPEE = 0x22, + CMD_INIT2INIT_QPEE = 0x2d, + CMD_SUSPEND_QPEE = 0x32, + CMD_UNSUSPEND_QPEE = 0x33, + /* special QPs and management commands */ + CMD_CONF_SPECIAL_QP = 0x23, + CMD_MAD_IFC = 0x24, + + /* multicast commands */ + CMD_READ_MGM = 0x25, + CMD_WRITE_MGM = 0x26, + CMD_MGID_HASH = 0x27, + + /* miscellaneous commands */ + CMD_DIAG_RPRT = 0x30, + CMD_NOP = 0x31, + + /* debug commands */ + CMD_QUERY_DEBUG_MSG = 0x2a, + CMD_SET_DEBUG_MSG = 0x2b, +}; + +/* + * According to Mellanox code, FW may be starved and never complete + * commands. So we can't use strict timeouts described in PRM -- we + * just arbitrarily select 60 seconds for now. + */ +#if 0 +/* + * Round up and add 1 to make sure we get the full wait time (since we + * will be starting in the middle of a jiffy) + */ +enum { + CMD_TIME_CLASS_A = (HZ + 999) / 1000 + 1, + CMD_TIME_CLASS_B = (HZ + 99) / 100 + 1, + CMD_TIME_CLASS_C = (HZ + 9) / 10 + 1 +}; +#else +enum { + CMD_TIME_CLASS_A = 60 * HZ, + CMD_TIME_CLASS_B = 60 * HZ, + CMD_TIME_CLASS_C = 60 * HZ +}; +#endif + +enum { + GO_BIT_TIMEOUT = HZ * 10 +}; + +struct mthca_cmd_context { + struct completion done; + struct timer_list timer; + int result; + int next; + u64 out_param; + u16 token; + u8 status; +}; + +static inline int go_bit(struct mthca_dev *dev) +{ + return readl(dev->hcr + HCR_STATUS_OFFSET) & + swab32(1 << HCR_GO_BIT); +} + +static int mthca_cmd_post(struct mthca_dev *dev, + u64 in_param, + u64 out_param, + u32 in_modifier, + u8 op_modifier, + u16 op, + u16 token, + int event) +{ + int err = 0; + + if (down_interruptible(&dev->cmd.hcr_sem)) + return -EINTR; + + if (event) { + unsigned long end = jiffies + GO_BIT_TIMEOUT; + + while (go_bit(dev) && time_before(jiffies, end)) { + set_current_state(TASK_RUNNING); + schedule(); + } + } + + if (go_bit(dev)) { + err = -EAGAIN; + goto out; + } + + /* + * We use writel (instead of something like memcpy_toio) + * because writes of less than 32 bits to the HCR don't work + * (and some architectures such as ia64 implement memcpy_toio + * in terms of writeb). + */ + __raw_writel(cpu_to_be32(in_param >> 32), dev->hcr + 0 * 4); + __raw_writel(cpu_to_be32(in_param & 0xfffffffful), dev->hcr + 1 * 4); + __raw_writel(cpu_to_be32(in_modifier), dev->hcr + 2 * 4); + __raw_writel(cpu_to_be32(out_param >> 32), dev->hcr + 3 * 4); + __raw_writel(cpu_to_be32(out_param & 0xfffffffful), dev->hcr + 4 * 4); + __raw_writel(cpu_to_be32(token << 16), dev->hcr + 5 * 4); + + /* __raw_writel may not order writes. */ + wmb(); + + __raw_writel(cpu_to_be32((1 << HCR_GO_BIT) | + (event ? (1 << HCA_E_BIT) : 0) | + (op_modifier << HCR_OPMOD_SHIFT) | + op), dev->hcr + 6 * 4); + +out: + up(&dev->cmd.hcr_sem); + return err; +} + +static int mthca_cmd_poll(struct mthca_dev *dev, + u64 in_param, + u64 *out_param, + int out_is_imm, + u32 in_modifier, + u8 op_modifier, + u16 op, + unsigned long timeout, + u8 *status) +{ + int err = 0; + unsigned long end; + + if (down_interruptible(&dev->cmd.poll_sem)) + return -EINTR; + + err = mthca_cmd_post(dev, in_param, + out_param ? *out_param : 0, + in_modifier, op_modifier, + op, CMD_POLL_TOKEN, 0); + if (err) + goto out; + + end = timeout + jiffies; + while (go_bit(dev) && time_before(jiffies, end)) { + set_current_state(TASK_RUNNING); + schedule(); + } + + if (go_bit(dev)) { + err = -EBUSY; + goto out; + } + + if (out_is_imm) { + memcpy_fromio(out_param, dev->hcr + HCR_OUT_PARAM_OFFSET, sizeof (u64)); + be64_to_cpus(out_param); + } + + *status = be32_to_cpu(__raw_readl(dev->hcr + HCR_STATUS_OFFSET)) >> 24; + +out: + up(&dev->cmd.poll_sem); + return err; +} + +void mthca_cmd_event(struct mthca_dev *dev, + u16 token, + u8 status, + u64 out_param) +{ + struct mthca_cmd_context *context = + &dev->cmd.context[token & dev->cmd.token_mask]; + + /* previously timed out command completing at long last */ + if (token != context->token) + return; + + context->result = 0; + context->status = status; + context->out_param = out_param; + + context->token += dev->cmd.token_mask + 1; + + complete(&context->done); +} + +static void event_timeout(unsigned long context_ptr) +{ + struct mthca_cmd_context *context = + (struct mthca_cmd_context *) context_ptr; + + context->result = -EBUSY; + complete(&context->done); +} + +static int mthca_cmd_wait(struct mthca_dev *dev, + u64 in_param, + u64 *out_param, + int out_is_imm, + u32 in_modifier, + u8 op_modifier, + u16 op, + unsigned long timeout, + u8 *status) +{ + int err = 0; + struct mthca_cmd_context *context; + + if (down_interruptible(&dev->cmd.event_sem)) + return -EINTR; + + spin_lock(&dev->cmd.context_lock); + BUG_ON(dev->cmd.free_head < 0); + context = &dev->cmd.context[dev->cmd.free_head]; + dev->cmd.free_head = context->next; + spin_unlock(&dev->cmd.context_lock); + + init_completion(&context->done); + + err = mthca_cmd_post(dev, in_param, + out_param ? *out_param : 0, + in_modifier, op_modifier, + op, context->token, 1); + if (err) + goto out; + + context->timer.expires = jiffies + timeout; + add_timer(&context->timer); + + wait_for_completion(&context->done); + del_timer_sync(&context->timer); + + err = context->result; + if (err) + goto out; + + *status = context->status; + if (*status) + mthca_dbg(dev, "Command %02x completed with status %02x\n", + op, *status); + + if (out_is_imm) + *out_param = context->out_param; + +out: + spin_lock(&dev->cmd.context_lock); + context->next = dev->cmd.free_head; + dev->cmd.free_head = context - dev->cmd.context; + spin_unlock(&dev->cmd.context_lock); + + up(&dev->cmd.event_sem); + return err; +} + +/* Invoke a command with an output mailbox */ +static int mthca_cmd_box(struct mthca_dev *dev, + u64 in_param, + u64 out_param, + u32 in_modifier, + u8 op_modifier, + u16 op, + unsigned long timeout, + u8 *status) +{ + if (dev->cmd.use_events) + return mthca_cmd_wait(dev, in_param, &out_param, 0, + in_modifier, op_modifier, op, + timeout, status); + else + return mthca_cmd_poll(dev, in_param, &out_param, 0, + in_modifier, op_modifier, op, + timeout, status); +} + +/* Invoke a command with no output parameter */ +static int mthca_cmd(struct mthca_dev *dev, + u64 in_param, + u32 in_modifier, + u8 op_modifier, + u16 op, + unsigned long timeout, + u8 *status) +{ + return mthca_cmd_box(dev, in_param, 0, in_modifier, + op_modifier, op, timeout, status); +} + +/* + * Invoke a command with an immediate output parameter (and copy the + * output into the caller's out_param pointer after the command + * executes). + */ +static int mthca_cmd_imm(struct mthca_dev *dev, + u64 in_param, + u64 *out_param, + u32 in_modifier, + u8 op_modifier, + u16 op, + unsigned long timeout, + u8 *status) +{ + if (dev->cmd.use_events) + return mthca_cmd_wait(dev, in_param, out_param, 1, + in_modifier, op_modifier, op, + timeout, status); + else + return mthca_cmd_poll(dev, in_param, out_param, 1, + in_modifier, op_modifier, op, + timeout, status); +} + +/* + * Switch to using events to issue FW commands (should be called after + * event queue to command events has been initialized). + */ +int mthca_cmd_use_events(struct mthca_dev *dev) +{ + int i; + + dev->cmd.context = kmalloc(dev->cmd.max_cmds * + sizeof (struct mthca_cmd_context), + GFP_KERNEL); + if (!dev->cmd.context) + return -ENOMEM; + + for (i = 0; i < dev->cmd.max_cmds; ++i) { + dev->cmd.context[i].token = i; + dev->cmd.context[i].next = i + 1; + init_timer(&dev->cmd.context[i].timer); + dev->cmd.context[i].timer.data = + (unsigned long) &dev->cmd.context[i]; + dev->cmd.context[i].timer.function = event_timeout; + } + + dev->cmd.context[dev->cmd.max_cmds - 1].next = -1; + dev->cmd.free_head = 0; + + sema_init(&dev->cmd.event_sem, dev->cmd.max_cmds); + spin_lock_init(&dev->cmd.context_lock); + + for (dev->cmd.token_mask = 1; + dev->cmd.token_mask < dev->cmd.max_cmds; + dev->cmd.token_mask <<= 1) + ; /* nothing */ + --dev->cmd.token_mask; + + dev->cmd.use_events = 1; + down(&dev->cmd.poll_sem); + + return 0; +} + +/* + * Switch back to polling (used when shutting down the device) + */ +void mthca_cmd_use_polling(struct mthca_dev *dev) +{ + int i; + + dev->cmd.use_events = 0; + + for (i = 0; i < dev->cmd.max_cmds; ++i) + down(&dev->cmd.event_sem); + + kfree(dev->cmd.context); + + up(&dev->cmd.poll_sem); +} + +int mthca_SYS_EN(struct mthca_dev *dev, u8 *status) +{ + u64 out; + int ret; + + ret = mthca_cmd_imm(dev, 0, &out, 0, 0, CMD_SYS_EN, HZ, status); + + if (*status == MTHCA_CMD_STAT_DDR_MEM_ERR) + mthca_warn(dev, "SYS_EN DDR error: syn=%x, sock=%d, " + "sladdr=%d, SPD source=%s\n", + (int) (out >> 6) & 0xf, (int) (out >> 4) & 3, + (int) (out >> 1) & 7, (int) out & 1 ? "NVMEM" : "DIMM"); + + return ret; +} + +int mthca_SYS_DIS(struct mthca_dev *dev, u8 *status) +{ + return mthca_cmd(dev, 0, 0, 0, CMD_SYS_DIS, HZ, status); +} + +int mthca_MAP_FA(struct mthca_dev *dev, int count, + struct scatterlist *sglist, u8 *status) +{ + u32 *inbox; + dma_addr_t indma; + int lg; + int nent = 0; + int i, j; + int err = 0; + int ts = 0; + + inbox = pci_alloc_consistent(dev->pdev, PAGE_SIZE, &indma); + memset(inbox, 0, PAGE_SIZE); + + for (i = 0; i < count; ++i) { + /* + * We have to pass pages that are aligned to their + * size, so find the least significant 1 in the + * address or size and use that as our log2 size. + */ + lg = ffs(sg_dma_address(sglist + i) | sg_dma_len(sglist + i)) - 1; + if (lg < 12) { + mthca_warn(dev, "Got FW area not aligned to 4K (%llx/%x).\n", + (unsigned long long) sg_dma_address(sglist + i), + sg_dma_len(sglist + i)); + err = -EINVAL; + goto out; + } + for (j = 0; j < sg_dma_len(sglist + i) / (1 << lg); ++j, ++nent) { + *((__be64 *) (inbox + nent * 4 + 2)) = + cpu_to_be64((sg_dma_address(sglist + i) + + (j << lg)) | + (lg - 12)); + ts += 1 << (lg - 10); + if (nent == PAGE_SIZE / 16) { + err = mthca_cmd(dev, indma, nent, 0, CMD_MAP_FA, + CMD_TIME_CLASS_B, status); + if (err || *status) + goto out; + nent = 0; + } + } + } + + if (nent) { + err = mthca_cmd(dev, indma, nent, 0, CMD_MAP_FA, + CMD_TIME_CLASS_B, status); + } + + mthca_dbg(dev, "Mapped %d KB of host memory for FW.\n", ts); + +out: + pci_free_consistent(dev->pdev, PAGE_SIZE, inbox, indma); + return err; +} + +int mthca_UNMAP_FA(struct mthca_dev *dev, u8 *status) +{ + return mthca_cmd(dev, 0, 0, 0, CMD_UNMAP_FA, CMD_TIME_CLASS_B, status); +} + +int mthca_RUN_FW(struct mthca_dev *dev, u8 *status) +{ + return mthca_cmd(dev, 0, 0, 0, CMD_RUN_FW, CMD_TIME_CLASS_A, status); +} + +int mthca_QUERY_FW(struct mthca_dev *dev, u8 *status) +{ + u32 *outbox; + dma_addr_t outdma; + int err = 0; + u8 lg; + +#define QUERY_FW_OUT_SIZE 0x100 +#define QUERY_FW_VER_OFFSET 0x00 +#define QUERY_FW_MAX_CMD_OFFSET 0x0f +#define QUERY_FW_ERR_START_OFFSET 0x30 +#define QUERY_FW_ERR_SIZE_OFFSET 0x38 + +#define QUERY_FW_START_OFFSET 0x20 +#define QUERY_FW_END_OFFSET 0x28 + +#define QUERY_FW_SIZE_OFFSET 0x00 +#define QUERY_FW_CLR_INT_BASE_OFFSET 0x20 +#define QUERY_FW_EQ_ARM_BASE_OFFSET 0x40 +#define QUERY_FW_EQ_SET_CI_BASE_OFFSET 0x48 + + outbox = pci_alloc_consistent(dev->pdev, QUERY_FW_OUT_SIZE, &outdma); + if (!outbox) { + return -ENOMEM; + } + + err = mthca_cmd_box(dev, 0, outdma, 0, 0, CMD_QUERY_FW, + CMD_TIME_CLASS_A, status); + + if (err) + goto out; + + MTHCA_GET(dev->fw_ver, outbox, QUERY_FW_VER_OFFSET); + /* + * FW subminor version is at more signifant bits than minor + * version, so swap here. + */ + dev->fw_ver = (dev->fw_ver & 0xffff00000000ull) | + ((dev->fw_ver & 0xffff0000ull) >> 16) | + ((dev->fw_ver & 0x0000ffffull) << 16); + + MTHCA_GET(lg, outbox, QUERY_FW_MAX_CMD_OFFSET); + dev->cmd.max_cmds = 1 << lg; + + mthca_dbg(dev, "FW version %012llx, max commands %d\n", + (unsigned long long) dev->fw_ver, dev->cmd.max_cmds); + + if (dev->hca_type == ARBEL_NATIVE) { + MTHCA_GET(dev->fw.arbel.fw_pages, outbox, QUERY_FW_SIZE_OFFSET); + MTHCA_GET(dev->fw.arbel.clr_int_base, outbox, QUERY_FW_CLR_INT_BASE_OFFSET); + MTHCA_GET(dev->fw.arbel.eq_arm_base, outbox, QUERY_FW_EQ_ARM_BASE_OFFSET); + MTHCA_GET(dev->fw.arbel.eq_set_ci_base, outbox, QUERY_FW_EQ_SET_CI_BASE_OFFSET); + mthca_dbg(dev, "FW size %d KB\n", dev->fw.arbel.fw_pages << 2); + + /* + * Arbel page size is always 4 KB; round up number of + * system pages needed. + */ + dev->fw.arbel.fw_pages = + (dev->fw.arbel.fw_pages + (1 << (PAGE_SHIFT - 12)) - 1) >> + (PAGE_SHIFT - 12); + + mthca_dbg(dev, "Clear int @ %llx, EQ arm @ %llx, EQ set CI @ %llx\n", + (unsigned long long) dev->fw.arbel.clr_int_base, + (unsigned long long) dev->fw.arbel.eq_arm_base, + (unsigned long long) dev->fw.arbel.eq_set_ci_base); + } else { + MTHCA_GET(dev->fw.tavor.fw_start, outbox, QUERY_FW_START_OFFSET); + MTHCA_GET(dev->fw.tavor.fw_end, outbox, QUERY_FW_END_OFFSET); + + mthca_dbg(dev, "FW size %d KB (start %llx, end %llx)\n", + (int) ((dev->fw.tavor.fw_end - dev->fw.tavor.fw_start) >> 10), + (unsigned long long) dev->fw.tavor.fw_start, + (unsigned long long) dev->fw.tavor.fw_end); + } + +out: + pci_free_consistent(dev->pdev, QUERY_FW_OUT_SIZE, outbox, outdma); + return err; +} + +int mthca_ENABLE_LAM(struct mthca_dev *dev, u8 *status) +{ + u8 info; + u32 *outbox; + dma_addr_t outdma; + int err = 0; + +#define ENABLE_LAM_OUT_SIZE 0x100 +#define ENABLE_LAM_START_OFFSET 0x00 +#define ENABLE_LAM_END_OFFSET 0x08 +#define ENABLE_LAM_INFO_OFFSET 0x13 + +#define ENABLE_LAM_INFO_HIDDEN_FLAG (1 << 4) +#define ENABLE_LAM_INFO_ECC_MASK 0x3 + + outbox = pci_alloc_consistent(dev->pdev, ENABLE_LAM_OUT_SIZE, &outdma); + if (!outbox) + return -ENOMEM; + + err = mthca_cmd_box(dev, 0, outdma, 0, 0, CMD_ENABLE_LAM, + CMD_TIME_CLASS_C, status); + + if (err) + goto out; + + if (*status == MTHCA_CMD_STAT_LAM_NOT_PRE) + goto out; + + MTHCA_GET(dev->ddr_start, outbox, ENABLE_LAM_START_OFFSET); + MTHCA_GET(dev->ddr_end, outbox, ENABLE_LAM_END_OFFSET); + MTHCA_GET(info, outbox, ENABLE_LAM_INFO_OFFSET); + + if (!!(info & ENABLE_LAM_INFO_HIDDEN_FLAG) != + !!(dev->mthca_flags & MTHCA_FLAG_DDR_HIDDEN)) { + mthca_info(dev, "FW reports that HCA-attached memory " + "is %s hidden; does not match PCI config\n", + (info & ENABLE_LAM_INFO_HIDDEN_FLAG) ? + "" : "not"); + } + if (info & ENABLE_LAM_INFO_HIDDEN_FLAG) + mthca_dbg(dev, "HCA-attached memory is hidden.\n"); + + mthca_dbg(dev, "HCA memory size %d KB (start %llx, end %llx)\n", + (int) ((dev->ddr_end - dev->ddr_start) >> 10), + (unsigned long long) dev->ddr_start, + (unsigned long long) dev->ddr_end); + +out: + pci_free_consistent(dev->pdev, ENABLE_LAM_OUT_SIZE, outbox, outdma); + return err; +} + +int mthca_DISABLE_LAM(struct mthca_dev *dev, u8 *status) +{ + return mthca_cmd(dev, 0, 0, 0, CMD_SYS_DIS, CMD_TIME_CLASS_C, status); +} + +int mthca_QUERY_DDR(struct mthca_dev *dev, u8 *status) +{ + u8 info; + u32 *outbox; + dma_addr_t outdma; + int err = 0; + +#define QUERY_DDR_OUT_SIZE 0x100 +#define QUERY_DDR_START_OFFSET 0x00 +#define QUERY_DDR_END_OFFSET 0x08 +#define QUERY_DDR_INFO_OFFSET 0x13 + +#define QUERY_DDR_INFO_HIDDEN_FLAG (1 << 4) +#define QUERY_DDR_INFO_ECC_MASK 0x3 + + outbox = pci_alloc_consistent(dev->pdev, QUERY_DDR_OUT_SIZE, &outdma); + if (!outbox) + return -ENOMEM; + + err = mthca_cmd_box(dev, 0, outdma, 0, 0, CMD_QUERY_DDR, + CMD_TIME_CLASS_A, status); + + if (err) + goto out; + + MTHCA_GET(dev->ddr_start, outbox, QUERY_DDR_START_OFFSET); + MTHCA_GET(dev->ddr_end, outbox, QUERY_DDR_END_OFFSET); + MTHCA_GET(info, outbox, QUERY_DDR_INFO_OFFSET); + + if (!!(info & QUERY_DDR_INFO_HIDDEN_FLAG) != + !!(dev->mthca_flags & MTHCA_FLAG_DDR_HIDDEN)) { + mthca_info(dev, "FW reports that HCA-attached memory " + "is %s hidden; does not match PCI config\n", + (info & QUERY_DDR_INFO_HIDDEN_FLAG) ? + "" : "not"); + } + if (info & QUERY_DDR_INFO_HIDDEN_FLAG) + mthca_dbg(dev, "HCA-attached memory is hidden.\n"); + + mthca_dbg(dev, "HCA memory size %d KB (start %llx, end %llx)\n", + (int) ((dev->ddr_end - dev->ddr_start) >> 10), + (unsigned long long) dev->ddr_start, + (unsigned long long) dev->ddr_end); + +out: + pci_free_consistent(dev->pdev, QUERY_DDR_OUT_SIZE, outbox, outdma); + return err; +} + +int mthca_QUERY_DEV_LIM(struct mthca_dev *dev, + struct mthca_dev_lim *dev_lim, u8 *status) +{ + u32 *outbox; + dma_addr_t outdma; + u8 field; + u16 size; + int err; + +#define QUERY_DEV_LIM_OUT_SIZE 0x100 +#define QUERY_DEV_LIM_MAX_SRQ_SZ_OFFSET 0x10 +#define QUERY_DEV_LIM_MAX_QP_SZ_OFFSET 0x11 +#define QUERY_DEV_LIM_RSVD_QP_OFFSET 0x12 +#define QUERY_DEV_LIM_MAX_QP_OFFSET 0x13 +#define QUERY_DEV_LIM_RSVD_SRQ_OFFSET 0x14 +#define QUERY_DEV_LIM_MAX_SRQ_OFFSET 0x15 +#define QUERY_DEV_LIM_RSVD_EEC_OFFSET 0x16 +#define QUERY_DEV_LIM_MAX_EEC_OFFSET 0x17 +#define QUERY_DEV_LIM_MAX_CQ_SZ_OFFSET 0x19 +#define QUERY_DEV_LIM_RSVD_CQ_OFFSET 0x1a +#define QUERY_DEV_LIM_MAX_CQ_OFFSET 0x1b +#define QUERY_DEV_LIM_MAX_MPT_OFFSET 0x1d +#define QUERY_DEV_LIM_RSVD_EQ_OFFSET 0x1e +#define QUERY_DEV_LIM_MAX_EQ_OFFSET 0x1f +#define QUERY_DEV_LIM_RSVD_MTT_OFFSET 0x20 +#define QUERY_DEV_LIM_MAX_MRW_SZ_OFFSET 0x21 +#define QUERY_DEV_LIM_RSVD_MRW_OFFSET 0x22 +#define QUERY_DEV_LIM_MAX_MTT_SEG_OFFSET 0x23 +#define QUERY_DEV_LIM_MAX_AV_OFFSET 0x27 +#define QUERY_DEV_LIM_MAX_REQ_QP_OFFSET 0x29 +#define QUERY_DEV_LIM_MAX_RES_QP_OFFSET 0x2b +#define QUERY_DEV_LIM_MAX_RDMA_OFFSET 0x2f +#define QUERY_DEV_LIM_RSZ_SRQ_OFFSET 0x33 +#define QUERY_DEV_LIM_ACK_DELAY_OFFSET 0x35 +#define QUERY_DEV_LIM_MTU_WIDTH_OFFSET 0x36 +#define QUERY_DEV_LIM_VL_PORT_OFFSET 0x37 +#define QUERY_DEV_LIM_MAX_GID_OFFSET 0x3b +#define QUERY_DEV_LIM_MAX_PKEY_OFFSET 0x3f +#define QUERY_DEV_LIM_FLAGS_OFFSET 0x44 +#define QUERY_DEV_LIM_RSVD_UAR_OFFSET 0x48 +#define QUERY_DEV_LIM_UAR_SZ_OFFSET 0x49 +#define QUERY_DEV_LIM_PAGE_SZ_OFFSET 0x4b +#define QUERY_DEV_LIM_MAX_SG_OFFSET 0x51 +#define QUERY_DEV_LIM_MAX_DESC_SZ_OFFSET 0x52 +#define QUERY_DEV_LIM_MAX_SG_RQ_OFFSET 0x55 +#define QUERY_DEV_LIM_MAX_DESC_SZ_RQ_OFFSET 0x56 +#define QUERY_DEV_LIM_MAX_QP_MCG_OFFSET 0x61 +#define QUERY_DEV_LIM_RSVD_MCG_OFFSET 0x62 +#define QUERY_DEV_LIM_MAX_MCG_OFFSET 0x63 +#define QUERY_DEV_LIM_RSVD_PD_OFFSET 0x64 +#define QUERY_DEV_LIM_MAX_PD_OFFSET 0x65 +#define QUERY_DEV_LIM_RSVD_RDD_OFFSET 0x66 +#define QUERY_DEV_LIM_MAX_RDD_OFFSET 0x67 +#define QUERY_DEV_LIM_EEC_ENTRY_SZ_OFFSET 0x80 +#define QUERY_DEV_LIM_QPC_ENTRY_SZ_OFFSET 0x82 +#define QUERY_DEV_LIM_EEEC_ENTRY_SZ_OFFSET 0x84 +#define QUERY_DEV_LIM_EQPC_ENTRY_SZ_OFFSET 0x86 +#define QUERY_DEV_LIM_EQC_ENTRY_SZ_OFFSET 0x88 +#define QUERY_DEV_LIM_CQC_ENTRY_SZ_OFFSET 0x8a +#define QUERY_DEV_LIM_SRQ_ENTRY_SZ_OFFSET 0x8c +#define QUERY_DEV_LIM_UAR_ENTRY_SZ_OFFSET 0x8e +#define QUERY_DEV_LIM_MTT_ENTRY_SZ_OFFSET 0x90 +#define QUERY_DEV_LIM_MPT_ENTRY_SZ_OFFSET 0x92 +#define QUERY_DEV_LIM_PBL_SZ_OFFSET 0x96 +#define QUERY_DEV_LIM_BMME_FLAGS_OFFSET 0x97 +#define QUERY_DEV_LIM_RSVD_LKEY_OFFSET 0x98 +#define QUERY_DEV_LIM_LAMR_OFFSET 0x9f +#define QUERY_DEV_LIM_MAX_ICM_SZ_OFFSET 0xa0 + + outbox = pci_alloc_consistent(dev->pdev, QUERY_DEV_LIM_OUT_SIZE, &outdma); + if (!outbox) + return -ENOMEM; + + err = mthca_cmd_box(dev, 0, outdma, 0, 0, CMD_QUERY_DEV_LIM, + CMD_TIME_CLASS_A, status); + + if (err) + goto out; + + MTHCA_GET(field, outbox, QUERY_DEV_LIM_MAX_SRQ_SZ_OFFSET); + dev_lim->max_srq_sz = 1 << field; + MTHCA_GET(field, outbox, QUERY_DEV_LIM_MAX_QP_SZ_OFFSET); + dev_lim->max_qp_sz = 1 << field; + MTHCA_GET(field, outbox, QUERY_DEV_LIM_RSVD_QP_OFFSET); + dev_lim->reserved_qps = 1 << (field & 0xf); + MTHCA_GET(field, outbox, QUERY_DEV_LIM_MAX_QP_OFFSET); + dev_lim->max_qps = 1 << (field & 0x1f); + MTHCA_GET(field, outbox, QUERY_DEV_LIM_RSVD_SRQ_OFFSET); + dev_lim->reserved_srqs = 1 << (field >> 4); + MTHCA_GET(field, outbox, QUERY_DEV_LIM_MAX_SRQ_OFFSET); + dev_lim->max_srqs = 1 << (field & 0x1f); + MTHCA_GET(field, outbox, QUERY_DEV_LIM_RSVD_EEC_OFFSET); + dev_lim->reserved_eecs = 1 << (field & 0xf); + MTHCA_GET(field, outbox, QUERY_DEV_LIM_MAX_EEC_OFFSET); + dev_lim->max_eecs = 1 << (field & 0x1f); + MTHCA_GET(field, outbox, QUERY_DEV_LIM_MAX_CQ_SZ_OFFSET); + dev_lim->max_cq_sz = 1 << field; + MTHCA_GET(field, outbox, QUERY_DEV_LIM_RSVD_CQ_OFFSET); + dev_lim->reserved_cqs = 1 << (field & 0xf); + MTHCA_GET(field, outbox, QUERY_DEV_LIM_MAX_CQ_OFFSET); + dev_lim->max_cqs = 1 << (field & 0x1f); + MTHCA_GET(field, outbox, QUERY_DEV_LIM_MAX_MPT_OFFSET); + dev_lim->max_mpts = 1 << (field & 0x3f); + MTHCA_GET(field, outbox, QUERY_DEV_LIM_RSVD_EQ_OFFSET); + dev_lim->reserved_eqs = 1 << (field & 0xf); + MTHCA_GET(field, outbox, QUERY_DEV_LIM_MAX_EQ_OFFSET); + dev_lim->max_eqs = 1 << (field & 0x7); + MTHCA_GET(field, outbox, QUERY_DEV_LIM_RSVD_MTT_OFFSET); + dev_lim->reserved_mtts = 1 << (field >> 4); + MTHCA_GET(field, outbox, QUERY_DEV_LIM_MAX_MRW_SZ_OFFSET); + dev_lim->max_mrw_sz = 1 << field; + MTHCA_GET(field, outbox, QUERY_DEV_LIM_RSVD_MRW_OFFSET); + dev_lim->reserved_mrws = 1 << (field & 0xf); + MTHCA_GET(field, outbox, QUERY_DEV_LIM_MAX_MTT_SEG_OFFSET); + dev_lim->max_mtt_seg = 1 << (field & 0x3f); + MTHCA_GET(field, outbox, QUERY_DEV_LIM_MAX_REQ_QP_OFFSET); + dev_lim->max_requester_per_qp = 1 << (field & 0x3f); + MTHCA_GET(field, outbox, QUERY_DEV_LIM_MAX_RES_QP_OFFSET); + dev_lim->max_responder_per_qp = 1 << (field & 0x3f); + MTHCA_GET(field, outbox, QUERY_DEV_LIM_MAX_RDMA_OFFSET); + dev_lim->max_rdma_global = 1 << (field & 0x3f); + MTHCA_GET(field, outbox, QUERY_DEV_LIM_ACK_DELAY_OFFSET); + dev_lim->local_ca_ack_delay = field & 0x1f; + MTHCA_GET(field, outbox, QUERY_DEV_LIM_MTU_WIDTH_OFFSET); + dev_lim->max_mtu = field >> 4; + dev_lim->max_port_width = field & 0xf; + MTHCA_GET(field, outbox, QUERY_DEV_LIM_VL_PORT_OFFSET); + dev_lim->max_vl = field >> 4; + dev_lim->num_ports = field & 0xf; + MTHCA_GET(field, outbox, QUERY_DEV_LIM_MAX_GID_OFFSET); + dev_lim->max_gids = 1 << (field & 0xf); + MTHCA_GET(field, outbox, QUERY_DEV_LIM_MAX_PKEY_OFFSET); + dev_lim->max_pkeys = 1 << (field & 0xf); + MTHCA_GET(dev_lim->flags, outbox, QUERY_DEV_LIM_FLAGS_OFFSET); + MTHCA_GET(field, outbox, QUERY_DEV_LIM_RSVD_UAR_OFFSET); + dev_lim->reserved_uars = field >> 4; + MTHCA_GET(field, outbox, QUERY_DEV_LIM_UAR_SZ_OFFSET); + dev_lim->uar_size = 1 << ((field & 0x3f) + 20); + MTHCA_GET(field, outbox, QUERY_DEV_LIM_PAGE_SZ_OFFSET); + dev_lim->min_page_sz = 1 << field; + MTHCA_GET(field, outbox, QUERY_DEV_LIM_MAX_SG_OFFSET); + dev_lim->max_sg = field; + + MTHCA_GET(size, outbox, QUERY_DEV_LIM_MAX_DESC_SZ_OFFSET); + dev_lim->max_desc_sz = size; + + MTHCA_GET(field, outbox, QUERY_DEV_LIM_MAX_QP_MCG_OFFSET); + dev_lim->max_qp_per_mcg = 1 << field; + MTHCA_GET(field, outbox, QUERY_DEV_LIM_RSVD_MCG_OFFSET); + dev_lim->reserved_mgms = field & 0xf; + MTHCA_GET(field, outbox, QUERY_DEV_LIM_MAX_MCG_OFFSET); + dev_lim->max_mcgs = 1 << field; + MTHCA_GET(field, outbox, QUERY_DEV_LIM_RSVD_PD_OFFSET); + dev_lim->reserved_pds = field >> 4; + MTHCA_GET(field, outbox, QUERY_DEV_LIM_MAX_PD_OFFSET); + dev_lim->max_pds = 1 << (field & 0x3f); + MTHCA_GET(field, outbox, QUERY_DEV_LIM_RSVD_RDD_OFFSET); + dev_lim->reserved_rdds = field >> 4; + MTHCA_GET(field, outbox, QUERY_DEV_LIM_MAX_RDD_OFFSET); + dev_lim->max_rdds = 1 << (field & 0x3f); + + MTHCA_GET(size, outbox, QUERY_DEV_LIM_EEC_ENTRY_SZ_OFFSET); + dev_lim->eec_entry_sz = size; + MTHCA_GET(size, outbox, QUERY_DEV_LIM_QPC_ENTRY_SZ_OFFSET); + dev_lim->qpc_entry_sz = size; + MTHCA_GET(size, outbox, QUERY_DEV_LIM_EEEC_ENTRY_SZ_OFFSET); + dev_lim->eeec_entry_sz = size; + MTHCA_GET(size, outbox, QUERY_DEV_LIM_EQPC_ENTRY_SZ_OFFSET); + dev_lim->eqpc_entry_sz = size; + MTHCA_GET(size, outbox, QUERY_DEV_LIM_EQC_ENTRY_SZ_OFFSET); + dev_lim->eqc_entry_sz = size; + MTHCA_GET(size, outbox, QUERY_DEV_LIM_CQC_ENTRY_SZ_OFFSET); + dev_lim->cqc_entry_sz = size; + MTHCA_GET(size, outbox, QUERY_DEV_LIM_SRQ_ENTRY_SZ_OFFSET); + dev_lim->srq_entry_sz = size; + MTHCA_GET(size, outbox, QUERY_DEV_LIM_UAR_ENTRY_SZ_OFFSET); + dev_lim->uar_scratch_entry_sz = size; + + mthca_dbg(dev, "Max QPs: %d, reserved QPs: %d, entry size: %d\n", + dev_lim->max_qps, dev_lim->reserved_qps, dev_lim->qpc_entry_sz); + mthca_dbg(dev, "Max CQs: %d, reserved CQs: %d, entry size: %d\n", + dev_lim->max_cqs, dev_lim->reserved_cqs, dev_lim->cqc_entry_sz); + mthca_dbg(dev, "Max EQs: %d, reserved EQs: %d, entry size: %d\n", + dev_lim->max_eqs, dev_lim->reserved_eqs, dev_lim->eqc_entry_sz); + mthca_dbg(dev, "reserved MPTs: %d, reserved MTTs: %d\n", + dev_lim->reserved_mrws, dev_lim->reserved_mtts); + mthca_dbg(dev, "Max PDs: %d, reserved PDs: %d, reserved UARs: %d\n", + dev_lim->max_pds, dev_lim->reserved_pds, dev_lim->reserved_uars); + mthca_dbg(dev, "Max QP/MCG: %d, reserved MGMs: %d\n", + dev_lim->max_pds, dev_lim->reserved_mgms); + + mthca_dbg(dev, "Flags: %08x\n", dev_lim->flags); + + if (dev->hca_type == ARBEL_NATIVE) { + MTHCA_GET(field, outbox, QUERY_DEV_LIM_RSZ_SRQ_OFFSET); + dev_lim->hca.arbel.resize_srq = field & 1; + MTHCA_GET(size, outbox, QUERY_DEV_LIM_MTT_ENTRY_SZ_OFFSET); + dev_lim->hca.arbel.mtt_entry_sz = size; + MTHCA_GET(size, outbox, QUERY_DEV_LIM_MPT_ENTRY_SZ_OFFSET); + dev_lim->hca.arbel.mpt_entry_sz = size; + MTHCA_GET(field, outbox, QUERY_DEV_LIM_PBL_SZ_OFFSET); + dev_lim->hca.arbel.max_pbl_sz = 1 << (field & 0x3f); + MTHCA_GET(dev_lim->hca.arbel.bmme_flags, outbox, + QUERY_DEV_LIM_BMME_FLAGS_OFFSET); + MTHCA_GET(dev_lim->hca.arbel.reserved_lkey, outbox, + QUERY_DEV_LIM_RSVD_LKEY_OFFSET); + MTHCA_GET(field, outbox, QUERY_DEV_LIM_LAMR_OFFSET); + dev_lim->hca.arbel.lam_required = field & 1; + MTHCA_GET(dev_lim->hca.arbel.max_icm_sz, outbox, + QUERY_DEV_LIM_MAX_ICM_SZ_OFFSET); + + if (dev_lim->hca.arbel.bmme_flags & 1) + mthca_dbg(dev, "Base MM extensions: yes " + "(flags %d, max PBL %d, rsvd L_Key %08x)\n", + dev_lim->hca.arbel.bmme_flags, + dev_lim->hca.arbel.max_pbl_sz, + dev_lim->hca.arbel.reserved_lkey); + else + mthca_dbg(dev, "Base MM extensions: no\n"); + + mthca_dbg(dev, "Max ICM size %lld MB\n", + (unsigned long long) dev_lim->hca.arbel.max_icm_sz >> 20); + } else { + MTHCA_GET(field, outbox, QUERY_DEV_LIM_MAX_AV_OFFSET); + dev_lim->hca.tavor.max_avs = 1 << (field & 0x3f); + } + +out: + pci_free_consistent(dev->pdev, QUERY_DEV_LIM_OUT_SIZE, outbox, outdma); + return err; +} + +int mthca_QUERY_ADAPTER(struct mthca_dev *dev, + struct mthca_adapter *adapter, u8 *status) +{ + u32 *outbox; + dma_addr_t outdma; + int err; + +#define QUERY_ADAPTER_OUT_SIZE 0x100 +#define QUERY_ADAPTER_VENDOR_ID_OFFSET 0x00 +#define QUERY_ADAPTER_DEVICE_ID_OFFSET 0x04 +#define QUERY_ADAPTER_REVISION_ID_OFFSET 0x08 +#define QUERY_ADAPTER_INTA_PIN_OFFSET 0x10 + + outbox = pci_alloc_consistent(dev->pdev, QUERY_ADAPTER_OUT_SIZE, &outdma); + if (!outbox) + return -ENOMEM; + + err = mthca_cmd_box(dev, 0, outdma, 0, 0, CMD_QUERY_ADAPTER, + CMD_TIME_CLASS_A, status); + + if (err) + goto out; + + MTHCA_GET(adapter->vendor_id, outbox, QUERY_ADAPTER_VENDOR_ID_OFFSET); + MTHCA_GET(adapter->device_id, outbox, QUERY_ADAPTER_DEVICE_ID_OFFSET); + MTHCA_GET(adapter->revision_id, outbox, QUERY_ADAPTER_REVISION_ID_OFFSET); + MTHCA_GET(adapter->inta_pin, outbox, QUERY_ADAPTER_INTA_PIN_OFFSET); + +out: + pci_free_consistent(dev->pdev, QUERY_DEV_LIM_OUT_SIZE, outbox, outdma); + return err; +} + +int mthca_INIT_HCA(struct mthca_dev *dev, + struct mthca_init_hca_param *param, + u8 *status) +{ + u32 *inbox; + dma_addr_t indma; + int err; + +#define INIT_HCA_IN_SIZE 0x200 +#define INIT_HCA_FLAGS_OFFSET 0x014 +#define INIT_HCA_QPC_OFFSET 0x020 +#define INIT_HCA_QPC_BASE_OFFSET (INIT_HCA_QPC_OFFSET + 0x10) +#define INIT_HCA_LOG_QP_OFFSET (INIT_HCA_QPC_OFFSET + 0x17) +#define INIT_HCA_EEC_BASE_OFFSET (INIT_HCA_QPC_OFFSET + 0x20) +#define INIT_HCA_LOG_EEC_OFFSET (INIT_HCA_QPC_OFFSET + 0x27) +#define INIT_HCA_SRQC_BASE_OFFSET (INIT_HCA_QPC_OFFSET + 0x28) +#define INIT_HCA_LOG_SRQ_OFFSET (INIT_HCA_QPC_OFFSET + 0x2f) +#define INIT_HCA_CQC_BASE_OFFSET (INIT_HCA_QPC_OFFSET + 0x30) +#define INIT_HCA_LOG_CQ_OFFSET (INIT_HCA_QPC_OFFSET + 0x37) +#define INIT_HCA_EQPC_BASE_OFFSET (INIT_HCA_QPC_OFFSET + 0x40) +#define INIT_HCA_EEEC_BASE_OFFSET (INIT_HCA_QPC_OFFSET + 0x50) +#define INIT_HCA_EQC_BASE_OFFSET (INIT_HCA_QPC_OFFSET + 0x60) +#define INIT_HCA_LOG_EQ_OFFSET (INIT_HCA_QPC_OFFSET + 0x67) +#define INIT_HCA_RDB_BASE_OFFSET (INIT_HCA_QPC_OFFSET + 0x70) +#define INIT_HCA_UDAV_OFFSET 0x0b0 +#define INIT_HCA_UDAV_LKEY_OFFSET (INIT_HCA_UDAV_OFFSET + 0x0) +#define INIT_HCA_UDAV_PD_OFFSET (INIT_HCA_UDAV_OFFSET + 0x4) +#define INIT_HCA_MCAST_OFFSET 0x0c0 +#define INIT_HCA_MC_BASE_OFFSET (INIT_HCA_MCAST_OFFSET + 0x00) +#define INIT_HCA_LOG_MC_ENTRY_SZ_OFFSET (INIT_HCA_MCAST_OFFSET + 0x12) +#define INIT_HCA_MC_HASH_SZ_OFFSET (INIT_HCA_MCAST_OFFSET + 0x16) +#define INIT_HCA_LOG_MC_TABLE_SZ_OFFSET (INIT_HCA_MCAST_OFFSET + 0x1b) +#define INIT_HCA_TPT_OFFSET 0x0f0 +#define INIT_HCA_MPT_BASE_OFFSET (INIT_HCA_TPT_OFFSET + 0x00) +#define INIT_HCA_MTT_SEG_SZ_OFFSET (INIT_HCA_TPT_OFFSET + 0x09) +#define INIT_HCA_LOG_MPT_SZ_OFFSET (INIT_HCA_TPT_OFFSET + 0x0b) +#define INIT_HCA_MTT_BASE_OFFSET (INIT_HCA_TPT_OFFSET + 0x10) +#define INIT_HCA_UAR_OFFSET 0x120 +#define INIT_HCA_UAR_BASE_OFFSET (INIT_HCA_UAR_OFFSET + 0x00) +#define INIT_HCA_UAR_PAGE_SZ_OFFSET (INIT_HCA_UAR_OFFSET + 0x0b) +#define INIT_HCA_UAR_SCATCH_BASE_OFFSET (INIT_HCA_UAR_OFFSET + 0x10) + + inbox = pci_alloc_consistent(dev->pdev, INIT_HCA_IN_SIZE, &indma); + if (!inbox) + return -ENOMEM; + + memset(inbox, 0, INIT_HCA_IN_SIZE); + +#if defined(__LITTLE_ENDIAN) + *(inbox + INIT_HCA_FLAGS_OFFSET / 4) &= ~cpu_to_be32(1 << 1); +#elif defined(__BIG_ENDIAN) + *(inbox + INIT_HCA_FLAGS_OFFSET / 4) |= cpu_to_be32(1 << 1); +#else +#error Host endianness not defined +#endif + /* Check port for UD address vector: */ + *(inbox + INIT_HCA_FLAGS_OFFSET / 4) |= cpu_to_be32(1); + + /* We leave wqe_quota, responder_exu, etc as 0 (default) */ + + /* QPC/EEC/CQC/EQC/RDB attributes */ + + MTHCA_PUT(inbox, param->qpc_base, INIT_HCA_QPC_BASE_OFFSET); + MTHCA_PUT(inbox, param->log_num_qps, INIT_HCA_LOG_QP_OFFSET); + MTHCA_PUT(inbox, param->eec_base, INIT_HCA_EEC_BASE_OFFSET); + MTHCA_PUT(inbox, param->log_num_eecs, INIT_HCA_LOG_EEC_OFFSET); + MTHCA_PUT(inbox, param->srqc_base, INIT_HCA_SRQC_BASE_OFFSET); + MTHCA_PUT(inbox, param->log_num_srqs, INIT_HCA_LOG_SRQ_OFFSET); + MTHCA_PUT(inbox, param->cqc_base, INIT_HCA_CQC_BASE_OFFSET); + MTHCA_PUT(inbox, param->log_num_cqs, INIT_HCA_LOG_CQ_OFFSET); + MTHCA_PUT(inbox, param->eqpc_base, INIT_HCA_EQPC_BASE_OFFSET); + MTHCA_PUT(inbox, param->eeec_base, INIT_HCA_EEEC_BASE_OFFSET); + MTHCA_PUT(inbox, param->eqc_base, INIT_HCA_EQC_BASE_OFFSET); + MTHCA_PUT(inbox, param->log_num_eqs, INIT_HCA_LOG_EQ_OFFSET); + MTHCA_PUT(inbox, param->rdb_base, INIT_HCA_RDB_BASE_OFFSET); + + /* UD AV attributes */ + + /* multicast attributes */ + + MTHCA_PUT(inbox, param->mc_base, INIT_HCA_MC_BASE_OFFSET); + MTHCA_PUT(inbox, param->log_mc_entry_sz, INIT_HCA_LOG_MC_ENTRY_SZ_OFFSET); + MTHCA_PUT(inbox, param->mc_hash_sz, INIT_HCA_MC_HASH_SZ_OFFSET); + MTHCA_PUT(inbox, param->log_mc_table_sz, INIT_HCA_LOG_MC_TABLE_SZ_OFFSET); + + /* TPT attributes */ + + MTHCA_PUT(inbox, param->mpt_base, INIT_HCA_MPT_BASE_OFFSET); + MTHCA_PUT(inbox, param->mtt_seg_sz, INIT_HCA_MTT_SEG_SZ_OFFSET); + MTHCA_PUT(inbox, param->log_mpt_sz, INIT_HCA_LOG_MPT_SZ_OFFSET); + MTHCA_PUT(inbox, param->mtt_base, INIT_HCA_MTT_BASE_OFFSET); + + /* UAR attributes */ + { + u8 uar_page_sz = PAGE_SHIFT - 12; + MTHCA_PUT(inbox, uar_page_sz, INIT_HCA_UAR_PAGE_SZ_OFFSET); + MTHCA_PUT(inbox, param->uar_scratch_base, INIT_HCA_UAR_SCATCH_BASE_OFFSET); + } + + err = mthca_cmd(dev, indma, 0, 0, CMD_INIT_HCA, + HZ, status); + + pci_free_consistent(dev->pdev, INIT_HCA_IN_SIZE, inbox, indma); + return err; +} + +int mthca_INIT_IB(struct mthca_dev *dev, + struct mthca_init_ib_param *param, + int port, u8 *status) +{ + u32 *inbox; + dma_addr_t indma; + int err; + u32 flags; + +#define INIT_IB_IN_SIZE 56 +#define INIT_IB_FLAGS_OFFSET 0x00 +#define INIT_IB_FLAG_SIG (1 << 18) +#define INIT_IB_FLAG_NG (1 << 17) +#define INIT_IB_FLAG_G0 (1 << 16) +#define INIT_IB_FLAG_1X (1 << 8) +#define INIT_IB_FLAG_4X (1 << 9) +#define INIT_IB_FLAG_12X (1 << 11) +#define INIT_IB_VL_SHIFT 4 +#define INIT_IB_MTU_SHIFT 12 +#define INIT_IB_MAX_GID_OFFSET 0x06 +#define INIT_IB_MAX_PKEY_OFFSET 0x0a +#define INIT_IB_GUID0_OFFSET 0x10 +#define INIT_IB_NODE_GUID_OFFSET 0x18 +#define INIT_IB_SI_GUID_OFFSET 0x20 + + inbox = pci_alloc_consistent(dev->pdev, INIT_IB_IN_SIZE, &indma); + if (!inbox) + return -ENOMEM; + + memset(inbox, 0, INIT_IB_IN_SIZE); + + flags = 0; + flags |= param->enable_1x ? INIT_IB_FLAG_1X : 0; + flags |= param->enable_4x ? INIT_IB_FLAG_4X : 0; + flags |= param->set_guid0 ? INIT_IB_FLAG_G0 : 0; + flags |= param->set_node_guid ? INIT_IB_FLAG_NG : 0; + flags |= param->set_si_guid ? INIT_IB_FLAG_SIG : 0; + flags |= param->vl_cap << INIT_IB_VL_SHIFT; + flags |= param->mtu_cap << INIT_IB_MTU_SHIFT; + MTHCA_PUT(inbox, flags, INIT_IB_FLAGS_OFFSET); + + MTHCA_PUT(inbox, param->gid_cap, INIT_IB_MAX_GID_OFFSET); + MTHCA_PUT(inbox, param->pkey_cap, INIT_IB_MAX_PKEY_OFFSET); + MTHCA_PUT(inbox, param->guid0, INIT_IB_GUID0_OFFSET); + MTHCA_PUT(inbox, param->node_guid, INIT_IB_NODE_GUID_OFFSET); + MTHCA_PUT(inbox, param->si_guid, INIT_IB_SI_GUID_OFFSET); + + err = mthca_cmd(dev, indma, port, 0, CMD_INIT_IB, + CMD_TIME_CLASS_A, status); + + pci_free_consistent(dev->pdev, INIT_HCA_IN_SIZE, inbox, indma); + return err; +} + +int mthca_CLOSE_IB(struct mthca_dev *dev, int port, u8 *status) +{ + return mthca_cmd(dev, 0, port, 0, CMD_CLOSE_IB, HZ, status); +} + +int mthca_CLOSE_HCA(struct mthca_dev *dev, int panic, u8 *status) +{ + return mthca_cmd(dev, 0, 0, panic, CMD_CLOSE_HCA, HZ, status); +} + +int mthca_SW2HW_MPT(struct mthca_dev *dev, void *mpt_entry, + int mpt_index, u8 *status) +{ + dma_addr_t indma; + int err; + + indma = pci_map_single(dev->pdev, mpt_entry, + MTHCA_MPT_ENTRY_SIZE, + PCI_DMA_TODEVICE); + if (pci_dma_mapping_error(indma)) + return -ENOMEM; + + err = mthca_cmd(dev, indma, mpt_index, 0, CMD_SW2HW_MPT, + CMD_TIME_CLASS_B, status); + + pci_unmap_single(dev->pdev, indma, + MTHCA_MPT_ENTRY_SIZE, PCI_DMA_TODEVICE); + return err; +} + +int mthca_HW2SW_MPT(struct mthca_dev *dev, void *mpt_entry, + int mpt_index, u8 *status) +{ + dma_addr_t outdma = 0; + int err; + + if (mpt_entry) { + outdma = pci_map_single(dev->pdev, mpt_entry, + MTHCA_MPT_ENTRY_SIZE, + PCI_DMA_FROMDEVICE); + if (pci_dma_mapping_error(outdma)) + return -ENOMEM; + } + + err = mthca_cmd_box(dev, 0, outdma, mpt_index, !mpt_entry, + CMD_HW2SW_MPT, + CMD_TIME_CLASS_B, status); + + if (mpt_entry) + pci_unmap_single(dev->pdev, outdma, + MTHCA_MPT_ENTRY_SIZE, + PCI_DMA_FROMDEVICE); + return err; +} + +int mthca_WRITE_MTT(struct mthca_dev *dev, u64 *mtt_entry, + int num_mtt, u8 *status) +{ + dma_addr_t indma; + int err; + + indma = pci_map_single(dev->pdev, mtt_entry, + (num_mtt + 2) * 8, + PCI_DMA_TODEVICE); + if (pci_dma_mapping_error(indma)) + return -ENOMEM; + + err = mthca_cmd(dev, indma, num_mtt, 0, CMD_WRITE_MTT, + CMD_TIME_CLASS_B, status); + + pci_unmap_single(dev->pdev, indma, + (num_mtt + 2) * 8, PCI_DMA_TODEVICE); + return err; +} + +int mthca_MAP_EQ(struct mthca_dev *dev, u64 event_mask, int unmap, + int eq_num, u8 *status) +{ + mthca_dbg(dev, "%s mask %016llx for eqn %d\n", + unmap ? "Clearing" : "Setting", + (unsigned long long) event_mask, eq_num); + return mthca_cmd(dev, event_mask, (unmap << 31) | eq_num, + 0, CMD_MAP_EQ, CMD_TIME_CLASS_B, status); +} + +int mthca_SW2HW_EQ(struct mthca_dev *dev, void *eq_context, + int eq_num, u8 *status) +{ + dma_addr_t indma; + int err; + + indma = pci_map_single(dev->pdev, eq_context, + MTHCA_EQ_CONTEXT_SIZE, + PCI_DMA_TODEVICE); + if (pci_dma_mapping_error(indma)) + return -ENOMEM; + + err = mthca_cmd(dev, indma, eq_num, 0, CMD_SW2HW_EQ, + CMD_TIME_CLASS_A, status); + + pci_unmap_single(dev->pdev, indma, + MTHCA_EQ_CONTEXT_SIZE, PCI_DMA_TODEVICE); + return err; +} + +int mthca_HW2SW_EQ(struct mthca_dev *dev, void *eq_context, + int eq_num, u8 *status) +{ + dma_addr_t outdma = 0; + int err; + + outdma = pci_map_single(dev->pdev, eq_context, + MTHCA_EQ_CONTEXT_SIZE, + PCI_DMA_FROMDEVICE); + if (pci_dma_mapping_error(outdma)) + return -ENOMEM; + + err = mthca_cmd_box(dev, 0, outdma, eq_num, 0, + CMD_HW2SW_EQ, + CMD_TIME_CLASS_A, status); + + pci_unmap_single(dev->pdev, outdma, + MTHCA_EQ_CONTEXT_SIZE, + PCI_DMA_FROMDEVICE); + return err; +} + +int mthca_SW2HW_CQ(struct mthca_dev *dev, void *cq_context, + int cq_num, u8 *status) +{ + dma_addr_t indma; + int err; + + indma = pci_map_single(dev->pdev, cq_context, + MTHCA_CQ_CONTEXT_SIZE, + PCI_DMA_TODEVICE); + if (pci_dma_mapping_error(indma)) + return -ENOMEM; + + err = mthca_cmd(dev, indma, cq_num, 0, CMD_SW2HW_CQ, + CMD_TIME_CLASS_A, status); + + pci_unmap_single(dev->pdev, indma, + MTHCA_CQ_CONTEXT_SIZE, PCI_DMA_TODEVICE); + return err; +} + +int mthca_HW2SW_CQ(struct mthca_dev *dev, void *cq_context, + int cq_num, u8 *status) +{ + dma_addr_t outdma = 0; + int err; + + outdma = pci_map_single(dev->pdev, cq_context, + MTHCA_CQ_CONTEXT_SIZE, + PCI_DMA_FROMDEVICE); + if (pci_dma_mapping_error(outdma)) + return -ENOMEM; + + err = mthca_cmd_box(dev, 0, outdma, cq_num, 0, + CMD_HW2SW_CQ, + CMD_TIME_CLASS_A, status); + + pci_unmap_single(dev->pdev, outdma, + MTHCA_CQ_CONTEXT_SIZE, + PCI_DMA_FROMDEVICE); + return err; +} + +int mthca_MODIFY_QP(struct mthca_dev *dev, int trans, u32 num, + int is_ee, void *qp_context, u32 optmask, + u8 *status) +{ + static const u16 op[] = { + [MTHCA_TRANS_RST2INIT] = CMD_RST2INIT_QPEE, + [MTHCA_TRANS_INIT2INIT] = CMD_INIT2INIT_QPEE, + [MTHCA_TRANS_INIT2RTR] = CMD_INIT2RTR_QPEE, + [MTHCA_TRANS_RTR2RTS] = CMD_RTR2RTS_QPEE, + [MTHCA_TRANS_RTS2RTS] = CMD_RTS2RTS_QPEE, + [MTHCA_TRANS_SQERR2RTS] = CMD_SQERR2RTS_QPEE, + [MTHCA_TRANS_ANY2ERR] = CMD_2ERR_QPEE, + [MTHCA_TRANS_RTS2SQD] = CMD_RTS2SQD_QPEE, + [MTHCA_TRANS_SQD2SQD] = CMD_SQD2SQD_QPEE, + [MTHCA_TRANS_SQD2RTS] = CMD_SQD2RTS_QPEE, + [MTHCA_TRANS_ANY2RST] = CMD_ERR2RST_QPEE + }; + u8 op_mod = 0; + + dma_addr_t indma; + int err; + + if (trans < 0 || trans >= ARRAY_SIZE(op)) + return -EINVAL; + + if (trans == MTHCA_TRANS_ANY2RST) { + indma = 0; + op_mod = 3; /* don't write outbox, any->reset */ + + /* For debugging */ + qp_context = pci_alloc_consistent(dev->pdev, MTHCA_QP_CONTEXT_SIZE, + &indma); + op_mod = 2; /* write outbox, any->reset */ + } else { + indma = pci_map_single(dev->pdev, qp_context, + MTHCA_QP_CONTEXT_SIZE, + PCI_DMA_TODEVICE); + if (pci_dma_mapping_error(indma)) + return -ENOMEM; + + if (0) { + int i; + mthca_dbg(dev, "Dumping QP context:\n"); + printk(" %08x\n", be32_to_cpup(qp_context)); + for (i = 0; i < 0x100 / 4; ++i) { + if (i % 8 == 0) + printk("[%02x] ", i * 4); + printk(" %08x", be32_to_cpu(((u32 *) qp_context)[i + 2])); + if ((i + 1) % 8 == 0) + printk("\n"); + } + } + } + + if (trans == MTHCA_TRANS_ANY2RST) { + err = mthca_cmd_box(dev, 0, indma, (!!is_ee << 24) | num, + op_mod, op[trans], CMD_TIME_CLASS_C, status); + + if (0) { + int i; + mthca_dbg(dev, "Dumping QP context:\n"); + printk(" %08x\n", be32_to_cpup(qp_context)); + for (i = 0; i < 0x100 / 4; ++i) { + if (i % 8 == 0) + printk("[%02x] ", i * 4); + printk(" %08x", be32_to_cpu(((u32 *) qp_context)[i + 2])); + if ((i + 1) % 8 == 0) + printk("\n"); + } + } + + } else + err = mthca_cmd(dev, indma, (!!is_ee << 24) | num, + op_mod, op[trans], CMD_TIME_CLASS_C, status); + + if (trans != MTHCA_TRANS_ANY2RST) + pci_unmap_single(dev->pdev, indma, + MTHCA_QP_CONTEXT_SIZE, PCI_DMA_TODEVICE); + else + pci_free_consistent(dev->pdev, MTHCA_QP_CONTEXT_SIZE, + qp_context, indma); + return err; +} + +int mthca_QUERY_QP(struct mthca_dev *dev, u32 num, int is_ee, + void *qp_context, u8 *status) +{ + dma_addr_t outdma = 0; + int err; + + outdma = pci_map_single(dev->pdev, qp_context, + MTHCA_QP_CONTEXT_SIZE, + PCI_DMA_FROMDEVICE); + if (pci_dma_mapping_error(outdma)) + return -ENOMEM; + + err = mthca_cmd_box(dev, 0, outdma, (!!is_ee << 24) | num, 0, + CMD_QUERY_QPEE, + CMD_TIME_CLASS_A, status); + + pci_unmap_single(dev->pdev, outdma, + MTHCA_QP_CONTEXT_SIZE, + PCI_DMA_FROMDEVICE); + return err; +} + +int mthca_CONF_SPECIAL_QP(struct mthca_dev *dev, int type, u32 qpn, + u8 *status) +{ + u8 op_mod; + + switch (type) { + case IB_QPT_SMI: + op_mod = 0; + break; + case IB_QPT_GSI: + op_mod = 1; + break; + case IB_QPT_RAW_IPV6: + op_mod = 2; + break; + case IB_QPT_RAW_ETY: + op_mod = 3; + break; + default: + return -EINVAL; + } + + return mthca_cmd(dev, 0, qpn, op_mod, CMD_CONF_SPECIAL_QP, + CMD_TIME_CLASS_B, status); +} + +int mthca_MAD_IFC(struct mthca_dev *dev, int ignore_mkey, int port, + void *in_mad, void *response_mad, u8 *status) { + void *box; + dma_addr_t dma; + int err; + +#define MAD_IFC_BOX_SIZE 512 + + box = pci_alloc_consistent(dev->pdev, MAD_IFC_BOX_SIZE, &dma); + if (!box) + return -ENOMEM; + + memcpy(box, in_mad, 256); + + err = mthca_cmd_box(dev, dma, dma + 256, port, !!ignore_mkey, + CMD_MAD_IFC, CMD_TIME_CLASS_C, status); + + if (!err && !*status) + memcpy(response_mad, box + 256, 256); + + pci_free_consistent(dev->pdev, MAD_IFC_BOX_SIZE, box, dma); + return err; +} + +int mthca_READ_MGM(struct mthca_dev *dev, int index, void *mgm, + u8 *status) +{ + dma_addr_t outdma = 0; + int err; + + outdma = pci_map_single(dev->pdev, mgm, + MTHCA_MGM_ENTRY_SIZE, + PCI_DMA_FROMDEVICE); + if (pci_dma_mapping_error(outdma)) + return -ENOMEM; + + err = mthca_cmd_box(dev, 0, outdma, index, 0, + CMD_READ_MGM, + CMD_TIME_CLASS_A, status); + + pci_unmap_single(dev->pdev, outdma, + MTHCA_MGM_ENTRY_SIZE, + PCI_DMA_FROMDEVICE); + return err; +} + +int mthca_WRITE_MGM(struct mthca_dev *dev, int index, void *mgm, + u8 *status) +{ + dma_addr_t indma; + int err; + + indma = pci_map_single(dev->pdev, mgm, + MTHCA_MGM_ENTRY_SIZE, + PCI_DMA_TODEVICE); + if (pci_dma_mapping_error(indma)) + return -ENOMEM; + + err = mthca_cmd(dev, indma, index, 0, CMD_WRITE_MGM, + CMD_TIME_CLASS_A, status); + + pci_unmap_single(dev->pdev, indma, + MTHCA_MGM_ENTRY_SIZE, PCI_DMA_TODEVICE); + return err; +} + +int mthca_MGID_HASH(struct mthca_dev *dev, void *gid, u16 *hash, + u8 *status) +{ + dma_addr_t indma; + u64 imm; + int err; + + indma = pci_map_single(dev->pdev, gid, 16, PCI_DMA_TODEVICE); + if (pci_dma_mapping_error(indma)) + return -ENOMEM; + + err = mthca_cmd_imm(dev, indma, &imm, 0, 0, CMD_MGID_HASH, + CMD_TIME_CLASS_A, status); + *hash = imm; + + pci_unmap_single(dev->pdev, indma, 16, PCI_DMA_TODEVICE); + return err; +} --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/hw/mthca/mthca_cmd.h 2004-12-27 21:48:22.408667751 -0800 @@ -0,0 +1,276 @@ +/* + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: mthca_cmd.h 1349 2004-12-16 21:09:43Z roland $ + */ + +#ifndef MTHCA_CMD_H +#define MTHCA_CMD_H + +#include + +#define MTHCA_CMD_MAILBOX_ALIGN 16UL +#define MTHCA_CMD_MAILBOX_EXTRA (MTHCA_CMD_MAILBOX_ALIGN - 1) + +enum { + /* command completed successfully: */ + MTHCA_CMD_STAT_OK = 0x00, + /* Internal error (such as a bus error) occurred while processing command: */ + MTHCA_CMD_STAT_INTERNAL_ERR = 0x01, + /* Operation/command not supported or opcode modifier not supported: */ + MTHCA_CMD_STAT_BAD_OP = 0x02, + /* Parameter not supported or parameter out of range: */ + MTHCA_CMD_STAT_BAD_PARAM = 0x03, + /* System not enabled or bad system state: */ + MTHCA_CMD_STAT_BAD_SYS_STATE = 0x04, + /* Attempt to access reserved or unallocaterd resource: */ + MTHCA_CMD_STAT_BAD_RESOURCE = 0x05, + /* Requested resource is currently executing a command, or is otherwise busy: */ + MTHCA_CMD_STAT_RESOURCE_BUSY = 0x06, + /* memory error: */ + MTHCA_CMD_STAT_DDR_MEM_ERR = 0x07, + /* Required capability exceeds device limits: */ + MTHCA_CMD_STAT_EXCEED_LIM = 0x08, + /* Resource is not in the appropriate state or ownership: */ + MTHCA_CMD_STAT_BAD_RES_STATE = 0x09, + /* Index out of range: */ + MTHCA_CMD_STAT_BAD_INDEX = 0x0a, + /* FW image corrupted: */ + MTHCA_CMD_STAT_BAD_NVMEM = 0x0b, + /* Attempt to modify a QP/EE which is not in the presumed state: */ + MTHCA_CMD_STAT_BAD_QPEE_STATE = 0x10, + /* Bad segment parameters (Address/Size): */ + MTHCA_CMD_STAT_BAD_SEG_PARAM = 0x20, + /* Memory Region has Memory Windows bound to: */ + MTHCA_CMD_STAT_REG_BOUND = 0x21, + /* HCA local attached memory not present: */ + MTHCA_CMD_STAT_LAM_NOT_PRE = 0x22, + /* Bad management packet (silently discarded): */ + MTHCA_CMD_STAT_BAD_PKT = 0x30, + /* More outstanding CQEs in CQ than new CQ size: */ + MTHCA_CMD_STAT_BAD_SIZE = 0x40 +}; + +enum { + MTHCA_TRANS_INVALID = 0, + MTHCA_TRANS_RST2INIT, + MTHCA_TRANS_INIT2INIT, + MTHCA_TRANS_INIT2RTR, + MTHCA_TRANS_RTR2RTS, + MTHCA_TRANS_RTS2RTS, + MTHCA_TRANS_SQERR2RTS, + MTHCA_TRANS_ANY2ERR, + MTHCA_TRANS_RTS2SQD, + MTHCA_TRANS_SQD2SQD, + MTHCA_TRANS_SQD2RTS, + MTHCA_TRANS_ANY2RST, +}; + +enum { + DEV_LIM_FLAG_SRQ = 1 << 6 +}; + +struct mthca_dev_lim { + int max_srq_sz; + int max_qp_sz; + int reserved_qps; + int max_qps; + int reserved_srqs; + int max_srqs; + int reserved_eecs; + int max_eecs; + int max_cq_sz; + int reserved_cqs; + int max_cqs; + int max_mpts; + int reserved_eqs; + int max_eqs; + int reserved_mtts; + int max_mrw_sz; + int reserved_mrws; + int max_mtt_seg; + int max_requester_per_qp; + int max_responder_per_qp; + int max_rdma_global; + int local_ca_ack_delay; + int max_mtu; + int max_port_width; + int max_vl; + int num_ports; + int max_gids; + int max_pkeys; + u32 flags; + int reserved_uars; + int uar_size; + int min_page_sz; + int max_sg; + int max_desc_sz; + int max_qp_per_mcg; + int reserved_mgms; + int max_mcgs; + int reserved_pds; + int max_pds; + int reserved_rdds; + int max_rdds; + int eec_entry_sz; + int qpc_entry_sz; + int eeec_entry_sz; + int eqpc_entry_sz; + int eqc_entry_sz; + int cqc_entry_sz; + int srq_entry_sz; + int uar_scratch_entry_sz; + union { + struct { + int max_avs; + } tavor; + struct { + int resize_srq; + int mtt_entry_sz; + int mpt_entry_sz; + int max_pbl_sz; + u8 bmme_flags; + u32 reserved_lkey; + int lam_required; + u64 max_icm_sz; + } arbel; + } hca; +}; + +struct mthca_adapter { + u32 vendor_id; + u32 device_id; + u32 revision_id; + u8 inta_pin; +}; + +struct mthca_init_hca_param { + u64 qpc_base; + u8 log_num_qps; + u64 eec_base; + u8 log_num_eecs; + u64 srqc_base; + u8 log_num_srqs; + u64 cqc_base; + u8 log_num_cqs; + u64 eqpc_base; + u64 eeec_base; + u64 eqc_base; + u8 log_num_eqs; + u64 rdb_base; + u64 mc_base; + u16 log_mc_entry_sz; + u16 mc_hash_sz; + u8 log_mc_table_sz; + u64 mpt_base; + u8 mtt_seg_sz; + u8 log_mpt_sz; + u64 mtt_base; + u64 uar_scratch_base; +}; + +struct mthca_init_ib_param { + int enable_1x; + int enable_4x; + int vl_cap; + int mtu_cap; + u16 gid_cap; + u16 pkey_cap; + int set_guid0; + u64 guid0; + int set_node_guid; + u64 node_guid; + int set_si_guid; + u64 si_guid; +}; + +int mthca_cmd_use_events(struct mthca_dev *dev); +void mthca_cmd_use_polling(struct mthca_dev *dev); +void mthca_cmd_event(struct mthca_dev *dev, u16 token, + u8 status, u64 out_param); + +int mthca_SYS_EN(struct mthca_dev *dev, u8 *status); +int mthca_SYS_DIS(struct mthca_dev *dev, u8 *status); +int mthca_MAP_FA(struct mthca_dev *dev, int count, + struct scatterlist *sglist, u8 *status); +int mthca_UNMAP_FA(struct mthca_dev *dev, u8 *status); +int mthca_RUN_FW(struct mthca_dev *dev, u8 *status); +int mthca_QUERY_FW(struct mthca_dev *dev, u8 *status); +int mthca_ENABLE_LAM(struct mthca_dev *dev, u8 *status); +int mthca_DISABLE_LAM(struct mthca_dev *dev, u8 *status); +int mthca_QUERY_DDR(struct mthca_dev *dev, u8 *status); +int mthca_QUERY_DEV_LIM(struct mthca_dev *dev, + struct mthca_dev_lim *dev_lim, u8 *status); +int mthca_QUERY_ADAPTER(struct mthca_dev *dev, + struct mthca_adapter *adapter, u8 *status); +int mthca_INIT_HCA(struct mthca_dev *dev, + struct mthca_init_hca_param *param, + u8 *status); +int mthca_INIT_IB(struct mthca_dev *dev, + struct mthca_init_ib_param *param, + int port, u8 *status); +int mthca_CLOSE_IB(struct mthca_dev *dev, int port, u8 *status); +int mthca_CLOSE_HCA(struct mthca_dev *dev, int panic, u8 *status); +int mthca_SW2HW_MPT(struct mthca_dev *dev, void *mpt_entry, + int mpt_index, u8 *status); +int mthca_HW2SW_MPT(struct mthca_dev *dev, void *mpt_entry, + int mpt_index, u8 *status); +int mthca_WRITE_MTT(struct mthca_dev *dev, u64 *mtt_entry, + int num_mtt, u8 *status); +int mthca_MAP_EQ(struct mthca_dev *dev, u64 event_mask, int unmap, + int eq_num, u8 *status); +int mthca_SW2HW_EQ(struct mthca_dev *dev, void *eq_context, + int eq_num, u8 *status); +int mthca_HW2SW_EQ(struct mthca_dev *dev, void *eq_context, + int eq_num, u8 *status); +int mthca_SW2HW_CQ(struct mthca_dev *dev, void *cq_context, + int cq_num, u8 *status); +int mthca_HW2SW_CQ(struct mthca_dev *dev, void *cq_context, + int cq_num, u8 *status); +int mthca_MODIFY_QP(struct mthca_dev *dev, int trans, u32 num, + int is_ee, void *qp_context, u32 optmask, + u8 *status); +int mthca_QUERY_QP(struct mthca_dev *dev, u32 num, int is_ee, + void *qp_context, u8 *status); +int mthca_CONF_SPECIAL_QP(struct mthca_dev *dev, int type, u32 qpn, + u8 *status); +int mthca_MAD_IFC(struct mthca_dev *dev, int ignore_mkey, int port, + void *in_mad, void *response_mad, u8 *status); +int mthca_READ_MGM(struct mthca_dev *dev, int index, void *mgm, + u8 *status); +int mthca_WRITE_MGM(struct mthca_dev *dev, int index, void *mgm, + u8 *status); +int mthca_MGID_HASH(struct mthca_dev *dev, void *gid, u16 *hash, + u8 *status); + +#define MAILBOX_ALIGN(x) ((void *) ALIGN((unsigned long) (x), MTHCA_CMD_MAILBOX_ALIGN)) + +#endif /* MTHCA_CMD_H */ From roland@topspin.com Mon Dec 27 21:49:43 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 21:50:52 -0800 (PST) Received: from umhlanga.STRATNET.NET (umhlanga.stratnet.net [12.162.17.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS5nJDg025948 for ; Mon, 27 Dec 2004 21:49:42 -0800 Received: from exch-1.topspincom.com ([12.162.17.3]) by umhlanga.STRATNET.NET with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:51:04 -0800 Received: from eddore ([10.10.253.169]) by exch-1.topspincom.com with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:51:04 -0800 Received: from localhost ([127.0.0.1] helo=eddore) by eddore with smtp (Exim 4.34) id 1CjAG6-0000tj-JT; Mon, 27 Dec 2004 21:51:04 -0800 Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com, openib-general@openib.org In-Reply-To: <200412272151.h64HpFQTg9SpyhAM@topspin.com> X-Mailer: Roland's Patchbomber Date: Mon, 27 Dec 2004 21:51:02 -0800 Message-Id: <200412272151.HKpCNDFOMg5KO6kB@topspin.com> Mime-Version: 1.0 To: davem@davemloft.net From: Roland Dreier X-SA-Exim-Connect-IP: 127.0.0.1 X-SA-Exim-Mail-From: roland@topspin.com Subject: [PATCH][v5][9/24] Add Mellanox HCA low-level driver Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-SA-Exim-Version: 4.1 (built Tue, 17 Aug 2004 11:06:07 +0200) X-SA-Exim-Scanned: Yes (on eddore) X-OriginalArrivalTime: 28 Dec 2004 05:51:04.0189 (UTC) FILETIME=[37B7AAD0:01C4ECA1] X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13109 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: roland@topspin.com Precedence: bulk X-list: netdev Add a low-level driver for Mellanox MT23108 and MT25208 HCAs. The MT25208 is only fully supported when in MT23108 compatibility mode; only the very beginnings of support for native MT25208 mode (required for HCAs without local memory) is present. (As a side note, I believe this driver would be the first in-tree consumer of the PCI MSI/MSI-X API) Signed-off-by: Roland Dreier --- linux-bk.orig/drivers/infiniband/Kconfig 2004-12-27 21:48:18.185289416 -0800 +++ linux-bk/drivers/infiniband/Kconfig 2004-12-27 21:48:21.258837002 -0800 @@ -7,4 +7,6 @@ any protocols you wish to use as well as drivers for your InfiniBand hardware. +source "drivers/infiniband/hw/mthca/Kconfig" + endmenu --- linux-bk.orig/drivers/infiniband/Makefile 2004-12-27 21:48:18.216284854 -0800 +++ linux-bk/drivers/infiniband/Makefile 2004-12-27 21:48:21.219842741 -0800 @@ -1 +1,2 @@ obj-$(CONFIG_INFINIBAND) += core/ +obj-$(CONFIG_INFINIBAND_MTHCA) += hw/mthca/ --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/hw/mthca/Kconfig 2004-12-27 21:48:21.318828171 -0800 @@ -0,0 +1,26 @@ +config INFINIBAND_MTHCA + tristate "Mellanox HCA support" + depends on PCI && INFINIBAND + ---help--- + This is a low-level driver for Mellanox InfiniHost host + channel adapters (HCAs), including the MT23108 PCI-X HCA + ("Tavor") and the MT25208 PCI Express HCA ("Arbel"). + +config INFINIBAND_MTHCA_DEBUG + bool "Verbose debugging output" + depends on INFINIBAND_MTHCA + default n + ---help--- + This option causes the mthca driver produce a bunch of debug + messages. Select this is you are developing the driver or + trying to diagnose a problem. + +config INFINIBAND_MTHCA_SSE_DOORBELL + bool "SSE doorbell code" + depends on INFINIBAND_MTHCA && X86 && !X86_64 + default n + ---help--- + This option will have the mthca driver use SSE instructions + to ring hardware doorbell registers. This may improve + performance for some workloads, but the driver will not run + on processors without SSE instructions. --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/hw/mthca/Makefile 2004-12-27 21:48:21.366821107 -0800 @@ -0,0 +1,12 @@ +EXTRA_CFLAGS += -Idrivers/infiniband/include + +ifdef CONFIG_INFINIBAND_MTHCA_DEBUG +EXTRA_CFLAGS += -DDEBUG +endif + +obj-$(CONFIG_INFINIBAND_MTHCA) += ib_mthca.o + +ib_mthca-y := mthca_main.o mthca_cmd.o mthca_profile.o mthca_reset.o \ + mthca_allocator.o mthca_eq.o mthca_pd.o mthca_cq.o \ + mthca_mr.o mthca_qp.o mthca_av.o mthca_mcg.o mthca_mad.o \ + mthca_provider.o --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/hw/mthca/mthca_allocator.c 2004-12-27 21:48:21.428811982 -0800 @@ -0,0 +1,179 @@ +/* + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: mthca_allocator.c 1349 2004-12-16 21:09:43Z roland $ + */ + +#include +#include +#include + +#include "mthca_dev.h" + +/* Trivial bitmap-based allocator */ +u32 mthca_alloc(struct mthca_alloc *alloc) +{ + u32 obj; + + spin_lock(&alloc->lock); + obj = find_next_zero_bit(alloc->table, alloc->max, alloc->last); + if (obj >= alloc->max) { + alloc->top = (alloc->top + alloc->max) & alloc->mask; + obj = find_first_zero_bit(alloc->table, alloc->max); + } + + if (obj < alloc->max) { + set_bit(obj, alloc->table); + obj |= alloc->top; + } else + obj = -1; + + spin_unlock(&alloc->lock); + + return obj; +} + +void mthca_free(struct mthca_alloc *alloc, u32 obj) +{ + obj &= alloc->max - 1; + spin_lock(&alloc->lock); + clear_bit(obj, alloc->table); + alloc->last = min(alloc->last, obj); + alloc->top = (alloc->top + alloc->max) & alloc->mask; + spin_unlock(&alloc->lock); +} + +int mthca_alloc_init(struct mthca_alloc *alloc, u32 num, u32 mask, + u32 reserved) +{ + int i; + + /* num must be a power of 2 */ + if (num != 1 << (ffs(num) - 1)) + return -EINVAL; + + alloc->last = 0; + alloc->top = 0; + alloc->max = num; + alloc->mask = mask; + spin_lock_init(&alloc->lock); + alloc->table = kmalloc(BITS_TO_LONGS(num) * sizeof (long), + GFP_KERNEL); + if (!alloc->table) + return -ENOMEM; + + bitmap_zero(alloc->table, num); + for (i = 0; i < reserved; ++i) + set_bit(i, alloc->table); + + return 0; +} + +void mthca_alloc_cleanup(struct mthca_alloc *alloc) +{ + kfree(alloc->table); +} + +/* + * Array of pointers with lazy allocation of leaf pages. Callers of + * _get, _set and _clear methods must use a lock or otherwise + * serialize access to the array. + */ + +void *mthca_array_get(struct mthca_array *array, int index) +{ + int p = (index * sizeof (void *)) >> PAGE_SHIFT; + + if (array->page_list[p].page) { + int i = index & (PAGE_SIZE / sizeof (void *) - 1); + return array->page_list[p].page[i]; + } else + return NULL; +} + +int mthca_array_set(struct mthca_array *array, int index, void *value) +{ + int p = (index * sizeof (void *)) >> PAGE_SHIFT; + + /* Allocate with GFP_ATOMIC because we'll be called with locks held. */ + if (!array->page_list[p].page) + array->page_list[p].page = (void **) get_zeroed_page(GFP_ATOMIC); + + if (!array->page_list[p].page) + return -ENOMEM; + + array->page_list[p].page[index & (PAGE_SIZE / sizeof (void *) - 1)] = + value; + ++array->page_list[p].used; + + return 0; +} + +void mthca_array_clear(struct mthca_array *array, int index) +{ + int p = (index * sizeof (void *)) >> PAGE_SHIFT; + + if (--array->page_list[p].used == 0) { + free_page((unsigned long) array->page_list[p].page); + array->page_list[p].page = NULL; + } + + if (array->page_list[p].used < 0) + pr_debug("Array %p index %d page %d with ref count %d < 0\n", + array, index, p, array->page_list[p].used); +} + +int mthca_array_init(struct mthca_array *array, int nent) +{ + int npage = (nent * sizeof (void *) + PAGE_SIZE - 1) / PAGE_SIZE; + int i; + + array->page_list = kmalloc(npage * sizeof *array->page_list, GFP_KERNEL); + if (!array->page_list) + return -ENOMEM; + + for (i = 0; i < npage; ++i) { + array->page_list[i].page = NULL; + array->page_list[i].used = 0; + } + + return 0; +} + +void mthca_array_cleanup(struct mthca_array *array, int nent) +{ + int i; + + for (i = 0; i < (nent * sizeof (void *) + PAGE_SIZE - 1) / PAGE_SIZE; ++i) + free_page((unsigned long) array->page_list[i].page); + + kfree(array->page_list); +} --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/hw/mthca/mthca_config_reg.h 2004-12-27 21:48:21.473805359 -0800 @@ -0,0 +1,55 @@ +/* + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: mthca_config_reg.h 1349 2004-12-16 21:09:43Z roland $ + */ + +#ifndef MTHCA_CONFIG_REG_H +#define MTHCA_CONFIG_REG_H + +#include + +#define MTHCA_HCR_BASE 0x80680 +#define MTHCA_HCR_SIZE 0x0001c +#define MTHCA_ECR_BASE 0x80700 +#define MTHCA_ECR_SIZE 0x00008 +#define MTHCA_ECR_CLR_BASE 0x80708 +#define MTHCA_ECR_CLR_SIZE 0x00008 +#define MTHCA_ECR_OFFSET (MTHCA_ECR_BASE - MTHCA_HCR_BASE) +#define MTHCA_ECR_CLR_OFFSET (MTHCA_ECR_CLR_BASE - MTHCA_HCR_BASE) +#define MTHCA_CLR_INT_BASE 0xf00d8 +#define MTHCA_CLR_INT_SIZE 0x00008 + +#define MTHCA_MAP_HCR_SIZE (MTHCA_ECR_CLR_BASE + \ + MTHCA_ECR_CLR_SIZE - \ + MTHCA_HCR_BASE) + +#endif /* MTHCA_CONFIG_REG_H */ --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/hw/mthca/mthca_dev.h 2004-12-27 21:48:21.522798147 -0800 @@ -0,0 +1,391 @@ +/* + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: mthca_dev.h 1349 2004-12-16 21:09:43Z roland $ + */ + +#ifndef MTHCA_DEV_H +#define MTHCA_DEV_H + +#include +#include +#include +#include +#include +#include + +#include "mthca_provider.h" +#include "mthca_doorbell.h" + +#define DRV_NAME "ib_mthca" +#define PFX DRV_NAME ": " +#define DRV_VERSION "0.06-pre" +#define DRV_RELDATE "November 8, 2004" + +/* Types of supported HCA */ +enum { + TAVOR, /* MT23108 */ + ARBEL_COMPAT, /* MT25208 in Tavor compat mode */ + ARBEL_NATIVE /* MT25208 with extended features */ +}; + +enum { + MTHCA_FLAG_DDR_HIDDEN = 1 << 1, + MTHCA_FLAG_SRQ = 1 << 2, + MTHCA_FLAG_MSI = 1 << 3, + MTHCA_FLAG_MSI_X = 1 << 4, + MTHCA_FLAG_NO_LAM = 1 << 5 +}; + +enum { + MTHCA_KAR_PAGE = 1, + MTHCA_MAX_PORTS = 2 +}; + +enum { + MTHCA_MPT_ENTRY_SIZE = 0x40, + MTHCA_EQ_CONTEXT_SIZE = 0x40, + MTHCA_CQ_CONTEXT_SIZE = 0x40, + MTHCA_QP_CONTEXT_SIZE = 0x200, + MTHCA_AV_SIZE = 0x20, + MTHCA_MGM_ENTRY_SIZE = 0x40 +}; + +enum { + MTHCA_EQ_CMD, + MTHCA_EQ_ASYNC, + MTHCA_EQ_COMP, + MTHCA_NUM_EQ +}; + +struct mthca_cmd { + int use_events; + struct semaphore hcr_sem; + struct semaphore poll_sem; + struct semaphore event_sem; + int max_cmds; + spinlock_t context_lock; + int free_head; + struct mthca_cmd_context *context; + u16 token_mask; +}; + +struct mthca_limits { + int num_ports; + int vl_cap; + int mtu_cap; + int gid_table_len; + int pkey_table_len; + int local_ca_ack_delay; + int max_sg; + int num_qps; + int reserved_qps; + int num_srqs; + int reserved_srqs; + int num_eecs; + int reserved_eecs; + int num_cqs; + int reserved_cqs; + int num_eqs; + int reserved_eqs; + int num_mpts; + int num_mtt_segs; + int mtt_seg_size; + int reserved_mtts; + int reserved_mrws; + int num_rdbs; + int reserved_uars; + int num_mgms; + int num_amgms; + int reserved_mcgs; + int num_pds; + int reserved_pds; +}; + +struct mthca_alloc { + u32 last; + u32 top; + u32 max; + u32 mask; + spinlock_t lock; + unsigned long *table; +}; + +struct mthca_array { + struct { + void **page; + int used; + } *page_list; +}; + +struct mthca_pd_table { + struct mthca_alloc alloc; +}; + +struct mthca_mr_table { + struct mthca_alloc mpt_alloc; + int max_mtt_order; + unsigned long **mtt_buddy; + u64 mtt_base; +}; + +struct mthca_eq_table { + struct mthca_alloc alloc; + void __iomem *clr_int; + u32 clr_mask; + struct mthca_eq eq[MTHCA_NUM_EQ]; + int have_irq; + u8 inta_pin; +}; + +struct mthca_cq_table { + struct mthca_alloc alloc; + spinlock_t lock; + struct mthca_array cq; +}; + +struct mthca_qp_table { + struct mthca_alloc alloc; + int sqp_start; + spinlock_t lock; + struct mthca_array qp; +}; + +struct mthca_av_table { + struct pci_pool *pool; + int num_ddr_avs; + u64 ddr_av_base; + void __iomem *av_map; + struct mthca_alloc alloc; +}; + +struct mthca_mcg_table { + struct semaphore sem; + struct mthca_alloc alloc; +}; + +struct mthca_dev { + struct ib_device ib_dev; + struct pci_dev *pdev; + + int hca_type; + unsigned long mthca_flags; + + u32 rev_id; + + /* firmware info */ + u64 fw_ver; + union { + struct { + u64 fw_start; + u64 fw_end; + } tavor; + struct { + u64 clr_int_base; + u64 eq_arm_base; + u64 eq_set_ci_base; + struct scatterlist *mem; + u16 fw_pages; + } arbel; + } fw; + + u64 ddr_start; + u64 ddr_end; + + MTHCA_DECLARE_DOORBELL_LOCK(doorbell_lock) + + void __iomem *hcr; + void __iomem *clr_base; + void __iomem *kar; + + struct mthca_cmd cmd; + struct mthca_limits limits; + + struct mthca_pd_table pd_table; + struct mthca_mr_table mr_table; + struct mthca_eq_table eq_table; + struct mthca_cq_table cq_table; + struct mthca_qp_table qp_table; + struct mthca_av_table av_table; + struct mthca_mcg_table mcg_table; + + struct mthca_pd driver_pd; + struct mthca_mr driver_mr; + + struct ib_mad_agent *send_agent[MTHCA_MAX_PORTS][2]; + struct ib_ah *sm_ah[MTHCA_MAX_PORTS]; + spinlock_t sm_lock; +}; + +#define mthca_dbg(mdev, format, arg...) \ + dev_dbg(&mdev->pdev->dev, format, ## arg) +#define mthca_err(mdev, format, arg...) \ + dev_err(&mdev->pdev->dev, format, ## arg) +#define mthca_info(mdev, format, arg...) \ + dev_info(&mdev->pdev->dev, format, ## arg) +#define mthca_warn(mdev, format, arg...) \ + dev_warn(&mdev->pdev->dev, format, ## arg) + +extern void __buggy_use_of_MTHCA_GET(void); +extern void __buggy_use_of_MTHCA_PUT(void); + +#define MTHCA_GET(dest, source, offset) \ + do { \ + void *__p = (char *) (source) + (offset); \ + switch (sizeof (dest)) { \ + case 1: (dest) = *(u8 *) __p; break; \ + case 2: (dest) = be16_to_cpup(__p); break; \ + case 4: (dest) = be32_to_cpup(__p); break; \ + case 8: (dest) = be64_to_cpup(__p); break; \ + default: __buggy_use_of_MTHCA_GET(); \ + } \ + } while (0) + +#define MTHCA_PUT(dest, source, offset) \ + do { \ + __typeof__(source) *__p = \ + (__typeof__(source) *) ((char *) (dest) + (offset)); \ + switch (sizeof(source)) { \ + case 1: *__p = (source); break; \ + case 2: *__p = cpu_to_be16(source); break; \ + case 4: *__p = cpu_to_be32(source); break; \ + case 8: *__p = cpu_to_be64(source); break; \ + default: __buggy_use_of_MTHCA_PUT(); \ + } \ + } while (0) + +int mthca_reset(struct mthca_dev *mdev); + +u32 mthca_alloc(struct mthca_alloc *alloc); +void mthca_free(struct mthca_alloc *alloc, u32 obj); +int mthca_alloc_init(struct mthca_alloc *alloc, u32 num, u32 mask, + u32 reserved); +void mthca_alloc_cleanup(struct mthca_alloc *alloc); +void *mthca_array_get(struct mthca_array *array, int index); +int mthca_array_set(struct mthca_array *array, int index, void *value); +void mthca_array_clear(struct mthca_array *array, int index); +int mthca_array_init(struct mthca_array *array, int nent); +void mthca_array_cleanup(struct mthca_array *array, int nent); + +int mthca_init_pd_table(struct mthca_dev *dev); +int mthca_init_mr_table(struct mthca_dev *dev); +int mthca_init_eq_table(struct mthca_dev *dev); +int mthca_init_cq_table(struct mthca_dev *dev); +int mthca_init_qp_table(struct mthca_dev *dev); +int mthca_init_av_table(struct mthca_dev *dev); +int mthca_init_mcg_table(struct mthca_dev *dev); + +void mthca_cleanup_pd_table(struct mthca_dev *dev); +void mthca_cleanup_mr_table(struct mthca_dev *dev); +void mthca_cleanup_eq_table(struct mthca_dev *dev); +void mthca_cleanup_cq_table(struct mthca_dev *dev); +void mthca_cleanup_qp_table(struct mthca_dev *dev); +void mthca_cleanup_av_table(struct mthca_dev *dev); +void mthca_cleanup_mcg_table(struct mthca_dev *dev); + +int mthca_register_device(struct mthca_dev *dev); +void mthca_unregister_device(struct mthca_dev *dev); + +int mthca_pd_alloc(struct mthca_dev *dev, struct mthca_pd *pd); +void mthca_pd_free(struct mthca_dev *dev, struct mthca_pd *pd); + +int mthca_mr_alloc_notrans(struct mthca_dev *dev, u32 pd, + u32 access, struct mthca_mr *mr); +int mthca_mr_alloc_phys(struct mthca_dev *dev, u32 pd, + u64 *buffer_list, int buffer_size_shift, + int list_len, u64 iova, u64 total_size, + u32 access, struct mthca_mr *mr); +void mthca_free_mr(struct mthca_dev *dev, struct mthca_mr *mr); + +int mthca_poll_cq(struct ib_cq *ibcq, int num_entries, + struct ib_wc *entry); +void mthca_arm_cq(struct mthca_dev *dev, struct mthca_cq *cq, + int solicited); +int mthca_init_cq(struct mthca_dev *dev, int nent, + struct mthca_cq *cq); +void mthca_free_cq(struct mthca_dev *dev, + struct mthca_cq *cq); +void mthca_cq_event(struct mthca_dev *dev, u32 cqn); +void mthca_cq_clean(struct mthca_dev *dev, u32 cqn, u32 qpn); + +void mthca_qp_event(struct mthca_dev *dev, u32 qpn, + enum ib_event_type event_type); +int mthca_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, int attr_mask); +int mthca_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr, + struct ib_send_wr **bad_wr); +int mthca_post_receive(struct ib_qp *ibqp, struct ib_recv_wr *wr, + struct ib_recv_wr **bad_wr); +int mthca_free_err_wqe(struct mthca_qp *qp, int is_send, + int index, int *dbd, u32 *new_wqe); +int mthca_alloc_qp(struct mthca_dev *dev, + struct mthca_pd *pd, + struct mthca_cq *send_cq, + struct mthca_cq *recv_cq, + enum ib_qp_type type, + enum ib_sig_type send_policy, + enum ib_sig_type recv_policy, + struct mthca_qp *qp); +int mthca_alloc_sqp(struct mthca_dev *dev, + struct mthca_pd *pd, + struct mthca_cq *send_cq, + struct mthca_cq *recv_cq, + enum ib_sig_type send_policy, + enum ib_sig_type recv_policy, + int qpn, + int port, + struct mthca_sqp *sqp); +void mthca_free_qp(struct mthca_dev *dev, struct mthca_qp *qp); +int mthca_create_ah(struct mthca_dev *dev, + struct mthca_pd *pd, + struct ib_ah_attr *ah_attr, + struct mthca_ah *ah); +int mthca_destroy_ah(struct mthca_dev *dev, struct mthca_ah *ah); +int mthca_read_ah(struct mthca_dev *dev, struct mthca_ah *ah, + struct ib_ud_header *header); + +int mthca_multicast_attach(struct ib_qp *ibqp, union ib_gid *gid, u16 lid); +int mthca_multicast_detach(struct ib_qp *ibqp, union ib_gid *gid, u16 lid); + +int mthca_process_mad(struct ib_device *ibdev, + int mad_flags, + u8 port_num, + u16 slid, + struct ib_mad *in_mad, + struct ib_mad *out_mad); +int mthca_create_agents(struct mthca_dev *dev); +void mthca_free_agents(struct mthca_dev *dev); + +static inline struct mthca_dev *to_mdev(struct ib_device *ibdev) +{ + return container_of(ibdev, struct mthca_dev, ib_dev); +} + +#endif /* MTHCA_DEV_H */ --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/hw/mthca/mthca_doorbell.h 2004-12-27 21:48:21.567791525 -0800 @@ -0,0 +1,123 @@ +/* + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: mthca_doorbell.h 1349 2004-12-16 21:09:43Z roland $ + */ + +#include +#include +#include + +#define MTHCA_RD_DOORBELL 0x00 +#define MTHCA_SEND_DOORBELL 0x10 +#define MTHCA_RECEIVE_DOORBELL 0x18 +#define MTHCA_CQ_DOORBELL 0x20 +#define MTHCA_EQ_DOORBELL 0x28 + +#if BITS_PER_LONG == 64 +/* + * Assume that we can just write a 64-bit doorbell atomically. s390 + * actually doesn't have writeq() but S/390 systems don't even have + * PCI so we won't worry about it. + */ + +#define MTHCA_DECLARE_DOORBELL_LOCK(name) +#define MTHCA_INIT_DOORBELL_LOCK(ptr) do { } while (0) +#define MTHCA_GET_DOORBELL_LOCK(ptr) (NULL) + +static inline void mthca_write64(u32 val[2], void __iomem *dest, + spinlock_t *doorbell_lock) +{ + __raw_writeq(*(u64 *) val, dest); +} + +#elif defined(CONFIG_INFINIBAND_MTHCA_SSE_DOORBELL) +/* Use SSE to write 64 bits atomically without a lock. */ + +#define MTHCA_DECLARE_DOORBELL_LOCK(name) +#define MTHCA_INIT_DOORBELL_LOCK(ptr) do { } while (0) +#define MTHCA_GET_DOORBELL_LOCK(ptr) (NULL) + +static inline unsigned long mthca_get_fpu(void) +{ + unsigned long cr0; + + preempt_disable(); + asm volatile("mov %%cr0,%0; clts" : "=r" (cr0)); + return cr0; +} + +static inline void mthca_put_fpu(unsigned long cr0) +{ + asm volatile("mov %0,%%cr0" : : "r" (cr0)); + preempt_enable(); +} + +static inline void mthca_write64(u32 val[2], void __iomem *dest, + spinlock_t *doorbell_lock) +{ + /* i386 stack is aligned to 8 bytes, so this should be OK: */ + u8 xmmsave[8] __attribute__((aligned(8))); + unsigned long cr0; + + cr0 = mthca_get_fpu(); + + asm volatile ( + "movlps %%xmm0,(%0); \n\t" + "movlps (%1),%%xmm0; \n\t" + "movlps %%xmm0,(%2); \n\t" + "movlps (%0),%%xmm0; \n\t" + : + : "r" (xmmsave), "r" (val), "r" (dest) + : "memory" ); + + mthca_put_fpu(cr0); +} + +#else +/* Just fall back to a spinlock to protect the doorbell */ + +#define MTHCA_DECLARE_DOORBELL_LOCK(name) spinlock_t name; +#define MTHCA_INIT_DOORBELL_LOCK(ptr) spin_lock_init(ptr) +#define MTHCA_GET_DOORBELL_LOCK(ptr) (ptr) + +static inline void mthca_write64(u32 val[2], void __iomem *dest, + spinlock_t *doorbell_lock) +{ + unsigned long flags; + + spin_lock_irqsave(doorbell_lock, flags); + __raw_writel(val[0], dest); + __raw_writel(val[1], dest + 4); + spin_unlock_irqrestore(doorbell_lock, flags); +} + +#endif --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/hw/mthca/mthca_main.c 2004-12-27 21:48:21.623783283 -0800 @@ -0,0 +1,936 @@ +/* + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: mthca_main.c 1396 2004-12-28 04:10:27Z roland $ + */ + +#include +#include +#include +#include +#include +#include +#include + +#ifdef CONFIG_INFINIBAND_MTHCA_SSE_DOORBELL +#include +#endif + +#include "mthca_dev.h" +#include "mthca_config_reg.h" +#include "mthca_cmd.h" +#include "mthca_profile.h" + +MODULE_AUTHOR("Roland Dreier"); +MODULE_DESCRIPTION("Mellanox InfiniBand HCA low-level driver"); +MODULE_LICENSE("Dual BSD/GPL"); +MODULE_VERSION(DRV_VERSION); + +#ifdef CONFIG_PCI_MSI + +static int msi_x = 0; +module_param(msi_x, int, 0444); +MODULE_PARM_DESC(msi_x, "attempt to use MSI-X if nonzero"); + +static int msi = 0; +module_param(msi, int, 0444); +MODULE_PARM_DESC(msi, "attempt to use MSI if nonzero"); + +#else /* CONFIG_PCI_MSI */ + +#define msi_x (0) +#define msi (0) + +#endif /* CONFIG_PCI_MSI */ + +static const char mthca_version[] __devinitdata = + "ib_mthca: Mellanox InfiniBand HCA driver v" + DRV_VERSION " (" DRV_RELDATE ")\n"; + +static int __devinit mthca_tune_pci(struct mthca_dev *mdev) +{ + int cap; + u16 val; + + /* First try to max out Read Byte Count */ + cap = pci_find_capability(mdev->pdev, PCI_CAP_ID_PCIX); + if (cap) { + if (pci_read_config_word(mdev->pdev, cap + PCI_X_CMD, &val)) { + mthca_err(mdev, "Couldn't read PCI-X command register, " + "aborting.\n"); + return -ENODEV; + } + val = (val & ~PCI_X_CMD_MAX_READ) | (3 << 2); + if (pci_write_config_word(mdev->pdev, cap + PCI_X_CMD, val)) { + mthca_err(mdev, "Couldn't write PCI-X command register, " + "aborting.\n"); + return -ENODEV; + } + } else if (mdev->hca_type == TAVOR) + mthca_info(mdev, "No PCI-X capability, not setting RBC.\n"); + + cap = pci_find_capability(mdev->pdev, PCI_CAP_ID_EXP); + if (cap) { + if (pci_read_config_word(mdev->pdev, cap + PCI_EXP_DEVCTL, &val)) { + mthca_err(mdev, "Couldn't read PCI Express device control " + "register, aborting.\n"); + return -ENODEV; + } + val = (val & ~PCI_EXP_DEVCTL_READRQ) | (5 << 12); + if (pci_write_config_word(mdev->pdev, cap + PCI_EXP_DEVCTL, val)) { + mthca_err(mdev, "Couldn't write PCI Express device control " + "register, aborting.\n"); + return -ENODEV; + } + } else if (mdev->hca_type == ARBEL_NATIVE || + mdev->hca_type == ARBEL_COMPAT) + mthca_info(mdev, "No PCI Express capability, " + "not setting Max Read Request Size.\n"); + + return 0; +} + +static int __devinit mthca_dev_lim(struct mthca_dev *mdev, struct mthca_dev_lim *dev_lim) +{ + int err; + u8 status; + + err = mthca_QUERY_DEV_LIM(mdev, dev_lim, &status); + if (err) { + mthca_err(mdev, "QUERY_DEV_LIM command failed, aborting.\n"); + return err; + } + if (status) { + mthca_err(mdev, "QUERY_DEV_LIM returned status 0x%02x, " + "aborting.\n", status); + return -EINVAL; + } + if (dev_lim->min_page_sz > PAGE_SIZE) { + mthca_err(mdev, "HCA minimum page size of %d bigger than " + "kernel PAGE_SIZE of %ld, aborting.\n", + dev_lim->min_page_sz, PAGE_SIZE); + return -ENODEV; + } + if (dev_lim->num_ports > MTHCA_MAX_PORTS) { + mthca_err(mdev, "HCA has %d ports, but we only support %d, " + "aborting.\n", + dev_lim->num_ports, MTHCA_MAX_PORTS); + return -ENODEV; + } + + mdev->limits.num_ports = dev_lim->num_ports; + mdev->limits.vl_cap = dev_lim->max_vl; + mdev->limits.mtu_cap = dev_lim->max_mtu; + mdev->limits.gid_table_len = dev_lim->max_gids; + mdev->limits.pkey_table_len = dev_lim->max_pkeys; + mdev->limits.local_ca_ack_delay = dev_lim->local_ca_ack_delay; + mdev->limits.max_sg = dev_lim->max_sg; + mdev->limits.reserved_qps = dev_lim->reserved_qps; + mdev->limits.reserved_srqs = dev_lim->reserved_srqs; + mdev->limits.reserved_eecs = dev_lim->reserved_eecs; + mdev->limits.reserved_cqs = dev_lim->reserved_cqs; + mdev->limits.reserved_eqs = dev_lim->reserved_eqs; + mdev->limits.reserved_mtts = dev_lim->reserved_mtts; + mdev->limits.reserved_mrws = dev_lim->reserved_mrws; + mdev->limits.reserved_uars = dev_lim->reserved_uars; + mdev->limits.reserved_pds = dev_lim->reserved_pds; + + if (dev_lim->flags & DEV_LIM_FLAG_SRQ) + mdev->mthca_flags |= MTHCA_FLAG_SRQ; + + return 0; +} + +static int __devinit mthca_init_tavor(struct mthca_dev *mdev) +{ + u8 status; + int err; + struct mthca_dev_lim dev_lim; + struct mthca_init_hca_param init_hca; + struct mthca_adapter adapter; + + err = mthca_SYS_EN(mdev, &status); + if (err) { + mthca_err(mdev, "SYS_EN command failed, aborting.\n"); + return err; + } + if (status) { + mthca_err(mdev, "SYS_EN returned status 0x%02x, " + "aborting.\n", status); + return -EINVAL; + } + + err = mthca_QUERY_FW(mdev, &status); + if (err) { + mthca_err(mdev, "QUERY_FW command failed, aborting.\n"); + goto err_out_disable; + } + if (status) { + mthca_err(mdev, "QUERY_FW returned status 0x%02x, " + "aborting.\n", status); + err = -EINVAL; + goto err_out_disable; + } + err = mthca_QUERY_DDR(mdev, &status); + if (err) { + mthca_err(mdev, "QUERY_DDR command failed, aborting.\n"); + goto err_out_disable; + } + if (status) { + mthca_err(mdev, "QUERY_DDR returned status 0x%02x, " + "aborting.\n", status); + err = -EINVAL; + goto err_out_disable; + } + + err = mthca_dev_lim(mdev, &dev_lim); + + err = mthca_make_profile(mdev, &dev_lim, &init_hca); + if (err) + goto err_out_disable; + + err = mthca_INIT_HCA(mdev, &init_hca, &status); + if (err) { + mthca_err(mdev, "INIT_HCA command failed, aborting.\n"); + goto err_out_disable; + } + if (status) { + mthca_err(mdev, "INIT_HCA returned status 0x%02x, " + "aborting.\n", status); + err = -EINVAL; + goto err_out_disable; + } + + err = mthca_QUERY_ADAPTER(mdev, &adapter, &status); + if (err) { + mthca_err(mdev, "QUERY_ADAPTER command failed, aborting.\n"); + goto err_out_disable; + } + if (status) { + mthca_err(mdev, "QUERY_ADAPTER returned status 0x%02x, " + "aborting.\n", status); + err = -EINVAL; + goto err_out_close; + } + + mdev->eq_table.inta_pin = adapter.inta_pin; + mdev->rev_id = adapter.revision_id; + + return 0; + +err_out_close: + mthca_CLOSE_HCA(mdev, 0, &status); + +err_out_disable: + mthca_SYS_DIS(mdev, &status); + + return err; +} + +static int __devinit mthca_load_fw(struct mthca_dev *mdev) +{ + u8 status; + int err; + int num_ent, num_sg, fw_pages, cur_order; + int i; + + /* FIXME: use HCA-attached memory for FW if present */ + + mdev->fw.arbel.mem = kmalloc(sizeof *mdev->fw.arbel.mem * + mdev->fw.arbel.fw_pages, + GFP_KERNEL); + if (!mdev->fw.arbel.mem) { + mthca_err(mdev, "Couldn't allocate FW area, aborting.\n"); + return -ENOMEM; + } + + memset(mdev->fw.arbel.mem, 0, + sizeof *mdev->fw.arbel.mem * mdev->fw.arbel.fw_pages); + + fw_pages = mdev->fw.arbel.fw_pages; + num_ent = 0; + + /* + * We allocate in as big chunks as we can, up to a maximum of + * 256 KB per chunk. + */ + cur_order = get_order(1 << 18); + + while (fw_pages > 0) { + while (1 << cur_order > fw_pages) + --cur_order; + + /* + * We allocate with GFP_HIGHUSER because only the + * firmware is going to touch these pages, so there's + * no need for a kernel virtual address. We use + * __GFP_NOWARN because we'll deal with any allocation + * failures ourselves. + */ + mdev->fw.arbel.mem[num_ent].page = alloc_pages(GFP_HIGHUSER | __GFP_NOWARN, + cur_order); + mdev->fw.arbel.mem[num_ent].length = PAGE_SIZE << cur_order; + if (!mdev->fw.arbel.mem[num_ent].page) { + --cur_order; + if (cur_order < 0) { + mthca_err(mdev, "Couldn't allocate FW area, aborting.\n"); + err = -ENOMEM; + goto err_free; + } + } else { + ++num_ent; + fw_pages -= 1 << cur_order; + } + } + + num_sg = pci_map_sg(mdev->pdev, mdev->fw.arbel.mem, num_ent, + PCI_DMA_BIDIRECTIONAL); + if (num_sg <= 0) { + mthca_err(mdev, "Couldn't allocate FW area, aborting.\n"); + err = -ENOMEM; + goto err_free; + } + + err = mthca_MAP_FA(mdev, num_sg, mdev->fw.arbel.mem, &status); + if (err) { + mthca_err(mdev, "MAP_FA command failed, aborting.\n"); + goto err_unmap; + } + if (status) { + mthca_err(mdev, "MAP_FA returned status 0x%02x, aborting.\n", status); + err = -EINVAL; + goto err_unmap; + } + err = mthca_RUN_FW(mdev, &status); + if (err) { + mthca_err(mdev, "RUN_FW command failed, aborting.\n"); + goto err_unmap_fa; + } + if (status) { + mthca_err(mdev, "RUN_FW returned status 0x%02x, aborting.\n", status); + err = -EINVAL; + goto err_unmap_fa; + } + + return 0; + +err_unmap_fa: + mthca_UNMAP_FA(mdev, &status); + +err_unmap: + pci_unmap_sg(mdev->pdev, mdev->fw.arbel.mem, + mdev->fw.arbel.fw_pages, PCI_DMA_BIDIRECTIONAL); +err_free: + for (i = 0; i < mdev->fw.arbel.fw_pages; ++i) + if (mdev->fw.arbel.mem[i].page) + __free_pages(mdev->fw.arbel.mem[i].page, + get_order(mdev->fw.arbel.mem[i].length)); + kfree(mdev->fw.arbel.mem); + return err; +} + +static int __devinit mthca_init_arbel(struct mthca_dev *mdev) +{ + struct mthca_dev_lim dev_lim; + u8 status; + int err; + + err = mthca_QUERY_FW(mdev, &status); + if (err) { + mthca_err(mdev, "QUERY_FW command failed, aborting.\n"); + return err; + } + if (status) { + mthca_err(mdev, "QUERY_FW returned status 0x%02x, " + "aborting.\n", status); + return -EINVAL; + } + + err = mthca_ENABLE_LAM(mdev, &status); + if (err) { + mthca_err(mdev, "ENABLE_LAM command failed, aborting.\n"); + return err; + } + if (status == MTHCA_CMD_STAT_LAM_NOT_PRE) { + mthca_dbg(mdev, "No HCA-attached memory (running in MemFree mode)\n"); + mdev->mthca_flags |= MTHCA_FLAG_NO_LAM; + } else if (status) { + mthca_err(mdev, "ENABLE_LAM returned status 0x%02x, " + "aborting.\n", status); + return -EINVAL; + } + + err = mthca_load_fw(mdev); + if (err) { + mthca_err(mdev, "Failed to start FW, aborting.\n"); + goto err_out_disable; + } + + err = mthca_dev_lim(mdev, &dev_lim); + if (err) { + mthca_err(mdev, "QUERY_DEV_LIM command failed, aborting.\n"); + goto err_out_disable; + } + + mthca_warn(mdev, "Sorry, native MT25208 mode support is not done, " + "aborting.\n"); + err = -ENODEV; + +err_out_disable: + if (!(mdev->mthca_flags & MTHCA_FLAG_NO_LAM)) + mthca_DISABLE_LAM(mdev, &status); + return err; +} + +static int __devinit mthca_init_hca(struct mthca_dev *mdev) +{ + if (mdev->hca_type == ARBEL_NATIVE) + return mthca_init_arbel(mdev); + else + return mthca_init_tavor(mdev); +} + +static int __devinit mthca_setup_hca(struct mthca_dev *dev) +{ + int err; + + MTHCA_INIT_DOORBELL_LOCK(&dev->doorbell_lock); + + err = mthca_init_pd_table(dev); + if (err) { + mthca_err(dev, "Failed to initialize " + "protection domain table, aborting.\n"); + return err; + } + + err = mthca_init_mr_table(dev); + if (err) { + mthca_err(dev, "Failed to initialize " + "memory region table, aborting.\n"); + goto err_out_pd_table_free; + } + + err = mthca_pd_alloc(dev, &dev->driver_pd); + if (err) { + mthca_err(dev, "Failed to create driver PD, " + "aborting.\n"); + goto err_out_mr_table_free; + } + + err = mthca_init_eq_table(dev); + if (err) { + mthca_err(dev, "Failed to initialize " + "event queue table, aborting.\n"); + goto err_out_pd_free; + } + + err = mthca_cmd_use_events(dev); + if (err) { + mthca_err(dev, "Failed to switch to event-driven " + "firmware commands, aborting.\n"); + goto err_out_eq_table_free; + } + + err = mthca_init_cq_table(dev); + if (err) { + mthca_err(dev, "Failed to initialize " + "completion queue table, aborting.\n"); + goto err_out_cmd_poll; + } + + err = mthca_init_qp_table(dev); + if (err) { + mthca_err(dev, "Failed to initialize " + "queue pair table, aborting.\n"); + goto err_out_cq_table_free; + } + + err = mthca_init_av_table(dev); + if (err) { + mthca_err(dev, "Failed to initialize " + "address vector table, aborting.\n"); + goto err_out_qp_table_free; + } + + err = mthca_init_mcg_table(dev); + if (err) { + mthca_err(dev, "Failed to initialize " + "multicast group table, aborting.\n"); + goto err_out_av_table_free; + } + + return 0; + +err_out_av_table_free: + mthca_cleanup_av_table(dev); + +err_out_qp_table_free: + mthca_cleanup_qp_table(dev); + +err_out_cq_table_free: + mthca_cleanup_cq_table(dev); + +err_out_cmd_poll: + mthca_cmd_use_polling(dev); + +err_out_eq_table_free: + mthca_cleanup_eq_table(dev); + +err_out_pd_free: + mthca_pd_free(dev, &dev->driver_pd); + +err_out_mr_table_free: + mthca_cleanup_mr_table(dev); + +err_out_pd_table_free: + mthca_cleanup_pd_table(dev); + return err; +} + +static int __devinit mthca_request_regions(struct pci_dev *pdev, + int ddr_hidden) +{ + int err; + + /* + * We request our first BAR in two chunks, since the MSI-X + * vector table is right in the middle. + * + * This is why we can't just use pci_request_regions() -- if + * we did then setting up MSI-X would fail, since the PCI core + * wants to do request_mem_region on the MSI-X vector table. + */ + if (!request_mem_region(pci_resource_start(pdev, 0) + + MTHCA_HCR_BASE, + MTHCA_MAP_HCR_SIZE, + DRV_NAME)) + return -EBUSY; + + if (!request_mem_region(pci_resource_start(pdev, 0) + + MTHCA_CLR_INT_BASE, + MTHCA_CLR_INT_SIZE, + DRV_NAME)) { + err = -EBUSY; + goto err_out_bar0_beg; + } + + err = pci_request_region(pdev, 2, DRV_NAME); + if (err) + goto err_out_bar0_end; + + if (!ddr_hidden) { + err = pci_request_region(pdev, 4, DRV_NAME); + if (err) + goto err_out_bar2; + } + + return 0; + +err_out_bar0_beg: + release_mem_region(pci_resource_start(pdev, 0) + + MTHCA_HCR_BASE, + MTHCA_MAP_HCR_SIZE); + +err_out_bar0_end: + release_mem_region(pci_resource_start(pdev, 0) + + MTHCA_CLR_INT_BASE, + MTHCA_CLR_INT_SIZE); + +err_out_bar2: + pci_release_region(pdev, 2); + return err; +} + +static void mthca_release_regions(struct pci_dev *pdev, + int ddr_hidden) +{ + release_mem_region(pci_resource_start(pdev, 0) + + MTHCA_HCR_BASE, + MTHCA_MAP_HCR_SIZE); + release_mem_region(pci_resource_start(pdev, 0) + + MTHCA_CLR_INT_BASE, + MTHCA_CLR_INT_SIZE); + pci_release_region(pdev, 2); + if (!ddr_hidden) + pci_release_region(pdev, 4); +} + +static int __devinit mthca_enable_msi_x(struct mthca_dev *mdev) +{ + struct msix_entry entries[3]; + int err; + + entries[0].entry = 0; + entries[1].entry = 1; + entries[2].entry = 2; + + err = pci_enable_msix(mdev->pdev, entries, ARRAY_SIZE(entries)); + if (err) { + if (err > 0) + mthca_info(mdev, "Only %d MSI-X vectors available, " + "not using MSI-X\n", err); + return err; + } + + mdev->eq_table.eq[MTHCA_EQ_COMP ].msi_x_vector = entries[0].vector; + mdev->eq_table.eq[MTHCA_EQ_ASYNC].msi_x_vector = entries[1].vector; + mdev->eq_table.eq[MTHCA_EQ_CMD ].msi_x_vector = entries[2].vector; + + return 0; +} + +static void mthca_close_hca(struct mthca_dev *mdev) +{ + u8 status; + int i; + + mthca_CLOSE_HCA(mdev, 0, &status); + + if (mdev->hca_type == ARBEL_NATIVE) { + mthca_UNMAP_FA(mdev, &status); + + pci_unmap_sg(mdev->pdev, mdev->fw.arbel.mem, + mdev->fw.arbel.fw_pages, PCI_DMA_BIDIRECTIONAL); + + for (i = 0; i < mdev->fw.arbel.fw_pages; ++i) + if (mdev->fw.arbel.mem[i].page) + __free_pages(mdev->fw.arbel.mem[i].page, + get_order(mdev->fw.arbel.mem[i].length)); + kfree(mdev->fw.arbel.mem); + + if (!(mdev->mthca_flags & MTHCA_FLAG_NO_LAM)) + mthca_DISABLE_LAM(mdev, &status); + } else + mthca_SYS_DIS(mdev, &status); +} + +static int __devinit mthca_init_one(struct pci_dev *pdev, + const struct pci_device_id *id) +{ + static int mthca_version_printed = 0; + int ddr_hidden = 0; + int err; + unsigned long mthca_base; + struct mthca_dev *mdev; + + if (!mthca_version_printed) { + printk(KERN_INFO "%s", mthca_version); + ++mthca_version_printed; + } + + printk(KERN_INFO PFX "Initializing %s (%s)\n", + pci_pretty_name(pdev), pci_name(pdev)); + + err = pci_enable_device(pdev); + if (err) { + dev_err(&pdev->dev, "Cannot enable PCI device, " + "aborting.\n"); + return err; + } + + /* + * Check for BARs. We expect 0: 1MB, 2: 8MB, 4: DDR (may not + * be present) + */ + if (!(pci_resource_flags(pdev, 0) & IORESOURCE_MEM) || + pci_resource_len(pdev, 0) != 1 << 20) { + dev_err(&pdev->dev, "Missing DCS, aborting."); + err = -ENODEV; + goto err_out_disable_pdev; + } + if (!(pci_resource_flags(pdev, 2) & IORESOURCE_MEM) || + pci_resource_len(pdev, 2) != 1 << 23) { + dev_err(&pdev->dev, "Missing UAR, aborting."); + err = -ENODEV; + goto err_out_disable_pdev; + } + if (!(pci_resource_flags(pdev, 4) & IORESOURCE_MEM)) + ddr_hidden = 1; + + err = mthca_request_regions(pdev, ddr_hidden); + if (err) { + dev_err(&pdev->dev, "Cannot obtain PCI resources, " + "aborting.\n"); + goto err_out_disable_pdev; + } + + pci_set_master(pdev); + + err = pci_set_dma_mask(pdev, DMA_64BIT_MASK); + if (err) { + dev_warn(&pdev->dev, "Warning: couldn't set 64-bit PCI DMA mask.\n"); + err = pci_set_dma_mask(pdev, DMA_32BIT_MASK); + if (err) { + dev_err(&pdev->dev, "Can't set PCI DMA mask, aborting.\n"); + goto err_out_free_res; + } + } + err = pci_set_consistent_dma_mask(pdev, DMA_64BIT_MASK); + if (err) { + dev_warn(&pdev->dev, "Warning: couldn't set 64-bit " + "consistent PCI DMA mask.\n"); + err = pci_set_consistent_dma_mask(pdev, DMA_32BIT_MASK); + if (err) { + dev_err(&pdev->dev, "Can't set consistent PCI DMA mask, " + "aborting.\n"); + goto err_out_free_res; + } + } + + mdev = (struct mthca_dev *) ib_alloc_device(sizeof *mdev); + if (!mdev) { + dev_err(&pdev->dev, "Device struct alloc failed, " + "aborting.\n"); + err = -ENOMEM; + goto err_out_free_res; + } + + mdev->pdev = pdev; + mdev->hca_type = id->driver_data; + + if (ddr_hidden) + mdev->mthca_flags |= MTHCA_FLAG_DDR_HIDDEN; + + /* + * Now reset the HCA before we touch the PCI capabilities or + * attempt a firmware command, since a boot ROM may have left + * the HCA in an undefined state. + */ + err = mthca_reset(mdev); + if (err) { + mthca_err(mdev, "Failed to reset HCA, aborting.\n"); + goto err_out_free_dev; + } + + if (msi_x && !mthca_enable_msi_x(mdev)) + mdev->mthca_flags |= MTHCA_FLAG_MSI_X; + if (msi && !(mdev->mthca_flags & MTHCA_FLAG_MSI_X) && + !pci_enable_msi(pdev)) + mdev->mthca_flags |= MTHCA_FLAG_MSI; + + sema_init(&mdev->cmd.hcr_sem, 1); + sema_init(&mdev->cmd.poll_sem, 1); + mdev->cmd.use_events = 0; + + mthca_base = pci_resource_start(pdev, 0); + mdev->hcr = ioremap(mthca_base + MTHCA_HCR_BASE, MTHCA_MAP_HCR_SIZE); + if (!mdev->hcr) { + mthca_err(mdev, "Couldn't map command register, " + "aborting.\n"); + err = -ENOMEM; + goto err_out_free_dev; + } + mdev->clr_base = ioremap(mthca_base + MTHCA_CLR_INT_BASE, + MTHCA_CLR_INT_SIZE); + if (!mdev->clr_base) { + mthca_err(mdev, "Couldn't map command register, " + "aborting.\n"); + err = -ENOMEM; + goto err_out_iounmap; + } + + mthca_base = pci_resource_start(pdev, 2); + mdev->kar = ioremap(mthca_base + PAGE_SIZE * MTHCA_KAR_PAGE, PAGE_SIZE); + if (!mdev->kar) { + mthca_err(mdev, "Couldn't map kernel access region, " + "aborting.\n"); + err = -ENOMEM; + goto err_out_iounmap_clr; + } + + err = mthca_tune_pci(mdev); + if (err) + goto err_out_iounmap_kar; + + err = mthca_init_hca(mdev); + if (err) + goto err_out_iounmap_kar; + + err = mthca_setup_hca(mdev); + if (err) + goto err_out_close; + + err = mthca_register_device(mdev); + if (err) + goto err_out_cleanup; + + err = mthca_create_agents(mdev); + if (err) + goto err_out_unregister; + + pci_set_drvdata(pdev, mdev); + + return 0; + +err_out_unregister: + mthca_unregister_device(mdev); + +err_out_cleanup: + mthca_cleanup_mcg_table(mdev); + mthca_cleanup_av_table(mdev); + mthca_cleanup_qp_table(mdev); + mthca_cleanup_cq_table(mdev); + mthca_cmd_use_polling(mdev); + mthca_cleanup_eq_table(mdev); + + mthca_pd_free(mdev, &mdev->driver_pd); + + mthca_cleanup_mr_table(mdev); + mthca_cleanup_pd_table(mdev); + +err_out_close: + mthca_close_hca(mdev); + +err_out_iounmap_kar: + iounmap(mdev->kar); + +err_out_iounmap_clr: + iounmap(mdev->clr_base); + +err_out_iounmap: + iounmap(mdev->hcr); + +err_out_free_dev: + if (mdev->mthca_flags & MTHCA_FLAG_MSI_X) + pci_disable_msix(pdev); + if (mdev->mthca_flags & MTHCA_FLAG_MSI) + pci_disable_msi(pdev); + + ib_dealloc_device(&mdev->ib_dev); + +err_out_free_res: + mthca_release_regions(pdev, ddr_hidden); + +err_out_disable_pdev: + pci_disable_device(pdev); + pci_set_drvdata(pdev, NULL); + return err; +} + +static void __devexit mthca_remove_one(struct pci_dev *pdev) +{ + struct mthca_dev *mdev = pci_get_drvdata(pdev); + u8 status; + int p; + + if (mdev) { + mthca_free_agents(mdev); + mthca_unregister_device(mdev); + + for (p = 1; p <= mdev->limits.num_ports; ++p) + mthca_CLOSE_IB(mdev, p, &status); + + mthca_cleanup_mcg_table(mdev); + mthca_cleanup_av_table(mdev); + mthca_cleanup_qp_table(mdev); + mthca_cleanup_cq_table(mdev); + mthca_cmd_use_polling(mdev); + mthca_cleanup_eq_table(mdev); + + mthca_pd_free(mdev, &mdev->driver_pd); + + mthca_cleanup_mr_table(mdev); + mthca_cleanup_pd_table(mdev); + + mthca_close_hca(mdev); + + iounmap(mdev->hcr); + iounmap(mdev->clr_base); + + if (mdev->mthca_flags & MTHCA_FLAG_MSI_X) + pci_disable_msix(pdev); + if (mdev->mthca_flags & MTHCA_FLAG_MSI) + pci_disable_msi(pdev); + + ib_dealloc_device(&mdev->ib_dev); + mthca_release_regions(pdev, mdev->mthca_flags & + MTHCA_FLAG_DDR_HIDDEN); + pci_disable_device(pdev); + pci_set_drvdata(pdev, NULL); + } +} + +static struct pci_device_id mthca_pci_table[] = { + { PCI_DEVICE(PCI_VENDOR_ID_MELLANOX, PCI_DEVICE_ID_MELLANOX_TAVOR), + .driver_data = TAVOR }, + { PCI_DEVICE(PCI_VENDOR_ID_TOPSPIN, PCI_DEVICE_ID_MELLANOX_TAVOR), + .driver_data = TAVOR }, + { PCI_DEVICE(PCI_VENDOR_ID_MELLANOX, PCI_DEVICE_ID_MELLANOX_ARBEL_COMPAT), + .driver_data = ARBEL_COMPAT }, + { PCI_DEVICE(PCI_VENDOR_ID_TOPSPIN, PCI_DEVICE_ID_MELLANOX_ARBEL_COMPAT), + .driver_data = ARBEL_COMPAT }, + { PCI_DEVICE(PCI_VENDOR_ID_MELLANOX, PCI_DEVICE_ID_MELLANOX_ARBEL), + .driver_data = ARBEL_NATIVE }, + { PCI_DEVICE(PCI_VENDOR_ID_TOPSPIN, PCI_DEVICE_ID_MELLANOX_ARBEL), + .driver_data = ARBEL_NATIVE }, + { 0, } +}; + +MODULE_DEVICE_TABLE(pci, mthca_pci_table); + +static struct pci_driver mthca_driver = { + .name = "ib_mthca", + .id_table = mthca_pci_table, + .probe = mthca_init_one, + .remove = __devexit_p(mthca_remove_one) +}; + +static int __init mthca_init(void) +{ + int ret; + + /* + * TODO: measure whether dynamically choosing doorbell code at + * runtime affects our performance. Is there a "magic" way to + * choose without having to follow a function pointer every + * time we ring a doorbell? + */ +#ifdef CONFIG_INFINIBAND_MTHCA_SSE_DOORBELL + if (!cpu_has_xmm) { + printk(KERN_ERR PFX "mthca was compiled with SSE doorbell code, but\n"); + printk(KERN_ERR PFX "the current CPU does not support SSE.\n"); + printk(KERN_ERR PFX "Turn off CONFIG_INFINIBAND_MTHCA_SSE_DOORBELL " + "and recompile.\n"); + return -ENODEV; + } +#endif + + ret = pci_register_driver(&mthca_driver); + return ret < 0 ? ret : 0; +} + +static void __exit mthca_cleanup(void) +{ + pci_unregister_driver(&mthca_driver); +} + +module_init(mthca_init); +module_exit(mthca_cleanup); From roland@topspin.com Mon Dec 27 21:49:51 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 21:51:05 -0800 (PST) Received: from umhlanga.STRATNET.NET (umhlanga.stratnet.net [12.162.17.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS5nJE2025948 for ; Mon, 27 Dec 2004 21:49:49 -0800 Received: from exch-1.topspincom.com ([12.162.17.3]) by umhlanga.STRATNET.NET with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:51:17 -0800 Received: from eddore ([10.10.253.169]) by exch-1.topspincom.com with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:51:17 -0800 Received: from localhost ([127.0.0.1] helo=eddore) by eddore with smtp (Exim 4.34) id 1CjAGJ-0000vO-K7; Mon, 27 Dec 2004 21:51:17 -0800 Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com, openib-general@openib.org In-Reply-To: <200412272151.6vXZnyn99yKN3sSD@topspin.com> X-Mailer: Roland's Patchbomber Date: Mon, 27 Dec 2004 21:51:15 -0800 Message-Id: <200412272151.XKAQkznglOHW39xJ@topspin.com> Mime-Version: 1.0 To: davem@davemloft.net From: Roland Dreier X-SA-Exim-Connect-IP: 127.0.0.1 X-SA-Exim-Mail-From: roland@topspin.com Subject: [PATCH][v5][19/24] Add IPoIB (IP-over-InfiniBand) driver Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-SA-Exim-Version: 4.1 (built Tue, 17 Aug 2004 11:06:07 +0200) X-SA-Exim-Scanned: Yes (on eddore) X-OriginalArrivalTime: 28 Dec 2004 05:51:17.0220 (UTC) FILETIME=[3F7C0A40:01C4ECA1] X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13115 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: roland@topspin.com Precedence: bulk X-list: netdev Add a driver that implements the (IPoIB) IP-over-InfiniBand protocol. This is a network device driver of type ARPHRD_INFINIBAND (and addr_len INFINIBAND_ALEN bytes). The ARP/ND implementation for this driver is not completely straightforward, because InfiniBand requires an additional path lookup be performed (through an IB-specific mechanism) after a remote hardware address has been resolved. We are very open to suggestions of a better way to handle this than the current implementation. Although IB has a special multicast group join mode intended to support IP multicast routing (non member join), no means to identify different multicast styles has yet been determined, so all joins by the driver are currently full member joins. We are looking for guidance in how to solve this. The IPoIB protocol/encapsulation is described in the Internet-Drafts http://www.ietf.org/internet-drafts/draft-ietf-ipoib-architecture-04.txt http://www.ietf.org/internet-drafts/draft-ietf-ipoib-ip-over-infiniband-08.txt Signed-off-by: Roland Dreier --- linux-bk.orig/drivers/infiniband/Kconfig 2004-12-27 21:48:21.258837002 -0800 +++ linux-bk/drivers/infiniband/Kconfig 2004-12-27 21:48:25.377230788 -0800 @@ -9,4 +9,6 @@ source "drivers/infiniband/hw/mthca/Kconfig" +source "drivers/infiniband/ulp/ipoib/Kconfig" + endmenu --- linux-bk.orig/drivers/infiniband/Makefile 2004-12-27 21:48:21.219842741 -0800 +++ linux-bk/drivers/infiniband/Makefile 2004-12-27 21:48:25.347235203 -0800 @@ -1,2 +1,3 @@ obj-$(CONFIG_INFINIBAND) += core/ obj-$(CONFIG_INFINIBAND_MTHCA) += hw/mthca/ +obj-$(CONFIG_INFINIBAND_IPOIB) += ulp/ipoib/ --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/ulp/ipoib/Kconfig 2004-12-27 21:48:25.454219455 -0800 @@ -0,0 +1,33 @@ +config INFINIBAND_IPOIB + tristate "IP-over-InfiniBand" + depends on INFINIBAND && NETDEVICES && INET + ---help--- + Support for the IP-over-InfiniBand protocol (IPoIB). This + transports IP packets over InfiniBand so you can use your IB + device as a fancy NIC. + + The IPoIB protocol is defined by the IETF ipoib working + group: . + +config INFINIBAND_IPOIB_DEBUG + bool "IP-over-InfiniBand debugging" + depends on INFINIBAND_IPOIB + ---help--- + This option causes debugging code to be compiled into the + IPoIB driver. The output can be turned on via the + debug_level and mcast_debug_level module parameters (which + can also be set after the driver is loaded through sysfs). + + This option also creates an "ipoib_debugfs," which can be + mounted to expose debugging information about IB multicast + groups used by the IPoIB driver. + +config INFINIBAND_IPOIB_DEBUG_DATA + bool "IP-over-InfiniBand data path debugging" + depends on INFINIBAND_IPOIB_DEBUG + ---help--- + This option compiles debugging code into the the data path + of the IPoIB driver. The output can be turned on via the + data_debug_level module parameter; however, even with output + turned off, this debugging code will have some performance + impact. --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/ulp/ipoib/Makefile 2004-12-27 21:48:25.420224459 -0800 @@ -0,0 +1,11 @@ +EXTRA_CFLAGS += -Idrivers/infiniband/include + +obj-$(CONFIG_INFINIBAND_IPOIB) += ib_ipoib.o + +ib_ipoib-y := ipoib_main.o \ + ipoib_ib.o \ + ipoib_multicast.o \ + ipoib_verbs.o \ + ipoib_vlan.o +ib_ipoib-$(CONFIG_INFINIBAND_IPOIB_DEBUG) += ipoib_fs.o + --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/ulp/ipoib/ipoib.h 2004-12-27 21:48:25.497213127 -0800 @@ -0,0 +1,350 @@ +/* + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: ipoib.h 1358 2004-12-17 22:00:11Z roland $ + */ + +#ifndef _IPOIB_H +#define _IPOIB_H + +#include +#include +#include +#include +#include +#include +#include +#include + +#include + +#include +#include + +#include +#include +#include + +/* constants */ + +enum { + IPOIB_PACKET_SIZE = 2048, + IPOIB_BUF_SIZE = IPOIB_PACKET_SIZE + IB_GRH_BYTES, + + IPOIB_ENCAP_LEN = 4, + + IPOIB_RX_RING_SIZE = 128, + IPOIB_TX_RING_SIZE = 64, + + IPOIB_NUM_WC = 4, + + IPOIB_MAX_PATH_REC_QUEUE = 3, + IPOIB_MAX_MCAST_QUEUE = 3, + + IPOIB_FLAG_OPER_UP = 0, + IPOIB_FLAG_ADMIN_UP = 1, + IPOIB_PKEY_ASSIGNED = 2, + IPOIB_PKEY_STOP = 3, + IPOIB_FLAG_SUBINTERFACE = 4, + IPOIB_MCAST_RUN = 5, + IPOIB_STOP_REAPER = 6, + + IPOIB_MAX_BACKOFF_SECONDS = 16, + + IPOIB_MCAST_FLAG_FOUND = 0, /* used in set_multicast_list */ + IPOIB_MCAST_FLAG_SENDONLY = 1, + IPOIB_MCAST_FLAG_BUSY = 2, /* joining or already joined */ + IPOIB_MCAST_FLAG_ATTACHED = 3, +}; + +/* structs */ + +struct ipoib_header { + u16 proto; + u16 reserved; +}; + +struct ipoib_pseudoheader { + u8 hwaddr[INFINIBAND_ALEN]; +}; + +struct ipoib_mcast; + +struct ipoib_buf { + struct sk_buff *skb; + DECLARE_PCI_UNMAP_ADDR(mapping) +}; + +/* + * Device private locking: tx_lock protects members used in TX fast + * path (and we use LLTX so upper layers don't do extra locking). + * lock protects everything else. lock nests inside of tx_lock (ie + * tx_lock must be acquired first if needed). + */ +struct ipoib_dev_priv { + spinlock_t lock; + + struct net_device *dev; + + unsigned long flags; + + struct semaphore mcast_mutex; + struct semaphore vlan_mutex; + + struct rb_root path_tree; + struct list_head path_list; + + struct ipoib_mcast *broadcast; + struct list_head multicast_list; + struct rb_root multicast_tree; + + struct work_struct pkey_task; + struct work_struct mcast_task; + struct work_struct flush_task; + struct work_struct restart_task; + struct work_struct ah_reap_task; + + struct ib_device *ca; + u8 port; + u16 pkey; + struct ib_pd *pd; + struct ib_mr *mr; + struct ib_cq *cq; + struct ib_qp *qp; + u32 qkey; + + union ib_gid local_gid; + u16 local_lid; + + unsigned int admin_mtu; + unsigned int mcast_mtu; + + struct ipoib_buf *rx_ring; + + spinlock_t tx_lock; + struct ipoib_buf *tx_ring; + unsigned tx_head; + unsigned tx_tail; + + struct ib_wc ibwc[IPOIB_NUM_WC]; + + struct list_head dead_ahs; + + struct ib_event_handler event_handler; + + struct net_device_stats stats; + + struct net_device *parent; + struct list_head child_intfs; + struct list_head list; + +#ifdef CONFIG_INFINIBAND_IPOIB_DEBUG + struct list_head fs_list; + struct dentry *mcg_dentry; +#endif +}; + +struct ipoib_ah { + struct net_device *dev; + struct ib_ah *ah; + struct list_head list; + struct kref ref; + unsigned last_send; +}; + +struct ipoib_path { + struct net_device *dev; + struct ib_sa_path_rec pathrec; + struct ipoib_ah *ah; + struct sk_buff_head queue; + + struct list_head neigh_list; + + int query_id; + struct ib_sa_query *query; + struct completion done; + + struct rb_node rb_node; + struct list_head list; +}; + +struct ipoib_neigh { + struct ipoib_ah *ah; + struct sk_buff_head queue; + + struct neighbour *neighbour; + + struct list_head list; +}; + +static inline struct ipoib_neigh **to_ipoib_neigh(struct neighbour *neigh) +{ + return (struct ipoib_neigh **) (neigh->ha + 24 - + (offsetof(struct neighbour, ha) & 4)); +} + +extern struct workqueue_struct *ipoib_workqueue; + +/* functions */ + +void ipoib_ib_completion(struct ib_cq *cq, void *dev_ptr); + +struct ipoib_ah *ipoib_create_ah(struct net_device *dev, + struct ib_pd *pd, struct ib_ah_attr *attr); +void ipoib_free_ah(struct kref *kref); +static inline void ipoib_put_ah(struct ipoib_ah *ah) +{ + kref_put(&ah->ref, ipoib_free_ah); +} + +int ipoib_add_pkey_attr(struct net_device *dev); + +void ipoib_send(struct net_device *dev, struct sk_buff *skb, + struct ipoib_ah *address, u32 qpn); +void ipoib_reap_ah(void *dev_ptr); + +void ipoib_flush_paths(struct net_device *dev); +struct ipoib_dev_priv *ipoib_intf_alloc(const char *format); + +int ipoib_ib_dev_init(struct net_device *dev, struct ib_device *ca, int port); +void ipoib_ib_dev_flush(void *dev); +void ipoib_ib_dev_cleanup(struct net_device *dev); + +int ipoib_ib_dev_open(struct net_device *dev); +int ipoib_ib_dev_up(struct net_device *dev); +int ipoib_ib_dev_down(struct net_device *dev); +int ipoib_ib_dev_stop(struct net_device *dev); + +int ipoib_dev_init(struct net_device *dev, struct ib_device *ca, int port); +void ipoib_dev_cleanup(struct net_device *dev); + +void ipoib_mcast_join_task(void *dev_ptr); +void ipoib_mcast_send(struct net_device *dev, union ib_gid *mgid, + struct sk_buff *skb); + +void ipoib_mcast_restart_task(void *dev_ptr); +int ipoib_mcast_start_thread(struct net_device *dev); +int ipoib_mcast_stop_thread(struct net_device *dev); + +void ipoib_mcast_dev_down(struct net_device *dev); +void ipoib_mcast_dev_flush(struct net_device *dev); + +struct ipoib_mcast_iter *ipoib_mcast_iter_init(struct net_device *dev); +void ipoib_mcast_iter_free(struct ipoib_mcast_iter *iter); +int ipoib_mcast_iter_next(struct ipoib_mcast_iter *iter); +void ipoib_mcast_iter_read(struct ipoib_mcast_iter *iter, + union ib_gid *gid, + unsigned long *created, + unsigned int *queuelen, + unsigned int *complete, + unsigned int *send_only); + +int ipoib_mcast_attach(struct net_device *dev, u16 mlid, + union ib_gid *mgid); +int ipoib_mcast_detach(struct net_device *dev, u16 mlid, + union ib_gid *mgid); + +int ipoib_qp_create(struct net_device *dev); +int ipoib_transport_dev_init(struct net_device *dev, struct ib_device *ca); +void ipoib_transport_dev_cleanup(struct net_device *dev); + +void ipoib_event(struct ib_event_handler *handler, + struct ib_event *record); + +int ipoib_vlan_add(struct net_device *pdev, unsigned short pkey); +int ipoib_vlan_delete(struct net_device *pdev, unsigned short pkey); + +void ipoib_pkey_poll(void *dev); +int ipoib_pkey_dev_delay_open(struct net_device *dev); + +#ifdef CONFIG_INFINIBAND_IPOIB_DEBUG +int ipoib_create_debug_file(struct net_device *dev); +void ipoib_delete_debug_file(struct net_device *dev); +int ipoib_register_debugfs(void); +void ipoib_unregister_debugfs(void); +#else +static inline int ipoib_create_debug_file(struct net_device *dev) { return 0; } +static inline void ipoib_delete_debug_file(struct net_device *dev) { } +static inline int ipoib_register_debugfs(void) { return 0; } +static inline void ipoib_unregister_debugfs(void) { } +#endif + + +#define ipoib_printk(level, priv, format, arg...) \ + printk(level "%s: " format, ((struct ipoib_dev_priv *) priv)->dev->name , ## arg) +#define ipoib_warn(priv, format, arg...) \ + ipoib_printk(KERN_WARNING, priv, format , ## arg) + + +#ifdef CONFIG_INFINIBAND_IPOIB_DEBUG +extern int debug_level; + +#define ipoib_dbg(priv, format, arg...) \ + do { \ + if (debug_level > 0) \ + ipoib_printk(KERN_DEBUG, priv, format , ## arg); \ + } while (0) +#define ipoib_dbg_mcast(priv, format, arg...) \ + do { \ + if (mcast_debug_level > 0) \ + ipoib_printk(KERN_DEBUG, priv, format , ## arg); \ + } while (0) +#else /* CONFIG_INFINIBAND_IPOIB_DEBUG */ +#define ipoib_dbg(priv, format, arg...) \ + do { (void) (priv); } while (0) +#define ipoib_dbg_mcast(priv, format, arg...) \ + do { (void) (priv); } while (0) +#endif /* CONFIG_INFINIBAND_IPOIB_DEBUG */ + +#ifdef CONFIG_INFINIBAND_IPOIB_DEBUG_DATA +#define ipoib_dbg_data(priv, format, arg...) \ + do { \ + if (data_debug_level > 0) \ + ipoib_printk(KERN_DEBUG, priv, format , ## arg); \ + } while (0) +#else /* CONFIG_INFINIBAND_IPOIB_DEBUG_DATA */ +#define ipoib_dbg_data(priv, format, arg...) \ + do { (void) (priv); } while (0) +#endif /* CONFIG_INFINIBAND_IPOIB_DEBUG_DATA */ + + +#define IPOIB_GID_FMT "%x:%x:%x:%x:%x:%x:%x:%x" + +#define IPOIB_GID_ARG(gid) be16_to_cpup((__be16 *) ((gid).raw + 0)), \ + be16_to_cpup((__be16 *) ((gid).raw + 2)), \ + be16_to_cpup((__be16 *) ((gid).raw + 4)), \ + be16_to_cpup((__be16 *) ((gid).raw + 6)), \ + be16_to_cpup((__be16 *) ((gid).raw + 8)), \ + be16_to_cpup((__be16 *) ((gid).raw + 10)), \ + be16_to_cpup((__be16 *) ((gid).raw + 12)), \ + be16_to_cpup((__be16 *) ((gid).raw + 14)) + +#endif /* _IPOIB_H */ --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/ulp/ipoib/ipoib_fs.c 2004-12-27 21:48:25.549205474 -0800 @@ -0,0 +1,287 @@ +/* + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: ipoib_fs.c 1389 2004-12-27 22:56:47Z roland $ + */ + +#include +#include + +#include "ipoib.h" + +enum { + IPOIB_MAGIC = 0x49504942 /* "IPIB" */ +}; + +static DECLARE_MUTEX(ipoib_fs_mutex); +static struct dentry *ipoib_root; +static struct super_block *ipoib_sb; +static LIST_HEAD(ipoib_device_list); + +static void *ipoib_mcg_seq_start(struct seq_file *file, loff_t *pos) +{ + struct ipoib_mcast_iter *iter; + loff_t n = *pos; + + iter = ipoib_mcast_iter_init(file->private); + if (!iter) + return NULL; + + while (n--) { + if (ipoib_mcast_iter_next(iter)) { + ipoib_mcast_iter_free(iter); + return NULL; + } + } + + return iter; +} + +static void *ipoib_mcg_seq_next(struct seq_file *file, void *iter_ptr, + loff_t *pos) +{ + struct ipoib_mcast_iter *iter = iter_ptr; + + (*pos)++; + + if (ipoib_mcast_iter_next(iter)) { + ipoib_mcast_iter_free(iter); + return NULL; + } + + return iter; +} + +static void ipoib_mcg_seq_stop(struct seq_file *file, void *iter_ptr) +{ + /* nothing for now */ +} + +static int ipoib_mcg_seq_show(struct seq_file *file, void *iter_ptr) +{ + struct ipoib_mcast_iter *iter = iter_ptr; + char gid_buf[sizeof "ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff"]; + union ib_gid mgid; + int i, n; + unsigned long created; + unsigned int queuelen, complete, send_only; + + if (iter) { + ipoib_mcast_iter_read(iter, &mgid, &created, &queuelen, + &complete, &send_only); + + for (n = 0, i = 0; i < sizeof mgid / 2; ++i) { + n += sprintf(gid_buf + n, "%x", + be16_to_cpu(((u16 *)mgid.raw)[i])); + if (i < sizeof mgid / 2 - 1) + gid_buf[n++] = ':'; + } + } + + seq_printf(file, "GID: %*s", -(1 + (int) sizeof gid_buf), gid_buf); + + seq_printf(file, + " created: %10ld queuelen: %4d complete: %d send_only: %d\n", + created, queuelen, complete, send_only); + + return 0; +} + +static struct seq_operations ipoib_seq_ops = { + .start = ipoib_mcg_seq_start, + .next = ipoib_mcg_seq_next, + .stop = ipoib_mcg_seq_stop, + .show = ipoib_mcg_seq_show, +}; + +static int ipoib_mcg_open(struct inode *inode, struct file *file) +{ + struct seq_file *seq; + int ret; + + ret = seq_open(file, &ipoib_seq_ops); + if (ret) + return ret; + + seq = file->private_data; + seq->private = inode->u.generic_ip; + + return 0; +} + +static struct file_operations ipoib_fops = { + .owner = THIS_MODULE, + .open = ipoib_mcg_open, + .read = seq_read, + .llseek = seq_lseek, + .release = seq_release +}; + +static struct inode *ipoib_get_inode(void) +{ + struct inode *inode = new_inode(ipoib_sb); + + if (inode) { + inode->i_mode = S_IFREG | S_IRUGO; + inode->i_uid = 0; + inode->i_gid = 0; + inode->i_blksize = PAGE_CACHE_SIZE; + inode->i_blocks = 0; + inode->i_atime = inode->i_mtime = inode->i_ctime = CURRENT_TIME; + inode->i_fop = &ipoib_fops; + } + + return inode; +} + +static int __ipoib_create_debug_file(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct dentry *dentry; + struct inode *inode; + char name[IFNAMSIZ + sizeof "_mcg"]; + + snprintf(name, sizeof name, "%s_mcg", dev->name); + + dentry = d_alloc_name(ipoib_root, name); + if (!dentry) + return -ENOMEM; + + inode = ipoib_get_inode(); + if (!inode) { + dput(dentry); + return -ENOMEM; + } + + inode->u.generic_ip = dev; + priv->mcg_dentry = dentry; + + d_add(dentry, inode); + + return 0; +} + +int ipoib_create_debug_file(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + down(&ipoib_fs_mutex); + + list_add_tail(&priv->fs_list, &ipoib_device_list); + + if (!ipoib_sb) { + up(&ipoib_fs_mutex); + return 0; + } + + up(&ipoib_fs_mutex); + + return __ipoib_create_debug_file(dev); +} + +void ipoib_delete_debug_file(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + down(&ipoib_fs_mutex); + list_del(&priv->fs_list); + if (!ipoib_sb) { + up(&ipoib_fs_mutex); + return; + } + up(&ipoib_fs_mutex); + + if (priv->mcg_dentry) { + d_drop(priv->mcg_dentry); + simple_unlink(ipoib_root->d_inode, priv->mcg_dentry); + } +} + +static int ipoib_fill_super(struct super_block *sb, void *data, int silent) +{ + static struct tree_descr ipoib_files[] = { + { "" } + }; + struct ipoib_dev_priv *priv; + int ret; + + ret = simple_fill_super(sb, IPOIB_MAGIC, ipoib_files); + if (ret) + return ret; + + ipoib_root = sb->s_root; + + down(&ipoib_fs_mutex); + + ipoib_sb = sb; + + list_for_each_entry(priv, &ipoib_device_list, fs_list) { + ret = __ipoib_create_debug_file(priv->dev); + if (ret) + break; + } + + up(&ipoib_fs_mutex); + + return ret; +} + +static struct super_block *ipoib_get_sb(struct file_system_type *fs_type, + int flags, const char *dev_name, void *data) +{ + return get_sb_single(fs_type, flags, data, ipoib_fill_super); +} + +static void ipoib_kill_sb(struct super_block *sb) +{ + down(&ipoib_fs_mutex); + ipoib_sb = NULL; + up(&ipoib_fs_mutex); + + kill_litter_super(sb); +} + +static struct file_system_type ipoib_fs_type = { + .owner = THIS_MODULE, + .name = "ipoib_debugfs", + .get_sb = ipoib_get_sb, + .kill_sb = ipoib_kill_sb, +}; + +int ipoib_register_debugfs(void) +{ + return register_filesystem(&ipoib_fs_type); +} + +void ipoib_unregister_debugfs(void) +{ + unregister_filesystem(&ipoib_fs_type); +} --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/ulp/ipoib/ipoib_ib.c 2004-12-27 21:48:25.597198409 -0800 @@ -0,0 +1,678 @@ +/* + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: ipoib_ib.c 1386 2004-12-27 16:23:17Z roland $ + */ + +#include +#include + +#include + +#include "ipoib.h" + +#ifdef CONFIG_INFINIBAND_IPOIB_DEBUG_DATA +int data_debug_level; + +module_param(data_debug_level, int, 0644); +MODULE_PARM_DESC(data_debug_level, + "Enable data path debug tracing if > 0"); +#endif + +#define IPOIB_OP_RECV (1ul << 31) + +static DECLARE_MUTEX(pkey_sem); + +struct ipoib_ah *ipoib_create_ah(struct net_device *dev, + struct ib_pd *pd, struct ib_ah_attr *attr) +{ + struct ipoib_ah *ah; + + ah = kmalloc(sizeof *ah, GFP_KERNEL); + if (!ah) + return NULL; + + ah->dev = dev; + ah->last_send = 0; + kref_init(&ah->ref); + + ah->ah = ib_create_ah(pd, attr); + if (IS_ERR(ah->ah)) { + kfree(ah); + ah = NULL; + } else + ipoib_dbg(netdev_priv(dev), "Created ah %p\n", ah->ah); + + return ah; +} + +void ipoib_free_ah(struct kref *kref) +{ + struct ipoib_ah *ah = container_of(kref, struct ipoib_ah, ref); + struct ipoib_dev_priv *priv = netdev_priv(ah->dev); + + unsigned long flags; + + if (ah->last_send <= priv->tx_tail) { + ipoib_dbg(priv, "Freeing ah %p\n", ah->ah); + ib_destroy_ah(ah->ah); + kfree(ah); + } else { + spin_lock_irqsave(&priv->lock, flags); + list_add_tail(&ah->list, &priv->dead_ahs); + spin_unlock_irqrestore(&priv->lock, flags); + } +} + +static inline int ipoib_ib_receive(struct ipoib_dev_priv *priv, + unsigned int wr_id, + dma_addr_t addr) +{ + struct ib_sge list = { + .addr = addr, + .length = IPOIB_BUF_SIZE, + .lkey = priv->mr->lkey, + }; + struct ib_recv_wr param = { + .wr_id = wr_id | IPOIB_OP_RECV, + .sg_list = &list, + .num_sge = 1, + .recv_flags = IB_RECV_SIGNALED + }; + struct ib_recv_wr *bad_wr; + + return ib_post_recv(priv->qp, ¶m, &bad_wr); +} + +static int ipoib_ib_post_receive(struct net_device *dev, int id) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct sk_buff *skb; + dma_addr_t addr; + int ret; + + skb = dev_alloc_skb(IPOIB_BUF_SIZE + 4); + if (!skb) { + ipoib_warn(priv, "failed to allocate receive buffer\n"); + + priv->rx_ring[id].skb = NULL; + return -ENOMEM; + } + skb_reserve(skb, 4); /* 16 byte align IP header */ + priv->rx_ring[id].skb = skb; + addr = dma_map_single(priv->ca->dma_device, + skb->data, IPOIB_BUF_SIZE, + DMA_FROM_DEVICE); + pci_unmap_addr_set(&priv->rx_ring[id], mapping, addr); + + ret = ipoib_ib_receive(priv, id, addr); + if (ret) { + ipoib_warn(priv, "ipoib_ib_receive failed for buf %d (%d)\n", + id, ret); + priv->rx_ring[id].skb = NULL; + } + + return ret; +} + +static int ipoib_ib_post_receives(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + int i; + + for (i = 0; i < IPOIB_RX_RING_SIZE; ++i) { + if (ipoib_ib_post_receive(dev, i)) { + ipoib_warn(priv, "ipoib_ib_post_receive failed for buf %d\n", i); + return -EIO; + } + } + + return 0; +} + +static void ipoib_ib_handle_wc(struct net_device *dev, + struct ib_wc *wc) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + unsigned int wr_id = wc->wr_id; + + ipoib_dbg_data(priv, "called: id %d, op %d, status: %d\n", + wr_id, wc->opcode, wc->status); + + if (wr_id & IPOIB_OP_RECV) { + wr_id &= ~IPOIB_OP_RECV; + + if (wr_id < IPOIB_RX_RING_SIZE) { + struct sk_buff *skb = priv->rx_ring[wr_id].skb; + + priv->rx_ring[wr_id].skb = NULL; + + dma_unmap_single(priv->ca->dma_device, + pci_unmap_addr(&priv->rx_ring[wr_id], + mapping), + IPOIB_BUF_SIZE, + DMA_FROM_DEVICE); + + if (wc->status != IB_WC_SUCCESS) { + if (wc->status != IB_WC_WR_FLUSH_ERR) + ipoib_warn(priv, "failed recv event " + "(status=%d, wrid=%d vend_err %x)\n", + wc->status, wr_id, wc->vendor_err); + dev_kfree_skb_any(skb); + return; + } + + ipoib_dbg_data(priv, "received %d bytes, SLID 0x%04x\n", + wc->byte_len, wc->slid); + + skb_put(skb, wc->byte_len); + skb_pull(skb, IB_GRH_BYTES); + + if (wc->slid != priv->local_lid || + wc->src_qp != priv->qp->qp_num) { + skb->protocol = ((struct ipoib_header *) skb->data)->proto; + + skb_pull(skb, IPOIB_ENCAP_LEN); + + dev->last_rx = jiffies; + ++priv->stats.rx_packets; + priv->stats.rx_bytes += skb->len; + + skb->dev = dev; + /* XXX get correct PACKET_ type here */ + skb->pkt_type = PACKET_HOST; + netif_rx_ni(skb); + } else { + ipoib_dbg_data(priv, "dropping loopback packet\n"); + dev_kfree_skb_any(skb); + } + + /* repost receive */ + if (ipoib_ib_post_receive(dev, wr_id)) + ipoib_warn(priv, "ipoib_ib_post_receive failed " + "for buf %d\n", wr_id); + } else + ipoib_warn(priv, "completion event with wrid %d\n", + wr_id); + + } else { + struct ipoib_buf *tx_req; + unsigned long flags; + + if (wr_id >= IPOIB_TX_RING_SIZE) { + ipoib_warn(priv, "completion event with wrid %d (> %d)\n", + wr_id, IPOIB_TX_RING_SIZE); + return; + } + + ipoib_dbg_data(priv, "send complete, wrid %d\n", wr_id); + + tx_req = &priv->tx_ring[wr_id]; + + dma_unmap_single(priv->ca->dma_device, + pci_unmap_addr(tx_req, mapping), + tx_req->skb->len, + DMA_TO_DEVICE); + + ++priv->stats.tx_packets; + priv->stats.tx_bytes += tx_req->skb->len; + + dev_kfree_skb_any(tx_req->skb); + + spin_lock_irqsave(&priv->tx_lock, flags); + ++priv->tx_tail; + if (netif_queue_stopped(dev) && + priv->tx_head - priv->tx_tail <= IPOIB_TX_RING_SIZE / 2) + netif_wake_queue(dev); + spin_unlock_irqrestore(&priv->tx_lock, flags); + + if (wc->status != IB_WC_SUCCESS && + wc->status != IB_WC_WR_FLUSH_ERR) + ipoib_warn(priv, "failed send event " + "(status=%d, wrid=%d vend_err %x)\n", + wc->status, wr_id, wc->vendor_err); + } +} + +void ipoib_ib_completion(struct ib_cq *cq, void *dev_ptr) +{ + struct net_device *dev = (struct net_device *) dev_ptr; + struct ipoib_dev_priv *priv = netdev_priv(dev); + int n, i; + + ib_req_notify_cq(cq, IB_CQ_NEXT_COMP); + do { + n = ib_poll_cq(cq, IPOIB_NUM_WC, priv->ibwc); + for (i = 0; i < n; ++i) + ipoib_ib_handle_wc(dev, priv->ibwc + i); + } while (n == IPOIB_NUM_WC); +} + +static inline int post_send(struct ipoib_dev_priv *priv, + unsigned int wr_id, + struct ib_ah *address, u32 qpn, + dma_addr_t addr, int len) +{ + struct ib_sge list = { + .addr = addr, + .length = len, + .lkey = priv->mr->lkey, + }; + struct ib_send_wr param = { + .wr_id = wr_id, + .opcode = IB_WR_SEND, + .sg_list = &list, + .num_sge = 1, + .wr = { + .ud = { + .remote_qpn = qpn, + .remote_qkey = priv->qkey, + .ah = address + }, + }, + .send_flags = IB_SEND_SIGNALED, + }; + struct ib_send_wr *bad_wr; + + return ib_post_send(priv->qp, ¶m, &bad_wr); +} + +void ipoib_send(struct net_device *dev, struct sk_buff *skb, + struct ipoib_ah *address, u32 qpn) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ipoib_buf *tx_req; + dma_addr_t addr; + + if (skb->len > dev->mtu + INFINIBAND_ALEN) { + ipoib_warn(priv, "packet len %d (> %d) too long to send, dropping\n", + skb->len, dev->mtu + INFINIBAND_ALEN); + ++priv->stats.tx_dropped; + ++priv->stats.tx_errors; + dev_kfree_skb_any(skb); + return; + } + + ipoib_dbg_data(priv, "sending packet, length=%d address=%p qpn=0x%06x\n", + skb->len, address, qpn); + + /* + * We put the skb into the tx_ring _before_ we call post_send() + * because it's entirely possible that the completion handler will + * run before we execute anything after the post_send(). That + * means we have to make sure everything is properly recorded and + * our state is consistent before we call post_send(). + */ + tx_req = &priv->tx_ring[priv->tx_head & (IPOIB_TX_RING_SIZE - 1)]; + tx_req->skb = skb; + addr = dma_map_single(priv->ca->dma_device, skb->data, skb->len, + DMA_TO_DEVICE); + pci_unmap_addr_set(tx_req, mapping, addr); + + if (unlikely(post_send(priv, priv->tx_head & (IPOIB_TX_RING_SIZE - 1), + address->ah, qpn, addr, skb->len))) { + ipoib_warn(priv, "post_send failed\n"); + ++priv->stats.tx_errors; + dma_unmap_single(priv->ca->dma_device, addr, skb->len, + DMA_TO_DEVICE); + dev_kfree_skb_any(skb); + } else { + dev->trans_start = jiffies; + + address->last_send = priv->tx_head; + ++priv->tx_head; + + if (priv->tx_head - priv->tx_tail == IPOIB_TX_RING_SIZE) { + ipoib_dbg(priv, "TX ring full, stopping kernel net queue\n"); + netif_stop_queue(dev); + } + } +} + +void __ipoib_reap_ah(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ipoib_ah *ah, *tah; + LIST_HEAD(remove_list); + + spin_lock_irq(&priv->lock); + list_for_each_entry_safe(ah, tah, &priv->dead_ahs, list) + if (ah->last_send <= priv->tx_tail) { + list_del(&ah->list); + list_add_tail(&ah->list, &remove_list); + } + spin_unlock_irq(&priv->lock); + + list_for_each_entry_safe(ah, tah, &remove_list, list) { + ipoib_dbg(priv, "Reaping ah %p\n", ah->ah); + ib_destroy_ah(ah->ah); + kfree(ah); + } +} + +void ipoib_reap_ah(void *dev_ptr) +{ + struct net_device *dev = dev_ptr; + struct ipoib_dev_priv *priv = netdev_priv(dev); + + __ipoib_reap_ah(dev); + + if (!test_bit(IPOIB_STOP_REAPER, &priv->flags)) + queue_delayed_work(ipoib_workqueue, &priv->ah_reap_task, HZ); +} + +int ipoib_ib_dev_open(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + int ret; + + ret = ipoib_qp_create(dev); + if (ret) { + ipoib_warn(priv, "ipoib_qp_create returned %d\n", ret); + return -1; + } + + ret = ipoib_ib_post_receives(dev); + if (ret) { + ipoib_warn(priv, "ipoib_ib_post_receives returned %d\n", ret); + return -1; + } + + clear_bit(IPOIB_STOP_REAPER, &priv->flags); + queue_delayed_work(ipoib_workqueue, &priv->ah_reap_task, HZ); + + return 0; +} + +int ipoib_ib_dev_up(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + set_bit(IPOIB_FLAG_OPER_UP, &priv->flags); + + return ipoib_mcast_start_thread(dev); +} + +int ipoib_ib_dev_down(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + ipoib_dbg(priv, "downing ib_dev\n"); + + clear_bit(IPOIB_FLAG_OPER_UP, &priv->flags); + netif_carrier_off(dev); + + /* Shutdown the P_Key thread if still active */ + if (!test_bit(IPOIB_PKEY_ASSIGNED, &priv->flags)) { + down(&pkey_sem); + set_bit(IPOIB_PKEY_STOP, &priv->flags); + cancel_delayed_work(&priv->pkey_task); + up(&pkey_sem); + flush_workqueue(ipoib_workqueue); + } + + ipoib_mcast_stop_thread(dev); + + /* + * Flush the multicast groups first so we stop any multicast joins. The + * completion thread may have already died and we may deadlock waiting + * for the completion thread to finish some multicast joins. + */ + ipoib_mcast_dev_flush(dev); + + /* Delete broadcast and local addresses since they will be recreated */ + ipoib_mcast_dev_down(dev); + + ipoib_flush_paths(dev); + + return 0; +} + +static int recvs_pending(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + int pending = 0; + int i; + + for (i = 0; i < IPOIB_RX_RING_SIZE; ++i) + if (priv->rx_ring[i].skb) + ++pending; + + return pending; +} + +int ipoib_ib_dev_stop(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ib_qp_attr qp_attr; + int attr_mask; + unsigned long begin; + struct ipoib_buf *tx_req; + int i; + + /* Kill the existing QP and allocate a new one */ + qp_attr.qp_state = IB_QPS_ERR; + attr_mask = IB_QP_STATE; + if (ib_modify_qp(priv->qp, &qp_attr, attr_mask)) + ipoib_warn(priv, "Failed to modify QP to ERROR state\n"); + + /* Wait for all sends and receives to complete */ + begin = jiffies; + + while (priv->tx_head != priv->tx_tail || recvs_pending(dev)) { + if (time_after(jiffies, begin + 5 * HZ)) { + ipoib_warn(priv, "timing out; %d sends %d receives not completed\n", + priv->tx_head - priv->tx_tail, recvs_pending(dev)); + + /* + * assume the HW is wedged and just free up + * all our pending work requests. + */ + while (priv->tx_tail < priv->tx_head) { + tx_req = &priv->tx_ring[priv->tx_tail & + (IPOIB_TX_RING_SIZE - 1)]; + dma_unmap_single(priv->ca->dma_device, + pci_unmap_addr(tx_req, mapping), + tx_req->skb->len, + DMA_TO_DEVICE); + dev_kfree_skb_any(tx_req->skb); + ++priv->tx_tail; + } + + for (i = 0; i < IPOIB_RX_RING_SIZE; ++i) + if (priv->rx_ring[i].skb) { + dma_unmap_single(priv->ca->dma_device, + pci_unmap_addr(&priv->rx_ring[i], + mapping), + IPOIB_BUF_SIZE, + DMA_FROM_DEVICE); + dev_kfree_skb_any(priv->rx_ring[i].skb); + priv->rx_ring[i].skb = NULL; + } + + goto timeout; + } + + yield(); + } + + ipoib_dbg(priv, "All sends and receives done.\n"); + +timeout: + qp_attr.qp_state = IB_QPS_RESET; + attr_mask = IB_QP_STATE; + if (ib_modify_qp(priv->qp, &qp_attr, attr_mask)) + ipoib_warn(priv, "Failed to modify QP to RESET state\n"); + + /* Wait for all AHs to be reaped */ + set_bit(IPOIB_STOP_REAPER, &priv->flags); + cancel_delayed_work(&priv->ah_reap_task); + flush_workqueue(ipoib_workqueue); + + begin = jiffies; + + while (!list_empty(&priv->dead_ahs)) { + __ipoib_reap_ah(dev); + + if (time_after(jiffies, begin + HZ)) { + ipoib_warn(priv, "timing out; will leak address handles\n"); + break; + } + + yield(); + } + + return 0; +} + +int ipoib_ib_dev_init(struct net_device *dev, struct ib_device *ca, int port) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + priv->ca = ca; + priv->port = port; + priv->qp = NULL; + + if (ipoib_transport_dev_init(dev, ca)) { + printk(KERN_WARNING "%s: ipoib_transport_dev_init failed\n", ca->name); + return -ENODEV; + } + + if (dev->flags & IFF_UP) { + if (ipoib_ib_dev_open(dev)) { + ipoib_transport_dev_cleanup(dev); + return -ENODEV; + } + } + + return 0; +} + +void ipoib_ib_dev_flush(void *_dev) +{ + struct net_device *dev = (struct net_device *)_dev; + struct ipoib_dev_priv *priv = netdev_priv(dev), *cpriv; + + if (!test_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags)) + return; + + ipoib_dbg(priv, "flushing\n"); + + ipoib_ib_dev_down(dev); + + /* + * The device could have been brought down between the start and when + * we get here, don't bring it back up if it's not configured up + */ + if (test_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags)) + ipoib_ib_dev_up(dev); + + /* Flush any child interfaces too */ + list_for_each_entry(cpriv, &priv->child_intfs, list) + ipoib_ib_dev_flush(&cpriv->dev); +} + +void ipoib_ib_dev_cleanup(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + ipoib_dbg(priv, "cleaning up ib_dev\n"); + + ipoib_mcast_stop_thread(dev); + + /* Delete the broadcast address and the local address */ + ipoib_mcast_dev_down(dev); + + ipoib_transport_dev_cleanup(dev); +} + +/* + * Delayed P_Key Assigment Interim Support + * + * The following is initial implementation of delayed P_Key assigment + * mechanism. It is using the same approach implemented for the multicast + * group join. The single goal of this implementation is to quickly address + * Bug #2507. This implementation will probably be removed when the P_Key + * change async notification is available. + */ +int ipoib_open(struct net_device *dev); + +static void ipoib_pkey_dev_check_presence(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + u16 pkey_index = 0; + + if (ib_cached_pkey_find(priv->ca, priv->port, priv->pkey, &pkey_index)) + clear_bit(IPOIB_PKEY_ASSIGNED, &priv->flags); + else + set_bit(IPOIB_PKEY_ASSIGNED, &priv->flags); +} + +void ipoib_pkey_poll(void *dev_ptr) +{ + struct net_device *dev = dev_ptr; + struct ipoib_dev_priv *priv = netdev_priv(dev); + + ipoib_pkey_dev_check_presence(dev); + + if (test_bit(IPOIB_PKEY_ASSIGNED, &priv->flags)) + ipoib_open(dev); + else { + down(&pkey_sem); + if (!test_bit(IPOIB_PKEY_STOP, &priv->flags)) + queue_delayed_work(ipoib_workqueue, + &priv->pkey_task, + HZ); + up(&pkey_sem); + } +} + +int ipoib_pkey_dev_delay_open(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + /* Look for the interface pkey value in the IB Port P_Key table and */ + /* set the interface pkey assigment flag */ + ipoib_pkey_dev_check_presence(dev); + + /* P_Key value not assigned yet - start polling */ + if (!test_bit(IPOIB_PKEY_ASSIGNED, &priv->flags)) { + down(&pkey_sem); + clear_bit(IPOIB_PKEY_STOP, &priv->flags); + queue_delayed_work(ipoib_workqueue, + &priv->pkey_task, + HZ); + up(&pkey_sem); + return 1; + } + + return 0; +} --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/ulp/ipoib/ipoib_main.c 2004-12-27 21:48:25.628193847 -0800 @@ -0,0 +1,1079 @@ +/* + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: ipoib_main.c 1377 2004-12-23 19:57:12Z roland $ + */ + +#include "ipoib.h" + +#include +#include + +#include +#include +#include + +#include /* For ARPHRD_xxx */ + +#include +#include + +MODULE_AUTHOR("Roland Dreier"); +MODULE_DESCRIPTION("IP-over-InfiniBand net driver"); +MODULE_LICENSE("Dual BSD/GPL"); + +#ifdef CONFIG_INFINIBAND_IPOIB_DEBUG +int debug_level; + +module_param(debug_level, int, 0644); +MODULE_PARM_DESC(debug_level, "Enable debug tracing if > 0"); +#endif + +static const u8 ipv4_bcast_addr[] = { + 0x00, 0xff, 0xff, 0xff, + 0xff, 0x12, 0x40, 0x1b, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff +}; + +struct workqueue_struct *ipoib_workqueue; + +static void ipoib_add_one(struct ib_device *device); +static void ipoib_remove_one(struct ib_device *device); + +static struct ib_client ipoib_client = { + .name = "ipoib", + .add = ipoib_add_one, + .remove = ipoib_remove_one +}; + +int ipoib_open(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + ipoib_dbg(priv, "bringing up interface\n"); + + set_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags); + + if (ipoib_pkey_dev_delay_open(dev)) + return 0; + + if (ipoib_ib_dev_open(dev)) + return -EINVAL; + + if (ipoib_ib_dev_up(dev)) + return -EINVAL; + + if (!test_bit(IPOIB_FLAG_SUBINTERFACE, &priv->flags)) { + struct ipoib_dev_priv *cpriv; + + /* Bring up any child interfaces too */ + down(&priv->vlan_mutex); + list_for_each_entry(cpriv, &priv->child_intfs, list) { + int flags; + + flags = cpriv->dev->flags; + if (flags & IFF_UP) + continue; + + dev_change_flags(cpriv->dev, flags | IFF_UP); + } + up(&priv->vlan_mutex); + } + + netif_start_queue(dev); + + return 0; +} + +static int ipoib_stop(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + ipoib_dbg(priv, "stopping interface\n"); + + clear_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags); + + netif_stop_queue(dev); + + ipoib_ib_dev_down(dev); + ipoib_ib_dev_stop(dev); + + if (!test_bit(IPOIB_FLAG_SUBINTERFACE, &priv->flags)) { + struct ipoib_dev_priv *cpriv; + + /* Bring down any child interfaces too */ + down(&priv->vlan_mutex); + list_for_each_entry(cpriv, &priv->child_intfs, list) { + int flags; + + flags = cpriv->dev->flags; + if (!(flags & IFF_UP)) + continue; + + dev_change_flags(cpriv->dev, flags & ~IFF_UP); + } + up(&priv->vlan_mutex); + } + + return 0; +} + +static int ipoib_change_mtu(struct net_device *dev, int new_mtu) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + if (new_mtu > IPOIB_PACKET_SIZE - IPOIB_ENCAP_LEN) + return -EINVAL; + + priv->admin_mtu = new_mtu; + + dev->mtu = min(priv->mcast_mtu, priv->admin_mtu); + + return 0; +} + +static struct ipoib_path *__path_find(struct net_device *dev, + union ib_gid *gid) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct rb_node *n = priv->path_tree.rb_node; + struct ipoib_path *path; + int ret; + + while (n) { + path = rb_entry(n, struct ipoib_path, rb_node); + + ret = memcmp(gid->raw, path->pathrec.dgid.raw, + sizeof (union ib_gid)); + + if (ret < 0) + n = n->rb_left; + else if (ret > 0) + n = n->rb_right; + else + return path; + } + + return NULL; +} + +static int __path_add(struct net_device *dev, struct ipoib_path *path) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct rb_node **n = &priv->path_tree.rb_node; + struct rb_node *pn = NULL; + struct ipoib_path *tpath; + int ret; + + while (*n) { + pn = *n; + tpath = rb_entry(pn, struct ipoib_path, rb_node); + + ret = memcmp(path->pathrec.dgid.raw, tpath->pathrec.dgid.raw, + sizeof (union ib_gid)); + if (ret < 0) + n = &pn->rb_left; + else if (ret > 0) + n = &pn->rb_right; + else + return -EEXIST; + } + + rb_link_node(&path->rb_node, pn, n); + rb_insert_color(&path->rb_node, &priv->path_tree); + + list_add_tail(&path->list, &priv->path_list); + + return 0; +} + +static void __path_free(struct net_device *dev, struct ipoib_path *path) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ipoib_neigh *neigh, *tn; + struct sk_buff *skb; + + while ((skb = __skb_dequeue(&path->queue))) + dev_kfree_skb_irq(skb); + + list_for_each_entry_safe(neigh, tn, &path->neigh_list, list) { + if (neigh->ah) + ipoib_put_ah(neigh->ah); + *to_ipoib_neigh(neigh->neighbour) = NULL; + neigh->neighbour->ops->destructor = NULL; + kfree(neigh); + } + + if (path->ah) + ipoib_put_ah(path->ah); + + rb_erase(&path->rb_node, &priv->path_tree); + list_del(&path->list); + kfree(path); +} + +void ipoib_flush_paths(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ipoib_path *path, *tp; + LIST_HEAD(remove_list); + unsigned long flags; + + spin_lock_irqsave(&priv->lock, flags); + list_splice(&priv->path_list, &remove_list); + INIT_LIST_HEAD(&priv->path_list); + spin_unlock_irqrestore(&priv->lock, flags); + + list_for_each_entry_safe(path, tp, &remove_list, list) { + if (path->query) + ib_sa_cancel_query(path->query_id, path->query); + wait_for_completion(&path->done); + __path_free(dev, path); + } +} + +static void path_rec_completion(int status, + struct ib_sa_path_rec *pathrec, + void *path_ptr) +{ + struct ipoib_path *path = path_ptr; + struct net_device *dev = path->dev; + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ipoib_ah *ah = NULL; + struct ipoib_neigh *neigh; + struct sk_buff_head skqueue; + struct sk_buff *skb; + unsigned long flags; + + if (pathrec) + ipoib_dbg(priv, "PathRec LID 0x%04x for GID " IPOIB_GID_FMT "\n", + be16_to_cpu(pathrec->dlid), IPOIB_GID_ARG(pathrec->dgid)); + else + ipoib_dbg(priv, "PathRec status %d for GID " IPOIB_GID_FMT "\n", + status, IPOIB_GID_ARG(path->pathrec.dgid)); + + skb_queue_head_init(&skqueue); + + if (!status) { + /* + * For now we set static_rate to 0. This is not + * really correct: we should look at the rate + * component of the path member record, compare it + * with the rate of our local port (calculated from + * the active link speed and link width) and set an + * inter-packet delay appropriately. + */ + struct ib_ah_attr av = { + .dlid = be16_to_cpu(pathrec->dlid), + .sl = pathrec->sl, + .static_rate = 0, + .port_num = priv->port + }; + + ah = ipoib_create_ah(dev, priv->pd, &av); + } + + spin_lock_irqsave(&priv->lock, flags); + + path->ah = ah; + + if (ah) { + path->pathrec = *pathrec; + + ipoib_dbg(priv, "created address handle %p for LID 0x%04x, SL %d\n", + ah, be16_to_cpu(pathrec->dlid), pathrec->sl); + + while ((skb = __skb_dequeue(&path->queue))) + __skb_queue_tail(&skqueue, skb); + + list_for_each_entry(neigh, &path->neigh_list, list) { + kref_get(&path->ah->ref); + neigh->ah = path->ah; + + while ((skb = __skb_dequeue(&neigh->queue))) + __skb_queue_tail(&skqueue, skb); + } + } else + path->query = NULL; + + complete(&path->done); + + spin_unlock_irqrestore(&priv->lock, flags); + + while ((skb = __skb_dequeue(&skqueue))) { + skb->dev = dev; + if (dev_queue_xmit(skb)) + ipoib_warn(priv, "dev_queue_xmit failed " + "to requeue packet\n"); + } +} + +static struct ipoib_path *path_rec_create(struct net_device *dev, + union ib_gid *gid) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ipoib_path *path; + + path = kmalloc(sizeof *path, GFP_ATOMIC); + if (!path) + return NULL; + + path->dev = dev; + path->pathrec.dlid = 0; + + skb_queue_head_init(&path->queue); + + INIT_LIST_HEAD(&path->neigh_list); + path->query = NULL; + init_completion(&path->done); + + memcpy(path->pathrec.dgid.raw, gid->raw, sizeof (union ib_gid)); + path->pathrec.sgid = priv->local_gid; + path->pathrec.pkey = cpu_to_be16(priv->pkey); + path->pathrec.numb_path = 1; + + __path_add(dev, path); + + return path; +} + +static int path_rec_start(struct net_device *dev, + struct ipoib_path *path) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + ipoib_dbg(priv, "Start path record lookup for " IPOIB_GID_FMT "\n", + IPOIB_GID_ARG(path->pathrec.dgid)); + + path->query_id = + ib_sa_path_rec_get(priv->ca, priv->port, + &path->pathrec, + IB_SA_PATH_REC_DGID | + IB_SA_PATH_REC_SGID | + IB_SA_PATH_REC_NUMB_PATH | + IB_SA_PATH_REC_PKEY, + 1000, GFP_ATOMIC, + path_rec_completion, + path, &path->query); + if (path->query_id < 0) { + ipoib_warn(priv, "ib_sa_path_rec_get failed\n"); + path->query = NULL; + return path->query_id; + } + + return 0; +} + +static void neigh_add_path(struct sk_buff *skb, struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ipoib_path *path; + struct ipoib_neigh *neigh; + + neigh = kmalloc(sizeof *neigh, GFP_ATOMIC); + if (!neigh) { + ++priv->stats.tx_dropped; + dev_kfree_skb_any(skb); + return; + } + + skb_queue_head_init(&neigh->queue); + neigh->neighbour = skb->dst->neighbour; + *to_ipoib_neigh(skb->dst->neighbour) = neigh; + + /* + * We can only be called from ipoib_start_xmit, so we're + * inside tx_lock -- no need to save/restore flags. + */ + spin_lock(&priv->lock); + + path = __path_find(dev, (union ib_gid *) (skb->dst->neighbour->ha + 4)); + if (!path) { + path = path_rec_create(dev, + (union ib_gid *) (skb->dst->neighbour->ha + 4)); + if (!path) + goto err; + } + + list_add_tail(&neigh->list, &path->neigh_list); + + if (path->pathrec.dlid) { + kref_get(&path->ah->ref); + neigh->ah = path->ah; + + ipoib_send(dev, skb, path->ah, + be32_to_cpup((__be32 *) skb->dst->neighbour->ha)); + } else { + neigh->ah = NULL; + if (skb_queue_len(&neigh->queue) < IPOIB_MAX_PATH_REC_QUEUE) { + __skb_queue_tail(&neigh->queue, skb); + } else { + ++priv->stats.tx_dropped; + dev_kfree_skb_any(skb); + } + + if (!path->query && path_rec_start(dev, path)) + goto err; + } + + spin_unlock(&priv->lock); + return; + +err: + *to_ipoib_neigh(skb->dst->neighbour) = NULL; + list_del(&neigh->list); + kfree(neigh); + neigh->neighbour->ops->destructor = NULL; + + ++priv->stats.tx_dropped; + dev_kfree_skb_any(skb); + + spin_unlock(&priv->lock); +} + +static void path_lookup(struct sk_buff *skb, struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(skb->dev); + + /* Look up path record for unicasts */ + if (skb->dst->neighbour->ha[4] != 0xff) { + neigh_add_path(skb, dev); + return; + } + + /* Add in the P_Key for multicasts */ + skb->dst->neighbour->ha[8] = (priv->pkey >> 8) & 0xff; + skb->dst->neighbour->ha[9] = priv->pkey & 0xff; + ipoib_mcast_send(dev, (union ib_gid *) (skb->dst->neighbour->ha + 4), skb); +} + +static void unicast_arp_send(struct sk_buff *skb, struct net_device *dev, + struct ipoib_pseudoheader *phdr) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ipoib_path *path; + + /* + * We can only be called from ipoib_start_xmit, so we're + * inside tx_lock -- no need to save/restore flags. + */ + spin_lock(&priv->lock); + + path = __path_find(dev, (union ib_gid *) (phdr->hwaddr + 4)); + if (!path) { + path = path_rec_create(dev, + (union ib_gid *) (phdr->hwaddr + 4)); + if (path) { + /* put pseudoheader back on for next time */ + skb_push(skb, sizeof *phdr); + __skb_queue_tail(&path->queue, skb); + + if (path_rec_start(dev, path)) + __path_free(dev, path); + } else { + ++priv->stats.tx_dropped; + dev_kfree_skb_any(skb); + } + + spin_unlock(&priv->lock); + return; + } + + if (path->pathrec.dlid) { + ipoib_dbg(priv, "Send unicast ARP to %04x\n", + be16_to_cpu(path->pathrec.dlid)); + + ipoib_send(dev, skb, path->ah, + be32_to_cpup((__be32 *) phdr->hwaddr)); + } else if ((path->query || !path_rec_start(dev, path)) && + skb_queue_len(&path->queue) < IPOIB_MAX_PATH_REC_QUEUE) { + /* put pseudoheader back on for next time */ + skb_push(skb, sizeof *phdr); + __skb_queue_tail(&path->queue, skb); + } else { + ++priv->stats.tx_dropped; + dev_kfree_skb_any(skb); + } + + spin_unlock(&priv->lock); +} + +static int ipoib_start_xmit(struct sk_buff *skb, struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ipoib_neigh *neigh; + unsigned long flags; + + local_irq_save(flags); + if (!spin_trylock(&priv->tx_lock)) { + local_irq_restore(flags); + return NETDEV_TX_LOCKED; + } + + /* + * Check if our queue is stopped. Since we have the LLTX bit + * set, we can't rely on netif_stop_queue() preventing our + * xmit function from being called with a full queue. + */ + if (unlikely(netif_queue_stopped(dev))) { + spin_unlock_irqrestore(&priv->tx_lock, flags); + return NETDEV_TX_BUSY; + } + + if (skb->dst && skb->dst->neighbour) { + if (unlikely(!*to_ipoib_neigh(skb->dst->neighbour))) { + path_lookup(skb, dev); + goto out; + } + + neigh = *to_ipoib_neigh(skb->dst->neighbour); + + if (likely(neigh->ah)) { + ipoib_send(dev, skb, neigh->ah, + be32_to_cpup((__be32 *) skb->dst->neighbour->ha)); + goto out; + } + + if (skb_queue_len(&neigh->queue) < IPOIB_MAX_PATH_REC_QUEUE) { + spin_lock(&priv->lock); + __skb_queue_tail(&neigh->queue, skb); + spin_unlock(&priv->lock); + } else { + ++priv->stats.tx_dropped; + dev_kfree_skb_any(skb); + } + } else { + struct ipoib_pseudoheader *phdr = + (struct ipoib_pseudoheader *) skb->data; + skb_pull(skb, sizeof *phdr); + + if (phdr->hwaddr[4] == 0xff) { + /* Add in the P_Key for multicast*/ + phdr->hwaddr[8] = (priv->pkey >> 8) & 0xff; + phdr->hwaddr[9] = priv->pkey & 0xff; + + ipoib_mcast_send(dev, (union ib_gid *) (phdr->hwaddr + 4), skb); + } else { + /* unicast GID -- should be ARP reply */ + + if (be16_to_cpup((u16 *) skb->data) != ETH_P_ARP) { + ipoib_warn(priv, "Unicast, no %s: type %04x, QPN %06x " + IPOIB_GID_FMT "\n", + skb->dst ? "neigh" : "dst", + be16_to_cpup((u16 *) skb->data), + be32_to_cpup((u32 *) phdr->hwaddr), + IPOIB_GID_ARG(*(union ib_gid *) (phdr->hwaddr + 4))); + dev_kfree_skb_any(skb); + ++priv->stats.tx_dropped; + goto out; + } + + unicast_arp_send(skb, dev, phdr); + } + } + +out: + spin_unlock_irqrestore(&priv->tx_lock, flags); + + return NETDEV_TX_OK; +} + +struct net_device_stats *ipoib_get_stats(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + return &priv->stats; +} + +static void ipoib_timeout(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + ipoib_warn(priv, "transmit timeout: latency %ld\n", + jiffies - dev->trans_start); + /* XXX reset QP, etc. */ +} + +static int ipoib_hard_header(struct sk_buff *skb, + struct net_device *dev, + unsigned short type, + void *daddr, void *saddr, unsigned len) +{ + struct ipoib_header *header; + + header = (struct ipoib_header *) skb_push(skb, sizeof *header); + + header->proto = htons(type); + header->reserved = 0; + + /* + * If we don't have a neighbour structure, stuff the + * destination address onto the front of the skb so we can + * figure out where to send the packet later. + */ + if (!skb->dst || !skb->dst->neighbour) { + struct ipoib_pseudoheader *phdr = + (struct ipoib_pseudoheader *) skb_push(skb, sizeof *phdr); + memcpy(phdr->hwaddr, daddr, INFINIBAND_ALEN); + } + + return 0; +} + +static void ipoib_set_mcast_list(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + schedule_work(&priv->restart_task); +} + +static void ipoib_neigh_destructor(struct neighbour *n) +{ + struct ipoib_neigh *neigh = *to_ipoib_neigh(n); + struct ipoib_dev_priv *priv = netdev_priv(n->dev); + unsigned long flags; + + ipoib_dbg(priv, + "neigh_destructor for %06x " IPOIB_GID_FMT "\n", + be32_to_cpup((__be32 *) n->ha), + IPOIB_GID_ARG(*((union ib_gid *) (n->ha + 4)))); + + spin_lock_irqsave(&priv->lock, flags); + + if (neigh) { + if (neigh->ah) + ipoib_put_ah(neigh->ah); + list_del(&neigh->list); + *to_ipoib_neigh(n) = NULL; + kfree(neigh); + } + + spin_unlock_irqrestore(&priv->lock, flags); +} + +static int ipoib_neigh_setup(struct neighbour *neigh) +{ + /* + * Is this kosher? I can't find anybody in the kernel that + * sets neigh->destructor, so we should be able to set it here + * without trouble. + */ + neigh->ops->destructor = ipoib_neigh_destructor; + + return 0; +} + +static int ipoib_neigh_setup_dev(struct net_device *dev, struct neigh_parms *parms) +{ + parms->neigh_setup = ipoib_neigh_setup; + + return 0; +} + +int ipoib_dev_init(struct net_device *dev, struct ib_device *ca, int port) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + /* Allocate RX/TX "rings" to hold queued skbs */ + + priv->rx_ring = kmalloc(IPOIB_RX_RING_SIZE * sizeof (struct ipoib_buf), + GFP_KERNEL); + if (!priv->rx_ring) { + printk(KERN_WARNING "%s: failed to allocate RX ring (%d entries)\n", + ca->name, IPOIB_RX_RING_SIZE); + goto out; + } + memset(priv->rx_ring, 0, + IPOIB_RX_RING_SIZE * sizeof (struct ipoib_buf)); + + priv->tx_ring = kmalloc(IPOIB_TX_RING_SIZE * sizeof (struct ipoib_buf), + GFP_KERNEL); + if (!priv->tx_ring) { + printk(KERN_WARNING "%s: failed to allocate TX ring (%d entries)\n", + ca->name, IPOIB_TX_RING_SIZE); + goto out_rx_ring_cleanup; + } + memset(priv->tx_ring, 0, + IPOIB_TX_RING_SIZE * sizeof (struct ipoib_buf)); + + /* priv->tx_head & tx_tail are already 0 */ + + if (ipoib_ib_dev_init(dev, ca, port)) + goto out_tx_ring_cleanup; + + return 0; + +out_tx_ring_cleanup: + kfree(priv->tx_ring); + +out_rx_ring_cleanup: + kfree(priv->rx_ring); + +out: + return -ENOMEM; +} + +void ipoib_dev_cleanup(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev), *cpriv, *tcpriv; + + ipoib_delete_debug_file(dev); + + /* Delete any child interfaces first */ + list_for_each_entry_safe(cpriv, tcpriv, &priv->child_intfs, list) { + unregister_netdev(cpriv->dev); + ipoib_dev_cleanup(cpriv->dev); + free_netdev(cpriv->dev); + } + + ipoib_ib_dev_cleanup(dev); + + if (priv->rx_ring) { + kfree(priv->rx_ring); + priv->rx_ring = NULL; + } + + if (priv->tx_ring) { + kfree(priv->tx_ring); + priv->tx_ring = NULL; + } +} + +static void ipoib_setup(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + dev->open = ipoib_open; + dev->stop = ipoib_stop; + dev->change_mtu = ipoib_change_mtu; + dev->hard_start_xmit = ipoib_start_xmit; + dev->get_stats = ipoib_get_stats; + dev->tx_timeout = ipoib_timeout; + dev->hard_header = ipoib_hard_header; + dev->set_multicast_list = ipoib_set_mcast_list; + dev->neigh_setup = ipoib_neigh_setup_dev; + + dev->watchdog_timeo = HZ; + + dev->rebuild_header = NULL; + dev->set_mac_address = NULL; + dev->header_cache_update = NULL; + + dev->flags |= IFF_BROADCAST | IFF_MULTICAST; + + /* + * We add in INFINIBAND_ALEN to allow for the destination + * address "pseudoheader" for skbs without neighbour struct. + */ + dev->hard_header_len = IPOIB_ENCAP_LEN + INFINIBAND_ALEN; + dev->addr_len = INFINIBAND_ALEN; + dev->type = ARPHRD_INFINIBAND; + dev->tx_queue_len = IPOIB_TX_RING_SIZE * 2; + dev->features = NETIF_F_VLAN_CHALLENGED | NETIF_F_LLTX; + + /* MTU will be reset when mcast join happens */ + dev->mtu = IPOIB_PACKET_SIZE - IPOIB_ENCAP_LEN; + priv->mcast_mtu = priv->admin_mtu = dev->mtu; + + memcpy(dev->broadcast, ipv4_bcast_addr, INFINIBAND_ALEN); + + netif_carrier_off(dev); + + SET_MODULE_OWNER(dev); + + priv->dev = dev; + + spin_lock_init(&priv->lock); + spin_lock_init(&priv->tx_lock); + + init_MUTEX(&priv->mcast_mutex); + init_MUTEX(&priv->vlan_mutex); + + INIT_LIST_HEAD(&priv->path_list); + INIT_LIST_HEAD(&priv->child_intfs); + INIT_LIST_HEAD(&priv->dead_ahs); + INIT_LIST_HEAD(&priv->multicast_list); + + INIT_WORK(&priv->pkey_task, ipoib_pkey_poll, priv->dev); + INIT_WORK(&priv->mcast_task, ipoib_mcast_join_task, priv->dev); + INIT_WORK(&priv->flush_task, ipoib_ib_dev_flush, priv->dev); + INIT_WORK(&priv->restart_task, ipoib_mcast_restart_task, priv->dev); + INIT_WORK(&priv->ah_reap_task, ipoib_reap_ah, priv->dev); +} + +struct ipoib_dev_priv *ipoib_intf_alloc(const char *name) +{ + struct net_device *dev; + + dev = alloc_netdev((int) sizeof (struct ipoib_dev_priv), name, + ipoib_setup); + if (!dev) + return NULL; + + return netdev_priv(dev); +} + +static ssize_t show_pkey(struct class_device *cdev, char *buf) +{ + struct ipoib_dev_priv *priv = + netdev_priv(container_of(cdev, struct net_device, class_dev)); + + return sprintf(buf, "0x%04x\n", priv->pkey); +} +static CLASS_DEVICE_ATTR(pkey, S_IRUGO, show_pkey, NULL); + +static ssize_t create_child(struct class_device *cdev, + const char *buf, size_t count) +{ + int pkey; + int ret; + + if (sscanf(buf, "%i", &pkey) != 1) + return -EINVAL; + + if (pkey < 0 || pkey > 0xffff) + return -EINVAL; + + ret = ipoib_vlan_add(container_of(cdev, struct net_device, class_dev), + pkey); + + return ret ? ret : count; +} +static CLASS_DEVICE_ATTR(create_child, S_IWUGO, NULL, create_child); + +static ssize_t delete_child(struct class_device *cdev, + const char *buf, size_t count) +{ + int pkey; + int ret; + + if (sscanf(buf, "%i", &pkey) != 1) + return -EINVAL; + + if (pkey < 0 || pkey > 0xffff) + return -EINVAL; + + ret = ipoib_vlan_delete(container_of(cdev, struct net_device, class_dev), + pkey); + + return ret ? ret : count; + +} +static CLASS_DEVICE_ATTR(delete_child, S_IWUGO, NULL, delete_child); + +int ipoib_add_pkey_attr(struct net_device *dev) +{ + return class_device_create_file(&dev->class_dev, + &class_device_attr_pkey); +} + +static struct net_device *ipoib_add_port(const char *format, + struct ib_device *hca, u8 port) +{ + struct ipoib_dev_priv *priv; + int result = -ENOMEM; + + priv = ipoib_intf_alloc(format); + if (!priv) + goto alloc_mem_failed; + + SET_NETDEV_DEV(priv->dev, hca->dma_device); + + result = ib_query_pkey(hca, port, 0, &priv->pkey); + if (result) { + printk(KERN_WARNING "%s: ib_query_pkey port %d failed (ret = %d)\n", + hca->name, port, result); + goto alloc_mem_failed; + } + + priv->dev->broadcast[8] = priv->pkey >> 8; + priv->dev->broadcast[9] = priv->pkey & 0xff; + + result = ib_query_gid(hca, port, 0, &priv->local_gid); + if (result) { + printk(KERN_WARNING "%s: ib_query_gid port %d failed (ret = %d)\n", + hca->name, port, result); + goto alloc_mem_failed; + } else + memcpy(priv->dev->dev_addr + 4, priv->local_gid.raw, sizeof (union ib_gid)); + + + result = ipoib_dev_init(priv->dev, hca, port); + if (result < 0) { + printk(KERN_WARNING "%s: failed to initialize port %d (ret = %d)\n", + hca->name, port, result); + goto device_init_failed; + } + + INIT_IB_EVENT_HANDLER(&priv->event_handler, + priv->ca, ipoib_event); + result = ib_register_event_handler(&priv->event_handler); + if (result < 0) { + printk(KERN_WARNING "%s: ib_register_event_handler failed for " + "port %d (ret = %d)\n", + hca->name, port, result); + goto event_failed; + } + + result = register_netdev(priv->dev); + if (result) { + printk(KERN_WARNING "%s: couldn't register ipoib port %d; error %d\n", + hca->name, port, result); + goto register_failed; + } + + if (ipoib_create_debug_file(priv->dev)) + goto debug_failed; + + if (ipoib_add_pkey_attr(priv->dev)) + goto sysfs_failed; + if (class_device_create_file(&priv->dev->class_dev, + &class_device_attr_create_child)) + goto sysfs_failed; + if (class_device_create_file(&priv->dev->class_dev, + &class_device_attr_delete_child)) + goto sysfs_failed; + + return priv->dev; + +sysfs_failed: + ipoib_delete_debug_file(priv->dev); + +debug_failed: + unregister_netdev(priv->dev); + +register_failed: + ib_unregister_event_handler(&priv->event_handler); + +event_failed: + ipoib_dev_cleanup(priv->dev); + +device_init_failed: + free_netdev(priv->dev); + +alloc_mem_failed: + return ERR_PTR(result); +} + +static void ipoib_add_one(struct ib_device *device) +{ + struct list_head *dev_list; + struct net_device *dev; + struct ipoib_dev_priv *priv; + int s, e, p; + + dev_list = kmalloc(sizeof *dev_list, GFP_KERNEL); + if (!dev_list) + return; + + INIT_LIST_HEAD(dev_list); + + if (device->node_type == IB_NODE_SWITCH) { + s = 0; + e = 0; + } else { + s = 1; + e = device->phys_port_cnt; + } + + for (p = s; p <= e; ++p) { + dev = ipoib_add_port("ib%d", device, p); + if (!IS_ERR(dev)) { + priv = netdev_priv(dev); + list_add_tail(&priv->list, dev_list); + } + } + + ib_set_client_data(device, &ipoib_client, dev_list); +} + +static void ipoib_remove_one(struct ib_device *device) +{ + struct ipoib_dev_priv *priv, *tmp; + struct list_head *dev_list; + + dev_list = ib_get_client_data(device, &ipoib_client); + + list_for_each_entry_safe(priv, tmp, dev_list, list) { + ib_unregister_event_handler(&priv->event_handler); + + unregister_netdev(priv->dev); + ipoib_dev_cleanup(priv->dev); + free_netdev(priv->dev); + } +} + +static int __init ipoib_init_module(void) +{ + int ret; + + ret = ipoib_register_debugfs(); + if (ret) + return ret; + + /* + * We create our own workqueue mainly because we want to be + * able to flush it when devices are being removed. We can't + * use schedule_work()/flush_scheduled_work() because both + * unregister_netdev() and linkwatch_event take the rtnl lock, + * so flush_scheduled_work() can deadlock during device + * removal. + */ + ipoib_workqueue = create_singlethread_workqueue("ipoib"); + if (!ipoib_workqueue) { + ret = -ENOMEM; + goto err_fs; + } + + ret = ib_register_client(&ipoib_client); + if (ret) + goto err_wq; + + return 0; + +err_fs: + ipoib_unregister_debugfs(); + +err_wq: + destroy_workqueue(ipoib_workqueue); + + return ret; +} + +static void __exit ipoib_cleanup_module(void) +{ + ipoib_unregister_debugfs(); + ib_unregister_client(&ipoib_client); + destroy_workqueue(ipoib_workqueue); +} + +module_init(ipoib_init_module); +module_exit(ipoib_cleanup_module); --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/ulp/ipoib/ipoib_verbs.c 2004-12-27 21:48:25.681186047 -0800 @@ -0,0 +1,254 @@ +/* + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: ipoib_verbs.c 1349 2004-12-16 21:09:43Z roland $ + */ + +#include + +#include "ipoib.h" + +int ipoib_mcast_attach(struct net_device *dev, u16 mlid, union ib_gid *mgid) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ib_qp_attr *qp_attr; + int attr_mask; + int ret; + u16 pkey_index; + + ret = -ENOMEM; + qp_attr = kmalloc(sizeof *qp_attr, GFP_KERNEL); + if (!qp_attr) + goto out; + + if (ib_cached_pkey_find(priv->ca, priv->port, priv->pkey, &pkey_index)) { + clear_bit(IPOIB_PKEY_ASSIGNED, &priv->flags); + ret = -ENXIO; + goto out; + } + set_bit(IPOIB_PKEY_ASSIGNED, &priv->flags); + + /* set correct QKey for QP */ + qp_attr->qkey = priv->qkey; + attr_mask = IB_QP_QKEY; + ret = ib_modify_qp(priv->qp, qp_attr, attr_mask); + if (ret) { + ipoib_warn(priv, "failed to modify QP, ret = %d\n", ret); + goto out; + } + + /* attach QP to multicast group */ + down(&priv->mcast_mutex); + ret = ib_attach_mcast(priv->qp, mgid, mlid); + up(&priv->mcast_mutex); + if (ret) + ipoib_warn(priv, "failed to attach to multicast group, ret = %d\n", ret); + +out: + kfree(qp_attr); + return ret; +} + +int ipoib_mcast_detach(struct net_device *dev, u16 mlid, union ib_gid *mgid) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + int ret; + + down(&priv->mcast_mutex); + ret = ib_detach_mcast(priv->qp, mgid, mlid); + up(&priv->mcast_mutex); + if (ret) + ipoib_warn(priv, "ib_detach_mcast failed (result = %d)\n", ret); + + return ret; +} + +int ipoib_qp_create(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + int ret; + u16 pkey_index; + struct ib_qp_attr qp_attr; + int attr_mask; + + /* + * Search through the port P_Key table for the requested pkey value. + * The port has to be assigned to the respective IB partition in + * advance. + */ + ret = ib_cached_pkey_find(priv->ca, priv->port, priv->pkey, &pkey_index); + if (ret) { + clear_bit(IPOIB_PKEY_ASSIGNED, &priv->flags); + return ret; + } + set_bit(IPOIB_PKEY_ASSIGNED, &priv->flags); + + qp_attr.qp_state = IB_QPS_INIT; + qp_attr.qkey = 0; + qp_attr.port_num = priv->port; + qp_attr.pkey_index = pkey_index; + attr_mask = + IB_QP_QKEY | + IB_QP_PORT | + IB_QP_PKEY_INDEX | + IB_QP_STATE; + ret = ib_modify_qp(priv->qp, &qp_attr, attr_mask); + if (ret) { + ipoib_warn(priv, "failed to modify QP to init, ret = %d\n", ret); + goto out_fail; + } + + qp_attr.qp_state = IB_QPS_RTR; + /* Can't set this in a INIT->RTR transition */ + attr_mask &= ~IB_QP_PORT; + ret = ib_modify_qp(priv->qp, &qp_attr, attr_mask); + if (ret) { + ipoib_warn(priv, "failed to modify QP to RTR, ret = %d\n", ret); + goto out_fail; + } + + qp_attr.qp_state = IB_QPS_RTS; + qp_attr.sq_psn = 0; + attr_mask |= IB_QP_SQ_PSN; + attr_mask &= ~IB_QP_PKEY_INDEX; + ret = ib_modify_qp(priv->qp, &qp_attr, attr_mask); + if (ret) { + ipoib_warn(priv, "failed to modify QP to RTS, ret = %d\n", ret); + goto out_fail; + } + + return 0; + +out_fail: + ib_destroy_qp(priv->qp); + priv->qp = NULL; + + return -EINVAL; +} + +int ipoib_transport_dev_init(struct net_device *dev, struct ib_device *ca) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ib_qp_init_attr init_attr = { + .cap = { + .max_send_wr = IPOIB_TX_RING_SIZE, + .max_recv_wr = IPOIB_RX_RING_SIZE, + .max_send_sge = 1, + .max_recv_sge = 1 + }, + .sq_sig_type = IB_SIGNAL_ALL_WR, + .rq_sig_type = IB_SIGNAL_ALL_WR, + .qp_type = IB_QPT_UD + }; + + priv->pd = ib_alloc_pd(priv->ca); + if (IS_ERR(priv->pd)) { + printk(KERN_WARNING "%s: failed to allocate PD\n", ca->name); + return -ENODEV; + } + + priv->cq = ib_create_cq(priv->ca, ipoib_ib_completion, NULL, dev, + IPOIB_TX_RING_SIZE + IPOIB_RX_RING_SIZE + 1); + if (IS_ERR(priv->cq)) { + printk(KERN_WARNING "%s: failed to create CQ\n", ca->name); + goto out_free_pd; + } + + if (ib_req_notify_cq(priv->cq, IB_CQ_NEXT_COMP)) + goto out_free_cq; + + priv->mr = ib_get_dma_mr(priv->pd, IB_ACCESS_LOCAL_WRITE); + if (IS_ERR(priv->mr)) { + printk(KERN_WARNING "%s: ib_reg_phys_mr failed\n", ca->name); + goto out_free_cq; + } + + init_attr.send_cq = priv->cq; + init_attr.recv_cq = priv->cq, + + priv->qp = ib_create_qp(priv->pd, &init_attr); + if (IS_ERR(priv->qp)) { + printk(KERN_WARNING "%s: failed to create QP\n", ca->name); + goto out_free_mr; + } + + priv->dev->dev_addr[1] = (priv->qp->qp_num >> 16) & 0xff; + priv->dev->dev_addr[2] = (priv->qp->qp_num >> 8) & 0xff; + priv->dev->dev_addr[3] = (priv->qp->qp_num ) & 0xff; + + return 0; + +out_free_mr: + ib_dereg_mr(priv->mr); + +out_free_cq: + ib_destroy_cq(priv->cq); + +out_free_pd: + ib_dealloc_pd(priv->pd); + return -ENODEV; +} + +void ipoib_transport_dev_cleanup(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + if (priv->qp) { + if (ib_destroy_qp(priv->qp)) + ipoib_warn(priv, "ib_qp_destroy failed\n"); + + priv->qp = NULL; + clear_bit(IPOIB_PKEY_ASSIGNED, &priv->flags); + } + + if (ib_dereg_mr(priv->mr)) + ipoib_warn(priv, "ib_dereg_mr failed\n"); + + if (ib_destroy_cq(priv->cq)) + ipoib_warn(priv, "ib_cq_destroy failed\n"); + + if (ib_dealloc_pd(priv->pd)) + ipoib_warn(priv, "ib_dealloc_pd failed\n"); +} + +void ipoib_event(struct ib_event_handler *handler, + struct ib_event *record) +{ + struct ipoib_dev_priv *priv = + container_of(handler, struct ipoib_dev_priv, event_handler); + + if (record->event == IB_EVENT_PORT_ACTIVE || + record->event == IB_EVENT_LID_CHANGE || + record->event == IB_EVENT_SM_CHANGE) { + ipoib_dbg(priv, "Port active event\n"); + schedule_work(&priv->flush_task); + } +} From roland@topspin.com Mon Dec 27 21:49:47 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 21:51:02 -0800 (PST) Received: from umhlanga.STRATNET.NET (umhlanga.stratnet.net [12.162.17.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS5nJDq025948 for ; Mon, 27 Dec 2004 21:49:46 -0800 Received: from exch-1.topspincom.com ([12.162.17.3]) by umhlanga.STRATNET.NET with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:51:12 -0800 Received: from eddore ([10.10.253.169]) by exch-1.topspincom.com with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:51:12 -0800 Received: from localhost ([127.0.0.1] helo=eddore) by eddore with smtp (Exim 4.34) id 1CjAGE-0000uf-1A; Mon, 27 Dec 2004 21:51:12 -0800 Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com, openib-general@openib.org In-Reply-To: <200412272151.MaHN86hIK0Tt84ws@topspin.com> X-Mailer: Roland's Patchbomber Date: Mon, 27 Dec 2004 21:51:10 -0800 Message-Id: <200412272151.qdu1iD71iJs65qnW@topspin.com> Mime-Version: 1.0 To: davem@davemloft.net From: Roland Dreier X-SA-Exim-Connect-IP: 127.0.0.1 X-SA-Exim-Mail-From: roland@topspin.com Subject: [PATCH][v5][14/24] Add Mellanox HCA low-level driver (QP/CQ) Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-SA-Exim-Version: 4.1 (built Tue, 17 Aug 2004 11:06:07 +0200) X-SA-Exim-Scanned: Yes (on eddore) X-OriginalArrivalTime: 28 Dec 2004 05:51:12.0298 (UTC) FILETIME=[3C8D00A0:01C4ECA1] X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13112 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: roland@topspin.com Precedence: bulk X-list: netdev Add CQ (completion queue) and QP (queue pair) code for Mellanox HCA driver. Signed-off-by: Roland Dreier --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/hw/mthca/mthca_cq.c 2004-12-27 21:48:23.509505711 -0800 @@ -0,0 +1,836 @@ +/* + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: mthca_cq.c 1369 2004-12-20 16:17:07Z roland $ + */ + +#include + +#include + +#include "mthca_dev.h" +#include "mthca_cmd.h" + +enum { + MTHCA_MAX_DIRECT_CQ_SIZE = 4 * PAGE_SIZE +}; + +enum { + MTHCA_CQ_ENTRY_SIZE = 0x20 +}; + +/* + * Must be packed because start is 64 bits but only aligned to 32 bits. + */ +struct mthca_cq_context { + u32 flags; + u64 start; + u32 logsize_usrpage; + u32 error_eqn; + u32 comp_eqn; + u32 pd; + u32 lkey; + u32 last_notified_index; + u32 solicit_producer_index; + u32 consumer_index; + u32 producer_index; + u32 cqn; + u32 reserved[3]; +} __attribute__((packed)); + +#define MTHCA_CQ_STATUS_OK ( 0 << 28) +#define MTHCA_CQ_STATUS_OVERFLOW ( 9 << 28) +#define MTHCA_CQ_STATUS_WRITE_FAIL (10 << 28) +#define MTHCA_CQ_FLAG_TR ( 1 << 18) +#define MTHCA_CQ_FLAG_OI ( 1 << 17) +#define MTHCA_CQ_STATE_DISARMED ( 0 << 8) +#define MTHCA_CQ_STATE_ARMED ( 1 << 8) +#define MTHCA_CQ_STATE_ARMED_SOL ( 4 << 8) +#define MTHCA_EQ_STATE_FIRED (10 << 8) + +enum { + MTHCA_ERROR_CQE_OPCODE_MASK = 0xfe +}; + +enum { + SYNDROME_LOCAL_LENGTH_ERR = 0x01, + SYNDROME_LOCAL_QP_OP_ERR = 0x02, + SYNDROME_LOCAL_EEC_OP_ERR = 0x03, + SYNDROME_LOCAL_PROT_ERR = 0x04, + SYNDROME_WR_FLUSH_ERR = 0x05, + SYNDROME_MW_BIND_ERR = 0x06, + SYNDROME_BAD_RESP_ERR = 0x10, + SYNDROME_LOCAL_ACCESS_ERR = 0x11, + SYNDROME_REMOTE_INVAL_REQ_ERR = 0x12, + SYNDROME_REMOTE_ACCESS_ERR = 0x13, + SYNDROME_REMOTE_OP_ERR = 0x14, + SYNDROME_RETRY_EXC_ERR = 0x15, + SYNDROME_RNR_RETRY_EXC_ERR = 0x16, + SYNDROME_LOCAL_RDD_VIOL_ERR = 0x20, + SYNDROME_REMOTE_INVAL_RD_REQ_ERR = 0x21, + SYNDROME_REMOTE_ABORTED_ERR = 0x22, + SYNDROME_INVAL_EECN_ERR = 0x23, + SYNDROME_INVAL_EEC_STATE_ERR = 0x24 +}; + +struct mthca_cqe { + u32 my_qpn; + u32 my_ee; + u32 rqpn; + u16 sl_g_mlpath; + u16 rlid; + u32 imm_etype_pkey_eec; + u32 byte_cnt; + u32 wqe; + u8 opcode; + u8 is_send; + u8 reserved; + u8 owner; +}; + +struct mthca_err_cqe { + u32 my_qpn; + u32 reserved1[3]; + u8 syndrome; + u8 reserved2; + u16 db_cnt; + u32 reserved3; + u32 wqe; + u8 opcode; + u8 reserved4[2]; + u8 owner; +}; + +#define MTHCA_CQ_ENTRY_OWNER_SW (0 << 7) +#define MTHCA_CQ_ENTRY_OWNER_HW (1 << 7) + +#define MTHCA_CQ_DB_INC_CI (1 << 24) +#define MTHCA_CQ_DB_REQ_NOT (2 << 24) +#define MTHCA_CQ_DB_REQ_NOT_SOL (3 << 24) +#define MTHCA_CQ_DB_SET_CI (4 << 24) +#define MTHCA_CQ_DB_REQ_NOT_MULT (5 << 24) + +static inline struct mthca_cqe *get_cqe(struct mthca_cq *cq, int entry) +{ + if (cq->is_direct) + return cq->queue.direct.buf + (entry * MTHCA_CQ_ENTRY_SIZE); + else + return cq->queue.page_list[entry * MTHCA_CQ_ENTRY_SIZE / PAGE_SIZE].buf + + (entry * MTHCA_CQ_ENTRY_SIZE) % PAGE_SIZE; +} + +static inline int cqe_sw(struct mthca_cq *cq, int i) +{ + return !(MTHCA_CQ_ENTRY_OWNER_HW & + get_cqe(cq, i)->owner); +} + +static inline int next_cqe_sw(struct mthca_cq *cq) +{ + return cqe_sw(cq, cq->cons_index); +} + +static inline void set_cqe_hw(struct mthca_cq *cq, int entry) +{ + get_cqe(cq, entry)->owner = MTHCA_CQ_ENTRY_OWNER_HW; +} + +static inline void inc_cons_index(struct mthca_dev *dev, struct mthca_cq *cq, + int nent) +{ + u32 doorbell[2]; + + doorbell[0] = cpu_to_be32(MTHCA_CQ_DB_INC_CI | cq->cqn); + doorbell[1] = cpu_to_be32(nent - 1); + + mthca_write64(doorbell, + dev->kar + MTHCA_CQ_DOORBELL, + MTHCA_GET_DOORBELL_LOCK(&dev->doorbell_lock)); +} + +void mthca_cq_event(struct mthca_dev *dev, u32 cqn) +{ + struct mthca_cq *cq; + + spin_lock(&dev->cq_table.lock); + cq = mthca_array_get(&dev->cq_table.cq, cqn & (dev->limits.num_cqs - 1)); + if (cq) + atomic_inc(&cq->refcount); + spin_unlock(&dev->cq_table.lock); + + if (!cq) { + mthca_warn(dev, "Completion event for bogus CQ %08x\n", cqn); + return; + } + + cq->ibcq.comp_handler(&cq->ibcq, cq->ibcq.cq_context); + + if (atomic_dec_and_test(&cq->refcount)) + wake_up(&cq->wait); +} + +void mthca_cq_clean(struct mthca_dev *dev, u32 cqn, u32 qpn) +{ + struct mthca_cq *cq; + struct mthca_cqe *cqe; + int prod_index; + int nfreed = 0; + + spin_lock_irq(&dev->cq_table.lock); + cq = mthca_array_get(&dev->cq_table.cq, cqn & (dev->limits.num_cqs - 1)); + if (cq) + atomic_inc(&cq->refcount); + spin_unlock_irq(&dev->cq_table.lock); + + if (!cq) + return; + + spin_lock_irq(&cq->lock); + + /* + * First we need to find the current producer index, so we + * know where to start cleaning from. It doesn't matter if HW + * adds new entries after this loop -- the QP we're worried + * about is already in RESET, so the new entries won't come + * from our QP and therefore don't need to be checked. + */ + for (prod_index = cq->cons_index; + cqe_sw(cq, prod_index & cq->ibcq.cqe); + ++prod_index) + if (prod_index == cq->cons_index + cq->ibcq.cqe) + break; + + if (0) + mthca_dbg(dev, "Cleaning QPN %06x from CQN %06x; ci %d, pi %d\n", + qpn, cqn, cq->cons_index, prod_index); + + /* + * Now sweep backwards through the CQ, removing CQ entries + * that match our QP by copying older entries on top of them. + */ + while (prod_index > cq->cons_index) { + cqe = get_cqe(cq, (prod_index - 1) & cq->ibcq.cqe); + if (cqe->my_qpn == cpu_to_be32(qpn)) + ++nfreed; + else if (nfreed) + memcpy(get_cqe(cq, (prod_index - 1 + nfreed) & + cq->ibcq.cqe), + cqe, + MTHCA_CQ_ENTRY_SIZE); + --prod_index; + } + + if (nfreed) { + wmb(); + inc_cons_index(dev, cq, nfreed); + cq->cons_index = (cq->cons_index + nfreed) & cq->ibcq.cqe; + } + + spin_unlock_irq(&cq->lock); + if (atomic_dec_and_test(&cq->refcount)) + wake_up(&cq->wait); +} + +static int handle_error_cqe(struct mthca_dev *dev, struct mthca_cq *cq, + struct mthca_qp *qp, int wqe_index, int is_send, + struct mthca_err_cqe *cqe, + struct ib_wc *entry, int *free_cqe) +{ + int err; + int dbd; + u32 new_wqe; + + if (1 && cqe->syndrome != SYNDROME_WR_FLUSH_ERR) { + int j; + + mthca_dbg(dev, "%x/%d: error CQE -> QPN %06x, WQE @ %08x\n", + cq->cqn, cq->cons_index, be32_to_cpu(cqe->my_qpn), + be32_to_cpu(cqe->wqe)); + + for (j = 0; j < 8; ++j) + printk(KERN_DEBUG " [%2x] %08x\n", + j * 4, be32_to_cpu(((u32 *) cqe)[j])); + } + + /* + * For completions in error, only work request ID, status (and + * freed resource count for RD) have to be set. + */ + switch (cqe->syndrome) { + case SYNDROME_LOCAL_LENGTH_ERR: + entry->status = IB_WC_LOC_LEN_ERR; + break; + case SYNDROME_LOCAL_QP_OP_ERR: + entry->status = IB_WC_LOC_QP_OP_ERR; + break; + case SYNDROME_LOCAL_EEC_OP_ERR: + entry->status = IB_WC_LOC_EEC_OP_ERR; + break; + case SYNDROME_LOCAL_PROT_ERR: + entry->status = IB_WC_LOC_PROT_ERR; + break; + case SYNDROME_WR_FLUSH_ERR: + entry->status = IB_WC_WR_FLUSH_ERR; + break; + case SYNDROME_MW_BIND_ERR: + entry->status = IB_WC_MW_BIND_ERR; + break; + case SYNDROME_BAD_RESP_ERR: + entry->status = IB_WC_BAD_RESP_ERR; + break; + case SYNDROME_LOCAL_ACCESS_ERR: + entry->status = IB_WC_LOC_ACCESS_ERR; + break; + case SYNDROME_REMOTE_INVAL_REQ_ERR: + entry->status = IB_WC_REM_INV_REQ_ERR; + break; + case SYNDROME_REMOTE_ACCESS_ERR: + entry->status = IB_WC_REM_ACCESS_ERR; + break; + case SYNDROME_REMOTE_OP_ERR: + entry->status = IB_WC_REM_OP_ERR; + break; + case SYNDROME_RETRY_EXC_ERR: + entry->status = IB_WC_RETRY_EXC_ERR; + break; + case SYNDROME_RNR_RETRY_EXC_ERR: + entry->status = IB_WC_RNR_RETRY_EXC_ERR; + break; + case SYNDROME_LOCAL_RDD_VIOL_ERR: + entry->status = IB_WC_LOC_RDD_VIOL_ERR; + break; + case SYNDROME_REMOTE_INVAL_RD_REQ_ERR: + entry->status = IB_WC_REM_INV_RD_REQ_ERR; + break; + case SYNDROME_REMOTE_ABORTED_ERR: + entry->status = IB_WC_REM_ABORT_ERR; + break; + case SYNDROME_INVAL_EECN_ERR: + entry->status = IB_WC_INV_EECN_ERR; + break; + case SYNDROME_INVAL_EEC_STATE_ERR: + entry->status = IB_WC_INV_EEC_STATE_ERR; + break; + default: + entry->status = IB_WC_GENERAL_ERR; + break; + } + + err = mthca_free_err_wqe(qp, is_send, wqe_index, &dbd, &new_wqe); + if (err) + return err; + + /* + * If we're at the end of the WQE chain, or we've used up our + * doorbell count, free the CQE. Otherwise just update it for + * the next poll operation. + */ + if (!(new_wqe & cpu_to_be32(0x3f)) || (!cqe->db_cnt && dbd)) + return 0; + + cqe->db_cnt = cpu_to_be16(be16_to_cpu(cqe->db_cnt) - dbd); + cqe->wqe = new_wqe; + cqe->syndrome = SYNDROME_WR_FLUSH_ERR; + + *free_cqe = 0; + + return 0; +} + +static void dump_cqe(struct mthca_cqe *cqe) +{ + int j; + + for (j = 0; j < 8; ++j) + printk(KERN_DEBUG " [%2x] %08x\n", + j * 4, be32_to_cpu(((u32 *) cqe)[j])); +} + +static inline int mthca_poll_one(struct mthca_dev *dev, + struct mthca_cq *cq, + struct mthca_qp **cur_qp, + int *freed, + struct ib_wc *entry) +{ + struct mthca_wq *wq; + struct mthca_cqe *cqe; + int wqe_index; + int is_error = 0; + int is_send; + int free_cqe = 1; + int err = 0; + + if (!next_cqe_sw(cq)) + return -EAGAIN; + + rmb(); + + cqe = get_cqe(cq, cq->cons_index); + + if (0) { + mthca_dbg(dev, "%x/%d: CQE -> QPN %06x, WQE @ %08x\n", + cq->cqn, cq->cons_index, be32_to_cpu(cqe->my_qpn), + be32_to_cpu(cqe->wqe)); + + dump_cqe(cqe); + } + + if ((cqe->opcode & MTHCA_ERROR_CQE_OPCODE_MASK) == + MTHCA_ERROR_CQE_OPCODE_MASK) { + is_error = 1; + is_send = cqe->opcode & 1; + } else + is_send = cqe->is_send & 0x80; + + if (!*cur_qp || be32_to_cpu(cqe->my_qpn) != (*cur_qp)->qpn) { + if (*cur_qp) { + if (*freed) { + wmb(); + inc_cons_index(dev, cq, *freed); + *freed = 0; + } + spin_unlock(&(*cur_qp)->lock); + if (atomic_dec_and_test(&(*cur_qp)->refcount)) + wake_up(&(*cur_qp)->wait); + } + + spin_lock(&dev->qp_table.lock); + *cur_qp = mthca_array_get(&dev->qp_table.qp, + be32_to_cpu(cqe->my_qpn) & + (dev->limits.num_qps - 1)); + if (*cur_qp) + atomic_inc(&(*cur_qp)->refcount); + spin_unlock(&dev->qp_table.lock); + + if (!*cur_qp) { + mthca_warn(dev, "CQ entry for unknown QP %06x\n", + be32_to_cpu(cqe->my_qpn) & 0xffffff); + err = -EINVAL; + goto out; + } + + spin_lock(&(*cur_qp)->lock); + } + + if (is_send) { + wq = &(*cur_qp)->sq; + wqe_index = ((be32_to_cpu(cqe->wqe) - (*cur_qp)->send_wqe_offset) + >> wq->wqe_shift); + entry->wr_id = (*cur_qp)->wrid[wqe_index + + (*cur_qp)->rq.max]; + } else { + wq = &(*cur_qp)->rq; + wqe_index = be32_to_cpu(cqe->wqe) >> wq->wqe_shift; + entry->wr_id = (*cur_qp)->wrid[wqe_index]; + } + + if (wq->last_comp < wqe_index) + wq->cur -= wqe_index - wq->last_comp; + else + wq->cur -= wq->max - wq->last_comp + wqe_index; + + wq->last_comp = wqe_index; + + if (0) + mthca_dbg(dev, "%s completion for QP %06x, index %d (nr %d)\n", + is_send ? "Send" : "Receive", + (*cur_qp)->qpn, wqe_index, wq->max); + + if (is_error) { + err = handle_error_cqe(dev, cq, *cur_qp, wqe_index, is_send, + (struct mthca_err_cqe *) cqe, + entry, &free_cqe); + goto out; + } + + if (is_send) { + entry->opcode = IB_WC_SEND; /* XXX */ + } else { + entry->byte_len = be32_to_cpu(cqe->byte_cnt); + switch (cqe->opcode & 0x1f) { + case IB_OPCODE_SEND_LAST_WITH_IMMEDIATE: + case IB_OPCODE_SEND_ONLY_WITH_IMMEDIATE: + entry->wc_flags = IB_WC_WITH_IMM; + entry->imm_data = cqe->imm_etype_pkey_eec; + entry->opcode = IB_WC_RECV; + break; + case IB_OPCODE_RDMA_WRITE_LAST_WITH_IMMEDIATE: + case IB_OPCODE_RDMA_WRITE_ONLY_WITH_IMMEDIATE: + entry->wc_flags = IB_WC_WITH_IMM; + entry->imm_data = cqe->imm_etype_pkey_eec; + entry->opcode = IB_WC_RECV_RDMA_WITH_IMM; + break; + default: + entry->wc_flags = 0; + entry->opcode = IB_WC_RECV; + break; + } + entry->slid = be16_to_cpu(cqe->rlid); + entry->sl = be16_to_cpu(cqe->sl_g_mlpath) >> 12; + entry->src_qp = be32_to_cpu(cqe->rqpn) & 0xffffff; + entry->dlid_path_bits = be16_to_cpu(cqe->sl_g_mlpath) & 0x7f; + entry->pkey_index = be32_to_cpu(cqe->imm_etype_pkey_eec) >> 16; + entry->wc_flags |= be16_to_cpu(cqe->sl_g_mlpath) & 0x80 ? + IB_WC_GRH : 0; + } + + entry->status = IB_WC_SUCCESS; + + out: + if (free_cqe) { + set_cqe_hw(cq, cq->cons_index); + ++(*freed); + cq->cons_index = (cq->cons_index + 1) & cq->ibcq.cqe; + } + + return err; +} + +int mthca_poll_cq(struct ib_cq *ibcq, int num_entries, + struct ib_wc *entry) +{ + struct mthca_dev *dev = to_mdev(ibcq->device); + struct mthca_cq *cq = to_mcq(ibcq); + struct mthca_qp *qp = NULL; + unsigned long flags; + int err = 0; + int freed = 0; + int npolled; + + spin_lock_irqsave(&cq->lock, flags); + + for (npolled = 0; npolled < num_entries; ++npolled) { + err = mthca_poll_one(dev, cq, &qp, + &freed, entry + npolled); + if (err) + break; + } + + if (freed) { + wmb(); + inc_cons_index(dev, cq, freed); + } + + if (qp) { + spin_unlock(&qp->lock); + if (atomic_dec_and_test(&qp->refcount)) + wake_up(&qp->wait); + } + + + spin_unlock_irqrestore(&cq->lock, flags); + + return err == 0 || err == -EAGAIN ? npolled : err; +} + +void mthca_arm_cq(struct mthca_dev *dev, struct mthca_cq *cq, + int solicited) +{ + u32 doorbell[2]; + + doorbell[0] = cpu_to_be32((solicited ? + MTHCA_CQ_DB_REQ_NOT_SOL : + MTHCA_CQ_DB_REQ_NOT) | + cq->cqn); + doorbell[1] = 0xffffffff; + + mthca_write64(doorbell, + dev->kar + MTHCA_CQ_DOORBELL, + MTHCA_GET_DOORBELL_LOCK(&dev->doorbell_lock)); +} + +int mthca_init_cq(struct mthca_dev *dev, int nent, + struct mthca_cq *cq) +{ + int size = nent * MTHCA_CQ_ENTRY_SIZE; + dma_addr_t t; + void *mailbox = NULL; + int npages, shift; + u64 *dma_list = NULL; + struct mthca_cq_context *cq_context; + int err = -ENOMEM; + u8 status; + int i; + + might_sleep(); + + mailbox = kmalloc(sizeof (struct mthca_cq_context) + MTHCA_CMD_MAILBOX_EXTRA, + GFP_KERNEL); + if (!mailbox) + goto err_out; + + cq_context = MAILBOX_ALIGN(mailbox); + + if (size <= MTHCA_MAX_DIRECT_CQ_SIZE) { + if (0) + mthca_dbg(dev, "Creating direct CQ of size %d\n", size); + + cq->is_direct = 1; + npages = 1; + shift = get_order(size) + PAGE_SHIFT; + + cq->queue.direct.buf = pci_alloc_consistent(dev->pdev, + size, &t); + if (!cq->queue.direct.buf) + goto err_out; + + pci_unmap_addr_set(&cq->queue.direct, mapping, t); + + memset(cq->queue.direct.buf, 0, size); + + while (t & ((1 << shift) - 1)) { + --shift; + npages *= 2; + } + + dma_list = kmalloc(npages * sizeof *dma_list, GFP_KERNEL); + if (!dma_list) + goto err_out_free; + + for (i = 0; i < npages; ++i) + dma_list[i] = t + i * (1 << shift); + } else { + cq->is_direct = 0; + npages = (size + PAGE_SIZE - 1) / PAGE_SIZE; + shift = PAGE_SHIFT; + + if (0) + mthca_dbg(dev, "Creating indirect CQ with %d pages\n", npages); + + dma_list = kmalloc(npages * sizeof *dma_list, GFP_KERNEL); + if (!dma_list) + goto err_out; + + cq->queue.page_list = kmalloc(npages * sizeof *cq->queue.page_list, + GFP_KERNEL); + if (!cq->queue.page_list) + goto err_out; + + for (i = 0; i < npages; ++i) + cq->queue.page_list[i].buf = NULL; + + for (i = 0; i < npages; ++i) { + cq->queue.page_list[i].buf = + pci_alloc_consistent(dev->pdev, PAGE_SIZE, &t); + if (!cq->queue.page_list[i].buf) + goto err_out_free; + + dma_list[i] = t; + pci_unmap_addr_set(&cq->queue.page_list[i], mapping, t); + + memset(cq->queue.page_list[i].buf, 0, PAGE_SIZE); + } + } + + for (i = 0; i < nent; ++i) + set_cqe_hw(cq, i); + + cq->cqn = mthca_alloc(&dev->cq_table.alloc); + if (cq->cqn == -1) + goto err_out_free; + + err = mthca_mr_alloc_phys(dev, dev->driver_pd.pd_num, + dma_list, shift, npages, + 0, size, + MTHCA_MPT_FLAG_LOCAL_WRITE | + MTHCA_MPT_FLAG_LOCAL_READ, + &cq->mr); + if (err) + goto err_out_free_cq; + + spin_lock_init(&cq->lock); + atomic_set(&cq->refcount, 1); + init_waitqueue_head(&cq->wait); + + memset(cq_context, 0, sizeof *cq_context); + cq_context->flags = cpu_to_be32(MTHCA_CQ_STATUS_OK | + MTHCA_CQ_STATE_DISARMED | + MTHCA_CQ_FLAG_TR); + cq_context->start = cpu_to_be64(0); + cq_context->logsize_usrpage = cpu_to_be32((ffs(nent) - 1) << 24 | + MTHCA_KAR_PAGE); + cq_context->error_eqn = cpu_to_be32(dev->eq_table.eq[MTHCA_EQ_ASYNC].eqn); + cq_context->comp_eqn = cpu_to_be32(dev->eq_table.eq[MTHCA_EQ_COMP].eqn); + cq_context->pd = cpu_to_be32(dev->driver_pd.pd_num); + cq_context->lkey = cpu_to_be32(cq->mr.ibmr.lkey); + cq_context->cqn = cpu_to_be32(cq->cqn); + + err = mthca_SW2HW_CQ(dev, cq_context, cq->cqn, &status); + if (err) { + mthca_warn(dev, "SW2HW_CQ failed (%d)\n", err); + goto err_out_free_mr; + } + + if (status) { + mthca_warn(dev, "SW2HW_CQ returned status 0x%02x\n", + status); + err = -EINVAL; + goto err_out_free_mr; + } + + spin_lock_irq(&dev->cq_table.lock); + if (mthca_array_set(&dev->cq_table.cq, + cq->cqn & (dev->limits.num_cqs - 1), + cq)) { + spin_unlock_irq(&dev->cq_table.lock); + goto err_out_free_mr; + } + spin_unlock_irq(&dev->cq_table.lock); + + cq->cons_index = 0; + + kfree(dma_list); + kfree(mailbox); + + return 0; + + err_out_free_mr: + mthca_free_mr(dev, &cq->mr); + + err_out_free_cq: + mthca_free(&dev->cq_table.alloc, cq->cqn); + + err_out_free: + if (cq->is_direct) + pci_free_consistent(dev->pdev, size, + cq->queue.direct.buf, + pci_unmap_addr(&cq->queue.direct, mapping)); + else { + for (i = 0; i < npages; ++i) + if (cq->queue.page_list[i].buf) + pci_free_consistent(dev->pdev, PAGE_SIZE, + cq->queue.page_list[i].buf, + pci_unmap_addr(&cq->queue.page_list[i], + mapping)); + + kfree(cq->queue.page_list); + } + + err_out: + kfree(dma_list); + kfree(mailbox); + + return err; +} + +void mthca_free_cq(struct mthca_dev *dev, + struct mthca_cq *cq) +{ + void *mailbox; + int err; + u8 status; + + might_sleep(); + + mailbox = kmalloc(sizeof (struct mthca_cq_context) + MTHCA_CMD_MAILBOX_EXTRA, + GFP_KERNEL); + if (!mailbox) { + mthca_warn(dev, "No memory for mailbox to free CQ.\n"); + return; + } + + err = mthca_HW2SW_CQ(dev, MAILBOX_ALIGN(mailbox), cq->cqn, &status); + if (err) + mthca_warn(dev, "HW2SW_CQ failed (%d)\n", err); + else if (status) + mthca_warn(dev, "HW2SW_CQ returned status 0x%02x\n", + status); + + if (0) { + u32 *ctx = MAILBOX_ALIGN(mailbox); + int j; + + printk(KERN_ERR "context for CQN %x\n", cq->cqn); + for (j = 0; j < 16; ++j) + printk(KERN_ERR "[%2x] %08x\n", j * 4, be32_to_cpu(ctx[j])); + } + + spin_lock_irq(&dev->cq_table.lock); + mthca_array_clear(&dev->cq_table.cq, + cq->cqn & (dev->limits.num_cqs - 1)); + spin_unlock_irq(&dev->cq_table.lock); + + atomic_dec(&cq->refcount); + wait_event(cq->wait, !atomic_read(&cq->refcount)); + + mthca_free_mr(dev, &cq->mr); + + if (cq->is_direct) + pci_free_consistent(dev->pdev, + (cq->ibcq.cqe + 1) * MTHCA_CQ_ENTRY_SIZE, + cq->queue.direct.buf, + pci_unmap_addr(&cq->queue.direct, + mapping)); + else { + int i; + + for (i = 0; + i < ((cq->ibcq.cqe + 1) * MTHCA_CQ_ENTRY_SIZE + PAGE_SIZE - 1) / + PAGE_SIZE; + ++i) + pci_free_consistent(dev->pdev, PAGE_SIZE, + cq->queue.page_list[i].buf, + pci_unmap_addr(&cq->queue.page_list[i], + mapping)); + + kfree(cq->queue.page_list); + } + + mthca_free(&dev->cq_table.alloc, cq->cqn); + kfree(mailbox); +} + +int __devinit mthca_init_cq_table(struct mthca_dev *dev) +{ + int err; + + spin_lock_init(&dev->cq_table.lock); + + err = mthca_alloc_init(&dev->cq_table.alloc, + dev->limits.num_cqs, + (1 << 24) - 1, + dev->limits.reserved_cqs); + if (err) + return err; + + err = mthca_array_init(&dev->cq_table.cq, + dev->limits.num_cqs); + if (err) + mthca_alloc_cleanup(&dev->cq_table.alloc); + + return err; +} + +void __devexit mthca_cleanup_cq_table(struct mthca_dev *dev) +{ + mthca_array_cleanup(&dev->cq_table.cq, dev->limits.num_cqs); + mthca_alloc_cleanup(&dev->cq_table.alloc); +} --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/hw/mthca/mthca_qp.c 2004-12-27 21:48:23.540501149 -0800 @@ -0,0 +1,1536 @@ +/* + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: mthca_qp.c 1355 2004-12-17 15:23:43Z roland $ + */ + +#include + +#include +#include +#include + +#include "mthca_dev.h" +#include "mthca_cmd.h" + +enum { + MTHCA_MAX_DIRECT_QP_SIZE = 4 * PAGE_SIZE, + MTHCA_ACK_REQ_FREQ = 10, + MTHCA_FLIGHT_LIMIT = 9, + MTHCA_UD_HEADER_SIZE = 72 /* largest UD header possible */ +}; + +enum { + MTHCA_QP_STATE_RST = 0, + MTHCA_QP_STATE_INIT = 1, + MTHCA_QP_STATE_RTR = 2, + MTHCA_QP_STATE_RTS = 3, + MTHCA_QP_STATE_SQE = 4, + MTHCA_QP_STATE_SQD = 5, + MTHCA_QP_STATE_ERR = 6, + MTHCA_QP_STATE_DRAINING = 7 +}; + +enum { + MTHCA_QP_ST_RC = 0x0, + MTHCA_QP_ST_UC = 0x1, + MTHCA_QP_ST_RD = 0x2, + MTHCA_QP_ST_UD = 0x3, + MTHCA_QP_ST_MLX = 0x7 +}; + +enum { + MTHCA_QP_PM_MIGRATED = 0x3, + MTHCA_QP_PM_ARMED = 0x0, + MTHCA_QP_PM_REARM = 0x1 +}; + +enum { + /* qp_context flags */ + MTHCA_QP_BIT_DE = 1 << 8, + /* params1 */ + MTHCA_QP_BIT_SRE = 1 << 15, + MTHCA_QP_BIT_SWE = 1 << 14, + MTHCA_QP_BIT_SAE = 1 << 13, + MTHCA_QP_BIT_SIC = 1 << 4, + MTHCA_QP_BIT_SSC = 1 << 3, + /* params2 */ + MTHCA_QP_BIT_RRE = 1 << 15, + MTHCA_QP_BIT_RWE = 1 << 14, + MTHCA_QP_BIT_RAE = 1 << 13, + MTHCA_QP_BIT_RIC = 1 << 4, + MTHCA_QP_BIT_RSC = 1 << 3 +}; + +struct mthca_qp_path { + u32 port_pkey; + u8 rnr_retry; + u8 g_mylmc; + u16 rlid; + u8 ackto; + u8 mgid_index; + u8 static_rate; + u8 hop_limit; + u32 sl_tclass_flowlabel; + u8 rgid[16]; +} __attribute__((packed)); + +struct mthca_qp_context { + u32 flags; + u32 sched_queue; + u32 mtu_msgmax; + u32 usr_page; + u32 local_qpn; + u32 remote_qpn; + u32 reserved1[2]; + struct mthca_qp_path pri_path; + struct mthca_qp_path alt_path; + u32 rdd; + u32 pd; + u32 wqe_base; + u32 wqe_lkey; + u32 params1; + u32 reserved2; + u32 next_send_psn; + u32 cqn_snd; + u32 next_snd_wqe[2]; + u32 last_acked_psn; + u32 ssn; + u32 params2; + u32 rnr_nextrecvpsn; + u32 ra_buff_indx; + u32 cqn_rcv; + u32 next_rcv_wqe[2]; + u32 qkey; + u32 srqn; + u32 rmsn; + u32 reserved3[19]; +} __attribute__((packed)); + +struct mthca_qp_param { + u32 opt_param_mask; + u32 reserved1; + struct mthca_qp_context context; + u32 reserved2[62]; +} __attribute__((packed)); + +enum { + MTHCA_QP_OPTPAR_ALT_ADDR_PATH = 1 << 0, + MTHCA_QP_OPTPAR_RRE = 1 << 1, + MTHCA_QP_OPTPAR_RAE = 1 << 2, + MTHCA_QP_OPTPAR_REW = 1 << 3, + MTHCA_QP_OPTPAR_PKEY_INDEX = 1 << 4, + MTHCA_QP_OPTPAR_Q_KEY = 1 << 5, + MTHCA_QP_OPTPAR_RNR_TIMEOUT = 1 << 6, + MTHCA_QP_OPTPAR_PRIMARY_ADDR_PATH = 1 << 7, + MTHCA_QP_OPTPAR_SRA_MAX = 1 << 8, + MTHCA_QP_OPTPAR_RRA_MAX = 1 << 9, + MTHCA_QP_OPTPAR_PM_STATE = 1 << 10, + MTHCA_QP_OPTPAR_PORT_NUM = 1 << 11, + MTHCA_QP_OPTPAR_RETRY_COUNT = 1 << 12, + MTHCA_QP_OPTPAR_ALT_RNR_RETRY = 1 << 13, + MTHCA_QP_OPTPAR_ACK_TIMEOUT = 1 << 14, + MTHCA_QP_OPTPAR_RNR_RETRY = 1 << 15, + MTHCA_QP_OPTPAR_SCHED_QUEUE = 1 << 16 +}; + +enum { + MTHCA_OPCODE_NOP = 0x00, + MTHCA_OPCODE_RDMA_WRITE = 0x08, + MTHCA_OPCODE_RDMA_WRITE_IMM = 0x09, + MTHCA_OPCODE_SEND = 0x0a, + MTHCA_OPCODE_SEND_IMM = 0x0b, + MTHCA_OPCODE_RDMA_READ = 0x10, + MTHCA_OPCODE_ATOMIC_CS = 0x11, + MTHCA_OPCODE_ATOMIC_FA = 0x12, + MTHCA_OPCODE_BIND_MW = 0x18, + MTHCA_OPCODE_INVALID = 0xff +}; + +enum { + MTHCA_NEXT_DBD = 1 << 7, + MTHCA_NEXT_FENCE = 1 << 6, + MTHCA_NEXT_CQ_UPDATE = 1 << 3, + MTHCA_NEXT_EVENT_GEN = 1 << 2, + MTHCA_NEXT_SOLICIT = 1 << 1, + + MTHCA_MLX_VL15 = 1 << 17, + MTHCA_MLX_SLR = 1 << 16 +}; + +struct mthca_next_seg { + u32 nda_op; /* [31:6] next WQE [4:0] next opcode */ + u32 ee_nds; /* [31:8] next EE [7] DBD [6] F [5:0] next WQE size */ + u32 flags; /* [3] CQ [2] Event [1] Solicit */ + u32 imm; /* immediate data */ +}; + +struct mthca_ud_seg { + u32 reserved1; + u32 lkey; + u64 av_addr; + u32 reserved2[4]; + u32 dqpn; + u32 qkey; + u32 reserved3[2]; +}; + +struct mthca_bind_seg { + u32 flags; /* [31] Atomic [30] rem write [29] rem read */ + u32 reserved; + u32 new_rkey; + u32 lkey; + u64 addr; + u64 length; +}; + +struct mthca_raddr_seg { + u64 raddr; + u32 rkey; + u32 reserved; +}; + +struct mthca_atomic_seg { + u64 swap_add; + u64 compare; +}; + +struct mthca_data_seg { + u32 byte_count; + u32 lkey; + u64 addr; +}; + +struct mthca_mlx_seg { + u32 nda_op; + u32 nds; + u32 flags; /* [17] VL15 [16] SLR [14:12] static rate + [11:8] SL [3] C [2] E */ + u16 rlid; + u16 vcrc; +}; + +static int is_sqp(struct mthca_dev *dev, struct mthca_qp *qp) +{ + return qp->qpn >= dev->qp_table.sqp_start && + qp->qpn <= dev->qp_table.sqp_start + 3; +} + +static int is_qp0(struct mthca_dev *dev, struct mthca_qp *qp) +{ + return qp->qpn >= dev->qp_table.sqp_start && + qp->qpn <= dev->qp_table.sqp_start + 1; +} + +static void *get_recv_wqe(struct mthca_qp *qp, int n) +{ + if (qp->is_direct) + return qp->queue.direct.buf + (n << qp->rq.wqe_shift); + else + return qp->queue.page_list[(n << qp->rq.wqe_shift) >> PAGE_SHIFT].buf + + ((n << qp->rq.wqe_shift) & (PAGE_SIZE - 1)); +} + +static void *get_send_wqe(struct mthca_qp *qp, int n) +{ + if (qp->is_direct) + return qp->queue.direct.buf + qp->send_wqe_offset + + (n << qp->sq.wqe_shift); + else + return qp->queue.page_list[(qp->send_wqe_offset + + (n << qp->sq.wqe_shift)) >> + PAGE_SHIFT].buf + + ((qp->send_wqe_offset + (n << qp->sq.wqe_shift)) & + (PAGE_SIZE - 1)); +} + +void mthca_qp_event(struct mthca_dev *dev, u32 qpn, + enum ib_event_type event_type) +{ + struct mthca_qp *qp; + struct ib_event event; + + spin_lock(&dev->qp_table.lock); + qp = mthca_array_get(&dev->qp_table.qp, qpn & (dev->limits.num_qps - 1)); + if (qp) + atomic_inc(&qp->refcount); + spin_unlock(&dev->qp_table.lock); + + if (!qp) { + mthca_warn(dev, "Async event for bogus QP %08x\n", qpn); + return; + } + + event.device = &dev->ib_dev; + event.event = event_type; + event.element.qp = &qp->ibqp; + if (qp->ibqp.event_handler) + qp->ibqp.event_handler(&event, qp->ibqp.qp_context); + + if (atomic_dec_and_test(&qp->refcount)) + wake_up(&qp->wait); +} + +static int to_mthca_state(enum ib_qp_state ib_state) +{ + switch (ib_state) { + case IB_QPS_RESET: return MTHCA_QP_STATE_RST; + case IB_QPS_INIT: return MTHCA_QP_STATE_INIT; + case IB_QPS_RTR: return MTHCA_QP_STATE_RTR; + case IB_QPS_RTS: return MTHCA_QP_STATE_RTS; + case IB_QPS_SQD: return MTHCA_QP_STATE_SQD; + case IB_QPS_SQE: return MTHCA_QP_STATE_SQE; + case IB_QPS_ERR: return MTHCA_QP_STATE_ERR; + default: return -1; + } +} + +enum { RC, UC, UD, RD, RDEE, MLX, NUM_TRANS }; + +static int to_mthca_st(int transport) +{ + switch (transport) { + case RC: return MTHCA_QP_ST_RC; + case UC: return MTHCA_QP_ST_UC; + case UD: return MTHCA_QP_ST_UD; + case RD: return MTHCA_QP_ST_RD; + case MLX: return MTHCA_QP_ST_MLX; + default: return -1; + } +} + +static const struct { + int trans; + u32 req_param[NUM_TRANS]; + u32 opt_param[NUM_TRANS]; +} state_table[IB_QPS_ERR + 1][IB_QPS_ERR + 1] = { + [IB_QPS_RESET] = { + [IB_QPS_RESET] = { .trans = MTHCA_TRANS_ANY2RST }, + [IB_QPS_ERR] = { .trans = MTHCA_TRANS_ANY2ERR }, + [IB_QPS_INIT] = { + .trans = MTHCA_TRANS_RST2INIT, + .req_param = { + [UD] = (IB_QP_PKEY_INDEX | + IB_QP_PORT | + IB_QP_QKEY), + [RC] = (IB_QP_PKEY_INDEX | + IB_QP_PORT | + IB_QP_ACCESS_FLAGS), + [MLX] = (IB_QP_PKEY_INDEX | + IB_QP_QKEY), + }, + /* bug-for-bug compatibility with VAPI: */ + .opt_param = { + [MLX] = IB_QP_PORT + } + }, + }, + [IB_QPS_INIT] = { + [IB_QPS_RESET] = { .trans = MTHCA_TRANS_ANY2RST }, + [IB_QPS_ERR] = { .trans = MTHCA_TRANS_ANY2ERR }, + [IB_QPS_INIT] = { + .trans = MTHCA_TRANS_INIT2INIT, + .opt_param = { + [UD] = (IB_QP_PKEY_INDEX | + IB_QP_PORT | + IB_QP_QKEY), + [RC] = (IB_QP_PKEY_INDEX | + IB_QP_PORT | + IB_QP_ACCESS_FLAGS), + [MLX] = (IB_QP_PKEY_INDEX | + IB_QP_QKEY), + } + }, + [IB_QPS_RTR] = { + .trans = MTHCA_TRANS_INIT2RTR, + .req_param = { + [RC] = (IB_QP_AV | + IB_QP_PATH_MTU | + IB_QP_DEST_QPN | + IB_QP_RQ_PSN | + IB_QP_MAX_DEST_RD_ATOMIC | + IB_QP_MIN_RNR_TIMER), + }, + .opt_param = { + [UD] = (IB_QP_PKEY_INDEX | + IB_QP_QKEY), + [RC] = (IB_QP_ALT_PATH | + IB_QP_ACCESS_FLAGS | + IB_QP_PKEY_INDEX), + [MLX] = (IB_QP_PKEY_INDEX | + IB_QP_QKEY), + } + } + }, + [IB_QPS_RTR] = { + [IB_QPS_RESET] = { .trans = MTHCA_TRANS_ANY2RST }, + [IB_QPS_ERR] = { .trans = MTHCA_TRANS_ANY2ERR }, + [IB_QPS_RTS] = { + .trans = MTHCA_TRANS_RTR2RTS, + .req_param = { + [UD] = IB_QP_SQ_PSN, + [RC] = (IB_QP_TIMEOUT | + IB_QP_RETRY_CNT | + IB_QP_RNR_RETRY | + IB_QP_SQ_PSN | + IB_QP_MAX_QP_RD_ATOMIC), + [MLX] = IB_QP_SQ_PSN, + }, + .opt_param = { + [UD] = (IB_QP_CUR_STATE | + IB_QP_QKEY), + [RC] = (IB_QP_CUR_STATE | + IB_QP_ALT_PATH | + IB_QP_ACCESS_FLAGS | + IB_QP_PKEY_INDEX | + IB_QP_MIN_RNR_TIMER | + IB_QP_PATH_MIG_STATE), + [MLX] = (IB_QP_CUR_STATE | + IB_QP_QKEY), + } + } + }, + [IB_QPS_RTS] = { + [IB_QPS_RESET] = { .trans = MTHCA_TRANS_ANY2RST }, + [IB_QPS_ERR] = { .trans = MTHCA_TRANS_ANY2ERR }, + [IB_QPS_RTS] = { + .trans = MTHCA_TRANS_RTS2RTS, + .opt_param = { + [UD] = (IB_QP_CUR_STATE | + IB_QP_QKEY), + [RC] = (IB_QP_ACCESS_FLAGS | + IB_QP_ALT_PATH | + IB_QP_PATH_MIG_STATE | + IB_QP_MIN_RNR_TIMER), + [MLX] = (IB_QP_CUR_STATE | + IB_QP_QKEY), + } + }, + [IB_QPS_SQD] = { + .trans = MTHCA_TRANS_RTS2SQD, + }, + }, + [IB_QPS_SQD] = { + [IB_QPS_RESET] = { .trans = MTHCA_TRANS_ANY2RST }, + [IB_QPS_ERR] = { .trans = MTHCA_TRANS_ANY2ERR }, + [IB_QPS_RTS] = { + .trans = MTHCA_TRANS_SQD2RTS, + .opt_param = { + [UD] = (IB_QP_CUR_STATE | + IB_QP_QKEY), + [RC] = (IB_QP_CUR_STATE | + IB_QP_ALT_PATH | + IB_QP_ACCESS_FLAGS | + IB_QP_MIN_RNR_TIMER | + IB_QP_PATH_MIG_STATE), + [MLX] = (IB_QP_CUR_STATE | + IB_QP_QKEY), + } + }, + [IB_QPS_SQD] = { + .trans = MTHCA_TRANS_SQD2SQD, + .opt_param = { + [UD] = (IB_QP_PKEY_INDEX | + IB_QP_QKEY), + [RC] = (IB_QP_AV | + IB_QP_TIMEOUT | + IB_QP_RETRY_CNT | + IB_QP_RNR_RETRY | + IB_QP_MAX_QP_RD_ATOMIC | + IB_QP_MAX_DEST_RD_ATOMIC | + IB_QP_CUR_STATE | + IB_QP_ALT_PATH | + IB_QP_ACCESS_FLAGS | + IB_QP_PKEY_INDEX | + IB_QP_MIN_RNR_TIMER | + IB_QP_PATH_MIG_STATE), + [MLX] = (IB_QP_PKEY_INDEX | + IB_QP_QKEY), + } + } + }, + [IB_QPS_SQE] = { + [IB_QPS_RESET] = { .trans = MTHCA_TRANS_ANY2RST }, + [IB_QPS_ERR] = { .trans = MTHCA_TRANS_ANY2ERR }, + [IB_QPS_RTS] = { + .trans = MTHCA_TRANS_SQERR2RTS, + .opt_param = { + [UD] = (IB_QP_CUR_STATE | + IB_QP_QKEY), + [RC] = (IB_QP_CUR_STATE | + IB_QP_MIN_RNR_TIMER), + [MLX] = (IB_QP_CUR_STATE | + IB_QP_QKEY), + } + } + }, + [IB_QPS_ERR] = { + [IB_QPS_RESET] = { .trans = MTHCA_TRANS_ANY2RST }, + [IB_QPS_ERR] = { .trans = MTHCA_TRANS_ANY2ERR } + } +}; + +static void store_attrs(struct mthca_sqp *sqp, struct ib_qp_attr *attr, + int attr_mask) +{ + if (attr_mask & IB_QP_PKEY_INDEX) + sqp->pkey_index = attr->pkey_index; + if (attr_mask & IB_QP_QKEY) + sqp->qkey = attr->qkey; + if (attr_mask & IB_QP_SQ_PSN) + sqp->send_psn = attr->sq_psn; +} + +static void init_port(struct mthca_dev *dev, int port) +{ + int err; + u8 status; + struct mthca_init_ib_param param; + + memset(¶m, 0, sizeof param); + + param.enable_1x = 1; + param.enable_4x = 1; + param.vl_cap = dev->limits.vl_cap; + param.mtu_cap = dev->limits.mtu_cap; + param.gid_cap = dev->limits.gid_table_len; + param.pkey_cap = dev->limits.pkey_table_len; + + err = mthca_INIT_IB(dev, ¶m, port, &status); + if (err) + mthca_warn(dev, "INIT_IB failed, return code %d.\n", err); + if (status) + mthca_warn(dev, "INIT_IB returned status %02x.\n", status); +} + +int mthca_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, int attr_mask) +{ + struct mthca_dev *dev = to_mdev(ibqp->device); + struct mthca_qp *qp = to_mqp(ibqp); + enum ib_qp_state cur_state, new_state; + void *mailbox = NULL; + struct mthca_qp_param *qp_param; + struct mthca_qp_context *qp_context; + u32 req_param, opt_param; + u8 status; + int err; + + if (attr_mask & IB_QP_CUR_STATE) { + if (attr->cur_qp_state != IB_QPS_RTR && + attr->cur_qp_state != IB_QPS_RTS && + attr->cur_qp_state != IB_QPS_SQD && + attr->cur_qp_state != IB_QPS_SQE) + return -EINVAL; + else + cur_state = attr->cur_qp_state; + } else { + spin_lock_irq(&qp->lock); + cur_state = qp->state; + spin_unlock_irq(&qp->lock); + } + + if (attr_mask & IB_QP_STATE) { + if (attr->qp_state < 0 || attr->qp_state > IB_QPS_ERR) + return -EINVAL; + new_state = attr->qp_state; + } else + new_state = cur_state; + + if (state_table[cur_state][new_state].trans == MTHCA_TRANS_INVALID) { + mthca_dbg(dev, "Illegal QP transition " + "%d->%d\n", cur_state, new_state); + return -EINVAL; + } + + req_param = state_table[cur_state][new_state].req_param[qp->transport]; + opt_param = state_table[cur_state][new_state].opt_param[qp->transport]; + + if ((req_param & attr_mask) != req_param) { + mthca_dbg(dev, "QP transition " + "%d->%d missing req attr 0x%08x\n", + cur_state, new_state, + req_param & ~attr_mask); + return -EINVAL; + } + + if (attr_mask & ~(req_param | opt_param | IB_QP_STATE)) { + mthca_dbg(dev, "QP transition (transport %d) " + "%d->%d has extra attr 0x%08x\n", + qp->transport, + cur_state, new_state, + attr_mask & ~(req_param | opt_param | + IB_QP_STATE)); + return -EINVAL; + } + + mailbox = kmalloc(sizeof (*qp_param) + MTHCA_CMD_MAILBOX_EXTRA, GFP_KERNEL); + if (!mailbox) + return -ENOMEM; + qp_param = MAILBOX_ALIGN(mailbox); + qp_context = &qp_param->context; + memset(qp_param, 0, sizeof *qp_param); + + qp_context->flags = cpu_to_be32((to_mthca_state(new_state) << 28) | + (to_mthca_st(qp->transport) << 16)); + qp_context->flags |= cpu_to_be32(MTHCA_QP_BIT_DE); + if (!(attr_mask & IB_QP_PATH_MIG_STATE)) + qp_context->flags |= cpu_to_be32(MTHCA_QP_PM_MIGRATED << 11); + else { + qp_param->opt_param_mask |= cpu_to_be32(MTHCA_QP_OPTPAR_PM_STATE); + switch (attr->path_mig_state) { + case IB_MIG_MIGRATED: + qp_context->flags |= cpu_to_be32(MTHCA_QP_PM_MIGRATED << 11); + break; + case IB_MIG_REARM: + qp_context->flags |= cpu_to_be32(MTHCA_QP_PM_REARM << 11); + break; + case IB_MIG_ARMED: + qp_context->flags |= cpu_to_be32(MTHCA_QP_PM_ARMED << 11); + break; + } + } + /* leave sched_queue as 0 */ + if (qp->transport == MLX || qp->transport == UD) + qp_context->mtu_msgmax = cpu_to_be32((IB_MTU_2048 << 29) | + (11 << 24)); + else if (attr_mask & IB_QP_PATH_MTU) { + qp_context->mtu_msgmax = cpu_to_be32((attr->path_mtu << 29) | + (31 << 24)); + } + qp_context->usr_page = cpu_to_be32(MTHCA_KAR_PAGE); + qp_context->local_qpn = cpu_to_be32(qp->qpn); + if (attr_mask & IB_QP_DEST_QPN) { + qp_context->remote_qpn = cpu_to_be32(attr->dest_qp_num); + } + + if (qp->transport == MLX) + qp_context->pri_path.port_pkey |= + cpu_to_be32(to_msqp(qp)->port << 24); + else { + if (attr_mask & IB_QP_PORT) { + qp_context->pri_path.port_pkey |= + cpu_to_be32(attr->port_num << 24); + qp_param->opt_param_mask |= cpu_to_be32(MTHCA_QP_OPTPAR_PORT_NUM); + } + } + + if (attr_mask & IB_QP_PKEY_INDEX) { + qp_context->pri_path.port_pkey |= + cpu_to_be32(attr->pkey_index); + qp_param->opt_param_mask |= cpu_to_be32(MTHCA_QP_OPTPAR_PKEY_INDEX); + } + + if (attr_mask & IB_QP_RNR_RETRY) { + qp_context->pri_path.rnr_retry = attr->rnr_retry << 5; + qp_param->opt_param_mask |= cpu_to_be32(MTHCA_QP_OPTPAR_RNR_RETRY); + } + + if (attr_mask & IB_QP_AV) { + qp_context->pri_path.g_mylmc = attr->ah_attr.src_path_bits & 0x7f; + qp_context->pri_path.rlid = cpu_to_be16(attr->ah_attr.dlid); + qp_context->pri_path.static_rate = (!!attr->ah_attr.static_rate) << 3; + if (attr->ah_attr.ah_flags & IB_AH_GRH) { + qp_context->pri_path.g_mylmc |= 1 << 7; + qp_context->pri_path.mgid_index = attr->ah_attr.grh.sgid_index; + qp_context->pri_path.hop_limit = attr->ah_attr.grh.hop_limit; + qp_context->pri_path.sl_tclass_flowlabel = + cpu_to_be32((attr->ah_attr.sl << 28) | + (attr->ah_attr.grh.traffic_class << 20) | + (attr->ah_attr.grh.flow_label)); + memcpy(qp_context->pri_path.rgid, + attr->ah_attr.grh.dgid.raw, 16); + } else { + qp_context->pri_path.sl_tclass_flowlabel = + cpu_to_be32(attr->ah_attr.sl << 28); + } + qp_param->opt_param_mask |= cpu_to_be32(MTHCA_QP_OPTPAR_PRIMARY_ADDR_PATH); + } + + if (attr_mask & IB_QP_TIMEOUT) { + qp_context->pri_path.ackto = attr->timeout; + qp_param->opt_param_mask |= cpu_to_be32(MTHCA_QP_OPTPAR_ACK_TIMEOUT); + } + + /* XXX alt_path */ + + /* leave rdd as 0 */ + qp_context->pd = cpu_to_be32(to_mpd(ibqp->pd)->pd_num); + /* leave wqe_base as 0 (we always create an MR based at 0 for WQs) */ + qp_context->wqe_lkey = cpu_to_be32(qp->mr.ibmr.lkey); + qp_context->params1 = cpu_to_be32((MTHCA_ACK_REQ_FREQ << 28) | + (MTHCA_FLIGHT_LIMIT << 24) | + MTHCA_QP_BIT_SRE | + MTHCA_QP_BIT_SWE | + MTHCA_QP_BIT_SAE); + if (qp->sq.policy == IB_SIGNAL_ALL_WR) + qp_context->params1 |= cpu_to_be32(MTHCA_QP_BIT_SSC); + if (attr_mask & IB_QP_RETRY_CNT) { + qp_context->params1 |= cpu_to_be32(attr->retry_cnt << 16); + qp_param->opt_param_mask |= cpu_to_be32(MTHCA_QP_OPTPAR_RETRY_COUNT); + } + + /* XXX initiator resources */ + + if (attr_mask & IB_QP_SQ_PSN) + qp_context->next_send_psn = cpu_to_be32(attr->sq_psn); + qp_context->cqn_snd = cpu_to_be32(to_mcq(ibqp->send_cq)->cqn); + + /* XXX RDMA/atomic enable, responder resources */ + + if (qp->rq.policy == IB_SIGNAL_ALL_WR) + qp_context->params2 |= cpu_to_be32(MTHCA_QP_BIT_RSC); + if (attr_mask & IB_QP_MIN_RNR_TIMER) { + qp_context->rnr_nextrecvpsn |= cpu_to_be32(attr->min_rnr_timer << 24); + qp_param->opt_param_mask |= cpu_to_be32(MTHCA_QP_OPTPAR_RNR_TIMEOUT); + } + if (attr_mask & IB_QP_RQ_PSN) + qp_context->rnr_nextrecvpsn |= cpu_to_be32(attr->rq_psn); + + /* XXX ra_buff_indx */ + + qp_context->cqn_rcv = cpu_to_be32(to_mcq(ibqp->recv_cq)->cqn); + + if (attr_mask & IB_QP_QKEY) { + qp_context->qkey = cpu_to_be32(attr->qkey); + qp_param->opt_param_mask |= cpu_to_be32(MTHCA_QP_OPTPAR_Q_KEY); + } + + err = mthca_MODIFY_QP(dev, state_table[cur_state][new_state].trans, + qp->qpn, 0, qp_param, 0, &status); + if (status) { + mthca_warn(dev, "modify QP %d returned status %02x.\n", + state_table[cur_state][new_state].trans, status); + err = -EINVAL; + } + + if (!err) + qp->state = new_state; + + kfree(mailbox); + + if (is_sqp(dev, qp)) + store_attrs(to_msqp(qp), attr, attr_mask); + + /* + * If we are moving QP0 to RTR, bring the IB link up; if we + * are moving QP0 to RESET or ERROR, bring the link back down. + */ + if (is_qp0(dev, qp)) { + if (cur_state != IB_QPS_RTR && + new_state == IB_QPS_RTR) + init_port(dev, to_msqp(qp)->port); + + if (cur_state != IB_QPS_RESET && + cur_state != IB_QPS_ERR && + (new_state == IB_QPS_RESET || + new_state == IB_QPS_ERR)) + mthca_CLOSE_IB(dev, to_msqp(qp)->port, &status); + } + + return err; +} + +/* + * Allocate and register buffer for WQEs. qp->rq.max, sq.max, + * rq.max_gs and sq.max_gs must all be assigned. + * mthca_alloc_wqe_buf will calculate rq.wqe_shift and + * sq.wqe_shift (as well as send_wqe_offset, is_direct, and + * queue) + */ +static int mthca_alloc_wqe_buf(struct mthca_dev *dev, + struct mthca_pd *pd, + struct mthca_qp *qp) +{ + int size; + int i; + int npages, shift; + dma_addr_t t; + u64 *dma_list = NULL; + int err = -ENOMEM; + + size = sizeof (struct mthca_next_seg) + + qp->rq.max_gs * sizeof (struct mthca_data_seg); + + for (qp->rq.wqe_shift = 6; 1 << qp->rq.wqe_shift < size; + qp->rq.wqe_shift++) + ; /* nothing */ + + size = sizeof (struct mthca_next_seg) + + qp->sq.max_gs * sizeof (struct mthca_data_seg); + if (qp->transport == MLX) + size += 2 * sizeof (struct mthca_data_seg); + else if (qp->transport == UD) + size += sizeof (struct mthca_ud_seg); + else /* bind seg is as big as atomic + raddr segs */ + size += sizeof (struct mthca_bind_seg); + + for (qp->sq.wqe_shift = 6; 1 << qp->sq.wqe_shift < size; + qp->sq.wqe_shift++) + ; /* nothing */ + + qp->send_wqe_offset = ALIGN(qp->rq.max << qp->rq.wqe_shift, + 1 << qp->sq.wqe_shift); + size = PAGE_ALIGN(qp->send_wqe_offset + + (qp->sq.max << qp->sq.wqe_shift)); + + qp->wrid = kmalloc((qp->rq.max + qp->sq.max) * sizeof (u64), + GFP_KERNEL); + if (!qp->wrid) + goto err_out; + + if (size <= MTHCA_MAX_DIRECT_QP_SIZE) { + qp->is_direct = 1; + npages = 1; + shift = get_order(size) + PAGE_SHIFT; + + if (0) + mthca_dbg(dev, "Creating direct QP of size %d (shift %d)\n", + size, shift); + + qp->queue.direct.buf = pci_alloc_consistent(dev->pdev, size, &t); + if (!qp->queue.direct.buf) + goto err_out; + + pci_unmap_addr_set(&qp->queue.direct, mapping, t); + + memset(qp->queue.direct.buf, 0, size); + + while (t & ((1 << shift) - 1)) { + --shift; + npages *= 2; + } + + dma_list = kmalloc(npages * sizeof *dma_list, GFP_KERNEL); + if (!dma_list) + goto err_out_free; + + for (i = 0; i < npages; ++i) + dma_list[i] = t + i * (1 << shift); + } else { + qp->is_direct = 0; + npages = size / PAGE_SIZE; + shift = PAGE_SHIFT; + + if (0) + mthca_dbg(dev, "Creating indirect QP with %d pages\n", npages); + + dma_list = kmalloc(npages * sizeof *dma_list, GFP_KERNEL); + if (!dma_list) + goto err_out; + + qp->queue.page_list = kmalloc(npages * + sizeof *qp->queue.page_list, + GFP_KERNEL); + if (!qp->queue.page_list) + goto err_out; + + for (i = 0; i < npages; ++i) { + qp->queue.page_list[i].buf = + pci_alloc_consistent(dev->pdev, PAGE_SIZE, &t); + if (!qp->queue.page_list[i].buf) + goto err_out_free; + + memset(qp->queue.page_list[i].buf, 0, PAGE_SIZE); + + pci_unmap_addr_set(&qp->queue.page_list[i], mapping, t); + dma_list[i] = t; + } + } + + err = mthca_mr_alloc_phys(dev, pd->pd_num, dma_list, shift, + npages, 0, size, + MTHCA_MPT_FLAG_LOCAL_WRITE | + MTHCA_MPT_FLAG_LOCAL_READ, + &qp->mr); + if (err) + goto err_out_free; + + kfree(dma_list); + return 0; + + err_out_free: + if (qp->is_direct) { + pci_free_consistent(dev->pdev, size, + qp->queue.direct.buf, + pci_unmap_addr(&qp->queue.direct, mapping)); + } else + for (i = 0; i < npages; ++i) { + if (qp->queue.page_list[i].buf) + pci_free_consistent(dev->pdev, PAGE_SIZE, + qp->queue.page_list[i].buf, + pci_unmap_addr(&qp->queue.page_list[i], + mapping)); + + } + + err_out: + kfree(qp->wrid); + kfree(dma_list); + return err; +} + +static int mthca_alloc_qp_common(struct mthca_dev *dev, + struct mthca_pd *pd, + struct mthca_cq *send_cq, + struct mthca_cq *recv_cq, + enum ib_sig_type send_policy, + enum ib_sig_type recv_policy, + struct mthca_qp *qp) +{ + int err; + + spin_lock_init(&qp->lock); + atomic_set(&qp->refcount, 1); + qp->state = IB_QPS_RESET; + qp->sq.policy = send_policy; + qp->rq.policy = recv_policy; + qp->rq.cur = 0; + qp->sq.cur = 0; + qp->rq.next = 0; + qp->sq.next = 0; + qp->rq.last_comp = qp->rq.max - 1; + qp->sq.last_comp = qp->sq.max - 1; + qp->rq.last = NULL; + qp->sq.last = NULL; + + err = mthca_alloc_wqe_buf(dev, pd, qp); + return err; +} + +int mthca_alloc_qp(struct mthca_dev *dev, + struct mthca_pd *pd, + struct mthca_cq *send_cq, + struct mthca_cq *recv_cq, + enum ib_qp_type type, + enum ib_sig_type send_policy, + enum ib_sig_type recv_policy, + struct mthca_qp *qp) +{ + int err; + + switch (type) { + case IB_QPT_RC: qp->transport = RC; break; + case IB_QPT_UC: qp->transport = UC; break; + case IB_QPT_UD: qp->transport = UD; break; + default: return -EINVAL; + } + + qp->qpn = mthca_alloc(&dev->qp_table.alloc); + if (qp->qpn == -1) + return -ENOMEM; + + err = mthca_alloc_qp_common(dev, pd, send_cq, recv_cq, + send_policy, recv_policy, qp); + if (err) { + mthca_free(&dev->qp_table.alloc, qp->qpn); + return err; + } + + spin_lock_irq(&dev->qp_table.lock); + mthca_array_set(&dev->qp_table.qp, + qp->qpn & (dev->limits.num_qps - 1), qp); + spin_unlock_irq(&dev->qp_table.lock); + + return 0; +} + +int mthca_alloc_sqp(struct mthca_dev *dev, + struct mthca_pd *pd, + struct mthca_cq *send_cq, + struct mthca_cq *recv_cq, + enum ib_sig_type send_policy, + enum ib_sig_type recv_policy, + int qpn, + int port, + struct mthca_sqp *sqp) +{ + int err = 0; + u32 mqpn = qpn * 2 + dev->qp_table.sqp_start + port - 1; + + sqp->header_buf_size = sqp->qp.sq.max * MTHCA_UD_HEADER_SIZE; + sqp->header_buf = dma_alloc_coherent(&dev->pdev->dev, sqp->header_buf_size, + &sqp->header_dma, GFP_KERNEL); + if (!sqp->header_buf) + return -ENOMEM; + + spin_lock_irq(&dev->qp_table.lock); + if (mthca_array_get(&dev->qp_table.qp, mqpn)) + err = -EBUSY; + else + mthca_array_set(&dev->qp_table.qp, mqpn, sqp); + spin_unlock_irq(&dev->qp_table.lock); + + if (err) + goto err_out; + + sqp->port = port; + sqp->qp.qpn = mqpn; + sqp->qp.transport = MLX; + + err = mthca_alloc_qp_common(dev, pd, send_cq, recv_cq, + send_policy, recv_policy, + &sqp->qp); + if (err) + goto err_out_free; + + atomic_inc(&pd->sqp_count); + + return 0; + + err_out_free: + spin_lock_irq(&dev->qp_table.lock); + mthca_array_clear(&dev->qp_table.qp, mqpn); + spin_unlock_irq(&dev->qp_table.lock); + + err_out: + dma_free_coherent(&dev->pdev->dev, sqp->header_buf_size, + sqp->header_buf, sqp->header_dma); + + return err; +} + +void mthca_free_qp(struct mthca_dev *dev, + struct mthca_qp *qp) +{ + u8 status; + int size; + int i; + + spin_lock_irq(&dev->qp_table.lock); + mthca_array_clear(&dev->qp_table.qp, + qp->qpn & (dev->limits.num_qps - 1)); + spin_unlock_irq(&dev->qp_table.lock); + + atomic_dec(&qp->refcount); + wait_event(qp->wait, !atomic_read(&qp->refcount)); + + if (qp->state != IB_QPS_RESET) + mthca_MODIFY_QP(dev, MTHCA_TRANS_ANY2RST, qp->qpn, 0, NULL, 0, &status); + + mthca_cq_clean(dev, to_mcq(qp->ibqp.send_cq)->cqn, qp->qpn); + if (qp->ibqp.send_cq != qp->ibqp.recv_cq) + mthca_cq_clean(dev, to_mcq(qp->ibqp.recv_cq)->cqn, qp->qpn); + + mthca_free_mr(dev, &qp->mr); + + size = PAGE_ALIGN(qp->send_wqe_offset + + (qp->sq.max << qp->sq.wqe_shift)); + + if (qp->is_direct) { + pci_free_consistent(dev->pdev, size, + qp->queue.direct.buf, + pci_unmap_addr(&qp->queue.direct, mapping)); + } else { + for (i = 0; i < size / PAGE_SIZE; ++i) { + pci_free_consistent(dev->pdev, PAGE_SIZE, + qp->queue.page_list[i].buf, + pci_unmap_addr(&qp->queue.page_list[i], + mapping)); + } + } + + kfree(qp->wrid); + + if (is_sqp(dev, qp)) { + atomic_dec(&(to_mpd(qp->ibqp.pd)->sqp_count)); + dma_free_coherent(&dev->pdev->dev, + to_msqp(qp)->header_buf_size, + to_msqp(qp)->header_buf, + to_msqp(qp)->header_dma); + } + else + mthca_free(&dev->qp_table.alloc, qp->qpn); +} + +/* Create UD header for an MLX send and build a data segment for it */ +static int build_mlx_header(struct mthca_dev *dev, struct mthca_sqp *sqp, + int ind, struct ib_send_wr *wr, + struct mthca_mlx_seg *mlx, + struct mthca_data_seg *data) +{ + int header_size; + int err; + + ib_ud_header_init(256, /* assume a MAD */ + sqp->ud_header.grh_present, + &sqp->ud_header); + + err = mthca_read_ah(dev, to_mah(wr->wr.ud.ah), &sqp->ud_header); + if (err) + return err; + mlx->flags &= ~cpu_to_be32(MTHCA_NEXT_SOLICIT | 1); + mlx->flags |= cpu_to_be32((!sqp->qp.ibqp.qp_num ? MTHCA_MLX_VL15 : 0) | + (sqp->ud_header.lrh.destination_lid == 0xffff ? + MTHCA_MLX_SLR : 0) | + (sqp->ud_header.lrh.service_level << 8)); + mlx->rlid = sqp->ud_header.lrh.destination_lid; + mlx->vcrc = 0; + + switch (wr->opcode) { + case IB_WR_SEND: + sqp->ud_header.bth.opcode = IB_OPCODE_UD_SEND_ONLY; + sqp->ud_header.immediate_present = 0; + break; + case IB_WR_SEND_WITH_IMM: + sqp->ud_header.bth.opcode = IB_OPCODE_UD_SEND_ONLY_WITH_IMMEDIATE; + sqp->ud_header.immediate_present = 1; + sqp->ud_header.immediate_data = wr->imm_data; + break; + default: + return -EINVAL; + } + + sqp->ud_header.lrh.virtual_lane = !sqp->qp.ibqp.qp_num ? 15 : 0; + if (sqp->ud_header.lrh.destination_lid == 0xffff) + sqp->ud_header.lrh.source_lid = 0xffff; + sqp->ud_header.bth.solicited_event = !!(wr->send_flags & IB_SEND_SOLICITED); + if (!sqp->qp.ibqp.qp_num) + ib_cached_pkey_get(&dev->ib_dev, sqp->port, + sqp->pkey_index, + &sqp->ud_header.bth.pkey); + else + ib_cached_pkey_get(&dev->ib_dev, sqp->port, + wr->wr.ud.pkey_index, + &sqp->ud_header.bth.pkey); + cpu_to_be16s(&sqp->ud_header.bth.pkey); + sqp->ud_header.bth.destination_qpn = cpu_to_be32(wr->wr.ud.remote_qpn); + sqp->ud_header.bth.psn = cpu_to_be32((sqp->send_psn++) & ((1 << 24) - 1)); + sqp->ud_header.deth.qkey = cpu_to_be32(wr->wr.ud.remote_qkey & 0x80000000 ? + sqp->qkey : wr->wr.ud.remote_qkey); + sqp->ud_header.deth.source_qpn = cpu_to_be32(sqp->qp.ibqp.qp_num); + + header_size = ib_ud_header_pack(&sqp->ud_header, + sqp->header_buf + + ind * MTHCA_UD_HEADER_SIZE); + + data->byte_count = cpu_to_be32(header_size); + data->lkey = cpu_to_be32(to_mpd(sqp->qp.ibqp.pd)->ntmr.ibmr.lkey); + data->addr = cpu_to_be64(sqp->header_dma + + ind * MTHCA_UD_HEADER_SIZE); + + return 0; +} + +int mthca_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr, + struct ib_send_wr **bad_wr) +{ + struct mthca_dev *dev = to_mdev(ibqp->device); + struct mthca_qp *qp = to_mqp(ibqp); + void *wqe; + void *prev_wqe; + unsigned long flags; + int err = 0; + int nreq; + int i; + int size; + int size0 = 0; + u32 f0 = 0; + int ind; + u8 op0 = 0; + + static const u8 opcode[] = { + [IB_WR_SEND] = MTHCA_OPCODE_SEND, + [IB_WR_SEND_WITH_IMM] = MTHCA_OPCODE_SEND_IMM, + [IB_WR_RDMA_WRITE] = MTHCA_OPCODE_RDMA_WRITE, + [IB_WR_RDMA_WRITE_WITH_IMM] = MTHCA_OPCODE_RDMA_WRITE_IMM, + [IB_WR_RDMA_READ] = MTHCA_OPCODE_RDMA_READ, + [IB_WR_ATOMIC_CMP_AND_SWP] = MTHCA_OPCODE_ATOMIC_CS, + [IB_WR_ATOMIC_FETCH_AND_ADD] = MTHCA_OPCODE_ATOMIC_FA, + }; + + spin_lock_irqsave(&qp->lock, flags); + + /* XXX check that state is OK to post send */ + + ind = qp->sq.next; + + for (nreq = 0; wr; ++nreq, wr = wr->next) { + if (qp->sq.cur + nreq >= qp->sq.max) { + mthca_err(dev, "SQ full (%d posted, %d max, %d nreq)\n", + qp->sq.cur, qp->sq.max, nreq); + err = -ENOMEM; + *bad_wr = wr; + goto out; + } + + wqe = get_send_wqe(qp, ind); + prev_wqe = qp->sq.last; + qp->sq.last = wqe; + + ((struct mthca_next_seg *) wqe)->nda_op = 0; + ((struct mthca_next_seg *) wqe)->ee_nds = 0; + ((struct mthca_next_seg *) wqe)->flags = + ((wr->send_flags & IB_SEND_SIGNALED) ? + cpu_to_be32(MTHCA_NEXT_CQ_UPDATE) : 0) | + ((wr->send_flags & IB_SEND_SOLICITED) ? + cpu_to_be32(MTHCA_NEXT_SOLICIT) : 0) | + cpu_to_be32(1); + if (wr->opcode == IB_WR_SEND_WITH_IMM || + wr->opcode == IB_WR_RDMA_WRITE_WITH_IMM) + ((struct mthca_next_seg *) wqe)->flags = wr->imm_data; + + wqe += sizeof (struct mthca_next_seg); + size = sizeof (struct mthca_next_seg) / 16; + + switch (qp->transport) { + case RC: + switch (wr->opcode) { + case IB_WR_ATOMIC_CMP_AND_SWP: + case IB_WR_ATOMIC_FETCH_AND_ADD: + ((struct mthca_raddr_seg *) wqe)->raddr = + cpu_to_be64(wr->wr.atomic.remote_addr); + ((struct mthca_raddr_seg *) wqe)->rkey = + cpu_to_be32(wr->wr.atomic.rkey); + ((struct mthca_raddr_seg *) wqe)->reserved = 0; + + wqe += sizeof (struct mthca_raddr_seg); + + if (wr->opcode == IB_WR_ATOMIC_CMP_AND_SWP) { + ((struct mthca_atomic_seg *) wqe)->swap_add = + cpu_to_be64(wr->wr.atomic.swap); + ((struct mthca_atomic_seg *) wqe)->compare = + cpu_to_be64(wr->wr.atomic.compare_add); + } else { + ((struct mthca_atomic_seg *) wqe)->swap_add = + cpu_to_be64(wr->wr.atomic.compare_add); + ((struct mthca_atomic_seg *) wqe)->compare = 0; + } + + wqe += sizeof (struct mthca_atomic_seg); + size += sizeof (struct mthca_raddr_seg) / 16 + + sizeof (struct mthca_atomic_seg); + break; + + case IB_WR_RDMA_WRITE: + case IB_WR_RDMA_WRITE_WITH_IMM: + case IB_WR_RDMA_READ: + ((struct mthca_raddr_seg *) wqe)->raddr = + cpu_to_be64(wr->wr.rdma.remote_addr); + ((struct mthca_raddr_seg *) wqe)->rkey = + cpu_to_be32(wr->wr.rdma.rkey); + ((struct mthca_raddr_seg *) wqe)->reserved = 0; + wqe += sizeof (struct mthca_raddr_seg); + size += sizeof (struct mthca_raddr_seg) / 16; + break; + + default: + /* No extra segments required for sends */ + break; + } + + case UD: + ((struct mthca_ud_seg *) wqe)->lkey = + cpu_to_be32(to_mah(wr->wr.ud.ah)->key); + ((struct mthca_ud_seg *) wqe)->av_addr = + cpu_to_be64(to_mah(wr->wr.ud.ah)->avdma); + ((struct mthca_ud_seg *) wqe)->dqpn = + cpu_to_be32(wr->wr.ud.remote_qpn); + ((struct mthca_ud_seg *) wqe)->qkey = + cpu_to_be32(wr->wr.ud.remote_qkey); + + wqe += sizeof (struct mthca_ud_seg); + size += sizeof (struct mthca_ud_seg) / 16; + break; + + case MLX: + err = build_mlx_header(dev, to_msqp(qp), ind, wr, + wqe - sizeof (struct mthca_next_seg), + wqe); + if (err) { + *bad_wr = wr; + goto out; + } + wqe += sizeof (struct mthca_data_seg); + size += sizeof (struct mthca_data_seg) / 16; + break; + } + + if (wr->num_sge > qp->sq.max_gs) { + mthca_err(dev, "too many gathers\n"); + err = -EINVAL; + *bad_wr = wr; + goto out; + } + + for (i = 0; i < wr->num_sge; ++i) { + ((struct mthca_data_seg *) wqe)->byte_count = + cpu_to_be32(wr->sg_list[i].length); + ((struct mthca_data_seg *) wqe)->lkey = + cpu_to_be32(wr->sg_list[i].lkey); + ((struct mthca_data_seg *) wqe)->addr = + cpu_to_be64(wr->sg_list[i].addr); + wqe += sizeof (struct mthca_data_seg); + size += sizeof (struct mthca_data_seg) / 16; + } + + /* Add one more inline data segment for ICRC */ + if (qp->transport == MLX) { + ((struct mthca_data_seg *) wqe)->byte_count = + cpu_to_be32((1 << 31) | 4); + ((u32 *) wqe)[1] = 0; + wqe += sizeof (struct mthca_data_seg); + size += sizeof (struct mthca_data_seg) / 16; + } + + qp->wrid[ind + qp->rq.max] = wr->wr_id; + + if (wr->opcode >= ARRAY_SIZE(opcode)) { + mthca_err(dev, "opcode invalid\n"); + err = -EINVAL; + *bad_wr = wr; + goto out; + } + + if (prev_wqe) { + ((struct mthca_next_seg *) prev_wqe)->nda_op = + cpu_to_be32(((ind << qp->sq.wqe_shift) + + qp->send_wqe_offset) | + opcode[wr->opcode]); + smp_wmb(); + ((struct mthca_next_seg *) prev_wqe)->ee_nds = + cpu_to_be32((size0 ? 0 : MTHCA_NEXT_DBD) | size); + } + + if (!size0) { + size0 = size; + op0 = opcode[wr->opcode]; + } + + ++ind; + if (unlikely(ind >= qp->sq.max)) + ind -= qp->sq.max; + } + +out: + if (nreq) { + u32 doorbell[2]; + + doorbell[0] = cpu_to_be32(((qp->sq.next << qp->sq.wqe_shift) + + qp->send_wqe_offset) | f0 | op0); + doorbell[1] = cpu_to_be32((qp->qpn << 8) | size0); + + wmb(); + + mthca_write64(doorbell, + dev->kar + MTHCA_SEND_DOORBELL, + MTHCA_GET_DOORBELL_LOCK(&dev->doorbell_lock)); + } + + qp->sq.cur += nreq; + qp->sq.next = ind; + + spin_unlock_irqrestore(&qp->lock, flags); + return err; +} + +int mthca_post_receive(struct ib_qp *ibqp, struct ib_recv_wr *wr, + struct ib_recv_wr **bad_wr) +{ + struct mthca_dev *dev = to_mdev(ibqp->device); + struct mthca_qp *qp = to_mqp(ibqp); + unsigned long flags; + int err = 0; + int nreq; + int i; + int size; + int size0 = 0; + int ind; + void *wqe; + void *prev_wqe; + + spin_lock_irqsave(&qp->lock, flags); + + /* XXX check that state is OK to post receive */ + + ind = qp->rq.next; + + for (nreq = 0; wr; ++nreq, wr = wr->next) { + if (qp->rq.cur + nreq >= qp->rq.max) { + mthca_err(dev, "RQ %06x full\n", qp->qpn); + err = -ENOMEM; + *bad_wr = wr; + goto out; + } + + wqe = get_recv_wqe(qp, ind); + prev_wqe = qp->rq.last; + qp->rq.last = wqe; + + ((struct mthca_next_seg *) wqe)->nda_op = 0; + ((struct mthca_next_seg *) wqe)->ee_nds = + cpu_to_be32(MTHCA_NEXT_DBD); + ((struct mthca_next_seg *) wqe)->flags = + (wr->recv_flags & IB_RECV_SIGNALED) ? + cpu_to_be32(MTHCA_NEXT_CQ_UPDATE) : 0; + + wqe += sizeof (struct mthca_next_seg); + size = sizeof (struct mthca_next_seg) / 16; + + if (wr->num_sge > qp->rq.max_gs) { + err = -EINVAL; + *bad_wr = wr; + goto out; + } + + for (i = 0; i < wr->num_sge; ++i) { + ((struct mthca_data_seg *) wqe)->byte_count = + cpu_to_be32(wr->sg_list[i].length); + ((struct mthca_data_seg *) wqe)->lkey = + cpu_to_be32(wr->sg_list[i].lkey); + ((struct mthca_data_seg *) wqe)->addr = + cpu_to_be64(wr->sg_list[i].addr); + wqe += sizeof (struct mthca_data_seg); + size += sizeof (struct mthca_data_seg) / 16; + } + + qp->wrid[ind] = wr->wr_id; + + if (prev_wqe) { + ((struct mthca_next_seg *) prev_wqe)->nda_op = + cpu_to_be32((ind << qp->rq.wqe_shift) | 1); + smp_wmb(); + ((struct mthca_next_seg *) prev_wqe)->ee_nds = + cpu_to_be32(MTHCA_NEXT_DBD | size); + } + + if (!size0) + size0 = size; + + ++ind; + if (unlikely(ind >= qp->rq.max)) + ind -= qp->rq.max; + } + +out: + if (nreq) { + u32 doorbell[2]; + + doorbell[0] = cpu_to_be32((qp->rq.next << qp->rq.wqe_shift) | size0); + doorbell[1] = cpu_to_be32((qp->qpn << 8) | nreq); + + wmb(); + + mthca_write64(doorbell, + dev->kar + MTHCA_RECEIVE_DOORBELL, + MTHCA_GET_DOORBELL_LOCK(&dev->doorbell_lock)); + } + + qp->rq.cur += nreq; + qp->rq.next = ind; + + spin_unlock_irqrestore(&qp->lock, flags); + return err; +} + +int mthca_free_err_wqe(struct mthca_qp *qp, int is_send, + int index, int *dbd, u32 *new_wqe) +{ + struct mthca_next_seg *next; + + if (is_send) + next = get_send_wqe(qp, index); + else + next = get_recv_wqe(qp, index); + + *dbd = !!(next->ee_nds & cpu_to_be32(MTHCA_NEXT_DBD)); + if (next->ee_nds & cpu_to_be32(0x3f)) + *new_wqe = (next->nda_op & cpu_to_be32(~0x3f)) | + (next->ee_nds & cpu_to_be32(0x3f)); + else + *new_wqe = 0; + + return 0; +} + +int __devinit mthca_init_qp_table(struct mthca_dev *dev) +{ + int err; + u8 status; + int i; + + spin_lock_init(&dev->qp_table.lock); + + /* + * We reserve 2 extra QPs per port for the special QPs. The + * special QP for port 1 has to be even, so round up. + */ + dev->qp_table.sqp_start = (dev->limits.reserved_qps + 1) & ~1UL; + err = mthca_alloc_init(&dev->qp_table.alloc, + dev->limits.num_qps, + (1 << 24) - 1, + dev->qp_table.sqp_start + + MTHCA_MAX_PORTS * 2); + if (err) + return err; + + err = mthca_array_init(&dev->qp_table.qp, + dev->limits.num_qps); + if (err) { + mthca_alloc_cleanup(&dev->qp_table.alloc); + return err; + } + + for (i = 0; i < 2; ++i) { + err = mthca_CONF_SPECIAL_QP(dev, i ? IB_QPT_GSI : IB_QPT_SMI, + dev->qp_table.sqp_start + i * 2, + &status); + if (err) + goto err_out; + if (status) { + mthca_warn(dev, "CONF_SPECIAL_QP returned " + "status %02x, aborting.\n", + status); + err = -EINVAL; + goto err_out; + } + } + return 0; + + err_out: + for (i = 0; i < 2; ++i) + mthca_CONF_SPECIAL_QP(dev, i, 0, &status); + + mthca_array_cleanup(&dev->qp_table.qp, dev->limits.num_qps); + mthca_alloc_cleanup(&dev->qp_table.alloc); + + return err; +} + +void __devexit mthca_cleanup_qp_table(struct mthca_dev *dev) +{ + int i; + u8 status; + + for (i = 0; i < 2; ++i) + mthca_CONF_SPECIAL_QP(dev, i, 0, &status); + + mthca_alloc_cleanup(&dev->qp_table.alloc); +} From roland@topspin.com Mon Dec 27 21:49:40 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 21:51:04 -0800 (PST) Received: from umhlanga.STRATNET.NET (umhlanga.stratnet.net [12.162.17.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS5nJDS025948 for ; Mon, 27 Dec 2004 21:49:39 -0800 Received: from exch-1.topspincom.com ([12.162.17.3]) by umhlanga.STRATNET.NET with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:50:55 -0800 Received: from eddore ([10.10.253.169]) by exch-1.topspincom.com with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:50:55 -0800 Received: from localhost ([127.0.0.1] helo=eddore) by eddore with smtp (Exim 4.34) id 1CjAFw-0000sg-Sf; Mon, 27 Dec 2004 21:50:55 -0800 Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com, openib-general@openib.org In-Reply-To: <200412272150.khYObxkGxtPP9Oju@topspin.com> X-Mailer: Roland's Patchbomber Date: Mon, 27 Dec 2004 21:50:52 -0800 Message-Id: <200412272150.vKuRYXlCFl5x8NAo@topspin.com> Mime-Version: 1.0 To: davem@davemloft.net From: Roland Dreier X-SA-Exim-Connect-IP: 127.0.0.1 X-SA-Exim-Mail-From: roland@topspin.com Subject: [PATCH][v5][2/24] Add core InfiniBand support Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-SA-Exim-Version: 4.1 (built Tue, 17 Aug 2004 11:06:07 +0200) X-SA-Exim-Scanned: Yes (on eddore) X-OriginalArrivalTime: 28 Dec 2004 05:50:55.0548 (UTC) FILETIME=[329127C0:01C4ECA1] X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13114 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: roland@topspin.com Precedence: bulk X-list: netdev Add implementation of core InfiniBand support. This can be thought of as a midlayer that provides an abstraction between low-level hardware drivers and upper level protocols (such as IP-over-InfiniBand). Signed-off-by: Roland Dreier --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/Kconfig 2004-12-27 21:48:18.185289416 -0800 @@ -0,0 +1,10 @@ +menu "InfiniBand support" + +config INFINIBAND + tristate "InfiniBand support" + ---help--- + Core support for InfiniBand (IB). Make sure to also select + any protocols you wish to use as well as drivers for your + InfiniBand hardware. + +endmenu --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/Makefile 2004-12-27 21:48:18.216284854 -0800 @@ -0,0 +1 @@ +obj-$(CONFIG_INFINIBAND) += core/ --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/core/Makefile 2004-12-27 21:48:18.262278084 -0800 @@ -0,0 +1,6 @@ +EXTRA_CFLAGS += -Idrivers/infiniband/include + +obj-$(CONFIG_INFINIBAND) += ib_core.o + +ib_core-y := packer.o ud_header.o verbs.o sysfs.o \ + device.o fmr_pool.o cache.o --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/core/cache.c 2004-12-27 21:48:18.576231871 -0800 @@ -0,0 +1,328 @@ +/* + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: cache.c 1349 2004-12-16 21:09:43Z roland $ + */ + +#include +#include +#include +#include + +#include "core_priv.h" + +struct ib_pkey_cache { + int table_len; + u16 table[0]; +}; + +struct ib_gid_cache { + int table_len; + union ib_gid table[0]; +}; + +struct ib_update_work { + struct work_struct work; + struct ib_device *device; + u8 port_num; +}; + +static inline int start_port(struct ib_device *device) +{ + return device->node_type == IB_NODE_SWITCH ? 0 : 1; +} + +static inline int end_port(struct ib_device *device) +{ + return device->node_type == IB_NODE_SWITCH ? 0 : device->phys_port_cnt; +} + +int ib_cached_gid_get(struct ib_device *device, + u8 port, + int index, + union ib_gid *gid) +{ + struct ib_gid_cache *cache; + unsigned long flags; + int ret = 0; + + if (port < start_port(device) || port > end_port(device)) + return -EINVAL; + + read_lock_irqsave(&device->cache.lock, flags); + + cache = device->cache.gid_cache[port - start_port(device)]; + + if (index < 0 || index >= cache->table_len) + ret = -EINVAL; + else + *gid = cache->table[index]; + + read_unlock_irqrestore(&device->cache.lock, flags); + + return ret; +} +EXPORT_SYMBOL(ib_cached_gid_get); + +int ib_cached_pkey_get(struct ib_device *device, + u8 port, + int index, + u16 *pkey) +{ + struct ib_pkey_cache *cache; + unsigned long flags; + int ret = 0; + + if (port < start_port(device) || port > end_port(device)) + return -EINVAL; + + read_lock_irqsave(&device->cache.lock, flags); + + cache = device->cache.pkey_cache[port - start_port(device)]; + + if (index < 0 || index >= cache->table_len) + ret = -EINVAL; + else + *pkey = cache->table[index]; + + read_unlock_irqrestore(&device->cache.lock, flags); + + return ret; +} +EXPORT_SYMBOL(ib_cached_pkey_get); + +int ib_cached_pkey_find(struct ib_device *device, + u8 port, + u16 pkey, + u16 *index) +{ + struct ib_pkey_cache *cache; + unsigned long flags; + int i; + int ret = -ENOENT; + + if (port < start_port(device) || port > end_port(device)) + return -EINVAL; + + read_lock_irqsave(&device->cache.lock, flags); + + cache = device->cache.pkey_cache[port - start_port(device)]; + + *index = -1; + + for (i = 0; i < cache->table_len; ++i) + if ((cache->table[i] & 0x7fff) == (pkey & 0x7fff)) { + *index = i; + ret = 0; + break; + } + + read_unlock_irqrestore(&device->cache.lock, flags); + + return ret; +} +EXPORT_SYMBOL(ib_cached_pkey_find); + +static void ib_cache_update(struct ib_device *device, + u8 port) +{ + struct ib_port_attr *tprops = NULL; + struct ib_pkey_cache *pkey_cache = NULL, *old_pkey_cache; + struct ib_gid_cache *gid_cache = NULL, *old_gid_cache; + int i; + int ret; + + tprops = kmalloc(sizeof *tprops, GFP_KERNEL); + if (!tprops) + return; + + ret = ib_query_port(device, port, tprops); + if (ret) { + printk(KERN_WARNING "ib_query_port failed (%d) for %s\n", + ret, device->name); + goto err; + } + + pkey_cache = kmalloc(sizeof *pkey_cache + tprops->pkey_tbl_len * + sizeof *pkey_cache->table, GFP_KERNEL); + if (!pkey_cache) + goto err; + + pkey_cache->table_len = tprops->pkey_tbl_len; + + gid_cache = kmalloc(sizeof *gid_cache + tprops->gid_tbl_len * + sizeof *gid_cache->table, GFP_KERNEL); + if (!gid_cache) + goto err; + + gid_cache->table_len = tprops->gid_tbl_len; + + for (i = 0; i < pkey_cache->table_len; ++i) { + ret = ib_query_pkey(device, port, i, pkey_cache->table + i); + if (ret) { + printk(KERN_WARNING "ib_query_pkey failed (%d) for %s (index %d)\n", + ret, device->name, i); + goto err; + } + } + + for (i = 0; i < gid_cache->table_len; ++i) { + ret = ib_query_gid(device, port, i, gid_cache->table + i); + if (ret) { + printk(KERN_WARNING "ib_query_gid failed (%d) for %s (index %d)\n", + ret, device->name, i); + goto err; + } + } + + write_lock_irq(&device->cache.lock); + + old_pkey_cache = device->cache.pkey_cache[port - start_port(device)]; + old_gid_cache = device->cache.gid_cache [port - start_port(device)]; + + device->cache.pkey_cache[port - start_port(device)] = pkey_cache; + device->cache.gid_cache [port - start_port(device)] = gid_cache; + + write_unlock_irq(&device->cache.lock); + + kfree(old_pkey_cache); + kfree(old_gid_cache); + kfree(tprops); + return; + +err: + kfree(pkey_cache); + kfree(gid_cache); + kfree(tprops); +} + +static void ib_cache_task(void *work_ptr) +{ + struct ib_update_work *work = work_ptr; + + ib_cache_update(work->device, work->port_num); + kfree(work); +} + +static void ib_cache_event(struct ib_event_handler *handler, + struct ib_event *event) +{ + struct ib_update_work *work; + + if (event->event == IB_EVENT_PORT_ERR || + event->event == IB_EVENT_PORT_ACTIVE || + event->event == IB_EVENT_LID_CHANGE || + event->event == IB_EVENT_PKEY_CHANGE || + event->event == IB_EVENT_SM_CHANGE) { + work = kmalloc(sizeof *work, GFP_ATOMIC); + if (work) { + INIT_WORK(&work->work, ib_cache_task, work); + work->device = event->device; + work->port_num = event->element.port_num; + schedule_work(&work->work); + } + } +} + +void ib_cache_setup_one(struct ib_device *device) +{ + int p; + + rwlock_init(&device->cache.lock); + + device->cache.pkey_cache = + kmalloc(sizeof *device->cache.pkey_cache * + (end_port(device) - start_port(device) + 1), GFP_KERNEL); + device->cache.gid_cache = + kmalloc(sizeof *device->cache.pkey_cache * + (end_port(device) - start_port(device) + 1), GFP_KERNEL); + + if (!device->cache.pkey_cache || !device->cache.gid_cache) { + printk(KERN_WARNING "Couldn't allocate cache " + "for %s\n", device->name); + goto err; + } + + for (p = 0; p <= end_port(device) - start_port(device); ++p) { + device->cache.pkey_cache[p] = NULL; + device->cache.gid_cache [p] = NULL; + ib_cache_update(device, p + start_port(device)); + } + + INIT_IB_EVENT_HANDLER(&device->cache.event_handler, + device, ib_cache_event); + if (ib_register_event_handler(&device->cache.event_handler)) + goto err_cache; + + return; + +err_cache: + for (p = 0; p <= end_port(device) - start_port(device); ++p) { + kfree(device->cache.pkey_cache[p]); + kfree(device->cache.gid_cache[p]); + } + +err: + kfree(device->cache.pkey_cache); + kfree(device->cache.gid_cache); +} + +void ib_cache_cleanup_one(struct ib_device *device) +{ + int p; + + ib_unregister_event_handler(&device->cache.event_handler); + flush_scheduled_work(); + + for (p = 0; p <= end_port(device) - start_port(device); ++p) { + kfree(device->cache.pkey_cache[p]); + kfree(device->cache.gid_cache[p]); + } + + kfree(device->cache.pkey_cache); + kfree(device->cache.gid_cache); +} + +struct ib_client cache_client = { + .name = "cache", + .add = ib_cache_setup_one, + .remove = ib_cache_cleanup_one +}; + +int __init ib_cache_setup(void) +{ + return ib_register_client(&cache_client); +} + +void __exit ib_cache_cleanup(void) +{ + ib_unregister_client(&cache_client); +} --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/core/core_priv.h 2004-12-27 21:48:18.600228339 -0800 @@ -0,0 +1,52 @@ +/* + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: core_priv.h 1349 2004-12-16 21:09:43Z roland $ + */ + +#ifndef _CORE_PRIV_H +#define _CORE_PRIV_H + +#include +#include + +#include + +int ib_device_register_sysfs(struct ib_device *device); +void ib_device_unregister_sysfs(struct ib_device *device); + +int ib_sysfs_setup(void); +void ib_sysfs_cleanup(void); + +int ib_cache_setup(void); +void ib_cache_cleanup(void); + +#endif /* _CORE_PRIV_H */ --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/core/device.c 2004-12-27 21:48:18.525239377 -0800 @@ -0,0 +1,614 @@ +/* + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: device.c 1349 2004-12-16 21:09:43Z roland $ + */ + +#include +#include +#include +#include +#include + +#include + +#include "core_priv.h" + +MODULE_AUTHOR("Roland Dreier"); +MODULE_DESCRIPTION("core kernel InfiniBand API"); +MODULE_LICENSE("Dual BSD/GPL"); + +struct ib_client_data { + struct list_head list; + struct ib_client *client; + void * data; +}; + +static LIST_HEAD(device_list); +static LIST_HEAD(client_list); + +/* + * device_sem protects access to both device_list and client_list. + * There's no real point to using multiple locks or something fancier + * like an rwsem: we always access both lists, and we're always + * modifying one list or the other list. In any case this is not a + * hot path so there's no point in trying to optimize. + */ +static DECLARE_MUTEX(device_sem); + +static int ib_device_check_mandatory(struct ib_device *device) +{ +#define IB_MANDATORY_FUNC(x) { offsetof(struct ib_device, x), #x } + static const struct { + size_t offset; + char *name; + } mandatory_table[] = { + IB_MANDATORY_FUNC(query_device), + IB_MANDATORY_FUNC(query_port), + IB_MANDATORY_FUNC(query_pkey), + IB_MANDATORY_FUNC(query_gid), + IB_MANDATORY_FUNC(alloc_pd), + IB_MANDATORY_FUNC(dealloc_pd), + IB_MANDATORY_FUNC(create_ah), + IB_MANDATORY_FUNC(destroy_ah), + IB_MANDATORY_FUNC(create_qp), + IB_MANDATORY_FUNC(modify_qp), + IB_MANDATORY_FUNC(destroy_qp), + IB_MANDATORY_FUNC(post_send), + IB_MANDATORY_FUNC(post_recv), + IB_MANDATORY_FUNC(create_cq), + IB_MANDATORY_FUNC(destroy_cq), + IB_MANDATORY_FUNC(poll_cq), + IB_MANDATORY_FUNC(req_notify_cq), + IB_MANDATORY_FUNC(get_dma_mr), + IB_MANDATORY_FUNC(dereg_mr) + }; + int i; + + for (i = 0; i < sizeof mandatory_table / sizeof mandatory_table[0]; ++i) { + if (!*(void **) ((void *) device + mandatory_table[i].offset)) { + printk(KERN_WARNING "Device %s is missing mandatory function %s\n", + device->name, mandatory_table[i].name); + return -EINVAL; + } + } + + return 0; +} + +static struct ib_device *__ib_device_get_by_name(const char *name) +{ + struct ib_device *device; + + list_for_each_entry(device, &device_list, core_list) + if (!strncmp(name, device->name, IB_DEVICE_NAME_MAX)) + return device; + + return NULL; +} + + +static int alloc_name(char *name) +{ + long *inuse; + char buf[IB_DEVICE_NAME_MAX]; + struct ib_device *device; + int i; + + inuse = (long *) get_zeroed_page(GFP_KERNEL); + if (!inuse) + return -ENOMEM; + + list_for_each_entry(device, &device_list, core_list) { + if (!sscanf(device->name, name, &i)) + continue; + if (i < 0 || i >= PAGE_SIZE * 8) + continue; + snprintf(buf, sizeof buf, name, i); + if (!strncmp(buf, device->name, IB_DEVICE_NAME_MAX)) + set_bit(i, inuse); + } + + i = find_first_zero_bit(inuse, PAGE_SIZE * 8); + free_page((unsigned long) inuse); + snprintf(buf, sizeof buf, name, i); + + if (__ib_device_get_by_name(buf)) + return -ENFILE; + + strlcpy(name, buf, IB_DEVICE_NAME_MAX); + return 0; +} + +/** + * ib_alloc_device - allocate an IB device struct + * @size:size of structure to allocate + * + * Low-level drivers should use ib_alloc_device() to allocate &struct + * ib_device. @size is the size of the structure to be allocated, + * including any private data used by the low-level driver. + * ib_dealloc_device() must be used to free structures allocated with + * ib_alloc_device(). + */ +struct ib_device *ib_alloc_device(size_t size) +{ + void *dev; + + BUG_ON(size < sizeof (struct ib_device)); + + dev = kmalloc(size, GFP_KERNEL); + if (!dev) + return NULL; + + memset(dev, 0, size); + + return dev; +} +EXPORT_SYMBOL(ib_alloc_device); + +/** + * ib_dealloc_device - free an IB device struct + * @device:structure to free + * + * Free a structure allocated with ib_alloc_device(). + */ +void ib_dealloc_device(struct ib_device *device) +{ + if (device->reg_state == IB_DEV_UNINITIALIZED) { + kfree(device); + return; + } + + BUG_ON(device->reg_state != IB_DEV_UNREGISTERED); + + ib_device_unregister_sysfs(device); +} +EXPORT_SYMBOL(ib_dealloc_device); + +static int add_client_context(struct ib_device *device, struct ib_client *client) +{ + struct ib_client_data *context; + unsigned long flags; + + context = kmalloc(sizeof *context, GFP_KERNEL); + if (!context) { + printk(KERN_WARNING "Couldn't allocate client context for %s/%s\n", + device->name, client->name); + return -ENOMEM; + } + + context->client = client; + context->data = NULL; + + spin_lock_irqsave(&device->client_data_lock, flags); + list_add(&context->list, &device->client_data_list); + spin_unlock_irqrestore(&device->client_data_lock, flags); + + return 0; +} + +/** + * ib_register_device - Register an IB device with IB core + * @device:Device to register + * + * Low-level drivers use ib_register_device() to register their + * devices with the IB core. All registered clients will receive a + * callback for each device that is added. @device must be allocated + * with ib_alloc_device(). + */ +int ib_register_device(struct ib_device *device) +{ + int ret; + + down(&device_sem); + + if (strchr(device->name, '%')) { + ret = alloc_name(device->name); + if (ret) + goto out; + } + + if (ib_device_check_mandatory(device)) { + ret = -EINVAL; + goto out; + } + + INIT_LIST_HEAD(&device->event_handler_list); + INIT_LIST_HEAD(&device->client_data_list); + spin_lock_init(&device->event_handler_lock); + spin_lock_init(&device->client_data_lock); + + ret = ib_device_register_sysfs(device); + if (ret) { + printk(KERN_WARNING "Couldn't register device %s with driver model\n", + device->name); + goto out; + } + + list_add_tail(&device->core_list, &device_list); + + device->reg_state = IB_DEV_REGISTERED; + + { + struct ib_client *client; + + list_for_each_entry(client, &client_list, list) + if (client->add && !add_client_context(device, client)) + client->add(device); + } + + out: + up(&device_sem); + return ret; +} +EXPORT_SYMBOL(ib_register_device); + +/** + * ib_unregister_device - Unregister an IB device + * @device:Device to unregister + * + * Unregister an IB device. All clients will receive a remove callback. + */ +void ib_unregister_device(struct ib_device *device) +{ + struct ib_client *client; + struct ib_client_data *context, *tmp; + unsigned long flags; + + down(&device_sem); + + list_for_each_entry_reverse(client, &client_list, list) + if (client->remove) + client->remove(device); + + list_del(&device->core_list); + + up(&device_sem); + + spin_lock_irqsave(&device->client_data_lock, flags); + list_for_each_entry_safe(context, tmp, &device->client_data_list, list) + kfree(context); + spin_unlock_irqrestore(&device->client_data_lock, flags); + + device->reg_state = IB_DEV_UNREGISTERED; +} +EXPORT_SYMBOL(ib_unregister_device); + +/** + * ib_register_client - Register an IB client + * @client:Client to register + * + * Upper level users of the IB drivers can use ib_register_client() to + * register callbacks for IB device addition and removal. When an IB + * device is added, each registered client's add method will be called + * (in the order the clients were registered), and when a device is + * removed, each client's remove method will be called (in the reverse + * order that clients were registered). In addition, when + * ib_register_client() is called, the client will receive an add + * callback for all devices already registered. + */ +int ib_register_client(struct ib_client *client) +{ + struct ib_device *device; + + down(&device_sem); + + list_add_tail(&client->list, &client_list); + list_for_each_entry(device, &device_list, core_list) + if (client->add && !add_client_context(device, client)) + client->add(device); + + up(&device_sem); + + return 0; +} +EXPORT_SYMBOL(ib_register_client); + +/** + * ib_unregister_client - Unregister an IB client + * @client:Client to unregister + * + * Upper level users use ib_unregister_client() to remove their client + * registration. When ib_unregister_client() is called, the client + * will receive a remove callback for each IB device still registered. + */ +void ib_unregister_client(struct ib_client *client) +{ + struct ib_client_data *context, *tmp; + struct ib_device *device; + unsigned long flags; + + down(&device_sem); + + list_for_each_entry(device, &device_list, core_list) { + if (client->remove) + client->remove(device); + + spin_lock_irqsave(&device->client_data_lock, flags); + list_for_each_entry_safe(context, tmp, &device->client_data_list, list) + if (context->client == client) { + list_del(&context->list); + kfree(context); + } + spin_unlock_irqrestore(&device->client_data_lock, flags); + } + list_del(&client->list); + + up(&device_sem); +} +EXPORT_SYMBOL(ib_unregister_client); + +/** + * ib_get_client_data - Get IB client context + * @device:Device to get context for + * @client:Client to get context for + * + * ib_get_client_data() returns client context set with + * ib_set_client_data(). + */ +void *ib_get_client_data(struct ib_device *device, struct ib_client *client) +{ + struct ib_client_data *context; + void *ret = NULL; + unsigned long flags; + + spin_lock_irqsave(&device->client_data_lock, flags); + list_for_each_entry(context, &device->client_data_list, list) + if (context->client == client) { + ret = context->data; + break; + } + spin_unlock_irqrestore(&device->client_data_lock, flags); + + return ret; +} +EXPORT_SYMBOL(ib_get_client_data); + +/** + * ib_set_client_data - Get IB client context + * @device:Device to set context for + * @client:Client to set context for + * @data:Context to set + * + * ib_set_client_data() sets client context that can be retrieved with + * ib_get_client_data(). + */ +void ib_set_client_data(struct ib_device *device, struct ib_client *client, + void *data) +{ + struct ib_client_data *context; + unsigned long flags; + + spin_lock_irqsave(&device->client_data_lock, flags); + list_for_each_entry(context, &device->client_data_list, list) + if (context->client == client) { + context->data = data; + goto out; + } + + printk(KERN_WARNING "No client context found for %s/%s\n", + device->name, client->name); + +out: + spin_unlock_irqrestore(&device->client_data_lock, flags); +} +EXPORT_SYMBOL(ib_set_client_data); + +/** + * ib_register_event_handler - Register an IB event handler + * @event_handler:Handler to register + * + * ib_register_event_handler() registers an event handler that will be + * called back when asynchronous IB events occur (as defined in + * chapter 11 of the InfiniBand Architecture Specification). This + * callback may occur in interrupt context. + */ +int ib_register_event_handler (struct ib_event_handler *event_handler) +{ + unsigned long flags; + + spin_lock_irqsave(&event_handler->device->event_handler_lock, flags); + list_add_tail(&event_handler->list, + &event_handler->device->event_handler_list); + spin_unlock_irqrestore(&event_handler->device->event_handler_lock, flags); + + return 0; +} +EXPORT_SYMBOL(ib_register_event_handler); + +/** + * ib_unregister_event_handler - Unregister an event handler + * @event_handler:Handler to unregister + * + * Unregister an event handler registered with + * ib_register_event_handler(). + */ +int ib_unregister_event_handler(struct ib_event_handler *event_handler) +{ + unsigned long flags; + + spin_lock_irqsave(&event_handler->device->event_handler_lock, flags); + list_del(&event_handler->list); + spin_unlock_irqrestore(&event_handler->device->event_handler_lock, flags); + + return 0; +} +EXPORT_SYMBOL(ib_unregister_event_handler); + +/** + * ib_dispatch_event - Dispatch an asynchronous event + * @event:Event to dispatch + * + * Low-level drivers must call ib_dispatch_event() to dispatch the + * event to all registered event handlers when an asynchronous event + * occurs. + */ +void ib_dispatch_event(struct ib_event *event) +{ + unsigned long flags; + struct ib_event_handler *handler; + + spin_lock_irqsave(&event->device->event_handler_lock, flags); + + list_for_each_entry(handler, &event->device->event_handler_list, list) + handler->handler(handler, event); + + spin_unlock_irqrestore(&event->device->event_handler_lock, flags); +} +EXPORT_SYMBOL(ib_dispatch_event); + +/** + * ib_query_device - Query IB device attributes + * @device:Device to query + * @device_attr:Device attributes + * + * ib_query_device() returns the attributes of a device through the + * @device_attr pointer. + */ +int ib_query_device(struct ib_device *device, + struct ib_device_attr *device_attr) +{ + return device->query_device(device, device_attr); +} +EXPORT_SYMBOL(ib_query_device); + +/** + * ib_query_port - Query IB port attributes + * @device:Device to query + * @port_num:Port number to query + * @port_attr:Port attributes + * + * ib_query_port() returns the attributes of a port through the + * @port_attr pointer. + */ +int ib_query_port(struct ib_device *device, + u8 port_num, + struct ib_port_attr *port_attr) +{ + return device->query_port(device, port_num, port_attr); +} +EXPORT_SYMBOL(ib_query_port); + +/** + * ib_query_gid - Get GID table entry + * @device:Device to query + * @port_num:Port number to query + * @index:GID table index to query + * @gid:Returned GID + * + * ib_query_gid() fetches the specified GID table entry. + */ +int ib_query_gid(struct ib_device *device, + u8 port_num, int index, union ib_gid *gid) +{ + return device->query_gid(device, port_num, index, gid); +} +EXPORT_SYMBOL(ib_query_gid); + +/** + * ib_query_pkey - Get P_Key table entry + * @device:Device to query + * @port_num:Port number to query + * @index:P_Key table index to query + * @pkey:Returned P_Key + * + * ib_query_pkey() fetches the specified P_Key table entry. + */ +int ib_query_pkey(struct ib_device *device, + u8 port_num, u16 index, u16 *pkey) +{ + return device->query_pkey(device, port_num, index, pkey); +} +EXPORT_SYMBOL(ib_query_pkey); + +/** + * ib_modify_device - Change IB device attributes + * @device:Device to modify + * @device_modify_mask:Mask of attributes to change + * @device_modify:New attribute values + * + * ib_modify_device() changes a device's attributes as specified by + * the @device_modify_mask and @device_modify structure. + */ +int ib_modify_device(struct ib_device *device, + int device_modify_mask, + struct ib_device_modify *device_modify) +{ + return device->modify_device(device, device_modify_mask, + device_modify); +} +EXPORT_SYMBOL(ib_modify_device); + +/** + * ib_modify_port - Modifies the attributes for the specified port. + * @device: The device to modify. + * @port_num: The number of the port to modify. + * @port_modify_mask: Mask used to specify which attributes of the port + * to change. + * @port_modify: New attribute values for the port. + * + * ib_modify_port() changes a port's attributes as specified by the + * @port_modify_mask and @port_modify structure. + */ +int ib_modify_port(struct ib_device *device, + u8 port_num, int port_modify_mask, + struct ib_port_modify *port_modify) +{ + return device->modify_port(device, port_num, port_modify_mask, + port_modify); +} +EXPORT_SYMBOL(ib_modify_port); + +static int __init ib_core_init(void) +{ + int ret; + + ret = ib_sysfs_setup(); + if (ret) + printk(KERN_WARNING "Couldn't create InfiniBand device class\n"); + + ret = ib_cache_setup(); + if (ret) { + printk(KERN_WARNING "Couldn't set up InfiniBand P_Key/GID cache\n"); + ib_sysfs_cleanup(); + } + + return ret; +} + +static void __exit ib_core_cleanup(void) +{ + ib_cache_cleanup(); + ib_sysfs_cleanup(); +} + +module_init(ib_core_init); +module_exit(ib_core_cleanup); --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/core/fmr_pool.c 2004-12-27 21:48:18.551235551 -0800 @@ -0,0 +1,507 @@ +/* + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: fmr_pool.c 1349 2004-12-16 21:09:43Z roland $ + */ + +#include +#include +#include +#include +#include + +#include + +#include "core_priv.h" + +enum { + IB_FMR_MAX_REMAPS = 32, + + IB_FMR_HASH_BITS = 8, + IB_FMR_HASH_SIZE = 1 << IB_FMR_HASH_BITS, + IB_FMR_HASH_MASK = IB_FMR_HASH_SIZE - 1 +}; + +/* + * If an FMR is not in use, then the list member will point to either + * its pool's free_list (if the FMR can be mapped again; that is, + * remap_count < IB_FMR_MAX_REMAPS) or its pool's dirty_list (if the + * FMR needs to be unmapped before being remapped). In either of + * these cases it is a bug if the ref_count is not 0. In other words, + * if ref_count is > 0, then the list member must not be linked into + * either free_list or dirty_list. + * + * The cache_node member is used to link the FMR into a cache bucket + * (if caching is enabled). This is independent of the reference + * count of the FMR. When a valid FMR is released, its ref_count is + * decremented, and if ref_count reaches 0, the FMR is placed in + * either free_list or dirty_list as appropriate. However, it is not + * removed from the cache and may be "revived" if a call to + * ib_fmr_register_physical() occurs before the FMR is remapped. In + * this case we just increment the ref_count and remove the FMR from + * free_list/dirty_list. + * + * Before we remap an FMR from free_list, we remove it from the cache + * (to prevent another user from obtaining a stale FMR). When an FMR + * is released, we add it to the tail of the free list, so that our + * cache eviction policy is "least recently used." + * + * All manipulation of ref_count, list and cache_node is protected by + * pool_lock to maintain consistency. + */ + +struct ib_fmr_pool { + spinlock_t pool_lock; + + int pool_size; + int max_pages; + int dirty_watermark; + int dirty_len; + struct list_head free_list; + struct list_head dirty_list; + struct hlist_head *cache_bucket; + + void (*flush_function)(struct ib_fmr_pool *pool, + void * arg); + void *flush_arg; + + struct task_struct *thread; + + atomic_t req_ser; + atomic_t flush_ser; + + wait_queue_head_t force_wait; +}; + +static inline u32 ib_fmr_hash(u64 first_page) +{ + return jhash_2words((u32) first_page, + (u32) (first_page >> 32), + 0); +} + +/* Caller must hold pool_lock */ +static inline struct ib_pool_fmr *ib_fmr_cache_lookup(struct ib_fmr_pool *pool, + u64 *page_list, + int page_list_len, + u64 io_virtual_address) +{ + struct hlist_head *bucket; + struct ib_pool_fmr *fmr; + struct hlist_node *pos; + + if (!pool->cache_bucket) + return NULL; + + bucket = pool->cache_bucket + ib_fmr_hash(*page_list); + + hlist_for_each_entry(fmr, pos, bucket, cache_node) + if (io_virtual_address == fmr->io_virtual_address && + page_list_len == fmr->page_list_len && + !memcmp(page_list, fmr->page_list, + page_list_len * sizeof *page_list)) + return fmr; + + return NULL; +} + +static void ib_fmr_batch_release(struct ib_fmr_pool *pool) +{ + int ret; + struct ib_pool_fmr *fmr; + LIST_HEAD(unmap_list); + LIST_HEAD(fmr_list); + + spin_lock_irq(&pool->pool_lock); + + list_for_each_entry(fmr, &pool->dirty_list, list) { + hlist_del_init(&fmr->cache_node); + fmr->remap_count = 0; + list_add_tail(&fmr->fmr->list, &fmr_list); + +#ifdef DEBUG + if (fmr->ref_count !=0) { + printk(KERN_WARNING "Unmapping FMR 0x%08x with ref count %d", + fmr, fmr->ref_count); + } +#endif + } + + list_splice(&pool->dirty_list, &unmap_list); + INIT_LIST_HEAD(&pool->dirty_list); + pool->dirty_len = 0; + + spin_unlock_irq(&pool->pool_lock); + + if (list_empty(&unmap_list)) { + return; + } + + ret = ib_unmap_fmr(&fmr_list); + if (ret) + printk(KERN_WARNING "ib_unmap_fmr returned %d", ret); + + spin_lock_irq(&pool->pool_lock); + list_splice(&unmap_list, &pool->free_list); + spin_unlock_irq(&pool->pool_lock); +} + +static int ib_fmr_cleanup_thread(void *pool_ptr) +{ + struct ib_fmr_pool *pool = pool_ptr; + + do { + if (pool->dirty_len >= pool->dirty_watermark || + atomic_read(&pool->flush_ser) - atomic_read(&pool->req_ser) < 0) { + ib_fmr_batch_release(pool); + + atomic_inc(&pool->flush_ser); + wake_up_interruptible(&pool->force_wait); + + if (pool->flush_function) + pool->flush_function(pool, pool->flush_arg); + } + + set_current_state(TASK_INTERRUPTIBLE); + if (pool->dirty_len < pool->dirty_watermark && + atomic_read(&pool->flush_ser) - atomic_read(&pool->req_ser) >= 0 && + !kthread_should_stop()) + schedule(); + __set_current_state(TASK_RUNNING); + } while (!kthread_should_stop()); + + return 0; +} + +/** + * ib_create_fmr_pool - Create an FMR pool + * @pd:Protection domain for FMRs + * @params:FMR pool parameters + * + * Create a pool of FMRs. Return value is pointer to new pool or + * error code if creation failed. + */ +struct ib_fmr_pool *ib_create_fmr_pool(struct ib_pd *pd, + struct ib_fmr_pool_param *params) +{ + struct ib_device *device; + struct ib_fmr_pool *pool; + int i; + int ret; + + if (!params) + return ERR_PTR(-EINVAL); + + device = pd->device; + if (!device->alloc_fmr || !device->dealloc_fmr || + !device->map_phys_fmr || !device->unmap_fmr) { + printk(KERN_WARNING "Device %s does not support fast memory regions", + device->name); + return ERR_PTR(-ENOSYS); + } + + pool = kmalloc(sizeof *pool, GFP_KERNEL); + if (!pool) { + printk(KERN_WARNING "couldn't allocate pool struct"); + return ERR_PTR(-ENOMEM); + } + + pool->cache_bucket = NULL; + + pool->flush_function = params->flush_function; + pool->flush_arg = params->flush_arg; + + INIT_LIST_HEAD(&pool->free_list); + INIT_LIST_HEAD(&pool->dirty_list); + + if (params->cache) { + pool->cache_bucket = + kmalloc(IB_FMR_HASH_SIZE * sizeof *pool->cache_bucket, + GFP_KERNEL); + if (!pool->cache_bucket) { + printk(KERN_WARNING "Failed to allocate cache in pool"); + ret = -ENOMEM; + goto out_free_pool; + } + + for (i = 0; i < IB_FMR_HASH_SIZE; ++i) + INIT_HLIST_HEAD(pool->cache_bucket + i); + } + + pool->pool_size = 0; + pool->max_pages = params->max_pages_per_fmr; + pool->dirty_watermark = params->dirty_watermark; + pool->dirty_len = 0; + spin_lock_init(&pool->pool_lock); + atomic_set(&pool->req_ser, 0); + atomic_set(&pool->flush_ser, 0); + init_waitqueue_head(&pool->force_wait); + + pool->thread = kthread_create(ib_fmr_cleanup_thread, + pool, + "ib_fmr(%s)", + device->name); + if (IS_ERR(pool->thread)) { + printk(KERN_WARNING "couldn't start cleanup thread"); + ret = PTR_ERR(pool->thread); + goto out_free_pool; + } + + { + struct ib_pool_fmr *fmr; + struct ib_fmr_attr attr = { + .max_pages = params->max_pages_per_fmr, + .max_maps = IB_FMR_MAX_REMAPS, + .page_size = PAGE_SHIFT + }; + + for (i = 0; i < params->pool_size; ++i) { + fmr = kmalloc(sizeof *fmr + params->max_pages_per_fmr * sizeof (u64), + GFP_KERNEL); + if (!fmr) { + printk(KERN_WARNING "failed to allocate fmr struct " + "for FMR %d", i); + goto out_fail; + } + + fmr->pool = pool; + fmr->remap_count = 0; + fmr->ref_count = 0; + INIT_HLIST_NODE(&fmr->cache_node); + + fmr->fmr = ib_alloc_fmr(pd, params->access, &attr); + if (IS_ERR(fmr->fmr)) { + printk(KERN_WARNING "fmr_create failed for FMR %d", i); + kfree(fmr); + goto out_fail; + } + + list_add_tail(&fmr->list, &pool->free_list); + ++pool->pool_size; + } + } + + return pool; + + out_free_pool: + kfree(pool->cache_bucket); + kfree(pool); + + return ERR_PTR(ret); + + out_fail: + ib_destroy_fmr_pool(pool); + + return ERR_PTR(-ENOMEM); +} +EXPORT_SYMBOL(ib_create_fmr_pool); + +/** + * ib_destroy_fmr_pool - Free FMR pool + * @pool:FMR pool to free + * + * Destroy an FMR pool and free all associated resources. + */ +int ib_destroy_fmr_pool(struct ib_fmr_pool *pool) +{ + struct ib_pool_fmr *fmr; + struct ib_pool_fmr *tmp; + int i; + + kthread_stop(pool->thread); + ib_fmr_batch_release(pool); + + i = 0; + list_for_each_entry_safe(fmr, tmp, &pool->free_list, list) { + ib_dealloc_fmr(fmr->fmr); + list_del(&fmr->list); + kfree(fmr); + ++i; + } + + if (i < pool->pool_size) + printk(KERN_WARNING "pool still has %d regions registered", + pool->pool_size - i); + + kfree(pool->cache_bucket); + kfree(pool); + + return 0; +} +EXPORT_SYMBOL(ib_destroy_fmr_pool); + +/** + * ib_flush_fmr_pool - Invalidate all unmapped FMRs + * @pool:FMR pool to flush + * + * Ensure that all unmapped FMRs are fully invalidated. + */ +int ib_flush_fmr_pool(struct ib_fmr_pool *pool) +{ + int serial; + + atomic_inc(&pool->req_ser); + /* + * It's OK if someone else bumps req_ser again here -- we'll + * just wait a little longer. + */ + serial = atomic_read(&pool->req_ser); + + wake_up_process(pool->thread); + + if (wait_event_interruptible(pool->force_wait, + atomic_read(&pool->flush_ser) - + atomic_read(&pool->req_ser) >= 0)) + return -EINTR; + + return 0; +} +EXPORT_SYMBOL(ib_flush_fmr_pool); + +/** + * ib_fmr_pool_map_phys - + * @pool:FMR pool to allocate FMR from + * @page_list:List of pages to map + * @list_len:Number of pages in @page_list + * @io_virtual_address:I/O virtual address for new FMR + * + * Map an FMR from an FMR pool. + */ +struct ib_pool_fmr *ib_fmr_pool_map_phys(struct ib_fmr_pool *pool_handle, + u64 *page_list, + int list_len, + u64 *io_virtual_address) +{ + struct ib_fmr_pool *pool = pool_handle; + struct ib_pool_fmr *fmr; + unsigned long flags; + int result; + + if (list_len < 1 || list_len > pool->max_pages) + return ERR_PTR(-EINVAL); + + spin_lock_irqsave(&pool->pool_lock, flags); + fmr = ib_fmr_cache_lookup(pool, + page_list, + list_len, + *io_virtual_address); + if (fmr) { + /* found in cache */ + ++fmr->ref_count; + if (fmr->ref_count == 1) { + list_del(&fmr->list); + } + + spin_unlock_irqrestore(&pool->pool_lock, flags); + + return fmr; + } + + if (list_empty(&pool->free_list)) { + spin_unlock_irqrestore(&pool->pool_lock, flags); + return ERR_PTR(-EAGAIN); + } + + fmr = list_entry(pool->free_list.next, struct ib_pool_fmr, list); + list_del(&fmr->list); + hlist_del_init(&fmr->cache_node); + spin_unlock_irqrestore(&pool->pool_lock, flags); + + result = ib_map_phys_fmr(fmr->fmr, page_list, list_len, + *io_virtual_address); + + if (result) { + spin_lock_irqsave(&pool->pool_lock, flags); + list_add(&fmr->list, &pool->free_list); + spin_unlock_irqrestore(&pool->pool_lock, flags); + + printk(KERN_WARNING "fmr_map returns %d", + result); + + return ERR_PTR(result); + } + + ++fmr->remap_count; + fmr->ref_count = 1; + + if (pool->cache_bucket) { + fmr->io_virtual_address = *io_virtual_address; + fmr->page_list_len = list_len; + memcpy(fmr->page_list, page_list, list_len * sizeof(*page_list)); + + spin_lock_irqsave(&pool->pool_lock, flags); + hlist_add_head(&fmr->cache_node, + pool->cache_bucket + ib_fmr_hash(fmr->page_list[0])); + spin_unlock_irqrestore(&pool->pool_lock, flags); + } + + return fmr; +} +EXPORT_SYMBOL(ib_fmr_pool_map_phys); + +/** + * ib_fmr_pool_unmap - Unmap FMR + * @fmr:FMR to unmap + * + * Unmap an FMR. The FMR mapping may remain valid until the FMR is + * reused (or until ib_flush_fmr_pool() is called). + */ +int ib_fmr_pool_unmap(struct ib_pool_fmr *fmr) +{ + struct ib_fmr_pool *pool; + unsigned long flags; + + pool = fmr->pool; + + spin_lock_irqsave(&pool->pool_lock, flags); + + --fmr->ref_count; + if (!fmr->ref_count) { + if (fmr->remap_count < IB_FMR_MAX_REMAPS) { + list_add_tail(&fmr->list, &pool->free_list); + } else { + list_add_tail(&fmr->list, &pool->dirty_list); + ++pool->dirty_len; + wake_up_process(pool->thread); + } + } + +#ifdef DEBUG + if (fmr->ref_count < 0) + printk(KERN_WARNING "FMR %p has ref count %d < 0", + fmr, fmr->ref_count); +#endif + + spin_unlock_irqrestore(&pool->pool_lock, flags); + + return 0; +} +EXPORT_SYMBOL(ib_fmr_pool_unmap); --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/core/packer.c 2004-12-27 21:48:18.385259982 -0800 @@ -0,0 +1,201 @@ +/* + * Copyright (c) 2004 Topspin Corporation. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: packer.c 1349 2004-12-16 21:09:43Z roland $ + */ + +#include + +static u64 value_read(int offset, int size, void *structure) +{ + switch (size) { + case 1: return *(u8 *) (structure + offset); + case 2: return be16_to_cpup((__be16 *) (structure + offset)); + case 4: return be32_to_cpup((__be32 *) (structure + offset)); + case 8: return be64_to_cpup((__be64 *) (structure + offset)); + default: + printk(KERN_WARNING "Field size %d bits not handled\n", size * 8); + return 0; + } +} + +/** + * ib_pack - Pack a structure into a buffer + * @desc:Array of structure field descriptions + * @desc_len:Number of entries in @desc + * @structure:Structure to pack from + * @buf:Buffer to pack into + * + * ib_pack() packs a list of structure fields into a buffer, + * controlled by the array of fields in @desc. + */ +void ib_pack(const struct ib_field *desc, + int desc_len, + void *structure, + void *buf) +{ + int i; + + for (i = 0; i < desc_len; ++i) { + if (desc[i].size_bits <= 32) { + int shift; + u32 val; + __be32 mask; + __be32 *addr; + + shift = 32 - desc[i].offset_bits - desc[i].size_bits; + if (desc[i].struct_size_bytes) + val = value_read(desc[i].struct_offset_bytes, + desc[i].struct_size_bytes, + structure) << shift; + else + val = 0; + + mask = cpu_to_be32(((1ull << desc[i].size_bits) - 1) << shift); + addr = (__be32 *) buf + desc[i].offset_words; + *addr = (*addr & ~mask) | (cpu_to_be32(val) & mask); + } else if (desc[i].size_bits <= 64) { + int shift; + u64 val; + __be64 mask; + __be64 *addr; + + shift = 64 - desc[i].offset_bits - desc[i].size_bits; + if (desc[i].struct_size_bytes) + val = value_read(desc[i].struct_offset_bytes, + desc[i].struct_size_bytes, + structure) << shift; + else + val = 0; + + mask = cpu_to_be64(((1ull << desc[i].size_bits) - 1) << shift); + addr = (__be64 *) ((__be32 *) buf + desc[i].offset_words); + *addr = (*addr & ~mask) | (cpu_to_be64(val) & mask); + } else { + if (desc[i].offset_bits % 8 || + desc[i].size_bits % 8) { + printk(KERN_WARNING "Structure field %s of size %d " + "bits is not byte-aligned\n", + desc[i].field_name, desc[i].size_bits); + } + + if (desc[i].struct_size_bytes) + memcpy(buf + desc[i].offset_words * 4 + + desc[i].offset_bits / 8, + structure + desc[i].struct_offset_bytes, + desc[i].size_bits / 8); + else + memset(buf + desc[i].offset_words * 4 + + desc[i].offset_bits / 8, + 0, + desc[i].size_bits / 8); + } + } +} +EXPORT_SYMBOL(ib_pack); + +static void value_write(int offset, int size, u64 val, void *structure) +{ + switch (size * 8) { + case 8: *( u8 *) (structure + offset) = val; break; + case 16: *(__be16 *) (structure + offset) = cpu_to_be16(val); break; + case 32: *(__be32 *) (structure + offset) = cpu_to_be32(val); break; + case 64: *(__be64 *) (structure + offset) = cpu_to_be64(val); break; + default: + printk(KERN_WARNING "Field size %d bits not handled\n", size * 8); + } +} + +/** + * ib_unpack - Unpack a buffer into a structure + * @desc:Array of structure field descriptions + * @desc_len:Number of entries in @desc + * @buf:Buffer to unpack from + * @structure:Structure to unpack into + * + * ib_pack() unpacks a list of structure fields from a buffer, + * controlled by the array of fields in @desc. + */ +void ib_unpack(const struct ib_field *desc, + int desc_len, + void *buf, + void *structure) +{ + int i; + + for (i = 0; i < desc_len; ++i) { + if (!desc[i].struct_size_bytes) + continue; + + if (desc[i].size_bits <= 32) { + int shift; + u32 val; + u32 mask; + __be32 *addr; + + shift = 32 - desc[i].offset_bits - desc[i].size_bits; + mask = ((1ull << desc[i].size_bits) - 1) << shift; + addr = (__be32 *) buf + desc[i].offset_words; + val = (be32_to_cpup(addr) & mask) >> shift; + value_write(desc[i].struct_offset_bytes, + desc[i].struct_size_bytes, + val, + structure); + } else if (desc[i].size_bits <= 64) { + int shift; + u64 val; + u64 mask; + __be64 *addr; + + shift = 64 - desc[i].offset_bits - desc[i].size_bits; + mask = ((1ull << desc[i].size_bits) - 1) << shift; + addr = (__be64 *) buf + desc[i].offset_words; + val = (be64_to_cpup(addr) & mask) >> shift; + value_write(desc[i].struct_offset_bytes, + desc[i].struct_size_bytes, + val, + structure); + } else { + if (desc[i].offset_bits % 8 || + desc[i].size_bits % 8) { + printk(KERN_WARNING "Structure field %s of size %d " + "bits is not byte-aligned\n", + desc[i].field_name, desc[i].size_bits); + } + + memcpy(structure + desc[i].struct_offset_bytes, + buf + desc[i].offset_words * 4 + + desc[i].offset_bits / 8, + desc[i].size_bits / 8); + } + } +} +EXPORT_SYMBOL(ib_unpack); --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/core/sysfs.c 2004-12-27 21:48:18.498243351 -0800 @@ -0,0 +1,725 @@ +/* + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: sysfs.c 1349 2004-12-16 21:09:43Z roland $ + */ + +#include "core_priv.h" + +#include + +struct ib_port { + struct kobject kobj; + struct ib_device *ibdev; + struct attribute_group gid_group; + struct attribute **gid_attr; + struct attribute_group pkey_group; + struct attribute **pkey_attr; + u8 port_num; +}; + +struct port_attribute { + struct attribute attr; + ssize_t (*show)(struct ib_port *, struct port_attribute *, char *buf); + ssize_t (*store)(struct ib_port *, struct port_attribute *, + const char *buf, size_t count); +}; + +#define PORT_ATTR(_name, _mode, _show, _store) \ +struct port_attribute port_attr_##_name = __ATTR(_name, _mode, _show, _store) + +#define PORT_ATTR_RO(_name) \ +struct port_attribute port_attr_##_name = __ATTR_RO(_name) + +struct port_table_attribute { + struct port_attribute attr; + int index; +}; + +static ssize_t port_attr_show(struct kobject *kobj, + struct attribute *attr, char *buf) +{ + struct port_attribute *port_attr = + container_of(attr, struct port_attribute, attr); + struct ib_port *p = container_of(kobj, struct ib_port, kobj); + + if (!port_attr->show) + return 0; + + return port_attr->show(p, port_attr, buf); +} + +static struct sysfs_ops port_sysfs_ops = { + .show = port_attr_show +}; + +static ssize_t state_show(struct ib_port *p, struct port_attribute *unused, + char *buf) +{ + struct ib_port_attr attr; + ssize_t ret; + + static const char *state_name[] = { + [IB_PORT_NOP] = "NOP", + [IB_PORT_DOWN] = "DOWN", + [IB_PORT_INIT] = "INIT", + [IB_PORT_ARMED] = "ARMED", + [IB_PORT_ACTIVE] = "ACTIVE", + [IB_PORT_ACTIVE_DEFER] = "ACTIVE_DEFER" + }; + + ret = ib_query_port(p->ibdev, p->port_num, &attr); + if (ret) + return ret; + + return sprintf(buf, "%d: %s\n", attr.state, + attr.state >= 0 && attr.state <= ARRAY_SIZE(state_name) ? + state_name[attr.state] : "UNKNOWN"); +} + +static ssize_t lid_show(struct ib_port *p, struct port_attribute *unused, + char *buf) +{ + struct ib_port_attr attr; + ssize_t ret; + + ret = ib_query_port(p->ibdev, p->port_num, &attr); + if (ret) + return ret; + + return sprintf(buf, "0x%x\n", attr.lid); +} + +static ssize_t lid_mask_count_show(struct ib_port *p, + struct port_attribute *unused, + char *buf) +{ + struct ib_port_attr attr; + ssize_t ret; + + ret = ib_query_port(p->ibdev, p->port_num, &attr); + if (ret) + return ret; + + return sprintf(buf, "%d\n", attr.lmc); +} + +static ssize_t sm_lid_show(struct ib_port *p, struct port_attribute *unused, + char *buf) +{ + struct ib_port_attr attr; + ssize_t ret; + + ret = ib_query_port(p->ibdev, p->port_num, &attr); + if (ret) + return ret; + + return sprintf(buf, "0x%x\n", attr.sm_lid); +} + +static ssize_t sm_sl_show(struct ib_port *p, struct port_attribute *unused, + char *buf) +{ + struct ib_port_attr attr; + ssize_t ret; + + ret = ib_query_port(p->ibdev, p->port_num, &attr); + if (ret) + return ret; + + return sprintf(buf, "%d\n", attr.sm_sl); +} + +static ssize_t cap_mask_show(struct ib_port *p, struct port_attribute *unused, + char *buf) +{ + struct ib_port_attr attr; + ssize_t ret; + + ret = ib_query_port(p->ibdev, p->port_num, &attr); + if (ret) + return ret; + + return sprintf(buf, "0x%08x\n", attr.port_cap_flags); +} + +static ssize_t rate_show(struct ib_port *p, struct port_attribute *unused, + char *buf) +{ + struct ib_port_attr attr; + char *speed = ""; + int rate; + ssize_t ret; + + ret = ib_query_port(p->ibdev, p->port_num, &attr); + if (ret) + return ret; + + switch (attr.active_speed) { + case 2: speed = " DDR"; break; + case 4: speed = " QDR"; break; + } + + printk(KERN_ERR "width %d speed %d\n", attr.active_width, attr.active_speed); + + rate = 25 * ib_width_enum_to_int(attr.active_width) * attr.active_speed; + if (rate < 0) + return -EINVAL; + + return sprintf(buf, "%d%s Gb/sec (%dX%s)\n", + rate / 10, rate % 10 ? ".5" : "", + ib_width_enum_to_int(attr.active_width), speed); +} + +static PORT_ATTR_RO(state); +static PORT_ATTR_RO(lid); +static PORT_ATTR_RO(lid_mask_count); +static PORT_ATTR_RO(sm_lid); +static PORT_ATTR_RO(sm_sl); +static PORT_ATTR_RO(cap_mask); +static PORT_ATTR_RO(rate); + +static struct attribute *port_default_attrs[] = { + &port_attr_state.attr, + &port_attr_lid.attr, + &port_attr_lid_mask_count.attr, + &port_attr_sm_lid.attr, + &port_attr_sm_sl.attr, + &port_attr_cap_mask.attr, + &port_attr_rate.attr, + NULL +}; + +static ssize_t show_port_gid(struct ib_port *p, struct port_attribute *attr, + char *buf) +{ + struct port_table_attribute *tab_attr = + container_of(attr, struct port_table_attribute, attr); + union ib_gid gid; + ssize_t ret; + + ret = ib_query_gid(p->ibdev, p->port_num, tab_attr->index, &gid); + if (ret) + return ret; + + return sprintf(buf, "%04x:%04x:%04x:%04x:%04x:%04x:%04x:%04x\n", + be16_to_cpu(((u16 *) gid.raw)[0]), + be16_to_cpu(((u16 *) gid.raw)[1]), + be16_to_cpu(((u16 *) gid.raw)[2]), + be16_to_cpu(((u16 *) gid.raw)[3]), + be16_to_cpu(((u16 *) gid.raw)[4]), + be16_to_cpu(((u16 *) gid.raw)[5]), + be16_to_cpu(((u16 *) gid.raw)[6]), + be16_to_cpu(((u16 *) gid.raw)[7])); +} + +static ssize_t show_port_pkey(struct ib_port *p, struct port_attribute *attr, + char *buf) +{ + struct port_table_attribute *tab_attr = + container_of(attr, struct port_table_attribute, attr); + u16 pkey; + ssize_t ret; + + ret = ib_query_pkey(p->ibdev, p->port_num, tab_attr->index, &pkey); + if (ret) + return ret; + + return sprintf(buf, "0x%04x\n", pkey); +} + +#define PORT_PMA_ATTR(_name, _counter, _width, _offset) \ +struct port_table_attribute port_pma_attr_##_name = { \ + .attr = __ATTR(_name, S_IRUGO, show_pma_counter, NULL), \ + .index = (_offset) | ((_width) << 16) | ((_counter) << 24) \ +} + +static ssize_t show_pma_counter(struct ib_port *p, struct port_attribute *attr, + char *buf) +{ + struct port_table_attribute *tab_attr = + container_of(attr, struct port_table_attribute, attr); + int offset = tab_attr->index & 0xffff; + int width = (tab_attr->index >> 16) & 0xff; + struct ib_mad *in_mad = NULL; + struct ib_mad *out_mad = NULL; + ssize_t ret; + + if (!p->ibdev->process_mad) + return sprintf(buf, "N/A (no PMA)\n"); + + in_mad = kmalloc(sizeof *in_mad, GFP_KERNEL); + out_mad = kmalloc(sizeof *in_mad, GFP_KERNEL); + if (!in_mad || !out_mad) { + ret = -ENOMEM; + goto out; + } + + memset(in_mad, 0, sizeof *in_mad); + in_mad->mad_hdr.base_version = 1; + in_mad->mad_hdr.mgmt_class = IB_MGMT_CLASS_PERF_MGMT; + in_mad->mad_hdr.class_version = 1; + in_mad->mad_hdr.method = IB_MGMT_METHOD_GET; + in_mad->mad_hdr.attr_id = cpu_to_be16(0x12); /* PortCounters */ + + in_mad->data[41] = p->port_num; /* PortSelect field */ + + if ((p->ibdev->process_mad(p->ibdev, IB_MAD_IGNORE_MKEY, p->port_num, 0xffff, + in_mad, out_mad) & + (IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_REPLY)) != + (IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_REPLY)) { + ret = -EINVAL; + goto out; + } + + switch (width) { + case 4: + ret = sprintf(buf, "%u\n", (out_mad->data[40 + offset / 8] >> + (offset % 4)) & 0xf); + break; + case 8: + ret = sprintf(buf, "%u\n", out_mad->data[40 + offset / 8]); + break; + case 16: + ret = sprintf(buf, "%u\n", + be16_to_cpup((u16 *)(out_mad->data + 40 + offset / 8))); + break; + case 32: + ret = sprintf(buf, "%u\n", + be32_to_cpup((u32 *)(out_mad->data + 40 + offset / 8))); + break; + default: + ret = 0; + } + +out: + kfree(in_mad); + kfree(out_mad); + + return ret; +} + +static PORT_PMA_ATTR(symbol_error , 0, 16, 32); +static PORT_PMA_ATTR(link_error_recovery , 1, 8, 48); +static PORT_PMA_ATTR(link_downed , 2, 8, 56); +static PORT_PMA_ATTR(port_rcv_errors , 3, 16, 64); +static PORT_PMA_ATTR(port_rcv_remote_physical_errors, 4, 16, 80); +static PORT_PMA_ATTR(port_rcv_switch_relay_errors , 5, 16, 96); +static PORT_PMA_ATTR(port_xmit_discards , 6, 16, 112); +static PORT_PMA_ATTR(port_xmit_constraint_errors , 7, 8, 128); +static PORT_PMA_ATTR(port_rcv_constraint_errors , 8, 8, 136); +static PORT_PMA_ATTR(local_link_integrity_errors , 9, 4, 152); +static PORT_PMA_ATTR(excessive_buffer_overrun_errors, 10, 4, 156); +static PORT_PMA_ATTR(VL15_dropped , 11, 16, 176); +static PORT_PMA_ATTR(port_xmit_data , 12, 32, 192); +static PORT_PMA_ATTR(port_rcv_data , 13, 32, 224); +static PORT_PMA_ATTR(port_xmit_packets , 14, 32, 256); +static PORT_PMA_ATTR(port_rcv_packets , 15, 32, 288); + +static struct attribute *pma_attrs[] = { + &port_pma_attr_symbol_error.attr.attr, + &port_pma_attr_link_error_recovery.attr.attr, + &port_pma_attr_link_downed.attr.attr, + &port_pma_attr_port_rcv_errors.attr.attr, + &port_pma_attr_port_rcv_remote_physical_errors.attr.attr, + &port_pma_attr_port_rcv_switch_relay_errors.attr.attr, + &port_pma_attr_port_xmit_discards.attr.attr, + &port_pma_attr_port_xmit_constraint_errors.attr.attr, + &port_pma_attr_port_rcv_constraint_errors.attr.attr, + &port_pma_attr_local_link_integrity_errors.attr.attr, + &port_pma_attr_excessive_buffer_overrun_errors.attr.attr, + &port_pma_attr_VL15_dropped.attr.attr, + &port_pma_attr_port_xmit_data.attr.attr, + &port_pma_attr_port_rcv_data.attr.attr, + &port_pma_attr_port_xmit_packets.attr.attr, + &port_pma_attr_port_rcv_packets.attr.attr, + NULL +}; + +static struct attribute_group pma_group = { + .name = "counters", + .attrs = pma_attrs +}; + +static void ib_port_release(struct kobject *kobj) +{ + struct ib_port *p = container_of(kobj, struct ib_port, kobj); + struct attribute *a; + int i; + + for (i = 0; (a = p->gid_attr[i]); ++i) { + kfree(a->name); + kfree(a); + } + + for (i = 0; (a = p->pkey_attr[i]); ++i) { + kfree(a->name); + kfree(a); + } + + kfree(p->gid_attr); + kfree(p); +} + +static struct kobj_type port_type = { + .release = ib_port_release, + .sysfs_ops = &port_sysfs_ops, + .default_attrs = port_default_attrs +}; + +static void ib_device_release(struct class_device *cdev) +{ + struct ib_device *dev = container_of(cdev, struct ib_device, class_dev); + + kfree(dev); +} + +static int ib_device_hotplug(struct class_device *cdev, char **envp, + int num_envp, char *buf, int size) +{ + struct ib_device *dev = container_of(cdev, struct ib_device, class_dev); + int i = 0, len = 0; + + if (add_hotplug_env_var(envp, num_envp, &i, buf, size, &len, + "NAME=%s", dev->name)) + return -ENOMEM; + + /* + * It might be nice to pass the node GUID to hotplug, but + * right now the only way to get it is to query the device + * provider, and this can crash during device removal because + * we are will be running after driver removal has started. + * We could add a node_guid field to struct ib_device, or we + * could just let the hotplug script read the node GUID from + * sysfs when devices are added. + */ + + envp[i] = NULL; + return 0; +} + +static int alloc_group(struct attribute ***attr, + ssize_t (*show)(struct ib_port *, + struct port_attribute *, char *buf), + int len) +{ + struct port_table_attribute ***tab_attr = + (struct port_table_attribute ***) attr; + int i; + int ret; + + *tab_attr = kmalloc((1 + len) * sizeof *tab_attr, GFP_KERNEL); + if (!*tab_attr) + return -ENOMEM; + + memset(*tab_attr, 0, (1 + len) * sizeof *tab_attr); + + for (i = 0; i < len; ++i) { + (*tab_attr)[i] = kmalloc(sizeof *(*tab_attr)[i], GFP_KERNEL); + if (!(*tab_attr)[i]) { + ret = -ENOMEM; + goto err; + } + memset((*tab_attr)[i], 0, sizeof *(*tab_attr)[i]); + (*tab_attr)[i]->attr.attr.name = kmalloc(8, GFP_KERNEL); + if (!(*tab_attr)[i]->attr.attr.name) { + ret = -ENOMEM; + goto err; + } + + if (snprintf((*tab_attr)[i]->attr.attr.name, 8, "%d", i) >= 8) { + ret = -ENOMEM; + goto err; + } + + (*tab_attr)[i]->attr.attr.mode = S_IRUGO; + (*tab_attr)[i]->attr.attr.owner = THIS_MODULE; + (*tab_attr)[i]->attr.show = show; + (*tab_attr)[i]->index = i; + } + + return 0; + +err: + for (i = 0; i < len; ++i) { + if ((*tab_attr)[i]) + kfree((*tab_attr)[i]->attr.attr.name); + kfree((*tab_attr)[i]); + } + + kfree(*tab_attr); + + return ret; +} + +static int add_port(struct ib_device *device, int port_num) +{ + struct ib_port *p; + struct ib_port_attr attr; + int i; + int ret; + + ret = ib_query_port(device, port_num, &attr); + if (ret) + return ret; + + p = kmalloc(sizeof *p, GFP_KERNEL); + if (!p) + return -ENOMEM; + memset(p, 0, sizeof *p); + + p->ibdev = device; + p->port_num = port_num; + p->kobj.ktype = &port_type; + + p->kobj.parent = kobject_get(&device->ports_parent); + if (!p->kobj.parent) { + ret = -EBUSY; + goto err; + } + + ret = kobject_set_name(&p->kobj, "%d", port_num); + if (ret) + goto err_put; + + ret = kobject_register(&p->kobj); + if (ret) + goto err_put; + + ret = sysfs_create_group(&p->kobj, &pma_group); + if (ret) + goto err_put; + + ret = alloc_group(&p->gid_attr, show_port_gid, attr.gid_tbl_len); + if (ret) + goto err_remove_pma; + + p->gid_group.name = "gids"; + p->gid_group.attrs = p->gid_attr; + + ret = sysfs_create_group(&p->kobj, &p->gid_group); + if (ret) + goto err_free_gid; + + ret = alloc_group(&p->pkey_attr, show_port_pkey, attr.pkey_tbl_len); + if (ret) + goto err_remove_gid; + + p->pkey_group.name = "pkeys"; + p->pkey_group.attrs = p->pkey_attr; + + ret = sysfs_create_group(&p->kobj, &p->pkey_group); + if (ret) + goto err_free_pkey; + + list_add_tail(&p->kobj.entry, &device->port_list); + + return 0; + +err_free_pkey: + for (i = 0; i < attr.pkey_tbl_len; ++i) { + kfree(p->pkey_attr[i]->name); + kfree(p->pkey_attr[i]); + } + + kfree(p->pkey_attr); + +err_remove_gid: + sysfs_remove_group(&p->kobj, &p->gid_group); + +err_free_gid: + for (i = 0; i < attr.gid_tbl_len; ++i) { + kfree(p->gid_attr[i]->name); + kfree(p->gid_attr[i]); + } + + kfree(p->gid_attr); + +err_remove_pma: + sysfs_remove_group(&p->kobj, &pma_group); + +err_put: + kobject_put(&device->ports_parent); + +err: + kfree(p); + return ret; +} + +static ssize_t show_sys_image_guid(struct class_device *cdev, char *buf) +{ + struct ib_device *dev = container_of(cdev, struct ib_device, class_dev); + struct ib_device_attr attr; + ssize_t ret; + + ret = ib_query_device(dev, &attr); + if (ret) + return ret; + + return sprintf(buf, "%04x:%04x:%04x:%04x\n", + be16_to_cpu(((u16 *) &attr.sys_image_guid)[0]), + be16_to_cpu(((u16 *) &attr.sys_image_guid)[1]), + be16_to_cpu(((u16 *) &attr.sys_image_guid)[2]), + be16_to_cpu(((u16 *) &attr.sys_image_guid)[3])); +} + +static ssize_t show_node_guid(struct class_device *cdev, char *buf) +{ + struct ib_device *dev = container_of(cdev, struct ib_device, class_dev); + struct ib_device_attr attr; + ssize_t ret; + + ret = ib_query_device(dev, &attr); + if (ret) + return ret; + + return sprintf(buf, "%04x:%04x:%04x:%04x\n", + be16_to_cpu(((u16 *) &attr.node_guid)[0]), + be16_to_cpu(((u16 *) &attr.node_guid)[1]), + be16_to_cpu(((u16 *) &attr.node_guid)[2]), + be16_to_cpu(((u16 *) &attr.node_guid)[3])); +} + +static CLASS_DEVICE_ATTR(sys_image_guid, S_IRUGO, show_sys_image_guid, NULL); +static CLASS_DEVICE_ATTR(node_guid, S_IRUGO, show_node_guid, NULL); + +static struct class_device_attribute *ib_class_attributes[] = { + &class_device_attr_sys_image_guid, + &class_device_attr_node_guid +}; + +static struct class ib_class = { + .name = "infiniband", + .release = ib_device_release, + .hotplug = ib_device_hotplug, +}; + +int ib_device_register_sysfs(struct ib_device *device) +{ + struct class_device *class_dev = &device->class_dev; + int ret; + int i; + + class_dev->class = &ib_class; + class_dev->class_data = device; + strlcpy(class_dev->class_id, device->name, BUS_ID_SIZE); + + INIT_LIST_HEAD(&device->port_list); + + ret = class_device_register(class_dev); + if (ret) + goto err; + + for (i = 0; i < ARRAY_SIZE(ib_class_attributes); ++i) { + ret = class_device_create_file(class_dev, ib_class_attributes[i]); + if (ret) + goto err_unregister; + } + + device->ports_parent.parent = kobject_get(&class_dev->kobj); + if (!device->ports_parent.parent) { + ret = -EBUSY; + goto err_unregister; + } + ret = kobject_set_name(&device->ports_parent, "ports"); + if (ret) + goto err_put; + ret = kobject_register(&device->ports_parent); + if (ret) + goto err_put; + + if (device->node_type == IB_NODE_SWITCH) { + ret = add_port(device, 0); + if (ret) + goto err_put; + } else { + int i; + + for (i = 1; i <= device->phys_port_cnt; ++i) { + ret = add_port(device, i); + if (ret) + goto err_put; + } + } + + return 0; + +err_put: + { + struct kobject *p, *t; + struct ib_port *port; + + list_for_each_entry_safe(p, t, &device->port_list, entry) { + list_del(&p->entry); + port = container_of(p, struct ib_port, kobj); + sysfs_remove_group(p, &pma_group); + sysfs_remove_group(p, &port->pkey_group); + sysfs_remove_group(p, &port->gid_group); + kobject_unregister(p); + } + } + + kobject_put(&class_dev->kobj); + +err_unregister: + class_device_unregister(class_dev); + +err: + return ret; +} + +void ib_device_unregister_sysfs(struct ib_device *device) +{ + struct kobject *p, *t; + struct ib_port *port; + + list_for_each_entry_safe(p, t, &device->port_list, entry) { + list_del(&p->entry); + port = container_of(p, struct ib_port, kobj); + sysfs_remove_group(p, &pma_group); + sysfs_remove_group(p, &port->pkey_group); + sysfs_remove_group(p, &port->gid_group); + kobject_unregister(p); + } + + kobject_unregister(&device->ports_parent); + class_device_unregister(&device->class_dev); +} + +int ib_sysfs_setup(void) +{ + return class_register(&ib_class); +} + +void ib_sysfs_cleanup(void) +{ + class_unregister(&ib_class); +} --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/core/ud_header.c 2004-12-27 21:48:18.428253653 -0800 @@ -0,0 +1,365 @@ +/* + * Copyright (c) 2004 Topspin Corporation. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: ud_header.c 1349 2004-12-16 21:09:43Z roland $ + */ + +#include + +#include + +#define STRUCT_FIELD(header, field) \ + .struct_offset_bytes = offsetof(struct ib_unpacked_ ## header, field), \ + .struct_size_bytes = sizeof ((struct ib_unpacked_ ## header *) 0)->field, \ + .field_name = #header ":" #field + +static const struct ib_field lrh_table[] = { + { STRUCT_FIELD(lrh, virtual_lane), + .offset_words = 0, + .offset_bits = 0, + .size_bits = 4 }, + { STRUCT_FIELD(lrh, link_version), + .offset_words = 0, + .offset_bits = 4, + .size_bits = 4 }, + { STRUCT_FIELD(lrh, service_level), + .offset_words = 0, + .offset_bits = 8, + .size_bits = 4 }, + { RESERVED, + .offset_words = 0, + .offset_bits = 12, + .size_bits = 2 }, + { STRUCT_FIELD(lrh, link_next_header), + .offset_words = 0, + .offset_bits = 14, + .size_bits = 2 }, + { STRUCT_FIELD(lrh, destination_lid), + .offset_words = 0, + .offset_bits = 16, + .size_bits = 16 }, + { RESERVED, + .offset_words = 1, + .offset_bits = 0, + .size_bits = 5 }, + { STRUCT_FIELD(lrh, packet_length), + .offset_words = 1, + .offset_bits = 5, + .size_bits = 11 }, + { STRUCT_FIELD(lrh, source_lid), + .offset_words = 1, + .offset_bits = 16, + .size_bits = 16 } +}; + +static const struct ib_field grh_table[] = { + { STRUCT_FIELD(grh, ip_version), + .offset_words = 0, + .offset_bits = 0, + .size_bits = 4 }, + { STRUCT_FIELD(grh, traffic_class), + .offset_words = 0, + .offset_bits = 4, + .size_bits = 8 }, + { STRUCT_FIELD(grh, flow_label), + .offset_words = 0, + .offset_bits = 12, + .size_bits = 20 }, + { STRUCT_FIELD(grh, payload_length), + .offset_words = 1, + .offset_bits = 0, + .size_bits = 16 }, + { STRUCT_FIELD(grh, next_header), + .offset_words = 1, + .offset_bits = 16, + .size_bits = 8 }, + { STRUCT_FIELD(grh, hop_limit), + .offset_words = 1, + .offset_bits = 24, + .size_bits = 8 }, + { STRUCT_FIELD(grh, source_gid), + .offset_words = 2, + .offset_bits = 0, + .size_bits = 128 }, + { STRUCT_FIELD(grh, destination_gid), + .offset_words = 6, + .offset_bits = 0, + .size_bits = 128 } +}; + +static const struct ib_field bth_table[] = { + { STRUCT_FIELD(bth, opcode), + .offset_words = 0, + .offset_bits = 0, + .size_bits = 8 }, + { STRUCT_FIELD(bth, solicited_event), + .offset_words = 0, + .offset_bits = 8, + .size_bits = 1 }, + { STRUCT_FIELD(bth, mig_req), + .offset_words = 0, + .offset_bits = 9, + .size_bits = 1 }, + { STRUCT_FIELD(bth, pad_count), + .offset_words = 0, + .offset_bits = 10, + .size_bits = 2 }, + { STRUCT_FIELD(bth, transport_header_version), + .offset_words = 0, + .offset_bits = 12, + .size_bits = 4 }, + { STRUCT_FIELD(bth, pkey), + .offset_words = 0, + .offset_bits = 16, + .size_bits = 16 }, + { RESERVED, + .offset_words = 1, + .offset_bits = 0, + .size_bits = 8 }, + { STRUCT_FIELD(bth, destination_qpn), + .offset_words = 1, + .offset_bits = 8, + .size_bits = 24 }, + { STRUCT_FIELD(bth, ack_req), + .offset_words = 2, + .offset_bits = 0, + .size_bits = 1 }, + { RESERVED, + .offset_words = 2, + .offset_bits = 1, + .size_bits = 7 }, + { STRUCT_FIELD(bth, psn), + .offset_words = 2, + .offset_bits = 8, + .size_bits = 24 } +}; + +static const struct ib_field deth_table[] = { + { STRUCT_FIELD(deth, qkey), + .offset_words = 0, + .offset_bits = 0, + .size_bits = 32 }, + { RESERVED, + .offset_words = 1, + .offset_bits = 0, + .size_bits = 8 }, + { STRUCT_FIELD(deth, source_qpn), + .offset_words = 1, + .offset_bits = 8, + .size_bits = 24 } +}; + +/** + * ib_ud_header_init - Initialize UD header structure + * @payload_bytes:Length of packet payload + * @grh_present:GRH flag (if non-zero, GRH will be included) + * @header:Structure to initialize + * + * ib_ud_header_init() initializes the lrh.link_version, lrh.link_next_header, + * lrh.packet_length, grh.ip_version, grh.payload_length, + * grh.next_header, bth.opcode, bth.pad_count and + * bth.transport_header_version fields of a &struct ib_ud_header given + * the payload length and whether a GRH will be included. + */ +void ib_ud_header_init(int payload_bytes, + int grh_present, + struct ib_ud_header *header) +{ + int header_len; + + memset(header, 0, sizeof *header); + + header_len = + IB_LRH_BYTES + + IB_BTH_BYTES + + IB_DETH_BYTES; + if (grh_present) { + header_len += IB_GRH_BYTES; + } + + header->lrh.link_version = 0; + header->lrh.link_next_header = + grh_present ? IB_LNH_IBA_GLOBAL : IB_LNH_IBA_LOCAL; + header->lrh.packet_length = (IB_LRH_BYTES + + IB_BTH_BYTES + + IB_DETH_BYTES + + payload_bytes + + 4 + /* ICRC */ + 3) / 4; /* round up */ + + header->grh_present = grh_present; + if (grh_present) { + header->lrh.packet_length += IB_GRH_BYTES / 4; + + header->grh.ip_version = 6; + header->grh.payload_length = + cpu_to_be16((IB_BTH_BYTES + + IB_DETH_BYTES + + payload_bytes + + 4 + /* ICRC */ + 3) & ~3); /* round up */ + header->grh.next_header = 0x1b; + } + + cpu_to_be16s(&header->lrh.packet_length); + + if (header->immediate_present) + header->bth.opcode = IB_OPCODE_UD_SEND_ONLY_WITH_IMMEDIATE; + else + header->bth.opcode = IB_OPCODE_UD_SEND_ONLY; + header->bth.pad_count = (4 - payload_bytes) & 3; + header->bth.transport_header_version = 0; +} +EXPORT_SYMBOL(ib_ud_header_init); + +/** + * ib_ud_header_pack - Pack UD header struct into wire format + * @header:UD header struct + * @buf:Buffer to pack into + * + * ib_ud_header_pack() packs the UD header structure @header into wire + * format in the buffer @buf. + */ +int ib_ud_header_pack(struct ib_ud_header *header, + void *buf) +{ + int len = 0; + + ib_pack(lrh_table, ARRAY_SIZE(lrh_table), + &header->lrh, buf); + len += IB_LRH_BYTES; + + if (header->grh_present) { + ib_pack(grh_table, ARRAY_SIZE(grh_table), + &header->grh, buf + len); + len += IB_GRH_BYTES; + } + + ib_pack(bth_table, ARRAY_SIZE(bth_table), + &header->bth, buf + len); + len += IB_BTH_BYTES; + + ib_pack(deth_table, ARRAY_SIZE(deth_table), + &header->deth, buf + len); + len += IB_DETH_BYTES; + + if (header->immediate_present) { + memcpy(buf + len, &header->immediate_data, sizeof header->immediate_data); + len += sizeof header->immediate_data; + } + + return len; +} +EXPORT_SYMBOL(ib_ud_header_pack); + +/** + * ib_ud_header_unpack - Unpack UD header struct from wire format + * @header:UD header struct + * @buf:Buffer to pack into + * + * ib_ud_header_pack() unpacks the UD header structure @header from wire + * format in the buffer @buf. + */ +int ib_ud_header_unpack(void *buf, + struct ib_ud_header *header) +{ + ib_unpack(lrh_table, ARRAY_SIZE(lrh_table), + buf, &header->lrh); + buf += IB_LRH_BYTES; + + if (header->lrh.link_version != 0) { + printk(KERN_WARNING "Invalid LRH.link_version %d\n", + header->lrh.link_version); + return -EINVAL; + } + + switch (header->lrh.link_next_header) { + case IB_LNH_IBA_LOCAL: + header->grh_present = 0; + break; + + case IB_LNH_IBA_GLOBAL: + header->grh_present = 1; + ib_unpack(grh_table, ARRAY_SIZE(grh_table), + buf, &header->grh); + buf += IB_GRH_BYTES; + + if (header->grh.ip_version != 6) { + printk(KERN_WARNING "Invalid GRH.ip_version %d\n", + header->grh.ip_version); + return -EINVAL; + } + if (header->grh.next_header != 0x1b) { + printk(KERN_WARNING "Invalid GRH.next_header 0x%02x\n", + header->grh.next_header); + return -EINVAL; + } + break; + + default: + printk(KERN_WARNING "Invalid LRH.link_next_header %d\n", + header->lrh.link_next_header); + return -EINVAL; + } + + ib_unpack(bth_table, ARRAY_SIZE(bth_table), + buf, &header->bth); + buf += IB_BTH_BYTES; + + switch (header->bth.opcode) { + case IB_OPCODE_UD_SEND_ONLY: + header->immediate_present = 0; + break; + case IB_OPCODE_UD_SEND_ONLY_WITH_IMMEDIATE: + header->immediate_present = 1; + break; + default: + printk(KERN_WARNING "Invalid BTH.opcode 0x%02x\n", + header->bth.opcode); + return -EINVAL; + } + + if (header->bth.transport_header_version != 0) { + printk(KERN_WARNING "Invalid BTH.transport_header_version %d\n", + header->bth.transport_header_version); + return -EINVAL; + } + + ib_unpack(deth_table, ARRAY_SIZE(deth_table), + buf, &header->deth); + buf += IB_DETH_BYTES; + + if (header->immediate_present) + memcpy(&header->immediate_data, buf, sizeof header->immediate_data); + + return 0; +} +EXPORT_SYMBOL(ib_ud_header_unpack); --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/core/verbs.c 2004-12-27 21:48:18.453249974 -0800 @@ -0,0 +1,433 @@ +/* + * Copyright (c) 2004 Mellanox Technologies Ltd. All rights reserved. + * Copyright (c) 2004 Infinicon Corporation. All rights reserved. + * Copyright (c) 2004 Intel Corporation. All rights reserved. + * Copyright (c) 2004 Topspin Corporation. All rights reserved. + * Copyright (c) 2004 Voltaire Corporation. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: verbs.c 1349 2004-12-16 21:09:43Z roland $ + */ + +#include +#include + +#include + +/* Protection domains */ + +struct ib_pd *ib_alloc_pd(struct ib_device *device) +{ + struct ib_pd *pd; + + pd = device->alloc_pd(device); + + if (!IS_ERR(pd)) { + pd->device = device; + atomic_set(&pd->usecnt, 0); + } + + return pd; +} +EXPORT_SYMBOL(ib_alloc_pd); + +int ib_dealloc_pd(struct ib_pd *pd) +{ + if (atomic_read(&pd->usecnt)) + return -EBUSY; + + return pd->device->dealloc_pd(pd); +} +EXPORT_SYMBOL(ib_dealloc_pd); + +/* Address handles */ + +struct ib_ah *ib_create_ah(struct ib_pd *pd, struct ib_ah_attr *ah_attr) +{ + struct ib_ah *ah; + + ah = pd->device->create_ah(pd, ah_attr); + + if (!IS_ERR(ah)) { + ah->device = pd->device; + ah->pd = pd; + atomic_inc(&pd->usecnt); + } + + return ah; +} +EXPORT_SYMBOL(ib_create_ah); + +int ib_modify_ah(struct ib_ah *ah, struct ib_ah_attr *ah_attr) +{ + return ah->device->modify_ah ? + ah->device->modify_ah(ah, ah_attr) : + -ENOSYS; +} +EXPORT_SYMBOL(ib_modify_ah); + +int ib_query_ah(struct ib_ah *ah, struct ib_ah_attr *ah_attr) +{ + return ah->device->query_ah ? + ah->device->query_ah(ah, ah_attr) : + -ENOSYS; +} +EXPORT_SYMBOL(ib_query_ah); + +int ib_destroy_ah(struct ib_ah *ah) +{ + struct ib_pd *pd; + int ret; + + pd = ah->pd; + ret = ah->device->destroy_ah(ah); + if (!ret) + atomic_dec(&pd->usecnt); + + return ret; +} +EXPORT_SYMBOL(ib_destroy_ah); + +/* Queue pairs */ + +struct ib_qp *ib_create_qp(struct ib_pd *pd, + struct ib_qp_init_attr *qp_init_attr) +{ + struct ib_qp *qp; + + qp = pd->device->create_qp(pd, qp_init_attr); + + if (!IS_ERR(qp)) { + qp->device = pd->device; + qp->pd = pd; + qp->send_cq = qp_init_attr->send_cq; + qp->recv_cq = qp_init_attr->recv_cq; + qp->srq = qp_init_attr->srq; + qp->event_handler = qp_init_attr->event_handler; + qp->qp_context = qp_init_attr->qp_context; + atomic_inc(&pd->usecnt); + atomic_inc(&qp_init_attr->send_cq->usecnt); + atomic_inc(&qp_init_attr->recv_cq->usecnt); + if (qp_init_attr->srq) + atomic_inc(&qp_init_attr->srq->usecnt); + } + + return qp; +} +EXPORT_SYMBOL(ib_create_qp); + +int ib_modify_qp(struct ib_qp *qp, + struct ib_qp_attr *qp_attr, + int qp_attr_mask) +{ + return qp->device->modify_qp(qp, qp_attr, qp_attr_mask); +} +EXPORT_SYMBOL(ib_modify_qp); + +int ib_query_qp(struct ib_qp *qp, + struct ib_qp_attr *qp_attr, + int qp_attr_mask, + struct ib_qp_init_attr *qp_init_attr) +{ + return qp->device->query_qp ? + qp->device->query_qp(qp, qp_attr, qp_attr_mask, qp_init_attr) : + -ENOSYS; +} +EXPORT_SYMBOL(ib_query_qp); + +int ib_destroy_qp(struct ib_qp *qp) +{ + struct ib_pd *pd; + struct ib_cq *scq, *rcq; + struct ib_srq *srq; + int ret; + + pd = qp->pd; + scq = qp->send_cq; + rcq = qp->recv_cq; + srq = qp->srq; + + ret = qp->device->destroy_qp(qp); + if (!ret) { + atomic_dec(&pd->usecnt); + atomic_dec(&scq->usecnt); + atomic_dec(&rcq->usecnt); + if (srq) + atomic_dec(&srq->usecnt); + } + + return ret; +} +EXPORT_SYMBOL(ib_destroy_qp); + +/* Completion queues */ + +struct ib_cq *ib_create_cq(struct ib_device *device, + ib_comp_handler comp_handler, + void (*event_handler)(struct ib_event *, void *), + void *cq_context, int cqe) +{ + struct ib_cq *cq; + + cq = device->create_cq(device, cqe); + + if (!IS_ERR(cq)) { + cq->device = device; + cq->comp_handler = comp_handler; + cq->event_handler = event_handler; + cq->cq_context = cq_context; + atomic_set(&cq->usecnt, 0); + } + + return cq; +} +EXPORT_SYMBOL(ib_create_cq); + +int ib_destroy_cq(struct ib_cq *cq) +{ + if (atomic_read(&cq->usecnt)) + return -EBUSY; + + return cq->device->destroy_cq(cq); +} +EXPORT_SYMBOL(ib_destroy_cq); + +int ib_resize_cq(struct ib_cq *cq, + int cqe) +{ + int ret; + + if (!cq->device->resize_cq) + return -ENOSYS; + + ret = cq->device->resize_cq(cq, &cqe); + if (!ret) + cq->cqe = cqe; + + return ret; +} +EXPORT_SYMBOL(ib_resize_cq); + +/* Memory regions */ + +struct ib_mr *ib_get_dma_mr(struct ib_pd *pd, int mr_access_flags) +{ + struct ib_mr *mr; + + mr = pd->device->get_dma_mr(pd, mr_access_flags); + + if (!IS_ERR(mr)) { + mr->device = pd->device; + mr->pd = pd; + atomic_inc(&pd->usecnt); + atomic_set(&mr->usecnt, 0); + } + + return mr; +} +EXPORT_SYMBOL(ib_get_dma_mr); + +struct ib_mr *ib_reg_phys_mr(struct ib_pd *pd, + struct ib_phys_buf *phys_buf_array, + int num_phys_buf, + int mr_access_flags, + u64 *iova_start) +{ + struct ib_mr *mr; + + mr = pd->device->reg_phys_mr(pd, phys_buf_array, num_phys_buf, + mr_access_flags, iova_start); + + if (!IS_ERR(mr)) { + mr->device = pd->device; + mr->pd = pd; + atomic_inc(&pd->usecnt); + atomic_set(&mr->usecnt, 0); + } + + return mr; +} +EXPORT_SYMBOL(ib_reg_phys_mr); + +int ib_rereg_phys_mr(struct ib_mr *mr, + int mr_rereg_mask, + struct ib_pd *pd, + struct ib_phys_buf *phys_buf_array, + int num_phys_buf, + int mr_access_flags, + u64 *iova_start) +{ + struct ib_pd *old_pd; + int ret; + + if (!mr->device->rereg_phys_mr) + return -ENOSYS; + + if (atomic_read(&mr->usecnt)) + return -EBUSY; + + old_pd = mr->pd; + + ret = mr->device->rereg_phys_mr(mr, mr_rereg_mask, pd, + phys_buf_array, num_phys_buf, + mr_access_flags, iova_start); + + if (!ret && (mr_rereg_mask & IB_MR_REREG_PD)) { + atomic_dec(&old_pd->usecnt); + atomic_inc(&pd->usecnt); + } + + return ret; +} +EXPORT_SYMBOL(ib_rereg_phys_mr); + +int ib_query_mr(struct ib_mr *mr, struct ib_mr_attr *mr_attr) +{ + return mr->device->query_mr ? + mr->device->query_mr(mr, mr_attr) : -ENOSYS; +} +EXPORT_SYMBOL(ib_query_mr); + +int ib_dereg_mr(struct ib_mr *mr) +{ + struct ib_pd *pd; + int ret; + + if (atomic_read(&mr->usecnt)) + return -EBUSY; + + pd = mr->pd; + ret = mr->device->dereg_mr(mr); + if (!ret) + atomic_dec(&pd->usecnt); + + return ret; +} +EXPORT_SYMBOL(ib_dereg_mr); + +/* Memory windows */ + +struct ib_mw *ib_alloc_mw(struct ib_pd *pd) +{ + struct ib_mw *mw; + + if (!pd->device->alloc_mw) + return ERR_PTR(-ENOSYS); + + mw = pd->device->alloc_mw(pd); + if (!IS_ERR(mw)) { + mw->device = pd->device; + mw->pd = pd; + atomic_inc(&pd->usecnt); + } + + return mw; +} +EXPORT_SYMBOL(ib_alloc_mw); + +int ib_dealloc_mw(struct ib_mw *mw) +{ + struct ib_pd *pd; + int ret; + + pd = mw->pd; + ret = mw->device->dealloc_mw(mw); + if (!ret) + atomic_dec(&pd->usecnt); + + return ret; +} +EXPORT_SYMBOL(ib_dealloc_mw); + +/* "Fast" memory regions */ + +struct ib_fmr *ib_alloc_fmr(struct ib_pd *pd, + int mr_access_flags, + struct ib_fmr_attr *fmr_attr) +{ + struct ib_fmr *fmr; + + if (!pd->device->alloc_fmr) + return ERR_PTR(-ENOSYS); + + fmr = pd->device->alloc_fmr(pd, mr_access_flags, fmr_attr); + if (!IS_ERR(fmr)) { + fmr->device = pd->device; + fmr->pd = pd; + atomic_inc(&pd->usecnt); + } + + return fmr; +} +EXPORT_SYMBOL(ib_alloc_fmr); + +int ib_unmap_fmr(struct list_head *fmr_list) +{ + struct ib_fmr *fmr; + + if (list_empty(fmr_list)) + return 0; + + fmr = list_entry(fmr_list->next, struct ib_fmr, list); + return fmr->device->unmap_fmr(fmr_list); +} +EXPORT_SYMBOL(ib_unmap_fmr); + +int ib_dealloc_fmr(struct ib_fmr *fmr) +{ + struct ib_pd *pd; + int ret; + + pd = fmr->pd; + ret = fmr->device->dealloc_fmr(fmr); + if (!ret) + atomic_dec(&pd->usecnt); + + return ret; +} +EXPORT_SYMBOL(ib_dealloc_fmr); + +/* Multicast groups */ + +int ib_attach_mcast(struct ib_qp *qp, union ib_gid *gid, u16 lid) +{ + return qp->device->attach_mcast ? + qp->device->attach_mcast(qp, gid, lid) : + -ENOSYS; +} +EXPORT_SYMBOL(ib_attach_mcast); + +int ib_detach_mcast(struct ib_qp *qp, union ib_gid *gid, u16 lid) +{ + return qp->device->detach_mcast ? + qp->device->detach_mcast(qp, gid, lid) : + -ENOSYS; +} +EXPORT_SYMBOL(ib_detach_mcast); From roland@topspin.com Mon Dec 27 21:49:41 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 21:51:03 -0800 (PST) Received: from umhlanga.STRATNET.NET (umhlanga.stratnet.net [12.162.17.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS5nJDY025948 for ; Mon, 27 Dec 2004 21:49:41 -0800 Received: from exch-1.topspincom.com ([12.162.17.3]) by umhlanga.STRATNET.NET with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:51:00 -0800 Received: from eddore ([10.10.253.169]) by exch-1.topspincom.com with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:50:59 -0800 Received: from localhost ([127.0.0.1] helo=eddore) by eddore with smtp (Exim 4.34) id 1CjAG2-0000t8-0g; Mon, 27 Dec 2004 21:50:59 -0800 Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com, openib-general@openib.org In-Reply-To: <200412272150.7LBVS92XE77zrCiS@topspin.com> X-Mailer: Roland's Patchbomber Date: Mon, 27 Dec 2004 21:50:57 -0800 Message-Id: <200412272150.vDVch42vu9imXkVM@topspin.com> Mime-Version: 1.0 To: davem@davemloft.net From: Roland Dreier X-SA-Exim-Connect-IP: 127.0.0.1 X-SA-Exim-Mail-From: roland@topspin.com Subject: [PATCH][v5][5/24] Add InfiniBand MAD (management datagram) support Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-SA-Exim-Version: 4.1 (built Tue, 17 Aug 2004 11:06:07 +0200) X-SA-Exim-Scanned: Yes (on eddore) X-OriginalArrivalTime: 28 Dec 2004 05:50:59.0814 (UTC) FILETIME=[351C1860:01C4ECA1] X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13113 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: roland@topspin.com Precedence: bulk X-list: netdev Add support for handling InfiniBand MADs (management datagrams), including sending and receiving MADs as well as passing MADs on to local agents. This is required for an SM (subnet manager) to discover and configure the host, since the SM's query MADs must be passed to the local SMA (subnet management agent). In addition, this support is used by upper level protocols to send queries to and receive responses from the SM. Signed-off-by: Roland Dreier --- linux-bk.orig/drivers/infiniband/core/Makefile 2004-12-27 21:48:18.262278084 -0800 +++ linux-bk/drivers/infiniband/core/Makefile 2004-12-27 21:48:19.838046137 -0800 @@ -1,6 +1,8 @@ EXTRA_CFLAGS += -Idrivers/infiniband/include -obj-$(CONFIG_INFINIBAND) += ib_core.o +obj-$(CONFIG_INFINIBAND) += ib_core.o ib_mad.o ib_core-y := packer.o ud_header.o verbs.o sysfs.o \ device.o fmr_pool.o cache.o + +ib_mad-y := mad.o smi.o agent.o --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/core/agent.c 2004-12-27 21:48:19.916034657 -0800 @@ -0,0 +1,399 @@ +/* + * Copyright (c) 2004 Mellanox Technologies Ltd. All rights reserved. + * Copyright (c) 2004 Infinicon Corporation. All rights reserved. + * Copyright (c) 2004 Intel Corporation. All rights reserved. + * Copyright (c) 2004 Topspin Corporation. All rights reserved. + * Copyright (c) 2004 Voltaire Corporation. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: agent.c 1389 2004-12-27 22:56:47Z roland $ + */ + +#include + +#include + +#include + +#include "smi.h" +#include "agent_priv.h" +#include "mad_priv.h" + + +spinlock_t ib_agent_port_list_lock; +static LIST_HEAD(ib_agent_port_list); + +extern kmem_cache_t *ib_mad_cache; + + +/* + * Caller must hold ib_agent_port_list_lock + */ +static inline struct ib_agent_port_private * +__ib_get_agent_port(struct ib_device *device, int port_num, + struct ib_mad_agent *mad_agent) +{ + struct ib_agent_port_private *entry; + + BUG_ON(!(!!device ^ !!mad_agent)); /* Exactly one MUST be (!NULL) */ + + if (device) { + list_for_each_entry(entry, &ib_agent_port_list, port_list) { + if (entry->dr_smp_agent->device == device && + entry->port_num == port_num) + return entry; + } + } else { + list_for_each_entry(entry, &ib_agent_port_list, port_list) { + if ((entry->dr_smp_agent == mad_agent) || + (entry->lr_smp_agent == mad_agent) || + (entry->perf_mgmt_agent == mad_agent)) + return entry; + } + } + return NULL; +} + +static inline struct ib_agent_port_private * +ib_get_agent_port(struct ib_device *device, int port_num, + struct ib_mad_agent *mad_agent) +{ + struct ib_agent_port_private *entry; + unsigned long flags; + + spin_lock_irqsave(&ib_agent_port_list_lock, flags); + entry = __ib_get_agent_port(device, port_num, mad_agent); + spin_unlock_irqrestore(&ib_agent_port_list_lock, flags); + + return entry; +} + +int smi_check_local_dr_smp(struct ib_smp *smp, + struct ib_device *device, + int port_num) +{ + struct ib_agent_port_private *port_priv; + + if (smp->mgmt_class != IB_MGMT_CLASS_SUBN_DIRECTED_ROUTE) + return 1; + port_priv = ib_get_agent_port(device, port_num, NULL); + if (!port_priv) { + printk(KERN_DEBUG SPFX "smi_check_local_dr_smp %s port %d " + "not open\n", + device->name, port_num); + return 1; + } + + return smi_check_local_smp(port_priv->dr_smp_agent, smp); +} + +static int agent_mad_send(struct ib_mad_agent *mad_agent, + struct ib_agent_port_private *port_priv, + struct ib_mad_private *mad_priv, + struct ib_grh *grh, + struct ib_wc *wc) +{ + struct ib_agent_send_wr *agent_send_wr; + struct ib_sge gather_list; + struct ib_send_wr send_wr; + struct ib_send_wr *bad_send_wr; + struct ib_ah_attr ah_attr; + unsigned long flags; + int ret = 1; + + agent_send_wr = kmalloc(sizeof(*agent_send_wr), GFP_KERNEL); + if (!agent_send_wr) + goto out; + agent_send_wr->mad = mad_priv; + + /* PCI mapping */ + gather_list.addr = dma_map_single(mad_agent->device->dma_device, + &mad_priv->mad, + sizeof(mad_priv->mad), + DMA_TO_DEVICE); + gather_list.length = sizeof(mad_priv->mad); + gather_list.lkey = (*port_priv->mr).lkey; + + send_wr.next = NULL; + send_wr.opcode = IB_WR_SEND; + send_wr.sg_list = &gather_list; + send_wr.num_sge = 1; + send_wr.wr.ud.remote_qpn = wc->src_qp; /* DQPN */ + send_wr.wr.ud.timeout_ms = 0; + send_wr.send_flags = IB_SEND_SIGNALED | IB_SEND_SOLICITED; + + ah_attr.dlid = wc->slid; + ah_attr.port_num = mad_agent->port_num; + ah_attr.src_path_bits = wc->dlid_path_bits; + ah_attr.sl = wc->sl; + ah_attr.static_rate = 0; + ah_attr.ah_flags = 0; /* No GRH */ + if (mad_priv->mad.mad.mad_hdr.mgmt_class == IB_MGMT_CLASS_PERF_MGMT) { + if (wc->wc_flags & IB_WC_GRH) { + ah_attr.ah_flags = IB_AH_GRH; + /* Should sgid be looked up ? */ + ah_attr.grh.sgid_index = 0; + ah_attr.grh.hop_limit = grh->hop_limit; + ah_attr.grh.flow_label = be32_to_cpup( + &grh->version_tclass_flow) & 0xfffff; + ah_attr.grh.traffic_class = (be32_to_cpup( + &grh->version_tclass_flow) >> 20) & 0xff; + memcpy(ah_attr.grh.dgid.raw, + grh->sgid.raw, + sizeof(ah_attr.grh.dgid)); + } + } + + agent_send_wr->ah = ib_create_ah(mad_agent->qp->pd, &ah_attr); + if (IS_ERR(agent_send_wr->ah)) { + printk(KERN_ERR SPFX "No memory for address handle\n"); + kfree(agent_send_wr); + goto out; + } + + send_wr.wr.ud.ah = agent_send_wr->ah; + if (mad_priv->mad.mad.mad_hdr.mgmt_class == IB_MGMT_CLASS_PERF_MGMT) { + send_wr.wr.ud.pkey_index = wc->pkey_index; + send_wr.wr.ud.remote_qkey = IB_QP1_QKEY; + } else { /* for SMPs */ + send_wr.wr.ud.pkey_index = 0; + send_wr.wr.ud.remote_qkey = 0; + } + send_wr.wr.ud.mad_hdr = &mad_priv->mad.mad.mad_hdr; + send_wr.wr_id = (unsigned long)agent_send_wr; + + pci_unmap_addr_set(agent_send_wr, mapping, gather_list.addr); + + /* Send */ + spin_lock_irqsave(&port_priv->send_list_lock, flags); + if (ib_post_send_mad(mad_agent, &send_wr, &bad_send_wr)) { + spin_unlock_irqrestore(&port_priv->send_list_lock, flags); + dma_unmap_single(mad_agent->device->dma_device, + pci_unmap_addr(agent_send_wr, mapping), + sizeof(mad_priv->mad), + DMA_TO_DEVICE); + ib_destroy_ah(agent_send_wr->ah); + kfree(agent_send_wr); + } else { + list_add_tail(&agent_send_wr->send_list, + &port_priv->send_posted_list); + spin_unlock_irqrestore(&port_priv->send_list_lock, flags); + ret = 0; + } + +out: + return ret; +} + +int agent_send(struct ib_mad_private *mad, + struct ib_grh *grh, + struct ib_wc *wc, + struct ib_device *device, + int port_num) +{ + struct ib_agent_port_private *port_priv; + struct ib_mad_agent *mad_agent; + + port_priv = ib_get_agent_port(device, port_num, NULL); + if (!port_priv) { + printk(KERN_DEBUG SPFX "agent_send %s port %d not open\n", + device->name, port_num); + return 1; + } + + /* Get mad agent based on mgmt_class in MAD */ + switch (mad->mad.mad.mad_hdr.mgmt_class) { + case IB_MGMT_CLASS_SUBN_DIRECTED_ROUTE: + mad_agent = port_priv->dr_smp_agent; + break; + case IB_MGMT_CLASS_SUBN_LID_ROUTED: + mad_agent = port_priv->lr_smp_agent; + break; + case IB_MGMT_CLASS_PERF_MGMT: + mad_agent = port_priv->perf_mgmt_agent; + break; + default: + return 1; + } + + return agent_mad_send(mad_agent, port_priv, mad, grh, wc); +} + +static void agent_send_handler(struct ib_mad_agent *mad_agent, + struct ib_mad_send_wc *mad_send_wc) +{ + struct ib_agent_port_private *port_priv; + struct ib_agent_send_wr *agent_send_wr; + unsigned long flags; + + /* Find matching MAD agent */ + port_priv = ib_get_agent_port(NULL, 0, mad_agent); + if (!port_priv) { + printk(KERN_ERR SPFX "agent_send_handler: no matching MAD " + "agent %p\n", mad_agent); + return; + } + + agent_send_wr = (struct ib_agent_send_wr *)(unsigned long)mad_send_wc->wr_id; + spin_lock_irqsave(&port_priv->send_list_lock, flags); + /* Remove completed send from posted send MAD list */ + list_del(&agent_send_wr->send_list); + spin_unlock_irqrestore(&port_priv->send_list_lock, flags); + + /* Unmap PCI */ + dma_unmap_single(mad_agent->device->dma_device, + pci_unmap_addr(agent_send_wr, mapping), + sizeof(agent_send_wr->mad->mad), + DMA_TO_DEVICE); + + ib_destroy_ah(agent_send_wr->ah); + + /* Release allocated memory */ + kmem_cache_free(ib_mad_cache, agent_send_wr->mad); + kfree(agent_send_wr); +} + +int ib_agent_port_open(struct ib_device *device, int port_num) +{ + int ret; + struct ib_agent_port_private *port_priv; + struct ib_mad_reg_req reg_req; + unsigned long flags; + + /* First, check if port already open for SMI */ + port_priv = ib_get_agent_port(device, port_num, NULL); + if (port_priv) { + printk(KERN_DEBUG SPFX "%s port %d already open\n", + device->name, port_num); + return 0; + } + + /* Create new device info */ + port_priv = kmalloc(sizeof *port_priv, GFP_KERNEL); + if (!port_priv) { + printk(KERN_ERR SPFX "No memory for ib_agent_port_private\n"); + ret = -ENOMEM; + goto error1; + } + + memset(port_priv, 0, sizeof *port_priv); + port_priv->port_num = port_num; + spin_lock_init(&port_priv->send_list_lock); + INIT_LIST_HEAD(&port_priv->send_posted_list); + + /* Obtain MAD agent for directed route SM class */ + reg_req.mgmt_class = IB_MGMT_CLASS_SUBN_DIRECTED_ROUTE; + reg_req.mgmt_class_version = 1; + + port_priv->dr_smp_agent = ib_register_mad_agent(device, port_num, + IB_QPT_SMI, + NULL, 0, + &agent_send_handler, + NULL, NULL); + + if (IS_ERR(port_priv->dr_smp_agent)) { + ret = PTR_ERR(port_priv->dr_smp_agent); + goto error2; + } + + /* Obtain MAD agent for LID routed SM class */ + reg_req.mgmt_class = IB_MGMT_CLASS_SUBN_LID_ROUTED; + port_priv->lr_smp_agent = ib_register_mad_agent(device, port_num, + IB_QPT_SMI, + NULL, 0, + &agent_send_handler, + NULL, NULL); + if (IS_ERR(port_priv->lr_smp_agent)) { + ret = PTR_ERR(port_priv->lr_smp_agent); + goto error3; + } + + /* Obtain MAD agent for PerfMgmt class */ + reg_req.mgmt_class = IB_MGMT_CLASS_PERF_MGMT; + port_priv->perf_mgmt_agent = ib_register_mad_agent(device, port_num, + IB_QPT_GSI, + NULL, 0, + &agent_send_handler, + NULL, NULL); + if (IS_ERR(port_priv->perf_mgmt_agent)) { + ret = PTR_ERR(port_priv->perf_mgmt_agent); + goto error4; + } + + port_priv->mr = ib_get_dma_mr(port_priv->dr_smp_agent->qp->pd, + IB_ACCESS_LOCAL_WRITE); + if (IS_ERR(port_priv->mr)) { + printk(KERN_ERR SPFX "Couldn't get DMA MR\n"); + ret = PTR_ERR(port_priv->mr); + goto error5; + } + + spin_lock_irqsave(&ib_agent_port_list_lock, flags); + list_add_tail(&port_priv->port_list, &ib_agent_port_list); + spin_unlock_irqrestore(&ib_agent_port_list_lock, flags); + + return 0; + +error5: + ib_unregister_mad_agent(port_priv->perf_mgmt_agent); +error4: + ib_unregister_mad_agent(port_priv->lr_smp_agent); +error3: + ib_unregister_mad_agent(port_priv->dr_smp_agent); +error2: + kfree(port_priv); +error1: + return ret; +} + +int ib_agent_port_close(struct ib_device *device, int port_num) +{ + struct ib_agent_port_private *port_priv; + unsigned long flags; + + spin_lock_irqsave(&ib_agent_port_list_lock, flags); + port_priv = __ib_get_agent_port(device, port_num, NULL); + if (port_priv == NULL) { + spin_unlock_irqrestore(&ib_agent_port_list_lock, flags); + printk(KERN_ERR SPFX "Port %d not found\n", port_num); + return -ENODEV; + } + list_del(&port_priv->port_list); + spin_unlock_irqrestore(&ib_agent_port_list_lock, flags); + + ib_dereg_mr(port_priv->mr); + + ib_unregister_mad_agent(port_priv->perf_mgmt_agent); + ib_unregister_mad_agent(port_priv->lr_smp_agent); + ib_unregister_mad_agent(port_priv->dr_smp_agent); + kfree(port_priv); + + return 0; +} --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/core/mad.c 2004-12-27 21:48:19.890038484 -0800 @@ -0,0 +1,2632 @@ +/* + * Copyright (c) 2004, Voltaire, Inc. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: mad.c 1389 2004-12-27 22:56:47Z roland $ + */ + +#include +#include + +#include + +#include "mad_priv.h" +#include "smi.h" +#include "agent.h" + + +MODULE_LICENSE("Dual BSD/GPL"); +MODULE_DESCRIPTION("kernel IB MAD API"); +MODULE_AUTHOR("Hal Rosenstock"); +MODULE_AUTHOR("Sean Hefty"); + + +kmem_cache_t *ib_mad_cache; +static struct list_head ib_mad_port_list; +static u32 ib_mad_client_id = 0; + +/* Port list lock */ +static spinlock_t ib_mad_port_list_lock; + + +/* Forward declarations */ +static int method_in_use(struct ib_mad_mgmt_method_table **method, + struct ib_mad_reg_req *mad_reg_req); +static void remove_mad_reg_req(struct ib_mad_agent_private *priv); +static int ib_mad_post_receive_mads(struct ib_mad_qp_info *qp_info, + struct ib_mad_private *mad); +static void cancel_mads(struct ib_mad_agent_private *mad_agent_priv); +static void ib_mad_complete_send_wr(struct ib_mad_send_wr_private *mad_send_wr, + struct ib_mad_send_wc *mad_send_wc); +static void timeout_sends(void *data); +static void local_completions(void *data); +static int solicited_mad(struct ib_mad *mad); +static int add_nonoui_reg_req(struct ib_mad_reg_req *mad_reg_req, + struct ib_mad_agent_private *agent_priv, + u8 mgmt_class); +static int add_oui_reg_req(struct ib_mad_reg_req *mad_reg_req, + struct ib_mad_agent_private *agent_priv); + +/* + * Returns a ib_mad_port_private structure or NULL for a device/port + * Assumes ib_mad_port_list_lock is being held + */ +static inline struct ib_mad_port_private * +__ib_get_mad_port(struct ib_device *device, int port_num) +{ + struct ib_mad_port_private *entry; + + list_for_each_entry(entry, &ib_mad_port_list, port_list) { + if (entry->device == device && entry->port_num == port_num) + return entry; + } + return NULL; +} + +/* + * Wrapper function to return a ib_mad_port_private structure or NULL + * for a device/port + */ +static inline struct ib_mad_port_private * +ib_get_mad_port(struct ib_device *device, int port_num) +{ + struct ib_mad_port_private *entry; + unsigned long flags; + + spin_lock_irqsave(&ib_mad_port_list_lock, flags); + entry = __ib_get_mad_port(device, port_num); + spin_unlock_irqrestore(&ib_mad_port_list_lock, flags); + + return entry; +} + +static inline u8 convert_mgmt_class(u8 mgmt_class) +{ + /* Alias IB_MGMT_CLASS_SUBN_DIRECTED_ROUTE to 0 */ + return mgmt_class == IB_MGMT_CLASS_SUBN_DIRECTED_ROUTE ? + 0 : mgmt_class; +} + +static int get_spl_qp_index(enum ib_qp_type qp_type) +{ + switch (qp_type) + { + case IB_QPT_SMI: + return 0; + case IB_QPT_GSI: + return 1; + default: + return -1; + } +} + +static int vendor_class_index(u8 mgmt_class) +{ + return mgmt_class - IB_MGMT_CLASS_VENDOR_RANGE2_START; +} + +static int is_vendor_class(u8 mgmt_class) +{ + if ((mgmt_class < IB_MGMT_CLASS_VENDOR_RANGE2_START) || + (mgmt_class > IB_MGMT_CLASS_VENDOR_RANGE2_END)) + return 0; + return 1; +} + +static int is_vendor_oui(char *oui) +{ + if (oui[0] || oui[1] || oui[2]) + return 1; + return 0; +} + +static int is_vendor_method_in_use( + struct ib_mad_mgmt_vendor_class *vendor_class, + struct ib_mad_reg_req *mad_reg_req) +{ + struct ib_mad_mgmt_method_table *method; + int i; + + for (i = 0; i < MAX_MGMT_OUI; i++) { + if (!memcmp(vendor_class->oui[i], mad_reg_req->oui, 3)) { + method = vendor_class->method_table[i]; + if (method) { + if (method_in_use(&method, mad_reg_req)) + return 1; + else + break; + } + } + } + return 0; +} + +/* + * ib_register_mad_agent - Register to send/receive MADs + */ +struct ib_mad_agent *ib_register_mad_agent(struct ib_device *device, + u8 port_num, + enum ib_qp_type qp_type, + struct ib_mad_reg_req *mad_reg_req, + u8 rmpp_version, + ib_mad_send_handler send_handler, + ib_mad_recv_handler recv_handler, + void *context) +{ + struct ib_mad_port_private *port_priv; + struct ib_mad_agent *ret = ERR_PTR(-EINVAL); + struct ib_mad_agent_private *mad_agent_priv; + struct ib_mad_reg_req *reg_req = NULL; + struct ib_mad_mgmt_class_table *class; + struct ib_mad_mgmt_vendor_class_table *vendor; + struct ib_mad_mgmt_vendor_class *vendor_class; + struct ib_mad_mgmt_method_table *method; + int ret2, qpn; + unsigned long flags; + u8 mgmt_class, vclass; + + /* Validate parameters */ + qpn = get_spl_qp_index(qp_type); + if (qpn == -1) + goto error1; + + if (rmpp_version) + goto error1; /* XXX: until RMPP implemented */ + + /* Validate MAD registration request if supplied */ + if (mad_reg_req) { + if (mad_reg_req->mgmt_class_version >= MAX_MGMT_VERSION) + goto error1; + if (!recv_handler) + goto error1; + if (mad_reg_req->mgmt_class >= MAX_MGMT_CLASS) { + /* + * IB_MGMT_CLASS_SUBN_DIRECTED_ROUTE is the only + * one in this range currently allowed + */ + if (mad_reg_req->mgmt_class != + IB_MGMT_CLASS_SUBN_DIRECTED_ROUTE) + goto error1; + } else if (mad_reg_req->mgmt_class == 0) { + /* + * Class 0 is reserved in IBA and is used for + * aliasing of IB_MGMT_CLASS_SUBN_DIRECTED_ROUTE + */ + goto error1; + } else if (is_vendor_class(mad_reg_req->mgmt_class)) { + /* + * If class is in "new" vendor range, + * ensure supplied OUI is not zero + */ + if (!is_vendor_oui(mad_reg_req->oui)) + goto error1; + } + /* Make sure class supplied is consistent with QP type */ + if (qp_type == IB_QPT_SMI) { + if ((mad_reg_req->mgmt_class != + IB_MGMT_CLASS_SUBN_LID_ROUTED) && + (mad_reg_req->mgmt_class != + IB_MGMT_CLASS_SUBN_DIRECTED_ROUTE)) + goto error1; + } else { + if ((mad_reg_req->mgmt_class == + IB_MGMT_CLASS_SUBN_LID_ROUTED) || + (mad_reg_req->mgmt_class == + IB_MGMT_CLASS_SUBN_DIRECTED_ROUTE)) + goto error1; + } + } else { + /* No registration request supplied */ + if (!send_handler) + goto error1; + } + + /* Validate device and port */ + port_priv = ib_get_mad_port(device, port_num); + if (!port_priv) { + ret = ERR_PTR(-ENODEV); + goto error1; + } + + /* Allocate structures */ + mad_agent_priv = kmalloc(sizeof *mad_agent_priv, GFP_KERNEL); + if (!mad_agent_priv) { + ret = ERR_PTR(-ENOMEM); + goto error1; + } + + if (mad_reg_req) { + reg_req = kmalloc(sizeof *reg_req, GFP_KERNEL); + if (!reg_req) { + ret = ERR_PTR(-ENOMEM); + goto error2; + } + /* Make a copy of the MAD registration request */ + memcpy(reg_req, mad_reg_req, sizeof *reg_req); + } + + /* Now, fill in the various structures */ + memset(mad_agent_priv, 0, sizeof *mad_agent_priv); + mad_agent_priv->qp_info = &port_priv->qp_info[qpn]; + mad_agent_priv->reg_req = reg_req; + mad_agent_priv->rmpp_version = rmpp_version; + mad_agent_priv->agent.device = device; + mad_agent_priv->agent.recv_handler = recv_handler; + mad_agent_priv->agent.send_handler = send_handler; + mad_agent_priv->agent.context = context; + mad_agent_priv->agent.qp = port_priv->qp_info[qpn].qp; + mad_agent_priv->agent.port_num = port_num; + + spin_lock_irqsave(&port_priv->reg_lock, flags); + mad_agent_priv->agent.hi_tid = ++ib_mad_client_id; + + /* + * Make sure MAD registration (if supplied) + * is non overlapping with any existing ones + */ + if (mad_reg_req) { + mgmt_class = convert_mgmt_class(mad_reg_req->mgmt_class); + if (!is_vendor_class(mgmt_class)) { + class = port_priv->version[mad_reg_req-> + mgmt_class_version].class; + if (class) { + method = class->method_table[mgmt_class]; + if (method) { + if (method_in_use(&method, + mad_reg_req)) + goto error3; + } + } + ret2 = add_nonoui_reg_req(mad_reg_req, mad_agent_priv, + mgmt_class); + } else { + /* "New" vendor class range */ + vendor = port_priv->version[mad_reg_req-> + mgmt_class_version].vendor; + if (vendor) { + vclass = vendor_class_index(mgmt_class); + vendor_class = vendor->vendor_class[vclass]; + if (vendor_class) { + if (is_vendor_method_in_use( + vendor_class, + mad_reg_req)) + goto error3; + } + } + ret2 = add_oui_reg_req(mad_reg_req, mad_agent_priv); + } + if (ret2) { + ret = ERR_PTR(ret2); + goto error3; + } + } + + /* Add mad agent into port's agent list */ + list_add_tail(&mad_agent_priv->agent_list, &port_priv->agent_list); + spin_unlock_irqrestore(&port_priv->reg_lock, flags); + + spin_lock_init(&mad_agent_priv->lock); + INIT_LIST_HEAD(&mad_agent_priv->send_list); + INIT_LIST_HEAD(&mad_agent_priv->wait_list); + INIT_WORK(&mad_agent_priv->timed_work, timeout_sends, mad_agent_priv); + INIT_LIST_HEAD(&mad_agent_priv->local_list); + INIT_WORK(&mad_agent_priv->local_work, local_completions, + mad_agent_priv); + atomic_set(&mad_agent_priv->refcount, 1); + init_waitqueue_head(&mad_agent_priv->wait); + + return &mad_agent_priv->agent; + +error3: + spin_unlock_irqrestore(&port_priv->reg_lock, flags); + kfree(reg_req); +error2: + kfree(mad_agent_priv); +error1: + return ret; +} +EXPORT_SYMBOL(ib_register_mad_agent); + +static inline int is_snooping_sends(int mad_snoop_flags) +{ + return (mad_snoop_flags & + (/*IB_MAD_SNOOP_POSTED_SENDS | + IB_MAD_SNOOP_RMPP_SENDS |*/ + IB_MAD_SNOOP_SEND_COMPLETIONS /*| + IB_MAD_SNOOP_RMPP_SEND_COMPLETIONS*/)); +} + +static inline int is_snooping_recvs(int mad_snoop_flags) +{ + return (mad_snoop_flags & + (IB_MAD_SNOOP_RECVS /*| + IB_MAD_SNOOP_RMPP_RECVS*/)); +} + +static int register_snoop_agent(struct ib_mad_qp_info *qp_info, + struct ib_mad_snoop_private *mad_snoop_priv) +{ + struct ib_mad_snoop_private **new_snoop_table; + unsigned long flags; + int i; + + spin_lock_irqsave(&qp_info->snoop_lock, flags); + /* Check for empty slot in array. */ + for (i = 0; i < qp_info->snoop_table_size; i++) + if (!qp_info->snoop_table[i]) + break; + + if (i == qp_info->snoop_table_size) { + /* Grow table. */ + new_snoop_table = kmalloc(sizeof mad_snoop_priv * + qp_info->snoop_table_size + 1, + GFP_ATOMIC); + if (!new_snoop_table) { + i = -ENOMEM; + goto out; + } + if (qp_info->snoop_table) { + memcpy(new_snoop_table, qp_info->snoop_table, + sizeof mad_snoop_priv * + qp_info->snoop_table_size); + kfree(qp_info->snoop_table); + } + qp_info->snoop_table = new_snoop_table; + qp_info->snoop_table_size++; + } + qp_info->snoop_table[i] = mad_snoop_priv; + atomic_inc(&qp_info->snoop_count); +out: + spin_unlock_irqrestore(&qp_info->snoop_lock, flags); + return i; +} + +struct ib_mad_agent *ib_register_mad_snoop(struct ib_device *device, + u8 port_num, + enum ib_qp_type qp_type, + int mad_snoop_flags, + ib_mad_snoop_handler snoop_handler, + ib_mad_recv_handler recv_handler, + void *context) +{ + struct ib_mad_port_private *port_priv; + struct ib_mad_agent *ret; + struct ib_mad_snoop_private *mad_snoop_priv; + int qpn; + + /* Validate parameters */ + if ((is_snooping_sends(mad_snoop_flags) && !snoop_handler) || + (is_snooping_recvs(mad_snoop_flags) && !recv_handler)) { + ret = ERR_PTR(-EINVAL); + goto error1; + } + qpn = get_spl_qp_index(qp_type); + if (qpn == -1) { + ret = ERR_PTR(-EINVAL); + goto error1; + } + port_priv = ib_get_mad_port(device, port_num); + if (!port_priv) { + ret = ERR_PTR(-ENODEV); + goto error1; + } + /* Allocate structures */ + mad_snoop_priv = kmalloc(sizeof *mad_snoop_priv, GFP_KERNEL); + if (!mad_snoop_priv) { + ret = ERR_PTR(-ENOMEM); + goto error1; + } + + /* Now, fill in the various structures */ + memset(mad_snoop_priv, 0, sizeof *mad_snoop_priv); + mad_snoop_priv->qp_info = &port_priv->qp_info[qpn]; + mad_snoop_priv->agent.device = device; + mad_snoop_priv->agent.recv_handler = recv_handler; + mad_snoop_priv->agent.snoop_handler = snoop_handler; + mad_snoop_priv->agent.context = context; + mad_snoop_priv->agent.qp = port_priv->qp_info[qpn].qp; + mad_snoop_priv->agent.port_num = port_num; + mad_snoop_priv->mad_snoop_flags = mad_snoop_flags; + init_waitqueue_head(&mad_snoop_priv->wait); + mad_snoop_priv->snoop_index = register_snoop_agent( + &port_priv->qp_info[qpn], + mad_snoop_priv); + if (mad_snoop_priv->snoop_index < 0) { + ret = ERR_PTR(mad_snoop_priv->snoop_index); + goto error2; + } + + atomic_set(&mad_snoop_priv->refcount, 1); + return &mad_snoop_priv->agent; + +error2: + kfree(mad_snoop_priv); +error1: + return ret; +} +EXPORT_SYMBOL(ib_register_mad_snoop); + +static void unregister_mad_agent(struct ib_mad_agent_private *mad_agent_priv) +{ + struct ib_mad_port_private *port_priv; + unsigned long flags; + + /* Note that we could still be handling received MADs */ + + /* + * Canceling all sends results in dropping received response + * MADs, preventing us from queuing additional work + */ + cancel_mads(mad_agent_priv); + + port_priv = mad_agent_priv->qp_info->port_priv; + cancel_delayed_work(&mad_agent_priv->timed_work); + flush_workqueue(port_priv->wq); + + spin_lock_irqsave(&port_priv->reg_lock, flags); + remove_mad_reg_req(mad_agent_priv); + list_del(&mad_agent_priv->agent_list); + spin_unlock_irqrestore(&port_priv->reg_lock, flags); + + /* XXX: Cleanup pending RMPP receives for this agent */ + + atomic_dec(&mad_agent_priv->refcount); + wait_event(mad_agent_priv->wait, + !atomic_read(&mad_agent_priv->refcount)); + + if (mad_agent_priv->reg_req) + kfree(mad_agent_priv->reg_req); + kfree(mad_agent_priv); +} + +static void unregister_mad_snoop(struct ib_mad_snoop_private *mad_snoop_priv) +{ + struct ib_mad_qp_info *qp_info; + unsigned long flags; + + qp_info = mad_snoop_priv->qp_info; + spin_lock_irqsave(&qp_info->snoop_lock, flags); + qp_info->snoop_table[mad_snoop_priv->snoop_index] = NULL; + atomic_dec(&qp_info->snoop_count); + spin_unlock_irqrestore(&qp_info->snoop_lock, flags); + + atomic_dec(&mad_snoop_priv->refcount); + wait_event(mad_snoop_priv->wait, + !atomic_read(&mad_snoop_priv->refcount)); + + kfree(mad_snoop_priv); +} + +/* + * ib_unregister_mad_agent - Unregisters a client from using MAD services + */ +int ib_unregister_mad_agent(struct ib_mad_agent *mad_agent) +{ + struct ib_mad_agent_private *mad_agent_priv; + struct ib_mad_snoop_private *mad_snoop_priv; + + /* If the TID is zero, the agent can only snoop. */ + if (mad_agent->hi_tid) { + mad_agent_priv = container_of(mad_agent, + struct ib_mad_agent_private, + agent); + unregister_mad_agent(mad_agent_priv); + } else { + mad_snoop_priv = container_of(mad_agent, + struct ib_mad_snoop_private, + agent); + unregister_mad_snoop(mad_snoop_priv); + } + return 0; +} +EXPORT_SYMBOL(ib_unregister_mad_agent); + +static void dequeue_mad(struct ib_mad_list_head *mad_list) +{ + struct ib_mad_queue *mad_queue; + unsigned long flags; + + BUG_ON(!mad_list->mad_queue); + mad_queue = mad_list->mad_queue; + spin_lock_irqsave(&mad_queue->lock, flags); + list_del(&mad_list->list); + mad_queue->count--; + spin_unlock_irqrestore(&mad_queue->lock, flags); +} + +static void snoop_send(struct ib_mad_qp_info *qp_info, + struct ib_send_wr *send_wr, + struct ib_mad_send_wc *mad_send_wc, + int mad_snoop_flags) +{ + struct ib_mad_snoop_private *mad_snoop_priv; + unsigned long flags; + int i; + + spin_lock_irqsave(&qp_info->snoop_lock, flags); + for (i = 0; i < qp_info->snoop_table_size; i++) { + mad_snoop_priv = qp_info->snoop_table[i]; + if (!mad_snoop_priv || + !(mad_snoop_priv->mad_snoop_flags & mad_snoop_flags)) + continue; + + atomic_inc(&mad_snoop_priv->refcount); + spin_unlock_irqrestore(&qp_info->snoop_lock, flags); + mad_snoop_priv->agent.snoop_handler(&mad_snoop_priv->agent, + send_wr, mad_send_wc); + if (atomic_dec_and_test(&mad_snoop_priv->refcount)) + wake_up(&mad_snoop_priv->wait); + spin_lock_irqsave(&qp_info->snoop_lock, flags); + } + spin_unlock_irqrestore(&qp_info->snoop_lock, flags); +} + +static void snoop_recv(struct ib_mad_qp_info *qp_info, + struct ib_mad_recv_wc *mad_recv_wc, + int mad_snoop_flags) +{ + struct ib_mad_snoop_private *mad_snoop_priv; + unsigned long flags; + int i; + + spin_lock_irqsave(&qp_info->snoop_lock, flags); + for (i = 0; i < qp_info->snoop_table_size; i++) { + mad_snoop_priv = qp_info->snoop_table[i]; + if (!mad_snoop_priv || + !(mad_snoop_priv->mad_snoop_flags & mad_snoop_flags)) + continue; + + atomic_inc(&mad_snoop_priv->refcount); + spin_unlock_irqrestore(&qp_info->snoop_lock, flags); + mad_snoop_priv->agent.recv_handler(&mad_snoop_priv->agent, + mad_recv_wc); + if (atomic_dec_and_test(&mad_snoop_priv->refcount)) + wake_up(&mad_snoop_priv->wait); + spin_lock_irqsave(&qp_info->snoop_lock, flags); + } + spin_unlock_irqrestore(&qp_info->snoop_lock, flags); +} + +/* + * Return 0 if SMP is to be sent + * Return 1 if SMP was consumed locally (whether or not solicited) + * Return < 0 if error + */ +static int handle_outgoing_smp(struct ib_mad_agent_private *mad_agent_priv, + struct ib_smp *smp, + struct ib_send_wr *send_wr) +{ + int ret, alloc_flags; + unsigned long flags; + struct ib_mad_local_private *local; + struct ib_mad_private *mad_priv; + struct ib_device *device = mad_agent_priv->agent.device; + u8 port_num = mad_agent_priv->agent.port_num; + + if (!smi_handle_dr_smp_send(smp, device->node_type, port_num)) { + ret = -EINVAL; + printk(KERN_ERR PFX "Invalid directed route\n"); + goto out; + } + /* Check to post send on QP or process locally */ + ret = smi_check_local_dr_smp(smp, device, port_num); + if (!ret || !device->process_mad) + goto out; + + if (in_atomic() || irqs_disabled()) + alloc_flags = GFP_ATOMIC; + else + alloc_flags = GFP_KERNEL; + local = kmalloc(sizeof *local, alloc_flags); + if (!local) { + ret = -ENOMEM; + printk(KERN_ERR PFX "No memory for ib_mad_local_private\n"); + goto out; + } + local->mad_priv = NULL; + mad_priv = kmem_cache_alloc(ib_mad_cache, alloc_flags); + if (!mad_priv) { + ret = -ENOMEM; + printk(KERN_ERR PFX "No memory for local response MAD\n"); + kfree(local); + goto out; + } + ret = device->process_mad(device, 0, port_num, smp->dr_slid, + (struct ib_mad *)smp, + (struct ib_mad *)&mad_priv->mad); + switch (ret) + { + case IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_REPLY: + /* + * See if response is solicited and + * there is a recv handler + */ + if (solicited_mad(&mad_priv->mad.mad) && + mad_agent_priv->agent.recv_handler) + local->mad_priv = mad_priv; + else + kmem_cache_free(ib_mad_cache, mad_priv); + break; + case IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_CONSUMED: + kmem_cache_free(ib_mad_cache, mad_priv); + break; + case IB_MAD_RESULT_SUCCESS: + kmem_cache_free(ib_mad_cache, mad_priv); + kfree(local); + ret = 0; + goto out; + default: + kmem_cache_free(ib_mad_cache, mad_priv); + kfree(local); + ret = -EINVAL; + goto out; + } + + local->send_wr = *send_wr; + local->send_wr.sg_list = local->sg_list; + memcpy(local->sg_list, send_wr->sg_list, + sizeof *send_wr->sg_list * send_wr->num_sge); + local->send_wr.next = NULL; + local->tid = send_wr->wr.ud.mad_hdr->tid; + local->wr_id = send_wr->wr_id; + /* Reference MAD agent until local completion handled */ + atomic_inc(&mad_agent_priv->refcount); + /* Queue local completion to local list */ + spin_lock_irqsave(&mad_agent_priv->lock, flags); + list_add_tail(&local->completion_list, &mad_agent_priv->local_list); + spin_unlock_irqrestore(&mad_agent_priv->lock, flags); + queue_work(mad_agent_priv->qp_info->port_priv->wq, + &mad_agent_priv->local_work); + ret = 1; +out: + return ret; +} + +static int ib_send_mad(struct ib_mad_agent_private *mad_agent_priv, + struct ib_mad_send_wr_private *mad_send_wr) +{ + struct ib_mad_qp_info *qp_info; + struct ib_send_wr *bad_send_wr; + unsigned long flags; + int ret; + + /* Replace user's WR ID with our own to find WR upon completion */ + qp_info = mad_agent_priv->qp_info; + mad_send_wr->wr_id = mad_send_wr->send_wr.wr_id; + mad_send_wr->send_wr.wr_id = (unsigned long)&mad_send_wr->mad_list; + mad_send_wr->mad_list.mad_queue = &qp_info->send_queue; + + spin_lock_irqsave(&qp_info->send_queue.lock, flags); + if (qp_info->send_queue.count++ < qp_info->send_queue.max_active) { + list_add_tail(&mad_send_wr->mad_list.list, + &qp_info->send_queue.list); + spin_unlock_irqrestore(&qp_info->send_queue.lock, flags); + ret = ib_post_send(mad_agent_priv->agent.qp, + &mad_send_wr->send_wr, &bad_send_wr); + if (ret) { + printk(KERN_ERR PFX "ib_post_send failed: %d\n", ret); + dequeue_mad(&mad_send_wr->mad_list); + } + } else { + list_add_tail(&mad_send_wr->mad_list.list, + &qp_info->overflow_list); + spin_unlock_irqrestore(&qp_info->send_queue.lock, flags); + ret = 0; + } + return ret; +} + +/* + * ib_post_send_mad - Posts MAD(s) to the send queue of the QP associated + * with the registered client + */ +int ib_post_send_mad(struct ib_mad_agent *mad_agent, + struct ib_send_wr *send_wr, + struct ib_send_wr **bad_send_wr) +{ + int ret = -EINVAL; + struct ib_mad_agent_private *mad_agent_priv; + + /* Validate supplied parameters */ + if (!bad_send_wr) + goto error1; + + if (!mad_agent || !send_wr) + goto error2; + + if (!mad_agent->send_handler) + goto error2; + + mad_agent_priv = container_of(mad_agent, + struct ib_mad_agent_private, + agent); + + /* Walk list of send WRs and post each on send list */ + while (send_wr) { + unsigned long flags; + struct ib_send_wr *next_send_wr; + struct ib_mad_send_wr_private *mad_send_wr; + struct ib_smp *smp; + + /* Validate more parameters */ + if (send_wr->num_sge > IB_MAD_SEND_REQ_MAX_SG) + goto error2; + + if (send_wr->wr.ud.timeout_ms && !mad_agent->recv_handler) + goto error2; + + if (!send_wr->wr.ud.mad_hdr) { + printk(KERN_ERR PFX "MAD header must be supplied " + "in WR %p\n", send_wr); + goto error2; + } + + /* + * Save pointer to next work request to post in case the + * current one completes, and the user modifies the work + * request associated with the completion + */ + next_send_wr = (struct ib_send_wr *)send_wr->next; + + smp = (struct ib_smp *)send_wr->wr.ud.mad_hdr; + if (smp->mgmt_class == IB_MGMT_CLASS_SUBN_DIRECTED_ROUTE) { + ret = handle_outgoing_smp(mad_agent_priv, smp, send_wr); + if (ret < 0) /* error */ + goto error2; + else if (ret == 1) /* locally consumed */ + goto next; + } + + /* Allocate MAD send WR tracking structure */ + mad_send_wr = kmalloc(sizeof *mad_send_wr, + (in_atomic() || irqs_disabled()) ? + GFP_ATOMIC : GFP_KERNEL); + if (!mad_send_wr) { + printk(KERN_ERR PFX "No memory for " + "ib_mad_send_wr_private\n"); + ret = -ENOMEM; + goto error2; + } + + mad_send_wr->send_wr = *send_wr; + mad_send_wr->send_wr.sg_list = mad_send_wr->sg_list; + memcpy(mad_send_wr->sg_list, send_wr->sg_list, + sizeof *send_wr->sg_list * send_wr->num_sge); + mad_send_wr->send_wr.next = NULL; + mad_send_wr->tid = send_wr->wr.ud.mad_hdr->tid; + mad_send_wr->agent = mad_agent; + /* Timeout will be updated after send completes */ + mad_send_wr->timeout = msecs_to_jiffies(send_wr->wr. + ud.timeout_ms); + mad_send_wr->retry = 0; + /* One reference for each work request to QP + response */ + mad_send_wr->refcount = 1 + (mad_send_wr->timeout > 0); + mad_send_wr->status = IB_WC_SUCCESS; + + /* Reference MAD agent until send completes */ + atomic_inc(&mad_agent_priv->refcount); + spin_lock_irqsave(&mad_agent_priv->lock, flags); + list_add_tail(&mad_send_wr->agent_list, + &mad_agent_priv->send_list); + spin_unlock_irqrestore(&mad_agent_priv->lock, flags); + + ret = ib_send_mad(mad_agent_priv, mad_send_wr); + if (ret) { + /* Fail send request */ + spin_lock_irqsave(&mad_agent_priv->lock, flags); + list_del(&mad_send_wr->agent_list); + spin_unlock_irqrestore(&mad_agent_priv->lock, flags); + atomic_dec(&mad_agent_priv->refcount); + goto error2; + } +next: + send_wr = next_send_wr; + } + return 0; + +error2: + *bad_send_wr = send_wr; +error1: + return ret; +} +EXPORT_SYMBOL(ib_post_send_mad); + +/* + * ib_free_recv_mad - Returns data buffers used to receive + * a MAD to the access layer + */ +void ib_free_recv_mad(struct ib_mad_recv_wc *mad_recv_wc) +{ + struct ib_mad_recv_buf *entry; + struct ib_mad_private_header *mad_priv_hdr; + struct ib_mad_private *priv; + + mad_priv_hdr = container_of(mad_recv_wc, + struct ib_mad_private_header, + recv_wc); + priv = container_of(mad_priv_hdr, struct ib_mad_private, header); + + /* + * Walk receive buffer list associated with this WC + * No need to remove them from list of receive buffers + */ + list_for_each_entry(entry, &mad_recv_wc->recv_buf.list, list) { + /* Free previous receive buffer */ + kmem_cache_free(ib_mad_cache, priv); + mad_priv_hdr = container_of(mad_recv_wc, + struct ib_mad_private_header, + recv_wc); + priv = container_of(mad_priv_hdr, struct ib_mad_private, + header); + } + + /* Free last buffer */ + kmem_cache_free(ib_mad_cache, priv); +} +EXPORT_SYMBOL(ib_free_recv_mad); + +void ib_coalesce_recv_mad(struct ib_mad_recv_wc *mad_recv_wc, + void *buf) +{ + printk(KERN_ERR PFX "ib_coalesce_recv_mad() not implemented yet\n"); +} +EXPORT_SYMBOL(ib_coalesce_recv_mad); + +struct ib_mad_agent *ib_redirect_mad_qp(struct ib_qp *qp, + u8 rmpp_version, + ib_mad_send_handler send_handler, + ib_mad_recv_handler recv_handler, + void *context) +{ + return ERR_PTR(-EINVAL); /* XXX: for now */ +} +EXPORT_SYMBOL(ib_redirect_mad_qp); + +int ib_process_mad_wc(struct ib_mad_agent *mad_agent, + struct ib_wc *wc) +{ + printk(KERN_ERR PFX "ib_process_mad_wc() not implemented yet\n"); + return 0; +} +EXPORT_SYMBOL(ib_process_mad_wc); + +static int method_in_use(struct ib_mad_mgmt_method_table **method, + struct ib_mad_reg_req *mad_reg_req) +{ + int i; + + for (i = find_first_bit(mad_reg_req->method_mask, IB_MGMT_MAX_METHODS); + i < IB_MGMT_MAX_METHODS; + i = find_next_bit(mad_reg_req->method_mask, IB_MGMT_MAX_METHODS, + 1+i)) { + if ((*method)->agent[i]) { + printk(KERN_ERR PFX "Method %d already in use\n", i); + return -EINVAL; + } + } + return 0; +} + +static int allocate_method_table(struct ib_mad_mgmt_method_table **method) +{ + /* Allocate management method table */ + *method = kmalloc(sizeof **method, GFP_ATOMIC); + if (!*method) { + printk(KERN_ERR PFX "No memory for " + "ib_mad_mgmt_method_table\n"); + return -ENOMEM; + } + /* Clear management method table */ + memset(*method, 0, sizeof **method); + + return 0; +} + +/* + * Check to see if there are any methods still in use + */ +static int check_method_table(struct ib_mad_mgmt_method_table *method) +{ + int i; + + for (i = 0; i < IB_MGMT_MAX_METHODS; i++) + if (method->agent[i]) + return 1; + return 0; +} + +/* + * Check to see if there are any method tables for this class still in use + */ +static int check_class_table(struct ib_mad_mgmt_class_table *class) +{ + int i; + + for (i = 0; i < MAX_MGMT_CLASS; i++) + if (class->method_table[i]) + return 1; + return 0; +} + +static int check_vendor_class(struct ib_mad_mgmt_vendor_class *vendor_class) +{ + int i; + + for (i = 0; i < MAX_MGMT_OUI; i++) + if (vendor_class->method_table[i]) + return 1; + return 0; +} + +static int find_vendor_oui(struct ib_mad_mgmt_vendor_class *vendor_class, + char *oui) +{ + int i; + + for (i = 0; i < MAX_MGMT_OUI; i++) + /* Is there matching OUI for this vendor class ? */ + if (!memcmp(vendor_class->oui[i], oui, 3)) + return i; + + return -1; +} + +static int check_vendor_table(struct ib_mad_mgmt_vendor_class_table *vendor) +{ + int i; + + for (i = 0; i < MAX_MGMT_VENDOR_RANGE2; i++) + if (vendor->vendor_class[i]) + return 1; + + return 0; +} + +static void remove_methods_mad_agent(struct ib_mad_mgmt_method_table *method, + struct ib_mad_agent_private *agent) +{ + int i; + + /* Remove any methods for this mad agent */ + for (i = 0; i < IB_MGMT_MAX_METHODS; i++) { + if (method->agent[i] == agent) { + method->agent[i] = NULL; + } + } +} + +static int add_nonoui_reg_req(struct ib_mad_reg_req *mad_reg_req, + struct ib_mad_agent_private *agent_priv, + u8 mgmt_class) +{ + struct ib_mad_port_private *port_priv; + struct ib_mad_mgmt_class_table **class; + struct ib_mad_mgmt_method_table **method; + int i, ret; + + port_priv = agent_priv->qp_info->port_priv; + class = &port_priv->version[mad_reg_req->mgmt_class_version].class; + if (!*class) { + /* Allocate management class table for "new" class version */ + *class = kmalloc(sizeof **class, GFP_ATOMIC); + if (!*class) { + printk(KERN_ERR PFX "No memory for " + "ib_mad_mgmt_class_table\n"); + ret = -ENOMEM; + goto error1; + } + /* Clear management class table */ + memset(*class, 0, sizeof(**class)); + /* Allocate method table for this management class */ + method = &(*class)->method_table[mgmt_class]; + if ((ret = allocate_method_table(method))) + goto error2; + } else { + method = &(*class)->method_table[mgmt_class]; + if (!*method) { + /* Allocate method table for this management class */ + if ((ret = allocate_method_table(method))) + goto error1; + } + } + + /* Now, make sure methods are not already in use */ + if (method_in_use(method, mad_reg_req)) + goto error3; + + /* Finally, add in methods being registered */ + for (i = find_first_bit(mad_reg_req->method_mask, + IB_MGMT_MAX_METHODS); + i < IB_MGMT_MAX_METHODS; + i = find_next_bit(mad_reg_req->method_mask, IB_MGMT_MAX_METHODS, + 1+i)) { + (*method)->agent[i] = agent_priv; + } + return 0; + +error3: + /* Remove any methods for this mad agent */ + remove_methods_mad_agent(*method, agent_priv); + /* Now, check to see if there are any methods in use */ + if (!check_method_table(*method)) { + /* If not, release management method table */ + kfree(*method); + *method = NULL; + } + ret = -EINVAL; + goto error1; +error2: + kfree(*class); + *class = NULL; +error1: + return ret; +} + +static int add_oui_reg_req(struct ib_mad_reg_req *mad_reg_req, + struct ib_mad_agent_private *agent_priv) +{ + struct ib_mad_port_private *port_priv; + struct ib_mad_mgmt_vendor_class_table **vendor_table; + struct ib_mad_mgmt_vendor_class_table *vendor = NULL; + struct ib_mad_mgmt_vendor_class *vendor_class = NULL; + struct ib_mad_mgmt_method_table **method; + int i, ret = -ENOMEM; + u8 vclass; + + /* "New" vendor (with OUI) class */ + vclass = vendor_class_index(mad_reg_req->mgmt_class); + port_priv = agent_priv->qp_info->port_priv; + vendor_table = &port_priv->version[ + mad_reg_req->mgmt_class_version].vendor; + if (!*vendor_table) { + /* Allocate mgmt vendor class table for "new" class version */ + vendor = kmalloc(sizeof *vendor, GFP_ATOMIC); + if (!vendor) { + printk(KERN_ERR PFX "No memory for " + "ib_mad_mgmt_vendor_class_table\n"); + goto error1; + } + /* Clear management vendor class table */ + memset(vendor, 0, sizeof(*vendor)); + *vendor_table = vendor; + } + if (!(*vendor_table)->vendor_class[vclass]) { + /* Allocate table for this management vendor class */ + vendor_class = kmalloc(sizeof *vendor_class, GFP_ATOMIC); + if (!vendor_class) { + printk(KERN_ERR PFX "No memory for " + "ib_mad_mgmt_vendor_class\n"); + goto error2; + } + memset(vendor_class, 0, sizeof(*vendor_class)); + (*vendor_table)->vendor_class[vclass] = vendor_class; + } + for (i = 0; i < MAX_MGMT_OUI; i++) { + /* Is there matching OUI for this vendor class ? */ + if (!memcmp((*vendor_table)->vendor_class[vclass]->oui[i], + mad_reg_req->oui, 3)) { + method = &(*vendor_table)->vendor_class[ + vclass]->method_table[i]; + BUG_ON(!*method); + goto check_in_use; + } + } + for (i = 0; i < MAX_MGMT_OUI; i++) { + /* OUI slot available ? */ + if (!is_vendor_oui((*vendor_table)->vendor_class[ + vclass]->oui[i])) { + method = &(*vendor_table)->vendor_class[ + vclass]->method_table[i]; + BUG_ON(*method); + /* Allocate method table for this OUI */ + if ((ret = allocate_method_table(method))) + goto error3; + memcpy((*vendor_table)->vendor_class[vclass]->oui[i], + mad_reg_req->oui, 3); + goto check_in_use; + } + } + printk(KERN_ERR PFX "All OUI slots in use\n"); + goto error3; + +check_in_use: + /* Now, make sure methods are not already in use */ + if (method_in_use(method, mad_reg_req)) + goto error4; + + /* Finally, add in methods being registered */ + for (i = find_first_bit(mad_reg_req->method_mask, + IB_MGMT_MAX_METHODS); + i < IB_MGMT_MAX_METHODS; + i = find_next_bit(mad_reg_req->method_mask, IB_MGMT_MAX_METHODS, + 1+i)) { + (*method)->agent[i] = agent_priv; + } + return 0; + +error4: + /* Remove any methods for this mad agent */ + remove_methods_mad_agent(*method, agent_priv); + /* Now, check to see if there are any methods in use */ + if (!check_method_table(*method)) { + /* If not, release management method table */ + kfree(*method); + *method = NULL; + } + ret = -EINVAL; +error3: + if (vendor_class) { + (*vendor_table)->vendor_class[vclass] = NULL; + kfree(vendor_class); + } +error2: + if (vendor) { + *vendor_table = NULL; + kfree(vendor); + } +error1: + return ret; +} + +static void remove_mad_reg_req(struct ib_mad_agent_private *agent_priv) +{ + struct ib_mad_port_private *port_priv; + struct ib_mad_mgmt_class_table *class; + struct ib_mad_mgmt_method_table *method; + struct ib_mad_mgmt_vendor_class_table *vendor; + struct ib_mad_mgmt_vendor_class *vendor_class; + int index; + u8 mgmt_class; + + /* + * Was MAD registration request supplied + * with original registration ? + */ + if (!agent_priv->reg_req) { + goto out; + } + + port_priv = agent_priv->qp_info->port_priv; + class = port_priv->version[ + agent_priv->reg_req->mgmt_class_version].class; + if (!class) + goto vendor_check; + + mgmt_class = convert_mgmt_class(agent_priv->reg_req->mgmt_class); + method = class->method_table[mgmt_class]; + if (method) { + /* Remove any methods for this mad agent */ + remove_methods_mad_agent(method, agent_priv); + /* Now, check to see if there are any methods still in use */ + if (!check_method_table(method)) { + /* If not, release management method table */ + kfree(method); + class->method_table[mgmt_class] = NULL; + /* Any management classes left ? */ + if (!check_class_table(class)) { + /* If not, release management class table */ + kfree(class); + port_priv->version[ + agent_priv->reg_req-> + mgmt_class_version].class = NULL; + } + } + } + +vendor_check: + vendor = port_priv->version[ + agent_priv->reg_req->mgmt_class_version].vendor; + if (!vendor) + goto out; + + mgmt_class = vendor_class_index(agent_priv->reg_req->mgmt_class); + vendor_class = vendor->vendor_class[mgmt_class]; + if (vendor_class) { + index = find_vendor_oui(vendor_class, agent_priv->reg_req->oui); + if (index == -1) + goto out; + method = vendor_class->method_table[index]; + if (method) { + /* Remove any methods for this mad agent */ + remove_methods_mad_agent(method, agent_priv); + /* + * Now, check to see if there are + * any methods still in use + */ + if (!check_method_table(method)) { + /* If not, release management method table */ + kfree(method); + vendor_class->method_table[index] = NULL; + memset(vendor_class->oui[index], 0, 3); + /* Any OUIs left ? */ + if (!check_vendor_class(vendor_class)) { + /* If not, release vendor class table */ + kfree(vendor_class); + vendor->vendor_class[mgmt_class] = NULL; + /* Any other vendor classes left ? */ + if (!check_vendor_table(vendor)) { + kfree(vendor); + port_priv->version[ + agent_priv->reg_req-> + mgmt_class_version]. + vendor = NULL; + } + } + } + } + } + +out: + return; +} + +static int response_mad(struct ib_mad *mad) +{ + /* Trap represses are responses although response bit is reset */ + return ((mad->mad_hdr.method == IB_MGMT_METHOD_TRAP_REPRESS) || + (mad->mad_hdr.method & IB_MGMT_METHOD_RESP)); +} + +static int solicited_mad(struct ib_mad *mad) +{ + /* CM MADs are never solicited */ + if (mad->mad_hdr.mgmt_class == IB_MGMT_CLASS_CM) { + return 0; + } + + /* XXX: Determine whether MAD is using RMPP */ + + /* Not using RMPP */ + /* Is this MAD a response to a previous MAD ? */ + return response_mad(mad); +} + +static struct ib_mad_agent_private * +find_mad_agent(struct ib_mad_port_private *port_priv, + struct ib_mad *mad, + int solicited) +{ + struct ib_mad_agent_private *mad_agent = NULL; + unsigned long flags; + + spin_lock_irqsave(&port_priv->reg_lock, flags); + + /* + * Whether MAD was solicited determines type of routing to + * MAD client. + */ + if (solicited) { + u32 hi_tid; + struct ib_mad_agent_private *entry; + + /* + * Routing is based on high 32 bits of transaction ID + * of MAD. + */ + hi_tid = be64_to_cpu(mad->mad_hdr.tid) >> 32; + list_for_each_entry(entry, &port_priv->agent_list, + agent_list) { + if (entry->agent.hi_tid == hi_tid) { + mad_agent = entry; + break; + } + } + } else { + struct ib_mad_mgmt_class_table *class; + struct ib_mad_mgmt_method_table *method; + struct ib_mad_mgmt_vendor_class_table *vendor; + struct ib_mad_mgmt_vendor_class *vendor_class; + struct ib_vendor_mad *vendor_mad; + int index; + + /* + * Routing is based on version, class, and method + * For "newer" vendor MADs, also based on OUI + */ + if (mad->mad_hdr.class_version >= MAX_MGMT_VERSION) + goto out; + if (!is_vendor_class(mad->mad_hdr.mgmt_class)) { + class = port_priv->version[ + mad->mad_hdr.class_version].class; + if (!class) + goto out; + method = class->method_table[convert_mgmt_class( + mad->mad_hdr.mgmt_class)]; + if (method) + mad_agent = method->agent[mad->mad_hdr.method & + ~IB_MGMT_METHOD_RESP]; + } else { + vendor = port_priv->version[ + mad->mad_hdr.class_version].vendor; + if (!vendor) + goto out; + vendor_class = vendor->vendor_class[vendor_class_index( + mad->mad_hdr.mgmt_class)]; + if (!vendor_class) + goto out; + /* Find matching OUI */ + vendor_mad = (struct ib_vendor_mad *)mad; + index = find_vendor_oui(vendor_class, vendor_mad->oui); + if (index == -1) + goto out; + method = vendor_class->method_table[index]; + if (method) { + mad_agent = method->agent[mad->mad_hdr.method & + ~IB_MGMT_METHOD_RESP]; + } + } + } + + if (mad_agent) { + if (mad_agent->agent.recv_handler) + atomic_inc(&mad_agent->refcount); + else { + printk(KERN_NOTICE PFX "No receive handler for client " + "%p on port %d\n", + &mad_agent->agent, port_priv->port_num); + mad_agent = NULL; + } + } +out: + spin_unlock_irqrestore(&port_priv->reg_lock, flags); + + return mad_agent; +} + +static int validate_mad(struct ib_mad *mad, u32 qp_num) +{ + int valid = 0; + + /* Make sure MAD base version is understood */ + if (mad->mad_hdr.base_version != IB_MGMT_BASE_VERSION) { + printk(KERN_ERR PFX "MAD received with unsupported base " + "version %d\n", mad->mad_hdr.base_version); + goto out; + } + + /* Filter SMI packets sent to other than QP0 */ + if ((mad->mad_hdr.mgmt_class == IB_MGMT_CLASS_SUBN_LID_ROUTED) || + (mad->mad_hdr.mgmt_class == IB_MGMT_CLASS_SUBN_DIRECTED_ROUTE)) { + if (qp_num == 0) + valid = 1; + } else { + /* Filter GSI packets sent to QP0 */ + if (qp_num != 0) + valid = 1; + } + +out: + return valid; +} + +/* + * Return start of fully reassembled MAD, or NULL, if MAD isn't assembled yet + */ +static struct ib_mad_private * +reassemble_recv(struct ib_mad_agent_private *mad_agent_priv, + struct ib_mad_private *recv) +{ + /* Until we have RMPP, all receives are reassembled!... */ + INIT_LIST_HEAD(&recv->header.recv_wc.recv_buf.list); + return recv; +} + +static struct ib_mad_send_wr_private* +find_send_req(struct ib_mad_agent_private *mad_agent_priv, + u64 tid) +{ + struct ib_mad_send_wr_private *mad_send_wr; + + list_for_each_entry(mad_send_wr, &mad_agent_priv->wait_list, + agent_list) { + if (mad_send_wr->tid == tid) + return mad_send_wr; + } + + /* + * It's possible to receive the response before we've + * been notified that the send has completed + */ + list_for_each_entry(mad_send_wr, &mad_agent_priv->send_list, + agent_list) { + if (mad_send_wr->tid == tid && mad_send_wr->timeout) { + /* Verify request has not been canceled */ + return (mad_send_wr->status == IB_WC_SUCCESS) ? + mad_send_wr : NULL; + } + } + return NULL; +} + +static void ib_mad_complete_recv(struct ib_mad_agent_private *mad_agent_priv, + struct ib_mad_private *recv, + int solicited) +{ + struct ib_mad_send_wr_private *mad_send_wr; + struct ib_mad_send_wc mad_send_wc; + unsigned long flags; + + /* Fully reassemble receive before processing */ + recv = reassemble_recv(mad_agent_priv, recv); + if (!recv) { + if (atomic_dec_and_test(&mad_agent_priv->refcount)) + wake_up(&mad_agent_priv->wait); + return; + } + + /* Complete corresponding request */ + if (solicited) { + spin_lock_irqsave(&mad_agent_priv->lock, flags); + mad_send_wr = find_send_req(mad_agent_priv, + recv->mad.mad.mad_hdr.tid); + if (!mad_send_wr) { + spin_unlock_irqrestore(&mad_agent_priv->lock, flags); + ib_free_recv_mad(&recv->header.recv_wc); + if (atomic_dec_and_test(&mad_agent_priv->refcount)) + wake_up(&mad_agent_priv->wait); + return; + } + /* Timeout = 0 means that we won't wait for a response */ + mad_send_wr->timeout = 0; + spin_unlock_irqrestore(&mad_agent_priv->lock, flags); + + /* Defined behavior is to complete response before request */ + recv->header.recv_wc.wc->wr_id = mad_send_wr->wr_id; + mad_agent_priv->agent.recv_handler( + &mad_agent_priv->agent, + &recv->header.recv_wc); + atomic_dec(&mad_agent_priv->refcount); + + mad_send_wc.status = IB_WC_SUCCESS; + mad_send_wc.vendor_err = 0; + mad_send_wc.wr_id = mad_send_wr->wr_id; + ib_mad_complete_send_wr(mad_send_wr, &mad_send_wc); + } else { + mad_agent_priv->agent.recv_handler( + &mad_agent_priv->agent, + &recv->header.recv_wc); + if (atomic_dec_and_test(&mad_agent_priv->refcount)) + wake_up(&mad_agent_priv->wait); + } +} + +static void ib_mad_recv_done_handler(struct ib_mad_port_private *port_priv, + struct ib_wc *wc) +{ + struct ib_mad_qp_info *qp_info; + struct ib_mad_private_header *mad_priv_hdr; + struct ib_mad_private *recv, *response; + struct ib_mad_list_head *mad_list; + struct ib_mad_agent_private *mad_agent; + int solicited; + + response = kmem_cache_alloc(ib_mad_cache, GFP_KERNEL); + if (!response) + printk(KERN_ERR PFX "ib_mad_recv_done_handler no memory " + "for response buffer\n"); + + mad_list = (struct ib_mad_list_head *)(unsigned long)wc->wr_id; + qp_info = mad_list->mad_queue->qp_info; + dequeue_mad(mad_list); + + mad_priv_hdr = container_of(mad_list, struct ib_mad_private_header, + mad_list); + recv = container_of(mad_priv_hdr, struct ib_mad_private, header); + dma_unmap_single(port_priv->device->dma_device, + pci_unmap_addr(&recv->header, mapping), + sizeof(struct ib_mad_private) - + sizeof(struct ib_mad_private_header), + DMA_FROM_DEVICE); + + /* Setup MAD receive work completion from "normal" work completion */ + recv->header.recv_wc.wc = wc; + recv->header.recv_wc.mad_len = sizeof(struct ib_mad); + recv->header.recv_wc.recv_buf.mad = &recv->mad.mad; + recv->header.recv_wc.recv_buf.grh = &recv->grh; + + if (atomic_read(&qp_info->snoop_count)) + snoop_recv(qp_info, &recv->header.recv_wc, IB_MAD_SNOOP_RECVS); + + /* Validate MAD */ + if (!validate_mad(&recv->mad.mad, qp_info->qp->qp_num)) + goto out; + + if (recv->mad.mad.mad_hdr.mgmt_class == + IB_MGMT_CLASS_SUBN_DIRECTED_ROUTE) { + if (!smi_handle_dr_smp_recv(&recv->mad.smp, + port_priv->device->node_type, + port_priv->port_num, + port_priv->device->phys_port_cnt)) + goto out; + if (!smi_check_forward_dr_smp(&recv->mad.smp)) + goto local; + if (!smi_handle_dr_smp_send(&recv->mad.smp, + port_priv->device->node_type, + port_priv->port_num)) + goto out; + if (!smi_check_local_dr_smp(&recv->mad.smp, + port_priv->device, + port_priv->port_num)) + goto out; + } + +local: + /* Give driver "right of first refusal" on incoming MAD */ + if (port_priv->device->process_mad) { + int ret; + + if (!response) { + printk(KERN_ERR PFX "No memory for response MAD\n"); + /* + * Is it better to assume that + * it wouldn't be processed ? + */ + goto out; + } + + ret = port_priv->device->process_mad(port_priv->device, 0, + port_priv->port_num, + wc->slid, + &recv->mad.mad, + &response->mad.mad); + if (ret & IB_MAD_RESULT_SUCCESS) { + if (ret & IB_MAD_RESULT_CONSUMED) + goto out; + if (ret & IB_MAD_RESULT_REPLY) { + /* Send response */ + if (!agent_send(response, &recv->grh, wc, + port_priv->device, + port_priv->port_num)) + response = NULL; + goto out; + } + } + } + + /* Determine corresponding MAD agent for incoming receive MAD */ + solicited = solicited_mad(&recv->mad.mad); + mad_agent = find_mad_agent(port_priv, &recv->mad.mad, solicited); + if (mad_agent) { + ib_mad_complete_recv(mad_agent, recv, solicited); + /* + * recv is freed up in error cases in ib_mad_complete_recv + * or via recv_handler in ib_mad_complete_recv() + */ + recv = NULL; + } + +out: + /* Post another receive request for this QP */ + if (response) { + ib_mad_post_receive_mads(qp_info, response); + if (recv) + kmem_cache_free(ib_mad_cache, recv); + } else + ib_mad_post_receive_mads(qp_info, recv); +} + +static void adjust_timeout(struct ib_mad_agent_private *mad_agent_priv) +{ + struct ib_mad_send_wr_private *mad_send_wr; + unsigned long delay; + + if (list_empty(&mad_agent_priv->wait_list)) { + cancel_delayed_work(&mad_agent_priv->timed_work); + } else { + mad_send_wr = list_entry(mad_agent_priv->wait_list.next, + struct ib_mad_send_wr_private, + agent_list); + + if (time_after(mad_agent_priv->timeout, + mad_send_wr->timeout)) { + mad_agent_priv->timeout = mad_send_wr->timeout; + cancel_delayed_work(&mad_agent_priv->timed_work); + delay = mad_send_wr->timeout - jiffies; + if ((long)delay <= 0) + delay = 1; + queue_delayed_work(mad_agent_priv->qp_info-> + port_priv->wq, + &mad_agent_priv->timed_work, delay); + } + } +} + +static void wait_for_response(struct ib_mad_agent_private *mad_agent_priv, + struct ib_mad_send_wr_private *mad_send_wr ) +{ + struct ib_mad_send_wr_private *temp_mad_send_wr; + struct list_head *list_item; + unsigned long delay; + + list_del(&mad_send_wr->agent_list); + + delay = mad_send_wr->timeout; + mad_send_wr->timeout += jiffies; + + list_for_each_prev(list_item, &mad_agent_priv->wait_list) { + temp_mad_send_wr = list_entry(list_item, + struct ib_mad_send_wr_private, + agent_list); + if (time_after(mad_send_wr->timeout, + temp_mad_send_wr->timeout)) + break; + } + list_add(&mad_send_wr->agent_list, list_item); + + /* Reschedule a work item if we have a shorter timeout */ + if (mad_agent_priv->wait_list.next == &mad_send_wr->agent_list) { + cancel_delayed_work(&mad_agent_priv->timed_work); + queue_delayed_work(mad_agent_priv->qp_info->port_priv->wq, + &mad_agent_priv->timed_work, delay); + } +} + +/* + * Process a send work completion + */ +static void ib_mad_complete_send_wr(struct ib_mad_send_wr_private *mad_send_wr, + struct ib_mad_send_wc *mad_send_wc) +{ + struct ib_mad_agent_private *mad_agent_priv; + unsigned long flags; + + mad_agent_priv = container_of(mad_send_wr->agent, + struct ib_mad_agent_private, agent); + + spin_lock_irqsave(&mad_agent_priv->lock, flags); + if (mad_send_wc->status != IB_WC_SUCCESS && + mad_send_wr->status == IB_WC_SUCCESS) { + mad_send_wr->status = mad_send_wc->status; + mad_send_wr->refcount -= (mad_send_wr->timeout > 0); + } + + if (--mad_send_wr->refcount > 0) { + if (mad_send_wr->refcount == 1 && mad_send_wr->timeout && + mad_send_wr->status == IB_WC_SUCCESS) { + wait_for_response(mad_agent_priv, mad_send_wr); + } + spin_unlock_irqrestore(&mad_agent_priv->lock, flags); + return; + } + + /* Remove send from MAD agent and notify client of completion */ + list_del(&mad_send_wr->agent_list); + adjust_timeout(mad_agent_priv); + spin_unlock_irqrestore(&mad_agent_priv->lock, flags); + + if (mad_send_wr->status != IB_WC_SUCCESS ) + mad_send_wc->status = mad_send_wr->status; + mad_agent_priv->agent.send_handler(&mad_agent_priv->agent, + mad_send_wc); + + /* Release reference on agent taken when sending */ + if (atomic_dec_and_test(&mad_agent_priv->refcount)) + wake_up(&mad_agent_priv->wait); + + kfree(mad_send_wr); +} + +static void ib_mad_send_done_handler(struct ib_mad_port_private *port_priv, + struct ib_wc *wc) +{ + struct ib_mad_send_wr_private *mad_send_wr, *queued_send_wr; + struct ib_mad_list_head *mad_list; + struct ib_mad_qp_info *qp_info; + struct ib_mad_queue *send_queue; + struct ib_send_wr *bad_send_wr; + unsigned long flags; + int ret; + + mad_list = (struct ib_mad_list_head *)(unsigned long)wc->wr_id; + mad_send_wr = container_of(mad_list, struct ib_mad_send_wr_private, + mad_list); + send_queue = mad_list->mad_queue; + qp_info = send_queue->qp_info; + +retry: + queued_send_wr = NULL; + spin_lock_irqsave(&send_queue->lock, flags); + list_del(&mad_list->list); + + /* Move queued send to the send queue */ + if (send_queue->count-- > send_queue->max_active) { + mad_list = container_of(qp_info->overflow_list.next, + struct ib_mad_list_head, list); + queued_send_wr = container_of(mad_list, + struct ib_mad_send_wr_private, + mad_list); + list_del(&mad_list->list); + list_add_tail(&mad_list->list, &send_queue->list); + } + spin_unlock_irqrestore(&send_queue->lock, flags); + + /* Restore client wr_id in WC and complete send */ + wc->wr_id = mad_send_wr->wr_id; + if (atomic_read(&qp_info->snoop_count)) + snoop_send(qp_info, &mad_send_wr->send_wr, + (struct ib_mad_send_wc *)wc, + IB_MAD_SNOOP_SEND_COMPLETIONS); + ib_mad_complete_send_wr(mad_send_wr, (struct ib_mad_send_wc *)wc); + + if (queued_send_wr) { + ret = ib_post_send(qp_info->qp, &queued_send_wr->send_wr, + &bad_send_wr); + if (ret) { + printk(KERN_ERR PFX "ib_post_send failed: %d\n", ret); + mad_send_wr = queued_send_wr; + wc->status = IB_WC_LOC_QP_OP_ERR; + goto retry; + } + } +} + +static void mark_sends_for_retry(struct ib_mad_qp_info *qp_info) +{ + struct ib_mad_send_wr_private *mad_send_wr; + struct ib_mad_list_head *mad_list; + unsigned long flags; + + spin_lock_irqsave(&qp_info->send_queue.lock, flags); + list_for_each_entry(mad_list, &qp_info->send_queue.list, list) { + mad_send_wr = container_of(mad_list, + struct ib_mad_send_wr_private, + mad_list); + mad_send_wr->retry = 1; + } + spin_unlock_irqrestore(&qp_info->send_queue.lock, flags); +} + +static void mad_error_handler(struct ib_mad_port_private *port_priv, + struct ib_wc *wc) +{ + struct ib_mad_list_head *mad_list; + struct ib_mad_qp_info *qp_info; + struct ib_mad_send_wr_private *mad_send_wr; + int ret; + + /* Determine if failure was a send or receive */ + mad_list = (struct ib_mad_list_head *)(unsigned long)wc->wr_id; + qp_info = mad_list->mad_queue->qp_info; + if (mad_list->mad_queue == &qp_info->recv_queue) + /* + * Receive errors indicate that the QP has entered the error + * state - error handling/shutdown code will cleanup + */ + return; + + /* + * Send errors will transition the QP to SQE - move + * QP to RTS and repost flushed work requests + */ + mad_send_wr = container_of(mad_list, struct ib_mad_send_wr_private, + mad_list); + if (wc->status == IB_WC_WR_FLUSH_ERR) { + if (mad_send_wr->retry) { + /* Repost send */ + struct ib_send_wr *bad_send_wr; + + mad_send_wr->retry = 0; + ret = ib_post_send(qp_info->qp, &mad_send_wr->send_wr, + &bad_send_wr); + if (ret) + ib_mad_send_done_handler(port_priv, wc); + } else + ib_mad_send_done_handler(port_priv, wc); + } else { + struct ib_qp_attr *attr; + + /* Transition QP to RTS and fail offending send */ + attr = kmalloc(sizeof *attr, GFP_KERNEL); + if (attr) { + attr->qp_state = IB_QPS_RTS; + attr->cur_qp_state = IB_QPS_SQE; + ret = ib_modify_qp(qp_info->qp, attr, + IB_QP_STATE | IB_QP_CUR_STATE); + kfree(attr); + if (ret) + printk(KERN_ERR PFX "mad_error_handler - " + "ib_modify_qp to RTS : %d\n", ret); + else + mark_sends_for_retry(qp_info); + } + ib_mad_send_done_handler(port_priv, wc); + } +} + +/* + * IB MAD completion callback + */ +static void ib_mad_completion_handler(void *data) +{ + struct ib_mad_port_private *port_priv; + struct ib_wc wc; + + port_priv = (struct ib_mad_port_private *)data; + ib_req_notify_cq(port_priv->cq, IB_CQ_NEXT_COMP); + + while (ib_poll_cq(port_priv->cq, 1, &wc) == 1) { + if (wc.status == IB_WC_SUCCESS) { + switch (wc.opcode) { + case IB_WC_SEND: + ib_mad_send_done_handler(port_priv, &wc); + break; + case IB_WC_RECV: + ib_mad_recv_done_handler(port_priv, &wc); + break; + default: + BUG_ON(1); + break; + } + } else + mad_error_handler(port_priv, &wc); + } +} + +static void cancel_mads(struct ib_mad_agent_private *mad_agent_priv) +{ + unsigned long flags; + struct ib_mad_send_wr_private *mad_send_wr, *temp_mad_send_wr; + struct ib_mad_send_wc mad_send_wc; + struct list_head cancel_list; + + INIT_LIST_HEAD(&cancel_list); + + spin_lock_irqsave(&mad_agent_priv->lock, flags); + list_for_each_entry_safe(mad_send_wr, temp_mad_send_wr, + &mad_agent_priv->send_list, agent_list) { + if (mad_send_wr->status == IB_WC_SUCCESS) { + mad_send_wr->status = IB_WC_WR_FLUSH_ERR; + mad_send_wr->refcount -= (mad_send_wr->timeout > 0); + } + } + + /* Empty wait list to prevent receives from finding a request */ + list_splice_init(&mad_agent_priv->wait_list, &cancel_list); + spin_unlock_irqrestore(&mad_agent_priv->lock, flags); + + /* Report all cancelled requests */ + mad_send_wc.status = IB_WC_WR_FLUSH_ERR; + mad_send_wc.vendor_err = 0; + + list_for_each_entry_safe(mad_send_wr, temp_mad_send_wr, + &cancel_list, agent_list) { + mad_send_wc.wr_id = mad_send_wr->wr_id; + mad_agent_priv->agent.send_handler(&mad_agent_priv->agent, + &mad_send_wc); + + list_del(&mad_send_wr->agent_list); + kfree(mad_send_wr); + atomic_dec(&mad_agent_priv->refcount); + } +} + +static struct ib_mad_send_wr_private* +find_send_by_wr_id(struct ib_mad_agent_private *mad_agent_priv, + u64 wr_id) +{ + struct ib_mad_send_wr_private *mad_send_wr; + + list_for_each_entry(mad_send_wr, &mad_agent_priv->wait_list, + agent_list) { + if (mad_send_wr->wr_id == wr_id) + return mad_send_wr; + } + + list_for_each_entry(mad_send_wr, &mad_agent_priv->send_list, + agent_list) { + if (mad_send_wr->wr_id == wr_id) + return mad_send_wr; + } + return NULL; +} + +void ib_cancel_mad(struct ib_mad_agent *mad_agent, + u64 wr_id) +{ + struct ib_mad_agent_private *mad_agent_priv; + struct ib_mad_send_wr_private *mad_send_wr; + struct ib_mad_send_wc mad_send_wc; + unsigned long flags; + + mad_agent_priv = container_of(mad_agent, struct ib_mad_agent_private, + agent); + spin_lock_irqsave(&mad_agent_priv->lock, flags); + mad_send_wr = find_send_by_wr_id(mad_agent_priv, wr_id); + if (!mad_send_wr) { + spin_unlock_irqrestore(&mad_agent_priv->lock, flags); + goto out; + } + + if (mad_send_wr->status == IB_WC_SUCCESS) + mad_send_wr->refcount -= (mad_send_wr->timeout > 0); + + if (mad_send_wr->refcount != 0) { + mad_send_wr->status = IB_WC_WR_FLUSH_ERR; + spin_unlock_irqrestore(&mad_agent_priv->lock, flags); + goto out; + } + + list_del(&mad_send_wr->agent_list); + adjust_timeout(mad_agent_priv); + spin_unlock_irqrestore(&mad_agent_priv->lock, flags); + + mad_send_wc.status = IB_WC_WR_FLUSH_ERR; + mad_send_wc.vendor_err = 0; + mad_send_wc.wr_id = mad_send_wr->wr_id; + mad_agent_priv->agent.send_handler(&mad_agent_priv->agent, + &mad_send_wc); + + kfree(mad_send_wr); + if (atomic_dec_and_test(&mad_agent_priv->refcount)) + wake_up(&mad_agent_priv->wait); + +out: + return; +} +EXPORT_SYMBOL(ib_cancel_mad); + +static void local_completions(void *data) +{ + struct ib_mad_agent_private *mad_agent_priv; + struct ib_mad_local_private *local; + unsigned long flags; + struct ib_wc wc; + struct ib_mad_send_wc mad_send_wc; + + mad_agent_priv = (struct ib_mad_agent_private *)data; + + spin_lock_irqsave(&mad_agent_priv->lock, flags); + while (!list_empty(&mad_agent_priv->local_list)) { + local = list_entry(mad_agent_priv->local_list.next, + struct ib_mad_local_private, + completion_list); + spin_unlock_irqrestore(&mad_agent_priv->lock, flags); + if (local->mad_priv) { + /* + * Defined behavior is to complete response + * before request + */ + wc.wr_id = local->wr_id; + wc.status = IB_WC_SUCCESS; + wc.opcode = IB_WC_RECV; + wc.vendor_err = 0; + wc.byte_len = sizeof(struct ib_mad); + wc.src_qp = IB_QP0; + wc.wc_flags = 0; + wc.pkey_index = 0; + wc.slid = IB_LID_PERMISSIVE; + wc.sl = 0; + wc.dlid_path_bits = 0; + local->mad_priv->header.recv_wc.wc = &wc; + local->mad_priv->header.recv_wc.mad_len = + sizeof(struct ib_mad); + INIT_LIST_HEAD(&local->mad_priv->header.recv_wc.recv_buf.list); + local->mad_priv->header.recv_wc.recv_buf.grh = NULL; + local->mad_priv->header.recv_wc.recv_buf.mad = + &local->mad_priv->mad.mad; + if (atomic_read(&mad_agent_priv->qp_info->snoop_count)) + snoop_recv(mad_agent_priv->qp_info, + &local->mad_priv->header.recv_wc, + IB_MAD_SNOOP_RECVS); + mad_agent_priv->agent.recv_handler( + &mad_agent_priv->agent, + &local->mad_priv->header.recv_wc); + } + + /* Complete send */ + mad_send_wc.status = IB_WC_SUCCESS; + mad_send_wc.vendor_err = 0; + mad_send_wc.wr_id = local->wr_id; + if (atomic_read(&mad_agent_priv->qp_info->snoop_count)) + snoop_send(mad_agent_priv->qp_info, &local->send_wr, + &mad_send_wc, + IB_MAD_SNOOP_SEND_COMPLETIONS); + mad_agent_priv->agent.send_handler(&mad_agent_priv->agent, + &mad_send_wc); + + spin_lock_irqsave(&mad_agent_priv->lock, flags); + list_del(&local->completion_list); + atomic_dec(&mad_agent_priv->refcount); + kfree(local); + } + spin_unlock_irqrestore(&mad_agent_priv->lock, flags); +} + +static void timeout_sends(void *data) +{ + struct ib_mad_agent_private *mad_agent_priv; + struct ib_mad_send_wr_private *mad_send_wr; + struct ib_mad_send_wc mad_send_wc; + unsigned long flags, delay; + + mad_agent_priv = (struct ib_mad_agent_private *)data; + + mad_send_wc.status = IB_WC_RESP_TIMEOUT_ERR; + mad_send_wc.vendor_err = 0; + + spin_lock_irqsave(&mad_agent_priv->lock, flags); + while (!list_empty(&mad_agent_priv->wait_list)) { + mad_send_wr = list_entry(mad_agent_priv->wait_list.next, + struct ib_mad_send_wr_private, + agent_list); + + if (time_after(mad_send_wr->timeout, jiffies)) { + delay = mad_send_wr->timeout - jiffies; + if ((long)delay <= 0) + delay = 1; + queue_delayed_work(mad_agent_priv->qp_info-> + port_priv->wq, + &mad_agent_priv->timed_work, delay); + break; + } + + list_del(&mad_send_wr->agent_list); + spin_unlock_irqrestore(&mad_agent_priv->lock, flags); + + mad_send_wc.wr_id = mad_send_wr->wr_id; + mad_agent_priv->agent.send_handler(&mad_agent_priv->agent, + &mad_send_wc); + + kfree(mad_send_wr); + atomic_dec(&mad_agent_priv->refcount); + spin_lock_irqsave(&mad_agent_priv->lock, flags); + } + spin_unlock_irqrestore(&mad_agent_priv->lock, flags); +} + +static void ib_mad_thread_completion_handler(struct ib_cq *cq) +{ + struct ib_mad_port_private *port_priv = cq->cq_context; + + queue_work(port_priv->wq, &port_priv->work); +} + +/* + * Allocate receive MADs and post receive WRs for them + */ +static int ib_mad_post_receive_mads(struct ib_mad_qp_info *qp_info, + struct ib_mad_private *mad) +{ + unsigned long flags; + int post, ret; + struct ib_mad_private *mad_priv; + struct ib_sge sg_list; + struct ib_recv_wr recv_wr, *bad_recv_wr; + struct ib_mad_queue *recv_queue = &qp_info->recv_queue; + + /* Initialize common scatter list fields */ + sg_list.length = sizeof *mad_priv - sizeof mad_priv->header; + sg_list.lkey = (*qp_info->port_priv->mr).lkey; + + /* Initialize common receive WR fields */ + recv_wr.next = NULL; + recv_wr.sg_list = &sg_list; + recv_wr.num_sge = 1; + recv_wr.recv_flags = IB_RECV_SIGNALED; + + do { + /* Allocate and map receive buffer */ + if (mad) { + mad_priv = mad; + mad = NULL; + } else { + mad_priv = kmem_cache_alloc(ib_mad_cache, GFP_KERNEL); + if (!mad_priv) { + printk(KERN_ERR PFX "No memory for receive buffer\n"); + ret = -ENOMEM; + break; + } + } + sg_list.addr = dma_map_single(qp_info->port_priv-> + device->dma_device, + &mad_priv->grh, + sizeof *mad_priv - + sizeof mad_priv->header, + DMA_FROM_DEVICE); + pci_unmap_addr_set(&mad_priv->header, mapping, sg_list.addr); + recv_wr.wr_id = (unsigned long)&mad_priv->header.mad_list; + mad_priv->header.mad_list.mad_queue = recv_queue; + + /* Post receive WR */ + spin_lock_irqsave(&recv_queue->lock, flags); + post = (++recv_queue->count < recv_queue->max_active); + list_add_tail(&mad_priv->header.mad_list.list, &recv_queue->list); + spin_unlock_irqrestore(&recv_queue->lock, flags); + ret = ib_post_recv(qp_info->qp, &recv_wr, &bad_recv_wr); + if (ret) { + spin_lock_irqsave(&recv_queue->lock, flags); + list_del(&mad_priv->header.mad_list.list); + recv_queue->count--; + spin_unlock_irqrestore(&recv_queue->lock, flags); + dma_unmap_single(qp_info->port_priv->device->dma_device, + pci_unmap_addr(&mad_priv->header, + mapping), + sizeof *mad_priv - + sizeof mad_priv->header, + DMA_FROM_DEVICE); + kmem_cache_free(ib_mad_cache, mad_priv); + printk(KERN_ERR PFX "ib_post_recv failed: %d\n", ret); + break; + } + } while (post); + + return ret; +} + +/* + * Return all the posted receive MADs + */ +static void cleanup_recv_queue(struct ib_mad_qp_info *qp_info) +{ + struct ib_mad_private_header *mad_priv_hdr; + struct ib_mad_private *recv; + struct ib_mad_list_head *mad_list; + + while (!list_empty(&qp_info->recv_queue.list)) { + + mad_list = list_entry(qp_info->recv_queue.list.next, + struct ib_mad_list_head, list); + mad_priv_hdr = container_of(mad_list, + struct ib_mad_private_header, + mad_list); + recv = container_of(mad_priv_hdr, struct ib_mad_private, + header); + + /* Remove from posted receive MAD list */ + list_del(&mad_list->list); + + /* Undo PCI mapping */ + dma_unmap_single(qp_info->port_priv->device->dma_device, + pci_unmap_addr(&recv->header, mapping), + sizeof(struct ib_mad_private) - + sizeof(struct ib_mad_private_header), + DMA_FROM_DEVICE); + kmem_cache_free(ib_mad_cache, recv); + } + + qp_info->recv_queue.count = 0; +} + +/* + * Start the port + */ +static int ib_mad_port_start(struct ib_mad_port_private *port_priv) +{ + int ret, i; + struct ib_qp_attr *attr; + struct ib_qp *qp; + + attr = kmalloc(sizeof *attr, GFP_KERNEL); + if (!attr) { + printk(KERN_ERR PFX "Couldn't kmalloc ib_qp_attr\n"); + return -ENOMEM; + } + + for (i = 0; i < IB_MAD_QPS_CORE; i++) { + qp = port_priv->qp_info[i].qp; + /* + * PKey index for QP1 is irrelevant but + * one is needed for the Reset to Init transition + */ + attr->qp_state = IB_QPS_INIT; + attr->pkey_index = 0; + attr->qkey = (qp->qp_num == 0) ? 0 : IB_QP1_QKEY; + ret = ib_modify_qp(qp, attr, IB_QP_STATE | + IB_QP_PKEY_INDEX | IB_QP_QKEY); + if (ret) { + printk(KERN_ERR PFX "Couldn't change QP%d state to " + "INIT: %d\n", i, ret); + goto out; + } + + attr->qp_state = IB_QPS_RTR; + ret = ib_modify_qp(qp, attr, IB_QP_STATE); + if (ret) { + printk(KERN_ERR PFX "Couldn't change QP%d state to " + "RTR: %d\n", i, ret); + goto out; + } + + attr->qp_state = IB_QPS_RTS; + attr->sq_psn = IB_MAD_SEND_Q_PSN; + ret = ib_modify_qp(qp, attr, IB_QP_STATE | IB_QP_SQ_PSN); + if (ret) { + printk(KERN_ERR PFX "Couldn't change QP%d state to " + "RTS: %d\n", i, ret); + goto out; + } + } + + ret = ib_req_notify_cq(port_priv->cq, IB_CQ_NEXT_COMP); + if (ret) { + printk(KERN_ERR PFX "Failed to request completion " + "notification: %d\n", ret); + goto out; + } + + for (i = 0; i < IB_MAD_QPS_CORE; i++) { + ret = ib_mad_post_receive_mads(&port_priv->qp_info[i], NULL); + if (ret) { + printk(KERN_ERR PFX "Couldn't post receive WRs\n"); + goto out; + } + } +out: + kfree(attr); + return ret; +} + +static void qp_event_handler(struct ib_event *event, void *qp_context) +{ + struct ib_mad_qp_info *qp_info = qp_context; + + /* It's worse than that! He's dead, Jim! */ + printk(KERN_ERR PFX "Fatal error (%d) on MAD QP (%d)\n", + event->event, qp_info->qp->qp_num); +} + +static void init_mad_queue(struct ib_mad_qp_info *qp_info, + struct ib_mad_queue *mad_queue) +{ + mad_queue->qp_info = qp_info; + mad_queue->count = 0; + spin_lock_init(&mad_queue->lock); + INIT_LIST_HEAD(&mad_queue->list); +} + +static void init_mad_qp(struct ib_mad_port_private *port_priv, + struct ib_mad_qp_info *qp_info) +{ + qp_info->port_priv = port_priv; + init_mad_queue(qp_info, &qp_info->send_queue); + init_mad_queue(qp_info, &qp_info->recv_queue); + INIT_LIST_HEAD(&qp_info->overflow_list); + spin_lock_init(&qp_info->snoop_lock); + qp_info->snoop_table = NULL; + qp_info->snoop_table_size = 0; + atomic_set(&qp_info->snoop_count, 0); +} + +static int create_mad_qp(struct ib_mad_qp_info *qp_info, + enum ib_qp_type qp_type) +{ + struct ib_qp_init_attr qp_init_attr; + int ret; + + memset(&qp_init_attr, 0, sizeof qp_init_attr); + qp_init_attr.send_cq = qp_info->port_priv->cq; + qp_init_attr.recv_cq = qp_info->port_priv->cq; + qp_init_attr.sq_sig_type = IB_SIGNAL_ALL_WR; + qp_init_attr.rq_sig_type = IB_SIGNAL_ALL_WR; + qp_init_attr.cap.max_send_wr = IB_MAD_QP_SEND_SIZE; + qp_init_attr.cap.max_recv_wr = IB_MAD_QP_RECV_SIZE; + qp_init_attr.cap.max_send_sge = IB_MAD_SEND_REQ_MAX_SG; + qp_init_attr.cap.max_recv_sge = IB_MAD_RECV_REQ_MAX_SG; + qp_init_attr.qp_type = qp_type; + qp_init_attr.port_num = qp_info->port_priv->port_num; + qp_init_attr.qp_context = qp_info; + qp_init_attr.event_handler = qp_event_handler; + qp_info->qp = ib_create_qp(qp_info->port_priv->pd, &qp_init_attr); + if (IS_ERR(qp_info->qp)) { + printk(KERN_ERR PFX "Couldn't create ib_mad QP%d\n", + get_spl_qp_index(qp_type)); + ret = PTR_ERR(qp_info->qp); + goto error; + } + /* Use minimum queue sizes unless the CQ is resized */ + qp_info->send_queue.max_active = IB_MAD_QP_SEND_SIZE; + qp_info->recv_queue.max_active = IB_MAD_QP_RECV_SIZE; + return 0; + +error: + return ret; +} + +static void destroy_mad_qp(struct ib_mad_qp_info *qp_info) +{ + ib_destroy_qp(qp_info->qp); + if (qp_info->snoop_table) + kfree(qp_info->snoop_table); +} + +/* + * Open the port + * Create the QP, PD, MR, and CQ if needed + */ +static int ib_mad_port_open(struct ib_device *device, + int port_num) +{ + int ret, cq_size; + struct ib_mad_port_private *port_priv; + unsigned long flags; + char name[sizeof "ib_mad123"]; + + /* First, check if port already open at MAD layer */ + port_priv = ib_get_mad_port(device, port_num); + if (port_priv) { + printk(KERN_DEBUG PFX "%s port %d already open\n", + device->name, port_num); + return 0; + } + + /* Create new device info */ + port_priv = kmalloc(sizeof *port_priv, GFP_KERNEL); + if (!port_priv) { + printk(KERN_ERR PFX "No memory for ib_mad_port_private\n"); + return -ENOMEM; + } + memset(port_priv, 0, sizeof *port_priv); + port_priv->device = device; + port_priv->port_num = port_num; + spin_lock_init(&port_priv->reg_lock); + INIT_LIST_HEAD(&port_priv->agent_list); + init_mad_qp(port_priv, &port_priv->qp_info[0]); + init_mad_qp(port_priv, &port_priv->qp_info[1]); + + cq_size = (IB_MAD_QP_SEND_SIZE + IB_MAD_QP_RECV_SIZE) * 2; + port_priv->cq = ib_create_cq(port_priv->device, + (ib_comp_handler) + ib_mad_thread_completion_handler, + NULL, port_priv, cq_size); + if (IS_ERR(port_priv->cq)) { + printk(KERN_ERR PFX "Couldn't create ib_mad CQ\n"); + ret = PTR_ERR(port_priv->cq); + goto error3; + } + + port_priv->pd = ib_alloc_pd(device); + if (IS_ERR(port_priv->pd)) { + printk(KERN_ERR PFX "Couldn't create ib_mad PD\n"); + ret = PTR_ERR(port_priv->pd); + goto error4; + } + + port_priv->mr = ib_get_dma_mr(port_priv->pd, IB_ACCESS_LOCAL_WRITE); + if (IS_ERR(port_priv->mr)) { + printk(KERN_ERR PFX "Couldn't get ib_mad DMA MR\n"); + ret = PTR_ERR(port_priv->mr); + goto error5; + } + + ret = create_mad_qp(&port_priv->qp_info[0], IB_QPT_SMI); + if (ret) + goto error6; + ret = create_mad_qp(&port_priv->qp_info[1], IB_QPT_GSI); + if (ret) + goto error7; + + snprintf(name, sizeof name, "ib_mad%d", port_num); + port_priv->wq = create_singlethread_workqueue(name); + if (!port_priv->wq) { + ret = -ENOMEM; + goto error8; + } + INIT_WORK(&port_priv->work, ib_mad_completion_handler, port_priv); + + ret = ib_mad_port_start(port_priv); + if (ret) { + printk(KERN_ERR PFX "Couldn't start port\n"); + goto error9; + } + + spin_lock_irqsave(&ib_mad_port_list_lock, flags); + list_add_tail(&port_priv->port_list, &ib_mad_port_list); + spin_unlock_irqrestore(&ib_mad_port_list_lock, flags); + return 0; + +error9: + destroy_workqueue(port_priv->wq); +error8: + destroy_mad_qp(&port_priv->qp_info[1]); +error7: + destroy_mad_qp(&port_priv->qp_info[0]); +error6: + ib_dereg_mr(port_priv->mr); +error5: + ib_dealloc_pd(port_priv->pd); +error4: + ib_destroy_cq(port_priv->cq); + cleanup_recv_queue(&port_priv->qp_info[1]); + cleanup_recv_queue(&port_priv->qp_info[0]); +error3: + kfree(port_priv); + + return ret; +} + +/* + * Close the port + * If there are no classes using the port, free the port + * resources (CQ, MR, PD, QP) and remove the port's info structure + */ +static int ib_mad_port_close(struct ib_device *device, int port_num) +{ + struct ib_mad_port_private *port_priv; + unsigned long flags; + + spin_lock_irqsave(&ib_mad_port_list_lock, flags); + port_priv = __ib_get_mad_port(device, port_num); + if (port_priv == NULL) { + spin_unlock_irqrestore(&ib_mad_port_list_lock, flags); + printk(KERN_ERR PFX "Port %d not found\n", port_num); + return -ENODEV; + } + list_del(&port_priv->port_list); + spin_unlock_irqrestore(&ib_mad_port_list_lock, flags); + + /* Stop processing completions. */ + flush_workqueue(port_priv->wq); + destroy_workqueue(port_priv->wq); + destroy_mad_qp(&port_priv->qp_info[1]); + destroy_mad_qp(&port_priv->qp_info[0]); + ib_dereg_mr(port_priv->mr); + ib_dealloc_pd(port_priv->pd); + ib_destroy_cq(port_priv->cq); + cleanup_recv_queue(&port_priv->qp_info[1]); + cleanup_recv_queue(&port_priv->qp_info[0]); + /* XXX: Handle deallocation of MAD registration tables */ + + kfree(port_priv); + + return 0; +} + +static void ib_mad_init_device(struct ib_device *device) +{ + int ret, num_ports, cur_port, i, ret2; + + if (device->node_type == IB_NODE_SWITCH) { + num_ports = 1; + cur_port = 0; + } else { + num_ports = device->phys_port_cnt; + cur_port = 1; + } + for (i = 0; i < num_ports; i++, cur_port++) { + ret = ib_mad_port_open(device, cur_port); + if (ret) { + printk(KERN_ERR PFX "Couldn't open %s port %d\n", + device->name, cur_port); + goto error_device_open; + } + ret = ib_agent_port_open(device, cur_port); + if (ret) { + printk(KERN_ERR PFX "Couldn't open %s port %d " + "for agents\n", + device->name, cur_port); + goto error_device_open; + } + } + + goto error_device_query; + +error_device_open: + while (i > 0) { + cur_port--; + ret2 = ib_agent_port_close(device, cur_port); + if (ret2) { + printk(KERN_ERR PFX "Couldn't close %s port %d " + "for agents\n", + device->name, cur_port); + } + ret2 = ib_mad_port_close(device, cur_port); + if (ret2) { + printk(KERN_ERR PFX "Couldn't close %s port %d\n", + device->name, cur_port); + } + i--; + } + +error_device_query: + return; +} + +static void ib_mad_remove_device(struct ib_device *device) +{ + int ret = 0, i, num_ports, cur_port, ret2; + + if (device->node_type == IB_NODE_SWITCH) { + num_ports = 1; + cur_port = 0; + } else { + num_ports = device->phys_port_cnt; + cur_port = 1; + } + for (i = 0; i < num_ports; i++, cur_port++) { + ret2 = ib_agent_port_close(device, cur_port); + if (ret2) { + printk(KERN_ERR PFX "Couldn't close %s port %d " + "for agents\n", + device->name, cur_port); + if (!ret) + ret = ret2; + } + ret2 = ib_mad_port_close(device, cur_port); + if (ret2) { + printk(KERN_ERR PFX "Couldn't close %s port %d\n", + device->name, cur_port); + if (!ret) + ret = ret2; + } + } +} + +static struct ib_client mad_client = { + .name = "mad", + .add = ib_mad_init_device, + .remove = ib_mad_remove_device +}; + +static int __init ib_mad_init_module(void) +{ + int ret; + + spin_lock_init(&ib_mad_port_list_lock); + spin_lock_init(&ib_agent_port_list_lock); + + ib_mad_cache = kmem_cache_create("ib_mad", + sizeof(struct ib_mad_private), + 0, + SLAB_HWCACHE_ALIGN, + NULL, + NULL); + if (!ib_mad_cache) { + printk(KERN_ERR PFX "Couldn't create ib_mad cache\n"); + ret = -ENOMEM; + goto error1; + } + + INIT_LIST_HEAD(&ib_mad_port_list); + + if (ib_register_client(&mad_client)) { + printk(KERN_ERR PFX "Couldn't register ib_mad client\n"); + ret = -EINVAL; + goto error2; + } + + return 0; + +error2: + kmem_cache_destroy(ib_mad_cache); +error1: + return ret; +} + +static void __exit ib_mad_cleanup_module(void) +{ + ib_unregister_client(&mad_client); + + if (kmem_cache_destroy(ib_mad_cache)) { + printk(KERN_DEBUG PFX "Failed to destroy ib_mad cache\n"); + } +} + +module_init(ib_mad_init_module); +module_exit(ib_mad_cleanup_module); From roland@topspin.com Mon Dec 27 21:52:43 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 21:53:00 -0800 (PST) Received: from umhlanga.STRATNET.NET (umhlanga.stratnet.net [12.162.17.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS5q8ct028217 for ; Mon, 27 Dec 2004 21:52:42 -0800 Received: from exch-1.topspincom.com ([12.162.17.3]) by umhlanga.STRATNET.NET with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:51:21 -0800 Received: from eddore ([10.10.253.169]) by exch-1.topspincom.com with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:51:21 -0800 Received: from localhost ([127.0.0.1] helo=eddore) by eddore with smtp (Exim 4.34) id 1CjAGO-0000w7-T4; Mon, 27 Dec 2004 21:51:21 -0800 Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com, openib-general@openib.org In-Reply-To: <200412272151.S29WkrmlJifc5kHZ@topspin.com> X-Mailer: Roland's Patchbomber Date: Mon, 27 Dec 2004 21:51:20 -0800 Message-Id: <200412272151.hCBo2fMh4Io7Xy87@topspin.com> Mime-Version: 1.0 To: davem@davemloft.net From: Roland Dreier X-SA-Exim-Connect-IP: 127.0.0.1 X-SA-Exim-Mail-From: roland@topspin.com Subject: [PATCH][v5][24/24] InfiniBand MAINTAINERS entry Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-SA-Exim-Version: 4.1 (built Tue, 17 Aug 2004 11:06:07 +0200) X-SA-Exim-Scanned: Yes (on eddore) X-OriginalArrivalTime: 28 Dec 2004 05:51:21.0204 (UTC) FILETIME=[41DBF340:01C4ECA1] X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13119 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: roland@topspin.com Precedence: bulk X-list: netdev Add OpenIB maintainers information to MAINTAINERS. Signed-off-by: Roland Dreier --- linux-bk.orig/MAINTAINERS 2004-12-27 21:47:44.140300651 -0800 +++ linux-bk/MAINTAINERS 2004-12-27 21:48:28.966702428 -0800 @@ -1081,6 +1081,17 @@ L: linux-fbdev-devel@lists.sourceforge.net S: Maintained +INFINIBAND SUBSYSTEM +P: Roland Dreier +M: roland@topspin.com +P: Sean Hefty +M: mshefty@ichips.intel.com +P: Hal Rosenstock +M: halr@voltaire.com +L: openib-general@openib.org +W: http://www.openib.org/ +S: Supported + INPUT (KEYBOARD, MOUSE, JOYSTICK) DRIVERS P: Vojtech Pavlik M: vojtech@suse.cz From roland@topspin.com Mon Dec 27 21:52:39 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 21:52:58 -0800 (PST) Received: from umhlanga.STRATNET.NET (umhlanga.stratnet.net [12.162.17.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS5q8cp028217 for ; Mon, 27 Dec 2004 21:52:38 -0800 Received: from exch-1.topspincom.com ([12.162.17.3]) by umhlanga.STRATNET.NET with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:51:20 -0800 Received: from eddore ([10.10.253.169]) by exch-1.topspincom.com with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:51:19 -0800 Received: from localhost ([127.0.0.1] helo=eddore) by eddore with smtp (Exim 4.34) id 1CjAGN-0000vp-F7; Mon, 27 Dec 2004 21:51:19 -0800 Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com, openib-general@openib.org In-Reply-To: <200412272151.3Lde9MPbD7ODIUdu@topspin.com> X-Mailer: Roland's Patchbomber Date: Mon, 27 Dec 2004 21:51:19 -0800 Message-Id: <200412272151.zeKZJPoIEBr55elh@topspin.com> Mime-Version: 1.0 To: davem@davemloft.net From: Roland Dreier X-SA-Exim-Connect-IP: 127.0.0.1 X-SA-Exim-Mail-From: roland@topspin.com Subject: [PATCH][v5][22/24] Document InfiniBand ioctl use Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-SA-Exim-Version: 4.1 (built Tue, 17 Aug 2004 11:06:07 +0200) X-SA-Exim-Scanned: Yes (on eddore) X-OriginalArrivalTime: 28 Dec 2004 05:51:19.0876 (UTC) FILETIME=[41115040:01C4ECA1] X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13117 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: roland@topspin.com Precedence: bulk X-list: netdev Add the 0x1b ioctl magic number used by ib_umad module to Documentation/ioctl-number.txt. Signed-off-by: Roland Dreier --- linux-bk.orig/Documentation/ioctl-number.txt 2004-12-27 21:47:59.407053483 -0800 +++ linux-bk/Documentation/ioctl-number.txt 2004-12-27 21:48:28.036839302 -0800 @@ -72,6 +72,7 @@ 0x09 all linux/md.h 0x12 all linux/fs.h linux/blkpg.h +0x1b all InfiniBand Subsystem 0x20 all drivers/cdrom/cm206.h 0x22 all scsi/sg.h '#' 00-3F IEEE 1394 Subsystem Block for the entire subsystem From roland@topspin.com Mon Dec 27 21:52:36 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 21:53:00 -0800 (PST) Received: from umhlanga.STRATNET.NET (umhlanga.stratnet.net [12.162.17.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS5q8cn028217 for ; Mon, 27 Dec 2004 21:52:34 -0800 Received: from exch-1.topspincom.com ([12.162.17.3]) by umhlanga.STRATNET.NET with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:51:19 -0800 Received: from eddore ([10.10.253.169]) by exch-1.topspincom.com with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:51:19 -0800 Received: from localhost ([127.0.0.1] helo=eddore) by eddore with smtp (Exim 4.34) id 1CjAGM-0000vg-Ld; Mon, 27 Dec 2004 21:51:19 -0800 Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com, openib-general@openib.org In-Reply-To: <200412272151.JEARwZ80axXxZD2Q@topspin.com> X-Mailer: Roland's Patchbomber Date: Mon, 27 Dec 2004 21:51:18 -0800 Message-Id: <200412272151.3Lde9MPbD7ODIUdu@topspin.com> Mime-Version: 1.0 To: davem@davemloft.net From: Roland Dreier X-SA-Exim-Connect-IP: 127.0.0.1 X-SA-Exim-Mail-From: roland@topspin.com Subject: [PATCH][v5][21/24] Add InfiniBand userspace MAD support Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-SA-Exim-Version: 4.1 (built Tue, 17 Aug 2004 11:06:07 +0200) X-SA-Exim-Scanned: Yes (on eddore) X-OriginalArrivalTime: 28 Dec 2004 05:51:19.0439 (UTC) FILETIME=[40CEA1F0:01C4ECA1] X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13118 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: roland@topspin.com Precedence: bulk X-list: netdev Add a driver that provides a character special device for each InfiniBand port. This device allows userspace to send and receive MADs via write() and read() (with some control operations implemented as ioctls). All operations are 32/64 clean and have been tested with 32-bit userspace running on a ppc64 kernel. Signed-off-by: Roland Dreier --- linux-bk.orig/drivers/infiniband/core/Makefile 2004-12-27 21:48:20.847897490 -0800 +++ linux-bk/drivers/infiniband/core/Makefile 2004-12-27 21:48:27.528914067 -0800 @@ -1,6 +1,6 @@ EXTRA_CFLAGS += -Idrivers/infiniband/include -obj-$(CONFIG_INFINIBAND) += ib_core.o ib_mad.o ib_sa.o +obj-$(CONFIG_INFINIBAND) += ib_core.o ib_mad.o ib_sa.o ib_umad.o ib_core-y := packer.o ud_header.o verbs.o sysfs.o \ device.o fmr_pool.o cache.o @@ -8,3 +8,5 @@ ib_mad-y := mad.o smi.o agent.o ib_sa-y := sa_query.o + +ib_umad-y := user_mad.o --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/core/user_mad.c 2004-12-27 21:48:27.576907002 -0800 @@ -0,0 +1,738 @@ +/* + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: user_mad.c 1389 2004-12-27 22:56:47Z roland $ + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include + +#include +#include + +MODULE_AUTHOR("Roland Dreier"); +MODULE_DESCRIPTION("InfiniBand userspace MAD packet access"); +MODULE_LICENSE("Dual BSD/GPL"); + +enum { + IB_UMAD_MAX_PORTS = 256, + IB_UMAD_MAX_AGENTS = 32 +}; + +struct ib_umad_port { + int devnum; + struct cdev dev; + struct class_device class_dev; + struct ib_device *ib_dev; + struct ib_umad_device *umad_dev; + u8 port_num; +}; + +struct ib_umad_device { + int start_port, end_port; + struct kref ref; + struct ib_umad_port port[0]; +}; + +struct ib_umad_file { + struct ib_umad_port *port; + spinlock_t recv_lock; + struct list_head recv_list; + wait_queue_head_t recv_wait; + struct rw_semaphore agent_mutex; + struct ib_mad_agent *agent[IB_UMAD_MAX_AGENTS]; + struct ib_mr *mr[IB_UMAD_MAX_AGENTS]; +}; + +struct ib_umad_packet { + struct ib_user_mad mad; + struct ib_ah *ah; + struct list_head list; + DECLARE_PCI_UNMAP_ADDR(mapping) +}; + +static dev_t base_dev; +static spinlock_t map_lock; +static DECLARE_BITMAP(dev_map, IB_UMAD_MAX_PORTS); + +static void ib_umad_add_one(struct ib_device *device); +static void ib_umad_remove_one(struct ib_device *device); + +static int queue_packet(struct ib_umad_file *file, + struct ib_mad_agent *agent, + struct ib_umad_packet *packet) +{ + int ret = 1; + + down_read(&file->agent_mutex); + for (packet->mad.id = 0; + packet->mad.id < IB_UMAD_MAX_AGENTS; + packet->mad.id++) + if (agent == file->agent[packet->mad.id]) { + spin_lock_irq(&file->recv_lock); + list_add_tail(&packet->list, &file->recv_list); + spin_unlock_irq(&file->recv_lock); + wake_up_interruptible(&file->recv_wait); + ret = 0; + break; + } + + up_read(&file->agent_mutex); + + return ret; +} + +static void send_handler(struct ib_mad_agent *agent, + struct ib_mad_send_wc *send_wc) +{ + struct ib_umad_file *file = agent->context; + struct ib_umad_packet *packet = + (void *) (unsigned long) send_wc->wr_id; + + dma_unmap_single(agent->device->dma_device, + pci_unmap_addr(packet, mapping), + sizeof packet->mad.data, + DMA_TO_DEVICE); + ib_destroy_ah(packet->ah); + + if (send_wc->status == IB_WC_RESP_TIMEOUT_ERR) { + packet->mad.status = ETIMEDOUT; + + if (!queue_packet(file, agent, packet)) + return; + } + + kfree(packet); +} + +static void recv_handler(struct ib_mad_agent *agent, + struct ib_mad_recv_wc *mad_recv_wc) +{ + struct ib_umad_file *file = agent->context; + struct ib_umad_packet *packet; + + if (mad_recv_wc->wc->status != IB_WC_SUCCESS) + goto out; + + packet = kmalloc(sizeof *packet, GFP_KERNEL); + if (!packet) + goto out; + + memset(packet, 0, sizeof *packet); + + memcpy(packet->mad.data, mad_recv_wc->recv_buf.mad, sizeof packet->mad.data); + packet->mad.status = 0; + packet->mad.qpn = cpu_to_be32(mad_recv_wc->wc->src_qp); + packet->mad.lid = cpu_to_be16(mad_recv_wc->wc->slid); + packet->mad.sl = mad_recv_wc->wc->sl; + packet->mad.path_bits = mad_recv_wc->wc->dlid_path_bits; + packet->mad.grh_present = !!(mad_recv_wc->wc->wc_flags & IB_WC_GRH); + if (packet->mad.grh_present) { + /* XXX parse GRH */ + packet->mad.gid_index = 0; + packet->mad.hop_limit = 0; + packet->mad.traffic_class = 0; + memset(packet->mad.gid, 0, 16); + packet->mad.flow_label = 0; + } + + if (queue_packet(file, agent, packet)) + kfree(packet); + +out: + ib_free_recv_mad(mad_recv_wc); +} + +static ssize_t ib_umad_read(struct file *filp, char __user *buf, + size_t count, loff_t *pos) +{ + struct ib_umad_file *file = filp->private_data; + struct ib_umad_packet *packet; + ssize_t ret; + + if (count < sizeof (struct ib_user_mad)) + return -EINVAL; + + spin_lock_irq(&file->recv_lock); + + while (list_empty(&file->recv_list)) { + spin_unlock_irq(&file->recv_lock); + + if (filp->f_flags & O_NONBLOCK) + return -EAGAIN; + + if (wait_event_interruptible(file->recv_wait, + !list_empty(&file->recv_list))) + return -ERESTARTSYS; + + spin_lock_irq(&file->recv_lock); + } + + packet = list_entry(file->recv_list.next, struct ib_umad_packet, list); + list_del(&packet->list); + + spin_unlock_irq(&file->recv_lock); + + if (copy_to_user(buf, &packet->mad, sizeof packet->mad)) + ret = -EFAULT; + else + ret = sizeof packet->mad; + + kfree(packet); + return ret; +} + +static ssize_t ib_umad_write(struct file *filp, const char __user *buf, + size_t count, loff_t *pos) +{ + struct ib_umad_file *file = filp->private_data; + struct ib_umad_packet *packet; + struct ib_mad_agent *agent; + struct ib_ah_attr ah_attr; + struct ib_sge gather_list; + struct ib_send_wr *bad_wr, wr = { + .opcode = IB_WR_SEND, + .sg_list = &gather_list, + .num_sge = 1, + .send_flags = IB_SEND_SIGNALED, + }; + u8 method; + u64 *tid; + int ret; + + if (count < sizeof (struct ib_user_mad)) + return -EINVAL; + + packet = kmalloc(sizeof *packet, GFP_KERNEL); + if (!packet) + return -ENOMEM; + + if (copy_from_user(&packet->mad, buf, sizeof packet->mad)) { + kfree(packet); + return -EFAULT; + } + + if (packet->mad.id < 0 || packet->mad.id >= IB_UMAD_MAX_AGENTS) { + ret = -EINVAL; + goto err; + } + + down_read(&file->agent_mutex); + + agent = file->agent[packet->mad.id]; + if (!agent) { + ret = -EINVAL; + goto err_up; + } + + /* + * If userspace is generating a request that will generate a + * response, we need to make sure the high-order part of the + * transaction ID matches the agent being used to send the + * MAD. + */ + method = ((struct ib_mad_hdr *) packet->mad.data)->method; + + if (!(method & IB_MGMT_METHOD_RESP) && + method != IB_MGMT_METHOD_TRAP_REPRESS && + method != IB_MGMT_METHOD_SEND) { + tid = &((struct ib_mad_hdr *) packet->mad.data)->tid; + *tid = cpu_to_be64(((u64) agent->hi_tid) << 32 | + (be64_to_cpup(tid) & 0xffffffff)); + } + + memset(&ah_attr, 0, sizeof ah_attr); + ah_attr.dlid = be16_to_cpu(packet->mad.lid); + ah_attr.sl = packet->mad.sl; + ah_attr.src_path_bits = packet->mad.path_bits; + ah_attr.port_num = file->port->port_num; + if (packet->mad.grh_present) { + ah_attr.ah_flags = IB_AH_GRH; + memcpy(ah_attr.grh.dgid.raw, packet->mad.gid, 16); + ah_attr.grh.flow_label = packet->mad.flow_label; + ah_attr.grh.hop_limit = packet->mad.hop_limit; + ah_attr.grh.traffic_class = packet->mad.traffic_class; + } + + packet->ah = ib_create_ah(agent->qp->pd, &ah_attr); + if (IS_ERR(packet->ah)) { + ret = PTR_ERR(packet->ah); + goto err_up; + } + + gather_list.addr = dma_map_single(agent->device->dma_device, + packet->mad.data, + sizeof packet->mad.data, + DMA_TO_DEVICE); + gather_list.length = sizeof packet->mad.data; + gather_list.lkey = file->mr[packet->mad.id]->lkey; + pci_unmap_addr_set(packet, mapping, gather_list.addr); + + wr.wr.ud.mad_hdr = (struct ib_mad_hdr *) packet->mad.data; + wr.wr.ud.ah = packet->ah; + wr.wr.ud.remote_qpn = be32_to_cpu(packet->mad.qpn); + wr.wr.ud.remote_qkey = be32_to_cpu(packet->mad.qkey); + wr.wr.ud.timeout_ms = packet->mad.timeout_ms; + + wr.wr_id = (unsigned long) packet; + + ret = ib_post_send_mad(agent, &wr, &bad_wr); + if (ret) { + dma_unmap_single(agent->device->dma_device, + pci_unmap_addr(packet, mapping), + sizeof packet->mad.data, + DMA_TO_DEVICE); + goto err_up; + } + + up_read(&file->agent_mutex); + + return sizeof packet->mad; + +err_up: + up_read(&file->agent_mutex); + +err: + kfree(packet); + return ret; +} + +static unsigned int ib_umad_poll(struct file *filp, struct poll_table_struct *wait) +{ + struct ib_umad_file *file = filp->private_data; + + /* we will always be able to post a MAD send */ + unsigned int mask = POLLOUT | POLLWRNORM; + + poll_wait(filp, &file->recv_wait, wait); + + if (!list_empty(&file->recv_list)) + mask |= POLLIN | POLLRDNORM; + + return mask; +} + +static int ib_umad_reg_agent(struct ib_umad_file *file, unsigned long arg) +{ + struct ib_user_mad_reg_req ureq; + struct ib_mad_reg_req req; + struct ib_mad_agent *agent; + int agent_id; + int ret; + + down_write(&file->agent_mutex); + + if (copy_from_user(&ureq, (void __user *) arg, sizeof ureq)) { + ret = -EFAULT; + goto out; + } + + if (ureq.qpn != 0 && ureq.qpn != 1) { + ret = -EINVAL; + goto out; + } + + for (agent_id = 0; agent_id < IB_UMAD_MAX_AGENTS; ++agent_id) + if (!file->agent[agent_id]) + goto found; + + ret = -ENOMEM; + goto out; + +found: + req.mgmt_class = ureq.mgmt_class; + req.mgmt_class_version = ureq.mgmt_class_version; + memcpy(req.method_mask, ureq.method_mask, sizeof req.method_mask); + memcpy(req.oui, ureq.oui, sizeof req.oui); + + agent = ib_register_mad_agent(file->port->ib_dev, file->port->port_num, + ureq.qpn ? IB_QPT_GSI : IB_QPT_SMI, + &req, 0, send_handler, recv_handler, + file); + if (IS_ERR(agent)) { + ret = PTR_ERR(agent); + goto out; + } + + file->agent[agent_id] = agent; + + file->mr[agent_id] = ib_get_dma_mr(agent->qp->pd, IB_ACCESS_LOCAL_WRITE); + if (IS_ERR(file->mr[agent_id])) { + ret = -ENOMEM; + goto err; + } + + if (put_user(agent_id, + (u32 __user *) (arg + offsetof(struct ib_user_mad_reg_req, id)))) { + ret = -EFAULT; + goto err_mr; + } + + ret = 0; + goto out; + +err_mr: + ib_dereg_mr(file->mr[agent_id]); + +err: + file->agent[agent_id] = NULL; + ib_unregister_mad_agent(agent); + +out: + up_write(&file->agent_mutex); + return ret; +} + +static int ib_umad_unreg_agent(struct ib_umad_file *file, unsigned long arg) +{ + u32 id; + int ret = 0; + + down_write(&file->agent_mutex); + + if (get_user(id, (u32 __user *) arg)) { + ret = -EFAULT; + goto out; + } + + if (id < 0 || id >= IB_UMAD_MAX_AGENTS || !file->agent[id]) { + ret = -EINVAL; + goto out; + } + + ib_dereg_mr(file->mr[id]); + ib_unregister_mad_agent(file->agent[id]); + file->agent[id] = NULL; + +out: + up_write(&file->agent_mutex); + return ret; +} + +static int ib_umad_ioctl(struct inode *inode, struct file *filp, + unsigned int cmd, unsigned long arg) +{ + switch (cmd) { + case IB_USER_MAD_REGISTER_AGENT: + return ib_umad_reg_agent(filp->private_data, arg); + case IB_USER_MAD_UNREGISTER_AGENT: + return ib_umad_unreg_agent(filp->private_data, arg); + default: + return -ENOIOCTLCMD; + } +} + +static int ib_umad_open(struct inode *inode, struct file *filp) +{ + struct ib_umad_port *port = + container_of(inode->i_cdev, struct ib_umad_port, dev); + struct ib_umad_file *file; + + file = kmalloc(sizeof *file, GFP_KERNEL); + if (!file) + return -ENOMEM; + + memset(file, 0, sizeof *file); + + spin_lock_init(&file->recv_lock); + init_rwsem(&file->agent_mutex); + INIT_LIST_HEAD(&file->recv_list); + init_waitqueue_head(&file->recv_wait); + + file->port = port; + filp->private_data = file; + + return 0; +} + +static int ib_umad_close(struct inode *inode, struct file *filp) +{ + struct ib_umad_file *file = filp->private_data; + int i; + + for (i = 0; i < IB_UMAD_MAX_AGENTS; ++i) + if (file->agent[i]) { + ib_dereg_mr(file->mr[i]); + ib_unregister_mad_agent(file->agent[i]); + } + + kfree(file); + + return 0; +} + +static struct file_operations umad_fops = { + .owner = THIS_MODULE, + .read = ib_umad_read, + .write = ib_umad_write, + .poll = ib_umad_poll, + .ioctl = ib_umad_ioctl, + .open = ib_umad_open, + .release = ib_umad_close +}; + +static struct ib_client umad_client = { + .name = "umad", + .add = ib_umad_add_one, + .remove = ib_umad_remove_one +}; + +static ssize_t show_dev(struct class_device *class_dev, char *buf) +{ + struct ib_umad_port *port = + container_of(class_dev, struct ib_umad_port, class_dev); + + return print_dev_t(buf, port->dev.dev); +} +static CLASS_DEVICE_ATTR(dev, S_IRUGO, show_dev, NULL); + +static ssize_t show_ibdev(struct class_device *class_dev, char *buf) +{ + struct ib_umad_port *port = + container_of(class_dev, struct ib_umad_port, class_dev); + + return sprintf(buf, "%s\n", port->ib_dev->name); +} +static CLASS_DEVICE_ATTR(ibdev, S_IRUGO, show_ibdev, NULL); + +static ssize_t show_port(struct class_device *class_dev, char *buf) +{ + struct ib_umad_port *port = + container_of(class_dev, struct ib_umad_port, class_dev); + + return sprintf(buf, "%d\n", port->port_num); +} +static CLASS_DEVICE_ATTR(port, S_IRUGO, show_port, NULL); + +static void ib_umad_release_dev(struct kref *ref) +{ + struct ib_umad_device *dev = + container_of(ref, struct ib_umad_device, ref); + + kfree(dev); +} + +static void ib_umad_release_port(struct class_device *class_dev) +{ + struct ib_umad_port *port = + container_of(class_dev, struct ib_umad_port, class_dev); + + cdev_del(&port->dev); + clear_bit(port->devnum, dev_map); + kref_put(&port->umad_dev->ref, ib_umad_release_dev); +} + +static struct class umad_class = { + .name = "infiniband_mad", + .release = ib_umad_release_port +}; + +static ssize_t show_abi_version(struct class *class, char *buf) +{ + return sprintf(buf, "%d\n", IB_USER_MAD_ABI_VERSION); +} +static CLASS_ATTR(abi_version, S_IRUGO, show_abi_version, NULL); + +static void ib_umad_add_one(struct ib_device *device) +{ + struct ib_umad_device *umad_dev; + int s, e, i; + + if (device->node_type == IB_NODE_SWITCH) + s = e = 0; + else { + s = 1; + e = device->phys_port_cnt; + } + + umad_dev = kmalloc(sizeof *umad_dev + + (e - s + 1) * sizeof (struct ib_umad_port), + GFP_KERNEL); + if (!umad_dev) + return; + + memset(umad_dev, 0, sizeof *umad_dev + + (e - s + 1) * sizeof (struct ib_umad_port)); + + kref_init(&umad_dev->ref); + + umad_dev->start_port = s; + umad_dev->end_port = e; + + for (i = s; i <= e; ++i) { + umad_dev->port[i - s].umad_dev = umad_dev; + kref_get(&umad_dev->ref); + + spin_lock(&map_lock); + umad_dev->port[i - s].devnum = + find_first_zero_bit(dev_map, IB_UMAD_MAX_PORTS); + if (umad_dev->port[i - s].devnum >= IB_UMAD_MAX_PORTS) { + spin_unlock(&map_lock); + goto err; + } + set_bit(umad_dev->port[i - s].devnum, dev_map); + spin_unlock(&map_lock); + + umad_dev->port[i - s].ib_dev = device; + umad_dev->port[i - s].port_num = i; + + cdev_init(&umad_dev->port[i - s].dev, &umad_fops); + umad_dev->port[i - s].dev.owner = THIS_MODULE; + kobject_set_name(&umad_dev->port[i - s].dev.kobj, + "umad%d", umad_dev->port[i - s].devnum); + if (cdev_add(&umad_dev->port[i - s].dev, base_dev + + umad_dev->port[i - s].devnum, 1)) + goto err; + + umad_dev->port[i - s].class_dev.class = &umad_class; + umad_dev->port[i - s].class_dev.dev = device->dma_device; + snprintf(umad_dev->port[i - s].class_dev.class_id, + BUS_ID_SIZE, "umad%d", umad_dev->port[i - s].devnum); + if (class_device_register(&umad_dev->port[i - s].class_dev)) + goto err_class; + + if (class_device_create_file(&umad_dev->port[i - s].class_dev, + &class_device_attr_dev)) + goto err_class; + if (class_device_create_file(&umad_dev->port[i - s].class_dev, + &class_device_attr_ibdev)) + goto err_class; + if (class_device_create_file(&umad_dev->port[i - s].class_dev, + &class_device_attr_port)) + goto err_class; + } + + ib_set_client_data(device, &umad_client, umad_dev); + + return; + +err_class: + cdev_del(&umad_dev->port[i - s].dev); + clear_bit(umad_dev->port[i - s].devnum, dev_map); + +err: + while (--i >= s) + class_device_unregister(&umad_dev->port[i - s].class_dev); + + kref_put(&umad_dev->ref, ib_umad_release_dev); +} + +static void ib_umad_remove_one(struct ib_device *device) +{ + struct ib_umad_device *umad_dev = ib_get_client_data(device, &umad_client); + int i; + + if (!umad_dev) + return; + + for (i = 0; i <= umad_dev->end_port - umad_dev->start_port; ++i) + class_device_unregister(&umad_dev->port[i].class_dev); + + kref_put(&umad_dev->ref, ib_umad_release_dev); +} + +static int __init ib_umad_init(void) +{ + int ret; + + spin_lock_init(&map_lock); + + ret = alloc_chrdev_region(&base_dev, 0, IB_UMAD_MAX_PORTS, + "infiniband_mad"); + if (ret) { + printk(KERN_ERR "user_mad: couldn't get device number\n"); + goto out; + } + + ret = class_register(&umad_class); + if (ret) { + printk(KERN_ERR "user_mad: couldn't create class infiniband_mad\n"); + goto out_chrdev; + } + + ret = class_create_file(&umad_class, &class_attr_abi_version); + if (ret) { + printk(KERN_ERR "user_mad: couldn't create abi_version attribute\n"); + goto out_class; + } + + ret = ib_register_client(&umad_client); + if (ret) { + printk(KERN_ERR "user_mad: couldn't register ib_umad client\n"); + goto out_class; + } + + /* Our ioctls are 32/64 clean */ + ret = register_ioctl32_conversion(IB_USER_MAD_REGISTER_AGENT, NULL); + ret |= register_ioctl32_conversion(IB_USER_MAD_UNREGISTER_AGENT, NULL); + if (ret) { + printk(KERN_ERR "user_mad: couldn't register ioctl32 conversions\n"); + goto out_client; + } + + return 0; + +out_client: + ib_unregister_client(&umad_client); + +out_class: + class_unregister(&umad_class); + +out_chrdev: + unregister_chrdev_region(base_dev, IB_UMAD_MAX_PORTS); + +out: + return ret; +} + +static void __exit ib_umad_cleanup(void) +{ + unregister_ioctl32_conversion(IB_USER_MAD_REGISTER_AGENT); + unregister_ioctl32_conversion(IB_USER_MAD_UNREGISTER_AGENT); + ib_unregister_client(&umad_client); + class_unregister(&umad_class); + unregister_chrdev_region(base_dev, IB_UMAD_MAX_PORTS); +} + +module_init(ib_umad_init); +module_exit(ib_umad_cleanup); --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/include/ib_user_mad.h 2004-12-27 21:48:27.631898908 -0800 @@ -0,0 +1,123 @@ +/* + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: ib_user_mad.h 1389 2004-12-27 22:56:47Z roland $ + */ + +#ifndef IB_USER_MAD_H +#define IB_USER_MAD_H + +#include +#include + +/* + * Increment this value if any changes that break userspace ABI + * compatibility are made. + */ +#define IB_USER_MAD_ABI_VERSION 2 + +/* + * Make sure that all structs defined in this file remain laid out so + * that they pack the same way on 32-bit and 64-bit architectures (to + * avoid incompatibility between 32-bit userspace and 64-bit kernels). + */ + +/** + * ib_user_mad - MAD packet + * @data - Contents of MAD + * @id - ID of agent MAD received with/to be sent with + * @status - 0 on successful receive, ETIMEDOUT if no response + * received (transaction ID in data[] will be set to TID of original + * request) (ignored on send) + * @timeout_ms - Milliseconds to wait for response (unset on receive) + * @qpn - Remote QP number received from/to be sent to + * @qkey - Remote Q_Key to be sent with (unset on receive) + * @lid - Remote lid received from/to be sent to + * @sl - Service level received with/to be sent with + * @path_bits - Local path bits received with/to be sent with + * @grh_present - If set, GRH was received/should be sent + * @gid_index - Local GID index to send with (unset on receive) + * @hop_limit - Hop limit in GRH + * @traffic_class - Traffic class in GRH + * @gid - Remote GID in GRH + * @flow_label - Flow label in GRH + * + * All multi-byte quantities are stored in network (big endian) byte order. + */ +struct ib_user_mad { + __u8 data[256]; + __u32 id; + __u32 status; + __u32 timeout_ms; + __u32 qpn; + __u32 qkey; + __u16 lid; + __u8 sl; + __u8 path_bits; + __u8 grh_present; + __u8 gid_index; + __u8 hop_limit; + __u8 traffic_class; + __u8 gid[16]; + __u32 flow_label; +}; + +/** + * ib_user_mad_reg_req - MAD registration request + * @id - Set by the kernel; used to identify agent in future requests. + * @qpn - Queue pair number; must be 0 or 1. + * @method_mask - The caller will receive unsolicited MADs for any method + * where @method_mask = 1. + * @mgmt_class - Indicates which management class of MADs should be receive + * by the caller. This field is only required if the user wishes to + * receive unsolicited MADs, otherwise it should be 0. + * @mgmt_class_version - Indicates which version of MADs for the given + * management class to receive. + * @oui: Indicates IEEE OUI when mgmt_class is a vendor class + * in the range from 0x30 to 0x4f. Otherwise not used. + */ +struct ib_user_mad_reg_req { + __u32 id; + __u32 method_mask[4]; + __u8 qpn; + __u8 mgmt_class; + __u8 mgmt_class_version; + __u8 oui[3]; +}; + +#define IB_IOCTL_MAGIC 0x1b + +#define IB_USER_MAD_REGISTER_AGENT _IOWR(IB_IOCTL_MAGIC, 1, \ + struct ib_user_mad_reg_req) + +#define IB_USER_MAD_UNREGISTER_AGENT _IOW(IB_IOCTL_MAGIC, 2, __u32) + +#endif /* IB_USER_MAD_H */ From roland@topspin.com Mon Dec 27 21:52:33 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 21:52:58 -0800 (PST) Received: from umhlanga.STRATNET.NET (umhlanga.stratnet.net [12.162.17.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS5q8cl028217 for ; Mon, 27 Dec 2004 21:52:31 -0800 Received: from exch-1.topspincom.com ([12.162.17.3]) by umhlanga.STRATNET.NET with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:51:18 -0800 Received: from eddore ([10.10.253.169]) by exch-1.topspincom.com with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:51:18 -0800 Received: from localhost ([127.0.0.1] helo=eddore) by eddore with smtp (Exim 4.34) id 1CjAGL-0000vX-7S; Mon, 27 Dec 2004 21:51:18 -0800 Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com, openib-general@openib.org In-Reply-To: <200412272151.XKAQkznglOHW39xJ@topspin.com> X-Mailer: Roland's Patchbomber Date: Mon, 27 Dec 2004 21:51:17 -0800 Message-Id: <200412272151.JEARwZ80axXxZD2Q@topspin.com> Mime-Version: 1.0 To: davem@davemloft.net From: Roland Dreier X-SA-Exim-Connect-IP: 127.0.0.1 X-SA-Exim-Mail-From: roland@topspin.com Subject: [PATCH][v5][20/24] Add IPoIB multicast & partition code Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-SA-Exim-Version: 4.1 (built Tue, 17 Aug 2004 11:06:07 +0200) X-SA-Exim-Scanned: Yes (on eddore) X-OriginalArrivalTime: 28 Dec 2004 05:51:18.0657 (UTC) FILETIME=[40574F10:01C4ECA1] X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13116 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: roland@topspin.com Precedence: bulk X-list: netdev Add functions for handling IPoIB multicast and multiple partitions. Signed-off-by: Roland Dreier --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/ulp/ipoib/ipoib_multicast.c 2004-12-27 21:48:27.157968669 -0800 @@ -0,0 +1,981 @@ +/* + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: ipoib_multicast.c 1362 2004-12-18 15:56:29Z roland $ + */ + +#include +#include +#include +#include +#include +#include +#include +#include + +#include "ipoib.h" + +#ifdef CONFIG_INFINIBAND_IPOIB_DEBUG +int mcast_debug_level; + +module_param(mcast_debug_level, int, 0644); +MODULE_PARM_DESC(mcast_debug_level, + "Enable multicast debug tracing if > 0"); +#endif + +static DECLARE_MUTEX(mcast_mutex); + +/* Used for all multicast joins (broadcast, IPv4 mcast and IPv6 mcast) */ +struct ipoib_mcast { + struct ib_sa_mcmember_rec mcmember; + struct ipoib_ah *ah; + + struct rb_node rb_node; + struct list_head list; + struct completion done; + + int query_id; + struct ib_sa_query *query; + + unsigned long created; + unsigned long backoff; + + unsigned long flags; + unsigned char logcount; + + struct list_head neigh_list; + + struct sk_buff_head pkt_queue; + + struct net_device *dev; +}; + +struct ipoib_mcast_iter { + struct net_device *dev; + union ib_gid mgid; + unsigned long created; + unsigned int queuelen; + unsigned int complete; + unsigned int send_only; +}; + +static void ipoib_mcast_free(struct ipoib_mcast *mcast) +{ + struct net_device *dev = mcast->dev; + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ipoib_neigh *neigh, *tmp; + unsigned long flags; + + ipoib_dbg_mcast(netdev_priv(dev), + "deleting multicast group " IPOIB_GID_FMT "\n", + IPOIB_GID_ARG(mcast->mcmember.mgid)); + + spin_lock_irqsave(&priv->lock, flags); + + list_for_each_entry_safe(neigh, tmp, &mcast->neigh_list, list) { + ipoib_put_ah(neigh->ah); + *to_ipoib_neigh(neigh->neighbour) = NULL; + neigh->neighbour->ops->destructor = NULL; + kfree(neigh); + } + + spin_unlock_irqrestore(&priv->lock, flags); + + if (mcast->ah) + ipoib_put_ah(mcast->ah); + + while (!skb_queue_empty(&mcast->pkt_queue)) { + struct sk_buff *skb = skb_dequeue(&mcast->pkt_queue); + + skb->dev = dev; + dev_kfree_skb_any(skb); + } + + kfree(mcast); +} + +static struct ipoib_mcast *ipoib_mcast_alloc(struct net_device *dev, + int can_sleep) +{ + struct ipoib_mcast *mcast; + + mcast = kmalloc(sizeof (*mcast), can_sleep ? GFP_KERNEL : GFP_ATOMIC); + if (!mcast) + return NULL; + + memset(mcast, 0, sizeof (*mcast)); + + init_completion(&mcast->done); + + mcast->dev = dev; + mcast->created = jiffies; + mcast->backoff = HZ; + mcast->logcount = 0; + + INIT_LIST_HEAD(&mcast->list); + INIT_LIST_HEAD(&mcast->neigh_list); + skb_queue_head_init(&mcast->pkt_queue); + + mcast->ah = NULL; + mcast->query = NULL; + + return mcast; +} + +static struct ipoib_mcast *__ipoib_mcast_find(struct net_device *dev, union ib_gid *mgid) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct rb_node *n = priv->multicast_tree.rb_node; + + while (n) { + struct ipoib_mcast *mcast; + int ret; + + mcast = rb_entry(n, struct ipoib_mcast, rb_node); + + ret = memcmp(mgid->raw, mcast->mcmember.mgid.raw, + sizeof (union ib_gid)); + if (ret < 0) + n = n->rb_left; + else if (ret > 0) + n = n->rb_right; + else + return mcast; + } + + return NULL; +} + +static int __ipoib_mcast_add(struct net_device *dev, struct ipoib_mcast *mcast) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct rb_node **n = &priv->multicast_tree.rb_node, *pn = NULL; + + while (*n) { + struct ipoib_mcast *tmcast; + int ret; + + pn = *n; + tmcast = rb_entry(pn, struct ipoib_mcast, rb_node); + + ret = memcmp(mcast->mcmember.mgid.raw, tmcast->mcmember.mgid.raw, + sizeof (union ib_gid)); + if (ret < 0) + n = &pn->rb_left; + else if (ret > 0) + n = &pn->rb_right; + else + return -EEXIST; + } + + rb_link_node(&mcast->rb_node, pn, n); + rb_insert_color(&mcast->rb_node, &priv->multicast_tree); + + return 0; +} + +static int ipoib_mcast_join_finish(struct ipoib_mcast *mcast, + struct ib_sa_mcmember_rec *mcmember) +{ + struct net_device *dev = mcast->dev; + struct ipoib_dev_priv *priv = netdev_priv(dev); + int ret; + + mcast->mcmember = *mcmember; + + /* Set the cached Q_Key before we attach if it's the broadcast group */ + if (!memcmp(mcast->mcmember.mgid.raw, priv->dev->broadcast + 4, + sizeof (union ib_gid))) + priv->qkey = be32_to_cpu(priv->broadcast->mcmember.qkey); + + if (!test_bit(IPOIB_MCAST_FLAG_SENDONLY, &mcast->flags)) { + if (test_and_set_bit(IPOIB_MCAST_FLAG_ATTACHED, &mcast->flags)) { + ipoib_warn(priv, "multicast group " IPOIB_GID_FMT + " already attached\n", + IPOIB_GID_ARG(mcast->mcmember.mgid)); + + return 0; + } + + ret = ipoib_mcast_attach(dev, be16_to_cpu(mcast->mcmember.mlid), + &mcast->mcmember.mgid); + if (ret < 0) { + ipoib_warn(priv, "couldn't attach QP to multicast group " + IPOIB_GID_FMT "\n", + IPOIB_GID_ARG(mcast->mcmember.mgid)); + + clear_bit(IPOIB_MCAST_FLAG_ATTACHED, &mcast->flags); + return ret; + } + } + + { + /* + * For now we set static_rate to 0. This is not + * really correct: we should look at the rate + * component of the MC member record, compare it with + * the rate of our local port (calculated from the + * active link speed and link width) and set an + * inter-packet delay appropriately. + */ + struct ib_ah_attr av = { + .dlid = be16_to_cpu(mcast->mcmember.mlid), + .port_num = priv->port, + .sl = mcast->mcmember.sl, + .static_rate = 0, + .ah_flags = IB_AH_GRH, + .grh = { + .flow_label = be32_to_cpu(mcast->mcmember.flow_label), + .hop_limit = mcast->mcmember.hop_limit, + .sgid_index = 0, + .traffic_class = mcast->mcmember.traffic_class + } + }; + + av.grh.dgid = mcast->mcmember.mgid; + + mcast->ah = ipoib_create_ah(dev, priv->pd, &av); + if (!mcast->ah) { + ipoib_warn(priv, "ib_address_create failed\n"); + } else { + ipoib_dbg_mcast(priv, "MGID " IPOIB_GID_FMT + " AV %p, LID 0x%04x, SL %d\n", + IPOIB_GID_ARG(mcast->mcmember.mgid), + mcast->ah->ah, + be16_to_cpu(mcast->mcmember.mlid), + mcast->mcmember.sl); + } + } + + /* actually send any queued packets */ + while (!skb_queue_empty(&mcast->pkt_queue)) { + struct sk_buff *skb = skb_dequeue(&mcast->pkt_queue); + + skb->dev = dev; + + if (!skb->dst || !skb->dst->neighbour) { + /* put pseudoheader back on for next time */ + skb_push(skb, sizeof (struct ipoib_pseudoheader)); + } + + if (dev_queue_xmit(skb)) + ipoib_warn(priv, "dev_queue_xmit failed to requeue packet\n"); + } + + return 0; +} + +static void +ipoib_mcast_sendonly_join_complete(int status, + struct ib_sa_mcmember_rec *mcmember, + void *mcast_ptr) +{ + struct ipoib_mcast *mcast = mcast_ptr; + struct net_device *dev = mcast->dev; + + if (!status) + ipoib_mcast_join_finish(mcast, mcmember); + else { + if (mcast->logcount++ < 20) + ipoib_dbg_mcast(netdev_priv(dev), "multicast join failed for " + IPOIB_GID_FMT ", status %d\n", + IPOIB_GID_ARG(mcast->mcmember.mgid), status); + + /* Flush out any queued packets */ + while (!skb_queue_empty(&mcast->pkt_queue)) { + struct sk_buff *skb = skb_dequeue(&mcast->pkt_queue); + + skb->dev = dev; + + dev_kfree_skb_any(skb); + } + + /* Clear the busy flag so we try again */ + clear_bit(IPOIB_MCAST_FLAG_BUSY, &mcast->flags); + } + + complete(&mcast->done); +} + +static int ipoib_mcast_sendonly_join(struct ipoib_mcast *mcast) +{ + struct net_device *dev = mcast->dev; + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ib_sa_mcmember_rec rec = { +#if 0 /* Some SMs don't support send-only yet */ + .join_state = 4 +#else + .join_state = 1 +#endif + }; + int ret = 0; + + if (!test_bit(IPOIB_FLAG_OPER_UP, &priv->flags)) { + ipoib_dbg_mcast(priv, "device shutting down, no multicast joins\n"); + return -ENODEV; + } + + if (test_and_set_bit(IPOIB_MCAST_FLAG_BUSY, &mcast->flags)) { + ipoib_dbg_mcast(priv, "multicast entry busy, skipping\n"); + return -EBUSY; + } + + rec.mgid = mcast->mcmember.mgid; + rec.port_gid = priv->local_gid; + rec.pkey = be16_to_cpu(priv->pkey); + + ret = ib_sa_mcmember_rec_set(priv->ca, priv->port, &rec, + IB_SA_MCMEMBER_REC_MGID | + IB_SA_MCMEMBER_REC_PORT_GID | + IB_SA_MCMEMBER_REC_PKEY | + IB_SA_MCMEMBER_REC_JOIN_STATE, + 1000, GFP_ATOMIC, + ipoib_mcast_sendonly_join_complete, + mcast, &mcast->query); + if (ret < 0) { + ipoib_warn(priv, "ib_sa_mcmember_rec_set failed (ret = %d)\n", + ret); + } else { + ipoib_dbg_mcast(priv, "no multicast record for " IPOIB_GID_FMT + ", starting join\n", + IPOIB_GID_ARG(mcast->mcmember.mgid)); + + mcast->query_id = ret; + } + + return ret; +} + +static void ipoib_mcast_join_complete(int status, + struct ib_sa_mcmember_rec *mcmember, + void *mcast_ptr) +{ + struct ipoib_mcast *mcast = mcast_ptr; + struct net_device *dev = mcast->dev; + struct ipoib_dev_priv *priv = netdev_priv(dev); + + ipoib_dbg_mcast(priv, "join completion for " IPOIB_GID_FMT + " (status %d)\n", + IPOIB_GID_ARG(mcast->mcmember.mgid), status); + + if (!status && !ipoib_mcast_join_finish(mcast, mcmember)) { + mcast->backoff = HZ; + down(&mcast_mutex); + if (test_bit(IPOIB_MCAST_RUN, &priv->flags)) + queue_work(ipoib_workqueue, &priv->mcast_task); + up(&mcast_mutex); + complete(&mcast->done); + return; + } + + if (status == -EINTR) { + complete(&mcast->done); + return; + } + + if (status && mcast->logcount++ < 20) { + if (status == -ETIMEDOUT || status == -EINTR) { + ipoib_dbg_mcast(priv, "multicast join failed for " IPOIB_GID_FMT + ", status %d\n", + IPOIB_GID_ARG(mcast->mcmember.mgid), + status); + } else { + ipoib_warn(priv, "multicast join failed for " + IPOIB_GID_FMT ", status %d\n", + IPOIB_GID_ARG(mcast->mcmember.mgid), + status); + } + } + + mcast->backoff *= 2; + if (mcast->backoff > IPOIB_MAX_BACKOFF_SECONDS) + mcast->backoff = IPOIB_MAX_BACKOFF_SECONDS; + + mcast->query = NULL; + + down(&mcast_mutex); + if (test_bit(IPOIB_MCAST_RUN, &priv->flags)) { + if (status == -ETIMEDOUT) + queue_work(ipoib_workqueue, &priv->mcast_task); + else + queue_delayed_work(ipoib_workqueue, &priv->mcast_task, + mcast->backoff * HZ); + } else + complete(&mcast->done); + up(&mcast_mutex); + + return; +} + +static void ipoib_mcast_join(struct net_device *dev, struct ipoib_mcast *mcast, + int create) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ib_sa_mcmember_rec rec = { + .join_state = 1 + }; + ib_sa_comp_mask comp_mask; + int ret = 0; + + ipoib_dbg_mcast(priv, "joining MGID " IPOIB_GID_FMT "\n", + IPOIB_GID_ARG(mcast->mcmember.mgid)); + + rec.mgid = mcast->mcmember.mgid; + rec.port_gid = priv->local_gid; + rec.pkey = be16_to_cpu(priv->pkey); + + comp_mask = + IB_SA_MCMEMBER_REC_MGID | + IB_SA_MCMEMBER_REC_PORT_GID | + IB_SA_MCMEMBER_REC_PKEY | + IB_SA_MCMEMBER_REC_JOIN_STATE; + + if (create) { + comp_mask |= + IB_SA_MCMEMBER_REC_QKEY | + IB_SA_MCMEMBER_REC_SL | + IB_SA_MCMEMBER_REC_FLOW_LABEL | + IB_SA_MCMEMBER_REC_TRAFFIC_CLASS; + + rec.qkey = priv->broadcast->mcmember.qkey; + rec.sl = priv->broadcast->mcmember.sl; + rec.flow_label = priv->broadcast->mcmember.flow_label; + rec.traffic_class = priv->broadcast->mcmember.traffic_class; + } + + ret = ib_sa_mcmember_rec_set(priv->ca, priv->port, &rec, comp_mask, + mcast->backoff * 1000, GFP_ATOMIC, + ipoib_mcast_join_complete, + mcast, &mcast->query); + + if (ret < 0) { + ipoib_warn(priv, "ib_sa_mcmember_rec_set failed, status %d\n", ret); + + mcast->backoff *= 2; + if (mcast->backoff > IPOIB_MAX_BACKOFF_SECONDS) + mcast->backoff = IPOIB_MAX_BACKOFF_SECONDS; + + down(&mcast_mutex); + if (test_bit(IPOIB_MCAST_RUN, &priv->flags)) + queue_delayed_work(ipoib_workqueue, + &priv->mcast_task, + mcast->backoff); + up(&mcast_mutex); + } else + mcast->query_id = ret; +} + +void ipoib_mcast_join_task(void *dev_ptr) +{ + struct net_device *dev = dev_ptr; + struct ipoib_dev_priv *priv = netdev_priv(dev); + + if (!test_bit(IPOIB_MCAST_RUN, &priv->flags)) + return; + + if (ib_query_gid(priv->ca, priv->port, 0, &priv->local_gid)) + ipoib_warn(priv, "ib_gid_entry_get() failed\n"); + else + memcpy(priv->dev->dev_addr + 4, priv->local_gid.raw, sizeof (union ib_gid)); + + if (!priv->broadcast) { + priv->broadcast = ipoib_mcast_alloc(dev, 1); + if (!priv->broadcast) { + ipoib_warn(priv, "failed to allocate broadcast group\n"); + down(&mcast_mutex); + if (test_bit(IPOIB_MCAST_RUN, &priv->flags)) + queue_delayed_work(ipoib_workqueue, + &priv->mcast_task, HZ); + up(&mcast_mutex); + return; + } + + memcpy(priv->broadcast->mcmember.mgid.raw, priv->dev->broadcast + 4, + sizeof (union ib_gid)); + + spin_lock_irq(&priv->lock); + __ipoib_mcast_add(dev, priv->broadcast); + spin_unlock_irq(&priv->lock); + } + + if (!test_bit(IPOIB_MCAST_FLAG_ATTACHED, &priv->broadcast->flags)) { + ipoib_mcast_join(dev, priv->broadcast, 0); + return; + } + + while (1) { + struct ipoib_mcast *mcast = NULL; + + spin_lock_irq(&priv->lock); + list_for_each_entry(mcast, &priv->multicast_list, list) { + if (!test_bit(IPOIB_MCAST_FLAG_SENDONLY, &mcast->flags) + && !test_bit(IPOIB_MCAST_FLAG_BUSY, &mcast->flags) + && !test_bit(IPOIB_MCAST_FLAG_ATTACHED, &mcast->flags)) { + /* Found the next unjoined group */ + break; + } + } + spin_unlock_irq(&priv->lock); + + if (&mcast->list == &priv->multicast_list) { + /* All done */ + break; + } + + ipoib_mcast_join(dev, mcast, 1); + return; + } + + { + struct ib_port_attr attr; + + if (!ib_query_port(priv->ca, priv->port, &attr)) + priv->local_lid = attr.lid; + else + ipoib_warn(priv, "ib_query_port failed\n"); + } + + priv->mcast_mtu = ib_mtu_enum_to_int(priv->broadcast->mcmember.mtu) - + IPOIB_ENCAP_LEN; + dev->mtu = min(priv->mcast_mtu, priv->admin_mtu); + + ipoib_dbg_mcast(priv, "successfully joined all multicast groups\n"); + + clear_bit(IPOIB_MCAST_RUN, &priv->flags); + netif_carrier_on(dev); +} + +int ipoib_mcast_start_thread(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + + ipoib_dbg_mcast(priv, "starting multicast thread\n"); + + down(&mcast_mutex); + if (!test_and_set_bit(IPOIB_MCAST_RUN, &priv->flags)) + queue_work(ipoib_workqueue, &priv->mcast_task); + up(&mcast_mutex); + + return 0; +} + +int ipoib_mcast_stop_thread(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ipoib_mcast *mcast; + + ipoib_dbg_mcast(priv, "stopping multicast thread\n"); + + down(&mcast_mutex); + clear_bit(IPOIB_MCAST_RUN, &priv->flags); + cancel_delayed_work(&priv->mcast_task); + up(&mcast_mutex); + + flush_workqueue(ipoib_workqueue); + + if (priv->broadcast && priv->broadcast->query) { + ib_sa_cancel_query(priv->broadcast->query_id, priv->broadcast->query); + priv->broadcast->query = NULL; + ipoib_dbg_mcast(priv, "waiting for bcast\n"); + wait_for_completion(&priv->broadcast->done); + } + + list_for_each_entry(mcast, &priv->multicast_list, list) { + if (mcast->query) { + ib_sa_cancel_query(mcast->query_id, mcast->query); + mcast->query = NULL; + ipoib_dbg_mcast(priv, "waiting for MGID " IPOIB_GID_FMT "\n", + IPOIB_GID_ARG(mcast->mcmember.mgid)); + wait_for_completion(&mcast->done); + } + } + + return 0; +} + +int ipoib_mcast_leave(struct net_device *dev, struct ipoib_mcast *mcast) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ib_sa_mcmember_rec rec = { + .join_state = 1 + }; + int ret = 0; + + if (!test_and_clear_bit(IPOIB_MCAST_FLAG_ATTACHED, &mcast->flags)) + return 0; + + ipoib_dbg_mcast(priv, "leaving MGID " IPOIB_GID_FMT "\n", + IPOIB_GID_ARG(mcast->mcmember.mgid)); + + rec.mgid = mcast->mcmember.mgid; + rec.port_gid = priv->local_gid; + rec.pkey = be16_to_cpu(priv->pkey); + + /* Remove ourselves from the multicast group */ + ret = ipoib_mcast_detach(dev, be16_to_cpu(mcast->mcmember.mlid), + &mcast->mcmember.mgid); + if (ret) + ipoib_warn(priv, "ipoib_mcast_detach failed (result = %d)\n", ret); + + /* + * Just make one shot at leaving and don't wait for a reply; + * if we fail, too bad. + */ + ret = ib_sa_mcmember_rec_delete(priv->ca, priv->port, &rec, + IB_SA_MCMEMBER_REC_MGID | + IB_SA_MCMEMBER_REC_PORT_GID | + IB_SA_MCMEMBER_REC_PKEY | + IB_SA_MCMEMBER_REC_JOIN_STATE, + 0, GFP_ATOMIC, NULL, + mcast, &mcast->query); + if (ret < 0) + ipoib_warn(priv, "ib_sa_mcmember_rec_delete failed " + "for leave (result = %d)\n", ret); + + return 0; +} + +void ipoib_mcast_send(struct net_device *dev, union ib_gid *mgid, + struct sk_buff *skb) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct ipoib_mcast *mcast; + + /* + * We can only be called from ipoib_start_xmit, so we're + * inside tx_lock -- no need to save/restore flags. + */ + spin_lock(&priv->lock); + + mcast = __ipoib_mcast_find(dev, mgid); + if (!mcast) { + /* Let's create a new send only group now */ + ipoib_dbg_mcast(priv, "setting up send only multicast group for " + IPOIB_GID_FMT "\n", IPOIB_GID_ARG(*mgid)); + + mcast = ipoib_mcast_alloc(dev, 0); + if (!mcast) { + ipoib_warn(priv, "unable to allocate memory for " + "multicast structure\n"); + dev_kfree_skb_any(skb); + goto out; + } + + set_bit(IPOIB_MCAST_FLAG_SENDONLY, &mcast->flags); + mcast->mcmember.mgid = *mgid; + __ipoib_mcast_add(dev, mcast); + list_add_tail(&mcast->list, &priv->multicast_list); + } + + if (!mcast->ah) { + if (skb_queue_len(&mcast->pkt_queue) < IPOIB_MAX_MCAST_QUEUE) + skb_queue_tail(&mcast->pkt_queue, skb); + else + dev_kfree_skb_any(skb); + + if (mcast->query) + ipoib_dbg_mcast(priv, "no address vector, " + "but multicast join already started\n"); + else if (test_bit(IPOIB_MCAST_FLAG_SENDONLY, &mcast->flags)) + ipoib_mcast_sendonly_join(mcast); + + /* + * If lookup completes between here and out:, don't + * want to send packet twice. + */ + mcast = NULL; + } + +out: + if (mcast && mcast->ah) { + if (skb->dst && + skb->dst->neighbour && + !*to_ipoib_neigh(skb->dst->neighbour)) { + struct ipoib_neigh *neigh = kmalloc(sizeof *neigh, GFP_ATOMIC); + + if (neigh) { + kref_get(&mcast->ah->ref); + neigh->ah = mcast->ah; + neigh->neighbour = skb->dst->neighbour; + *to_ipoib_neigh(skb->dst->neighbour) = neigh; + list_add_tail(&neigh->list, &mcast->neigh_list); + } + } + + ipoib_send(dev, skb, mcast->ah, IB_MULTICAST_QPN); + } + + spin_unlock(&priv->lock); +} + +void ipoib_mcast_dev_flush(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + LIST_HEAD(remove_list); + struct ipoib_mcast *mcast, *tmcast, *nmcast; + unsigned long flags; + + ipoib_dbg_mcast(priv, "flushing multicast list\n"); + + spin_lock_irqsave(&priv->lock, flags); + list_for_each_entry_safe(mcast, tmcast, &priv->multicast_list, list) { + nmcast = ipoib_mcast_alloc(dev, 0); + if (nmcast) { + nmcast->flags = + mcast->flags & (1 << IPOIB_MCAST_FLAG_SENDONLY); + + nmcast->mcmember.mgid = mcast->mcmember.mgid; + + /* Add the new group in before the to-be-destroyed group */ + list_add_tail(&nmcast->list, &mcast->list); + list_del_init(&mcast->list); + + rb_replace_node(&mcast->rb_node, &nmcast->rb_node, + &priv->multicast_tree); + + list_add_tail(&mcast->list, &remove_list); + } else { + ipoib_warn(priv, "could not reallocate multicast group " + IPOIB_GID_FMT "\n", + IPOIB_GID_ARG(mcast->mcmember.mgid)); + } + } + + if (priv->broadcast) { + nmcast = ipoib_mcast_alloc(dev, 0); + if (nmcast) { + nmcast->mcmember.mgid = priv->broadcast->mcmember.mgid; + + rb_replace_node(&priv->broadcast->rb_node, + &nmcast->rb_node, + &priv->multicast_tree); + + list_add_tail(&priv->broadcast->list, &remove_list); + } + + priv->broadcast = nmcast; + } + + spin_unlock_irqrestore(&priv->lock, flags); + + list_for_each_entry(mcast, &remove_list, list) { + ipoib_mcast_leave(dev, mcast); + ipoib_mcast_free(mcast); + } +} + +void ipoib_mcast_dev_down(struct net_device *dev) +{ + struct ipoib_dev_priv *priv = netdev_priv(dev); + unsigned long flags; + + /* Delete broadcast since it will be recreated */ + if (priv->broadcast) { + ipoib_dbg_mcast(priv, "deleting broadcast group\n"); + + spin_lock_irqsave(&priv->lock, flags); + rb_erase(&priv->broadcast->rb_node, &priv->multicast_tree); + spin_unlock_irqrestore(&priv->lock, flags); + ipoib_mcast_leave(dev, priv->broadcast); + ipoib_mcast_free(priv->broadcast); + priv->broadcast = NULL; + } +} + +void ipoib_mcast_restart_task(void *dev_ptr) +{ + struct net_device *dev = dev_ptr; + struct ipoib_dev_priv *priv = netdev_priv(dev); + struct dev_mc_list *mclist; + struct ipoib_mcast *mcast, *tmcast; + LIST_HEAD(remove_list); + unsigned long flags; + + ipoib_dbg_mcast(priv, "restarting multicast task\n"); + + ipoib_mcast_stop_thread(dev); + + spin_lock_irqsave(&priv->lock, flags); + + /* + * Unfortunately, the networking core only gives us a list of all of + * the multicast hardware addresses. We need to figure out which ones + * are new and which ones have been removed + */ + + /* Clear out the found flag */ + list_for_each_entry(mcast, &priv->multicast_list, list) + clear_bit(IPOIB_MCAST_FLAG_FOUND, &mcast->flags); + + /* Mark all of the entries that are found or don't exist */ + for (mclist = dev->mc_list; mclist; mclist = mclist->next) { + union ib_gid mgid; + + memcpy(mgid.raw, mclist->dmi_addr + 4, sizeof mgid); + + /* Add in the P_Key */ + mgid.raw[4] = (priv->pkey >> 8) & 0xff; + mgid.raw[5] = priv->pkey & 0xff; + + mcast = __ipoib_mcast_find(dev, &mgid); + if (!mcast || test_bit(IPOIB_MCAST_FLAG_SENDONLY, &mcast->flags)) { + struct ipoib_mcast *nmcast; + + /* Not found or send-only group, let's add a new entry */ + ipoib_dbg_mcast(priv, "adding multicast entry for mgid " + IPOIB_GID_FMT "\n", IPOIB_GID_ARG(mgid)); + + nmcast = ipoib_mcast_alloc(dev, 0); + if (!nmcast) { + ipoib_warn(priv, "unable to allocate memory for multicast structure\n"); + continue; + } + + set_bit(IPOIB_MCAST_FLAG_FOUND, &nmcast->flags); + + nmcast->mcmember.mgid = mgid; + + if (mcast) { + /* Destroy the send only entry */ + list_del(&mcast->list); + list_add_tail(&mcast->list, &remove_list); + + rb_replace_node(&mcast->rb_node, + &nmcast->rb_node, + &priv->multicast_tree); + } else + __ipoib_mcast_add(dev, nmcast); + + list_add_tail(&nmcast->list, &priv->multicast_list); + } + + if (mcast) + set_bit(IPOIB_MCAST_FLAG_FOUND, &mcast->flags); + } + + /* Remove all of the entries don't exist anymore */ + list_for_each_entry_safe(mcast, tmcast, &priv->multicast_list, list) { + if (!test_bit(IPOIB_MCAST_FLAG_FOUND, &mcast->flags) && + !test_bit(IPOIB_MCAST_FLAG_SENDONLY, &mcast->flags)) { + ipoib_dbg_mcast(priv, "deleting multicast group " IPOIB_GID_FMT "\n", + IPOIB_GID_ARG(mcast->mcmember.mgid)); + + rb_erase(&mcast->rb_node, &priv->multicast_tree); + + /* Move to the remove list */ + list_del(&mcast->list); + list_add_tail(&mcast->list, &remove_list); + } + } + spin_unlock_irqrestore(&priv->lock, flags); + + /* We have to cancel outside of the spinlock */ + list_for_each_entry(mcast, &remove_list, list) { + ipoib_mcast_leave(mcast->dev, mcast); + ipoib_mcast_free(mcast); + } + + if (test_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags)) + ipoib_mcast_start_thread(dev); +} + +struct ipoib_mcast_iter *ipoib_mcast_iter_init(struct net_device *dev) +{ + struct ipoib_mcast_iter *iter; + + iter = kmalloc(sizeof *iter, GFP_KERNEL); + if (!iter) + return NULL; + + iter->dev = dev; + memset(iter->mgid.raw, 0, sizeof iter->mgid); + + if (ipoib_mcast_iter_next(iter)) { + ipoib_mcast_iter_free(iter); + return NULL; + } + + return iter; +} + +void ipoib_mcast_iter_free(struct ipoib_mcast_iter *iter) +{ + kfree(iter); +} + +int ipoib_mcast_iter_next(struct ipoib_mcast_iter *iter) +{ + struct ipoib_dev_priv *priv = netdev_priv(iter->dev); + struct rb_node *n; + struct ipoib_mcast *mcast; + int ret = 1; + + spin_lock_irq(&priv->lock); + + n = rb_first(&priv->multicast_tree); + + while (n) { + mcast = rb_entry(n, struct ipoib_mcast, rb_node); + + if (memcmp(iter->mgid.raw, mcast->mcmember.mgid.raw, + sizeof (union ib_gid)) < 0) { + iter->mgid = mcast->mcmember.mgid; + iter->created = mcast->created; + iter->queuelen = skb_queue_len(&mcast->pkt_queue); + iter->complete = !!mcast->ah; + iter->send_only = !!(mcast->flags & (1 << IPOIB_MCAST_FLAG_SENDONLY)); + + ret = 0; + + break; + } + + n = rb_next(n); + } + + spin_unlock_irq(&priv->lock); + + return ret; +} + +void ipoib_mcast_iter_read(struct ipoib_mcast_iter *iter, + union ib_gid *mgid, + unsigned long *created, + unsigned int *queuelen, + unsigned int *complete, + unsigned int *send_only) +{ + *mgid = iter->mgid; + *created = iter->created; + *queuelen = iter->queuelen; + *complete = iter->complete; + *send_only = iter->send_only; +} --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/drivers/infiniband/ulp/ipoib/ipoib_vlan.c 2004-12-27 21:48:27.219959544 -0800 @@ -0,0 +1,177 @@ +/* + * Copyright (c) 2004 Topspin Communications. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * $Id: ipoib_vlan.c 1349 2004-12-16 21:09:43Z roland $ + */ + +#include +#include + +#include +#include +#include + +#include + +#include "ipoib.h" + +static ssize_t show_parent(struct class_device *class_dev, char *buf) +{ + struct net_device *dev = + container_of(class_dev, struct net_device, class_dev); + struct ipoib_dev_priv *priv = netdev_priv(dev); + + return sprintf(buf, "%s\n", priv->parent->name); +} +static CLASS_DEVICE_ATTR(parent, S_IRUGO, show_parent, NULL); + +int ipoib_vlan_add(struct net_device *pdev, unsigned short pkey) +{ + struct ipoib_dev_priv *ppriv, *priv; + char intf_name[IFNAMSIZ]; + int result; + + if (!capable(CAP_NET_ADMIN)) + return -EPERM; + + ppriv = netdev_priv(pdev); + + down(&ppriv->vlan_mutex); + + /* + * First ensure this isn't a duplicate. We check the parent device and + * then all of the child interfaces to make sure the Pkey doesn't match. + */ + if (ppriv->pkey == pkey) { + result = -ENOTUNIQ; + goto err; + } + + list_for_each_entry(priv, &ppriv->child_intfs, list) { + if (priv->pkey == pkey) { + result = -ENOTUNIQ; + goto err; + } + } + + snprintf(intf_name, sizeof intf_name, "%s.%04x", + ppriv->dev->name, pkey); + priv = ipoib_intf_alloc(intf_name); + if (!priv) { + result = -ENOMEM; + goto err; + } + + set_bit(IPOIB_FLAG_SUBINTERFACE, &priv->flags); + + priv->pkey = pkey; + + memcpy(priv->dev->dev_addr, ppriv->dev->dev_addr, INFINIBAND_ALEN); + priv->dev->broadcast[8] = pkey >> 8; + priv->dev->broadcast[9] = pkey & 0xff; + + result = ipoib_dev_init(priv->dev, ppriv->ca, ppriv->port); + if (result < 0) { + ipoib_warn(ppriv, "failed to initialize subinterface: " + "device %s, port %d", + ppriv->ca->name, ppriv->port); + goto device_init_failed; + } + + result = register_netdev(priv->dev); + if (result) { + ipoib_warn(priv, "failed to initialize; error %i", result); + goto register_failed; + } + + priv->parent = ppriv->dev; + + if (ipoib_create_debug_file(priv->dev)) + goto debug_failed; + + if (ipoib_add_pkey_attr(priv->dev)) + goto sysfs_failed; + + if (class_device_create_file(&priv->dev->class_dev, + &class_device_attr_parent)) + goto sysfs_failed; + + list_add_tail(&priv->list, &ppriv->child_intfs); + + up(&ppriv->vlan_mutex); + + return 0; + +sysfs_failed: + ipoib_delete_debug_file(priv->dev); + +debug_failed: + unregister_netdev(priv->dev); + +register_failed: + ipoib_dev_cleanup(priv->dev); + +device_init_failed: + free_netdev(priv->dev); + +err: + up(&ppriv->vlan_mutex); + return result; +} + +int ipoib_vlan_delete(struct net_device *pdev, unsigned short pkey) +{ + struct ipoib_dev_priv *ppriv, *priv, *tpriv; + int ret = -ENOENT; + + if (!capable(CAP_NET_ADMIN)) + return -EPERM; + + ppriv = netdev_priv(pdev); + + down(&ppriv->vlan_mutex); + list_for_each_entry_safe(priv, tpriv, &ppriv->child_intfs, list) { + if (priv->pkey == pkey) { + unregister_netdev(priv->dev); + ipoib_dev_cleanup(priv->dev); + + list_del(&priv->list); + + kfree(priv); + + ret = 0; + break; + } + } + up(&ppriv->vlan_mutex); + + return ret; +} From roland@topspin.com Mon Dec 27 21:52:41 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 21:53:01 -0800 (PST) Received: from umhlanga.STRATNET.NET (umhlanga.stratnet.net [12.162.17.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS5q8cr028217 for ; Mon, 27 Dec 2004 21:52:40 -0800 Received: from exch-1.topspincom.com ([12.162.17.3]) by umhlanga.STRATNET.NET with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:51:21 -0800 Received: from eddore ([10.10.253.169]) by exch-1.topspincom.com with Microsoft SMTPSVC(5.0.2195.5329); Mon, 27 Dec 2004 21:51:20 -0800 Received: from localhost ([127.0.0.1] helo=eddore) by eddore with smtp (Exim 4.34) id 1CjAGN-0000vy-TH; Mon, 27 Dec 2004 21:51:20 -0800 Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com, openib-general@openib.org In-Reply-To: <200412272151.zeKZJPoIEBr55elh@topspin.com> X-Mailer: Roland's Patchbomber Date: Mon, 27 Dec 2004 21:51:19 -0800 Message-Id: <200412272151.S29WkrmlJifc5kHZ@topspin.com> Mime-Version: 1.0 To: davem@davemloft.net From: Roland Dreier X-SA-Exim-Connect-IP: 127.0.0.1 X-SA-Exim-Mail-From: roland@topspin.com Subject: [PATCH][v5][23/24] Add InfiniBand Documentation files Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8bit X-SA-Exim-Version: 4.1 (built Tue, 17 Aug 2004 11:06:07 +0200) X-SA-Exim-Scanned: Yes (on eddore) X-OriginalArrivalTime: 28 Dec 2004 05:51:20.0876 (UTC) FILETIME=[41A9E6C0:01C4ECA1] X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13120 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: roland@topspin.com Precedence: bulk X-list: netdev Add files to Documentation/infiniband that describe the tree under /sys/class/infiniband, the IPoIB driver and the userspace MAD access driver. Signed-off-by: Roland Dreier --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/Documentation/infiniband/ipoib.txt 2004-12-27 21:48:28.484773367 -0800 @@ -0,0 +1,56 @@ +IP OVER INFINIBAND + + The ib_ipoib driver is an implementation of the IP over InfiniBand + protocol as specified by the latest Internet-Drafts issued by the + IETF ipoib working group. It is a "native" implementation in the + sense of setting the interface type to ARPHRD_INFINIBAND and the + hardware address length to 20 (earlier proprietary implementations + masqueraded to the kernel as ethernet interfaces). + +Partitions and P_Keys + + When the IPoIB driver is loaded, it creates one interface for each + port using the P_Key at index 0. To create an interface with a + different P_Key, write the desired P_Key into the main interface's + /sys/class/net//create_child file. For example: + + echo 0x8001 > /sys/class/net/ib0/create_child + + This will create an interface named ib0.8001 with P_Key 0x8001. To + remove a subinterface, use the "delete_child" file: + + echo 0x8001 > /sys/class/net/ib0/delete_child + + The P_Key for any interface is given by the "pkey" file, and the + main interface for a subinterface is in "parent." + +Debugging Information + + By compiling the IPoIB driver with CONFIG_INFINIBAND_IPOIB_DEBUG set + to 'y', tracing messages are compiled into the driver. They are + turned on by setting the module parameters debug_level and + mcast_debug_level to 1. These parameters can be controlled at + runtime through files in /sys/module/ib_ipoib/. + + CONFIG_INFINIBAND_IPOIB_DEBUG also enables the "ipoib_debugfs" + virtual filesystem. By mounting this filesystem, for example with + + mkdir -p /ipoib_debugfs + mount -t ipoib_debugfs none /ipoib_debufs + + it is possible to get statistics about multicast groups from the + files /ipoib_debugfs/ib0_mcg and so on. + + The performance impact of this option is negligible, so it + is safe to enable this option with debug_level set to 0 for normal + operation. + + CONFIG_INFINIBAND_IPOIB_DEBUG_DATA enables even more debug output in + the data path when data_debug_level is set to 1. However, even with + the output disabled, enabling this configuration option will affect + performance, because it adds tests to the fast path. + +References + + IETF IP over InfiniBand (ipoib) Working Group + http://ietf.org/html.charters/ipoib-charter.html --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/Documentation/infiniband/sysfs.txt 2004-12-27 21:48:28.513769099 -0800 @@ -0,0 +1,64 @@ +SYSFS FILES + + For each InfiniBand device, the InfiniBand drivers create the + following files under /sys/class/infiniband/: + + node_guid - Node GUID + sys_image_guid - System image GUID + + In addition, there is a "ports" subdirectory, with one subdirectory + for each port. For example, if mthca0 is a 2-port HCA, there will + be two directories: + + /sys/class/infiniband/mthca0/ports/1 + /sys/class/infiniband/mthca0/ports/2 + + (A switch will only have a single "0" subdirectory for switch port + 0; no subdirectory is created for normal switch ports) + + In each port subdirectory, the following files are created: + + cap_mask - Port capability mask + lid - Port LID + lid_mask_count - Port LID mask count + rate - Port data rate (active width * active speed) + sm_lid - Subnet manager LID for port's subnet + sm_sl - Subnet manager SL for port's subnet + state - Port state (DOWN, INIT, ARMED, ACTIVE or ACTIVE_DEFER) + + There is also a "counters" subdirectory, with files + + VL15_dropped + excessive_buffer_overrun_errors + link_downed + link_error_recovery + local_link_integrity_errors + port_rcv_constraint_errors + port_rcv_data + port_rcv_errors + port_rcv_packets + port_rcv_remote_physical_errors + port_rcv_switch_relay_errors + port_xmit_constraint_errors + port_xmit_data + port_xmit_discards + port_xmit_packets + symbol_error + + Each of these files contains the corresponding value from the port's + Performance Management PortCounters attribute, as described in + section 16.1.3.5 of the InfiniBand Architecture Specification. + + The "pkeys" and "gids" subdirectories contain one file for each + entry in the port's P_Key or GID table respectively. For example, + ports/1/pkeys/10 contains the value at index 10 in port 1's P_Key + table. + +MTHCA + + The Mellanox HCA driver also creates the files: + + hw_rev - Hardware revision number + fw_ver - Firmware version + hca_type - HCA type: "MT23108", "MT25208 (MT23108 compat mode)", + or "MT25208" --- /dev/null 1970-01-01 00:00:00.000000000 +0000 +++ linux-bk/Documentation/infiniband/user_mad.txt 2004-12-27 21:48:28.543764684 -0800 @@ -0,0 +1,81 @@ +USERSPACE MAD ACCESS + +Device files + + Each port of each InfiniBand device has a "umad" device attached. + For example, a two-port HCA will have two devices, while a switch + will have one device (for switch port 0). + +Creating MAD agents + + A MAD agent can be created by filling in a struct ib_user_mad_reg_req + and then calling the IB_USER_MAD_REGISTER_AGENT ioctl on a file + descriptor for the appropriate device file. If the registration + request succeeds, a 32-bit id will be returned in the structure. + For example: + + struct ib_user_mad_reg_req req = { /* ... */ }; + ret = ioctl(fd, IB_USER_MAD_REGISTER_AGENT, (char *) &req); + if (!ret) + my_agent = req.id; + else + perror("agent register"); + + Agents can be unregistered with the IB_USER_MAD_UNREGISTER_AGENT + ioctl. Also, all agents registered through a file descriptor will + be unregistered when the descriptor is closed. + +Receiving MADs + + MADs are received using read(). The buffer passed to read() must be + large enough to hold at least one struct ib_user_mad. For example: + + struct ib_user_mad mad; + ret = read(fd, &mad, sizeof mad); + if (ret != sizeof mad) + perror("read"); + + In addition to the actual MAD contents, the other struct ib_user_mad + fields will be filled in with information on the received MAD. For + example, the remote LID will be in mad.lid. + + If a send times out, a receive will be generated with mad.status set + to ETIMEDOUT. Otherwise when a MAD has been successfully received, + mad.status will be 0. + + poll()/select() may be used to wait until a MAD can be read. + +Sending MADs + + MADs are sent using write(). The agent ID for sending should be + filled into the id field of the MAD, the destination LID should be + filled into the lid field, and so on. For example: + + struct ib_user_mad mad; + + /* fill in mad.data */ + + mad.id = my_agent; /* req.id from agent registration */ + mad.lid = my_dest; /* in network byte order... */ + /* etc. */ + + ret = write(fd, &mad, sizeof mad); + if (ret != sizeof mad) + perror("write"); + +/dev files + + To create the appropriate character device files automatically with + udev, a rule like + + KERNEL="umad*", NAME="infiniband/%k" + + can be used. This will create a device node named + + /dev/infiniband/umad0 + + for the first port, and so on. The InfiniBand device and port + associated with this device can be determined from the files + + /sys/class/infiniband_mad/umad0/ibdev + /sys/class/infiniband_mad/umad0/port From davem@davemloft.net Mon Dec 27 21:54:32 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 21:54:42 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS5s9xs030549 for ; Mon, 27 Dec 2004 21:54:30 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1CjAJK-0003G2-00; Mon, 27 Dec 2004 21:54:22 -0800 Date: Mon, 27 Dec 2004 21:54:21 -0800 From: "David S. Miller" To: Thomas Graf Cc: netdev@oss.sgi.com Subject: Re: [PATCH] PKT_SCHED: dsmark should ignore ECN bits Message-Id: <20041227215421.7f0326eb.davem@davemloft.net> In-Reply-To: <20041227193822.GK7884@postel.suug.ch> References: <20041227193822.GK7884@postel.suug.ch> X-Mailer: Sylpheed version 1.0.0rc (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13121 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Mon, 27 Dec 2004 20:38:22 +0100 Thomas Graf wrote: > Taking ECN bits into account doesn't make sense. The two bits were > still unused at the time the code was written so this brings back the > old behaviour. > > Signed-off-by: Thomas Graf Looks good, I put this into my 2.4 tree as well. Thanks. From davem@davemloft.net Mon Dec 27 21:58:43 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 21:58:53 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS5wKwN003052 for ; Mon, 27 Dec 2004 21:58:40 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1CjANJ-0003Gi-00; Mon, 27 Dec 2004 21:58:29 -0800 Date: Mon, 27 Dec 2004 21:58:29 -0800 From: "David S. Miller" To: Arnaldo Carvalho de Melo Cc: netdev@oss.sgi.com Subject: Re: [PATCH][INET] move inet_sock into inet_opt and rename it to inet_sock Message-Id: <20041227215829.688bbe74.davem@davemloft.net> In-Reply-To: <41D0C4C2.6030000@conectiva.com.br> References: <41CE198B.7040005@conectiva.com.br> <41D0C4C2.6030000@conectiva.com.br> X-Mailer: Sylpheed version 1.0.0rc (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13122 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Tue, 28 Dec 2004 00:28:18 -0200 Arnaldo Carvalho de Melo wrote: > Now that 2.6.10 is out and your 2.6.11 queue was merged, > please take a look if this is acceptable, the following patches > will deal with udp_sock, tcp_sock, etc. > > This is the start of the series of patches that will introduce > struct connection_sock, reducing the memory used by non connected > protocols, such as UDP. > > It is available at: > > bk://kernel.bkbits.net/acme/connection_sock-2.6 This looks wonderful. Pulled, thanks a lot Arnaldo. From davej@redhat.com Mon Dec 27 22:01:14 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 22:01:21 -0800 (PST) Received: from mx1.redhat.com (mx1.redhat.com [66.187.233.31]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS60sZm005712 for ; Mon, 27 Dec 2004 22:01:14 -0800 Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.12.11/8.12.11) with ESMTP id iBS62IWk010382; Tue, 28 Dec 2004 01:02:18 -0500 Received: from devserv.devel.redhat.com (devserv.devel.redhat.com [172.16.58.1]) by int-mx1.corp.redhat.com (8.11.6/8.11.6) with ESMTP id iBS62Ir25172; Tue, 28 Dec 2004 01:02:18 -0500 Received: from devserv.devel.redhat.com (localhost.localdomain [127.0.0.1]) by devserv.devel.redhat.com (8.12.11/8.12.10) with ESMTP id iBS61mT6008360; Tue, 28 Dec 2004 01:01:48 -0500 Received: (from davej@localhost) by devserv.devel.redhat.com (8.12.11/8.12.11/Submit) id iBS61mbt008358; Tue, 28 Dec 2004 01:01:48 -0500 X-Authentication-Warning: devserv.devel.redhat.com: davej set sender to davej@redhat.com using -f Date: Tue, 28 Dec 2004 01:01:48 -0500 From: Dave Jones To: Alan Cox Cc: "David S. Miller" , Patrick McHardy , torvalds@osdl.org, Linux Kernel Mailing List , netdev@oss.sgi.com Subject: Re: PATCH: kmalloc packet slab Message-ID: <20041228060148.GB5481@redhat.com> Mail-Followup-To: Dave Jones , Alan Cox , "David S. Miller" , Patrick McHardy , torvalds@osdl.org, Linux Kernel Mailing List , netdev@oss.sgi.com References: <1104156983.20944.25.camel@localhost.localdomain> <41D043AC.2070203@trash.net> <20041227142350.1cf444fe.davem@davemloft.net> <1104195085.20898.62.camel@localhost.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1104195085.20898.62.camel@localhost.localdomain> User-Agent: Mutt/1.4.1i X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13123 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davej@redhat.com Precedence: bulk X-list: netdev On Tue, Dec 28, 2004 at 12:51:28AM +0000, Alan Cox wrote: > On Llu, 2004-12-27 at 22:23, David S. Miller wrote: > > If we are really going to do something like this, it should > > be calculated properly and be determined per-interface > > type as netdevs are registered. > > Fine by me, I'm just going through plausible looking changes in the Red > Hat tree. You might want to slightly injure someone internally until > they drop that too 8) Internal injuries unnecessary. Regardless of outcome of this patch, Fedora will pick up whatever happens upstream instead of carrying this any longer. This and a few other patches have been stagnating in our tree for far longer than they should have been. Dave From davem@davemloft.net Mon Dec 27 22:54:23 2004 Received: with ECARTIS (v1.0.0; list netdev); Mon, 27 Dec 2004 22:54:30 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBS6s3ER008257 for ; Mon, 27 Dec 2004 22:54:23 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1CjBFJ-0003PS-00; Mon, 27 Dec 2004 22:54:17 -0800 Date: Mon, 27 Dec 2004 22:54:17 -0800 From: "David S. Miller" To: Roland Dreier Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com, openib-general@openib.org Subject: Re: [PATCH][v5][0/24] Latest IB patch queue Message-Id: <20041227225417.3ac7a0a6.davem@davemloft.net> In-Reply-To: <200412272150.IBRnA4AvjendsF8x@topspin.com> References: <200412272150.IBRnA4AvjendsF8x@topspin.com> X-Mailer: Sylpheed version 1.0.0rc (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13124 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Mon, 27 Dec 2004 21:50:47 -0800 Roland Dreier wrote: > >>>>> "David" == David S Miller writes: > > David> Send it all over. > > OK, you asked for it... here's our latest tree, which should > incorporate all the feedback I've seen. W00t :-) All applied, thanks Roland. I'll run it through some build tests then toss it upstream. From l.bortot@inet.it Tue Dec 28 02:12:10 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 28 Dec 2004 02:12:17 -0800 (PST) Received: from mid-2.inet.it (mid-2.inet.it [213.92.5.19]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBSABmR7018076 for ; Tue, 28 Dec 2004 02:12:09 -0800 Received: from bove.bortot.it [::ffff:213.92.7.98] by mid-2.inet.it via I-SMTP-5.2.1-520 id ::ffff:213.92.7.98+X6Nmb5XwFmVYD; Tue, 28 Dec 2004 11:13:14 +0100 Message-ID: <41D131C4.9010201@inet.it> Date: Tue, 28 Dec 2004 11:13:24 +0100 From: Luca Bortot User-Agent: Mozilla Thunderbird 0.9 (Windows/20041103) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Francois Romieu CC: netdev@oss.sgi.com Subject: Re: jumbo on 8169 References: <41CFF27A.2070008@inet.it> <20041227123136.GA25187@electric-eye.fr.zoreil.com> <41D01562.4090606@inet.it> <20041227163802.GA27692@electric-eye.fr.zoreil.com> In-Reply-To: <20041227163802.GA27692@electric-eye.fr.zoreil.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13125 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: l.bortot@inet.it Precedence: bulk X-list: netdev Francois Romieu wrote: > TSO may make a difference for a TCP test. See ethtool help to enable it. > You can experiment with Tx csum/SG as well. I tried and benchmarked them, but I could not see any difference in both throughput and cpu usage. btw, I think I'll switch back to 1500... I'm using the linux box as my own router/firewall/file server, and the latter is why I went giga. But... samba isn't (yet) i/o asynchronous, so that over 10-12 MB/s I basically wait for both disk and network latency, not the devices themselves. With 7k frames I end up with 50% disk usage, 20% network usage, 20% cpu idle; with 1500 I get the same on disk and network, but 100%cpu usage. Since I don't mind having spare cpu during file transfers, I prefer a 1500 mtu setup for better compatibility (did you know that MS ip stack sets the "don't fragment" bit by default? Can you guess what happens when the 7k packets go to the internet?) > I'll welcome a complete dmesg and lspci -vx as they pretty well describe > the working combinations. Linux version 2.6.10 (root@farm) (gcc version 3.4.2 20041017 (Red Hat 3.4.2-6.fc3)) #7 Mon Dec 27 14:13:23 CET 2004 BIOS-provided physical RAM map: BIOS-e820: 0000000000000000 - 00000000000a0000 (usable) BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 0000000017ff0000 (usable) BIOS-e820: 0000000017ff0000 - 0000000017ff3000 (ACPI NVS) BIOS-e820: 0000000017ff3000 - 0000000018000000 (ACPI data) BIOS-e820: 00000000ffff0000 - 0000000100000000 (reserved) 383MB LOWMEM available. On node 0 totalpages: 98288 DMA zone: 4096 pages, LIFO batch:1 Normal zone: 94192 pages, LIFO batch:16 HighMem zone: 0 pages, LIFO batch:1 DMI 2.3 present. ACPI: RSDP (v000 QDIGRP ) @ 0x000f7430 ACPI: RSDT (v001 AWARD AWRDACPI 0x42302e31 AWRD 0x00000000) @ 0x17ff3000 ACPI: FADT (v001 AWARD AWRDACPI 0x42302e31 AWRD 0x00000000) @ 0x17ff3040 ACPI: DSDT (v001 QDIGRP AWRDACPI 0x00001000 MSFT 0x0100000a) @ 0x00000000 Built 1 zonelists Kernel command line: ro root=LABEL=/ rhgb quiet acpi=force console=tty0 console=ttyS0,9600n8 Initializing CPU#0 PID hash table entries: 2048 (order: 11, 32768 bytes) Detected 800.068 MHz processor. Using tsc for high-res timesource Console: colour VGA+ 80x25 Dentry cache hash table entries: 65536 (order: 6, 262144 bytes) Inode-cache hash table entries: 32768 (order: 5, 131072 bytes) Memory: 385368k/393152k available (2114k kernel code, 7144k reserved, 614k data, 188k init, 0k highmem) Checking if this processor honours the WP bit even in supervisor mode... Ok. Calibrating delay loop... 1576.96 BogoMIPS (lpj=788480) Security Framework v1.0.0 initialized SELinux: Initializing. SELinux: Starting in permissive mode selinux_register_security: Registering secondary module capability Capability LSM initialized as secondary Mount-cache hash table entries: 512 (order: 0, 4096 bytes) CPU: After generic identify, caps: 0383f9ff 00000000 00000000 00000000 CPU: After vendor identify, caps: 0383f9ff 00000000 00000000 00000000 CPU: L1 I cache: 16K, L1 D cache: 16K CPU: L2 cache: 256K CPU: After all inits, caps: 0383f9ff 00000000 00000000 00000040 Intel machine check architecture supported. Intel machine check reporting enabled on CPU#0. CPU: Intel Pentium III (Coppermine) stepping 06 Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Checking 'hlt' instruction... OK. ACPI: setting ELCR to 0200 (from 1e20) checking if image is initramfs... it is Freeing initrd memory: 411k freed NET: Registered protocol family 16 PCI: PCI BIOS revision 2.10 entry at 0xfb230, last bus=1 PCI: Using configuration type 1 mtrr: v2.0 (20020519) ACPI: Subsystem revision 20041105 ACPI: Interpreter enabled ACPI: Using PIC for interrupt routing ACPI: PCI Root Bridge [PCI0] (00:00) PCI: Probing PCI hardware (bus 00) ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT] ACPI: PCI Interrupt Link [LNKA] (IRQs 1 3 4 5 6 7 10 *11 12 14 15) ACPI: PCI Interrupt Link [LNKB] (IRQs 1 3 4 5 6 7 10 11 *12 14 15) ACPI: PCI Interrupt Link [LNKC] (IRQs 1 3 4 *5 6 7 10 11 12 14 15) ACPI: PCI Interrupt Link [LNKD] (IRQs 1 3 4 5 6 7 *10 11 12 14 15) Linux Plug and Play Support v0.97 (c) Adam Belay pnp: PnP ACPI init pnp: PnP ACPI: found 11 devices usbcore: registered new driver usbfs usbcore: registered new driver hub PCI: Using ACPI for IRQ routing ** PCI interrupts are no longer routed automatically. If this ** causes a device to stop working, it is probably because the ** driver failed to call pci_enable_device(). As a temporary ** workaround, the "pci=routeirq" argument restores the old ** behavior. If this argument makes the device work again, ** please email the output of "lspci" to bjorn.helgaas@hp.com ** so I can fix the driver. apm: BIOS version 1.2 Flags 0x07 (Driver version 1.16ac) apm: overridden by ACPI. audit: initializing netlink socket (disabled) audit(1104229026.676:0): initialized Total HugeTLB memory allocated, 0 VFS: Disk quotas dquot_6.5.1 Dquot-cache hash table entries: 1024 (order 0, 4096 bytes) SELinux: Registering netfilter hooks Initializing Cryptographic API Activating ISA DMA hang workarounds. vesafb: probe of vesafb0 failed with error -6 ACPI: Processor [CPU0] (supports C1 C2) ACPI: Processor [CPU0] (supports 2 throttling states) isapnp: Scanning for PnP cards... isapnp: No Plug & Play device found Real Time Clock Driver v1.12 Linux agpgart interface v0.100 (c) Dave Jones agpgart: Detected VIA Apollo Pro 133 chipset agpgart: Maximum main memory to use for agp memory: 321M agpgart: AGP aperture is 4M @ 0xe6000000 serio: i8042 AUX port at 0x60,0x64 irq 12 serio: i8042 KBD port at 0x60,0x64 irq 1 Serial: 8250/16550 driver $Revision: 1.90 $ 8 ports, IRQ sharing enabled ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A io scheduler noop registered io scheduler anticipatory registered io scheduler deadline registered io scheduler cfq registered RAMDISK driver initialized: 16 RAM disks of 16384K size 1024 blocksize r8169 Gigabit Ethernet driver 1.6LK loaded ACPI: PCI Interrupt Link [LNKC] enabled at IRQ 5 PCI: setting IRQ 5 as level-triggered ACPI: PCI interrupt 0000:00:0b.0[A] -> GSI 5 (level, low) -> IRQ 5 r8169: NAPI enabled eth0: Identified chip type is 'RTL8169s/8110s'. eth0: RTL8169 at 0xd881c000, 00:40:f4:b4:8b:44, IRQ 5 Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx VP_IDE: IDE controller at PCI slot 0000:00:07.1 VP_IDE: chipset revision 16 VP_IDE: not 100% native mode: will probe irqs later VP_IDE: VIA vt82c596b (rev 23) IDE UDMA66 controller on pci0000:00:07.1 ide0: BM-DMA at 0xd000-0xd007, BIOS settings: hda:DMA, hdb:DMA ide1: BM-DMA at 0xd008-0xd00f, BIOS settings: hdc:DMA, hdd:DMA Probing IDE interface ide0... hda: HDS722525VLAT80, ATA DISK drive hdb: ASUS CD-S340, ATAPI CD/DVD-ROM drive elevator: using anticipatory as default io scheduler ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 Probing IDE interface ide1... hdc: ST380021A, ATA DISK drive hdc: IRQ probe failed (0xbcfa) hdd: Maxtor 4G120J6, ATA DISK drive ide1 at 0x170-0x177,0x376 on irq 15 Probing IDE interface ide2... ide2: Wait for ready failed before probe ! Probing IDE interface ide3... ide3: Wait for ready failed before probe ! Probing IDE interface ide4... ide4: Wait for ready failed before probe ! Probing IDE interface ide5... ide5: Wait for ready failed before probe ! hda: max request size: 1024KiB hda: 488397168 sectors (250059 MB) w/7938KiB Cache, CHS=30401/255/63, UDMA(66) hda: cache flushes supported hda: hda1 hda2 hda3 hdc: max request size: 128KiB hdc: 156301488 sectors (80026 MB) w/2048KiB Cache, CHS=65535/16/63, UDMA(66) hdc: cache flushes not supported hdc: hdc1 hdd: max request size: 1024KiB hdd: 240121728 sectors (122942 MB) w/2048KiB Cache, CHS=16383/255/63, UDMA(66) hdd: cache flushes not supported hdd: hdd1 hdb: ATAPI 34X CD-ROM drive, 128kB Cache, UDMA(33) Uniform CD-ROM driver Revision: 3.20 ide-floppy driver 0.99.newide usbcore: registered new driver hiddev usbcore: registered new driver usbhid drivers/usb/input/hid-core.c: v2.0:USB HID core driver mice: PS/2 mouse device common for all mice md: md driver 0.90.1 MAX_MD_DEVS=256, MD_SB_DISKS=27 NET: Registered protocol family 2 IP: routing cache hash table of 4096 buckets, 32Kbytes TCP: Hash tables configured (established 32768 bind 65536) Initializing IPsec netlink socket NET: Registered protocol family 1 NET: Registered protocol family 17 ACPI wakeup devices: USB0 USB1 ACPI: (supports S0 S1 S4bios S5) Freeing unused kernel memory: 188k freed device-mapper: 4.3.0-ioctl (2004-09-30) initialised: dm-devel@redhat.com EXT3-fs: INFO: recovery required on readonly filesystem. EXT3-fs: write access will be enabled during recovery. kjournald starting. Commit interval 5 seconds EXT3-fs: recovery complete. EXT3-fs: mounted filesystem with ordered data mode. SELinux: Disabled at runtime. SELinux: Unregistering netfilter hooks inserting floppy driver for 2.6.10 FDC 0 is a post-1991 82077 8139too Fast Ethernet driver 0.9.27 ACPI: PCI Interrupt Link [LNKB] enabled at IRQ 12 PCI: setting IRQ 12 as level-triggered ACPI: PCI interrupt 0000:00:0a.0[A] -> GSI 12 (level, low) -> IRQ 12 eth1: RealTek RTL8139 at 0xd800, 00:40:f4:b1:26:b8, IRQ 12 eth1: Identified 8139 chip type 'RTL-8100B/8139D' e100: Intel(R) PRO/100 Network Driver, 3.2.3-k2-NAPI e100: Copyright(c) 1999-2004 Intel Corporation ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 10 PCI: setting IRQ 10 as level-triggered ACPI: PCI interrupt 0000:00:0c.0[A] -> GSI 10 (level, low) -> IRQ 10 e100: eth2: e100_probe: addr 0xe6602000, irq 10, MAC addr 00:A0:C9:1D:EC:2F ACPI: PCI interrupt 0000:00:0e.0[A] -> GSI 12 (level, low) -> IRQ 12 e100: eth3: e100_probe: addr 0xe6601000, irq 12, MAC addr 00:A0:C9:CD:46:43 ip_tables: (C) 2000-2002 Netfilter core team eth3: link up, 10Mbps, half-duplex, lpa 0x0000 e100: eth1: e100_watchdog: link up, 100Mbps, full-duplex ip_tables: (C) 2000-2002 Netfilter core team USB Universal Host Controller Interface driver v2.2 ACPI: PCI interrupt 0000:00:07.2[D] -> GSI 10 (level, low) -> IRQ 10 uhci_hcd 0000:00:07.2: UHCI Host Controller uhci_hcd 0000:00:07.2: irq 10, io base 0xd400 uhci_hcd 0000:00:07.2: new USB bus registered, assigned bus number 1 hub 1-0:1.0: USB hub found hub 1-0:1.0: 2 ports detected md: Autodetecting RAID arrays. md: autorun ... md: ... autorun DONE. ip_tables: (C) 2000-2002 Netfilter core team ACPI: Power Button (FF) [PWRF] ibm_acpi: ec object not found EXT3 FS on hda1, internal journal cdrom: open failed. kjournald starting. Commit interval 5 seconds EXT3 FS on dm-0, internal journal EXT3-fs: mounted filesystem with ordered data mode. Adding 655192k swap on /dev/hda2. Priority:-1 extents:1 IA-32 Microcode Update Driver: v1.14 microcode: CPU0 already at revision 0x8 (current=0x8) microcode: No suitable data for CPU0 parport_pc: Ignoring new-style parameters in presence of obsolete ones parport: PnPBIOS parport detected. parport0: PC-style at 0x378, irq 7 [PCSPP,TRISTATE] pnp: Device 00:09 disabled. ip_tables: (C) 2000-2002 Netfilter core team ip_conntrack version 2.1 (3071 buckets, 24568 max) - 304 bytes per conntrack e100: eth0: e100_watchdog: link up, 10Mbps, half-duplex r8169: eth2: link up i2c /dev entries driver parport_pc: Ignoring new-style parameters in presence of obsolete ones pnp: Device 00:09 activated. parport: PnPBIOS parport detected. parport0: PC-style at 0x378, irq 7 [PCSPP,TRISTATE] lp0: using parport0 (interrupt-driven). lp0: console ready Universal TUN/TAP device driver 1.5 (C)1999-2002 Maxim Krasnyansky 00:00.0 Host bridge: VIA Technologies, Inc. VT82C693A/694x [Apollo PRO133x] (rev 44) Flags: bus master, medium devsel, latency 0 Memory at e6000000 (32-bit, prefetchable) [size=4M] Capabilities: [a0] AGP version 1.0 00: 06 11 91 06 06 00 10 a2 44 00 00 06 00 00 00 00 10: 08 00 00 e6 00 00 00 00 00 00 00 00 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 30: 00 00 00 00 a0 00 00 00 00 00 00 00 00 00 00 00 00:01.0 PCI bridge: VIA Technologies, Inc. VT82C598/694x [Apollo MVP3/Pro133x AGP] (prog-if 00 [Normal decode]) Flags: bus master, 66Mhz, medium devsel, latency 0 Bus: primary=00, secondary=01, subordinate=01, sec-latency=0 I/O behind bridge: 0000c000-0000cfff Memory behind bridge: e0000000-e1ffffff Prefetchable memory behind bridge: e2000000-e2ffffff Capabilities: [80] Power Management version 2 00: 06 11 98 85 07 00 30 22 00 00 04 06 00 00 01 00 10: 00 00 00 00 00 00 00 00 00 01 01 00 c0 c0 00 00 20: 00 e0 f0 e1 00 e2 f0 e2 00 00 00 00 00 00 00 00 30: 00 00 00 00 80 00 00 00 00 00 00 00 00 00 0c 00 00:07.0 ISA bridge: VIA Technologies, Inc. VT82C596 ISA [Mobile South] (rev 23) Subsystem: Quantum Designs (H.K.) Inc: Unknown device 0000 Flags: bus master, stepping, medium devsel, latency 0 00: 06 11 96 05 87 00 00 02 23 00 01 06 00 00 80 00 10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 11 34 00 00 30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00:07.1 IDE interface: VIA Technologies, Inc. VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 10) (prog-if 8a [Master SecP PriP]) Flags: bus master, medium devsel, latency 64 I/O ports at d000 [size=16] Capabilities: [c0] Power Management version 2 00: 06 11 71 05 07 00 90 02 10 8a 01 01 00 40 00 00 10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 20: 01 d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 30: 00 00 00 00 c0 00 00 00 00 00 00 00 ff 00 00 00 00:07.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 11) (prog-if 00 [UHCI]) Subsystem: VIA Technologies, Inc. (Wrong ID) USB Controller Flags: bus master, medium devsel, latency 64, IRQ 10 I/O ports at d400 [size=32] Capabilities: [80] Power Management version 2 00: 06 11 38 30 07 00 10 02 11 00 03 0c 08 40 00 00 10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 20: 01 d4 00 00 00 00 00 00 00 00 00 00 25 09 34 12 30: 00 00 00 00 80 00 00 00 00 00 00 00 0a 04 00 00 00:07.3 Host bridge: VIA Technologies, Inc. VT82C596 Power Management (rev 30) Flags: medium devsel 00: 06 11 50 30 00 00 80 02 30 00 00 06 00 00 00 00 10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00:0a.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ (rev 10) Subsystem: Realtek Semiconductor Co., Ltd. RT8139 Flags: bus master, medium devsel, latency 64, IRQ 12 I/O ports at d800 [size=256] Memory at e6600000 (32-bit, non-prefetchable) [size=256] Capabilities: [50] Power Management version 2 00: ec 10 39 81 07 00 90 02 10 00 00 02 00 40 00 00 10: 01 d8 00 00 00 00 60 e6 00 00 00 00 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 ec 10 39 81 30: 00 00 00 00 50 00 00 00 00 00 00 00 0c 01 20 40 00:0b.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8169 Gigabit Ethernet (rev 10) Subsystem: Realtek Semiconductor Co., Ltd. RTL-8169 Gigabit Ethernet Flags: bus master, 66Mhz, medium devsel, latency 64, IRQ 5 I/O ports at dc00 [size=256] Memory at e6603000 (32-bit, non-prefetchable) [size=256] Expansion ROM at e3000000 [disabled] [size=128K] Capabilities: [dc] Power Management version 2 00: ec 10 69 81 17 00 b0 02 10 00 00 02 08 40 00 00 10: 01 dc 00 00 00 30 60 e6 00 00 00 00 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 ec 10 69 81 30: 00 00 00 e3 dc 00 00 00 00 00 00 00 05 01 20 40 00:0c.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100] (rev 01) Flags: bus master, medium devsel, latency 64, IRQ 10 Memory at e6602000 (32-bit, prefetchable) [size=4K] I/O ports at e000 [size=32] Memory at e6500000 (32-bit, non-prefetchable) [size=1M] Expansion ROM at e4000000 [disabled] [size=1M] 00: 86 80 29 12 07 00 80 02 01 00 00 02 00 40 00 00 10: 08 20 60 e6 01 e0 00 00 00 00 50 e6 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 30: 00 00 00 e4 00 00 00 00 00 00 00 00 0a 01 08 38 00:0e.0 Ethernet controller: Intel Corp. 82557/8/9 [Ethernet Pro 100] (rev 05) Subsystem: Intel Corp. EtherExpress PRO/100+ Flags: bus master, medium devsel, latency 64, IRQ 12 Memory at e6601000 (32-bit, prefetchable) [size=4K] I/O ports at e400 [size=32] Memory at e6400000 (32-bit, non-prefetchable) [size=1M] Expansion ROM at e5000000 [disabled] [size=1M] Capabilities: [dc] Power Management version 1 00: 86 80 29 12 07 00 90 02 05 00 00 02 08 40 00 00 10: 08 10 60 e6 01 e4 00 00 00 00 40 e6 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 86 80 09 00 30: 00 00 00 e5 dc 00 00 00 00 00 00 00 0c 01 08 38 01:00.0 VGA compatible controller: ATI Technologies Inc 3D Rage IIC AGP (rev 7a) (prog-if 00 [VGA]) Subsystem: ATI Technologies Inc Rage 3D Pro AGP 2x XPERT 98 Flags: bus master, stepping, medium devsel, latency 64, IRQ 11 Memory at e2000000 (32-bit, prefetchable) [size=16M] I/O ports at c000 [size=256] Memory at e1000000 (32-bit, non-prefetchable) [size=4K] Capabilities: [5c] Power Management version 1 00: 02 10 5a 47 87 00 90 02 7a 00 00 03 08 40 00 00 10: 08 00 00 e2 01 c0 00 00 00 00 00 e1 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 02 10 84 00 30: 00 00 00 00 5c 00 00 00 00 00 00 00 0b 01 08 00 From hadi@cyberus.ca Tue Dec 28 05:19:52 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 28 Dec 2004 05:19:58 -0800 (PST) Received: from mx02.cybersurf.com (mx02.cybersurf.com [209.197.145.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBSDJWG4029691 for ; Tue, 28 Dec 2004 05:19:52 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx02.cybersurf.com with esmtp (Exim 4.30) id 1CjHHX-0005zE-8U for netdev@oss.sgi.com; Tue, 28 Dec 2004 08:20:59 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CjHHV-0000J1-3w; Tue, 28 Dec 2004 08:20:57 -0500 Subject: Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. From: jamal Reply-To: hadi@cyberus.ca To: Thomas Graf Cc: "David S. Miller" , netdev@oss.sgi.com In-Reply-To: <20041227121658.GI7884@postel.suug.ch> References: <200412270715.iBR7Fffe026855@hera.kernel.org> <20041227121658.GI7884@postel.suug.ch> Content-Type: text/plain Organization: jamalopolous Message-Id: <1104240053.1100.53.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 28 Dec 2004 08:20:53 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13126 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Mon, 2004-12-27 at 07:16, Thomas Graf wrote: > > ChangeSet 1.2055.37.1, 2004/11/17 16:08:01-08:00, util@deuroconsult.ro > > > > [PKT_SCHED]: Allow using nfmark as key in U32 classifier. > > > > Signed-off-by: Catalin(ux aka Dino) BOIE > > Signed-off-by: David S. Miller > > I must have missed this one. This should have been implemented in the > metadata action module to make it available to all classifiers. You mean meta match. > We > should really stop to add more stuff to specific classifiers which have > to be removed once we have metamatch. I've made a proposal on paper > just need some more time to cook up a patch. I have cycles now. Are you working on this or should i invest some of my cycles in it? Lets fix this once and for all. cheers, jamal From hadi@cyberus.ca Tue Dec 28 05:26:52 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 28 Dec 2004 05:26:59 -0800 (PST) Received: from mx01.cybersurf.com (mx01.cybersurf.com [209.197.145.104]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBSDQVUY030316 for ; Tue, 28 Dec 2004 05:26:52 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx01.cybersurf.com with esmtp (Exim 4.30) id 1CjHOI-0003uN-Gb for netdev@oss.sgi.com; Tue, 28 Dec 2004 08:27:58 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CjHOE-0001AS-SY; Tue, 28 Dec 2004 08:27:55 -0500 Subject: Re: [PATCH] PKT_SCHED: Fix cls indev validation From: jamal Reply-To: hadi@cyberus.ca To: Thomas Graf Cc: Patrick McHardy , "David S. Miller" , netdev@oss.sgi.com In-Reply-To: <20041222142637.GE7884@postel.suug.ch> References: <20041219203050.GK17998@postel.suug.ch> <41C68CEF.3030803@trash.net> <1103552215.1048.333.camel@jzny.localdomain> <20041220200739.GX17998@postel.suug.ch> <41C7F833.4000909@trash.net> <20041222003142.GB7884@postel.suug.ch> <1103722366.1089.75.camel@jzny.localdomain> <20041222142637.GE7884@postel.suug.ch> Content-Type: text/plain Organization: jamalopolous Message-Id: <1104240471.1090.61.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 28 Dec 2004 08:27:52 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13127 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Wed, 2004-12-22 at 09:26, Thomas Graf wrote: > * jamal <1103722366.1089.75.camel@jzny.localdomain> 2004-12-22 08:32 > > On Tue, 2004-12-21 at 19:31, Thomas Graf wrote: > > > * Patrick McHardy <41C7F833.4000909@trash.net> 2004-12-21 11:17 > > > > Could you make your patchset available somehow ? > > > > > > http://people.suug.ch/~tgr/patches/queue/ > > > > > > Unfinished and untested. > > > > I just took a quick glimpse. > > > > 1)Recall: Policer will have to die at some point - only reason for its > > existence is for backward compat. > > New iproute2 code sooner than later stop using that inteface so we can > > kill it. I suspect we can kill it in a year or two and definetely the > > day 2.7 comes out. > > I fully agree. The patchset looks like a beautification but it isn't. > Its main purpose is to make changing consistent again. Except there is a continous stream of patches to cleanup cleanups ;-> Thats where the cosmetics definition comes in. Lets just do it right and get it over with. > I tried > achieving this with the existing API and the ifdef hell got horrible > and ended up having over 60 lines of redundant code per classifier. > I would rather be implementing new fancy stuff but fixing the existing > issues comes first. > > > 2) The name tcf_attrs doesnt sound right - attributes are normally > > data pieces not methods. Cant think of a good name. > > I feel the same, I was thinking of extensions but wasn't pleased either. > Suggestions appreciated. > > > 3) What can i say? dang - this indev thing is getting out of control ;-> > > Too late, it's there, we must maintain it ;-> > I think needs to be fixed. Theres a clear bold warning that it would die. We shouldnt keep building more walls and adding gardens around it. > > > I think its time we did this right than defering. > > Indeed, what about this: I'll try and do a proposal on a new > generic matching layer holding the action bits and providing > backward compatibility to police/indev. We can then build the > metadata match on top of it. Ok, waiting to see this. Post it on the list. cheers, jamal From hadi@cyberus.ca Tue Dec 28 05:30:56 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 28 Dec 2004 05:31:03 -0800 (PST) Received: from mx01.cybersurf.com (mx01.cybersurf.com [209.197.145.104]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBSDUZeP030916 for ; Tue, 28 Dec 2004 05:30:56 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx01.cybersurf.com with esmtp (Exim 4.30) id 1CjHSE-0007BV-Mv for netdev@oss.sgi.com; Tue, 28 Dec 2004 08:32:02 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CjHSC-0001aE-Rm; Tue, 28 Dec 2004 08:32:01 -0500 Subject: Re: LLTX and netif_stop_queue From: jamal Reply-To: hadi@cyberus.ca To: Eric Lemoine Cc: Patrick McHardy , "David S. Miller" , roland@topspin.com, netdev@oss.sgi.com, openib-general@openib.org In-Reply-To: <5cac192f04122408102129af43@mail.gmail.com> References: <52llbwoaej.fsf@topspin.com> <20041217214432.07b7b21e.davem@davemloft.net> <1103484675.1050.158.camel@jzny.localdomain> <5cac192f04122210491d64d4b6@mail.gmail.com> <20041222202919.057b8331.davem@davemloft.net> <5cac192f0412230110628749e3@mail.gmail.com> <41CAF444.3000305@trash.net> <5cac192f04122408102129af43@mail.gmail.com> Content-Type: text/plain Organization: jamalopolous Message-Id: <1104240717.1100.66.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 28 Dec 2004 08:31:57 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13128 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Fri, 2004-12-24 at 11:10, Eric Lemoine wrote: > Yes but requiring drivers to release a lock that they should not even > be aware of doesn't sound good. Another way would be to keep > dev->queue_lock grabbed when entering start_xmit() and let the driver > drop it (and re-acquire it before it returns) only if it wishes so. > Although I don't like this too much either, that's the best way I can > think of up to now... I am not a big fan of that patch either, but i cant think of a cleaner way to do it. The violation already happens with the LLTX flag. So maybe a big warning that says "Do this only if you driver is LLTX enabled". The other way to do it is put a check to see if LLTX is enabled before releasing that lock - but why the extra cycles? Driver writer should know. cheers, jamal From tgraf@suug.ch Tue Dec 28 05:38:56 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 28 Dec 2004 05:39:03 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBSDcZ4p031572 for ; Tue, 28 Dec 2004 05:38:55 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id 26CBFF; Tue, 28 Dec 2004 14:39:41 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id DFE2D1C0EA; Tue, 28 Dec 2004 14:40:22 +0100 (CET) Date: Tue, 28 Dec 2004 14:40:22 +0100 From: Thomas Graf To: jamal Cc: "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. Message-ID: <20041228134022.GA32419@postel.suug.ch> References: <200412270715.iBR7Fffe026855@hera.kernel.org> <20041227121658.GI7884@postel.suug.ch> <1104240053.1100.53.camel@jzny.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1104240053.1100.53.camel@jzny.localdomain> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13129 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * jamal <1104240053.1100.53.camel@jzny.localdomain> 2004-12-28 08:20 > On Mon, 2004-12-27 at 07:16, Thomas Graf wrote: > > > ChangeSet 1.2055.37.1, 2004/11/17 16:08:01-08:00, util@deuroconsult.ro > > > > > > [PKT_SCHED]: Allow using nfmark as key in U32 classifier. > > > > > > Signed-off-by: Catalin(ux aka Dino) BOIE > > > Signed-off-by: David S. Miller > > > > I must have missed this one. This should have been implemented in the > > metadata action module to make it available to all classifiers. > > You mean meta match. Yes, I was thinking about implementing it as action, any objections? > > We > > should really stop to add more stuff to specific classifiers which have > > to be removed once we have metamatch. I've made a proposal on paper > > just need some more time to cook up a patch. > > I have cycles now. Are you working on this or should i invest some of my > cycles in it? Lets fix this once and for all. I don't care, I'm still testing the patchset to make classifer changes consistent again. A few big changes to tcindex/route classifier need extensive testing. My thoughts on meta match: A match consists of a header, lvalue, rvalue and stats. The header contains the handle, the requested operand (eq,lt,gt) and a flag to invert the meaning of the match. A value consists of a type and data where type specifies the actual metadata to be used. A few upper bits of the type specify the kind, i.e. wether it's numeric or a bytestring. Only values with matching upper bits can be compared. Additionally a mask and shift operator can be configured. Why so complicated? Comparing two kernel meta values can useful, realdev == indev when classyfing on tunnels, comparing backlogs, etc. Device matches should be made available as both numeric and bytestring match to have a fastpath when ifindex is stable and a slower path which is more flexible to changes. From hadi@cyberus.ca Tue Dec 28 05:44:20 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 28 Dec 2004 05:44:29 -0800 (PST) Received: from mx02.cybersurf.com (mx02.cybersurf.com [209.197.145.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBSDhx4K032308 for ; Tue, 28 Dec 2004 05:44:20 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx02.cybersurf.com with esmtp (Exim 4.30) id 1CjHfC-0008Mu-Va for netdev@oss.sgi.com; Tue, 28 Dec 2004 08:45:27 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CjHf8-00036j-KV; Tue, 28 Dec 2004 08:45:22 -0500 Subject: Re: Lockup with 2.6.9-ac15 related to netconsole From: jamal Reply-To: hadi@cyberus.ca To: Matt Mackall Cc: Patrick McHardy , Francois Romieu , Mark Broadbent , linux-kernel@vger.kernel.org, netdev@oss.sgi.com In-Reply-To: <20041222171836.GL5974@waste.org> References: <52121.192.102.214.6.1103624620.squirrel@webmail.wetlettuce.com> <20041221123727.GA13606@electric-eye.fr.zoreil.com> <49295.192.102.214.6.1103635762.squirrel@webmail.wetlettuce.com> <20041221204853.GA20869@electric-eye.fr.zoreil.com> <20041221212737.GK5974@waste.org> <20041221225831.GA20910@electric-eye.fr.zoreil.com> <41C93FAB.9090708@trash.net> <41C9525F.4070805@trash.net> <20041222123940.GA4241@electric-eye.fr.zoreil.com> <41C98B75.9020802@trash.net> <20041222171836.GL5974@waste.org> Content-Type: text/plain Organization: jamalopolous Message-Id: <1104241519.1089.79.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 28 Dec 2004 08:45:20 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13130 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Wed, 2004-12-22 at 12:18, Matt Mackall wrote: > On Wed, Dec 22, 2004 at 03:57:57PM +0100, Patrick McHardy wrote: > > >Of course the patch is completely ugly and violates any layering > > >principle one could think of. It was not submitted for inclusion :o) > > > > Sure, but I think we should have a short-term workaround until > > a better solution has been invented. Maybe dropping the packets > > would be best for now, it only affects printks issued in paths > > starting at qdisc_restart (-> hard_start_xmit -> ...). Queueing > > the packets might also cause reordering since not all packets > > are queued. > > When I mentioned queueing, I was thinking of a netpoll-private queue > that would be hooked to a softirq or some such so that it would be > pushed out as soon as possible. Dropping may be the better approach I think so - just junk those packets. cheers, jamal From hadi@cyberus.ca Tue Dec 28 05:58:58 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 28 Dec 2004 05:59:11 -0800 (PST) Received: from mx01.cybersurf.com (mx01.cybersurf.com [209.197.145.104]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBSDwcVt000656 for ; Tue, 28 Dec 2004 05:58:58 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx01.cybersurf.com with esmtp (Exim 4.30) id 1CjHtM-0005j4-4Q for netdev@oss.sgi.com; Tue, 28 Dec 2004 09:00:04 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CjHtJ-0004lQ-3V; Tue, 28 Dec 2004 09:00:01 -0500 Subject: Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. From: jamal Reply-To: hadi@cyberus.ca To: Thomas Graf Cc: "David S. Miller" , netdev@oss.sgi.com In-Reply-To: <20041228134022.GA32419@postel.suug.ch> References: <200412270715.iBR7Fffe026855@hera.kernel.org> <20041227121658.GI7884@postel.suug.ch> <1104240053.1100.53.camel@jzny.localdomain> <20041228134022.GA32419@postel.suug.ch> Content-Type: text/plain Organization: jamalopolous Message-Id: <1104242397.1090.94.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 28 Dec 2004 08:59:57 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13131 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Tue, 2004-12-28 at 08:40, Thomas Graf wrote: > * jamal <1104240053.1100.53.camel@jzny.localdomain> 2004-12-28 08:20 > > You mean meta match. > > Yes, I was thinking about implementing it as action, any objections? > If you implement the mothership match changes we discussed then it should go as a match (as opposed to action). As an action its yet another deferal for later cleanup. So my preference is to get the changes we discussed then this meta match. I could whip out meta action for setting values if you are gonna work on the match piece. > > > We > > > should really stop to add more stuff to specific classifiers which have > > > to be removed once we have metamatch. I've made a proposal on paper > > > just need some more time to cook up a patch. > > > > I have cycles now. Are you working on this or should i invest some of my > > cycles in it? Lets fix this once and for all. > > I don't care, I'm still testing the patchset to make classifer changes > consistent again. A few big changes to tcindex/route classifier need > extensive testing. > If you have started working on it i would prefer you make the changes. BTW, I am talking about the top level match changes we discussed. > My thoughts on meta match: > > A match consists of a header, lvalue, rvalue and stats. The header > contains the handle, the requested operand (eq,lt,gt) and a flag to > invert the meaning of the match. A value consists of a type and data > where type specifies the actual metadata to be used. A few upper bits > of the type specify the kind, i.e. wether it's numeric or a > bytestring. Only values with matching upper bits can be compared. > Additionally a mask and shift operator can be configured. Why so > complicated? Comparing two kernel meta values can useful, realdev > == indev when classyfing on tunnels, comparing backlogs, etc. Device > matches should be made available as both numeric and bytestring match > to have a fastpath when ifindex is stable and a slower path which is > more flexible to changes. Sounds reasonable at the high level. I am not sure i got the stats part. Can you write out the BNF. Heres what i was thinking for meta action: tc action metaset [index ID] METAVARS OPERATION:= ADD|DEL|GET|DUMP METAVARS:= <[INDEV] [FWMARK] [TCINDEX] [PRIO] [CLASSID]> INDEV = indev FWMARK:= fwmark TCINDEX := tcindex PRIO := prio CLASSID := classid I notice you throw in some mask and shift operator - not sure if i could make use of it here. cheers, jamal From hadi@znyx.com Tue Dec 28 06:03:49 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 28 Dec 2004 06:03:56 -0800 (PST) Received: from lotus.znyx.com (znx208-2-156-007.znyx.com [208.2.156.7]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBSE3ScG001260 for ; Tue, 28 Dec 2004 06:03:49 -0800 Received: from [127.0.0.1] ([208.2.156.2]) by lotus.znyx.com (Lotus Domino Release 5.0.11) with ESMTP id 2004122806080184:104054 ; Tue, 28 Dec 2004 06:08:01 -0800 Subject: 2.4.x patchlet From: Jamal Hadi Salim Reply-To: hadi@znyx.com To: "David S. Miller" Cc: netdev@oss.sgi.com Organization: ZNYX Networks Message-Id: <1104242685.1090.97.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 28 Dec 2004 09:04:45 -0500 X-MIMETrack: Itemize by SMTP Server on Lotus/Znyx(Release 5.0.11 |July 24, 2002) at 12/28/2004 06:08:02 AM, Serialize by Router on Lotus/Znyx(Release 5.0.11 |July 24, 2002) at 12/28/2004 06:08:34 AM, Serialize complete at 12/28/2004 06:08:34 AM Content-Type: multipart/mixed; boundary="=-gDswNaKcgwzQsr1L9zpL" X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13132 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@znyx.com Precedence: bulk X-list: netdev --=-gDswNaKcgwzQsr1L9zpL Content-Transfer-Encoding: 7bit Content-Type: text/plain Removes annoying compile of newer tools when done on 2.4.x cheers, jamal --=-gDswNaKcgwzQsr1L9zpL Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename=arp-eth-patchlet Content-Type: text/plain; name=arp-eth-patchlet; charset=ISO-8859-1 --- 2429-pre3/include/linux/if_ether.h 2003-08-25 07:44:44.000000000 -0400 +++ 2429-pre3-bk1/include/linux/if_ether.h 2004-12-27 10:51:46.000000000 -0500 @@ -61,6 +61,8 @@ #define ETH_P_IPV6 0x86DD /* IPv6 over bluebook */ #define ETH_P_PPP_DISC 0x8863 /* PPPoE discovery messages */ #define ETH_P_PPP_SES 0x8864 /* PPPoE session messages */ +#define ETH_P_MPLS_UC 0x8847 /* MPLS Unicast traffic */ +#define ETH_P_MPLS_MC 0x8848 /* MPLS Multicast traffic */ #define ETH_P_ATMMPOA 0x884c /* MultiProtocol Over ATM */ #define ETH_P_ATMFATE 0x8884 /* Frame-based ATM Transport * over Ethernet --- 2429-pre3/include/linux/if_arp.h 2004-08-07 19:26:06.000000000 -0400 +++ 2429-pre3-bk1/include/linux/if_arp.h 2004-12-27 11:30:20.000000000 -0500 @@ -40,6 +40,7 @@ #define ARPHRD_METRICOM 23 /* Metricom STRIP (new IANA id) */ #define ARPHRD_IEEE1394 24 /* IEEE 1394 IPv4 - RFC 2734 */ #define ARPHRD_EUI64 27 /* EUI-64 */ +#define ARPHRD_INFINIBAND 32 /* InfiniBand */ /* Dummy types for non ARP hardware */ #define ARPHRD_SLIP 256 --=-gDswNaKcgwzQsr1L9zpL-- From tgraf@suug.ch Tue Dec 28 06:49:05 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 28 Dec 2004 06:49:12 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBSEmia3003003 for ; Tue, 28 Dec 2004 06:49:05 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id E13C0F; Tue, 28 Dec 2004 15:49:51 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id BE7F11C0EA; Tue, 28 Dec 2004 15:50:33 +0100 (CET) Date: Tue, 28 Dec 2004 15:50:33 +0100 From: Thomas Graf To: jamal Cc: "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. Message-ID: <20041228145033.GB32419@postel.suug.ch> References: <200412270715.iBR7Fffe026855@hera.kernel.org> <20041227121658.GI7884@postel.suug.ch> <1104240053.1100.53.camel@jzny.localdomain> <20041228134022.GA32419@postel.suug.ch> <1104242397.1090.94.camel@jzny.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1104242397.1090.94.camel@jzny.localdomain> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13133 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * jamal <1104242397.1090.94.camel@jzny.localdomain> 2004-12-28 08:59 > On Tue, 2004-12-28 at 08:40, Thomas Graf wrote: > > * jamal <1104240053.1100.53.camel@jzny.localdomain> 2004-12-28 08:20 > > > > You mean meta match. > > > > Yes, I was thinking about implementing it as action, any objections? > > > > > If you implement the mothership match changes we discussed then it > should go as a match (as opposed to action). As an action its yet > another deferal for later cleanup. I'm getting more and more careful because we already suffer from various limitations by design of underlying layers. I agree that the best way would be to make it a generic match but we will end up implementing logic expressions code for every layer over and over. I have to think a little more about it, here's an up-to-date brain dump: Classifier extensions should no longer be configured over classifier specific TLV types but rather be part of a nested TLV. The extesions should be changeable directly without going through the classifier changing code, i.e. RTM_NEWFEXT/RTM_DELFEXT or something alike. It should be possible to create logic relations between extensions like match indev = "eth0" or (nfmark gt 10 or nfmark lt 4). Doesn't sound too bad but we're actually just implementing things on top of classifiers that should actually be on the same level. > So my preference is to get the changes we discussed then this meta > match. > I could whip out meta action for setting values if you are gonna work > on the match piece. I was thinking of combining these by simply introducing an ASSIGN operator so we don't have redundant code. We could make a generic metadata api so netfilter could make use of it. > Sounds reasonable at the high level. I am not sure i got the stats part. Simple hit/success counters per match to be returned as separate TLV. > Can you write out the BNF. Heres what i was thinking for meta action: tc meta [NOT] VALUE OPERATOR VALUE VALUE ::= { METAVARS | number | pattern } OPERATOR ::= { = | > | < | assign } where: typeof(METAVAR) for every value pair must be equal From alan@lxorguk.ukuu.org.uk Tue Dec 28 07:30:18 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 28 Dec 2004 07:30:24 -0800 (PST) Received: from localhost.localdomain (clock-tower.bc.nu [81.2.110.250] (may be forged)) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBSFTvjH004937 for ; Tue, 28 Dec 2004 07:30:18 -0800 Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by localhost.localdomain (8.12.11/8.12.11) with ESMTP id iBSERBX8024769; Tue, 28 Dec 2004 14:27:13 GMT Received: (from alan@localhost) by localhost.localdomain (8.12.11/8.12.11/Submit) id iBSER9L7024768; Tue, 28 Dec 2004 14:27:09 GMT X-Authentication-Warning: localhost.localdomain: alan set sender to alan@lxorguk.ukuu.org.uk using -f Subject: Re: [2.6 patch] /net/ax25/: some cleanups From: Alan Cox To: "David S. Miller" Cc: Adrian Bunk , ralf@linux-mips.org, linux-hams@vger.kernel.org, netdev@oss.sgi.com, Linux Kernel Mailing List In-Reply-To: <20041227185151.2a7ceb71.davem@davemloft.net> References: <20041212211339.GX22324@stusta.de> <20041227185151.2a7ceb71.davem@davemloft.net> Content-Type: text/plain Content-Transfer-Encoding: 7bit Message-Id: <1104237408.20944.70.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.6 (1.4.6-2) Date: Tue, 28 Dec 2004 14:27:08 +0000 X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13134 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: alan@lxorguk.ukuu.org.uk Precedence: bulk X-list: netdev On Maw, 2004-12-28 at 02:51, David S. Miller wrote: > On Sun, 12 Dec 2004 22:13:39 +0100 > Adrian Bunk wrote: > > > The patch below contains the following cleanups: > > - make two needlessly global functions static > > - net/ax25/ax25_addr.c: remove the unused global function ax25digicmp Dave this function is only unused because a patch in 2.6.10 broke AX.25 protocol support by removing the device and path checks. AX.25 is a link layer protocol it is supposed to check the devices and arguably the path. From hadi@cyberus.ca Tue Dec 28 07:32:46 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 28 Dec 2004 07:32:52 -0800 (PST) Received: from mx02.cybersurf.com (mx02.cybersurf.com [209.197.145.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBSFWPHd005455 for ; Tue, 28 Dec 2004 07:32:46 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx02.cybersurf.com with esmtp (Exim 4.30) id 1CjJM9-0001PA-EN for netdev@oss.sgi.com; Tue, 28 Dec 2004 10:33:53 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CjJM7-0007yG-9L; Tue, 28 Dec 2004 10:33:51 -0500 Subject: patchlet From: jamal Reply-To: hadi@cyberus.ca To: James Morris Cc: netdev@oss.sgi.com Content-Type: multipart/mixed; boundary="=-6f9Mwp3iMWKucXOWo7fM" Organization: jamalopolous Message-Id: <1104248027.1100.110.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 28 Dec 2004 10:33:48 -0500 X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13135 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev --=-6f9Mwp3iMWKucXOWo7fM Content-Type: text/plain Content-Transfer-Encoding: 7bit James, Attached a small patchlet that looked obvious - untested. cheers, jamal --=-6f9Mwp3iMWKucXOWo7fM Content-Disposition: attachment; filename=secp Content-Type: text/plain; name=secp; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit --- 2610-bk1/security/selinux/nlmsgtab.c 2004/12/28 04:01:14 1.1 +++ 2610-bk1/security/selinux/nlmsgtab.c 2004/12/28 04:05:39 @@ -56,6 +56,9 @@ { RTM_NEWTFILTER, NETLINK_ROUTE_SOCKET__NLMSG_WRITE }, { RTM_DELTFILTER, NETLINK_ROUTE_SOCKET__NLMSG_WRITE }, { RTM_GETTFILTER, NETLINK_ROUTE_SOCKET__NLMSG_READ }, + { RTM_NEWACTION, NETLINK_ROUTE_SOCKET__NLMSG_WRITE }, + { RTM_DELACTION, NETLINK_ROUTE_SOCKET__NLMSG_WRITE }, + { RTM_GETACTION, NETLINK_ROUTE_SOCKET__NLMSG_READ }, { RTM_NEWPREFIX, NETLINK_ROUTE_SOCKET__NLMSG_WRITE }, { RTM_GETPREFIX, NETLINK_ROUTE_SOCKET__NLMSG_READ }, { RTM_GETMULTICAST, NETLINK_ROUTE_SOCKET__NLMSG_READ }, --=-6f9Mwp3iMWKucXOWo7fM-- From hadi@cyberus.ca Tue Dec 28 07:54:44 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 28 Dec 2004 07:54:51 -0800 (PST) Received: from mx01.cybersurf.com (mx01.cybersurf.com [209.197.145.104]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBSFsNcU006842 for ; Tue, 28 Dec 2004 07:54:44 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx01.cybersurf.com with esmtp (Exim 4.30) id 1CjJhP-0002H2-0n for netdev@oss.sgi.com; Tue, 28 Dec 2004 10:55:51 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CjJhH-0002Ne-3y; Tue, 28 Dec 2004 10:55:43 -0500 Subject: Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. From: jamal Reply-To: hadi@cyberus.ca To: Thomas Graf Cc: "David S. Miller" , netdev@oss.sgi.com In-Reply-To: <20041228145033.GB32419@postel.suug.ch> References: <200412270715.iBR7Fffe026855@hera.kernel.org> <20041227121658.GI7884@postel.suug.ch> <1104240053.1100.53.camel@jzny.localdomain> <20041228134022.GA32419@postel.suug.ch> <1104242397.1090.94.camel@jzny.localdomain> <20041228145033.GB32419@postel.suug.ch> Content-Type: text/plain Organization: jamalopolous Message-Id: <1104249339.1090.134.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 28 Dec 2004 10:55:39 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13136 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Tue, 2004-12-28 at 09:50, Thomas Graf wrote: > * jamal <1104242397.1090.94.camel@jzny.localdomain> 2004-12-28 08:59 > > If you implement the mothership match changes we discussed then it > > should go as a match (as opposed to action). As an action its yet > > another deferal for later cleanup. > > I'm getting more and more careful because we already suffer from > various limitations by design of underlying layers. I agree that the > best way would be to make it a generic match but we will end up > implementing logic expressions code for every layer over and over. Thats the point of this discussion ;-> We need to figure a way to do it with minimal damage ;-> I actually dont think its such a bad idea to change all classifiers this once as long as the backward compat applies. > I have to think a little more about it, here's an up-to-date brain dump: > Classifier extensions should no longer be configured over classifier > specific TLV types but rather be part of a nested TLV. The extesions > should be changeable directly without going through the classifier > changing code, i.e. RTM_NEWFEXT/RTM_DELFEXT or something alike. Sensible. Also enforce that it gets configured when configuring the filter - this way binding can happen at installation and not lookup time. Now we need to find a clever way to do little coding .. I will give it some thinking in the background. Clearly this has to be interleaved with filtering as opposed to be at the end of filtering as actions are. > It > should be possible to create logic relations between extensions like > match indev = "eth0" or (nfmark gt 10 or nfmark lt 4). > Ok, so several things from your text in requirements/desires department: - Reusing existing classifiers is very valuable at that extension level. We actually could do it today with continue/reclassify except it will be faster if you could just point to the next classifier rule from the current classification. Not sure if this makes sense.. - The idea of continue/reclassify is also valuable for match extensions. - match extensions desire minimalist API. Need to write one page of code approach .. - I like the logical relationships you have. > Doesn't sound too bad but we're actually just implementing things > on top of classifiers that should actually be on the same level. > Not in the case of the 2 page of code for extra match. > > So my preference is to get the changes we discussed then this meta > > match. > > I could whip out meta action for setting values if you are gonna work > > on the match piece. > > I was thinking of combining these by simply introducing an ASSIGN > operator so we don't have redundant code. That is still a hack. > We could make a generic > metadata api so netfilter could make use of it. > I can see tc using netfilter but not the other way round without a lot of complexity. > > Sounds reasonable at the high level. I am not sure i got the stats part. > > Simple hit/success counters per match to be returned as separate TLV. > Yes, but how do you use them to match? > > Can you write out the BNF. Heres what i was thinking for meta action: > > tc meta [NOT] VALUE OPERATOR VALUE hopefully within a filter since these are not really standalone. as in filter ... extmatch meta ... > VALUE ::= { METAVARS | number | pattern } > OPERATOR ::= { = | > | < | assign } > > where: typeof(METAVAR) for every value pair must be equal so how would tcindex fit here? cheers, jamal From tgraf@suug.ch Tue Dec 28 08:09:49 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 28 Dec 2004 08:09:55 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBSG9SmK007760 for ; Tue, 28 Dec 2004 08:09:48 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id DBE3584; Tue, 28 Dec 2004 17:10:34 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id 229391C0EA; Tue, 28 Dec 2004 17:11:18 +0100 (CET) Date: Tue, 28 Dec 2004 17:11:17 +0100 From: Thomas Graf To: jamal Cc: "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. Message-ID: <20041228161117.GD32419@postel.suug.ch> References: <200412270715.iBR7Fffe026855@hera.kernel.org> <20041227121658.GI7884@postel.suug.ch> <1104240053.1100.53.camel@jzny.localdomain> <20041228134022.GA32419@postel.suug.ch> <1104242397.1090.94.camel@jzny.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1104242397.1090.94.camel@jzny.localdomain> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13137 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev > If you implement the mothership match changes we discussed then it > should go as a match (as opposed to action). As an action its yet > another deferal for later cleanup. What about this... We introduce a new API tcf_exts which holds all the things on top of a classifier. There might be several instaces per classifier, i.e. one per filter for u32 and fw and one per hash bucket for route/tcindex, etc. The classifier no longer knows about action/police/meta/... you name it but rather forwards the TLV TCA_..._EXTS to the extensions API. Backward compatibility is provided in the API (very simple). The extension infrastructure builds a tree to implement logic expressions where each node can be one of the supported types. The action code would be transformed to an extension which would mean that there could be multiple action chains. Example: cls -+ meta indev=ppp0 | \ meta assign nfmark=2 | \ action:gact redirect | \ meta nfmark=0x20 \ action:mirred Every extension node has a unique handle in the namespace of the tree. Deletion of a node results in deletion of all sub nodes. Configuration must go via change routine of classifier or via tp->get() and some generic way to retrieve extension handle from classifier. Thoughts? From hadi@cyberus.ca Tue Dec 28 08:35:57 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 28 Dec 2004 08:36:03 -0800 (PST) Received: from mx01.cybersurf.com (mx01.cybersurf.com [209.197.145.104]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBSGZbCg012350 for ; Tue, 28 Dec 2004 08:35:57 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx01.cybersurf.com with esmtp (Exim 4.30) id 1CjKLI-0008LR-Lm for netdev@oss.sgi.com; Tue, 28 Dec 2004 11:37:04 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CjKLF-00080y-Hl; Tue, 28 Dec 2004 11:37:01 -0500 Subject: Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. From: jamal Reply-To: hadi@cyberus.ca To: Thomas Graf Cc: "David S. Miller" , netdev@oss.sgi.com In-Reply-To: <20041228161117.GD32419@postel.suug.ch> References: <200412270715.iBR7Fffe026855@hera.kernel.org> <20041227121658.GI7884@postel.suug.ch> <1104240053.1100.53.camel@jzny.localdomain> <20041228134022.GA32419@postel.suug.ch> <1104242397.1090.94.camel@jzny.localdomain> <20041228161117.GD32419@postel.suug.ch> Content-Type: text/plain Organization: jamalopolous Message-Id: <1104251817.1090.164.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 28 Dec 2004 11:36:58 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13138 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Tue, 2004-12-28 at 11:11, Thomas Graf wrote: > > If you implement the mothership match changes we discussed then it > > should go as a match (as opposed to action). As an action its yet > > another deferal for later cleanup. > > What about this... > > We introduce a new API tcf_exts which holds all the things on top of > a classifier. There might be several instaces per classifier, i.e. one > per filter for u32 and fw and one per hash bucket for route/tcindex, > etc. > > The classifier no longer knows about action/police/meta/... you name it > but rather forwards the TLV TCA_..._EXTS to the extensions API. Backward > compatibility is provided in the API (very simple). > I think this sounds cleaner but is major surgery. The other issue i have with it is semantically i see classification followed by actions. Classification may have extended classification within it. Actions may also have extended actions within them. The majority of the surgery is going to be in ensuring that you can mix and match actions, extended filters and extended actions. > The extension infrastructure builds a tree to implement logic > expressions where each node can be one of the supported types. > > The action code would be transformed to an extension which would > mean that there could be multiple action chains. Example: > > cls -+ meta indev=ppp0 > | \ meta assign nfmark=2 > | \ action:gact redirect > | > \ meta nfmark=0x20 > \ action:mirred > > Every extension node has a unique handle in the namespace of the tree. > Deletion of a node results in deletion of all sub nodes. > > Configuration must go via change routine of classifier or via tp->get() > and some generic way to retrieve extension handle from classifier. > > Thoughts? Note my concerns above - we are talking major splicing! Heres another approach: The classifier is blind while executing those actions. Data that needs to be embeded within the classifier is: struct {extmatch type:extmatch void_data}. extmatch_classify(extmatchdatastruct,skb) is a generic call which does a lookup on the type and calls the proper callback. Callbacks return standard classifier ret codes. So an indev matcher will take the skb and compare against the indev data stored in struct->void_data. Only other call i can see needed is a registration function. extended matchers register a callback and type. user space stuff is easy. Now with above i dont see how to fit your logical experssions - but its a simple change and fits the requirement of writting the one page extended matcher. The same thought could be extended to actions. Sounds too easy unless i am intoxicated with the double-doubles i have been conmsuming last few hours;-> thoughts? cheers, jamal From hadi@cyberus.ca Tue Dec 28 08:50:53 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 28 Dec 2004 08:51:00 -0800 (PST) Received: from mx03.cybersurf.com (mx03.cybersurf.com [209.197.145.106]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBSGoXYb013144 for ; Tue, 28 Dec 2004 08:50:53 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx03.cybersurf.com with esmtp (Exim 4.30) id 1CjKZi-0005xg-7F for netdev@oss.sgi.com; Tue, 28 Dec 2004 11:51:58 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CjKZd-0001oD-Ka; Tue, 28 Dec 2004 11:51:53 -0500 Subject: Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. From: jamal Reply-To: hadi@cyberus.ca To: Thomas Graf Cc: "David S. Miller" , netdev@oss.sgi.com In-Reply-To: <1104251817.1090.164.camel@jzny.localdomain> References: <200412270715.iBR7Fffe026855@hera.kernel.org> <20041227121658.GI7884@postel.suug.ch> <1104240053.1100.53.camel@jzny.localdomain> <20041228134022.GA32419@postel.suug.ch> <1104242397.1090.94.camel@jzny.localdomain> <20041228161117.GD32419@postel.suug.ch> <1104251817.1090.164.camel@jzny.localdomain> Content-Type: text/plain Organization: jamalopolous Message-Id: <1104252710.1090.171.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 28 Dec 2004 11:51:50 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13139 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Tue, 2004-12-28 at 11:36, jamal wrote: I have to go out for a few hours, for completion sake: - initialization happens like you said with extended matches TLV which result in building one of those extended match datastruct bound on the filter. The binding part is easy, the hard part is how you interleaf u32 matches for example vs indev. ** Also i see your point that changing all the classifiers is painful, but doing it this once so the next written classifier is easy is worth the effort in my opinion. cheers, jamal > Heres another approach: > The classifier is blind while executing those actions. > Data that needs to be embeded within the classifier is: > struct {extmatch type:extmatch void_data}. > extmatch_classify(extmatchdatastruct,skb) is a generic call which does a > lookup on the type and calls the proper callback. Callbacks return > standard classifier ret codes. > So an indev matcher will take the skb and compare against the indev data > stored in struct->void_data. > Only other call i can see needed is a registration function. extended > matchers register a callback and type. > user space stuff is easy. > Now with above i dont see how to fit your logical experssions - but its > a simple change and fits the requirement of writting the one page > extended matcher. The same thought could be extended to actions. > Sounds too easy unless i am intoxicated with the double-doubles i have > been conmsuming last few hours;-> > > thoughts? > > cheers, > jamal > > > > > From davem@davemloft.net Tue Dec 28 10:08:50 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 28 Dec 2004 10:08:57 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBSI8T4h015547 for ; Tue, 28 Dec 2004 10:08:50 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1CjLiW-0004VQ-00; Tue, 28 Dec 2004 10:05:08 -0800 Date: Tue, 28 Dec 2004 10:05:07 -0800 From: "David S. Miller" To: Alan Cox Cc: bunk@stusta.de, ralf@linux-mips.org, linux-hams@vger.kernel.org, netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: [2.6 patch] /net/ax25/: some cleanups Message-Id: <20041228100507.7b374b5e.davem@davemloft.net> In-Reply-To: <1104237408.20944.70.camel@localhost.localdomain> References: <20041212211339.GX22324@stusta.de> <20041227185151.2a7ceb71.davem@davemloft.net> <1104237408.20944.70.camel@localhost.localdomain> X-Mailer: Sylpheed version 1.0.0rc (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13140 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Tue, 28 Dec 2004 14:27:08 +0000 Alan Cox wrote: > On Maw, 2004-12-28 at 02:51, David S. Miller wrote: > > On Sun, 12 Dec 2004 22:13:39 +0100 > > Adrian Bunk wrote: > > > > > The patch below contains the following cleanups: > > > - make two needlessly global functions static > > > - net/ax25/ax25_addr.c: remove the unused global function ax25digicmp > > Dave this function is only unused because a patch in 2.6.10 broke AX.25 > protocol support by removing the device and path checks. AX.25 is a link > layer protocol it is supposed to check the devices and arguably the > path. Send a patch to netdev and CC: me to fix this as is standard procedure for getting changes into the networking. :-) From acme@conectiva.com.br Tue Dec 28 10:35:15 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 28 Dec 2004 10:35:29 -0800 (PST) Received: from orion.netbank.com.br (orion.netbank.com.br [200.203.199.90]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBSIYrO6016655 for ; Tue, 28 Dec 2004 10:35:14 -0800 Received: from [200.163.203.158] (helo=oops.ghostprotocols.net) by orion.netbank.com.br with asmtp (Exim 3.33 #1) id 1CjMEu-0000ll-00; Tue, 28 Dec 2004 16:38:36 -0200 Received: from [192.168.1.6] (amd64.kerneljanitors.org [192.168.1.6]) by oops.ghostprotocols.net (Postfix) with ESMTP id 2BC3F74B84; Tue, 28 Dec 2004 16:36:09 -0200 (BRST) Message-ID: <41D1A876.9050609@conectiva.com.br> Date: Tue, 28 Dec 2004 16:39:50 -0200 From: Arnaldo Carvalho de Melo Organization: Conectiva S.A. User-Agent: Mozilla Thunderbird 0.9 (X11/20041103) X-Accept-Language: en-us, en MIME-Version: 1.0 To: "David S. Miller" Cc: netdev@oss.sgi.com Subject: [PATCH][TCP] merge tcp_sock with tcp_opt Content-Type: multipart/mixed; boundary="------------010100030105070906050706" X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13141 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: acme@conectiva.com.br Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------010100030105070906050706 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Hi David, Here is the tcp_sock one, please consider pulling from: bk://kernel.bkbits.net/acme/connection_sock-2.6 There are some cases where both sock and tcp_sock pointers are passed to functions, as one can be obtained from the other quite easily, a following patch will leave those functions receiving only a struct sock pointer. Regards, - Arnaldo --------------010100030105070906050706 Content-Type: text/plain; name="tcp_sock.patch" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="tcp_sock.patch" =================================================================== ChangeSet@1.2195, 2004-12-28 16:22:56-02:00, acme@conectiva.com.br [TCP] merge tcp_sock with tcp_opt No need for two structs, follow the new inet_sock layout style. Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller include/linux/ipv6.h | 3 include/linux/tcp.h | 14 -- include/net/tcp.h | 112 ++++++++++----------- include/net/tcp_ecn.h | 37 ++----- net/ipv4/ip_sockglue.c | 2 net/ipv4/syncookies.c | 6 - net/ipv4/tcp.c | 48 ++++----- net/ipv4/tcp_diag.c | 4 net/ipv4/tcp_input.c | 240 +++++++++++++++++++++++------------------------ net/ipv4/tcp_ipv4.c | 30 ++--- net/ipv4/tcp_minisocks.c | 12 +- net/ipv4/tcp_output.c | 63 ++++++------ net/ipv4/tcp_timer.c | 20 +-- net/ipv6/ipv6_sockglue.c | 4 net/ipv6/tcp_ipv6.c | 28 ++--- net/sunrpc/svcsock.c | 2 net/sunrpc/xprt.c | 3 17 files changed, 308 insertions(+), 320 deletions(-) diff -Nru a/include/linux/ipv6.h b/include/linux/ipv6.h --- a/include/linux/ipv6.h 2004-12-28 16:36:54 -02:00 +++ b/include/linux/ipv6.h 2004-12-28 16:36:54 -02:00 @@ -268,8 +268,7 @@ }; struct tcp6_sock { - struct inet_sock inet; - struct tcp_opt tcp; + struct tcp_sock tcp; struct ipv6_pinfo inet6; }; diff -Nru a/include/linux/tcp.h b/include/linux/tcp.h --- a/include/linux/tcp.h 2004-12-28 16:36:53 -02:00 +++ b/include/linux/tcp.h 2004-12-28 16:36:53 -02:00 @@ -214,7 +214,9 @@ TCP_BIC, }; -struct tcp_opt { +struct tcp_sock { + /* inet_sock has to be the first member of tcp_sock */ + struct inet_sock inet; int tcp_header_len; /* Bytes of tcp header to send */ /* @@ -438,15 +440,9 @@ } bictcp; }; -/* WARNING: don't change the layout of the members in tcp_sock! */ -struct tcp_sock { - struct inet_sock inet; - struct tcp_opt tcp; -}; - -static inline struct tcp_opt * tcp_sk(const struct sock *__sk) +static inline struct tcp_sock *tcp_sk(const struct sock *sk) { - return &((struct tcp_sock *)__sk)->tcp; + return (struct tcp_sock *)sk; } #endif diff -Nru a/include/net/tcp.h b/include/net/tcp.h --- a/include/net/tcp.h 2004-12-28 16:36:53 -02:00 +++ b/include/net/tcp.h 2004-12-28 16:36:53 -02:00 @@ -808,17 +808,17 @@ TCP_ACK_PUSHED= 4 }; -static inline void tcp_schedule_ack(struct tcp_opt *tp) +static inline void tcp_schedule_ack(struct tcp_sock *tp) { tp->ack.pending |= TCP_ACK_SCHED; } -static inline int tcp_ack_scheduled(struct tcp_opt *tp) +static inline int tcp_ack_scheduled(struct tcp_sock *tp) { return tp->ack.pending&TCP_ACK_SCHED; } -static __inline__ void tcp_dec_quickack_mode(struct tcp_opt *tp) +static __inline__ void tcp_dec_quickack_mode(struct tcp_sock *tp) { if (tp->ack.quick && --tp->ack.quick == 0) { /* Leaving quickack mode we deflate ATO. */ @@ -826,14 +826,14 @@ } } -extern void tcp_enter_quickack_mode(struct tcp_opt *tp); +extern void tcp_enter_quickack_mode(struct tcp_sock *tp); -static __inline__ void tcp_delack_init(struct tcp_opt *tp) +static __inline__ void tcp_delack_init(struct tcp_sock *tp) { memset(&tp->ack, 0, sizeof(tp->ack)); } -static inline void tcp_clear_options(struct tcp_opt *tp) +static inline void tcp_clear_options(struct tcp_sock *tp) { tp->tstamp_ok = tp->sack_ok = tp->wscale_ok = tp->snd_wscale = 0; } @@ -860,7 +860,7 @@ struct sk_buff *skb); extern void tcp_enter_frto(struct sock *sk); extern void tcp_enter_loss(struct sock *sk, int how); -extern void tcp_clear_retrans(struct tcp_opt *tp); +extern void tcp_clear_retrans(struct tcp_sock *tp); extern void tcp_update_metrics(struct sock *sk); extern void tcp_close(struct sock *sk, @@ -884,7 +884,7 @@ extern int tcp_listen_start(struct sock *sk); extern void tcp_parse_options(struct sk_buff *skb, - struct tcp_opt *tp, + struct tcp_sock *tp, int estab); /* @@ -980,7 +980,7 @@ static inline void tcp_clear_xmit_timer(struct sock *sk, int what) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); switch (what) { case TCP_TIME_RETRANS: @@ -1013,7 +1013,7 @@ */ static inline void tcp_reset_xmit_timer(struct sock *sk, int what, unsigned long when) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); if (when > TCP_RTO_MAX) { #ifdef TCP_DEBUG @@ -1053,7 +1053,7 @@ static inline void tcp_initialize_rcv_mss(struct sock *sk) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); unsigned int hint = min(tp->advmss, tp->mss_cache_std); hint = min(hint, tp->rcv_wnd/2); @@ -1063,19 +1063,19 @@ tp->ack.rcv_mss = hint; } -static __inline__ void __tcp_fast_path_on(struct tcp_opt *tp, u32 snd_wnd) +static __inline__ void __tcp_fast_path_on(struct tcp_sock *tp, u32 snd_wnd) { tp->pred_flags = htonl((tp->tcp_header_len << 26) | ntohl(TCP_FLAG_ACK) | snd_wnd); } -static __inline__ void tcp_fast_path_on(struct tcp_opt *tp) +static __inline__ void tcp_fast_path_on(struct tcp_sock *tp) { __tcp_fast_path_on(tp, tp->snd_wnd>>tp->snd_wscale); } -static inline void tcp_fast_path_check(struct sock *sk, struct tcp_opt *tp) +static inline void tcp_fast_path_check(struct sock *sk, struct tcp_sock *tp) { if (skb_queue_len(&tp->out_of_order_queue) == 0 && tp->rcv_wnd && @@ -1088,7 +1088,7 @@ * Rcv_nxt can be after the window if our peer push more data * than the offered window. */ -static __inline__ u32 tcp_receive_window(const struct tcp_opt *tp) +static __inline__ u32 tcp_receive_window(const struct tcp_sock *tp) { s32 win = tp->rcv_wup + tp->rcv_wnd - tp->rcv_nxt; @@ -1220,7 +1220,7 @@ } static inline void tcp_packets_out_inc(struct sock *sk, - struct tcp_opt *tp, + struct tcp_sock *tp, const struct sk_buff *skb) { int orig = tcp_get_pcount(&tp->packets_out); @@ -1230,7 +1230,7 @@ tcp_reset_xmit_timer(sk, TCP_TIME_RETRANS, tp->rto); } -static inline void tcp_packets_out_dec(struct tcp_opt *tp, +static inline void tcp_packets_out_dec(struct tcp_sock *tp, const struct sk_buff *skb) { tcp_dec_pcount(&tp->packets_out, skb); @@ -1250,7 +1250,7 @@ * "Packets left network, but not honestly ACKed yet" PLUS * "Packets fast retransmitted" */ -static __inline__ unsigned int tcp_packets_in_flight(const struct tcp_opt *tp) +static __inline__ unsigned int tcp_packets_in_flight(const struct tcp_sock *tp) { return (tcp_get_pcount(&tp->packets_out) - tcp_get_pcount(&tp->left_out) + @@ -1274,7 +1274,7 @@ * behave like Reno until low_window is reached, * then increase congestion window slowly */ -static inline __u32 tcp_recalc_ssthresh(struct tcp_opt *tp) +static inline __u32 tcp_recalc_ssthresh(struct tcp_sock *tp) { if (tcp_is_bic(tp)) { if (sysctl_tcp_bic_fast_convergence && @@ -1296,7 +1296,7 @@ /* Stop taking Vegas samples for now. */ #define tcp_vegas_disable(__tp) ((__tp)->vegas.doing_vegas_now = 0) -static inline void tcp_vegas_enable(struct tcp_opt *tp) +static inline void tcp_vegas_enable(struct tcp_sock *tp) { /* There are several situations when we must "re-start" Vegas: * @@ -1328,9 +1328,9 @@ /* Should we be taking Vegas samples right now? */ #define tcp_vegas_enabled(__tp) ((__tp)->vegas.doing_vegas_now) -extern void tcp_ca_init(struct tcp_opt *tp); +extern void tcp_ca_init(struct tcp_sock *tp); -static inline void tcp_set_ca_state(struct tcp_opt *tp, u8 ca_state) +static inline void tcp_set_ca_state(struct tcp_sock *tp, u8 ca_state) { if (tcp_is_vegas(tp)) { if (ca_state == TCP_CA_Open) @@ -1345,7 +1345,7 @@ * The exception is rate halving phase, when cwnd is decreasing towards * ssthresh. */ -static inline __u32 tcp_current_ssthresh(struct tcp_opt *tp) +static inline __u32 tcp_current_ssthresh(struct tcp_sock *tp) { if ((1<ca_state)&(TCPF_CA_CWR|TCPF_CA_Recovery)) return tp->snd_ssthresh; @@ -1355,7 +1355,7 @@ (tp->snd_cwnd >> 2))); } -static inline void tcp_sync_left_out(struct tcp_opt *tp) +static inline void tcp_sync_left_out(struct tcp_sock *tp) { if (tp->sack_ok && (tcp_get_pcount(&tp->sacked_out) >= @@ -1372,7 +1372,7 @@ /* Congestion window validation. (RFC2861) */ -static inline void tcp_cwnd_validate(struct sock *sk, struct tcp_opt *tp) +static inline void tcp_cwnd_validate(struct sock *sk, struct tcp_sock *tp) { __u32 packets_out = tcp_get_pcount(&tp->packets_out); @@ -1391,7 +1391,7 @@ } /* Set slow start threshould and cwnd not falling to slow start */ -static inline void __tcp_enter_cwr(struct tcp_opt *tp) +static inline void __tcp_enter_cwr(struct tcp_sock *tp) { tp->undo_marker = 0; tp->snd_ssthresh = tcp_recalc_ssthresh(tp); @@ -1403,7 +1403,7 @@ TCP_ECN_queue_cwr(tp); } -static inline void tcp_enter_cwr(struct tcp_opt *tp) +static inline void tcp_enter_cwr(struct tcp_sock *tp) { tp->prior_ssthresh = 0; if (tp->ca_state < TCP_CA_CWR) { @@ -1412,23 +1412,23 @@ } } -extern __u32 tcp_init_cwnd(struct tcp_opt *tp, struct dst_entry *dst); +extern __u32 tcp_init_cwnd(struct tcp_sock *tp, struct dst_entry *dst); /* Slow start with delack produces 3 packets of burst, so that * it is safe "de facto". */ -static __inline__ __u32 tcp_max_burst(const struct tcp_opt *tp) +static __inline__ __u32 tcp_max_burst(const struct tcp_sock *tp) { return 3; } -static __inline__ int tcp_minshall_check(const struct tcp_opt *tp) +static __inline__ int tcp_minshall_check(const struct tcp_sock *tp) { return after(tp->snd_sml,tp->snd_una) && !after(tp->snd_sml, tp->snd_nxt); } -static __inline__ void tcp_minshall_update(struct tcp_opt *tp, int mss, +static __inline__ void tcp_minshall_update(struct tcp_sock *tp, int mss, const struct sk_buff *skb) { if (skb->len < mss) @@ -1444,7 +1444,7 @@ */ static __inline__ int -tcp_nagle_check(const struct tcp_opt *tp, const struct sk_buff *skb, +tcp_nagle_check(const struct tcp_sock *tp, const struct sk_buff *skb, unsigned mss_now, int nonagle) { return (skb->len < mss_now && @@ -1460,7 +1460,7 @@ /* This checks if the data bearing packet SKB (usually sk->sk_send_head) * should be put on the wire right now. */ -static __inline__ int tcp_snd_test(const struct tcp_opt *tp, +static __inline__ int tcp_snd_test(const struct tcp_sock *tp, struct sk_buff *skb, unsigned cur_mss, int nonagle) { @@ -1502,7 +1502,7 @@ !after(TCP_SKB_CB(skb)->end_seq, tp->snd_una + tp->snd_wnd)); } -static __inline__ void tcp_check_probe_timer(struct sock *sk, struct tcp_opt *tp) +static __inline__ void tcp_check_probe_timer(struct sock *sk, struct tcp_sock *tp) { if (!tcp_get_pcount(&tp->packets_out) && !tp->pending) tcp_reset_xmit_timer(sk, TCP_TIME_PROBE0, tp->rto); @@ -1519,7 +1519,7 @@ * The socket must be locked by the caller. */ static __inline__ void __tcp_push_pending_frames(struct sock *sk, - struct tcp_opt *tp, + struct tcp_sock *tp, unsigned cur_mss, int nonagle) { @@ -1536,12 +1536,12 @@ } static __inline__ void tcp_push_pending_frames(struct sock *sk, - struct tcp_opt *tp) + struct tcp_sock *tp) { __tcp_push_pending_frames(sk, tp, tcp_current_mss(sk, 1), tp->nonagle); } -static __inline__ int tcp_may_send_now(struct sock *sk, struct tcp_opt *tp) +static __inline__ int tcp_may_send_now(struct sock *sk, struct tcp_sock *tp) { struct sk_buff *skb = sk->sk_send_head; @@ -1550,12 +1550,12 @@ tcp_skb_is_last(sk, skb) ? TCP_NAGLE_PUSH : tp->nonagle)); } -static __inline__ void tcp_init_wl(struct tcp_opt *tp, u32 ack, u32 seq) +static __inline__ void tcp_init_wl(struct tcp_sock *tp, u32 ack, u32 seq) { tp->snd_wl1 = seq; } -static __inline__ void tcp_update_wl(struct tcp_opt *tp, u32 ack, u32 seq) +static __inline__ void tcp_update_wl(struct tcp_sock *tp, u32 ack, u32 seq) { tp->snd_wl1 = seq; } @@ -1586,7 +1586,7 @@ /* Prequeue for VJ style copy to user, combined with checksumming. */ -static __inline__ void tcp_prequeue_init(struct tcp_opt *tp) +static __inline__ void tcp_prequeue_init(struct tcp_sock *tp) { tp->ucopy.task = NULL; tp->ucopy.len = 0; @@ -1604,7 +1604,7 @@ */ static __inline__ int tcp_prequeue(struct sock *sk, struct sk_buff *skb) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); if (!sysctl_tcp_low_latency && tp->ucopy.task) { __skb_queue_tail(&tp->ucopy.prequeue, skb); @@ -1688,14 +1688,14 @@ tcp_destroy_sock(sk); } -static __inline__ void tcp_sack_reset(struct tcp_opt *tp) +static __inline__ void tcp_sack_reset(struct tcp_sock *tp) { tp->dsack = 0; tp->eff_sacks = 0; tp->num_sacks = 0; } -static __inline__ void tcp_build_and_update_options(__u32 *ptr, struct tcp_opt *tp, __u32 tstamp) +static __inline__ void tcp_build_and_update_options(__u32 *ptr, struct tcp_sock *tp, __u32 tstamp) { if (tp->tstamp_ok) { *ptr++ = __constant_htonl((TCPOPT_NOP << 24) | @@ -1790,7 +1790,7 @@ static inline void tcp_acceptq_queue(struct sock *sk, struct open_request *req, struct sock *child) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); req->sk = child; sk_acceptq_added(sk); @@ -1849,7 +1849,7 @@ return tcp_synq_len(sk) >> tcp_sk(sk)->listen_opt->max_qlen_log; } -static inline void tcp_synq_unlink(struct tcp_opt *tp, struct open_request *req, +static inline void tcp_synq_unlink(struct tcp_sock *tp, struct open_request *req, struct open_request **prev) { write_lock(&tp->syn_wait_lock); @@ -1866,7 +1866,7 @@ } static __inline__ void tcp_openreq_init(struct open_request *req, - struct tcp_opt *tp, + struct tcp_sock *tp, struct sk_buff *skb) { req->rcv_wnd = 0; /* So that tcp_send_synack() knows! */ @@ -1905,17 +1905,17 @@ wake_up(&tcp_lhash_wait); } -static inline int keepalive_intvl_when(const struct tcp_opt *tp) +static inline int keepalive_intvl_when(const struct tcp_sock *tp) { return tp->keepalive_intvl ? : sysctl_tcp_keepalive_intvl; } -static inline int keepalive_time_when(const struct tcp_opt *tp) +static inline int keepalive_time_when(const struct tcp_sock *tp) { return tp->keepalive_time ? : sysctl_tcp_keepalive_time; } -static inline int tcp_fin_time(const struct tcp_opt *tp) +static inline int tcp_fin_time(const struct tcp_sock *tp) { int fin_timeout = tp->linger2 ? : sysctl_tcp_fin_timeout; @@ -1925,7 +1925,7 @@ return fin_timeout; } -static inline int tcp_paws_check(const struct tcp_opt *tp, int rst) +static inline int tcp_paws_check(const struct tcp_sock *tp, int rst) { if ((s32)(tp->rcv_tsval - tp->ts_recent) >= 0) return 0; @@ -1962,7 +1962,7 @@ static inline int tcp_use_frto(const struct sock *sk) { - const struct tcp_opt *tp = tcp_sk(sk); + const struct tcp_sock *tp = tcp_sk(sk); /* F-RTO must be activated in sysctl and there must be some * unsent new data, and the advertised window should allow @@ -2014,7 +2014,7 @@ #define TCP_WESTWOOD_INIT_RTT (20*HZ) /* maybe too conservative?! */ #define TCP_WESTWOOD_RTT_MIN (HZ/20) /* 50ms */ -static inline void tcp_westwood_update_rtt(struct tcp_opt *tp, __u32 rtt_seq) +static inline void tcp_westwood_update_rtt(struct tcp_sock *tp, __u32 rtt_seq) { if (tcp_is_westwood(tp)) tp->westwood.rtt = rtt_seq; @@ -2035,19 +2035,19 @@ __tcp_westwood_slow_bw(sk, skb); } -static inline __u32 __tcp_westwood_bw_rttmin(const struct tcp_opt *tp) +static inline __u32 __tcp_westwood_bw_rttmin(const struct tcp_sock *tp) { return max((tp->westwood.bw_est) * (tp->westwood.rtt_min) / (__u32) (tp->mss_cache_std), 2U); } -static inline __u32 tcp_westwood_bw_rttmin(const struct tcp_opt *tp) +static inline __u32 tcp_westwood_bw_rttmin(const struct tcp_sock *tp) { return tcp_is_westwood(tp) ? __tcp_westwood_bw_rttmin(tp) : 0; } -static inline int tcp_westwood_ssthresh(struct tcp_opt *tp) +static inline int tcp_westwood_ssthresh(struct tcp_sock *tp) { __u32 ssthresh = 0; @@ -2060,7 +2060,7 @@ return (ssthresh != 0); } -static inline int tcp_westwood_cwnd(struct tcp_opt *tp) +static inline int tcp_westwood_cwnd(struct tcp_sock *tp) { __u32 cwnd = 0; diff -Nru a/include/net/tcp_ecn.h b/include/net/tcp_ecn.h --- a/include/net/tcp_ecn.h 2004-12-28 16:36:54 -02:00 +++ b/include/net/tcp_ecn.h 2004-12-28 16:36:54 -02:00 @@ -9,8 +9,7 @@ #define TCP_ECN_QUEUE_CWR 2 #define TCP_ECN_DEMAND_CWR 4 -static __inline__ void -TCP_ECN_queue_cwr(struct tcp_opt *tp) +static inline void TCP_ECN_queue_cwr(struct tcp_sock *tp) { if (tp->ecn_flags&TCP_ECN_OK) tp->ecn_flags |= TCP_ECN_QUEUE_CWR; @@ -19,16 +18,16 @@ /* Output functions */ -static __inline__ void -TCP_ECN_send_synack(struct tcp_opt *tp, struct sk_buff *skb) +static inline void TCP_ECN_send_synack(struct tcp_sock *tp, + struct sk_buff *skb) { TCP_SKB_CB(skb)->flags &= ~TCPCB_FLAG_CWR; if (!(tp->ecn_flags&TCP_ECN_OK)) TCP_SKB_CB(skb)->flags &= ~TCPCB_FLAG_ECE; } -static __inline__ void -TCP_ECN_send_syn(struct sock *sk, struct tcp_opt *tp, struct sk_buff *skb) +static inline void TCP_ECN_send_syn(struct sock *sk, struct tcp_sock *tp, + struct sk_buff *skb) { tp->ecn_flags = 0; if (sysctl_tcp_ecn && !(sk->sk_route_caps & NETIF_F_TSO)) { @@ -45,8 +44,8 @@ th->ece = 1; } -static __inline__ void -TCP_ECN_send(struct sock *sk, struct tcp_opt *tp, struct sk_buff *skb, int tcp_header_len) +static inline void TCP_ECN_send(struct sock *sk, struct tcp_sock *tp, + struct sk_buff *skb, int tcp_header_len) { if (tp->ecn_flags & TCP_ECN_OK) { /* Not-retransmitted data segment: set ECT and inject CWR. */ @@ -68,21 +67,18 @@ /* Input functions */ -static __inline__ void -TCP_ECN_accept_cwr(struct tcp_opt *tp, struct sk_buff *skb) +static inline void TCP_ECN_accept_cwr(struct tcp_sock *tp, struct sk_buff *skb) { if (skb->h.th->cwr) tp->ecn_flags &= ~TCP_ECN_DEMAND_CWR; } -static __inline__ void -TCP_ECN_withdraw_cwr(struct tcp_opt *tp) +static inline void TCP_ECN_withdraw_cwr(struct tcp_sock *tp) { tp->ecn_flags &= ~TCP_ECN_DEMAND_CWR; } -static __inline__ void -TCP_ECN_check_ce(struct tcp_opt *tp, struct sk_buff *skb) +static inline void TCP_ECN_check_ce(struct tcp_sock *tp, struct sk_buff *skb) { if (tp->ecn_flags&TCP_ECN_OK) { if (INET_ECN_is_ce(TCP_SKB_CB(skb)->flags)) @@ -95,30 +91,27 @@ } } -static __inline__ void -TCP_ECN_rcv_synack(struct tcp_opt *tp, struct tcphdr *th) +static inline void TCP_ECN_rcv_synack(struct tcp_sock *tp, struct tcphdr *th) { if ((tp->ecn_flags&TCP_ECN_OK) && (!th->ece || th->cwr)) tp->ecn_flags &= ~TCP_ECN_OK; } -static __inline__ void -TCP_ECN_rcv_syn(struct tcp_opt *tp, struct tcphdr *th) +static inline void TCP_ECN_rcv_syn(struct tcp_sock *tp, struct tcphdr *th) { if ((tp->ecn_flags&TCP_ECN_OK) && (!th->ece || !th->cwr)) tp->ecn_flags &= ~TCP_ECN_OK; } -static __inline__ int -TCP_ECN_rcv_ecn_echo(struct tcp_opt *tp, struct tcphdr *th) +static inline int TCP_ECN_rcv_ecn_echo(struct tcp_sock *tp, struct tcphdr *th) { if (th->ece && !th->syn && (tp->ecn_flags&TCP_ECN_OK)) return 1; return 0; } -static __inline__ void -TCP_ECN_openreq_child(struct tcp_opt *tp, struct open_request *req) +static inline void TCP_ECN_openreq_child(struct tcp_sock *tp, + struct open_request *req) { tp->ecn_flags = req->ecn_ok ? TCP_ECN_OK : 0; } diff -Nru a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c --- a/net/ipv4/ip_sockglue.c 2004-12-28 16:36:53 -02:00 +++ b/net/ipv4/ip_sockglue.c 2004-12-28 16:36:53 -02:00 @@ -429,7 +429,7 @@ if (err) break; if (sk->sk_type == SOCK_STREAM) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); #if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) if (sk->sk_family == PF_INET || (!((1 << sk->sk_state) & diff -Nru a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c --- a/net/ipv4/syncookies.c 2004-12-28 16:36:54 -02:00 +++ b/net/ipv4/syncookies.c 2004-12-28 16:36:54 -02:00 @@ -47,7 +47,7 @@ */ __u32 cookie_v4_init_sequence(struct sock *sk, struct sk_buff *skb, __u16 *mssp) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); int mssind; const __u16 mss = *mssp; @@ -98,7 +98,7 @@ struct open_request *req, struct dst_entry *dst) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); struct sock *child; child = tp->af_specific->syn_recv_sock(sk, skb, req, dst); @@ -114,7 +114,7 @@ struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb, struct ip_options *opt) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); __u32 cookie = ntohl(skb->h.th->ack_seq) - 1; struct sock *ret = sk; struct open_request *req; diff -Nru a/net/ipv4/tcp.c b/net/ipv4/tcp.c --- a/net/ipv4/tcp.c 2004-12-28 16:36:53 -02:00 +++ b/net/ipv4/tcp.c 2004-12-28 16:36:53 -02:00 @@ -330,7 +330,7 @@ { unsigned int mask; struct sock *sk = sock->sk; - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); poll_wait(file, sk->sk_sleep, wait); if (sk->sk_state == TCP_LISTEN) @@ -413,7 +413,7 @@ int tcp_ioctl(struct sock *sk, int cmd, unsigned long arg) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); int answ; switch (cmd) { @@ -461,7 +461,7 @@ int tcp_listen_start(struct sock *sk) { struct inet_sock *inet = inet_sk(sk); - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); struct tcp_listen_opt *lopt; sk->sk_max_ack_backlog = 0; @@ -514,7 +514,7 @@ static void tcp_listen_stop (struct sock *sk) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); struct tcp_listen_opt *lopt = tp->listen_opt; struct open_request *acc_req = tp->accept_queue; struct open_request *req; @@ -578,18 +578,18 @@ BUG_TRAP(!sk->sk_ack_backlog); } -static inline void tcp_mark_push(struct tcp_opt *tp, struct sk_buff *skb) +static inline void tcp_mark_push(struct tcp_sock *tp, struct sk_buff *skb) { TCP_SKB_CB(skb)->flags |= TCPCB_FLAG_PSH; tp->pushed_seq = tp->write_seq; } -static inline int forced_push(struct tcp_opt *tp) +static inline int forced_push(struct tcp_sock *tp) { return after(tp->write_seq, tp->pushed_seq + (tp->max_window >> 1)); } -static inline void skb_entail(struct sock *sk, struct tcp_opt *tp, +static inline void skb_entail(struct sock *sk, struct tcp_sock *tp, struct sk_buff *skb) { skb->csum = 0; @@ -605,7 +605,7 @@ tp->nonagle &= ~TCP_NAGLE_PUSH; } -static inline void tcp_mark_urg(struct tcp_opt *tp, int flags, +static inline void tcp_mark_urg(struct tcp_sock *tp, int flags, struct sk_buff *skb) { if (flags & MSG_OOB) { @@ -615,7 +615,7 @@ } } -static inline void tcp_push(struct sock *sk, struct tcp_opt *tp, int flags, +static inline void tcp_push(struct sock *sk, struct tcp_sock *tp, int flags, int mss_now, int nonagle) { if (sk->sk_send_head) { @@ -631,7 +631,7 @@ static ssize_t do_tcp_sendpages(struct sock *sk, struct page **pages, int poffset, size_t psize, int flags) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); int mss_now; int err; ssize_t copied; @@ -760,7 +760,7 @@ #define TCP_PAGE(sk) (sk->sk_sndmsg_page) #define TCP_OFF(sk) (sk->sk_sndmsg_off) -static inline int select_size(struct sock *sk, struct tcp_opt *tp) +static inline int select_size(struct sock *sk, struct tcp_sock *tp) { int tmp = tp->mss_cache_std; @@ -778,7 +778,7 @@ size_t size) { struct iovec *iov; - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); struct sk_buff *skb; int iovlen, flags; int mss_now; @@ -1002,7 +1002,7 @@ struct msghdr *msg, int len, int flags, int *addr_len) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); /* No URG data to read. */ if (sock_flag(sk, SOCK_URGINLINE) || !tp->urg_data || @@ -1052,7 +1052,7 @@ */ static void cleanup_rbuf(struct sock *sk, int copied) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); int time_to_ack = 0; #if TCP_DEBUG @@ -1107,7 +1107,7 @@ static void tcp_prequeue_process(struct sock *sk) { struct sk_buff *skb; - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); NET_ADD_STATS_USER(LINUX_MIB_TCPPREQUEUED, skb_queue_len(&tp->ucopy.prequeue)); @@ -1154,7 +1154,7 @@ sk_read_actor_t recv_actor) { struct sk_buff *skb; - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); u32 seq = tp->copied_seq; u32 offset; int copied = 0; @@ -1213,7 +1213,7 @@ int tcp_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg, size_t len, int nonblock, int flags, int *addr_len) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); int copied = 0; u32 peek_seq; u32 *seq; @@ -1719,7 +1719,7 @@ */ if (sk->sk_state == TCP_FIN_WAIT2) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); if (tp->linger2 < 0) { tcp_set_state(sk, TCP_CLOSE); tcp_send_active_reset(sk, GFP_ATOMIC); @@ -1773,7 +1773,7 @@ int tcp_disconnect(struct sock *sk, int flags) { struct inet_sock *inet = inet_sk(sk); - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); int err = 0; int old_state = sk->sk_state; @@ -1835,7 +1835,7 @@ */ static int wait_for_connect(struct sock *sk, long timeo) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); DEFINE_WAIT(wait); int err; @@ -1883,7 +1883,7 @@ struct sock *tcp_accept(struct sock *sk, int flags, int *err) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); struct open_request *req; struct sock *newsk; int error; @@ -1934,7 +1934,7 @@ int tcp_setsockopt(struct sock *sk, int level, int optname, char __user *optval, int optlen) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); int val; int err = 0; @@ -2098,7 +2098,7 @@ /* Return information about state of tcp endpoint in API format. */ void tcp_get_info(struct sock *sk, struct tcp_info *info) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); u32 now = tcp_time_stamp; memset(info, 0, sizeof(*info)); @@ -2157,7 +2157,7 @@ int tcp_getsockopt(struct sock *sk, int level, int optname, char __user *optval, int __user *optlen) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); int val, len; if (level != SOL_TCP) diff -Nru a/net/ipv4/tcp_diag.c b/net/ipv4/tcp_diag.c --- a/net/ipv4/tcp_diag.c 2004-12-28 16:36:53 -02:00 +++ b/net/ipv4/tcp_diag.c 2004-12-28 16:36:53 -02:00 @@ -56,7 +56,7 @@ int ext, u32 pid, u32 seq, u16 nlmsg_flags) { struct inet_sock *inet = inet_sk(sk); - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); struct tcpdiagmsg *r; struct nlmsghdr *nlh; struct tcp_info *info = NULL; @@ -512,7 +512,7 @@ { struct tcpdiag_entry entry; struct tcpdiagreq *r = NLMSG_DATA(cb->nlh); - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); struct tcp_listen_opt *lopt; struct rtattr *bc = NULL; struct inet_sock *inet = inet_sk(sk); diff -Nru a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c --- a/net/ipv4/tcp_input.c 2004-12-28 16:36:54 -02:00 +++ b/net/ipv4/tcp_input.c 2004-12-28 16:36:54 -02:00 @@ -127,7 +127,8 @@ /* Adapt the MSS value used to make delayed ack decision to the * real world. */ -static __inline__ void tcp_measure_rcv_mss(struct tcp_opt *tp, struct sk_buff *skb) +static inline void tcp_measure_rcv_mss(struct tcp_sock *tp, + struct sk_buff *skb) { unsigned int len, lss; @@ -170,7 +171,7 @@ } } -static void tcp_incr_quickack(struct tcp_opt *tp) +static void tcp_incr_quickack(struct tcp_sock *tp) { unsigned quickacks = tp->rcv_wnd/(2*tp->ack.rcv_mss); @@ -180,7 +181,7 @@ tp->ack.quick = min(quickacks, TCP_MAX_QUICKACKS); } -void tcp_enter_quickack_mode(struct tcp_opt *tp) +void tcp_enter_quickack_mode(struct tcp_sock *tp) { tcp_incr_quickack(tp); tp->ack.pingpong = 0; @@ -191,7 +192,7 @@ * and the session is not interactive. */ -static __inline__ int tcp_in_quickack_mode(struct tcp_opt *tp) +static __inline__ int tcp_in_quickack_mode(struct tcp_sock *tp) { return (tp->ack.quick && !tp->ack.pingpong); } @@ -236,8 +237,8 @@ */ /* Slow part of check#2. */ -static int -__tcp_grow_window(struct sock *sk, struct tcp_opt *tp, struct sk_buff *skb) +static int __tcp_grow_window(struct sock *sk, struct tcp_sock *tp, + struct sk_buff *skb) { /* Optimize this! */ int truesize = tcp_win_from_space(skb->truesize)/2; @@ -253,8 +254,8 @@ return 0; } -static __inline__ void -tcp_grow_window(struct sock *sk, struct tcp_opt *tp, struct sk_buff *skb) +static inline void tcp_grow_window(struct sock *sk, struct tcp_sock *tp, + struct sk_buff *skb) { /* Check #1 */ if (tp->rcv_ssthresh < tp->window_clamp && @@ -281,7 +282,7 @@ static void tcp_fixup_rcvbuf(struct sock *sk) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); int rcvmem = tp->advmss + MAX_TCP_HEADER + 16 + sizeof(struct sk_buff); /* Try to select rcvbuf so that 4 mss-sized segments @@ -299,7 +300,7 @@ */ static void tcp_init_buffer_space(struct sock *sk) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); int maxwin; if (!(sk->sk_userlocks & SOCK_RCVBUF_LOCK)) @@ -330,7 +331,7 @@ tp->snd_cwnd_stamp = tcp_time_stamp; } -static void init_bictcp(struct tcp_opt *tp) +static void init_bictcp(struct tcp_sock *tp) { tp->bictcp.cnt = 0; @@ -340,7 +341,7 @@ } /* 5. Recalculate window clamp after socket hit its memory bounds. */ -static void tcp_clamp_window(struct sock *sk, struct tcp_opt *tp) +static void tcp_clamp_window(struct sock *sk, struct tcp_sock *tp) { struct sk_buff *skb; unsigned int app_win = tp->rcv_nxt - tp->copied_seq; @@ -388,7 +389,7 @@ * though this reference is out of date. A new paper * is pending. */ -static void tcp_rcv_rtt_update(struct tcp_opt *tp, u32 sample, int win_dep) +static void tcp_rcv_rtt_update(struct tcp_sock *tp, u32 sample, int win_dep) { u32 new_sample = tp->rcv_rtt_est.rtt; long m = sample; @@ -421,7 +422,7 @@ tp->rcv_rtt_est.rtt = new_sample; } -static inline void tcp_rcv_rtt_measure(struct tcp_opt *tp) +static inline void tcp_rcv_rtt_measure(struct tcp_sock *tp) { if (tp->rcv_rtt_est.time == 0) goto new_measure; @@ -436,7 +437,7 @@ tp->rcv_rtt_est.time = tcp_time_stamp; } -static inline void tcp_rcv_rtt_measure_ts(struct tcp_opt *tp, struct sk_buff *skb) +static inline void tcp_rcv_rtt_measure_ts(struct tcp_sock *tp, struct sk_buff *skb) { if (tp->rcv_tsecr && (TCP_SKB_CB(skb)->end_seq - @@ -450,7 +451,7 @@ */ void tcp_rcv_space_adjust(struct sock *sk) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); int time; int space; @@ -511,7 +512,7 @@ * each ACK we send, he increments snd_cwnd and transmits more of his * queue. -DaveM */ -static void tcp_event_data_recv(struct sock *sk, struct tcp_opt *tp, struct sk_buff *skb) +static void tcp_event_data_recv(struct sock *sk, struct tcp_sock *tp, struct sk_buff *skb) { u32 now; @@ -558,7 +559,7 @@ /* When starting a new connection, pin down the current choice of * congestion algorithm. */ -void tcp_ca_init(struct tcp_opt *tp) +void tcp_ca_init(struct tcp_sock *tp) { if (sysctl_tcp_westwood) tp->adv_cong = TCP_WESTWOOD; @@ -579,7 +580,7 @@ * o min-filter RTT samples from a much longer window (forever for now) * to find the propagation delay (baseRTT) */ -static inline void vegas_rtt_calc(struct tcp_opt *tp, __u32 rtt) +static inline void vegas_rtt_calc(struct tcp_sock *tp, __u32 rtt) { __u32 vrtt = rtt + 1; /* Never allow zero rtt or baseRTT */ @@ -603,7 +604,7 @@ * To save cycles in the RFC 1323 implementation it was better to break * it up into three procedures. -- erics */ -static void tcp_rtt_estimator(struct tcp_opt *tp, __u32 mrtt) +static void tcp_rtt_estimator(struct tcp_sock *tp, __u32 mrtt) { long m = mrtt; /* RTT */ @@ -673,7 +674,7 @@ /* Calculate rto without backoff. This is the second half of Van Jacobson's * routine referred to above. */ -static __inline__ void tcp_set_rto(struct tcp_opt *tp) +static inline void tcp_set_rto(struct tcp_sock *tp) { /* Old crap is replaced with new one. 8) * @@ -697,7 +698,7 @@ /* NOTE: clamping at TCP_RTO_MIN is not required, current algo * guarantees that rto is higher. */ -static __inline__ void tcp_bound_rto(struct tcp_opt *tp) +static inline void tcp_bound_rto(struct tcp_sock *tp) { if (tp->rto > TCP_RTO_MAX) tp->rto = TCP_RTO_MAX; @@ -709,7 +710,7 @@ */ void tcp_update_metrics(struct sock *sk) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); struct dst_entry *dst = __sk_dst_get(sk); if (sysctl_tcp_nometrics_save) @@ -797,7 +798,7 @@ } /* Numbers are taken from RFC2414. */ -__u32 tcp_init_cwnd(struct tcp_opt *tp, struct dst_entry *dst) +__u32 tcp_init_cwnd(struct tcp_sock *tp, struct dst_entry *dst) { __u32 cwnd = (dst ? dst_metric(dst, RTAX_INITCWND) : 0); @@ -814,7 +815,7 @@ static void tcp_init_metrics(struct sock *sk) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); struct dst_entry *dst = __sk_dst_get(sk); if (dst == NULL) @@ -883,7 +884,7 @@ } } -static void tcp_update_reordering(struct tcp_opt *tp, int metric, int ts) +static void tcp_update_reordering(struct tcp_sock *tp, int metric, int ts) { if (metric > tp->reordering) { tp->reordering = min(TCP_MAX_REORDERING, metric); @@ -961,7 +962,7 @@ static int tcp_sacktag_write_queue(struct sock *sk, struct sk_buff *ack_skb, u32 prior_snd_una) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); unsigned char *ptr = ack_skb->h.raw + TCP_SKB_CB(ack_skb)->sacked; struct tcp_sack_block *sp = (struct tcp_sack_block *)(ptr+2); int num_sacks = (ptr[1] - TCPOLEN_SACK_BASE)>>3; @@ -1178,7 +1179,7 @@ */ void tcp_enter_frto(struct sock *sk) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); struct sk_buff *skb; tp->frto_counter = 1; @@ -1215,7 +1216,7 @@ */ static void tcp_enter_frto_loss(struct sock *sk) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); struct sk_buff *skb; int cnt = 0; @@ -1258,7 +1259,7 @@ init_bictcp(tp); } -void tcp_clear_retrans(struct tcp_opt *tp) +void tcp_clear_retrans(struct tcp_sock *tp) { tcp_set_pcount(&tp->left_out, 0); tcp_set_pcount(&tp->retrans_out, 0); @@ -1277,7 +1278,7 @@ */ void tcp_enter_loss(struct sock *sk, int how) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); struct sk_buff *skb; int cnt = 0; @@ -1321,7 +1322,7 @@ TCP_ECN_queue_cwr(tp); } -static int tcp_check_sack_reneging(struct sock *sk, struct tcp_opt *tp) +static int tcp_check_sack_reneging(struct sock *sk, struct tcp_sock *tp) { struct sk_buff *skb; @@ -1344,18 +1345,18 @@ return 0; } -static inline int tcp_fackets_out(struct tcp_opt *tp) +static inline int tcp_fackets_out(struct tcp_sock *tp) { return IsReno(tp) ? tcp_get_pcount(&tp->sacked_out)+1 : tcp_get_pcount(&tp->fackets_out); } -static inline int tcp_skb_timedout(struct tcp_opt *tp, struct sk_buff *skb) +static inline int tcp_skb_timedout(struct tcp_sock *tp, struct sk_buff *skb) { return (tcp_time_stamp - TCP_SKB_CB(skb)->when > tp->rto); } -static inline int tcp_head_timedout(struct sock *sk, struct tcp_opt *tp) +static inline int tcp_head_timedout(struct sock *sk, struct tcp_sock *tp) { return tcp_get_pcount(&tp->packets_out) && tcp_skb_timedout(tp, skb_peek(&sk->sk_write_queue)); @@ -1454,8 +1455,7 @@ * Main question: may we further continue forward transmission * with the same cwnd? */ -static int -tcp_time_to_recover(struct sock *sk, struct tcp_opt *tp) +static int tcp_time_to_recover(struct sock *sk, struct tcp_sock *tp) { __u32 packets_out; @@ -1493,7 +1493,7 @@ * in assumption of absent reordering, interpret this as reordering. * The only another reason could be bug in receiver TCP. */ -static void tcp_check_reno_reordering(struct tcp_opt *tp, int addend) +static void tcp_check_reno_reordering(struct tcp_sock *tp, int addend) { u32 holes; @@ -1512,7 +1512,7 @@ /* Emulate SACKs for SACKless connection: account for a new dupack. */ -static void tcp_add_reno_sack(struct tcp_opt *tp) +static void tcp_add_reno_sack(struct tcp_sock *tp) { tcp_inc_pcount_explicit(&tp->sacked_out, 1); tcp_check_reno_reordering(tp, 0); @@ -1521,7 +1521,7 @@ /* Account for ACK, ACKing some data in Reno Recovery phase. */ -static void tcp_remove_reno_sacks(struct sock *sk, struct tcp_opt *tp, int acked) +static void tcp_remove_reno_sacks(struct sock *sk, struct tcp_sock *tp, int acked) { if (acked > 0) { /* One ACK acked hole. The rest eat duplicate ACKs. */ @@ -1534,15 +1534,15 @@ tcp_sync_left_out(tp); } -static inline void tcp_reset_reno_sack(struct tcp_opt *tp) +static inline void tcp_reset_reno_sack(struct tcp_sock *tp) { tcp_set_pcount(&tp->sacked_out, 0); tcp_set_pcount(&tp->left_out, tcp_get_pcount(&tp->lost_out)); } /* Mark head of queue up as lost. */ -static void -tcp_mark_head_lost(struct sock *sk, struct tcp_opt *tp, int packets, u32 high_seq) +static void tcp_mark_head_lost(struct sock *sk, struct tcp_sock *tp, + int packets, u32 high_seq) { struct sk_buff *skb; int cnt = packets; @@ -1563,7 +1563,7 @@ /* Account newly detected lost packet(s) */ -static void tcp_update_scoreboard(struct sock *sk, struct tcp_opt *tp) +static void tcp_update_scoreboard(struct sock *sk, struct tcp_sock *tp) { if (IsFack(tp)) { int lost = tcp_get_pcount(&tp->fackets_out) - tp->reordering; @@ -1596,7 +1596,7 @@ /* CWND moderation, preventing bursts due to too big ACKs * in dubious situations. */ -static __inline__ void tcp_moderate_cwnd(struct tcp_opt *tp) +static inline void tcp_moderate_cwnd(struct tcp_sock *tp) { tp->snd_cwnd = min(tp->snd_cwnd, tcp_packets_in_flight(tp)+tcp_max_burst(tp)); @@ -1605,7 +1605,7 @@ /* Decrease cwnd each second ack. */ -static void tcp_cwnd_down(struct tcp_opt *tp) +static void tcp_cwnd_down(struct tcp_sock *tp) { int decr = tp->snd_cwnd_cnt + 1; __u32 limit; @@ -1635,7 +1635,7 @@ /* Nothing was retransmitted or returned timestamp is less * than timestamp of the first retransmission. */ -static __inline__ int tcp_packet_delayed(struct tcp_opt *tp) +static inline int tcp_packet_delayed(struct tcp_sock *tp) { return !tp->retrans_stamp || (tp->saw_tstamp && tp->rcv_tsecr && @@ -1645,7 +1645,7 @@ /* Undo procedures. */ #if FASTRETRANS_DEBUG > 1 -static void DBGUNDO(struct sock *sk, struct tcp_opt *tp, const char *msg) +static void DBGUNDO(struct sock *sk, struct tcp_sock *tp, const char *msg) { struct inet_sock *inet = inet_sk(sk); printk(KERN_DEBUG "Undo %s %u.%u.%u.%u/%u c%u l%u ss%u/%u p%u\n", @@ -1659,7 +1659,7 @@ #define DBGUNDO(x...) do { } while (0) #endif -static void tcp_undo_cwr(struct tcp_opt *tp, int undo) +static void tcp_undo_cwr(struct tcp_sock *tp, int undo) { if (tp->prior_ssthresh) { tp->snd_cwnd = max(tp->snd_cwnd, tp->snd_ssthresh<<1); @@ -1675,14 +1675,14 @@ tp->snd_cwnd_stamp = tcp_time_stamp; } -static inline int tcp_may_undo(struct tcp_opt *tp) +static inline int tcp_may_undo(struct tcp_sock *tp) { return tp->undo_marker && (!tp->undo_retrans || tcp_packet_delayed(tp)); } /* People celebrate: "We love our President!" */ -static int tcp_try_undo_recovery(struct sock *sk, struct tcp_opt *tp) +static int tcp_try_undo_recovery(struct sock *sk, struct tcp_sock *tp) { if (tcp_may_undo(tp)) { /* Happy end! We did not retransmit anything @@ -1708,7 +1708,7 @@ } /* Try to undo cwnd reduction, because D-SACKs acked all retransmitted data */ -static void tcp_try_undo_dsack(struct sock *sk, struct tcp_opt *tp) +static void tcp_try_undo_dsack(struct sock *sk, struct tcp_sock *tp) { if (tp->undo_marker && !tp->undo_retrans) { DBGUNDO(sk, tp, "D-SACK"); @@ -1720,7 +1720,8 @@ /* Undo during fast recovery after partial ACK. */ -static int tcp_try_undo_partial(struct sock *sk, struct tcp_opt *tp, int acked) +static int tcp_try_undo_partial(struct sock *sk, struct tcp_sock *tp, + int acked) { /* Partial ACK arrived. Force Hoe's retransmit. */ int failed = IsReno(tp) || tcp_get_pcount(&tp->fackets_out)>tp->reordering; @@ -1748,7 +1749,7 @@ } /* Undo during loss recovery after partial ACK. */ -static int tcp_try_undo_loss(struct sock *sk, struct tcp_opt *tp) +static int tcp_try_undo_loss(struct sock *sk, struct tcp_sock *tp) { if (tcp_may_undo(tp)) { struct sk_buff *skb; @@ -1769,7 +1770,7 @@ return 0; } -static __inline__ void tcp_complete_cwr(struct tcp_opt *tp) +static inline void tcp_complete_cwr(struct tcp_sock *tp) { if (tcp_westwood_cwnd(tp)) tp->snd_ssthresh = tp->snd_cwnd; @@ -1778,7 +1779,7 @@ tp->snd_cwnd_stamp = tcp_time_stamp; } -static void tcp_try_to_open(struct sock *sk, struct tcp_opt *tp, int flag) +static void tcp_try_to_open(struct sock *sk, struct tcp_sock *tp, int flag) { tcp_set_pcount(&tp->left_out, tcp_get_pcount(&tp->sacked_out)); @@ -1821,7 +1822,7 @@ tcp_fastretrans_alert(struct sock *sk, u32 prior_snd_una, int prior_packets, int flag) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); int is_dupack = (tp->snd_una == prior_snd_una && !(flag&FLAG_NOT_DUP)); /* Some technical things: @@ -1970,7 +1971,7 @@ /* Read draft-ietf-tcplw-high-performance before mucking * with this code. (Superceeds RFC1323) */ -static void tcp_ack_saw_tstamp(struct tcp_opt *tp, int flag) +static void tcp_ack_saw_tstamp(struct tcp_sock *tp, int flag) { __u32 seq_rtt; @@ -1996,7 +1997,7 @@ tcp_bound_rto(tp); } -static void tcp_ack_no_tstamp(struct tcp_opt *tp, u32 seq_rtt, int flag) +static void tcp_ack_no_tstamp(struct tcp_sock *tp, u32 seq_rtt, int flag) { /* We don't have a timestamp. Can only use * packets that are not retransmitted to determine @@ -2016,8 +2017,8 @@ tcp_bound_rto(tp); } -static __inline__ void -tcp_ack_update_rtt(struct tcp_opt *tp, int flag, s32 seq_rtt) +static inline void tcp_ack_update_rtt(struct tcp_sock *tp, + int flag, s32 seq_rtt) { /* Note that peer MAY send zero echo. In this case it is ignored. (rfc1323) */ if (tp->saw_tstamp && tp->rcv_tsecr) @@ -2039,7 +2040,7 @@ * Unless BIC is enabled and congestion window is large * this behaves the same as the original Reno. */ -static inline __u32 bictcp_cwnd(struct tcp_opt *tp) +static inline __u32 bictcp_cwnd(struct tcp_sock *tp) { /* orignal Reno behaviour */ if (!tcp_is_bic(tp)) @@ -2092,7 +2093,7 @@ /* This is Jacobson's slow start and congestion avoidance. * SIGCOMM '88, p. 328. */ -static __inline__ void reno_cong_avoid(struct tcp_opt *tp) +static inline void reno_cong_avoid(struct tcp_sock *tp) { if (tp->snd_cwnd <= tp->snd_ssthresh) { /* In "safe" area, increase. */ @@ -2141,7 +2142,7 @@ * a cwnd adjustment decision. The original Vegas implementation * assumed senders never went idle. */ -static void vegas_cong_avoid(struct tcp_opt *tp, u32 ack, u32 seq_rtt) +static void vegas_cong_avoid(struct tcp_sock *tp, u32 ack, u32 seq_rtt) { /* The key players are v_beg_snd_una and v_beg_snd_nxt. * @@ -2334,7 +2335,7 @@ tp->snd_cwnd_stamp = tcp_time_stamp; } -static inline void tcp_cong_avoid(struct tcp_opt *tp, u32 ack, u32 seq_rtt) +static inline void tcp_cong_avoid(struct tcp_sock *tp, u32 ack, u32 seq_rtt) { if (tcp_vegas_enabled(tp)) vegas_cong_avoid(tp, ack, seq_rtt); @@ -2346,7 +2347,7 @@ * RFC2988 recommends to restart timer to now+rto. */ -static __inline__ void tcp_ack_packets_out(struct sock *sk, struct tcp_opt *tp) +static inline void tcp_ack_packets_out(struct sock *sk, struct tcp_sock *tp) { if (!tcp_get_pcount(&tp->packets_out)) { tcp_clear_xmit_timer(sk, TCP_TIME_RETRANS); @@ -2367,7 +2368,7 @@ static int tcp_tso_acked(struct sock *sk, struct sk_buff *skb, __u32 now, __s32 *seq_rtt) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); struct tcp_skb_cb *scb = TCP_SKB_CB(skb); __u32 seq = tp->snd_una; __u32 packets_acked; @@ -2428,7 +2429,7 @@ /* Remove acknowledged frames from the retransmission queue. */ static int tcp_clean_rtx_queue(struct sock *sk, __s32 *seq_rtt_p) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); struct sk_buff *skb; __u32 now = tcp_time_stamp; int acked = 0; @@ -2525,7 +2526,7 @@ static void tcp_ack_probe(struct sock *sk) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); /* Was it a usable window open? */ @@ -2542,13 +2543,13 @@ } } -static __inline__ int tcp_ack_is_dubious(struct tcp_opt *tp, int flag) +static inline int tcp_ack_is_dubious(struct tcp_sock *tp, int flag) { return (!(flag & FLAG_NOT_DUP) || (flag & FLAG_CA_ALERT) || tp->ca_state != TCP_CA_Open); } -static __inline__ int tcp_may_raise_cwnd(struct tcp_opt *tp, int flag) +static inline int tcp_may_raise_cwnd(struct tcp_sock *tp, int flag) { return (!(flag & FLAG_ECE) || tp->snd_cwnd < tp->snd_ssthresh) && !((1<ca_state)&(TCPF_CA_Recovery|TCPF_CA_CWR)); @@ -2557,8 +2558,8 @@ /* Check that window update is acceptable. * The function assumes that snd_una<=ack<=snd_next. */ -static __inline__ int -tcp_may_update_window(struct tcp_opt *tp, u32 ack, u32 ack_seq, u32 nwin) +static inline int tcp_may_update_window(struct tcp_sock *tp, u32 ack, + u32 ack_seq, u32 nwin) { return (after(ack, tp->snd_una) || after(ack_seq, tp->snd_wl1) || @@ -2570,7 +2571,7 @@ * Window update algorithm, described in RFC793/RFC1122 (used in linux-2.2 * and in FreeBSD. NetBSD's one is even worse.) is wrong. */ -static int tcp_ack_update_window(struct sock *sk, struct tcp_opt *tp, +static int tcp_ack_update_window(struct sock *sk, struct tcp_sock *tp, struct sk_buff *skb, u32 ack, u32 ack_seq) { int flag = 0; @@ -2605,7 +2606,7 @@ static void tcp_process_frto(struct sock *sk, u32 prior_snd_una) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); tcp_sync_left_out(tp); @@ -2654,7 +2655,7 @@ static void init_westwood(struct sock *sk) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); tp->westwood.bw_ns_est = 0; tp->westwood.bw_est = 0; @@ -2678,7 +2679,7 @@ static void westwood_filter(struct sock *sk, __u32 delta) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); tp->westwood.bw_ns_est = westwood_do_filter(tp->westwood.bw_ns_est, @@ -2696,7 +2697,7 @@ static inline __u32 westwood_update_rttmin(const struct sock *sk) { - const struct tcp_opt *tp = tcp_sk(sk); + const struct tcp_sock *tp = tcp_sk(sk); __u32 rttmin = tp->westwood.rtt_min; if (tp->westwood.rtt != 0 && @@ -2713,7 +2714,7 @@ static inline __u32 westwood_acked(const struct sock *sk) { - const struct tcp_opt *tp = tcp_sk(sk); + const struct tcp_sock *tp = tcp_sk(sk); return tp->snd_una - tp->westwood.snd_una; } @@ -2729,7 +2730,7 @@ static int westwood_new_window(const struct sock *sk) { - const struct tcp_opt *tp = tcp_sk(sk); + const struct tcp_sock *tp = tcp_sk(sk); __u32 left_bound; __u32 rtt; int ret = 0; @@ -2760,7 +2761,7 @@ static void __westwood_update_window(struct sock *sk, __u32 now) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); __u32 delta = now - tp->westwood.rtt_win_sx; if (delta) { @@ -2788,7 +2789,7 @@ void __tcp_westwood_fast_bw(struct sock *sk, struct sk_buff *skb) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); westwood_update_window(sk, tcp_time_stamp); @@ -2805,24 +2806,24 @@ static void westwood_dupack_update(struct sock *sk) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); tp->westwood.accounted += tp->mss_cache_std; tp->westwood.cumul_ack = tp->mss_cache_std; } -static inline int westwood_may_change_cumul(struct tcp_opt *tp) +static inline int westwood_may_change_cumul(struct tcp_sock *tp) { return (tp->westwood.cumul_ack > tp->mss_cache_std); } -static inline void westwood_partial_update(struct tcp_opt *tp) +static inline void westwood_partial_update(struct tcp_sock *tp) { tp->westwood.accounted -= tp->westwood.cumul_ack; tp->westwood.cumul_ack = tp->mss_cache_std; } -static inline void westwood_complete_update(struct tcp_opt *tp) +static inline void westwood_complete_update(struct tcp_sock *tp) { tp->westwood.cumul_ack -= tp->westwood.accounted; tp->westwood.accounted = 0; @@ -2836,7 +2837,7 @@ static inline __u32 westwood_acked_count(struct sock *sk) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); tp->westwood.cumul_ack = westwood_acked(sk); @@ -2869,7 +2870,7 @@ void __tcp_westwood_slow_bw(struct sock *sk, struct sk_buff *skb) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); westwood_update_window(sk, tcp_time_stamp); @@ -2880,7 +2881,7 @@ /* This routine deals with incoming acks, but not outgoing ones. */ static int tcp_ack(struct sock *sk, struct sk_buff *skb, int flag) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); u32 prior_snd_una = tp->snd_una; u32 ack_seq = TCP_SKB_CB(skb)->seq; u32 ack = TCP_SKB_CB(skb)->ack_seq; @@ -2985,7 +2986,7 @@ * But, this can also be called on packets in the established flow when * the fast version below fails. */ -void tcp_parse_options(struct sk_buff *skb, struct tcp_opt *tp, int estab) +void tcp_parse_options(struct sk_buff *skb, struct tcp_sock *tp, int estab) { unsigned char *ptr; struct tcphdr *th = skb->h.th; @@ -3070,7 +3071,8 @@ /* Fast parse options. This hopes to only see timestamps. * If it is wrong it falls back on tcp_parse_options(). */ -static __inline__ int tcp_fast_parse_options(struct sk_buff *skb, struct tcphdr *th, struct tcp_opt *tp) +static inline int tcp_fast_parse_options(struct sk_buff *skb, struct tcphdr *th, + struct tcp_sock *tp) { if (th->doff == sizeof(struct tcphdr)>>2) { tp->saw_tstamp = 0; @@ -3092,15 +3094,13 @@ return 1; } -static __inline__ void -tcp_store_ts_recent(struct tcp_opt *tp) +static inline void tcp_store_ts_recent(struct tcp_sock *tp) { tp->ts_recent = tp->rcv_tsval; tp->ts_recent_stamp = xtime.tv_sec; } -static __inline__ void -tcp_replace_ts_recent(struct tcp_opt *tp, u32 seq) +static inline void tcp_replace_ts_recent(struct tcp_sock *tp, u32 seq) { if (tp->saw_tstamp && !after(seq, tp->rcv_wup)) { /* PAWS bug workaround wrt. ACK frames, the PAWS discard @@ -3139,7 +3139,7 @@ * up to bandwidth of 18Gigabit/sec. 8) ] */ -static int tcp_disordered_ack(struct tcp_opt *tp, struct sk_buff *skb) +static int tcp_disordered_ack(struct tcp_sock *tp, struct sk_buff *skb) { struct tcphdr *th = skb->h.th; u32 seq = TCP_SKB_CB(skb)->seq; @@ -3158,7 +3158,7 @@ (s32)(tp->ts_recent - tp->rcv_tsval) <= (tp->rto*1024)/HZ); } -static __inline__ int tcp_paws_discard(struct tcp_opt *tp, struct sk_buff *skb) +static inline int tcp_paws_discard(struct tcp_sock *tp, struct sk_buff *skb) { return ((s32)(tp->ts_recent - tp->rcv_tsval) > TCP_PAWS_WINDOW && xtime.tv_sec < tp->ts_recent_stamp + TCP_PAWS_24DAYS && @@ -3178,7 +3178,7 @@ * (borrowed from freebsd) */ -static inline int tcp_sequence(struct tcp_opt *tp, u32 seq, u32 end_seq) +static inline int tcp_sequence(struct tcp_sock *tp, u32 seq, u32 end_seq) { return !before(end_seq, tp->rcv_wup) && !after(seq, tp->rcv_nxt + tcp_receive_window(tp)); @@ -3223,7 +3223,7 @@ */ static void tcp_fin(struct sk_buff *skb, struct sock *sk, struct tcphdr *th) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); tcp_schedule_ack(tp); @@ -3303,7 +3303,7 @@ return 0; } -static __inline__ void tcp_dsack_set(struct tcp_opt *tp, u32 seq, u32 end_seq) +static inline void tcp_dsack_set(struct tcp_sock *tp, u32 seq, u32 end_seq) { if (tp->sack_ok && sysctl_tcp_dsack) { if (before(seq, tp->rcv_nxt)) @@ -3318,7 +3318,7 @@ } } -static __inline__ void tcp_dsack_extend(struct tcp_opt *tp, u32 seq, u32 end_seq) +static inline void tcp_dsack_extend(struct tcp_sock *tp, u32 seq, u32 end_seq) { if (!tp->dsack) tcp_dsack_set(tp, seq, end_seq); @@ -3328,7 +3328,7 @@ static void tcp_send_dupack(struct sock *sk, struct sk_buff *skb) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); if (TCP_SKB_CB(skb)->end_seq != TCP_SKB_CB(skb)->seq && before(TCP_SKB_CB(skb)->seq, tp->rcv_nxt)) { @@ -3350,7 +3350,7 @@ /* These routines update the SACK block as out-of-order packets arrive or * in-order packets close up the sequence space. */ -static void tcp_sack_maybe_coalesce(struct tcp_opt *tp) +static void tcp_sack_maybe_coalesce(struct tcp_sock *tp) { int this_sack; struct tcp_sack_block *sp = &tp->selective_acks[0]; @@ -3391,7 +3391,7 @@ static void tcp_sack_new_ofo_skb(struct sock *sk, u32 seq, u32 end_seq) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); struct tcp_sack_block *sp = &tp->selective_acks[0]; int cur_sacks = tp->num_sacks; int this_sack; @@ -3434,7 +3434,7 @@ /* RCV.NXT advances, some SACKs should be eaten. */ -static void tcp_sack_remove(struct tcp_opt *tp) +static void tcp_sack_remove(struct tcp_sock *tp) { struct tcp_sack_block *sp = &tp->selective_acks[0]; int num_sacks = tp->num_sacks; @@ -3475,7 +3475,7 @@ */ static void tcp_ofo_queue(struct sock *sk) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); __u32 dsack_high = tp->rcv_nxt; struct sk_buff *skb; @@ -3513,7 +3513,7 @@ static void tcp_data_queue(struct sock *sk, struct sk_buff *skb) { struct tcphdr *th = skb->h.th; - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); int eaten = -1; if (TCP_SKB_CB(skb)->seq == TCP_SKB_CB(skb)->end_seq) @@ -3821,7 +3821,7 @@ */ static void tcp_collapse_ofo_queue(struct sock *sk) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); struct sk_buff *skb = skb_peek(&tp->out_of_order_queue); struct sk_buff *head; u32 start, end; @@ -3866,7 +3866,7 @@ */ static int tcp_prune_queue(struct sock *sk) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); SOCK_DEBUG(sk, "prune_queue: c=%x\n", tp->copied_seq); @@ -3926,7 +3926,7 @@ */ void tcp_cwnd_application_limited(struct sock *sk) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); if (tp->ca_state == TCP_CA_Open && sk->sk_socket && !test_bit(SOCK_NOSPACE, &sk->sk_socket->flags)) { @@ -3950,7 +3950,7 @@ */ static void tcp_new_space(struct sock *sk) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); if (tcp_get_pcount(&tp->packets_out) < tp->snd_cwnd && !(sk->sk_userlocks & SOCK_SNDBUF_LOCK) && @@ -3981,7 +3981,7 @@ static void __tcp_data_snd_check(struct sock *sk, struct sk_buff *skb) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); if (after(TCP_SKB_CB(skb)->end_seq, tp->snd_una + tp->snd_wnd) || tcp_packets_in_flight(tp) >= tp->snd_cwnd || @@ -4003,7 +4003,7 @@ */ static void __tcp_ack_snd_check(struct sock *sk, int ofo_possible) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); /* More than one full frame received... */ if (((tp->rcv_nxt - tp->rcv_wup) > tp->ack.rcv_mss @@ -4026,7 +4026,7 @@ static __inline__ void tcp_ack_snd_check(struct sock *sk) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); if (!tcp_ack_scheduled(tp)) { /* We sent a data segment already. */ return; @@ -4046,7 +4046,7 @@ static void tcp_check_urg(struct sock * sk, struct tcphdr * th) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); u32 ptr = ntohs(th->urg_ptr); if (ptr && !sysctl_tcp_stdurg) @@ -4113,7 +4113,7 @@ /* This is the 'fast' part of urgent handling. */ static void tcp_urg(struct sock *sk, struct sk_buff *skb, struct tcphdr *th) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); /* Check if we get a new urgent pointer - normally not. */ if (th->urg) @@ -4138,7 +4138,7 @@ static int tcp_copy_to_iovec(struct sock *sk, struct sk_buff *skb, int hlen) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); int chunk = skb->len - hlen; int err; @@ -4206,7 +4206,7 @@ int tcp_rcv_established(struct sock *sk, struct sk_buff *skb, struct tcphdr *th, unsigned len) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); /* * Header prediction. @@ -4456,7 +4456,7 @@ static int tcp_rcv_synsent_state_process(struct sock *sk, struct sk_buff *skb, struct tcphdr *th, unsigned len) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); int saved_clamp = tp->mss_clamp; tcp_parse_options(skb, tp, 0); @@ -4701,7 +4701,7 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb, struct tcphdr *th, unsigned len) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); int queued = 0; tp->saw_tstamp = 0; diff -Nru a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c --- a/net/ipv4/tcp_ipv4.c 2004-12-28 16:36:53 -02:00 +++ b/net/ipv4/tcp_ipv4.c 2004-12-28 16:36:53 -02:00 @@ -568,7 +568,7 @@ tw = (struct tcp_tw_bucket *)sk2; if (TCP_IPV4_TW_MATCH(sk2, acookie, saddr, daddr, ports, dif)) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); /* With PAWS, it is safe from the viewpoint of data integrity. Even without PAWS it @@ -744,7 +744,7 @@ int tcp_v4_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len) { struct inet_sock *inet = inet_sk(sk); - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); struct sockaddr_in *usin = (struct sockaddr_in *)uaddr; struct rtable *rt; u32 daddr, nexthop; @@ -867,7 +867,7 @@ return (jhash_2words(raddr, (u32) rport, rnd) & (TCP_SYNQ_HSIZE - 1)); } -static struct open_request *tcp_v4_search_req(struct tcp_opt *tp, +static struct open_request *tcp_v4_search_req(struct tcp_sock *tp, struct open_request ***prevp, __u16 rport, __u32 raddr, __u32 laddr) @@ -893,7 +893,7 @@ static void tcp_v4_synq_add(struct sock *sk, struct open_request *req) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); struct tcp_listen_opt *lopt = tp->listen_opt; u32 h = tcp_v4_synq_hash(req->af.v4_req.rmt_addr, req->rmt_port, lopt->hash_rnd); @@ -918,7 +918,7 @@ { struct dst_entry *dst; struct inet_sock *inet = inet_sk(sk); - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); /* We are not interested in TCP_LISTEN and open_requests (SYN-ACKs * send out by Linux are always <576bytes so they should go through @@ -979,7 +979,7 @@ { struct iphdr *iph = (struct iphdr *)skb->data; struct tcphdr *th = (struct tcphdr *)(skb->data + (iph->ihl << 2)); - struct tcp_opt *tp; + struct tcp_sock *tp; struct inet_sock *inet; int type = skb->h.icmph->type; int code = skb->h.icmph->code; @@ -1393,7 +1393,7 @@ int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb) { - struct tcp_opt tp; + struct tcp_sock tp; struct open_request *req; __u32 saddr = skb->nh.iph->saddr; __u32 daddr = skb->nh.iph->daddr; @@ -1550,7 +1550,7 @@ struct dst_entry *dst) { struct inet_sock *newinet; - struct tcp_opt *newtp; + struct tcp_sock *newtp; struct sock *newsk; if (sk_acceptq_is_full(sk)) @@ -1602,7 +1602,7 @@ { struct tcphdr *th = skb->h.th; struct iphdr *iph = skb->nh.iph; - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); struct sock *nsk; struct open_request **prev; /* Find possible connection requests. */ @@ -1972,7 +1972,7 @@ int tcp_v4_remember_stamp(struct sock *sk) { struct inet_sock *inet = inet_sk(sk); - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); struct rtable *rt = (struct rtable *)__sk_dst_get(sk); struct inet_peer *peer = NULL; int release_it = 0; @@ -2040,7 +2040,7 @@ */ static int tcp_v4_init_sock(struct sock *sk) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); skb_queue_head_init(&tp->out_of_order_queue); tcp_init_xmit_timers(sk); @@ -2082,7 +2082,7 @@ int tcp_v4_destroy_sock(struct sock *sk) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); tcp_clear_xmit_timers(sk); @@ -2131,7 +2131,7 @@ static void *listening_get_next(struct seq_file *seq, void *cur) { - struct tcp_opt *tp; + struct tcp_sock *tp; struct hlist_node *node; struct sock *sk = cur; struct tcp_iter_state* st = seq->private; @@ -2368,7 +2368,7 @@ switch (st->state) { case TCP_SEQ_STATE_OPENREQ: if (v) { - struct tcp_opt *tp = tcp_sk(st->syn_wait_sk); + struct tcp_sock *tp = tcp_sk(st->syn_wait_sk); read_unlock_bh(&tp->syn_wait_lock); } case TCP_SEQ_STATE_LISTENING: @@ -2473,7 +2473,7 @@ { int timer_active; unsigned long timer_expires; - struct tcp_opt *tp = tcp_sk(sp); + struct tcp_sock *tp = tcp_sk(sp); struct inet_sock *inet = inet_sk(sp); unsigned int dest = inet->daddr; unsigned int src = inet->rcv_saddr; diff -Nru a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c --- a/net/ipv4/tcp_minisocks.c 2004-12-28 16:36:54 -02:00 +++ b/net/ipv4/tcp_minisocks.c 2004-12-28 16:36:54 -02:00 @@ -123,7 +123,7 @@ tcp_timewait_state_process(struct tcp_tw_bucket *tw, struct sk_buff *skb, struct tcphdr *th, unsigned len) { - struct tcp_opt tp; + struct tcp_sock tp; int paws_reject = 0; tp.saw_tstamp = 0; @@ -327,7 +327,7 @@ void tcp_time_wait(struct sock *sk, int state, int timeo) { struct tcp_tw_bucket *tw = NULL; - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); int recycle_ok = 0; if (sysctl_tcp_tw_recycle && tp->ts_recent_stamp) @@ -690,7 +690,7 @@ struct sock *newsk = sk_alloc(PF_INET, GFP_ATOMIC, 0, sk->sk_prot->slab); if(newsk != NULL) { - struct tcp_opt *newtp; + struct tcp_sock *newtp; struct sk_filter *filter; memcpy(newsk, sk, sizeof(struct tcp_sock)); @@ -734,7 +734,7 @@ return NULL; } - /* Now setup tcp_opt */ + /* Now setup tcp_sock */ newtp = tcp_sk(newsk); newtp->pred_flags = 0; newtp->rcv_nxt = req->rcv_isn + 1; @@ -858,10 +858,10 @@ struct open_request **prev) { struct tcphdr *th = skb->h.th; - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); u32 flg = tcp_flag_word(th) & (TCP_FLAG_RST|TCP_FLAG_SYN|TCP_FLAG_ACK); int paws_reject = 0; - struct tcp_opt ttp; + struct tcp_sock ttp; struct sock *child; ttp.saw_tstamp = 0; diff -Nru a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c --- a/net/ipv4/tcp_output.c 2004-12-28 16:36:54 -02:00 +++ b/net/ipv4/tcp_output.c 2004-12-28 16:36:54 -02:00 @@ -51,8 +51,8 @@ */ int sysctl_tcp_tso_win_divisor = 8; -static __inline__ -void update_send_head(struct sock *sk, struct tcp_opt *tp, struct sk_buff *skb) +static inline void update_send_head(struct sock *sk, struct tcp_sock *tp, + struct sk_buff *skb) { sk->sk_send_head = skb->next; if (sk->sk_send_head == (struct sk_buff *)&sk->sk_write_queue) @@ -67,7 +67,7 @@ * Anything in between SND.UNA...SND.UNA+SND.WND also can be already * invalid. OK, let's make this for now: */ -static __inline__ __u32 tcp_acceptable_seq(struct sock *sk, struct tcp_opt *tp) +static inline __u32 tcp_acceptable_seq(struct sock *sk, struct tcp_sock *tp) { if (!before(tp->snd_una+tp->snd_wnd, tp->snd_nxt)) return tp->snd_nxt; @@ -91,7 +91,7 @@ */ static __u16 tcp_advertise_mss(struct sock *sk) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); struct dst_entry *dst = __sk_dst_get(sk); int mss = tp->advmss; @@ -105,7 +105,7 @@ /* RFC2861. Reset CWND after idle period longer RTO to "restart window". * This is the first part of cwnd validation mechanism. */ -static void tcp_cwnd_restart(struct tcp_opt *tp, struct dst_entry *dst) +static void tcp_cwnd_restart(struct tcp_sock *tp, struct dst_entry *dst) { s32 delta = tcp_time_stamp - tp->lsndtime; u32 restart_cwnd = tcp_init_cwnd(tp, dst); @@ -124,7 +124,8 @@ tp->snd_cwnd_used = 0; } -static __inline__ void tcp_event_data_sent(struct tcp_opt *tp, struct sk_buff *skb, struct sock *sk) +static inline void tcp_event_data_sent(struct tcp_sock *tp, + struct sk_buff *skb, struct sock *sk) { u32 now = tcp_time_stamp; @@ -143,7 +144,7 @@ static __inline__ void tcp_event_ack_sent(struct sock *sk) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); tcp_dec_quickack_mode(tp); tcp_clear_xmit_timer(sk, TCP_TIME_DACK); @@ -208,14 +209,14 @@ (*window_clamp) = min(65535U << (*rcv_wscale), *window_clamp); } -/* Chose a new window to advertise, update state in tcp_opt for the +/* Chose a new window to advertise, update state in tcp_sock for the * socket, and return result with RFC1323 scaling applied. The return * value can be stuffed directly into th->window for an outgoing * frame. */ static __inline__ u16 tcp_select_window(struct sock *sk) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); u32 cur_win = tcp_receive_window(tp); u32 new_win = __tcp_select_window(sk); @@ -267,7 +268,7 @@ { if (skb != NULL) { struct inet_sock *inet = inet_sk(sk); - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); struct tcp_skb_cb *tcb = TCP_SKB_CB(skb); int tcp_header_size = tp->tcp_header_len; struct tcphdr *th; @@ -396,7 +397,7 @@ */ static void tcp_queue_skb(struct sock *sk, struct sk_buff *skb) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); /* Advance write_seq and place onto the write_queue. */ tp->write_seq = TCP_SKB_CB(skb)->end_seq; @@ -413,7 +414,7 @@ */ void tcp_push_one(struct sock *sk, unsigned cur_mss) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); struct sk_buff *skb = sk->sk_send_head; if (tcp_snd_test(tp, skb, cur_mss, TCP_NAGLE_PUSH)) { @@ -453,7 +454,7 @@ */ static int tcp_fragment(struct sock *sk, struct sk_buff *skb, u32 len) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); struct sk_buff *buff; int nsize; u16 flags; @@ -619,7 +620,7 @@ unsigned int tcp_sync_mss(struct sock *sk, u32 pmtu) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); struct dst_entry *dst = __sk_dst_get(sk); int mss_now; @@ -666,7 +667,7 @@ unsigned int tcp_current_mss(struct sock *sk, int large) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); struct dst_entry *dst = __sk_dst_get(sk); unsigned int do_large, mss_now; @@ -727,7 +728,7 @@ */ int tcp_write_xmit(struct sock *sk, int nonagle) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); unsigned int mss_now; /* If we are closed, the bytes will have to remain here. @@ -831,7 +832,7 @@ */ u32 __tcp_select_window(struct sock *sk) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); /* MSS for the peer's data. Previous verions used mss_clamp * here. I don't know if the value based on our guesses * of peer's MSS is better for the performance. It's more correct @@ -892,7 +893,7 @@ /* Attempt to collapse two adjacent SKB's during retransmission. */ static void tcp_retrans_try_collapse(struct sock *sk, struct sk_buff *skb, int mss_now) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); struct sk_buff *next_skb = skb->next; /* The first test we must make is that neither of these two @@ -970,7 +971,7 @@ */ void tcp_simple_retransmit(struct sock *sk) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); struct sk_buff *skb; unsigned int mss = tcp_current_mss(sk, 0); int lost = 0; @@ -1016,7 +1017,7 @@ */ int tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); unsigned int cur_mss = tcp_current_mss(sk, 0); int err; @@ -1140,7 +1141,7 @@ */ void tcp_xmit_retransmit_queue(struct sock *sk) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); struct sk_buff *skb; int packet_cnt = tcp_get_pcount(&tp->lost_out); @@ -1235,7 +1236,7 @@ */ void tcp_send_fin(struct sock *sk) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); struct sk_buff *skb = skb_peek_tail(&sk->sk_write_queue); int mss_now; @@ -1281,7 +1282,7 @@ */ void tcp_send_active_reset(struct sock *sk, int priority) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); struct sk_buff *skb; /* NOTE: No TCP options attached and we never retransmit this. */ @@ -1346,7 +1347,7 @@ struct sk_buff * tcp_make_synack(struct sock *sk, struct dst_entry *dst, struct open_request *req) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); struct tcphdr *th; int tcp_header_size; struct sk_buff *skb; @@ -1417,7 +1418,7 @@ static inline void tcp_connect_init(struct sock *sk) { struct dst_entry *dst = __sk_dst_get(sk); - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); /* We'll fix this up when we get a response from the other end. * See tcp_input.c:tcp_rcv_state_process case TCP_SYN_SENT. @@ -1466,7 +1467,7 @@ */ int tcp_connect(struct sock *sk) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); struct sk_buff *buff; tcp_connect_init(sk); @@ -1510,7 +1511,7 @@ */ void tcp_send_delayed_ack(struct sock *sk) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); int ato = tp->ack.ato; unsigned long timeout; @@ -1562,7 +1563,7 @@ { /* If we have been reset, we may not send again. */ if (sk->sk_state != TCP_CLOSE) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); struct sk_buff *buff; /* We are not putting this on the write queue, so @@ -1605,7 +1606,7 @@ */ static int tcp_xmit_probe_skb(struct sock *sk, int urgent) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); struct sk_buff *skb; /* We don't queue it, tcp_transmit_skb() sets ownership. */ @@ -1634,7 +1635,7 @@ int tcp_write_wakeup(struct sock *sk) { if (sk->sk_state != TCP_CLOSE) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); struct sk_buff *skb; if ((skb = sk->sk_send_head) != NULL && @@ -1688,7 +1689,7 @@ */ void tcp_send_probe0(struct sock *sk) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); int err; err = tcp_write_wakeup(sk); diff -Nru a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c --- a/net/ipv4/tcp_timer.c 2004-12-28 16:36:54 -02:00 +++ b/net/ipv4/tcp_timer.c 2004-12-28 16:36:54 -02:00 @@ -48,7 +48,7 @@ void tcp_init_xmit_timers(struct sock *sk) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); init_timer(&tp->retransmit_timer); tp->retransmit_timer.function=&tcp_write_timer; @@ -67,7 +67,7 @@ void tcp_clear_xmit_timers(struct sock *sk) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); tp->pending = 0; sk_stop_timer(sk, &tp->retransmit_timer); @@ -101,7 +101,7 @@ */ static int tcp_out_of_resources(struct sock *sk, int do_reset) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); int orphans = atomic_read(&tcp_orphan_count); /* If peer does not open window for long time, or did not transmit @@ -154,7 +154,7 @@ /* A write timeout has occurred. Process the after effects. */ static int tcp_write_timeout(struct sock *sk) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); int retry_until; if ((1 << sk->sk_state) & (TCPF_SYN_SENT | TCPF_SYN_RECV)) { @@ -208,7 +208,7 @@ static void tcp_delack_timer(unsigned long data) { struct sock *sk = (struct sock*)data; - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); bh_lock_sock(sk); if (sock_owned_by_user(sk)) { @@ -268,7 +268,7 @@ static void tcp_probe_timer(struct sock *sk) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); int max_probes; if (tcp_get_pcount(&tp->packets_out) || !sk->sk_send_head) { @@ -316,7 +316,7 @@ static void tcp_retransmit_timer(struct sock *sk) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); if (!tcp_get_pcount(&tp->packets_out)) goto out; @@ -418,7 +418,7 @@ static void tcp_write_timer(unsigned long data) { struct sock *sk = (struct sock*)data; - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); int event; bh_lock_sock(sk); @@ -462,7 +462,7 @@ static void tcp_synack_timer(struct sock *sk) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); struct tcp_listen_opt *lopt = tp->listen_opt; int max_retries = tp->syn_retries ? : sysctl_tcp_synack_retries; int thresh = max_retries; @@ -573,7 +573,7 @@ static void tcp_keepalive_timer (unsigned long data) { struct sock *sk = (struct sock *) data; - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); __u32 elapsed; /* Only process if socket is not in use. */ diff -Nru a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c --- a/net/ipv6/ipv6_sockglue.c 2004-12-28 16:36:53 -02:00 +++ b/net/ipv6/ipv6_sockglue.c 2004-12-28 16:36:53 -02:00 @@ -164,7 +164,7 @@ ipv6_sock_mc_close(sk); if (sk->sk_protocol == IPPROTO_TCP) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); local_bh_disable(); sock_prot_dec_use(sk->sk_prot); @@ -281,7 +281,7 @@ retv = 0; if (sk->sk_type == SOCK_STREAM) { if (opt) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); if (!((1 << sk->sk_state) & (TCPF_LISTEN | TCPF_CLOSE)) && inet_sk(sk)->daddr != LOOPBACK4_IPV6) { diff -Nru a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c --- a/net/ipv6/tcp_ipv6.c 2004-12-28 16:36:53 -02:00 +++ b/net/ipv6/tcp_ipv6.c 2004-12-28 16:36:54 -02:00 @@ -235,7 +235,7 @@ static void tcp_v6_hash(struct sock *sk) { if (sk->sk_state != TCP_CLOSE) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); if (tp->af_specific == &ipv6_mapped) { tcp_prot.hash(sk); @@ -391,7 +391,7 @@ return c & (TCP_SYNQ_HSIZE - 1); } -static struct open_request *tcp_v6_search_req(struct tcp_opt *tp, +static struct open_request *tcp_v6_search_req(struct tcp_sock *tp, struct open_request ***prevp, __u16 rport, struct in6_addr *raddr, @@ -466,7 +466,7 @@ ipv6_addr_equal(&tw->tw_v6_daddr, saddr) && ipv6_addr_equal(&tw->tw_v6_rcv_saddr, daddr) && sk2->sk_bound_dev_if == sk->sk_bound_dev_if) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); if (tw->tw_ts_recent_stamp) { /* See comment in tcp_ipv4.c */ @@ -551,7 +551,7 @@ struct sockaddr_in6 *usin = (struct sockaddr_in6 *) uaddr; struct inet_sock *inet = inet_sk(sk); struct ipv6_pinfo *np = inet6_sk(sk); - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); struct in6_addr *saddr = NULL, *final_p = NULL, final; struct flowi fl; struct dst_entry *dst; @@ -741,7 +741,7 @@ struct ipv6_pinfo *np; struct sock *sk; int err; - struct tcp_opt *tp; + struct tcp_sock *tp; __u32 seq; sk = tcp_v6_lookup(&hdr->daddr, th->dest, &hdr->saddr, th->source, skb->dev->ifindex); @@ -1146,7 +1146,7 @@ { struct open_request *req, **prev; struct tcphdr *th = skb->h.th; - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); struct sock *nsk; /* Find possible connection requests. */ @@ -1179,7 +1179,7 @@ static void tcp_v6_synq_add(struct sock *sk, struct open_request *req) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); struct tcp_listen_opt *lopt = tp->listen_opt; u32 h = tcp_v6_synq_hash(&req->af.v6_req.rmt_addr, req->rmt_port, lopt->hash_rnd); @@ -1202,7 +1202,7 @@ static int tcp_v6_conn_request(struct sock *sk, struct sk_buff *skb) { struct ipv6_pinfo *np = inet6_sk(sk); - struct tcp_opt tmptp, *tp = tcp_sk(sk); + struct tcp_sock tmptp, *tp = tcp_sk(sk); struct open_request *req = NULL; __u32 isn = TCP_SKB_CB(skb)->when; @@ -1282,7 +1282,7 @@ struct ipv6_pinfo *newnp, *np = inet6_sk(sk); struct tcp6_sock *newtcp6sk; struct inet_sock *newinet; - struct tcp_opt *newtp; + struct tcp_sock *newtp; struct sock *newsk; struct ipv6_txoptions *opt; @@ -1297,7 +1297,7 @@ return NULL; newtcp6sk = (struct tcp6_sock *)newsk; - newtcp6sk->inet.pinet6 = &newtcp6sk->inet6; + inet_sk(newsk)->pinet6 = &newtcp6sk->inet6; newinet = inet_sk(newsk); newnp = inet6_sk(newsk); @@ -1390,7 +1390,7 @@ ~(NETIF_F_IP_CSUM | NETIF_F_TSO); newtcp6sk = (struct tcp6_sock *)newsk; - newtcp6sk->inet.pinet6 = &newtcp6sk->inet6; + inet_sk(newsk)->pinet6 = &newtcp6sk->inet6; newtp = tcp_sk(newsk); newinet = inet_sk(newsk); @@ -1497,7 +1497,7 @@ static int tcp_v6_do_rcv(struct sock *sk, struct sk_buff *skb) { struct ipv6_pinfo *np = inet6_sk(sk); - struct tcp_opt *tp; + struct tcp_sock *tp; struct sk_buff *opt_skb = NULL; /* Imagine: socket is IPv6. IPv4 packet arrives, @@ -1919,7 +1919,7 @@ */ static int tcp_v6_init_sock(struct sock *sk) { - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); skb_queue_head_init(&tp->out_of_order_queue); tcp_init_xmit_timers(sk); @@ -2007,7 +2007,7 @@ int timer_active; unsigned long timer_expires; struct inet_sock *inet = inet_sk(sp); - struct tcp_opt *tp = tcp_sk(sp); + struct tcp_sock *tp = tcp_sk(sp); struct ipv6_pinfo *np = inet6_sk(sp); dest = &np->daddr; diff -Nru a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c --- a/net/sunrpc/svcsock.c 2004-12-28 16:36:53 -02:00 +++ b/net/sunrpc/svcsock.c 2004-12-28 16:36:53 -02:00 @@ -1077,7 +1077,7 @@ svc_tcp_init(struct svc_sock *svsk) { struct sock *sk = svsk->sk_sk; - struct tcp_opt *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); svsk->sk_recvfrom = svc_tcp_recvfrom; svsk->sk_sendto = svc_tcp_sendto; diff -Nru a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c --- a/net/sunrpc/xprt.c 2004-12-28 16:36:54 -02:00 +++ b/net/sunrpc/xprt.c 2004-12-28 16:36:54 -02:00 @@ -1545,8 +1545,7 @@ sk->sk_no_check = UDP_CSUM_NORCV; xprt_set_connected(xprt); } else { - struct tcp_opt *tp = tcp_sk(sk); - tp->nonagle = 1; /* disable Nagle's algorithm */ + tcp_sk(sk)->nonagle = 1; /* disable Nagle's algorithm */ sk->sk_data_ready = tcp_data_ready; sk->sk_state_change = tcp_state_change; xprt_clear_connected(xprt); --------------010100030105070906050706-- From tgraf@suug.ch Tue Dec 28 11:24:35 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 28 Dec 2004 11:24:41 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBSJOEvq018337 for ; Tue, 28 Dec 2004 11:24:34 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id E9138F; Tue, 28 Dec 2004 20:25:20 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id ECAE61C0EA; Tue, 28 Dec 2004 20:26:03 +0100 (CET) Date: Tue, 28 Dec 2004 20:26:03 +0100 From: Thomas Graf To: jamal Cc: "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. Message-ID: <20041228192603.GE32419@postel.suug.ch> References: <20041228161117.GD32419@postel.suug.ch> <1104251817.1090.164.camel@jzny.localdomain> <1104252710.1090.171.camel@jzny.localdomain> <200412270715.iBR7Fffe026855@hera.kernel.org> <20041227121658.GI7884@postel.suug.ch> <1104240053.1100.53.camel@jzny.localdomain> <20041228134022.GA32419@postel.suug.ch> <1104242397.1090.94.camel@jzny.localdomain> <20041228161117.GD32419@postel.suug.ch> <1104251817.1090.164.camel@jzny.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1104252710.1090.171.camel@jzny.localdomain> <1104251817.1090.164.camel@jzny.localdomain> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13142 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * jamal <1104251817.1090.164.camel@jzny.localdomain> 2004-12-28 11:36 > I think this sounds cleaner but is major surgery. The other issue i have > with it is semantically i see classification followed by actions. > Classification may have extended classification within it. Actions may > also have extended actions within them. Hmm... right, let's just leave the action as-is, we don't gain much anyway. Nevertheless, actions should be part of the extensions API but we can safely put them at the end of the extensions match routine so it always comes last. So basically we end up with 1) cls specific matches 2) generic matches including logical expressions 3) action We can always change it later without touching a single classifier. > The classifier is blind while executing those actions. > Data that needs to be embeded within the classifier is: > struct {extmatch type:extmatch void_data}. No, I'd really like to avoid having this but instead put a tcf_exts struct into it which holds all the data needed by the generic part. This also includes the tc_action pointer. This way we can get rid of all action/policer related ifdefs in the classifiers, reduce the chances of mistakes and don't need to touch classifier when changing something in the generic part. > extmatch_classify(extmatchdatastruct,skb) is a generic call which does a > lookup on the type and calls the proper callback. Callbacks return > standard classifier ret codes. Exactly, I called it tcf_exts_match with the following return definition: <0: error/unmatched filter, classifier must stop matching and move on to next filter or return mismatch if at the end. 0: Normal match, classifier must do whathever it would to if a filter matches >0: Classifier return code (TC_ACT_*), classifier must stop immediately and pass on the return code. So basically I export this API: tcf_exts_validate validates the input data and initalizes the action, it must not change any attributes. Validated data is stored in a temporary tcf_exts structure provided by the classifier (local variable) tcf_exts_change Applies the changes from tcf_exts_validate to the classifier, must not fail under any circumstance. Classifier provides both, the destination pointer (hold in classifier data) and the temporary buffer from tcf_exts_valiate. tcf_exts_match Runs all the matches and applies action. tcf_exts_destroy Destroys a complete extensions configuration tcf_exts_dump Dumps extensions into given skb. tcf_exts_dump_stats Dumps statistics into given skb. tcf_exts_is_predicative Returns 1 if a predicative extension is present, i.e. an extension which might cause further actions and thus overrule the regular tcf_result. In other words, returns 1 if a TC_ACT_* my be returned. tcf_exts_available Returns 1 if at least 1 extension is present. How does the exntesions API know about the classifier specifc TLV types? Every classifier maintains a map which holds the types, this is used in _validate, _dump and _dump_stats. > - initialization happens like you said with extended matches TLV which > result in building one of those extended match datastruct bound on the > filter. This is the part I'm unsure about. I want to keep it simple but also powerful. The action part is clear and will be untouched. The generic match part is more difficult, we either make userspace transfer the complete tree every time or we introduce commands like in the classifier. Second is probably better but a little bit more work. The binding part is easy, the hard part is how you interleaf u32 > matches for example vs indev. Value TLV: TCA_META_VALUE_TYPE, struct tcf_meta_type TCA_META_VALUE_DATA, variable struct tcf_meta_type { __u16 kind; __u16 len; }; Where kind: enum { /* numberic types */ TCF_META_I_NUMBER = 0x100, TCF_META_I_NFMARK = 0x101, TCF_META_I_TCINDEX = 0x102, /* variable length types */ TCF_META_V_PATTERN = 0x200, TCF_META_V_INDEV = 0x201, }; A matching of the upper 4 bits means the values can be compared. Of course userspace should check for this so we only have to put in a BUG_ON assertion. > ** Also i see your point that changing all the classifiers is painful, > but doing it this once so the next written classifier is easy is worth > the effort in my opinion. Truly agreed, I did this work already and I'm now testing it. We can easly put the generic match on top of it and all we have to do is add TCA_XXX_EXTS for every classifier and add it to the map. From roland@topspin.com Tue Dec 28 11:47:36 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 28 Dec 2004 11:47:43 -0800 (PST) Received: from umhlanga.STRATNET.NET (umhlanga.stratnet.net [12.162.17.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBSJlFpd019381 for ; Tue, 28 Dec 2004 11:47:36 -0800 Received: from exch-1.topspincom.com ([12.162.17.3]) by umhlanga.STRATNET.NET with Microsoft SMTPSVC(5.0.2195.5329); Tue, 28 Dec 2004 11:48:15 -0800 Received: from eddore ([10.10.253.169]) by exch-1.topspincom.com with Microsoft SMTPSVC(5.0.2195.5329); Tue, 28 Dec 2004 11:48:14 -0800 Received: from roland by eddore with local (Exim 4.34) id 1CjNKH-0001aQ-Vi; Tue, 28 Dec 2004 11:48:14 -0800 To: "David S. Miller" Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com, openib-general@openib.org X-Message-Flag: Warning: May contain useful information References: <200412272150.IBRnA4AvjendsF8x@topspin.com> <20041227225417.3ac7a0a6.davem@davemloft.net> From: Roland Dreier Date: Tue, 28 Dec 2004 11:48:13 -0800 In-Reply-To: <20041227225417.3ac7a0a6.davem@davemloft.net> (David S. Miller's message of "Mon, 27 Dec 2004 22:54:17 -0800") Message-ID: <52pt0unr0i.fsf@topspin.com> User-Agent: Gnus/5.1006 (Gnus v5.10.6) XEmacs/21.4 (Corporate Culture, linux) MIME-Version: 1.0 X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: roland@topspin.com Subject: Re: [PATCH][v5][0/24] Latest IB patch queue Content-Type: text/plain; charset=us-ascii X-SA-Exim-Version: 4.1 (built Tue, 17 Aug 2004 11:06:07 +0200) X-SA-Exim-Scanned: Yes (on eddore) X-OriginalArrivalTime: 28 Dec 2004 19:48:14.0885 (UTC) FILETIME=[2B8B9D50:01C4ED16] X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13143 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: roland@topspin.com Precedence: bulk X-list: netdev David> W00t :-) All applied, thanks Roland. David> I'll run it through some build tests then toss it upstream. Very cool, thanks a lot. Let me know if you see any build failures -- I test on about 6 or 7 different archs/configs but the bug gods always seem to hide problems from me. Speaking of build failures, one of my test builds is cross-compiling for sparc64 with gcc 3.4.2, which adds __attribute__((warn_unused_result)) to copy_to_user() et al. The -Werror in the arch/sparc64 means the build fails with linux-2.6.10/arch/sparc64/kernel/sys_sparc32.c:1686: warning: ignoring return value of `copy_to_user', declared with attribute warn_unused_result Of course binfmt_elf.c and compat_ioctl.c still have issues but those probably get more visibility... Thanks, Roland Check copy_to_user() return value in sys_sparc32.c and sys_sunos32.c. Signed-off-by: Roland Dreier Index: linux-2.6.10/arch/sparc64/kernel/sys_sparc32.c =================================================================== --- linux-2.6.10.orig/arch/sparc64/kernel/sys_sparc32.c 2004-12-24 13:35:00.000000000 -0800 +++ linux-2.6.10/arch/sparc64/kernel/sys_sparc32.c 2004-12-28 11:46:00.190457463 -0800 @@ -1683,7 +1683,8 @@ put_user(oldlen, (u32 __user *)(unsigned long) tmp.oldlenp)) error = -EFAULT; } - copy_to_user(args->__unused, tmp.__unused, sizeof(tmp.__unused)); + if (copy_to_user(args->__unused, tmp.__unused, sizeof(tmp.__unused))) + error = -EFAULT; } return error; #endif Index: linux-2.6.10/arch/sparc64/kernel/sys_sunos32.c =================================================================== --- linux-2.6.10.orig/arch/sparc64/kernel/sys_sunos32.c 2004-12-24 13:35:00.000000000 -0800 +++ linux-2.6.10/arch/sparc64/kernel/sys_sunos32.c 2004-12-28 11:47:03.954923634 -0800 @@ -291,7 +291,8 @@ put_user(ino, &dirent->d_ino); put_user(namlen, &dirent->d_namlen); put_user(reclen, &dirent->d_reclen); - copy_to_user(dirent->d_name, name, namlen); + if (copy_to_user(dirent->d_name, name, namlen)) + return -EFAULT; put_user(0, dirent->d_name + namlen); dirent = (void __user *) dirent + reclen; buf->curr = dirent; @@ -371,7 +372,8 @@ put_user(ino, &dirent->d_ino); put_user(namlen, &dirent->d_namlen); put_user(reclen, &dirent->d_reclen); - copy_to_user(dirent->d_name, name, namlen); + if (copy_to_user(dirent->d_name, name, namlen)) + return -EFAULT; put_user(0, dirent->d_name + namlen); dirent = (void __user *) dirent + reclen; buf->curr = dirent; From hadi@cyberus.ca Tue Dec 28 13:14:09 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 28 Dec 2004 13:14:15 -0800 (PST) Received: from mx03.cybersurf.com (mx03.cybersurf.com [209.197.145.106]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBSLDd5l025410 for ; Tue, 28 Dec 2004 13:14:04 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx03.cybersurf.com with esmtp (Exim 4.30) id 1CjOgQ-0005Gi-E6 for netdev@oss.sgi.com; Tue, 28 Dec 2004 16:15:10 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CjOgI-0004Te-Gf; Tue, 28 Dec 2004 16:15:02 -0500 Subject: Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. From: jamal Reply-To: hadi@cyberus.ca To: Thomas Graf Cc: "David S. Miller" , netdev@oss.sgi.com In-Reply-To: <20041228192603.GE32419@postel.suug.ch> References: <20041228161117.GD32419@postel.suug.ch> <1104251817.1090.164.camel@jzny.localdomain> <1104252710.1090.171.camel@jzny.localdomain> <200412270715.iBR7Fffe026855@hera.kernel.org> <20041227121658.GI7884@postel.suug.ch> <1104240053.1100.53.camel@jzny.localdomain> <20041228134022.GA32419@postel.suug.ch> <1104242397.1090.94.camel@jzny.localdomain> <20041228161117.GD32419@postel.suug.ch> <1104251817.1090.164.camel@jzny.localdomain> <20041228192603.GE32419@postel.suug.ch> Content-Type: text/plain Organization: jamalopolous Message-Id: <1104268498.1090.254.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 28 Dec 2004 16:14:58 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13144 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Tue, 2004-12-28 at 14:26, Thomas Graf wrote: > * jamal <1104251817.1090.164.camel@jzny.localdomain> 2004-12-28 11:36 > Hmm... right, let's just leave the action as-is, we don't gain much > anyway. Nevertheless, actions should be part of the extensions API > but we can safely put them at the end of the extensions match routine > so it always comes last. So basically we end up with > > 1) cls specific matches > 2) generic matches including logical expressions > 3) action Add 4) action extensions as well. > We can always change it later without touching a single classifier. > > > The classifier is blind while executing those actions. > > Data that needs to be embeded within the classifier is: > > struct {extmatch type:extmatch void_data}. > > No, I'd really like to avoid having this but instead put a tcf_exts > struct into it which holds all the data needed by the generic part. > This also includes the tc_action pointer. This way we can get rid > of all action/policer related ifdefs in the classifiers, reduce the > chances of mistakes and don't need to touch classifier when changing > something in the generic part. Whatever you had before is fine for action/policer - with intent to kill policer eventually. What i objected to is the indev and any other thing that has to do with classification helping - thats not where it should fit. Take u32 for example: The fit for match extensions is really at the key level not at a layer above. We need a sel2 which has new keys (which is easy because thats transported in a TLV). > > extmatch_classify(extmatchdatastruct,skb) is a generic call which does a > > lookup on the type and calls the proper callback. Callbacks return > > standard classifier ret codes. > > Exactly, I called it tcf_exts_match with the following return definition: > <0: error/unmatched filter, classifier must stop matching and move on to > next filter or return mismatch if at the end. > 0: Normal match, classifier must do whathever it would to if a filter matches > >0: Classifier return code (TC_ACT_*), classifier must stop immediately > and pass on the return code. > Why not reuse what already exists in terms of classifier/filter return codes? They are pretty sufficient and cover all the cases. > So basically I export this API: > > tcf_exts_validate > validates the input data and initalizes the action, it must not > change any attributes. Validated data is stored in a temporary > tcf_exts structure provided by the classifier (local variable) > > tcf_exts_change > Applies the changes from tcf_exts_validate to the classifier, must > not fail under any circumstance. Classifier provides both, the > destination pointer (hold in classifier data) and the temporary > buffer from tcf_exts_valiate. > > tcf_exts_match > Runs all the matches and applies action. > > tcf_exts_destroy > Destroys a complete extensions configuration > > tcf_exts_dump > Dumps extensions into given skb. > > tcf_exts_dump_stats > Dumps statistics into given skb. > > tcf_exts_is_predicative > Returns 1 if a predicative extension is present, i.e. an extension > which might cause further actions and thus overrule the regular > tcf_result. In other words, returns 1 if a TC_ACT_* my be returned. > > tcf_exts_available > Returns 1 if at least 1 extension is present. > > How does the exntesions API know about the classifier specifc TLV > types? Every classifier maintains a map which holds the types, this > is used in _validate, _dump and _dump_stats. > Hrm, so someone writting the one page extension now has to fill in all these functions? If yes this defeats the purpose of the exercise to create a single page of code. The user should really have to write only a match function and call a registration function to hook up into the framework. To be clear: struct tcf_ematch_ops { struct tcf_ematch_ops *next; char kind[IFNAMSIZ]; u32 type; int (*classify)(struct sk_buff*, struct tcf_ematch_data); } A global list of these is kept somewhere and a register/unregister manipulation exists. And the mythical one page code that needs writting: --------------------------------------------------- int myveryownmatch(struct sk_buff *skb, struct tc_action tcf_ematch_data *e) { return TCF_OK; } struct tcf_ematch_ops_ops my_tcf_ematch_ops = { .next = NULL, .name = "mymatch", .type = TCF_MY_MATCHID, .walk = myveryownmatch }; static int __init mymatch_init_module(void) { ret = tcf_register_ematch(&my_tcf_ematch_ops); // check here, ematch may already be registered etc } static void __exit mymatch_cleanup_module(void) { tcf_unregister_ematch(&my_tcf_ematch_ops); } module_init(mymatch_init_module); module_exit(mymatch_cleanup_module); ---------------- If what you describe above is internal - accessible via classifier then fine (other than tcf_exts_match) - lathough it looks excessive. I dont see these things calling actions. They are interleaved between matches. At completion of matches/filtering then you call the action code. > > - initialization happens like you said with extended matches TLV which > > result in building one of those extended match datastruct bound on the > > filter. > > This is the part I'm unsure about. I want to keep it simple but also > powerful. The action part is clear and will be untouched. The generic > match part is more difficult, we either make userspace transfer the > complete tree every time or we introduce commands like in the > classifier. Second is probably better but a little bit more work. Whats wrong with extended TLVs you mentioned earlier? match u32 .. ematch indev ... match u32 ... ematch meta tcindex .. the ematches are essentially TLVs on their own and are owned by the classifier. The classifier doesnt know whats in them. It just calls generic code to execute them. > > The binding part is easy, the hard part is how you interleaf u32 > > matches for example vs indev. > > Value TLV: > TCA_META_VALUE_TYPE, struct tcf_meta_type > TCA_META_VALUE_DATA, variable > > struct tcf_meta_type > { > __u16 kind; > __u16 len; > }; > > Where kind: > > enum > { > /* numberic types */ > TCF_META_I_NUMBER = 0x100, > TCF_META_I_NFMARK = 0x101, > TCF_META_I_TCINDEX = 0x102, > > /* variable length types */ > TCF_META_V_PATTERN = 0x200, > TCF_META_V_INDEV = 0x201, > }; > > A matching of the upper 4 bits means the values can be > compared. Of course userspace should check for this so > we only have to put in a BUG_ON assertion. I think you are only refering to one ematch kind above --> for metadata. What i talked about is arbitrary (example i could put a quick hack to grep strings without writting a full classifier). Essentially what you have fits just fine - you may need two ids; one for IDing as meta match and other as tcindex etc. The second one can be hidden. > > ** Also i see your point that changing all the classifiers is painful, > > but doing it this once so the next written classifier is easy is worth > > the effort in my opinion. > > Truly agreed, I did this work already and I'm now testing it. We can > easly put the generic match on top of it and all we have to do is add > TCA_XXX_EXTS for every classifier and add it to the map. > Refer to what i talked about above. The extension are little helpers i.e they cant live on their own - they need classifiers. Just review and sync essentially. The action extensions as well fit the same way. cheers, jamal From tgraf@suug.ch Tue Dec 28 14:08:52 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 28 Dec 2004 14:09:06 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBSM8VfD027230 for ; Tue, 28 Dec 2004 14:08:51 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id 30B0BF; Tue, 28 Dec 2004 23:09:38 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id 89D8B1C0EA; Tue, 28 Dec 2004 23:10:21 +0100 (CET) Date: Tue, 28 Dec 2004 23:10:21 +0100 From: Thomas Graf To: jamal Cc: "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. Message-ID: <20041228221021.GF32419@postel.suug.ch> References: <1104252710.1090.171.camel@jzny.localdomain> <200412270715.iBR7Fffe026855@hera.kernel.org> <20041227121658.GI7884@postel.suug.ch> <1104240053.1100.53.camel@jzny.localdomain> <20041228134022.GA32419@postel.suug.ch> <1104242397.1090.94.camel@jzny.localdomain> <20041228161117.GD32419@postel.suug.ch> <1104251817.1090.164.camel@jzny.localdomain> <20041228192603.GE32419@postel.suug.ch> <1104268498.1090.254.camel@jzny.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1104268498.1090.254.camel@jzny.localdomain> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13145 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * jamal <1104268498.1090.254.camel@jzny.localdomain> 2004-12-28 16:14 > Whatever you had before is fine for action/policer - with intent to kill > policer eventually. I left it in for now but I see no reason why to do so actually. Old iproute2 binaries should do just fine with the action backward compatibility code? > What i objected to is the indev and any other thing that has to do with > classification helping - thats not where it should fit. > Take u32 for example: The fit for match extensions is really at the key > level not at a layer above. > We need a sel2 which has new keys (which is easy because thats > transported in a TLV). Take a look at http://people.suug.ch/~tgr/patches/queue/03_tcf_exts_u32.diff The extensions are on the same level as the selector. The patchset still has errors in the patches for route and tcindex since it's non-trivial to adapt them to allow changing parameter on-the-fly. The rest is tested and works perfectly fine. I can create a subset or we can just take the first few patches for now and do the development on u32/fw and port it later. > Why not reuse what already exists in terms of classifier/filter return > codes? They are pretty sufficient and cover all the cases. I do reuse them. TC_ACT_* from include/linux/pkt_cls.h > Hrm, so someone writting the one page extension now has to fill in all > these functions? No, that's just how the classifier accesses the extensions API. > [ematch api] Exactly, this would be API visible to the matches. > If what you describe above is internal - accessible via classifier then > fine (other than tcf_exts_match) - lathough it looks excessive. The validate/change split is needed to implement consistent changes in classifiers. The current way causes corruption in classifer data whenever an action configuration fails. > I dont see these things calling actions. They are interleaved between > matches. At completion of matches/filtering then you call the action > code. Right, tcf_exts_match calls the generic matches and at the very end the action. > Whats wrong with extended TLVs you mentioned earlier? > > match u32 .. > ematch indev ... > match u32 ... > ematch meta tcindex .. > > the ematches are essentially TLVs on their own and are owned by > the classifier. The classifier doesnt know whats in them. It just > calls generic code to execute them. They should go into TCA_XXX_EXTS as embeded TLVs. The problem is not how to do it but rather how far to go. Do we want userspace to be able to delete a single generic match? Do we want to only allow replacing all matches? We will hit the limit of skbs at some point if we keep on encapsulating. ;-> > I think you are only refering to one ematch kind above --> for metadata. Correct. This would be the generic match for metadata. > What i talked about is arbitrary (example i could put a quick hack to > grep strings without writting a full classifier). Essentially what you > have fits just fine - you may need two ids; one for IDing as meta match > and other as tcindex etc. The second one can be hidden. I don't get this. From davem@davemloft.net Tue Dec 28 14:17:30 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 28 Dec 2004 14:17:37 -0800 (PST) Received: from cheetah.davemloft.net (mail@adsl-63-197-226-105.dsl.snfc21.pacbell.net [63.197.226.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBSMH9hu027847 for ; Tue, 28 Dec 2004 14:17:30 -0800 Received: from localhost ([127.0.0.1] helo=cheetah.davemloft.net ident=davem) by cheetah.davemloft.net with smtp (Exim 3.36 #1 (Debian)) id 1CjPeR-0004x2-00; Tue, 28 Dec 2004 14:17:11 -0800 Date: Tue, 28 Dec 2004 14:17:10 -0800 From: "David S. Miller" To: Roland Dreier Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com, openib-general@openib.org Subject: Re: [PATCH][v5][0/24] Latest IB patch queue Message-Id: <20041228141710.4daebcfb.davem@davemloft.net> In-Reply-To: <52pt0unr0i.fsf@topspin.com> References: <200412272150.IBRnA4AvjendsF8x@topspin.com> <20041227225417.3ac7a0a6.davem@davemloft.net> <52pt0unr0i.fsf@topspin.com> X-Mailer: Sylpheed version 1.0.0rc (GTK+ 1.2.10; sparc-unknown-linux-gnu) X-Face: "_;p5u5aPsO,_Vsx"^v-pEq09'CU4&Dc1$fQExov$62l60cgCc%FnIwD=.UF^a>?5'9Kn[;433QFVV9M..2eN.@4ZWPGbdi<=?[:T>y?SD(R*-3It"Vj:)"dP Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13146 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: davem@davemloft.net Precedence: bulk X-list: netdev On Tue, 28 Dec 2004 11:48:13 -0800 Roland Dreier wrote: > Speaking of build failures, one of my test builds is cross-compiling > for sparc64 with gcc 3.4.2, which adds __attribute__((warn_unused_result)) > to copy_to_user() et al. The -Werror in the arch/sparc64 means the > build fails with Thanks, I'll check that out. I believe that you didn't test the sparc64 build of the infiniband stuff because arch/sparc64/Kconfig needs to explicitly include the infiniband Kconfig since it does not use drivers/Kconfig. You didn't send me any such changes. There are a few platforms which also are in this situation. I added the sparc64 one to my tree while integrating your changes, but the others need to be attended to if you wish infiniband to be configurable on them. From alan@lxorguk.ukuu.org.uk Tue Dec 28 14:57:12 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 28 Dec 2004 14:57:18 -0800 (PST) Received: from localhost.localdomain (clock-tower.bc.nu [81.2.110.250] (may be forged)) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBSMupwn029285 for ; Tue, 28 Dec 2004 14:57:12 -0800 Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by localhost.localdomain (8.12.11/8.12.11) with ESMTP id iBSLs9Cx026132; Tue, 28 Dec 2004 21:54:10 GMT Received: (from alan@localhost) by localhost.localdomain (8.12.11/8.12.11/Submit) id iBSLs8wS026130; Tue, 28 Dec 2004 21:54:08 GMT X-Authentication-Warning: localhost.localdomain: alan set sender to alan@lxorguk.ukuu.org.uk using -f Subject: Re: [2.6 patch] /net/ax25/: some cleanups From: Alan Cox To: "David S. Miller" Cc: bunk@stusta.de, ralf@linux-mips.org, linux-hams@vger.kernel.org, netdev@oss.sgi.com, Linux Kernel Mailing List In-Reply-To: <20041228100507.7b374b5e.davem@davemloft.net> References: <20041212211339.GX22324@stusta.de> <20041227185151.2a7ceb71.davem@davemloft.net> <1104237408.20944.70.camel@localhost.localdomain> <20041228100507.7b374b5e.davem@davemloft.net> Content-Type: text/plain Content-Transfer-Encoding: 7bit Message-Id: <1104270846.26109.3.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.6 (1.4.6-2) Date: Tue, 28 Dec 2004 21:54:06 +0000 X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13147 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: alan@lxorguk.ukuu.org.uk Precedence: bulk X-list: netdev On Maw, 2004-12-28 at 18:05, David S. Miller wrote: > Send a patch to netdev and CC: me to fix this as is standard > procedure for getting changes into the networking. :-) Attached - revert to 2.6.9 behaviour. I suspect given that someone made the change for a reason that there should probably be a sysctl to switch between AX.25 (as 2.6.9) and "kind of routed AX.25-ish" (as 2.6.10). --- ../linux.vanilla-2.6.10/net/ax25/af_ax25.c 2004-12-25 21:15:46.000000000 +0000 +++ net/ax25/af_ax25.c 2004-12-26 22:07:44.000000000 +0000 @@ -207,8 +207,16 @@ continue; if (s->ax25_dev == NULL) continue; - if (ax25cmp(&s->source_addr, src_addr) == 0 && - ax25cmp(&s->dest_addr, dest_addr) == 0) { + if (ax25cmp(&s->source_addr, src_addr) == 0 && ax25cmp(&s->dest_addr, dest_addr) == 0 && s->ax25_dev->dev == dev) { + if (digi != NULL && digi->ndigi != 0) { + if (s->digipeat == NULL) + continue; + if (ax25digicmp(s->digipeat, digi) != 0) + continue; + } else { + if (s->digipeat != NULL && s->digipeat->ndigi != 0) + continue; + } ax25_cb_hold(s); spin_unlock_bh(&ax25_list_lock); From hadi@cyberus.ca Tue Dec 28 15:05:37 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 28 Dec 2004 15:05:44 -0800 (PST) Received: from mx03.cybersurf.com (mx03.cybersurf.com [209.197.145.106]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBSN5GHV030105 for ; Tue, 28 Dec 2004 15:05:37 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx03.cybersurf.com with esmtp (Exim 4.30) id 1CjQQS-0002zB-1n for netdev@oss.sgi.com; Tue, 28 Dec 2004 18:06:48 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CjQQM-0000ih-It; Tue, 28 Dec 2004 18:06:42 -0500 Subject: Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. From: jamal Reply-To: hadi@cyberus.ca To: Thomas Graf Cc: "David S. Miller" , netdev@oss.sgi.com In-Reply-To: <20041228221021.GF32419@postel.suug.ch> References: <1104252710.1090.171.camel@jzny.localdomain> <200412270715.iBR7Fffe026855@hera.kernel.org> <20041227121658.GI7884@postel.suug.ch> <1104240053.1100.53.camel@jzny.localdomain> <20041228134022.GA32419@postel.suug.ch> <1104242397.1090.94.camel@jzny.localdomain> <20041228161117.GD32419@postel.suug.ch> <1104251817.1090.164.camel@jzny.localdomain> <20041228192603.GE32419@postel.suug.ch> <1104268498.1090.254.camel@jzny.localdomain> <20041228221021.GF32419@postel.suug.ch> Content-Type: text/plain Organization: jamalopolous Message-Id: <1104275197.1100.276.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 28 Dec 2004 18:06:37 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13148 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Tue, 2004-12-28 at 17:10, Thomas Graf wrote: > * jamal <1104268498.1090.254.camel@jzny.localdomain> 2004-12-28 16:14 > > Whatever you had before is fine for action/policer - with intent to kill > > policer eventually. > > I left it in for now but I see no reason why to do so actually. Old > iproute2 binaries should do just fine with the action backward > compatibility code? > Its maintainance work. Nothing it provides isnt provided by new policer. > > What i objected to is the indev and any other thing that has to do with > > classification helping - thats not where it should fit. > > Take u32 for example: The fit for match extensions is really at the key > > level not at a layer above. > > We need a sel2 which has new keys (which is easy because thats > > transported in a TLV). > > Take a look at http://people.suug.ch/~tgr/patches/queue/03_tcf_exts_u32.diff > > The extensions are on the same level as the selector. Ok, i see we may be talking about two separate things:-> that patch is fine for policer/action (I noticed you removed indev - good thing). It is not proper spot for the matches and infact should go in as a separate patch altogether (relation is very minimal). For the matches, the checks are going to be per key _not_ at the selector level; i.e: struct tc_newu32_key { __u32 mask; __u32 val; int off; int offmask; pointer to extendedmatches here. }; Since these keys are packed in a selector and the selector is what gets transported from/to user space we need a selector2 which packs these new keys instead. Makes sense? i.e need a TCA_U32_SEL2 where the extended matches are stored. Backward compatibility: New TC transports them in addition to TCA_U32_SEL and old kernels ignore them. old TC doesnt send them so new kernels can handle them just fine. Beauty of TLVs. Your check in the classifier is if (matched) { if (NULL != key.pointertoematch) { ret = call the generic ematch function if ret == OK continue with next match else failed } } u32_change receives the extended matches and populates them accordingly. There is no need for a dump fucntion to exist for them. They get shipped back the same way they came in - user space knows how to dump them. a key.pointertoematch could be infact a llist of these items. So the struct looks like: struct tcf_ematch_info { struct tcf_ematch_info *next; __u32 type void *data; may need a datalen here for dumping back to user space. }; Makes sense? Back to what i said earlier i can now write a single page of code to scan for word "Thomas" if i get a match on TCP port 25 for all IP addresses... i.e metadata is a subset of all this. cheers, jamal From tgraf@suug.ch Tue Dec 28 15:17:47 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 28 Dec 2004 15:17:54 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBSNHQI2030913 for ; Tue, 28 Dec 2004 15:17:47 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id 4C108F; Wed, 29 Dec 2004 00:18:33 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id 5BAAF1C0EA; Wed, 29 Dec 2004 00:19:16 +0100 (CET) Date: Wed, 29 Dec 2004 00:19:16 +0100 From: Thomas Graf To: jamal Cc: "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. Message-ID: <20041228231916.GG32419@postel.suug.ch> References: <20041227121658.GI7884@postel.suug.ch> <1104240053.1100.53.camel@jzny.localdomain> <20041228134022.GA32419@postel.suug.ch> <1104242397.1090.94.camel@jzny.localdomain> <20041228161117.GD32419@postel.suug.ch> <1104251817.1090.164.camel@jzny.localdomain> <20041228192603.GE32419@postel.suug.ch> <1104268498.1090.254.camel@jzny.localdomain> <20041228221021.GF32419@postel.suug.ch> <1104275197.1100.276.camel@jzny.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1104275197.1100.276.camel@jzny.localdomain> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13149 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * jamal <1104275197.1100.276.camel@jzny.localdomain> 2004-12-28 18:06 > Its maintainance work. Nothing it provides isnt provided by > new policer. I'll remove it. > It is not proper spot for the matches and infact > should go in as a separate patch altogether (relation is very minimal). > > For the matches, the checks are going to be per key _not_ at the > selector level; i.e: > > struct tc_newu32_key > { > __u32 mask; > __u32 val; > int off; > int offmask; > pointer to extendedmatches here. > }; > > Since these keys are packed in a selector and the selector is what gets > transported from/to user space we need a selector2 which packs these new > keys instead. Makes sense? i.e need a TCA_U32_SEL2 where the extended > matches are stored. Why? I don't get that. Generic matches must only be considered if all keys of u32 match. u32 keys are just ANDed matches if one fails we can directly declare the classifier as unmatched. The only thing we would gain is that we could add multiple generic matches but with lack of real logical expressions. I'd rather implemnt some simple form of logical expression in the generic part so all classifiers can benfit. > Makes sense? Not for me. ;-> > Back to what i said earlier i can now write a single page of code > to scan for word "Thomas" if i get a match on TCP port 25 for all IP > addresses... i.e metadata is a subset of all this. Agreed, and smart as you are you simply take the Knuth-Morris-Pratt code out of my EGP patch to get some performance. ;-> From roland@topspin.com Tue Dec 28 15:23:35 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 28 Dec 2004 15:23:42 -0800 (PST) Received: from umhlanga.STRATNET.NET (umhlanga.stratnet.net [12.162.17.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBSNNEdt031516 for ; Tue, 28 Dec 2004 15:23:35 -0800 Received: from exch-1.topspincom.com ([12.162.17.3]) by umhlanga.STRATNET.NET with Microsoft SMTPSVC(5.0.2195.5329); Tue, 28 Dec 2004 15:24:44 -0800 Received: from eddore ([10.10.253.169]) by exch-1.topspincom.com with Microsoft SMTPSVC(5.0.2195.5329); Tue, 28 Dec 2004 15:24:44 -0800 Received: from roland by eddore with local (Exim 4.34) id 1CjQhn-0002JP-Ma; Tue, 28 Dec 2004 15:24:44 -0800 To: "David S. Miller" Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com, openib-general@openib.org X-Message-Flag: Warning: May contain useful information References: <200412272150.IBRnA4AvjendsF8x@topspin.com> <20041227225417.3ac7a0a6.davem@davemloft.net> <52pt0unr0i.fsf@topspin.com> <20041228141710.4daebcfb.davem@davemloft.net> From: Roland Dreier Date: Tue, 28 Dec 2004 15:24:43 -0800 In-Reply-To: <20041228141710.4daebcfb.davem@davemloft.net> (David S. Miller's message of "Tue, 28 Dec 2004 14:17:10 -0800") Message-ID: <52pt0uhupw.fsf@topspin.com> User-Agent: Gnus/5.1006 (Gnus v5.10.6) XEmacs/21.4 (Corporate Culture, linux) MIME-Version: 1.0 X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: roland@topspin.com Subject: Re: [PATCH][v5][0/24] Latest IB patch queue Content-Type: text/plain; charset=us-ascii X-SA-Exim-Version: 4.1 (built Tue, 17 Aug 2004 11:06:07 +0200) X-SA-Exim-Scanned: Yes (on eddore) X-OriginalArrivalTime: 28 Dec 2004 23:24:44.0104 (UTC) FILETIME=[69B92480:01C4ED34] X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13150 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: roland@topspin.com Precedence: bulk X-list: netdev David> I believe that you didn't test the sparc64 build of the David> infiniband stuff because arch/sparc64/Kconfig needs to David> explicitly include the infiniband Kconfig since it does not David> use drivers/Kconfig. You didn't send me any such changes. Actually I did test the build (and Tom Duffy at Sun has actually run the drivers on his system), but I forgot to include the required Kconfig change -- I just have it in my local test tree. David> There are a few platforms which also are in this situation. David> I added the sparc64 one to my tree while integrating your David> changes, but the others need to be attended to if you wish David> infiniband to be configurable on them. I think sparc64 is the only such platform where InfiniBand is likely to be of much interest. However I'll check out all of arch/ and send patches to hook up drivers/infiniband/ to the relevant maintainers once IB makes it upstream. Thanks, Roland From hadi@cyberus.ca Tue Dec 28 15:38:23 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 28 Dec 2004 15:38:30 -0800 (PST) Received: from mx01.cybersurf.com (mx01.cybersurf.com [209.197.145.104]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBSNc37g032383 for ; Tue, 28 Dec 2004 15:38:23 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx01.cybersurf.com with esmtp (Exim 4.30) id 1CjQw8-0007Qt-Bg for netdev@oss.sgi.com; Tue, 28 Dec 2004 18:39:32 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CjQw5-0004lQ-NW; Tue, 28 Dec 2004 18:39:29 -0500 Subject: Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. From: jamal Reply-To: hadi@cyberus.ca To: Thomas Graf Cc: "David S. Miller" , netdev@oss.sgi.com In-Reply-To: <20041228231916.GG32419@postel.suug.ch> References: <20041227121658.GI7884@postel.suug.ch> <1104240053.1100.53.camel@jzny.localdomain> <20041228134022.GA32419@postel.suug.ch> <1104242397.1090.94.camel@jzny.localdomain> <20041228161117.GD32419@postel.suug.ch> <1104251817.1090.164.camel@jzny.localdomain> <20041228192603.GE32419@postel.suug.ch> <1104268498.1090.254.camel@jzny.localdomain> <20041228221021.GF32419@postel.suug.ch> <1104275197.1100.276.camel@jzny.localdomain> <20041228231916.GG32419@postel.suug.ch> Content-Type: text/plain Organization: jamalopolous Message-Id: <1104277165.1100.291.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 28 Dec 2004 18:39:25 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13151 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Tue, 2004-12-28 at 18:19, Thomas Graf wrote: > * jamal <1104275197.1100.276.camel@jzny.localdomain> 2004-12-28 18:06 > > Its maintainance work. Nothing it provides isnt provided by > > new policer. > > I'll remove it. > Well one of your goals was to reduce the ifdef cluter. That goal is achieved with that goal. So I would keep it as is for now but kill it in the future. I am hoping we are talking about the same thing here - the old policer ;-> > > Since these keys are packed in a selector and the selector is what gets > > transported from/to user space we need a selector2 which packs these new > > keys instead. Makes sense? i.e need a TCA_U32_SEL2 where the extended > > matches are stored. > > Why? I don't get that. Generic matches must only be considered if all > keys of u32 match. u32 keys are just ANDed matches if one fails we can > directly declare the classifier as unmatched. yes - and that still applies here but you can now interleaf - as i mentioned earlier: match u32 .. ematch string "Thomas" ... match u32 ... ematch meta tcindex .. I dont wanna go into details of whether we could actually make the new keys do more than just strict AND from left to right - but you can see the potential to "fix" this if we are defining a new key ;-> The same applies for interleafing of actions and eactions. > The only thing we would > gain is that we could add multiple generic matches but with lack of real > logical expressions. I'd rather implemnt some simple form of logical > expression in the generic part so all classifiers can benfit. Ok, the logical expressions are the tricky part. But refer to what i am saying above. You still need to be backward compatible. But for the new keys you could go onto the adventorous side. I havent given the logical expressions much thought but i will in the background > > Makes sense? > > Not for me. ;-> > > > Back to what i said earlier i can now write a single page of code > > to scan for word "Thomas" if i get a match on TCP port 25 for all IP > > addresses... i.e metadata is a subset of all this. > > Agreed, and smart as you are you simply take the Knuth-Morris-Pratt > code out of my EGP patch to get some performance. ;-> Of course;-> Ive worked long enough in Linux to appreciate the definition of TheLinuxWay(tm) ;-> I also get to inherit your bugs this way. cheers, jamal From tgraf@suug.ch Tue Dec 28 16:08:00 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 28 Dec 2004 16:08:08 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBT07dPS001290 for ; Tue, 28 Dec 2004 16:08:00 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id EBFAAF; Wed, 29 Dec 2004 01:08:46 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id 379F11C0EA; Wed, 29 Dec 2004 01:09:28 +0100 (CET) Date: Wed, 29 Dec 2004 01:09:28 +0100 From: Thomas Graf To: jamal Cc: "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. Message-ID: <20041229000928.GH32419@postel.suug.ch> References: <20041228134022.GA32419@postel.suug.ch> <1104242397.1090.94.camel@jzny.localdomain> <20041228161117.GD32419@postel.suug.ch> <1104251817.1090.164.camel@jzny.localdomain> <20041228192603.GE32419@postel.suug.ch> <1104268498.1090.254.camel@jzny.localdomain> <20041228221021.GF32419@postel.suug.ch> <1104275197.1100.276.camel@jzny.localdomain> <20041228231916.GG32419@postel.suug.ch> <1104277165.1100.291.camel@jzny.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1104277165.1100.291.camel@jzny.localdomain> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13152 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * jamal <1104277165.1100.291.camel@jzny.localdomain> 2004-12-28 18:39 > On Tue, 2004-12-28 at 18:19, Thomas Graf wrote: > > Why? I don't get that. Generic matches must only be considered if all > > keys of u32 match. u32 keys are just ANDed matches if one fails we can > > directly declare the classifier as unmatched. > > yes - and that still applies here but you can now interleaf - as i > mentioned earlier: > > match u32 .. > ematch string "Thomas" ... > match u32 ... > ematch meta tcindex .. Yes but the only avantage of this is that a u32 match can be made dependant on a ematch. Is this really worth special handling? It requires special handling not needed for any of the other classifiers. I understand your point but don't agree at the moment. I might change my mind tomorrow ;-> > I dont wanna go into details of whether we could actually make the new > keys do more than just strict AND from left to right - but you can see > the potential to "fix" this if we are defining a new key ;-> We should rather do it on cls_api level, unfortunantely it's not that simple but the current status of having one classifier kind per prio and no way to interconnect them must be changed somewhen. > Ok, the logical expressions are the tricky part. But refer to what i am > saying above. You still need to be backward compatible. But for the new > keys you could go onto the adventorous side. I havent given the logical > expressions much thought but i will in the background Implementing logical expressions directly into u32 would be bad but we could have u32 hold a expression tree rather than the ematch directly which means you could do match u32 .. (ematch meta nfmark .. or string "...") match u32 .. From hadi@cyberus.ca Tue Dec 28 17:12:30 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 28 Dec 2004 17:12:38 -0800 (PST) Received: from mx01.cybersurf.com (mx01.cybersurf.com [209.197.145.104]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBT1CAX8007068 for ; Tue, 28 Dec 2004 17:12:30 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx01.cybersurf.com with esmtp (Exim 4.30) id 1CjSPD-0007Q0-9D for netdev@oss.sgi.com; Tue, 28 Dec 2004 20:13:39 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CjSPA-0007uz-Cc; Tue, 28 Dec 2004 20:13:36 -0500 Subject: Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. From: jamal Reply-To: hadi@cyberus.ca To: Thomas Graf Cc: "David S. Miller" , netdev@oss.sgi.com In-Reply-To: <20041229000928.GH32419@postel.suug.ch> References: <20041228134022.GA32419@postel.suug.ch> <1104242397.1090.94.camel@jzny.localdomain> <20041228161117.GD32419@postel.suug.ch> <1104251817.1090.164.camel@jzny.localdomain> <20041228192603.GE32419@postel.suug.ch> <1104268498.1090.254.camel@jzny.localdomain> <20041228221021.GF32419@postel.suug.ch> <1104275197.1100.276.camel@jzny.localdomain> <20041228231916.GG32419@postel.suug.ch> <1104277165.1100.291.camel@jzny.localdomain> <20041229000928.GH32419@postel.suug.ch> Content-Type: text/plain Organization: jamalopolous Message-Id: <1104282811.1090.314.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 28 Dec 2004 20:13:31 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13153 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Tue, 2004-12-28 at 19:09, Thomas Graf wrote: > * jamal <1104277165.1100.291.camel@jzny.localdomain> 2004-12-28 18:39 > > On Tue, 2004-12-28 at 18:19, Thomas Graf wrote: > > match u32 .. > > ematch string "Thomas" ... > > match u32 ... > > ematch meta tcindex .. > > Yes but the only avantage of this is that a u32 match can be > made dependant on a ematch. Is this really worth special > handling? It requires special handling not needed for any > of the other classifiers. > > I understand your point but don't agree at the moment. I > might change my mind tomorrow ;-> no problem ;-> I think the effort is the same as in doing it the other way - only result is more sophisticated. I havent investigated the other classifiers - u32 tends to be more complex, so solving it on u32 solves all the others typically. > > I dont wanna go into details of whether we could actually make the new > > keys do more than just strict AND from left to right - but you can see > > the potential to "fix" this if we are defining a new key ;-> > > We should rather do it on cls_api level, unfortunantely it's not that > simple but the current status of having one classifier kind per prio and > no way to interconnect them must be changed somewhen. > Remember two levels: 1) the classifier logical expressions (u32 and rsvp for example) - those belong to cls api. if u32 match .. ... AND rsvp ... OR route ... evaluation is left to right with some brute logical OR and ANDs via the continue and reclassify codes. 2) This issue is at a the single classifier/filter level, so its fair to push that into the classifier logic. an extended match cannot live by itself. Its parasitic on a real classifier - so the scope MUST be restricted to classifier as a matter of fact. > > Ok, the logical expressions are the tricky part. But refer to what i am > > saying above. You still need to be backward compatible. But for the new > > keys you could go onto the adventorous side. I havent given the logical > > expressions much thought but i will in the background > > Implementing logical expressions directly into u32 would be bad but > we could have u32 hold a expression tree rather than the ematch > directly which means you could do > > match u32 .. > (ematch meta nfmark .. or string "...") > match u32 .. > Or you could have: match u32 OR (ematch meta nfmark .. or string "...") match u32 .. Recall, the opportunity to do more in terms of logical expressions within u32 exists because we can introduce more interesting keys ... Anyways, I am going to introduce extended actions. I need to write a few stupid little actions not worth the whole API (such as a checksum validator/recomputer which i need to follow the pedit action for stateless NAT). It would use exactly teh same interleaving logic as what weve discussed. I dont know where you and Patrick are with the code but it think it would be safe for me to work off 2610-bk1. Tommorow i am working on some e1000 stuff - but day after i should be able to touch the action code. cheers, jamal From shaeffer@neuralscape.com Tue Dec 28 17:28:33 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 28 Dec 2004 17:28:40 -0800 (PST) Received: from neuralscape.com (synapse.neuralscape.com [198.144.200.82]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBT1SDTv008027 for ; Tue, 28 Dec 2004 17:28:33 -0800 Received: from shaeffer by neuralscape.com with local (Exim 4.12) id 1CjSdN-0004uc-00; Tue, 28 Dec 2004 17:28:17 -0800 Date: Tue, 28 Dec 2004 17:28:17 -0800 From: Karen Shaeffer To: Roland Dreier Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com, openib-general@openib.org Subject: Re: [PATCH][v5][0/24] Latest IB patch queue Message-ID: <20041229012817.GA18863@synapse.neuralscape.com> References: <200412272150.IBRnA4AvjendsF8x@topspin.com> <20041227225417.3ac7a0a6.davem@davemloft.net> <52pt0unr0i.fsf@topspin.com> <20041228141710.4daebcfb.davem@davemloft.net> <52pt0uhupw.fsf@topspin.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <52pt0uhupw.fsf@topspin.com> User-Agent: Mutt/1.4.1i X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13154 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shaeffer@neuralscape.com Precedence: bulk X-list: netdev On Tue, Dec 28, 2004 at 03:24:43PM -0800, Roland Dreier wrote: > > I think sparc64 is the only such platform where InfiniBand is likely > to be of much interest. However I'll check out all of arch/ and send > patches to hook up drivers/infiniband/ to the relevant maintainers > once IB makes it upstream. > Hi Roland, I am interested in Infiniband with x86_64 Opterons. Thanks, Karen -- Karen Shaeffer Neuralscape, Palo Alto, Ca. 94306 shaeffer@neuralscape.com http://www.neuralscape.com From roland@topspin.com Tue Dec 28 17:35:04 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 28 Dec 2004 17:35:11 -0800 (PST) Received: from umhlanga.STRATNET.NET (umhlanga.stratnet.net [12.162.17.40]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBT1Yh7u008702 for ; Tue, 28 Dec 2004 17:35:04 -0800 Received: from exch-1.topspincom.com ([12.162.17.3]) by umhlanga.STRATNET.NET with Microsoft SMTPSVC(5.0.2195.5329); Tue, 28 Dec 2004 17:36:13 -0800 Received: from eddore ([10.10.253.169]) by exch-1.topspincom.com with Microsoft SMTPSVC(5.0.2195.5329); Tue, 28 Dec 2004 17:36:13 -0800 Received: from roland by eddore with local (Exim 4.34) id 1CjSl2-0002mU-N9; Tue, 28 Dec 2004 17:36:13 -0800 To: Karen Shaeffer Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com, openib-general@openib.org X-Message-Flag: Warning: May contain useful information References: <200412272150.IBRnA4AvjendsF8x@topspin.com> <20041227225417.3ac7a0a6.davem@davemloft.net> <52pt0unr0i.fsf@topspin.com> <20041228141710.4daebcfb.davem@davemloft.net> <52pt0uhupw.fsf@topspin.com> <20041229012817.GA18863@synapse.neuralscape.com> From: Roland Dreier Date: Tue, 28 Dec 2004 17:36:12 -0800 In-Reply-To: <20041229012817.GA18863@synapse.neuralscape.com> (Karen Shaeffer's message of "Tue, 28 Dec 2004 17:28:17 -0800") Message-ID: <52hdm5j377.fsf@topspin.com> User-Agent: Gnus/5.1006 (Gnus v5.10.6) XEmacs/21.4 (Corporate Culture, linux) MIME-Version: 1.0 X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: roland@topspin.com Subject: Re: [PATCH][v5][0/24] Latest IB patch queue Content-Type: text/plain; charset=us-ascii X-SA-Exim-Version: 4.1 (built Tue, 17 Aug 2004 11:06:07 +0200) X-SA-Exim-Scanned: Yes (on eddore) X-OriginalArrivalTime: 29 Dec 2004 01:36:13.0192 (UTC) FILETIME=[C7FC7080:01C4ED46] X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13155 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: roland@topspin.com Precedence: bulk X-list: netdev Roland> I think sparc64 is the only such platform where InfiniBand Roland> is likely to be of much interest. However I'll check out Roland> all of arch/ and send patches to hook up Roland> drivers/infiniband/ to the relevant maintainers once IB Roland> makes it upstream. Karen> I am interested in Infiniband with x86_64 Opterons. OK, the current code should work well for you -- x86_64 is probably the most-tested architecture. "such platform[s]" in my comment above referred to architectures where arch/xxx/Kconfig does _not_ include drivers/Kconfig; arch/x86_64/Kconfig does include that file. So no change is required to use the current IB patches on x86_64. I believe the only architectures that both support PCI and do not include drivers/Kconfig in their arch Kconfig are arm, sparc, sparc64 and v850. Perhaps I'm wrong, but of those four architectures, sparc64 seems to be the only one where there would be any interest in using IB. - Roland From juhl-lkml@dif.dk Tue Dec 28 17:49:14 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 28 Dec 2004 17:49:21 -0800 (PST) Received: from mail.dif.dk (mail.dif.dk [193.138.115.101]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBT1mrwW009569 for ; Tue, 28 Dec 2004 17:49:14 -0800 Received: from localhost (localhost [127.0.0.1]) by mail.dif.dk (Postfix) with ESMTP id 75DD5FFC91 for ; Wed, 29 Dec 2004 02:53:56 +0100 (CET) Received: from mail.dif.dk ([127.0.0.1]) by localhost (saerimmer [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 26011-09 for ; Wed, 29 Dec 2004 02:53:55 +0100 (CET) Received: from diftmgw2.backbone.dif.dk (diftmgw2.backbone.dif.dk [10.227.136.246]) by mail.dif.dk (Postfix) with ESMTP id B6FABFFC5D for ; Wed, 29 Dec 2004 02:53:55 +0100 (CET) Received: from DIFPST1A.backbone.dif.dk ([10.227.136.220]) by diftmgw2.backbone.dif.dk with InterScan Messaging Security Suite; Wed, 29 Dec 2004 02:49:20 +0100 Received: from [172.16.2.11] (10.227.136.29 [10.227.136.29]) by DIFPST1A.backbone.dif.dk with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2657.72) id ZGMPDPB7; Wed, 29 Dec 2004 02:50:14 +0100 Date: Wed, 29 Dec 2004 03:01:20 +0100 (CET) From: Jesper Juhl To: Networking Team Cc: linux-net , "David S. Miller" , Alexey Kuznetsov , linux-kernel Subject: Patch: add loglevel to printk's in net/ipv4/route.c Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Scanned: by amavisd-new at dif.dk X-Virus-Status: Clean X-archive-position: 13156 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: juhl-lkml@dif.dk Precedence: bulk X-list: netdev Small patch below adds loglevels to a few printk's in net/ipv4/route.c Signed-off-by: Jesper Juhl diff -up linux-2.6.10-orig/net/ipv4/route.c linux-2.6.10/net/ipv4/route.c --- linux-2.6.10-orig/net/ipv4/route.c 2004-12-24 22:35:40.000000000 +0100 +++ linux-2.6.10/net/ipv4/route.c 2004-12-29 02:55:03.000000000 +0100 @@ -889,8 +889,8 @@ restart: printk(KERN_DEBUG "rt_cache @%02x: %u.%u.%u.%u", hash, NIPQUAD(rt->rt_dst)); for (trt = rt->u.rt_next; trt; trt = trt->u.rt_next) - printk(" . %u.%u.%u.%u", NIPQUAD(trt->rt_dst)); - printk("\n"); + printk(KERN_DEBUG " . %u.%u.%u.%u", NIPQUAD(trt->rt_dst)); + printk(KERN_DEBUG "\n"); } #endif rt_hash_table[hash].chain = rt; @@ -1802,11 +1802,11 @@ martian_source: unsigned char *p = skb->mac.raw; printk(KERN_WARNING "ll header: "); for (i = 0; i < dev->hard_header_len; i++, p++) { - printk("%02x", *p); + printk(KERN_WARNING "%02x", *p); if (i < (dev->hard_header_len - 1)) printk(":"); } - printk("\n"); + printk(KERN_WARNING "\n"); } } #endif From acme@conectiva.com.br Tue Dec 28 17:58:31 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 28 Dec 2004 17:58:46 -0800 (PST) Received: from orion.netbank.com.br (orion.netbank.com.br [200.203.199.90]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBT1wB3Z010266 for ; Tue, 28 Dec 2004 17:58:31 -0800 Received: from [200.163.203.158] (helo=oops.ghostprotocols.net) by orion.netbank.com.br with asmtp (Exim 3.33 #1) id 1CjT9d-0001ka-00; Wed, 29 Dec 2004 00:01:37 -0200 Received: from [192.168.1.6] (amd64.kerneljanitors.org [192.168.1.6]) by oops.ghostprotocols.net (Postfix) with ESMTP id 704DE74C5B; Tue, 28 Dec 2004 23:59:08 -0200 (BRST) Message-ID: <41D2104D.3010406@conectiva.com.br> Date: Wed, 29 Dec 2004 00:02:53 -0200 From: Arnaldo Carvalho de Melo Organization: Conectiva S.A. User-Agent: Mozilla Thunderbird 0.9 (X11/20041103) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Jesper Juhl Cc: Networking Team , linux-net , "David S. Miller" , Alexey Kuznetsov , linux-kernel Subject: Re: Patch: add loglevel to printk's in net/ipv4/route.c References: In-Reply-To: Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13157 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: acme@conectiva.com.br Precedence: bulk X-list: netdev Jesper Juhl wrote: > Small patch below adds loglevels to a few printk's in net/ipv4/route.c > > > Signed-off-by: Jesper Juhl > > diff -up linux-2.6.10-orig/net/ipv4/route.c linux-2.6.10/net/ipv4/route.c > --- linux-2.6.10-orig/net/ipv4/route.c 2004-12-24 22:35:40.000000000 +0100 > +++ linux-2.6.10/net/ipv4/route.c 2004-12-29 02:55:03.000000000 +0100 > @@ -889,8 +889,8 @@ restart: > printk(KERN_DEBUG "rt_cache @%02x: %u.%u.%u.%u", hash, > NIPQUAD(rt->rt_dst)); > for (trt = rt->u.rt_next; trt; trt = trt->u.rt_next) > - printk(" . %u.%u.%u.%u", NIPQUAD(trt->rt_dst)); > - printk("\n"); > + printk(KERN_DEBUG " . %u.%u.%u.%u", NIPQUAD(trt->rt_dst)); > + printk(KERN_DEBUG "\n"); > } > #endif > rt_hash_table[hash].chain = rt; > @@ -1802,11 +1802,11 @@ martian_source: > unsigned char *p = skb->mac.raw; > printk(KERN_WARNING "ll header: "); > for (i = 0; i < dev->hard_header_len; i++, p++) { > - printk("%02x", *p); > + printk(KERN_WARNING "%02x", *p); > if (i < (dev->hard_header_len - 1)) > printk(":"); > } > - printk("\n"); > + printk(KERN_WARNING "\n"); Are you sure the output is much improved? ;) - Arnaldo From juhl-lkml@dif.dk Tue Dec 28 18:01:47 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 28 Dec 2004 18:01:54 -0800 (PST) Received: from mail.dif.dk (mail.dif.dk [193.138.115.101]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBT21QRj010863 for ; Tue, 28 Dec 2004 18:01:47 -0800 Received: from localhost (localhost [127.0.0.1]) by mail.dif.dk (Postfix) with ESMTP id 1804BFFC5D for ; Wed, 29 Dec 2004 03:06:30 +0100 (CET) Received: from mail.dif.dk ([127.0.0.1]) by localhost (saerimmer [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 26273-10 for ; Wed, 29 Dec 2004 03:06:28 +0100 (CET) Received: from diftmgw2.backbone.dif.dk (diftmgw2.backbone.dif.dk [10.227.136.246]) by mail.dif.dk (Postfix) with ESMTP id B3BADFFC89 for ; Wed, 29 Dec 2004 03:06:28 +0100 (CET) Received: from DIFPST1A.backbone.dif.dk ([10.227.136.220]) by diftmgw2.backbone.dif.dk with InterScan Messaging Security Suite; Wed, 29 Dec 2004 03:01:52 +0100 Received: from [172.16.2.11] (10.227.136.29 [10.227.136.29]) by DIFPST1A.backbone.dif.dk with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2657.72) id ZGMPDPCL; Wed, 29 Dec 2004 03:02:46 +0100 Date: Wed, 29 Dec 2004 03:13:52 +0100 (CET) From: Jesper Juhl To: Arnaldo Carvalho de Melo Cc: Jesper Juhl , Networking Team , linux-net , "David S. Miller" , Alexey Kuznetsov , linux-kernel Subject: Re: Patch: add loglevel to printk's in net/ipv4/route.c In-Reply-To: <41D2104D.3010406@conectiva.com.br> Message-ID: References: <41D2104D.3010406@conectiva.com.br> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Scanned: by amavisd-new at dif.dk X-Virus-Status: Clean X-archive-position: 13158 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: juhl-lkml@dif.dk Precedence: bulk X-list: netdev On Wed, 29 Dec 2004, Arnaldo Carvalho de Melo wrote: > Jesper Juhl wrote: > > Small patch below adds loglevels to a few printk's in net/ipv4/route.c > > [...] > > Are you sure the output is much improved? ;) > It doesn't make much difference, it's mostly for completeness/correctness. -- Jesper From acme@conectiva.com.br Tue Dec 28 18:05:42 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 28 Dec 2004 18:05:50 -0800 (PST) Received: from orion.netbank.com.br (orion.netbank.com.br [200.203.199.90]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBT25Mwv011472 for ; Tue, 28 Dec 2004 18:05:42 -0800 Received: from [200.163.203.158] (helo=oops.ghostprotocols.net) by orion.netbank.com.br with asmtp (Exim 3.33 #1) id 1CjTGw-0001lw-00; Wed, 29 Dec 2004 00:09:10 -0200 Received: from [192.168.1.6] (amd64.kerneljanitors.org [192.168.1.6]) by oops.ghostprotocols.net (Postfix) with ESMTP id 0B98A74C62; Wed, 29 Dec 2004 00:06:37 -0200 (BRST) Message-ID: <41D2120E.8030008@conectiva.com.br> Date: Wed, 29 Dec 2004 00:10:22 -0200 From: Arnaldo Carvalho de Melo Organization: Conectiva S.A. User-Agent: Mozilla Thunderbird 0.9 (X11/20041103) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Jesper Juhl Cc: Networking Team , linux-net , "David S. Miller" , Alexey Kuznetsov , linux-kernel Subject: Re: Patch: add loglevel to printk's in net/ipv4/route.c References: <41D2104D.3010406@conectiva.com.br> In-Reply-To: Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13159 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: acme@conectiva.com.br Precedence: bulk X-list: netdev Jesper Juhl wrote: > On Wed, 29 Dec 2004, Arnaldo Carvalho de Melo wrote: > > >>Jesper Juhl wrote: >> >>>Small patch below adds loglevels to a few printk's in net/ipv4/route.c >>> > > [...] > >>Are you sure the output is much improved? ;) >> > > It doesn't make much difference, it's mostly for completeness/correctness. No, it does a helluva difference, give it a try :-) - Arnaldo From joern@wohnheim.fh-wedel.de Tue Dec 28 18:12:09 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 28 Dec 2004 18:12:16 -0800 (PST) Received: from moskovskaya.fh-wedel.de (mail.fh-wedel.de [213.39.232.198]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBT2BmXU012090 for ; Tue, 28 Dec 2004 18:12:09 -0800 Received: from wohnheim.fh-wedel.de ([213.39.233.138]:36222) by moskovskaya.fh-wedel.de with esmtp (Exim 4.34) id 1CjTKa-0000K5-0n; Wed, 29 Dec 2004 03:12:56 +0100 Received: from joern by wohnheim.fh-wedel.de with local (Exim 3.35 #1 (Debian)) id 1CjTKa-0008DI-00; Wed, 29 Dec 2004 03:12:56 +0100 Date: Wed, 29 Dec 2004 03:12:56 +0100 From: =?iso-8859-1?Q?J=F6rn?= Engel To: Arnaldo Carvalho de Melo Cc: Jesper Juhl , Networking Team , linux-net , "David S. Miller" , Alexey Kuznetsov , linux-kernel Subject: Re: Patch: add loglevel to printk's in net/ipv4/route.c Message-ID: <20041229021256.GD29323@wohnheim.fh-wedel.de> References: <41D2104D.3010406@conectiva.com.br> <41D2120E.8030008@conectiva.com.br> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <41D2120E.8030008@conectiva.com.br> User-Agent: Mutt/1.3.28i X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13160 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: joern@wohnheim.fh-wedel.de Precedence: bulk X-list: netdev On Wed, 29 December 2004 00:10:22 -0200, Arnaldo Carvalho de Melo wrote: > > > >It doesn't make much difference, it's mostly for completeness/correctness. > > No, it does a helluva difference, give it a try :-) hint: look for "\n" Jörn -- It is better to die of hunger having lived without grief and fear, than to live with a troubled spirit amid abundance. -- Epictetus From acme@conectiva.com.br Tue Dec 28 18:19:21 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 28 Dec 2004 18:19:29 -0800 (PST) Received: from orion.netbank.com.br (orion.netbank.com.br [200.203.199.90]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBT2J08H012741 for ; Tue, 28 Dec 2004 18:19:21 -0800 Received: from [200.163.203.158] (helo=oops.ghostprotocols.net) by orion.netbank.com.br with asmtp (Exim 3.33 #1) id 1CjTTn-0001o5-00; Wed, 29 Dec 2004 00:22:27 -0200 Received: from [192.168.1.6] (amd64.kerneljanitors.org [192.168.1.6]) by oops.ghostprotocols.net (Postfix) with ESMTP id 5139A74C79; Wed, 29 Dec 2004 00:19:58 -0200 (BRST) Message-ID: <41D2152F.8080501@conectiva.com.br> Date: Wed, 29 Dec 2004 00:23:43 -0200 From: Arnaldo Carvalho de Melo Organization: Conectiva S.A. User-Agent: Mozilla Thunderbird 0.9 (X11/20041103) X-Accept-Language: en-us, en MIME-Version: 1.0 To: =?ISO-8859-1?Q?J=F6rn_Engel?= Cc: Jesper Juhl , Networking Team , linux-net , "David S. Miller" , Alexey Kuznetsov , linux-kernel Subject: Re: Patch: add loglevel to printk's in net/ipv4/route.c References: <41D2104D.3010406@conectiva.com.br> <41D2120E.8030008@conectiva.com.br> <20041229021256.GD29323@wohnheim.fh-wedel.de> In-Reply-To: <20041229021256.GD29323@wohnheim.fh-wedel.de> Content-Type: text/plain; charset=iso-8859-1; format=flowed X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id iBT2J08H012741 X-archive-position: 13161 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: acme@conectiva.com.br Precedence: bulk X-list: netdev Jörn Engel wrote: > On Wed, 29 December 2004 00:10:22 -0200, Arnaldo Carvalho de Melo wrote: > >>>It doesn't make much difference, it's mostly for completeness/correctness. >> >>No, it does a helluva difference, give it a try :-) > > > hint: look for "\n" hint2: Or the _lack_ of "\n" 8) - Arnaldo From juhl-lkml@dif.dk Tue Dec 28 18:33:58 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 28 Dec 2004 18:34:05 -0800 (PST) Received: from mail.dif.dk (mail.dif.dk [193.138.115.101]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBT2Xbwo013606 for ; Tue, 28 Dec 2004 18:33:57 -0800 Received: from localhost (localhost [127.0.0.1]) by mail.dif.dk (Postfix) with ESMTP id D0478FFC99 for ; Wed, 29 Dec 2004 03:38:40 +0100 (CET) Received: from mail.dif.dk ([127.0.0.1]) by localhost (saerimmer [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 26795-08 for ; Wed, 29 Dec 2004 03:38:40 +0100 (CET) Received: from diftmgw2.backbone.dif.dk (diftmgw2.backbone.dif.dk [10.227.136.246]) by mail.dif.dk (Postfix) with ESMTP id 1115EFFC8A for ; Wed, 29 Dec 2004 03:38:40 +0100 (CET) Received: from DIFPST1A.backbone.dif.dk ([10.227.136.220]) by diftmgw2.backbone.dif.dk with InterScan Messaging Security Suite; Wed, 29 Dec 2004 03:34:03 +0100 Received: from [172.16.2.11] (10.227.136.29 [10.227.136.29]) by DIFPST1A.backbone.dif.dk with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2657.72) id ZGMPDPDR; Wed, 29 Dec 2004 03:34:58 +0100 Date: Wed, 29 Dec 2004 03:46:04 +0100 (CET) From: Jesper Juhl To: Arnaldo Carvalho de Melo Cc: =?ISO-8859-1?Q?J=F6rn_Engel?= , Jesper Juhl , Networking Team , linux-net , "David S. Miller" , Alexey Kuznetsov , linux-kernel Subject: Re: Patch: add loglevel to printk's in net/ipv4/route.c In-Reply-To: <41D2152F.8080501@conectiva.com.br> Message-ID: References: <41D2104D.3010406@conectiva.com.br> <41D2120E.8030008@conectiva.com.br> <20041229021256.GD29323@wohnheim.fh-wedel.de> <41D2152F.8080501@conectiva.com.br> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=iso-8859-1 X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Scanned: by amavisd-new at dif.dk X-Virus-Status: Clean Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from QUOTED-PRINTABLE to 8bit by oss.sgi.com id iBT2Xbwo013606 X-archive-position: 13162 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: juhl-lkml@dif.dk Precedence: bulk X-list: netdev On Wed, 29 Dec 2004, Arnaldo Carvalho de Melo wrote: > > > Jörn Engel wrote: > > On Wed, 29 December 2004 00:10:22 -0200, Arnaldo Carvalho de Melo wrote: > > > > > > It doesn't make much difference, it's mostly for > > > > completeness/correctness. > > > > > > No, it does a helluva difference, give it a try :-) > > > > > > hint: look for "\n" > > hint2: Or the _lack_ of "\n" 8) > Ok, obviously something's wrong, but it's currently 03:44 here, so I'll take a look at it tomorrow (or quite possibly the day after since I have things to do). Thank you for commenting, I'll dig into it at the first oppotunity I have. -- Jesper From shekhark@juniper.net Tue Dec 28 19:16:22 2004 Received: with ECARTIS (v1.0.0; list netdev); Tue, 28 Dec 2004 19:16:29 -0800 (PST) Received: from borg.juniper.net (borg.juniper.net [207.17.137.119]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBT3G2pU015409 for ; Tue, 28 Dec 2004 19:16:22 -0800 Received: from unknown (HELO alpha.jnpr.net) (172.24.18.126) by borg.juniper.net with ESMTP; 28 Dec 2004 19:17:27 -0800 X-BrightmailFiltered: true X-Ironport-AV: i="3.88,89,1102320000"; d="scan'208"; a="91547572:sNHT21622024" Received: from gluon.jnpr.net ([172.24.15.23]) by alpha.jnpr.net with Microsoft SMTPSVC(6.0.3790.211); Tue, 28 Dec 2004 19:17:27 -0800 X-MimeOLE: Produced By Microsoft Exchange V6.5.6944.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Subject: 2.6 IPSec Throughput puzzle Date: Tue, 28 Dec 2004 19:17:26 -0800 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: 2.6 IPSec Throughput puzzle Thread-Index: AcTtTxvAS0WKgI7LSkOoXI0MudOdUgAAzHdA From: "Shekhar Kshirsagar" To: "Networking Team" X-OriginalArrivalTime: 29 Dec 2004 03:17:27.0215 (UTC) FILETIME=[EC62CBF0:01C4ED54] X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id iBT3G2pU015409 X-archive-position: 13163 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shekhark@juniper.net Precedence: bulk X-list: netdev Hi, I have been trying to do some performance benchmarking for 2.6.10 IPSec with 2.6 GHz P5s connected back-to-back with Gigabit connection. I'm using iperf for performance test. I'm really puzzled with the performance results I'm getting. The performance drop with AH seems high, but worst is performance drop with null-esp in transport mode. Another strange observation is that DES throughput is greater than null encryption throughput. Throughput without IPSec : 936 MBits/s ( 25% CPU Util) Transport mode AH - SHA1 : 398 MBits/s (100% CPU Util) Transport mode ESP - null/SHA1: 62 MBits/s (100% CPU Util) Transport mode ESP - des/SHA1 : 111 MBits/s (100% CPU Util) Transport mode ESP - 3des/SHA1: 54 MBits/s (100% CPU Util) Transport mode ESP - aes/SHA1 : 192 MBits/s (100% CPU Util) Do these numbers sound reasonable? (I don't have any iptable rules) Thanks, Shekhar From rich@phekda.gotadsl.co.uk Wed Dec 29 02:15:58 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 29 Dec 2004 02:16:11 -0800 (PST) Received: from smtp.nildram.co.uk (smtp.nildram.co.uk [195.112.4.54]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBTAFbbI002759 for ; Wed, 29 Dec 2004 02:15:58 -0800 Received: from katrina.int.phekda.gotadsl.co.uk (82-133-110-3.dyn.gotadsl.co.uk [82.133.110.3]) by smtp.nildram.co.uk (Postfix) with ESMTP id C09432555A5; Wed, 29 Dec 2004 10:17:04 +0000 (GMT) Received: from [192.168.1.4] (katrina.int.phekda.gotadsl.co.uk [192.168.1.4]) by katrina.int.phekda.gotadsl.co.uk (Postfix) with ESMTP id 5802F323; Wed, 29 Dec 2004 10:17:51 +0000 (GMT) Message-ID: <41D2844E.5070204@phekda.gotadsl.co.uk> Date: Wed, 29 Dec 2004 10:17:50 +0000 From: Richard Dawe User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3) Gecko/20041020 X-Accept-Language: en, de, fr MIME-Version: 1.0 To: Francois Romieu Cc: netdev@oss.sgi.com, Me Subject: Re: Acer Aspire 1524WLMi and RealTek 8169 - very slow References: <41A09541.5040405@phekda.gotadsl.co.uk> <41A0F0D5.9050702@phekda.gotadsl.co.uk> <20041121205814.GA22460@electric-eye.fr.zoreil.com> <41A24F35.5080106@phekda.gotadsl.co.uk> <20041122213008.GA9618@electric-eye.fr.zoreil.com> In-Reply-To: <20041122213008.GA9618@electric-eye.fr.zoreil.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13164 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: rich@phekda.gotadsl.co.uk Precedence: bulk X-list: netdev Hello. Francois Romieu wrote: > Richard Dawe : [snip] >>Sadly my box won't boot, if I disable ACPI. > > > Can you give a look at: > http://forums.gentoo.org/viewtopic.php?t=122145%22%22 > > Ac*r, laptop, acpi and x64 are making me paranoid. [snip] I patched up my ACPI DSDT to compile without errors, rebuilt 2.6.10 final with the patched ACPI DSDT built-in. Network performance is still appalling (massive packet loss). IRQ routing seems to be disabled in 2.6.10. I got a warning about an unhandled interrupt for the VIA 8255 (I think that's the right number) on shutdown. I'm wondering if there is some bad interaction between the VIA chipset and the 8110 chipset. Another datapoint: On a Netgear switch at work, I had to put the chipset into 100Mbps half-duplex and set the MTU to 576 bytes, to get any decent performance out of at it. Anyway, I need to do more investigation. Thanks, bye, Rich =] -- Richard Dawe [ http://homepages.nildram.co.uk/~phekda/richdawe/ ] "You can't evaluate a man by logic alone." -- McCoy, "I, Mudd", Star Trek From ahu@outpost.ds9a.nl Wed Dec 29 04:10:53 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 29 Dec 2004 04:11:08 -0800 (PST) Received: from outpost.ds9a.nl (postfix@outpost.ds9a.nl [213.244.168.210]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBTCAWVr010249 for ; Wed, 29 Dec 2004 04:10:52 -0800 Received: by outpost.ds9a.nl (Postfix, from userid 1000) id 3211A4436; Wed, 29 Dec 2004 13:12:00 +0100 (CET) Date: Wed, 29 Dec 2004 13:12:00 +0100 From: bert hubert To: Shekhar Kshirsagar Cc: Networking Team Subject: Re: 2.6 IPSec Throughput puzzle Message-ID: <20041229121200.GA12199@outpost.ds9a.nl> Mail-Followup-To: bert hubert , Shekhar Kshirsagar , Networking Team References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.3.28i X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13165 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ahu@ds9a.nl Precedence: bulk X-list: netdev On Tue, Dec 28, 2004 at 07:17:26PM -0800, Shekhar Kshirsagar wrote: > I'm really puzzled with the performance results I'm getting. The > performance drop with AH seems high, but worst is performance drop with > null-esp in transport mode. Another strange observation is that DES > throughput is greater than null encryption throughput. Thanks for doing these benchmarks! I did some myself some time ago, but my hardware isn't representative of anything (consisting of a pentium pro 200 against a P3 1GHz). > Throughput without IPSec : 936 MBits/s ( 25% CPU Util) > Transport mode AH - SHA1 : 398 MBits/s (100% CPU Util) > Transport mode ESP - null/SHA1: 62 MBits/s (100% CPU Util) > Transport mode ESP - des/SHA1 : 111 MBits/s (100% CPU Util) > Transport mode ESP - 3des/SHA1: 54 MBits/s (100% CPU Util) > Transport mode ESP - aes/SHA1 : 192 MBits/s (100% CPU Util) > > Do these numbers sound reasonable? > (I don't have any iptable rules) It is very easy to use oprofile these days, I suggest you profile for a bit, should easily tell you what the culprit is. 62MBit/s sounds very low. Good luck! -- http://www.PowerDNS.com Open source, database driven DNS Software http://lartc.org Linux Advanced Routing & Traffic Control HOWTO From tgraf@suug.ch Wed Dec 29 04:47:14 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 29 Dec 2004 04:47:21 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBTCkrIf015217 for ; Wed, 29 Dec 2004 04:47:13 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id 2BA01F; Wed, 29 Dec 2004 13:48:00 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id 7F1011C0EA; Wed, 29 Dec 2004 13:48:42 +0100 (CET) Date: Wed, 29 Dec 2004 13:48:42 +0100 From: Thomas Graf To: jamal Cc: "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. Message-ID: <20041229124842.GI32419@postel.suug.ch> References: <20041228161117.GD32419@postel.suug.ch> <1104251817.1090.164.camel@jzny.localdomain> <20041228192603.GE32419@postel.suug.ch> <1104268498.1090.254.camel@jzny.localdomain> <20041228221021.GF32419@postel.suug.ch> <1104275197.1100.276.camel@jzny.localdomain> <20041228231916.GG32419@postel.suug.ch> <1104277165.1100.291.camel@jzny.localdomain> <20041229000928.GH32419@postel.suug.ch> <1104282811.1090.314.camel@jzny.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1104282811.1090.314.camel@jzny.localdomain> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13166 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * jamal <1104282811.1090.314.camel@jzny.localdomain> 2004-12-28 20:13 > no problem ;-> I think the effort is the same as in doing it the other > way - only result is more sophisticated. I havent investigated the other > classifiers - u32 tends to be more complex, so solving it on u32 solves > all the others typically. OK, I've changed my mind after some thinking. It's little bit more work but it's worth it. The ematch TLV contains an array of ematches, every ematch contains the logic relation to the next (2 bits) and a flag to invert the meaning (1 bit). A special ematch containing an index exists to implement precdence. A AND (B1 OR B2) AND C AND D ------->-PUSH------- -->-- / -->-- \ -->-- / \ / / \ \ / \ +-------+-------+-------+-------+-------+--------+ | A AND | B AND | C AND | D END | B1 OR | B2 END | +-------+-------+-------+-------+-------+--------+ \ / --------<-POP--------- A simple check that a jump index is never smaller than the current index (excluding backward jumps via stack) catches endless loop and avoids the use of a ttl. > Remember two levels: > 1) the classifier logical expressions (u32 and rsvp for example) - those > belong to cls api. > > if u32 match .. ... AND rsvp ... OR route ... > evaluation is left to right with some brute logical OR and ANDs via the > continue and reclassify codes. > > 2) This issue is at a the single classifier/filter level, so its fair to > push that into the classifier logic. an extended match cannot live by > itself. Its parasitic on a real classifier - so the scope MUST be > restricted to classifier as a matter of fact. Absolutely agreed, I did a context switch without prior warning. ;-> > Or you could have: > match u32 OR > (ematch meta nfmark .. or string "...") > match u32 .. > > Recall, the opportunity to do more in terms of logical expressions within u32 > exists because we can introduce more interesting keys ... We could reuse the 8 unused bits after nkeys in tc_u32sel too. iproute2 sets them to 0 so we can simply use them without anyone noticing. > I dont know where you and Patrick are with the code but it think it > would be safe for me to work off 2610-bk1. Tommorow i am working on some > e1000 stuff - but day after i should be able to touch the action code. I'm not touching the action bits but Patrick is as far as I know. From hadi@cyberus.ca Wed Dec 29 06:19:54 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 29 Dec 2004 06:20:01 -0800 (PST) Received: from mx02.cybersurf.com (mx02.cybersurf.com [209.197.145.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBTEJYjc018488 for ; Wed, 29 Dec 2004 06:19:54 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx02.cybersurf.com with esmtp (Exim 4.30) id 1CjehA-00053H-SF for netdev@oss.sgi.com; Wed, 29 Dec 2004 09:21:00 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1Cjeh7-0005ap-2R; Wed, 29 Dec 2004 09:20:57 -0500 Subject: Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. From: jamal Reply-To: hadi@cyberus.ca To: Thomas Graf Cc: "David S. Miller" , netdev@oss.sgi.com, Patrick McHardy In-Reply-To: <20041229124842.GI32419@postel.suug.ch> References: <20041228161117.GD32419@postel.suug.ch> <1104251817.1090.164.camel@jzny.localdomain> <20041228192603.GE32419@postel.suug.ch> <1104268498.1090.254.camel@jzny.localdomain> <20041228221021.GF32419@postel.suug.ch> <1104275197.1100.276.camel@jzny.localdomain> <20041228231916.GG32419@postel.suug.ch> <1104277165.1100.291.camel@jzny.localdomain> <20041229000928.GH32419@postel.suug.ch> <1104282811.1090.314.camel@jzny.localdomain> <20041229124842.GI32419@postel.suug.ch> Content-Type: text/plain Organization: jamalopolous Message-Id: <1104330054.1089.329.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 29 Dec 2004 09:20:54 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13167 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Wed, 2004-12-29 at 07:48, Thomas Graf wrote: > * jamal <1104282811.1090.314.camel@jzny.localdomain> 2004-12-28 20:13 .. > The ematch TLV contains an array of ematches, every > ematch contains the logic relation to the next (2 bits) and a flag to > invert the meaning (1 bit). A special ematch containing an index exists > to implement precdence. > > A AND (B1 OR B2) AND C AND D > > > ------->-PUSH------- > -->-- / -->-- \ -->-- > / \ / / \ \ / \ > +-------+-------+-------+-------+-------+--------+ > | A AND | B AND | C AND | D END | B1 OR | B2 END | > +-------+-------+-------+-------+-------+--------+ > \ / > --------<-POP--------- > > A simple check that a jump index is never smaller than the current > index (excluding backward jumps via stack) catches endless loop > and avoids the use of a ttl. > Sounds good. Given the opportunity: I think we need to put those flags as well for the u32 keys(and other classifiers) so we can have similar logic? Also in the case of u32 (to take this opportunity) i would like to stash state inot a 16 bit ID to help in pretty printing the matches. So if we have an extra 32 bits for flags:ID probably 8 bits for your need for flags, 16 bits for private Id and maybe another 8 bit for something else like versioning... Thoughts? > > Recall, the opportunity to do more in terms of logical expressions within u32 > > exists because we can introduce more interesting keys ... > > We could reuse the 8 unused bits after nkeys in tc_u32sel too. iproute2 sets > them to 0 so we can simply use them without anyone noticing. I would recommend just introducing the extra 32 bits per key. So much easier. > > I dont know where you and Patrick are with the code but it think it > > would be safe for me to work off 2610-bk1. Tommorow i am working on some > > e1000 stuff - but day after i should be able to touch the action code. > > I'm not touching the action bits but Patrick is as far as I know. Ok, dont wanna conflict with work hes doing - so i will wait until tommorow to see what hes upto. CCed him. cheers, jamal From tgraf@suug.ch Wed Dec 29 07:00:13 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 29 Dec 2004 07:00:19 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBTExq1g020151 for ; Wed, 29 Dec 2004 07:00:13 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id 7DD40F; Wed, 29 Dec 2004 16:00:58 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id 7EB561C0EA; Wed, 29 Dec 2004 16:01:40 +0100 (CET) Date: Wed, 29 Dec 2004 16:01:40 +0100 From: Thomas Graf To: jamal Cc: "David S. Miller" , netdev@oss.sgi.com, Patrick McHardy Subject: Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. Message-ID: <20041229150140.GJ32419@postel.suug.ch> References: <20041228192603.GE32419@postel.suug.ch> <1104268498.1090.254.camel@jzny.localdomain> <20041228221021.GF32419@postel.suug.ch> <1104275197.1100.276.camel@jzny.localdomain> <20041228231916.GG32419@postel.suug.ch> <1104277165.1100.291.camel@jzny.localdomain> <20041229000928.GH32419@postel.suug.ch> <1104282811.1090.314.camel@jzny.localdomain> <20041229124842.GI32419@postel.suug.ch> <1104330054.1089.329.camel@jzny.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1104330054.1089.329.camel@jzny.localdomain> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13168 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * jamal <1104330054.1089.329.camel@jzny.localdomain> 2004-12-29 09:20 > Given the opportunity: I think we need to put those flags as well for > the u32 keys(and other classifiers) so we can have similar logic? Sounds reasonable and easy to do if we introduce a new selector TLV. Speaking of the other classifiers: fw: Currently a list of ORed matches, nfmark transported via handle. We could change it to transfer the nfmark via a TLV and implement the same logic as in u32 (simple). The problem is mainly how to guarantee backwards compatibility, the handle would no longer tell about the nfmark being matched. OTOH, fw is no longer needed once we have metadata match. Adding a "always-true" classifier with ematch extension will completely replace it (except for the old path with netfilter disabled). tcindex: No changes required and partly replaced with metadata match but not completely. It would still be perfectly fine to map dscp values to classids. route4: Also partly replaced with metadata match but we would lose the exellent fast paths. We can leave it as-is and have metadata match for more complex matches (slow) and route4 for simple but fast uses. rsvp: Could theoretically be replaced with new u32 and extensive use of continue/reclassify but that's quite difficult. It's very specialized (and currently quite vulnerable to pskbs) and the use of it is clearly towards fast flow redirection. No need to change this. So, the conclusion for me is that we can focus on u32 and new classifiers and eventually make fw obsolete in the future. > Also in the case of u32 (to take this opportunity) i would like to stash > state inot a 16 bit ID to help in pretty printing the matches. Not sure what you mean. Which "state"? > So if we have an extra 32 bits for flags:ID probably 8 bits for your > need for flags, 16 bits for private Id and maybe another 8 bit for > something else like versioning... Basically what I need is 3 bits for logic relations and at least 8 for precedence index or 4 bits and reuse one of the already existing fields unused when key is used as container for sub keys. So, 8 bits will suit me very well. +---+---+-----+ | I | C | R | +---+---+-----+ I := 1 Match is inverted 0 Match is straight C := 1 Container Key 0 Normal Key R := 0 0 Last Key 0 1 AND 1 0 OR While we're at it we should increase nkeys to 16bit. From hadi@cyberus.ca Wed Dec 29 07:52:38 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 29 Dec 2004 07:52:46 -0800 (PST) Received: from mx02.cybersurf.com (mx02.cybersurf.com [209.197.145.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBTFqIIA022217 for ; Wed, 29 Dec 2004 07:52:38 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx02.cybersurf.com with esmtp (Exim 4.30) id 1Cjg8v-0006Uy-Bu for netdev@oss.sgi.com; Wed, 29 Dec 2004 10:53:45 -0500 Received: from [216.209.86.2] (helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1Cjg8s-0001zw-E2; Wed, 29 Dec 2004 10:53:42 -0500 Subject: Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. From: jamal Reply-To: hadi@cyberus.ca To: Thomas Graf Cc: "David S. Miller" , netdev@oss.sgi.com In-Reply-To: <20041229150140.GJ32419@postel.suug.ch> References: <20041228192603.GE32419@postel.suug.ch> <1104268498.1090.254.camel@jzny.localdomain> <20041228221021.GF32419@postel.suug.ch> <1104275197.1100.276.camel@jzny.localdomain> <20041228231916.GG32419@postel.suug.ch> <1104277165.1100.291.camel@jzny.localdomain> <20041229000928.GH32419@postel.suug.ch> <1104282811.1090.314.camel@jzny.localdomain> <20041229124842.GI32419@postel.suug.ch> <1104330054.1089.329.camel@jzny.localdomain> <20041229150140.GJ32419@postel.suug.ch> Content-Type: text/plain Organization: jamalopolous Message-Id: <1104335620.1025.22.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 29 Dec 2004 10:53:40 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13169 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Wed, 2004-12-29 at 10:01, Thomas Graf wrote: > * jamal <1104330054.1089.329.camel@jzny.localdomain> 2004-12-29 09:20 > > Given the opportunity: I think we need to put those flags as well for > > the u32 keys(and other classifiers) so we can have similar logic? > > Sounds reasonable and easy to do if we introduce a new selector TLV. Shouldnt be and issue and both backward and forward compatible. > Speaking of the other classifiers: > > fw: Currently a list of ORed matches, nfmark transported via handle. > We could change it to transfer the nfmark via a TLV and implement > the same logic as in u32 (simple). The problem is mainly how to > guarantee backwards compatibility, the handle would no longer > tell about the nfmark being matched. OTOH, fw is no longer needed > once we have metadata match. Adding a "always-true" classifier with > ematch extension will completely replace it (except for the old > path with netfilter disabled). > I would suspect we would end killing fwmark or it will get deprecated for lack of use. So probably safer to leave it alone. > tcindex: No changes required and partly replaced with metadata match > but not completely. It would still be perfectly fine to map > dscp values to classids. > This is a tricky one since it has those speacial cases. > route4: Also partly replaced with metadata match but we would lose > the exellent fast paths. We can leave it as-is and have metadata > match for more complex matches (slow) and route4 for simple but > fast uses. > nod. > rsvp: Could theoretically be replaced with new u32 and extensive use > of continue/reclassify but that's quite difficult. It's very > specialized (and currently quite vulnerable to pskbs) and the use > of it is clearly towards fast flow redirection. No need to change > this. > > So, the conclusion for me is that we can focus on u32 and new > classifiers and eventually make fw obsolete in the future. > Geez, I should have read to the end first ;-> So i agree with you. > > Also in the case of u32 (to take this opportunity) i would like to stash > > state inot a 16 bit ID to help in pretty printing the matches. > > Not sure what you mean. Which "state"? > One example of what state you could store: In the case where i enter something readable in english, the display back is raw; example: match ip src 10.0.0.210/32 gets displayed as: match 0a0000d2/ffffffff at 12 And a lot of times its tricky to find exactly what "at 12" means. If i store some ID that would tell me "IP" when i dump then i can pretty print it in english in user space using ip_print(). > > So if we have an extra 32 bits for flags:ID probably 8 bits for your > > need for flags, 16 bits for private Id and maybe another 8 bit for > > something else like versioning... > > Basically what I need is 3 bits for logic relations and at least 8 > for precedence index or 4 bits and reuse one of the already existing > fields unused when key is used as container for sub keys. So, 8 bits > will suit me very well. > > +---+---+-----+ > | I | C | R | > +---+---+-----+ > > I := 1 Match is inverted > 0 Match is straight > > C := 1 Container Key > 0 Normal Key > > R := 0 0 Last Key > 0 1 AND > 1 0 OR > > While we're at it we should increase nkeys to 16bit. > Sounds good to me since we have a new sel. It may endup being tricky to be both fast and backward compat; but we'll see what fun awaits when you start coding. cheers, jamal From yoshfuji@linux-ipv6.org Wed Dec 29 09:38:16 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 29 Dec 2004 09:38:25 -0800 (PST) Received: from yue.st-paulia.net (yue.linux-ipv6.org [203.178.140.15]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBTHbt3a029667 for ; Wed, 29 Dec 2004 09:38:16 -0800 Received: from localhost (localhost [127.0.0.1]) by yue.st-paulia.net (Postfix) with ESMTP id E6DB433CC2; Thu, 30 Dec 2004 02:39:39 +0900 (JST) Date: Thu, 30 Dec 2004 02:39:39 +0900 (JST) Message-Id: <20041230.023939.88211501.yoshfuji@linux-ipv6.org> To: jgarzik@pobox.com Cc: herbert@gondor.apana.org.au, acme@conectiva.com.br, netdev@oss.sgi.com, yoshfuji@linux-ipv6.org Subject: Re: Badness in dst_release (continuing saga) From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= In-Reply-To: <41CFDE1D.6000802@pobox.com> References: <41CFDE1D.6000802@pobox.com> Organization: USAGI Project X-URL: http://www.yoshifuji.org/%7Ehideaki/ X-Fingerprint: 9022 65EB 1ECF 3AD1 0BDF 80D8 4807 F894 E062 0EEA X-PGP-Key-URL: http://www.yoshifuji.org/%7Ehideaki/hideaki@yoshifuji.org.asc X-Face: "5$Al-.M>NJ%a'@hhZdQm:."qn~PA^gq4o*>iCFToq*bAi#4FRtx}enhuQKz7fNqQz\BYU] $~O_5m-9'}MIs`XGwIEscw;e5b>n"B_?j/AkL~i/MEaZBLP X-Mailer: Mew version 2.2 on Emacs 20.7 / Mule 4.1 (AOI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13170 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: yoshfuji@linux-ipv6.org Precedence: bulk X-list: netdev In article <41CFDE1D.6000802@pobox.com> (at Mon, 27 Dec 2004 05:04:13 -0500), Jeff Garzik says: > Ok, it's happened again. x86 SMP, 2.6.10-rc3-bk9. > > I have left the machine up without rebooting, those with accounts may > ssh in. First message occurs on Dec 24 in /var/log/messages.1, and they > continue in /var/log/messages and dmesg. ssh to gw.yyz.us. : > Badness in dst_release at include/net/dst.h:149 > [] ip6_dst_check+0x64/0x6a [ipv6] > [] ip6_dst_lookup+0x1a7/0x1c1 [ipv6] > [] udpv6_sendmsg+0x297/0x931 [ipv6] Ok, thanks. Due to badness of timing (I was almost offline for last a few days), I could not login that machine before you rebooted. However, I must have got some hints from information I was collecting. I will look into the code (and come up with some debugging code at least). Thank you again. --yoshfuji From akpm@osdl.org Wed Dec 29 13:46:57 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 29 Dec 2004 13:47:07 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBTLkbE4009792 for ; Wed, 29 Dec 2004 13:46:57 -0800 Received: from bix (build.pdx.osdl.net [172.20.1.2]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id iBTLm3631980 for ; Wed, 29 Dec 2004 13:48:03 -0800 Date: Wed, 29 Dec 2004 13:48:01 -0800 From: Andrew Morton To: netdev@oss.sgi.com Subject: Fw: [Bugme-new] [Bug 3960] New: Few errors in pktgen.c Message-Id: <20041229134801.7e31e1cf.akpm@osdl.org> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.10; i386-redhat-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13171 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: akpm@osdl.org Precedence: bulk X-list: netdev Begin forwarded message: Date: Wed, 29 Dec 2004 04:13:51 -0800 From: bugme-daemon@osdl.org To: bugme-new@lists.osdl.org Subject: [Bugme-new] [Bug 3960] New: Few errors in pktgen.c http://bugme.osdl.org/show_bug.cgi?id=3960 Summary: Few errors in pktgen.c Kernel Version: >=2.6.10 Status: NEW Severity: normal Owner: acme@conectiva.com.br Submitter: nmalykh@bilim.com Distribution: All Hardware Environment: All Software Environment: All Problem Description: Incorrect parameters handling in pktgen.c Steps to reproduce: Try to use pktgen without srcmac, src_mac_count and MACSRC_RND. Source MAC header fields in generated frames will be 00:00:00:00:00:00 instead interface MAC I've correct version of this module. ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is. From domen@coderock.org Wed Dec 29 14:14:45 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 29 Dec 2004 14:14:52 -0800 (PST) Received: from trashy.coderock.org (postfix@coderock.org [193.77.147.115]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBTMENxs011369 for ; Wed, 29 Dec 2004 14:14:43 -0800 Received: by trashy.coderock.org (Postfix, from userid 780) id 58A941E8C0; Wed, 29 Dec 2004 23:15:53 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by trashy.coderock.org (Postfix) with ESMTP id 248611E3F5; Wed, 29 Dec 2004 23:15:52 +0100 (CET) Received: from localhost (coderock.org [193.77.147.115]) by trashy.coderock.org (Postfix) with ESMTP id 66FBF1E3D6; Wed, 29 Dec 2004 23:15:40 +0100 (CET) Date: Wed, 29 Dec 2004 23:16:15 +0100 From: Domen Puncer To: jgarzik@pobox.com Cc: netdev@oss.sgi.com Subject: [patch] arcnet: remove casts Message-ID: <20041229221615.GF26245@nd47.coderock.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.1i X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13172 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: domen@coderock.org Precedence: bulk X-list: netdev Remove casts of (void *) pointers. drivers/net/arcnet/arc-rawmode.c | 4 ++-- drivers/net/arcnet/arc-rimi.c | 14 +++++++------- drivers/net/arcnet/arcnet.c | 30 +++++++++++++++--------------- drivers/net/arcnet/com20020.c | 6 +++--- drivers/net/arcnet/com90io.c | 4 ++-- drivers/net/arcnet/com90xx.c | 8 ++++---- drivers/net/arcnet/rfc1051.c | 8 ++++---- drivers/net/arcnet/rfc1201.c | 12 ++++++------ 8 files changed, 43 insertions(+), 43 deletions(-) Signed-off-by: Domen Puncer diff -pruNX dontdiff c/drivers/net/arcnet/arc-rawmode.c a/drivers/net/arcnet/arc-rawmode.c --- c/drivers/net/arcnet/arc-rawmode.c 2004-12-29 22:46:51.000000000 +0100 +++ a/drivers/net/arcnet/arc-rawmode.c 2004-12-29 23:09:19.000000000 +0100 @@ -87,7 +87,7 @@ MODULE_LICENSE("GPL"); static void rx(struct net_device *dev, int bufnum, struct archdr *pkthdr, int length) { - struct arcnet_local *lp = (struct arcnet_local *) dev->priv; + struct arcnet_local *lp = dev->priv; struct sk_buff *skb; struct archdr *pkt = pkthdr; int ofs; @@ -168,7 +168,7 @@ static int build_header(struct sk_buff * static int prepare_tx(struct net_device *dev, struct archdr *pkt, int length, int bufnum) { - struct arcnet_local *lp = (struct arcnet_local *) dev->priv; + struct arcnet_local *lp = dev->priv; struct arc_hardware *hard = &pkt->hard; int ofs; diff -pruNX dontdiff c/drivers/net/arcnet/arc-rimi.c a/drivers/net/arcnet/arc-rimi.c --- c/drivers/net/arcnet/arc-rimi.c 2004-10-19 13:52:57.000000000 +0200 +++ a/drivers/net/arcnet/arc-rimi.c 2004-12-29 23:09:19.000000000 +0100 @@ -230,7 +230,7 @@ err_free_irq: */ static int arcrimi_reset(struct net_device *dev, int really_reset) { - struct arcnet_local *lp = (struct arcnet_local *) dev->priv; + struct arcnet_local *lp = dev->priv; void __iomem *ioaddr = lp->mem_start + 0x800; BUGMSG(D_INIT, "Resetting %s (status=%02Xh)\n", dev->name, ASTATUS()); @@ -251,7 +251,7 @@ static int arcrimi_reset(struct net_devi static void arcrimi_setmask(struct net_device *dev, int mask) { - struct arcnet_local *lp = (struct arcnet_local *) dev->priv; + struct arcnet_local *lp = dev->priv; void __iomem *ioaddr = lp->mem_start + 0x800; AINTMASK(mask); @@ -259,7 +259,7 @@ static void arcrimi_setmask(struct net_d static int arcrimi_status(struct net_device *dev) { - struct arcnet_local *lp = (struct arcnet_local *) dev->priv; + struct arcnet_local *lp = dev->priv; void __iomem *ioaddr = lp->mem_start + 0x800; return ASTATUS(); @@ -267,7 +267,7 @@ static int arcrimi_status(struct net_dev static void arcrimi_command(struct net_device *dev, int cmd) { - struct arcnet_local *lp = (struct arcnet_local *) dev->priv; + struct arcnet_local *lp = dev->priv; void __iomem *ioaddr = lp->mem_start + 0x800; ACOMMAND(cmd); @@ -276,7 +276,7 @@ static void arcrimi_command(struct net_d static void arcrimi_copy_to_card(struct net_device *dev, int bufnum, int offset, void *buf, int count) { - struct arcnet_local *lp = (struct arcnet_local *) dev->priv; + struct arcnet_local *lp = dev->priv; void __iomem *memaddr = lp->mem_start + 0x800 + bufnum * 512 + offset; TIME("memcpy_toio", count, memcpy_toio(memaddr, buf, count)); } @@ -285,7 +285,7 @@ static void arcrimi_copy_to_card(struct static void arcrimi_copy_from_card(struct net_device *dev, int bufnum, int offset, void *buf, int count) { - struct arcnet_local *lp = (struct arcnet_local *) dev->priv; + struct arcnet_local *lp = dev->priv; void __iomem *memaddr = lp->mem_start + 0x800 + bufnum * 512 + offset; TIME("memcpy_fromio", count, memcpy_fromio(buf, memaddr, count)); } @@ -331,7 +331,7 @@ static int __init arc_rimi_init(void) static void __exit arc_rimi_exit(void) { struct net_device *dev = my_dev; - struct arcnet_local *lp = (struct arcnet_local *) dev->priv; + struct arcnet_local *lp = dev->priv; unregister_netdev(dev); iounmap(lp->mem_start); diff -pruNX dontdiff c/drivers/net/arcnet/arcnet.c a/drivers/net/arcnet/arcnet.c --- c/drivers/net/arcnet/arcnet.c 2004-12-29 22:46:51.000000000 +0100 +++ a/drivers/net/arcnet/arcnet.c 2004-12-29 23:09:19.000000000 +0100 @@ -182,7 +182,7 @@ EXPORT_SYMBOL(arcnet_dump_skb); void arcnet_dump_packet(struct net_device *dev, int bufnum, char *desc, int take_arcnet_lock) { - struct arcnet_local *lp = (struct arcnet_local *) dev->priv; + struct arcnet_local *lp = dev->priv; int i, length; unsigned long flags = 0; static uint8_t buf[512]; @@ -245,7 +245,7 @@ void arcnet_unregister_proto(struct ArcP */ static void release_arcbuf(struct net_device *dev, int bufnum) { - struct arcnet_local *lp = (struct arcnet_local *) dev->priv; + struct arcnet_local *lp = dev->priv; int i; lp->buf_queue[lp->first_free_buf++] = bufnum; @@ -267,7 +267,7 @@ static void release_arcbuf(struct net_de */ static int get_arcbuf(struct net_device *dev) { - struct arcnet_local *lp = (struct arcnet_local *) dev->priv; + struct arcnet_local *lp = dev->priv; int buf = -1, i; if (!atomic_dec_and_test(&lp->buf_lock)) { @@ -368,7 +368,7 @@ struct net_device *alloc_arcdev(char *na */ static int arcnet_open(struct net_device *dev) { - struct arcnet_local *lp = (struct arcnet_local *) dev->priv; + struct arcnet_local *lp = dev->priv; int count, newmtu, error; BUGMSG(D_INIT,"opened."); @@ -468,7 +468,7 @@ static int arcnet_open(struct net_device /* The inverse routine to arcnet_open - shuts down the card. */ static int arcnet_close(struct net_device *dev) { - struct arcnet_local *lp = (struct arcnet_local *) dev->priv; + struct arcnet_local *lp = dev->priv; netif_stop_queue(dev); @@ -489,7 +489,7 @@ static int arcnet_header(struct sk_buff unsigned short type, void *daddr, void *saddr, unsigned len) { - struct arcnet_local *lp = (struct arcnet_local *) dev->priv; + struct arcnet_local *lp = dev->priv; uint8_t _daddr, proto_num; struct ArcProto *proto; @@ -547,7 +547,7 @@ static int arcnet_header(struct sk_buff static int arcnet_rebuild_header(struct sk_buff *skb) { struct net_device *dev = skb->dev; - struct arcnet_local *lp = (struct arcnet_local *) dev->priv; + struct arcnet_local *lp = dev->priv; int status = 0; /* default is failure */ unsigned short type; uint8_t daddr=0; @@ -592,7 +592,7 @@ static int arcnet_rebuild_header(struct /* Called by the kernel in order to transmit a packet. */ static int arcnet_send_packet(struct sk_buff *skb, struct net_device *dev) { - struct arcnet_local *lp = (struct arcnet_local *) dev->priv; + struct arcnet_local *lp = dev->priv; struct archdr *pkt; struct arc_rfc1201 *soft; struct ArcProto *proto; @@ -675,7 +675,7 @@ static int arcnet_send_packet(struct sk_ */ static int go_tx(struct net_device *dev) { - struct arcnet_local *lp = (struct arcnet_local *) dev->priv; + struct arcnet_local *lp = dev->priv; BUGMSG(D_DURING, "go_tx: status=%Xh, intmask=%Xh, next_tx=%d, cur_tx=%d\n", ASTATUS(), lp->intmask, lp->next_tx, lp->cur_tx); @@ -706,7 +706,7 @@ static int go_tx(struct net_device *dev) static void arcnet_timeout(struct net_device *dev) { unsigned long flags; - struct arcnet_local *lp = (struct arcnet_local *) dev->priv; + struct arcnet_local *lp = dev->priv; int status = ASTATUS(); char *msg; @@ -755,7 +755,7 @@ irqreturn_t arcnet_interrupt(int irq, vo BUGMSG(D_DURING, "in arcnet_interrupt\n"); - lp = (struct arcnet_local *) dev->priv; + lp = dev->priv; if (!lp) BUG(); @@ -990,7 +990,7 @@ irqreturn_t arcnet_interrupt(int irq, vo */ void arcnet_rx(struct net_device *dev, int bufnum) { - struct arcnet_local *lp = (struct arcnet_local *) dev->priv; + struct arcnet_local *lp = dev->priv; struct archdr pkt; struct arc_rfc1201 *soft; int length, ofs; @@ -1054,7 +1054,7 @@ void arcnet_rx(struct net_device *dev, i */ static struct net_device_stats *arcnet_get_stats(struct net_device *dev) { - struct arcnet_local *lp = (struct arcnet_local *) dev->priv; + struct arcnet_local *lp = dev->priv; return &lp->stats; } @@ -1071,7 +1071,7 @@ static void null_rx(struct net_device *d static int null_build_header(struct sk_buff *skb, struct net_device *dev, unsigned short type, uint8_t daddr) { - struct arcnet_local *lp = (struct arcnet_local *) dev->priv; + struct arcnet_local *lp = dev->priv; BUGMSG(D_PROTO, "tx: can't build header for encap %02Xh; load a protocol driver.\n", @@ -1086,7 +1086,7 @@ static int null_build_header(struct sk_b static int null_prepare_tx(struct net_device *dev, struct archdr *pkt, int length, int bufnum) { - struct arcnet_local *lp = (struct arcnet_local *) dev->priv; + struct arcnet_local *lp = dev->priv; struct arc_hardware newpkt; BUGMSG(D_PROTO, "tx: no encap for this host; load a protocol driver.\n"); diff -pruNX dontdiff c/drivers/net/arcnet/com20020.c a/drivers/net/arcnet/com20020.c --- c/drivers/net/arcnet/com20020.c 2004-12-29 22:46:51.000000000 +0100 +++ a/drivers/net/arcnet/com20020.c 2004-12-29 23:11:53.000000000 +0100 @@ -159,7 +159,7 @@ int com20020_found(struct net_device *de /* Initialize the rest of the device structure. */ - lp = (struct arcnet_local *) dev->priv; + lp = dev->priv; lp->hw.owner = THIS_MODULE; lp->hw.command = com20020_command; @@ -233,7 +233,7 @@ int com20020_found(struct net_device *de */ static int com20020_reset(struct net_device *dev, int really_reset) { - struct arcnet_local *lp = (struct arcnet_local *) dev->priv; + struct arcnet_local *lp = dev->priv; u_int ioaddr = dev->base_addr; u_char inbyte; @@ -300,7 +300,7 @@ static int com20020_status(struct net_de static void com20020_close(struct net_device *dev) { - struct arcnet_local *lp = (struct arcnet_local *) dev->priv; + struct arcnet_local *lp = dev->priv; int ioaddr = dev->base_addr; /* disable transmitter */ diff -pruNX dontdiff c/drivers/net/arcnet/com90io.c a/drivers/net/arcnet/com90io.c --- c/drivers/net/arcnet/com90io.c 2004-02-18 13:27:39.000000000 +0100 +++ a/drivers/net/arcnet/com90io.c 2004-12-29 23:09:19.000000000 +0100 @@ -248,7 +248,7 @@ static int __init com90io_found(struct n return -EBUSY; } - lp = (struct arcnet_local *) (dev->priv); + lp = dev->priv; lp->card_name = "COM90xx I/O"; lp->hw.command = com90io_command; lp->hw.status = com90io_status; @@ -290,7 +290,7 @@ static int __init com90io_found(struct n */ static int com90io_reset(struct net_device *dev, int really_reset) { - struct arcnet_local *lp = (struct arcnet_local *) dev->priv; + struct arcnet_local *lp = dev->priv; short ioaddr = dev->base_addr; BUGMSG(D_INIT, "Resetting %s (status=%02Xh)\n", dev->name, ASTATUS()); diff -pruNX dontdiff c/drivers/net/arcnet/com90xx.c a/drivers/net/arcnet/com90xx.c --- c/drivers/net/arcnet/com90xx.c 2004-10-19 13:52:58.000000000 +0200 +++ a/drivers/net/arcnet/com90xx.c 2004-12-29 23:09:19.000000000 +0100 @@ -529,7 +529,7 @@ static void com90xx_setmask(struct net_d */ int com90xx_reset(struct net_device *dev, int really_reset) { - struct arcnet_local *lp = (struct arcnet_local *) dev->priv; + struct arcnet_local *lp = dev->priv; short ioaddr = dev->base_addr; BUGMSG(D_INIT, "Resetting (status=%02Xh)\n", ASTATUS()); @@ -565,7 +565,7 @@ int com90xx_reset(struct net_device *dev static void com90xx_copy_to_card(struct net_device *dev, int bufnum, int offset, void *buf, int count) { - struct arcnet_local *lp = (struct arcnet_local *) dev->priv; + struct arcnet_local *lp = dev->priv; void __iomem *memaddr = lp->mem_start + bufnum * 512 + offset; TIME("memcpy_toio", count, memcpy_toio(memaddr, buf, count)); } @@ -574,7 +574,7 @@ static void com90xx_copy_to_card(struct static void com90xx_copy_from_card(struct net_device *dev, int bufnum, int offset, void *buf, int count) { - struct arcnet_local *lp = (struct arcnet_local *) dev->priv; + struct arcnet_local *lp = dev->priv; void __iomem *memaddr = lp->mem_start + bufnum * 512 + offset; TIME("memcpy_fromio", count, memcpy_fromio(buf, memaddr, count)); } @@ -600,7 +600,7 @@ static void __exit com90xx_exit(void) for (count = 0; count < numcards; count++) { dev = cards[count]; - lp = (struct arcnet_local *) dev->priv; + lp = dev->priv; unregister_netdev(dev); free_irq(dev->irq, dev); diff -pruNX dontdiff c/drivers/net/arcnet/rfc1051.c a/drivers/net/arcnet/rfc1051.c --- c/drivers/net/arcnet/rfc1051.c 2004-12-29 22:46:51.000000000 +0100 +++ a/drivers/net/arcnet/rfc1051.c 2004-12-29 23:09:19.000000000 +0100 @@ -88,7 +88,7 @@ MODULE_LICENSE("GPL"); */ static unsigned short type_trans(struct sk_buff *skb, struct net_device *dev) { - struct arcnet_local *lp = (struct arcnet_local *) dev->priv; + struct arcnet_local *lp = dev->priv; struct archdr *pkt = (struct archdr *) skb->data; struct arc_rfc1051 *soft = &pkt->soft.rfc1051; int hdr_size = ARC_HDR_SIZE + RFC1051_HDR_SIZE; @@ -125,7 +125,7 @@ static unsigned short type_trans(struct static void rx(struct net_device *dev, int bufnum, struct archdr *pkthdr, int length) { - struct arcnet_local *lp = (struct arcnet_local *) dev->priv; + struct arcnet_local *lp = dev->priv; struct sk_buff *skb; struct archdr *pkt = pkthdr; int ofs; @@ -169,7 +169,7 @@ static void rx(struct net_device *dev, i static int build_header(struct sk_buff *skb, struct net_device *dev, unsigned short type, uint8_t daddr) { - struct arcnet_local *lp = (struct arcnet_local *) dev->priv; + struct arcnet_local *lp = dev->priv; int hdr_size = ARC_HDR_SIZE + RFC1051_HDR_SIZE; struct archdr *pkt = (struct archdr *) skb_push(skb, hdr_size); struct arc_rfc1051 *soft = &pkt->soft.rfc1051; @@ -220,7 +220,7 @@ static int build_header(struct sk_buff * static int prepare_tx(struct net_device *dev, struct archdr *pkt, int length, int bufnum) { - struct arcnet_local *lp = (struct arcnet_local *) dev->priv; + struct arcnet_local *lp = dev->priv; struct arc_hardware *hard = &pkt->hard; int ofs; diff -pruNX dontdiff c/drivers/net/arcnet/rfc1201.c a/drivers/net/arcnet/rfc1201.c --- c/drivers/net/arcnet/rfc1201.c 2004-12-29 22:46:51.000000000 +0100 +++ a/drivers/net/arcnet/rfc1201.c 2004-12-29 23:09:19.000000000 +0100 @@ -92,7 +92,7 @@ static unsigned short type_trans(struct { struct archdr *pkt = (struct archdr *) skb->data; struct arc_rfc1201 *soft = &pkt->soft.rfc1201; - struct arcnet_local *lp = (struct arcnet_local *) dev->priv; + struct arcnet_local *lp = dev->priv; int hdr_size = ARC_HDR_SIZE + RFC1201_HDR_SIZE; /* Pull off the arcnet header. */ @@ -134,7 +134,7 @@ static unsigned short type_trans(struct static void rx(struct net_device *dev, int bufnum, struct archdr *pkthdr, int length) { - struct arcnet_local *lp = (struct arcnet_local *) dev->priv; + struct arcnet_local *lp = dev->priv; struct sk_buff *skb; struct archdr *pkt = pkthdr; struct arc_rfc1201 *soft = &pkthdr->soft.rfc1201; @@ -376,7 +376,7 @@ static void rx(struct net_device *dev, i static int build_header(struct sk_buff *skb, struct net_device *dev, unsigned short type, uint8_t daddr) { - struct arcnet_local *lp = (struct arcnet_local *) dev->priv; + struct arcnet_local *lp = dev->priv; int hdr_size = ARC_HDR_SIZE + RFC1201_HDR_SIZE; struct archdr *pkt = (struct archdr *) skb_push(skb, hdr_size); struct arc_rfc1201 *soft = &pkt->soft.rfc1201; @@ -443,7 +443,7 @@ static int build_header(struct sk_buff * static void load_pkt(struct net_device *dev, struct arc_hardware *hard, struct arc_rfc1201 *soft, int softlen, int bufnum) { - struct arcnet_local *lp = (struct arcnet_local *) dev->priv; + struct arcnet_local *lp = dev->priv; int ofs; /* assume length <= XMTU: someone should have handled that by now. */ @@ -476,7 +476,7 @@ static void load_pkt(struct net_device * static int prepare_tx(struct net_device *dev, struct archdr *pkt, int length, int bufnum) { - struct arcnet_local *lp = (struct arcnet_local *) dev->priv; + struct arcnet_local *lp = dev->priv; const int maxsegsize = XMTU - RFC1201_HDR_SIZE; struct Outgoing *out; @@ -511,7 +511,7 @@ static int prepare_tx(struct net_device static int continue_tx(struct net_device *dev, int bufnum) { - struct arcnet_local *lp = (struct arcnet_local *) dev->priv; + struct arcnet_local *lp = dev->priv; struct Outgoing *out = &lp->outgoing; struct arc_hardware *hard = &out->pkt->hard; struct arc_rfc1201 *soft = &out->pkt->soft.rfc1201, *newsoft; From acme@conectiva.com.br Wed Dec 29 14:27:22 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 29 Dec 2004 14:27:30 -0800 (PST) Received: from orion.netbank.com.br (orion.netbank.com.br [200.203.199.90]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBTMR17x012309 for ; Wed, 29 Dec 2004 14:27:22 -0800 Received: from [200.163.203.158] (helo=oops.ghostprotocols.net) by orion.netbank.com.br with asmtp (Exim 3.33 #1) id 1CjmLH-0005A1-00; Wed, 29 Dec 2004 20:30:55 -0200 Received: from [192.168.1.6] (amd64.kerneljanitors.org [192.168.1.6]) by oops.ghostprotocols.net (Postfix) with ESMTP id 2E4CC75079; Wed, 29 Dec 2004 20:28:21 -0200 (BRST) Message-ID: <41D3306F.7080605@conectiva.com.br> Date: Wed, 29 Dec 2004 20:32:15 -0200 From: Arnaldo Carvalho de Melo Organization: Conectiva S.A. User-Agent: Mozilla Thunderbird 0.9 (X11/20041103) X-Accept-Language: en-us, en MIME-Version: 1.0 To: "David S. Miller" Cc: Networking Team , James.Bottomley@HansenPartnership.com Subject: [IPV6] fix inet6_sk for non IPV6 builds Content-Type: multipart/mixed; boundary="------------090801080902000902020809" X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13173 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: acme@conectiva.com.br Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------090801080902000902020809 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Hi David, Please apply this patch, the problem was noted by James Bottomley, that doesn't enables CONFIG_IPV6. Signed-off-by: Arnaldo Carvalho de Melo - Arnaldo --------------090801080902000902020809 Content-Type: text/plain; name="inet6_sk.patch" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="inet6_sk.patch" ===== include/linux/ipv6.h 1.23 vs edited ===== --- 1.23/include/linux/ipv6.h 2004-12-27 23:56:33 -02:00 +++ edited/include/linux/ipv6.h 2004-12-29 20:22:45 -02:00 @@ -273,6 +273,7 @@ struct ipv6_pinfo inet6; }; +#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) static inline struct ipv6_pinfo * inet6_sk(const struct sock *__sk) { return inet_sk(__sk)->pinet6; @@ -283,7 +284,6 @@ return &((struct raw6_sock *)__sk)->raw6; } -#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) #define __ipv6_only_sock(sk) (inet6_sk(sk)->ipv6only) #define ipv6_only_sock(sk) ((sk)->sk_family == PF_INET6 && __ipv6_only_sock(sk)) #else --------------090801080902000902020809-- From juhl-lkml@dif.dk Wed Dec 29 15:26:58 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 29 Dec 2004 15:27:06 -0800 (PST) Received: from mail.dif.dk (mail.dif.dk [193.138.115.101]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBTNQbat015014 for ; Wed, 29 Dec 2004 15:26:58 -0800 Received: from localhost (localhost [127.0.0.1]) by mail.dif.dk (Postfix) with ESMTP id 8A890FFC93 for ; Thu, 30 Dec 2004 00:31:47 +0100 (CET) Received: from mail.dif.dk ([127.0.0.1]) by localhost (saerimmer [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 25655-03 for ; Thu, 30 Dec 2004 00:31:46 +0100 (CET) Received: from diftmgw2.backbone.dif.dk (diftmgw2.backbone.dif.dk [10.227.136.246]) by mail.dif.dk (Postfix) with ESMTP id 2FCB4FFC99 for ; Thu, 30 Dec 2004 00:31:46 +0100 (CET) Received: from DIFPST1A.backbone.dif.dk ([10.227.136.220]) by diftmgw2.backbone.dif.dk with InterScan Messaging Security Suite; Thu, 30 Dec 2004 00:27:07 +0100 Received: from [172.16.2.11] (10.227.136.29 [10.227.136.29]) by DIFPST1A.backbone.dif.dk with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2657.72) id ZGMPDSL0; Thu, 30 Dec 2004 00:28:02 +0100 Date: Thu, 30 Dec 2004 00:39:10 +0100 (CET) From: Jesper Juhl To: Jesper Juhl Cc: Arnaldo Carvalho de Melo , =?ISO-8859-1?Q?J=F6rn_Engel?= , Networking Team , linux-net , "David S. Miller" , Alexey Kuznetsov , linux-kernel Subject: Re: Patch: add loglevel to printk's in net/ipv4/route.c In-Reply-To: Message-ID: References: <41D2104D.3010406@conectiva.com.br> <41D2120E.8030008@conectiva.com.br> <20041229021256.GD29323@wohnheim.fh-wedel.de> <41D2152F.8080501@conectiva.com.br> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=iso-8859-1 X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Scanned: by amavisd-new at dif.dk X-Virus-Status: Clean Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from QUOTED-PRINTABLE to 8bit by oss.sgi.com id iBTNQbat015014 X-archive-position: 13174 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: juhl-lkml@dif.dk Precedence: bulk X-list: netdev On Wed, 29 Dec 2004, Jesper Juhl wrote: > On Wed, 29 Dec 2004, Arnaldo Carvalho de Melo wrote: > > > > > > > Jörn Engel wrote: > > > On Wed, 29 December 2004 00:10:22 -0200, Arnaldo Carvalho de Melo wrote: > > > > > > > > It doesn't make much difference, it's mostly for > > > > > completeness/correctness. > > > > > > > > No, it does a helluva difference, give it a try :-) > > > > > > > > > hint: look for "\n" > > > > hint2: Or the _lack_ of "\n" 8) > > > > Ok, obviously something's wrong, but it's currently 03:44 here, so I'll > take a look at it tomorrow (or quite possibly the day after since I have > things to do). > Thank you for commenting, I'll dig into it at the first oppotunity I have. > > Ok, this is a bit embarresing. Looking at the patch now after getting some sleep it's quite obvious that it is wrong. I should have slept on it before sending it - sorry for the noise people. -- Jesper Juhl From shekhark@juniper.net Wed Dec 29 15:49:27 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 29 Dec 2004 15:49:35 -0800 (PST) Received: from borg.juniper.net (borg.juniper.net [207.17.137.119]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBTNn6qC016878 for ; Wed, 29 Dec 2004 15:49:26 -0800 Received: from unknown (HELO gamma.jnpr.net) (172.24.245.25) by borg.juniper.net with ESMTP; 29 Dec 2004 15:50:34 -0800 X-BrightmailFiltered: true X-Ironport-AV: i="3.88,91,1102320000"; d="scan'208"; a="91636736:sNHT21777524" Received: from gluon.jnpr.net ([172.24.15.23]) by gamma.jnpr.net with Microsoft SMTPSVC(6.0.3790.211); Wed, 29 Dec 2004 15:50:34 -0800 X-MimeOLE: Produced By Microsoft Exchange V6.5.6944.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Subject: RE: 2.6 IPSec Throughput puzzle Date: Wed, 29 Dec 2004 15:50:34 -0800 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: 2.6 IPSec Throughput puzzle Thread-Index: AcTtn7qNa8vYjV+2SGe58fWBNe0LfwAYHiHQ From: "Shekhar Kshirsagar" To: "Networking Team" Cc: "bert hubert" X-OriginalArrivalTime: 29 Dec 2004 23:50:34.0953 (UTC) FILETIME=[3083AB90:01C4EE01] X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by oss.sgi.com id iBTNn6qC016878 X-archive-position: 13175 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: shekhark@juniper.net Precedence: bulk X-list: netdev I played with oprofile for a while, and it seems that in case of null encryption, scatterwalk related code takes most of the cpu cycles. Tunnel mode ESP with null-encryption/sha1 (throughput 51MBits/sec), following are the top contenders: samples % symbol name 5794 18.9192 crypt 4812 15.7127 scatterwalk_done 3354 10.9518 sha1_transform 2530 8.2612 page_address 2469 8.0620 scatterwalk_copychunks 2458 8.0261 scatterwalk_map 1440 4.7020 kmap_atomic 1360 4.4408 default_idle 1077 3.5167 scatterwalk_whichbuf 930 3.0367 kunmap_atomic 731 2.3869 handle_IRQ_event 676 2.2073 ecb_process 381 1.2441 ide_intr Tunnel mode ESP with aes/sha1 (throughput 114MBits/sec), following are the top contenders: samples % symbol name 9056 29.6122 default_idle 7245 23.6904 sha1_transform 3933 12.8605 aes_enc_blk 931 3.0443 cbc_process 792 2.5898 crypt 716 2.3412 scatterwalk_done 519 1.6971 handle_IRQ_event 433 1.4159 sha1_update 412 1.3472 pskb_expand_head 380 1.2426 csum_partial 373 1.2197 page_address 368 1.2033 scatterwalk_map 345 1.1281 scatterwalk_copychunks Is there any place where I can find documentation about what exactly scatterwalk does? Thanks, Shekhar > -----Original Message----- > From: netdev-bounce@oss.sgi.com [mailto:netdev-bounce@oss.sgi.com] On > Behalf Of bert hubert > Sent: Wednesday, December 29, 2004 4:12 AM > To: Shekhar Kshirsagar > Cc: Networking Team > Subject: Re: 2.6 IPSec Throughput puzzle > > On Tue, Dec 28, 2004 at 07:17:26PM -0800, Shekhar Kshirsagar wrote: > > > I'm really puzzled with the performance results I'm getting. The > > performance drop with AH seems high, but worst is performance drop with > > null-esp in transport mode. Another strange observation is that DES > > throughput is greater than null encryption throughput. > > Thanks for doing these benchmarks! I did some myself some time ago, but my > hardware isn't representative of anything (consisting of a pentium pro 200 > against a P3 1GHz). > > > Throughput without IPSec : 936 MBits/s ( 25% CPU Util) > > Transport mode AH - SHA1 : 398 MBits/s (100% CPU Util) > > Transport mode ESP - null/SHA1: 62 MBits/s (100% CPU Util) > > Transport mode ESP - des/SHA1 : 111 MBits/s (100% CPU Util) > > Transport mode ESP - 3des/SHA1: 54 MBits/s (100% CPU Util) > > Transport mode ESP - aes/SHA1 : 192 MBits/s (100% CPU Util) > > > > Do these numbers sound reasonable? > > (I don't have any iptable rules) > > It is very easy to use oprofile these days, I suggest you profile for a > bit, > should easily tell you what the culprit is. 62MBit/s sounds very low. > > Good luck! > > -- > http://www.PowerDNS.com Open source, database driven DNS Software > http://lartc.org Linux Advanced Routing & Traffic Control HOWTO > From romieu@fr.zoreil.com Wed Dec 29 15:51:31 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 29 Dec 2004 15:51:38 -0800 (PST) Received: from fr.zoreil.com (electric-eye.fr.zoreil.com [213.41.134.224]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBTNp9dd017404 for ; Wed, 29 Dec 2004 15:51:30 -0800 Received: from electric-eye.fr.zoreil.com (localhost.localdomain [127.0.0.1]) by fr.zoreil.com (8.13.1/8.12.1) with ESMTP id iBTNq9D7001985; Thu, 30 Dec 2004 00:52:09 +0100 Received: (from romieu@localhost) by electric-eye.fr.zoreil.com (8.13.1/8.13.1/Submit) id iBTNq34O001984; Thu, 30 Dec 2004 00:52:03 +0100 Date: Thu, 30 Dec 2004 00:52:03 +0100 From: Francois Romieu To: Richard Dawe Cc: netdev@oss.sgi.com Subject: Re: Acer Aspire 1524WLMi and RealTek 8169 - very slow Message-ID: <20041229235203.GA5465@electric-eye.fr.zoreil.com> References: <41A09541.5040405@phekda.gotadsl.co.uk> <41A0F0D5.9050702@phekda.gotadsl.co.uk> <20041121205814.GA22460@electric-eye.fr.zoreil.com> <41A24F35.5080106@phekda.gotadsl.co.uk> <20041122213008.GA9618@electric-eye.fr.zoreil.com> <41D2844E.5070204@phekda.gotadsl.co.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <41D2844E.5070204@phekda.gotadsl.co.uk> User-Agent: Mutt/1.4.1i X-Organisation: Land of Sunshine Inc. X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13176 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: romieu@fr.zoreil.com Precedence: bulk X-list: netdev Richard Dawe : [...] > IRQ routing seems to be disabled in 2.6.10. I got a warning about an > unhandled interrupt for the VIA 8255 (I think that's the right number) > on shutdown. I'm wondering if there is some bad interaction between the > VIA chipset and the 8110 chipset. Possible. I hope the networking does not perform better when you are listening to music. > Another datapoint: On a Netgear switch at work, I had to put the chipset > into 100Mbps half-duplex and set the MTU to 576 bytes, to get any decent > performance out of at it. > > Anyway, I need to do more investigation. Could you send an updated dmesg, lspci -vvx, /proc/interrupts please ? Did you remove the other OS from the laptop ? I'd be curious to know if it can teach us something. -- Ueimor From herbert@13thfloor.at Wed Dec 29 18:11:30 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 29 Dec 2004 18:11:37 -0800 (PST) Received: from mail.13thfloor.at (MAIL.13thfloor.at [212.16.62.51]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBU2B9JT027052 for ; Wed, 29 Dec 2004 18:11:30 -0800 Received: by mail.13thfloor.at (Postfix, from userid 1001) id 5AC4B170246; Thu, 30 Dec 2004 03:12:39 +0100 (CET) Date: Thu, 30 Dec 2004 03:12:39 +0100 From: Herbert Poetzl To: Jesper Juhl Cc: Arnaldo Carvalho de Melo , =?iso-8859-1?Q?J=F6rn?= Engel , Networking Team , linux-net , "David S. Miller" , Alexey Kuznetsov , linux-kernel Subject: Re: Patch: add loglevel to printk's in net/ipv4/route.c Message-ID: <20041230021239.GA31474@mail.13thfloor.at> Mail-Followup-To: Jesper Juhl , Arnaldo Carvalho de Melo , =?iso-8859-1?Q?J=F6rn?= Engel , Networking Team , linux-net , "David S. Miller" , Alexey Kuznetsov , linux-kernel References: <41D2104D.3010406@conectiva.com.br> <41D2120E.8030008@conectiva.com.br> <20041229021256.GD29323@wohnheim.fh-wedel.de> <41D2152F.8080501@conectiva.com.br> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.4.1i X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13177 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: herbert@13thfloor.at Precedence: bulk X-list: netdev On Thu, Dec 30, 2004 at 12:39:10AM +0100, Jesper Juhl wrote: > On Wed, 29 Dec 2004, Jesper Juhl wrote: > > > On Wed, 29 Dec 2004, Arnaldo Carvalho de Melo wrote: > > > > > > > > > > > Jörn Engel wrote: > > > > On Wed, 29 December 2004 00:10:22 -0200, Arnaldo Carvalho de Melo wrote: > > > > > > > > > > It doesn't make much difference, it's mostly for > > > > > > completeness/correctness. > > > > > > > > > > No, it does a helluva difference, give it a try :-) > > > > > > > > > > > > hint: look for "\n" > > > > > > hint2: Or the _lack_ of "\n" 8) > > > > > > > Ok, obviously something's wrong, but it's currently 03:44 here, so I'll > > take a look at it tomorrow (or quite possibly the day after since I have > > things to do). > > Thank you for commenting, I'll dig into it at the first oppotunity I have. > > > > > Ok, this is a bit embarresing. Looking at the patch now after getting some > sleep it's quite obvious that it is wrong. I should have slept on it > before sending it - sorry for the noise people. nothing to be sorry about, and thanks for all the work you are doing for the linux kernel ... best, Herbert > -- > Jesper Juhl > > > > From kaber@trash.net Wed Dec 29 19:38:53 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 29 Dec 2004 19:39:00 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBU3cVcN030192 for ; Wed, 29 Dec 2004 19:38:52 -0800 Received: from eru.coreworks.de ([172.16.0.2] helo=trash.net) by kaber.coreworks.de with esmtp (Exim 4.34) id 1CjrAi-0002IS-K7; Thu, 30 Dec 2004 04:40:22 +0100 Message-ID: <41D37864.9050609@trash.net> Date: Thu, 30 Dec 2004 04:39:16 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040413 Debian/1.6-5 X-Accept-Language: en MIME-Version: 1.0 To: jamal CC: Maillist netdev Subject: [PATCH PKT_SCHED 1/17]: act_api.c whitespace cleanup Content-Type: multipart/mixed; boundary="------------030808000601080109070105" X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13179 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------030808000601080109070105 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Whitespace cleanup, break lines at 80 characters. --------------030808000601080109070105 Content-Type: text/x-patch; name="01.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="01.diff" # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/12/30 00:37:52+01:00 kaber@coreworks.de # [PKT_SCHED]: act_api.c whitespace cleanup # # Signed-off-by: Patrick McHardy # # net/sched/act_api.c # 2004/12/30 00:37:45+01:00 kaber@coreworks.de +124 -168 # [PKT_SCHED]: act_api.c whitespace cleanup # # Signed-off-by: Patrick McHardy # diff -Nru a/net/sched/act_api.c b/net/sched/act_api.c --- a/net/sched/act_api.c 2004-12-30 04:00:57 +01:00 +++ b/net/sched/act_api.c 2004-12-30 04:00:57 +01:00 @@ -35,14 +35,14 @@ #include #if 1 /* control */ -#define DPRINTK(format,args...) printk(KERN_DEBUG format,##args) +#define DPRINTK(format, args...) printk(KERN_DEBUG format, ##args) #else -#define DPRINTK(format,args...) +#define DPRINTK(format, args...) #endif #if 0 /* data */ -#define D2PRINTK(format,args...) printk(KERN_DEBUG format,##args) +#define D2PRINTK(format, args...) printk(KERN_DEBUG format, ##args) #else -#define D2PRINTK(format,args...) +#define D2PRINTK(format, args...) #endif static struct tc_action_ops *act_base = NULL; @@ -53,18 +53,16 @@ struct tc_action_ops *a, **ap; write_lock(&act_mod_lock); - for (ap = &act_base; (a=*ap)!=NULL; ap = &a->next) { + for (ap = &act_base; (a = *ap) != NULL; ap = &a->next) { if (act->type == a->type || (strcmp(act->kind, a->kind) == 0)) { write_unlock(&act_mod_lock); return -EEXIST; } } - - act->next = NULL; + act->next = NULL; *ap = act; write_unlock(&act_mod_lock); - return 0; } @@ -74,10 +72,9 @@ int err = -ENOENT; write_lock(&act_mod_lock); - for (ap = &act_base; (a=*ap)!=NULL; ap = &a->next) + for (ap = &act_base; (a = *ap) != NULL; ap = &a->next) if(a == act) break; - if (a) { *ap = a->next; a->next = NULL; @@ -90,47 +87,42 @@ /* lookup by name */ static struct tc_action_ops *tc_lookup_action_n(char *kind) { - struct tc_action_ops *a = NULL; if (kind) { read_lock(&act_mod_lock); for (a = act_base; a; a = a->next) { - if (strcmp(kind,a->kind) == 0) { + if (strcmp(kind, a->kind) == 0) { if (!try_module_get(a->owner)) { read_unlock(&act_mod_lock); return NULL; - } + } break; } } read_unlock(&act_mod_lock); } - return a; } /* lookup by rtattr */ static struct tc_action_ops *tc_lookup_action(struct rtattr *kind) { - struct tc_action_ops *a = NULL; if (kind) { read_lock(&act_mod_lock); for (a = act_base; a; a = a->next) { - - if (strcmp((char*)RTA_DATA(kind),a->kind) == 0){ + if (strcmp((char*)RTA_DATA(kind), a->kind) == 0){ if (!try_module_get(a->owner)) { read_unlock(&act_mod_lock); return NULL; - } + } break; } } read_unlock(&act_mod_lock); } - return a; } @@ -147,33 +139,34 @@ if (!try_module_get(a->owner)) { read_unlock(&act_mod_lock); return NULL; - } + } break; } } read_unlock(&act_mod_lock); } - return a; } #endif -int tcf_action_exec(struct sk_buff *skb,struct tc_action *act, struct tcf_result *res) +int tcf_action_exec(struct sk_buff *skb, struct tc_action *act, + struct tcf_result *res) { - struct tc_action *a; - int ret = -1; + int ret = -1; if (skb->tc_verd & TC_NCLS) { skb->tc_verd = CLR_TC_NCLS(skb->tc_verd); - D2PRINTK("(%p)tcf_action_exec: cleared TC_NCLS in %s out %s\n",skb,skb->input_dev?skb->input_dev->name:"xxx",skb->dev->name); + D2PRINTK("(%p)tcf_action_exec: cleared TC_NCLS in %s out %s\n", + skb, skb->input_dev ? skb->input_dev->name : "xxx", + skb->dev->name); ret = TC_ACT_OK; goto exec_done; } while ((a = act) != NULL) { repeat: if (a->ops && a->ops->act) { - ret = a->ops->act(&skb,a); + ret = a->ops->act(&skb, a); if (TC_MUNGED & skb->tc_verd) { /* copied already, allow trampling */ skb->tc_verd = SET_TC_OK2MUNGE(skb->tc_verd); @@ -195,7 +188,6 @@ res->class = 0; skb->tc_classid = 0; } - return ret; } @@ -205,54 +197,50 @@ for (a = act; act; a = act) { if (a && a->ops && a->ops->cleanup) { - DPRINTK("tcf_action_destroy destroying %p next %p\n", a,a->next?a->next:NULL); + DPRINTK("tcf_action_destroy destroying %p next %p\n", + a, a->next ? a->next : NULL); act = act->next; - if (ACT_P_DELETED == a->ops->cleanup(a, bind)) { + if (ACT_P_DELETED == a->ops->cleanup(a, bind)) module_put(a->ops->owner); - } - a->ops = NULL; + a->ops = NULL; kfree(a); } else { /*FIXME: Remove later - catch insertion bugs*/ - printk("tcf_action_destroy: BUG? destroying NULL ops \n"); + printk("tcf_action_destroy: BUG? destroying NULL ops\n"); if (a) { act = act->next; kfree(a); } else { - printk("tcf_action_destroy: BUG? destroying NULL action! \n"); + printk("tcf_action_destroy: BUG? destroying NULL action!\n"); break; } } } } -int tcf_action_dump_old(struct sk_buff *skb, struct tc_action *a, int bind, int ref) +int +tcf_action_dump_old(struct sk_buff *skb, struct tc_action *a, int bind, int ref) { int err = -EINVAL; - - if ( (NULL == a) || (NULL == a->ops) - || (NULL == a->ops->dump) ) + if ((NULL == a) || (NULL == a->ops) || (NULL == a->ops->dump)) return err; return a->ops->dump(skb, a, bind, ref); - } - -int tcf_action_dump_1(struct sk_buff *skb, struct tc_action *a, int bind, int ref) +int +tcf_action_dump_1(struct sk_buff *skb, struct tc_action *a, int bind, int ref) { int err = -EINVAL; - unsigned char *b = skb->tail; + unsigned char *b = skb->tail; struct rtattr *r; - - if ( (NULL == a) || (NULL == a->ops) - || (NULL == a->ops->dump) || (NULL == a->ops->kind)) + if ((NULL == a) || (NULL == a->ops) || (NULL == a->ops->dump) || + (NULL == a->ops->kind)) return err; - RTA_PUT(skb, TCA_KIND, IFNAMSIZ, a->ops->kind); - if (tcf_action_copy_stats(skb,a)) + if (tcf_action_copy_stats(skb, a)) goto rtattr_failure; r = (struct rtattr*) skb->tail; RTA_PUT(skb, TCA_OPTIONS, 0, NULL); @@ -261,18 +249,17 @@ return err; } - rtattr_failure: skb_trim(skb, b - skb->data); return -1; - } -int tcf_action_dump(struct sk_buff *skb, struct tc_action *act, int bind, int ref) +int +tcf_action_dump(struct sk_buff *skb, struct tc_action *act, int bind, int ref) { struct tc_action *a; int err = -EINVAL; - unsigned char *b = skb->tail; + unsigned char *b = skb->tail; struct rtattr *r ; while ((a = act) != NULL) { @@ -280,9 +267,8 @@ act = a->next; RTA_PUT(skb, a->order, 0, NULL); err = tcf_action_dump_1(skb, a, bind, ref); - if (0 > err) + if (0 > err) goto rtattr_failure; - r->rta_len = skb->tail - (u8*)r; } @@ -291,7 +277,6 @@ rtattr_failure: skb_trim(skb, b - skb->data); return -err; - } struct tc_action *tcf_action_init_1(struct rtattr *rta, struct rtattr *est, @@ -306,16 +291,16 @@ *err = -EINVAL; if (NULL == name) { - if (rtattr_parse(tb, TCA_ACT_MAX, RTA_DATA(rta), RTA_PAYLOAD(rta))<0) + if (rtattr_parse(tb, TCA_ACT_MAX, RTA_DATA(rta), + RTA_PAYLOAD(rta)) < 0) goto err_out; kind = tb[TCA_ACT_KIND-1]; if (NULL != kind) { sprintf(act_name, "%s", (char*)RTA_DATA(kind)); if (RTA_PAYLOAD(kind) >= IFNAMSIZ) { - printk(" Action %s bad\n", (char*)RTA_DATA(kind)); + printk("Action %s bad\n", (char*)RTA_DATA(kind)); goto err_out; } - } else { printk("Action bad kind\n"); goto err_out; @@ -323,19 +308,19 @@ a_o = tc_lookup_action(kind); } else { sprintf(act_name, "%s", name); - DPRINTK("tcf_action_init_1: finding %s\n",act_name); + DPRINTK("tcf_action_init_1: finding %s\n", act_name); a_o = tc_lookup_action_n(name); } #ifdef CONFIG_KMOD if (NULL == a_o) { - DPRINTK("tcf_action_init_1: trying to load module %s\n",act_name); - request_module (act_name); + DPRINTK("tcf_action_init_1: trying to load module %s\n", act_name); + request_module(act_name); a_o = tc_lookup_action_n(act_name); } #endif if (NULL == a_o) { - printk("failed to find %s\n",act_name); + printk("failed to find %s\n", act_name); goto err_out; } @@ -363,12 +348,12 @@ /* module count goes up only when brand new policy is created if it exists and is only bound to in a_o->init() then - ACT_P_CREATED is not returned (a zero is). - */ + ACT_P_CREATED is not returned (a zero is). + */ if (*err != ACT_P_CREATED) module_put(a_o->owner); a->ops = a_o; - DPRINTK("tcf_action_init_1: successfull %s \n",act_name); + DPRINTK("tcf_action_init_1: successfull %s \n", act_name); *err = 0; return a; @@ -419,7 +404,7 @@ return NULL; } -int tcf_action_copy_stats (struct sk_buff *skb,struct tc_action *a) +int tcf_action_copy_stats(struct sk_buff *skb, struct tc_action *a) { int err; struct gnet_dump d; @@ -462,14 +447,13 @@ return -1; } - static int -tca_get_fill(struct sk_buff *skb, struct tc_action *a, - u32 pid, u32 seq, unsigned flags, int event, int bind, int ref) +tca_get_fill(struct sk_buff *skb, struct tc_action *a, u32 pid, u32 seq, + unsigned flags, int event, int bind, int ref) { struct tcamsg *t; - struct nlmsghdr *nlh; - unsigned char *b = skb->tail; + struct nlmsghdr *nlh; + unsigned char *b = skb->tail; struct rtattr *x; nlh = NLMSG_PUT(skb, pid, seq, event, sizeof(*t)); @@ -480,9 +464,8 @@ x = (struct rtattr*) skb->tail; RTA_PUT(skb, TCA_ACT_TAB, 0, NULL); - if (0 > tcf_action_dump(skb, a, bind, ref)) { + if (0 > tcf_action_dump(skb, a, bind, ref)) goto rtattr_failure; - } x->rta_len = skb->tail - (u8*)x; @@ -495,58 +478,53 @@ return -1; } -static int act_get_notify(u32 pid, struct nlmsghdr *n, - struct tc_action *a, int event) +static int +act_get_notify(u32 pid, struct nlmsghdr *n, struct tc_action *a, int event) { struct sk_buff *skb; - int err = 0; skb = alloc_skb(NLMSG_GOODSIZE, GFP_KERNEL); if (!skb) return -ENOBUFS; - - if (tca_get_fill(skb, a, pid, n->nlmsg_seq, 0, event, 0, 0) <= 0) { + if (tca_get_fill(skb, a, pid, n->nlmsg_seq, 0, event, 0, 0) <= 0) { kfree_skb(skb); return -EINVAL; } - - err = netlink_unicast(rtnl,skb, pid, MSG_DONTWAIT); + err = netlink_unicast(rtnl, skb, pid, MSG_DONTWAIT); if (err > 0) err = 0; return err; } -static int tcf_action_get_1(struct rtattr *rta, struct tc_action *a, struct nlmsghdr *n, u32 pid) +static int tcf_action_get_1(struct rtattr *rta, struct tc_action *a, + struct nlmsghdr *n, u32 pid) { struct tc_action_ops *a_o; char act_name[4 + IFNAMSIZ + 1]; struct rtattr *tb[TCA_ACT_MAX+1]; struct rtattr *kind = NULL; int index; - int err = -EINVAL; if (rtattr_parse(tb, TCA_ACT_MAX, RTA_DATA(rta), RTA_PAYLOAD(rta))<0) goto err_out; - - kind = tb[TCA_ACT_KIND-1]; if (NULL != kind) { sprintf(act_name, "%s", (char*)RTA_DATA(kind)); if (RTA_PAYLOAD(kind) >= IFNAMSIZ) { - printk("tcf_action_get_1: action %s bad\n", (char*)RTA_DATA(kind)); + printk("tcf_action_get_1: action %s bad\n", + (char*)RTA_DATA(kind)); goto err_out; } - } else { printk("tcf_action_get_1: action bad kind\n"); goto err_out; } - if (tb[TCA_ACT_INDEX - 1]) { + if (tb[TCA_ACT_INDEX - 1]) index = *(int *)RTA_DATA(tb[TCA_ACT_INDEX - 1]); - } else { + else { printk("tcf_action_get_1: index not received\n"); goto err_out; } @@ -557,16 +535,13 @@ request_module (act_name); a_o = tc_lookup_action_n(act_name); } - #endif if (NULL == a_o) { - printk("failed to find %s\n",act_name); + printk("failed to find %s\n", act_name); goto err_out; } - - if (NULL == a) { + if (NULL == a) goto err_mod; - } a->ops = a_o; @@ -584,7 +559,7 @@ return err; } -static void cleanup_a (struct tc_action *act) +static void cleanup_a(struct tc_action *act) { struct tc_action *a; @@ -594,9 +569,8 @@ a->ops = NULL; a->priv = NULL; kfree(a); - } else { + } else printk("cleanup_a: BUG? empty action\n"); - } } } @@ -608,10 +582,10 @@ if (NULL != kind) { sprintf(act_name, "%s", (char*)RTA_DATA(kind)); if (RTA_PAYLOAD(kind) >= IFNAMSIZ) { - printk("get_ao: action %s bad\n", (char*)RTA_DATA(kind)); + printk("get_ao: action %s bad\n", + (char*)RTA_DATA(kind)); return NULL; } - } else { printk("get_ao: action bad kind\n"); return NULL; @@ -620,14 +594,14 @@ a_o = tc_lookup_action(kind); #ifdef CONFIG_KMOD if (NULL == a_o) { - DPRINTK("get_ao: trying to load module %s\n",act_name); - request_module (act_name); + DPRINTK("get_ao: trying to load module %s\n", act_name); + request_module(act_name); a_o = tc_lookup_action_n(act_name); } #endif if (NULL == a_o) { - printk("get_ao: failed to find %s\n",act_name); + printk("get_ao: failed to find %s\n", act_name); return NULL; } @@ -639,16 +613,13 @@ { struct tc_action *act = NULL; - act = kmalloc(sizeof(*act),GFP_KERNEL); + act = kmalloc(sizeof(*act), GFP_KERNEL); if (NULL == act) { /* grrr .. */ - printk("create_a: failed to alloc! \n"); + printk("create_a: failed to alloc!\n"); return NULL; } - - memset(act, 0,sizeof(*act)); - + memset(act, 0, sizeof(*act)); act->order = i; - return act; } @@ -679,16 +650,14 @@ b = (unsigned char *)skb->tail; - if (rtattr_parse(tb, TCA_ACT_MAX, RTA_DATA(rta), RTA_PAYLOAD(rta))<0) { + if (rtattr_parse(tb, TCA_ACT_MAX, RTA_DATA(rta), RTA_PAYLOAD(rta)) < 0) goto err_out; - } kind = tb[TCA_ACT_KIND-1]; - if (NULL == get_ao(kind, a)) { + if (NULL == get_ao(kind, a)) goto err_out; - } - nlh = NLMSG_PUT(skb, pid, n->nlmsg_seq, RTM_DELACTION, sizeof (*t)); + nlh = NLMSG_PUT(skb, pid, n->nlmsg_seq, RTM_DELACTION, sizeof(*t)); t = NLMSG_DATA(nlh); t->tca_family = AF_UNSPEC; @@ -696,9 +665,8 @@ RTA_PUT(skb, TCA_ACT_TAB, 0, NULL); err = a->ops->walk(skb, &dcb, RTM_DELACTION, a); - if (0 > err ) { + if (0 > err) goto rtattr_failure; - } x->rta_len = skb->tail - (u8 *) x; @@ -722,61 +690,55 @@ return err; } -static int tca_action_gd(struct rtattr *rta, struct nlmsghdr *n, u32 pid, int event ) +static int +tca_action_gd(struct rtattr *rta, struct nlmsghdr *n, u32 pid, int event) { - int s = 0; int i, ret = 0; struct tc_action *act = NULL; struct rtattr *tb[TCA_ACT_MAX_PRIO+1]; struct tc_action *a = NULL, *a_s = NULL; - if (event != RTM_GETACTION && event != RTM_DELACTION) + if (event != RTM_GETACTION && event != RTM_DELACTION) ret = -EINVAL; - if (rtattr_parse(tb, TCA_ACT_MAX_PRIO, RTA_DATA(rta), RTA_PAYLOAD(rta))<0) { + if (rtattr_parse(tb, TCA_ACT_MAX_PRIO, RTA_DATA(rta), + RTA_PAYLOAD(rta)) < 0) { ret = -EINVAL; goto nlmsg_failure; } if (event == RTM_DELACTION && n->nlmsg_flags&NLM_F_ROOT) { - if (NULL != tb[0] && NULL == tb[1]) { - return tca_action_flush(tb[0],n,pid); - } + if (NULL != tb[0] && NULL == tb[1]) + return tca_action_flush(tb[0], n, pid); } - for (i=0; i < TCA_ACT_MAX_PRIO ; i++) { - + for (i=0; i < TCA_ACT_MAX_PRIO; i++) { if (NULL == tb[i]) break; - act = create_a(i+1); if (NULL != a && a != act) { a->next = act; a = act; - } else { + } else a = act; - } if (!s) { s = 1; a_s = a; } - ret = tcf_action_get_1(tb[i],act,n,pid); + ret = tcf_action_get_1(tb[i], act, n, pid); if (ret < 0) { - printk("tcf_action_get: failed to get! \n"); + printk("tcf_action_get: failed to get!\n"); ret = -EINVAL; goto rtattr_failure; } - } - - if (RTM_GETACTION == event) { + if (RTM_GETACTION == event) ret = act_get_notify(pid, n, a_s, event); - } else { /* delete */ - + else { /* delete */ struct sk_buff *skb; skb = alloc_skb(NLMSG_GOODSIZE, GFP_KERNEL); @@ -785,7 +747,7 @@ goto nlmsg_failure; } - if (tca_get_fill(skb, a_s, pid, n->nlmsg_seq, 0, event, 0 , 1) <= 0) { + if (tca_get_fill(skb, a_s, pid, n->nlmsg_seq, 0, event, 0, 1) <= 0) { kfree_skb(skb); ret = -EINVAL; goto nlmsg_failure; @@ -805,16 +767,14 @@ return ret; } - -static int tcf_add_notify(struct tc_action *a, u32 pid, u32 seq, int event, unsigned flags) +static int tcf_add_notify(struct tc_action *a, u32 pid, u32 seq, int event, + unsigned flags) { struct tcamsg *t; - struct nlmsghdr *nlh; + struct nlmsghdr *nlh; struct sk_buff *skb; struct rtattr *x; unsigned char *b; - - int err = 0; skb = alloc_skb(NLMSG_GOODSIZE, GFP_KERNEL); @@ -831,9 +791,8 @@ x = (struct rtattr*) skb->tail; RTA_PUT(skb, TCA_ACT_TAB, 0, NULL); - if (0 > tcf_action_dump(skb, a, 0, 0)) { + if (0 > tcf_action_dump(skb, a, 0, 0)) goto rtattr_failure; - } x->rta_len = skb->tail - (u8*)x; @@ -853,7 +812,8 @@ } -static int tcf_action_add(struct rtattr *rta, struct nlmsghdr *n, u32 pid, int ovr ) +static int +tcf_action_add(struct rtattr *rta, struct nlmsghdr *n, u32 pid, int ovr) { int ret = 0; struct tc_action *act = NULL; @@ -867,16 +827,15 @@ /* dump then free all the actions after update; inserted policy * stays intact * */ - ret = tcf_add_notify(act, pid, seq, RTM_NEWACTION, n->nlmsg_flags); + ret = tcf_add_notify(act, pid, seq, RTM_NEWACTION, n->nlmsg_flags); for (a = act; act; a = act) { if (a) { act = act->next; a->ops = NULL; a->priv = NULL; kfree(a); - } else { + } else printk("tcf_action_add: BUG? empty action\n"); - } } done: return ret; @@ -886,37 +845,35 @@ { struct rtattr **tca = arg; u32 pid = skb ? NETLINK_CB(skb).pid : 0; - int ret = 0, ovr = 0; if (NULL == tca[TCA_ACT_TAB-1]) { - printk("tc_ctl_action: received NO action attribs\n"); - return -EINVAL; + printk("tc_ctl_action: received NO action attribs\n"); + return -EINVAL; } /* n->nlmsg_flags&NLM_F_CREATE * */ switch (n->nlmsg_type) { - case RTM_NEWACTION: + case RTM_NEWACTION: /* we are going to assume all other flags * imply create only if it doesnt exist * Note that CREATE | EXCL implies that * but since we want avoid ambiguity (eg when flags * is zero) then just set this */ - if (n->nlmsg_flags&NLM_F_REPLACE) { + if (n->nlmsg_flags&NLM_F_REPLACE) ovr = 1; - } - ret = tcf_action_add(tca[TCA_ACT_TAB-1], n, pid, ovr); + ret = tcf_action_add(tca[TCA_ACT_TAB-1], n, pid, ovr); break; case RTM_DELACTION: - ret = tca_action_gd(tca[TCA_ACT_TAB-1], n, pid,RTM_DELACTION); + ret = tca_action_gd(tca[TCA_ACT_TAB-1], n, pid, RTM_DELACTION); break; case RTM_GETACTION: - ret = tca_action_gd(tca[TCA_ACT_TAB-1], n, pid,RTM_GETACTION); + ret = tca_action_gd(tca[TCA_ACT_TAB-1], n, pid, RTM_GETACTION); break; default: - printk(" Unknown cmd was detected\n"); + printk("Unknown cmd was detected\n"); break; } @@ -930,8 +887,7 @@ struct rtattr *tb[TCA_ACT_MAX_PRIO + 1]; struct rtattr *rta[TCAA_MAX + 1]; struct rtattr *kind = NULL; - int min_len = NLMSG_LENGTH(sizeof (struct tcamsg)); - + int min_len = NLMSG_LENGTH(sizeof(struct tcamsg)); int attrlen = n->nlmsg_len - NLMSG_ALIGN(min_len); struct rtattr *attr = (void *) n + NLMSG_ALIGN(min_len); @@ -942,12 +898,14 @@ return NULL; } - if (rtattr_parse(tb, TCA_ACT_MAX_PRIO, RTA_DATA(tb1), NLMSG_ALIGN(RTA_PAYLOAD(tb1))) < 0) + if (rtattr_parse(tb, TCA_ACT_MAX_PRIO, RTA_DATA(tb1), + NLMSG_ALIGN(RTA_PAYLOAD(tb1))) < 0) return NULL; - if (NULL == tb[0]) + if (NULL == tb[0]) return NULL; - if (rtattr_parse(tb2, TCA_ACT_MAX, RTA_DATA(tb[0]), RTA_PAYLOAD(tb[0]))<0) + if (rtattr_parse(tb2, TCA_ACT_MAX, RTA_DATA(tb[0]), + RTA_PAYLOAD(tb[0])) < 0) return NULL; kind = tb2[TCA_ACT_KIND-1]; @@ -963,30 +921,30 @@ struct tc_action_ops *a_o; struct tc_action a; int ret = 0; - struct tcamsg *t = (struct tcamsg *) NLMSG_DATA(cb->nlh); char *kind = find_dump_kind(cb->nlh); + if (NULL == kind) { printk("tc_dump_action: action bad kind\n"); return 0; } a_o = tc_lookup_action_n(kind); - if (NULL == a_o) { printk("failed to find %s\n", kind); return 0; } - memset(&a,0,sizeof(struct tc_action)); + memset(&a, 0, sizeof(struct tc_action)); a.ops = a_o; if (NULL == a_o->walk) { - printk("tc_dump_action: %s !capable of dumping table\n",kind); + printk("tc_dump_action: %s !capable of dumping table\n", kind); goto rtattr_failure; } - nlh = NLMSG_PUT(skb, NETLINK_CB(cb->skb).pid, cb->nlh->nlmsg_seq, cb->nlh->nlmsg_type, sizeof (*t)); + nlh = NLMSG_PUT(skb, NETLINK_CB(cb->skb).pid, cb->nlh->nlmsg_seq, + cb->nlh->nlmsg_type, sizeof(*t)); t = NLMSG_DATA(nlh); t->tca_family = AF_UNSPEC; @@ -994,19 +952,17 @@ RTA_PUT(skb, TCA_ACT_TAB, 0, NULL); ret = a_o->walk(skb, cb, RTM_GETACTION, &a); - if (0 > ret ) { + if (0 > ret) goto rtattr_failure; - } if (ret > 0) { x->rta_len = skb->tail - (u8 *) x; ret = skb->len; - } else { + } else skb_trim(skb, (u8*)x - skb->data); - } nlh->nlmsg_len = skb->tail - b; - if (NETLINK_CB(cb->skb).pid && ret) + if (NETLINK_CB(cb->skb).pid && ret) nlh->nlmsg_flags |= NLM_F_MULTI; module_put(a_o->owner); return skb->len; --------------030808000601080109070105-- From kaber@trash.net Wed Dec 29 19:38:58 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 29 Dec 2004 19:39:06 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBU3cbNv030196 for ; Wed, 29 Dec 2004 19:38:57 -0800 Received: from eru.coreworks.de ([172.16.0.2] helo=trash.net) by kaber.coreworks.de with esmtp (Exim 4.34) id 1CjrAo-0002IT-Fo; Thu, 30 Dec 2004 04:40:27 +0100 Message-ID: <41D37869.5040204@trash.net> Date: Thu, 30 Dec 2004 04:39:21 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040413 Debian/1.6-5 X-Accept-Language: en MIME-Version: 1.0 To: jamal CC: Maillist netdev Subject: [PATCH PKT_SCHED 2/17]: Consistent comparision style in act_api.c Content-Type: multipart/mixed; boundary="------------000603080402090308020108" X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13180 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------000603080402090308020108 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Use consistent and more eye-friendly comparision style in act_api.c: if (x != NULL) instead of if (NULL != x). --------------000603080402090308020108 Content-Type: text/x-patch; name="02.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="02.diff" # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/12/30 00:52:10+01:00 kaber@coreworks.de # [PKT_SCHED]: Consistent comparision style in act_api.c # # Signed-off-by: Patrick McHardy # # net/sched/act_api.c # 2004/12/30 00:52:03+01:00 kaber@coreworks.de +37 -40 # [PKT_SCHED]: Consistent comparision style in act_api.c # # Signed-off-by: Patrick McHardy # diff -Nru a/net/sched/act_api.c b/net/sched/act_api.c --- a/net/sched/act_api.c 2004-12-30 04:01:01 +01:00 +++ b/net/sched/act_api.c 2004-12-30 04:01:01 +01:00 @@ -200,7 +200,7 @@ DPRINTK("tcf_action_destroy destroying %p next %p\n", a, a->next ? a->next : NULL); act = act->next; - if (ACT_P_DELETED == a->ops->cleanup(a, bind)) + if (a->ops->cleanup(a, bind) == ACT_P_DELETED) module_put(a->ops->owner); a->ops = NULL; @@ -223,7 +223,7 @@ { int err = -EINVAL; - if ((NULL == a) || (NULL == a->ops) || (NULL == a->ops->dump)) + if ((a == NULL) || (a->ops == NULL) || (a->ops->dump == NULL)) return err; return a->ops->dump(skb, a, bind, ref); } @@ -235,8 +235,8 @@ unsigned char *b = skb->tail; struct rtattr *r; - if ((NULL == a) || (NULL == a->ops) || (NULL == a->ops->dump) || - (NULL == a->ops->kind)) + if ((a == NULL) || (a->ops == NULL) || (a->ops->dump == NULL) || + (a->ops->kind == NULL)) return err; RTA_PUT(skb, TCA_KIND, IFNAMSIZ, a->ops->kind); @@ -267,7 +267,7 @@ act = a->next; RTA_PUT(skb, a->order, 0, NULL); err = tcf_action_dump_1(skb, a, bind, ref); - if (0 > err) + if (err < 0) goto rtattr_failure; r->rta_len = skb->tail - (u8*)r; } @@ -290,12 +290,12 @@ *err = -EINVAL; - if (NULL == name) { + if (name == NULL) { if (rtattr_parse(tb, TCA_ACT_MAX, RTA_DATA(rta), RTA_PAYLOAD(rta)) < 0) goto err_out; kind = tb[TCA_ACT_KIND-1]; - if (NULL != kind) { + if (kind != NULL) { sprintf(act_name, "%s", (char*)RTA_DATA(kind)); if (RTA_PAYLOAD(kind) >= IFNAMSIZ) { printk("Action %s bad\n", (char*)RTA_DATA(kind)); @@ -312,14 +312,14 @@ a_o = tc_lookup_action_n(name); } #ifdef CONFIG_KMOD - if (NULL == a_o) { + if (a_o == NULL) { DPRINTK("tcf_action_init_1: trying to load module %s\n", act_name); request_module(act_name); a_o = tc_lookup_action_n(act_name); } #endif - if (NULL == a_o) { + if (a_o == NULL) { printk("failed to find %s\n", act_name); goto err_out; } @@ -332,7 +332,7 @@ memset(a, 0, sizeof(*a)); /* backward compatibility for policer */ - if (NULL == name) { + if (name == NULL) { *err = a_o->init(tb[TCA_ACT_OPTIONS-1], est, a, ovr, bind); if (*err < 0) { *err = -EINVAL; @@ -414,7 +414,7 @@ /* place holder */ #endif - if (NULL == h) + if (h == NULL) goto errout; if (a->type == TCA_OLD_COMPAT) @@ -427,7 +427,7 @@ if (err < 0) goto errout; - if (NULL != a->ops && NULL != a->ops->get_stats) + if (a->ops != NULL && a->ops->get_stats != NULL) if (a->ops->get_stats(skb, a) < 0) goto errout; @@ -464,7 +464,7 @@ x = (struct rtattr*) skb->tail; RTA_PUT(skb, TCA_ACT_TAB, 0, NULL); - if (0 > tcf_action_dump(skb, a, bind, ref)) + if (tcf_action_dump(skb, a, bind, ref) < 0) goto rtattr_failure; x->rta_len = skb->tail - (u8*)x; @@ -510,7 +510,7 @@ if (rtattr_parse(tb, TCA_ACT_MAX, RTA_DATA(rta), RTA_PAYLOAD(rta))<0) goto err_out; kind = tb[TCA_ACT_KIND-1]; - if (NULL != kind) { + if (kind != NULL) { sprintf(act_name, "%s", (char*)RTA_DATA(kind)); if (RTA_PAYLOAD(kind) >= IFNAMSIZ) { printk("tcf_action_get_1: action %s bad\n", @@ -531,21 +531,21 @@ a_o = tc_lookup_action(kind); #ifdef CONFIG_KMOD - if (NULL == a_o) { + if (a_o == NULL) { request_module (act_name); a_o = tc_lookup_action_n(act_name); } #endif - if (NULL == a_o) { + if (a_o == NULL) { printk("failed to find %s\n", act_name); goto err_out; } - if (NULL == a) + if (a == NULL) goto err_mod; a->ops = a_o; - if (NULL == a_o->lookup || 0 == a_o->lookup(a, index)) { + if (a_o->lookup == NULL || a_o->lookup(a, index) == 0) { a->ops = NULL; err = -EINVAL; goto err_mod; @@ -579,7 +579,7 @@ char act_name[4 + IFNAMSIZ + 1]; struct tc_action_ops *a_o = NULL; - if (NULL != kind) { + if (kind != NULL) { sprintf(act_name, "%s", (char*)RTA_DATA(kind)); if (RTA_PAYLOAD(kind) >= IFNAMSIZ) { printk("get_ao: action %s bad\n", @@ -593,14 +593,13 @@ a_o = tc_lookup_action(kind); #ifdef CONFIG_KMOD - if (NULL == a_o) { + if (a_o == NULL) { DPRINTK("get_ao: trying to load module %s\n", act_name); request_module(act_name); a_o = tc_lookup_action_n(act_name); } #endif - - if (NULL == a_o) { + if (a_o == NULL) { printk("get_ao: failed to find %s\n", act_name); return NULL; } @@ -614,7 +613,7 @@ struct tc_action *act = NULL; act = kmalloc(sizeof(*act), GFP_KERNEL); - if (NULL == act) { /* grrr .. */ + if (act == NULL) { printk("create_a: failed to alloc!\n"); return NULL; } @@ -636,7 +635,7 @@ struct tc_action *a = create_a(0); int err = -EINVAL; - if (NULL == a) { + if (a == NULL) { printk("tca_action_flush: couldnt create tc_action\n"); return err; } @@ -654,7 +653,7 @@ goto err_out; kind = tb[TCA_ACT_KIND-1]; - if (NULL == get_ao(kind, a)) + if (get_ao(kind, a) == NULL) goto err_out; nlh = NLMSG_PUT(skb, pid, n->nlmsg_seq, RTM_DELACTION, sizeof(*t)); @@ -665,7 +664,7 @@ RTA_PUT(skb, TCA_ACT_TAB, 0, NULL); err = a->ops->walk(skb, &dcb, RTM_DELACTION, a); - if (0 > err) + if (err < 0) goto rtattr_failure; x->rta_len = skb->tail - (u8 *) x; @@ -680,7 +679,6 @@ return err; - rtattr_failure: module_put(a->ops->owner); nlmsg_failure: @@ -709,15 +707,15 @@ } if (event == RTM_DELACTION && n->nlmsg_flags&NLM_F_ROOT) { - if (NULL != tb[0] && NULL == tb[1]) + if (tb[0] != NULL && tb[1] == NULL) return tca_action_flush(tb[0], n, pid); } for (i=0; i < TCA_ACT_MAX_PRIO; i++) { - if (NULL == tb[i]) + if (tb[i] == NULL) break; act = create_a(i+1); - if (NULL != a && a != act) { + if (a != NULL && a != act) { a->next = act; a = act; } else @@ -736,7 +734,7 @@ } } - if (RTM_GETACTION == event) + if (event == RTM_GETACTION) ret = act_get_notify(pid, n, a_s, event); else { /* delete */ struct sk_buff *skb; @@ -791,7 +789,7 @@ x = (struct rtattr*) skb->tail; RTA_PUT(skb, TCA_ACT_TAB, 0, NULL); - if (0 > tcf_action_dump(skb, a, 0, 0)) + if (tcf_action_dump(skb, a, 0, 0) < 0) goto rtattr_failure; x->rta_len = skb->tail - (u8*)x; @@ -847,7 +845,7 @@ u32 pid = skb ? NETLINK_CB(skb).pid : 0; int ret = 0, ovr = 0; - if (NULL == tca[TCA_ACT_TAB-1]) { + if (tca[TCA_ACT_TAB-1] == NULL) { printk("tc_ctl_action: received NO action attribs\n"); return -EINVAL; } @@ -894,14 +892,13 @@ if (rtattr_parse(rta, TCAA_MAX, attr, attrlen) < 0) return NULL; tb1 = rta[TCA_ACT_TAB - 1]; - if (NULL == tb1) { + if (tb1 == NULL) return NULL; - } if (rtattr_parse(tb, TCA_ACT_MAX_PRIO, RTA_DATA(tb1), NLMSG_ALIGN(RTA_PAYLOAD(tb1))) < 0) return NULL; - if (NULL == tb[0]) + if (tb[0] == NULL) return NULL; if (rtattr_parse(tb2, TCA_ACT_MAX, RTA_DATA(tb[0]), @@ -924,13 +921,13 @@ struct tcamsg *t = (struct tcamsg *) NLMSG_DATA(cb->nlh); char *kind = find_dump_kind(cb->nlh); - if (NULL == kind) { + if (kind == NULL) { printk("tc_dump_action: action bad kind\n"); return 0; } a_o = tc_lookup_action_n(kind); - if (NULL == a_o) { + if (a_o == NULL) { printk("failed to find %s\n", kind); return 0; } @@ -938,7 +935,7 @@ memset(&a, 0, sizeof(struct tc_action)); a.ops = a_o; - if (NULL == a_o->walk) { + if (a_o->walk == NULL) { printk("tc_dump_action: %s !capable of dumping table\n", kind); goto rtattr_failure; } @@ -952,7 +949,7 @@ RTA_PUT(skb, TCA_ACT_TAB, 0, NULL); ret = a_o->walk(skb, cb, RTM_GETACTION, &a); - if (0 > ret) + if (ret < 0) goto rtattr_failure; if (ret > 0) { --------------000603080402090308020108-- From kaber@trash.net Wed Dec 29 19:38:47 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 29 Dec 2004 19:38:54 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBU3cQ03030191 for ; Wed, 29 Dec 2004 19:38:47 -0800 Received: from eru.coreworks.de ([172.16.0.2] helo=trash.net) by kaber.coreworks.de with esmtp (Exim 4.34) id 1CjrAc-0002IR-QZ; Thu, 30 Dec 2004 04:40:15 +0100 Message-ID: <41D3785F.3040909@trash.net> Date: Thu, 30 Dec 2004 04:39:11 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040413 Debian/1.6-5 X-Accept-Language: en MIME-Version: 1.0 To: jamal CC: Maillist netdev Subject: [PATCH PKT_SCHED 0/17]: tc action cleanup + fixes Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13178 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev Hi Jamal, here is what I got so far, I'll continue tommorrow. Only compile tested yet. Please review and comment. Patrick McHardy: o [PKT_SCHED]: Disable broken override bits in pedit action o [PKT_SCHED]: Return proper error codes in tcf_pedit_init o [PKT_SCHED]: Remove checks for impossible conditions in pedit action o [PKT_SCHED]: Clean up pedit action o [PKT_SCHED]: Clean up tcf_ipt_init o [PKT_SCHED]: Fix missing unlock in ipt action error path o [PKT_SCHED]: Remove checks for impossible conditions in ipt action o [PKT_SCHED]: Clean up ipt action o [PKT_SCHED]: Return -EOPNOTSUPP if gact probability is requested but not compiled in o [PKT_SCHED]: Return proper error codes in tcf_gact_init o [PKT_SCHED]: Remove checks for impossible conditions in gact action o [PKT_SCHED: Clean up gact action o [PKT_SCHED]: Clean up act_api.c action init path, propagate errors properly o [PKT_SCHED]: Check TCA_ACT_KIND payload size _before_ copying it o [PKT_SCHED]: Remove checks for impossible conditions in act_api.c o [PKT_SCHED]: Consistent comparision style in act_api.c o [PKT_SCHED]: act_api.c whitespace cleanup Regards Patrick From kaber@trash.net Wed Dec 29 19:39:08 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 29 Dec 2004 19:39:15 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBU3ck4i030201 for ; Wed, 29 Dec 2004 19:39:07 -0800 Received: from eru.coreworks.de ([172.16.0.2] helo=trash.net) by kaber.coreworks.de with esmtp (Exim 4.34) id 1CjrAy-0002IV-Oy; Thu, 30 Dec 2004 04:40:37 +0100 Message-ID: <41D37875.5020103@trash.net> Date: Thu, 30 Dec 2004 04:39:33 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040413 Debian/1.6-5 X-Accept-Language: en MIME-Version: 1.0 To: jamal CC: Maillist netdev Subject: [PATCH PKT_SCHED 4/17]: Check TCA_ACT_KIND payload size _before_ copying it Content-Type: multipart/mixed; boundary="------------070503010905070807000800" X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13182 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------070503010905070807000800 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Fix payload size checks like this one: - sprintf(act_name, "%s", (char*)RTA_DATA(kind)); - if (RTA_PAYLOAD(kind) >= IFNAMSIZ) { - printk("Action %s bad\n", (char*)RTA_DATA(kind)) --------------070503010905070807000800 Content-Type: text/x-patch; name="04.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="04.diff" # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/12/30 01:45:41+01:00 kaber@coreworks.de # [PKT_SCHED]: Check TCA_ACT_KIND payload size _before_ copying it # # Signed-off-by: Patrick McHardy # # net/sched/act_api.c # 2004/12/30 01:45:35+01:00 kaber@coreworks.de +6 -14 # [PKT_SCHED]: Check TCA_ACT_KIND payload size _before_ copying it # # Signed-off-by: Patrick McHardy # diff -Nru a/net/sched/act_api.c b/net/sched/act_api.c --- a/net/sched/act_api.c 2004-12-30 04:01:10 +01:00 +++ b/net/sched/act_api.c 2004-12-30 04:01:10 +01:00 @@ -288,11 +288,9 @@ goto err_out; kind = tb[TCA_ACT_KIND-1]; if (kind != NULL) { - sprintf(act_name, "%s", (char*)RTA_DATA(kind)); - if (RTA_PAYLOAD(kind) >= IFNAMSIZ) { - printk("Action %s bad\n", (char*)RTA_DATA(kind)); + if (RTA_PAYLOAD(kind) >= IFNAMSIZ) goto err_out; - } + sprintf(act_name, "%s", (char*)RTA_DATA(kind)); } else { printk("Action bad kind\n"); goto err_out; @@ -503,12 +501,9 @@ goto err_out; kind = tb[TCA_ACT_KIND-1]; if (kind != NULL) { - sprintf(act_name, "%s", (char*)RTA_DATA(kind)); - if (RTA_PAYLOAD(kind) >= IFNAMSIZ) { - printk("tcf_action_get_1: action %s bad\n", - (char*)RTA_DATA(kind)); + if (RTA_PAYLOAD(kind) >= IFNAMSIZ) goto err_out; - } + sprintf(act_name, "%s", (char*)RTA_DATA(kind)); } else { printk("tcf_action_get_1: action bad kind\n"); goto err_out; @@ -567,12 +562,9 @@ struct tc_action_ops *a_o = NULL; if (kind != NULL) { - sprintf(act_name, "%s", (char*)RTA_DATA(kind)); - if (RTA_PAYLOAD(kind) >= IFNAMSIZ) { - printk("get_ao: action %s bad\n", - (char*)RTA_DATA(kind)); + if (RTA_PAYLOAD(kind) >= IFNAMSIZ) return NULL; - } + sprintf(act_name, "%s", (char*)RTA_DATA(kind)); } else { printk("get_ao: action bad kind\n"); return NULL; --------------070503010905070807000800-- From kaber@trash.net Wed Dec 29 19:39:00 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 29 Dec 2004 19:39:08 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBU3cdOs030197 for ; Wed, 29 Dec 2004 19:39:00 -0800 Received: from eru.coreworks.de ([172.16.0.2] helo=trash.net) by kaber.coreworks.de with esmtp (Exim 4.34) id 1CjrAr-0002IU-Pb; Thu, 30 Dec 2004 04:40:30 +0100 Message-ID: <41D3786E.3070800@trash.net> Date: Thu, 30 Dec 2004 04:39:26 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040413 Debian/1.6-5 X-Accept-Language: en MIME-Version: 1.0 To: jamal CC: Maillist netdev Subject: [PATCH PKT_SCHED 3/17]: Remove checks for impossible conditions in act_api.c Content-Type: multipart/mixed; boundary="------------060704050804080901090705" X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13181 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------060704050804080901090705 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Remove checks for impossible conditions, also remove some useless NULL-ptr assignments and make loops iterating over actions clearer. --------------060704050804080901090705 Content-Type: text/x-patch; name="03.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="03.diff" # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/12/30 01:36:55+01:00 kaber@coreworks.de # [PKT_SCHED]: Remove checks for impossible conditions in act_api.c # # Also remove some useless NULL-ptr assignments and make loops iterating # over actions clearer. # # Signed-off-by: Patrick McHardy # # net/sched/act_api.c # 2004/12/30 01:36:48+01:00 kaber@coreworks.de +14 -32 # [PKT_SCHED]: Remove checks for impossible conditions in act_api.c # # Also remove some useless NULL-ptr assignments and make loops iterating # over actions clearer. # # Signed-off-by: Patrick McHardy # diff -Nru a/net/sched/act_api.c b/net/sched/act_api.c --- a/net/sched/act_api.c 2004-12-30 04:01:06 +01:00 +++ b/net/sched/act_api.c 2004-12-30 04:01:06 +01:00 @@ -195,25 +195,18 @@ { struct tc_action *a; - for (a = act; act; a = act) { - if (a && a->ops && a->ops->cleanup) { + for (a = act; a; a = act) { + if (a->ops && a->ops->cleanup) { DPRINTK("tcf_action_destroy destroying %p next %p\n", - a, a->next ? a->next : NULL); + a, a->next); act = act->next; if (a->ops->cleanup(a, bind) == ACT_P_DELETED) module_put(a->ops->owner); - - a->ops = NULL; kfree(a); } else { /*FIXME: Remove later - catch insertion bugs*/ printk("tcf_action_destroy: BUG? destroying NULL ops\n"); - if (a) { - act = act->next; - kfree(a); - } else { - printk("tcf_action_destroy: BUG? destroying NULL action!\n"); - break; - } + act = act->next; + kfree(a); } } } @@ -223,7 +216,7 @@ { int err = -EINVAL; - if ((a == NULL) || (a->ops == NULL) || (a->ops->dump == NULL)) + if (a->ops == NULL || a->ops->dump == NULL) return err; return a->ops->dump(skb, a, bind, ref); } @@ -235,8 +228,7 @@ unsigned char *b = skb->tail; struct rtattr *r; - if ((a == NULL) || (a->ops == NULL) || (a->ops->dump == NULL) || - (a->ops->kind == NULL)) + if (a->ops == NULL || a->ops->dump == NULL || a->ops->kind == NULL) return err; RTA_PUT(skb, TCA_KIND, IFNAMSIZ, a->ops->kind); @@ -563,14 +555,9 @@ { struct tc_action *a; - for (a = act; act; a = act) { - if (a) { - act = act->next; - a->ops = NULL; - a->priv = NULL; - kfree(a); - } else - printk("cleanup_a: BUG? empty action\n"); + for (a = act; a; a = act) { + act = a->next; + kfree(a); } } @@ -715,7 +702,7 @@ if (tb[i] == NULL) break; act = create_a(i+1); - if (a != NULL && a != act) { + if (a != NULL) { a->next = act; a = act; } else @@ -826,14 +813,9 @@ * stays intact * */ ret = tcf_add_notify(act, pid, seq, RTM_NEWACTION, n->nlmsg_flags); - for (a = act; act; a = act) { - if (a) { - act = act->next; - a->ops = NULL; - a->priv = NULL; - kfree(a); - } else - printk("tcf_action_add: BUG? empty action\n"); + for (a = act; a; a = act) { + act = a->next; + kfree(a); } done: return ret; --------------060704050804080901090705-- From kaber@trash.net Wed Dec 29 19:39:19 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 29 Dec 2004 19:39:26 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBU3cw16030218 for ; Wed, 29 Dec 2004 19:39:18 -0800 Received: from eru.coreworks.de ([172.16.0.2] helo=trash.net) by kaber.coreworks.de with esmtp (Exim 4.34) id 1CjrBA-0002IZ-F1; Thu, 30 Dec 2004 04:40:48 +0100 Message-ID: <41D37880.7030709@trash.net> Date: Thu, 30 Dec 2004 04:39:44 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040413 Debian/1.6-5 X-Accept-Language: en MIME-Version: 1.0 To: jamal CC: Maillist netdev Subject: [PATCH PKT_SCHED 5/17]: Clean up act_api.c action init path, propagate errors properly Content-Type: multipart/mixed; boundary="------------060902040500040601030304" X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13183 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------060902040500040601030304 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Clean up act_api.c action init path, propagate errors properly. Also replace an unrelated printk for an impossible condition by BUG(). --------------060902040500040601030304 Content-Type: text/x-patch; name="05.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="05.diff" # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/12/30 02:01:06+01:00 kaber@coreworks.de # [PKT_SCHED]: Clean up act_api.c action init path, propagate errors properly # # Also replace an unrelated printk for an impossible condition by BUG(). # # Signed-off-by: Patrick McHardy # # net/sched/act_api.c # 2004/12/30 02:01:00+01:00 kaber@coreworks.de +11 -25 # [PKT_SCHED]: Clean up act_api.c action init path, propagate errors properly # # Also replace an unrelated printk for an impossible condition by BUG(). # # Signed-off-by: Patrick McHardy # diff -Nru a/net/sched/act_api.c b/net/sched/act_api.c --- a/net/sched/act_api.c 2004-12-30 04:01:15 +01:00 +++ b/net/sched/act_api.c 2004-12-30 04:01:15 +01:00 @@ -287,14 +287,11 @@ RTA_PAYLOAD(rta)) < 0) goto err_out; kind = tb[TCA_ACT_KIND-1]; - if (kind != NULL) { - if (RTA_PAYLOAD(kind) >= IFNAMSIZ) - goto err_out; - sprintf(act_name, "%s", (char*)RTA_DATA(kind)); - } else { - printk("Action bad kind\n"); + if (kind == NULL) goto err_out; - } + if (RTA_PAYLOAD(kind) >= IFNAMSIZ) + goto err_out; + sprintf(act_name, "%s", (char*)RTA_DATA(kind)); a_o = tc_lookup_action(kind); } else { sprintf(act_name, "%s", name); @@ -310,7 +307,7 @@ #endif if (a_o == NULL) { - printk("failed to find %s\n", act_name); + *err = -ENOENT; goto err_out; } @@ -322,19 +319,12 @@ memset(a, 0, sizeof(*a)); /* backward compatibility for policer */ - if (name == NULL) { + if (name == NULL) *err = a_o->init(tb[TCA_ACT_OPTIONS-1], est, a, ovr, bind); - if (*err < 0) { - *err = -EINVAL; - goto err_free; - } - } else { + else *err = a_o->init(rta, est, a, ovr, bind); - if (*err < 0) { - *err = -EINVAL; - goto err_free; - } - } + if (*err < 0) + goto err_free; /* module count goes up only when brand new policy is created if it exists and is only bound to in a_o->init() then @@ -372,11 +362,8 @@ for (i=0; i < TCA_ACT_MAX_PRIO; i++) { if (tb[i]) { act = tcf_action_init_1(tb[i], est, name, ovr, bind, err); - if (act == NULL) { - printk("Error processing action order %d\n", i); + if (act == NULL) goto bad_ret; - } - act->order = i+1; if (a == NULL) a = act; @@ -845,8 +832,7 @@ ret = tca_action_gd(tca[TCA_ACT_TAB-1], n, pid, RTM_GETACTION); break; default: - printk("Unknown cmd was detected\n"); - break; + BUG(); } return ret; --------------060902040500040601030304-- From kaber@trash.net Wed Dec 29 19:39:27 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 29 Dec 2004 19:39:34 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBU3d5xb030257 for ; Wed, 29 Dec 2004 19:39:25 -0800 Received: from eru.coreworks.de ([172.16.0.2] helo=trash.net) by kaber.coreworks.de with esmtp (Exim 4.34) id 1CjrBH-0002Id-9L; Thu, 30 Dec 2004 04:40:55 +0100 Message-ID: <41D37887.2010504@trash.net> Date: Thu, 30 Dec 2004 04:39:51 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040413 Debian/1.6-5 X-Accept-Language: en MIME-Version: 1.0 To: jamal CC: Maillist netdev Subject: [PATCH PKT_SCHED 6/17]: Clean up gact action Content-Type: multipart/mixed; boundary="------------020505020004020905020104" X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13184 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------020505020004020905020104 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Whitespace cleanup, consistent comparision style. --------------020505020004020905020104 Content-Type: text/x-patch; name="06.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="06.diff" # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/12/30 02:15:50+01:00 kaber@coreworks.de # [PKT_SCHED: Clean up gact action # # - Whitespace cleanup # - Consistent comparision style # # Signed-off-by: Patrick McHardy # # net/sched/gact.c # 2004/12/30 02:15:44+01:00 kaber@coreworks.de +29 -38 # [PKT_SCHED: Clean up gact action # # - Whitespace cleanup # - Consistent comparision style # # Signed-off-by: Patrick McHardy # diff -Nru a/net/sched/gact.c b/net/sched/gact.c --- a/net/sched/gact.c 2004-12-30 04:01:19 +01:00 +++ b/net/sched/gact.c 2004-12-30 04:01:19 +01:00 @@ -52,27 +52,27 @@ #include #ifdef CONFIG_GACT_PROB -typedef int (*g_rand)(struct tcf_gact *p); -static int -gact_net_rand(struct tcf_gact *p) { +static int gact_net_rand(struct tcf_gact *p) +{ if (net_random()%p->pval) return p->action; return p->paction; } -static int -gact_determ(struct tcf_gact *p) { +static int gact_determ(struct tcf_gact *p) +{ if (p->bstats.packets%p->pval) return p->action; return p->paction; } - -g_rand gact_rand[MAX_RAND]= { NULL,gact_net_rand, gact_determ}; - +typedef int (*g_rand)(struct tcf_gact *p); +static g_rand gact_rand[MAX_RAND] = { NULL, gact_net_rand, gact_determ }; #endif + static int -tcf_gact_init(struct rtattr *rta, struct rtattr *est, struct tc_action *a,int ovr,int bind) +tcf_gact_init(struct rtattr *rta, struct rtattr *est, struct tc_action *a, + int ovr, int bind) { struct rtattr *tb[TCA_GACT_MAX]; struct tc_gact *parm = NULL; @@ -81,31 +81,26 @@ #endif struct tcf_gact *p = NULL; int ret = 0; - int size = sizeof (*p); if (rtattr_parse(tb, TCA_GACT_MAX, RTA_DATA(rta), RTA_PAYLOAD(rta)) < 0) return -1; - if (NULL == a || NULL == tb[TCA_GACT_PARMS - 1]) { + if (a == NULL || tb[TCA_GACT_PARMS - 1] == NULL) { printk("BUG: tcf_gact_init called with NULL params\n"); return -1; } parm = RTA_DATA(tb[TCA_GACT_PARMS - 1]); #ifdef CONFIG_GACT_PROB - if (NULL != tb[TCA_GACT_PROB - 1]) { + if (tb[TCA_GACT_PROB - 1] != NULL) p_parm = RTA_DATA(tb[TCA_GACT_PROB - 1]); - } #endif - p = tcf_hash_check(parm, a, ovr, bind); - - if (NULL == p) { - p = tcf_hash_create(parm,est,a,size,ovr, bind); - - if (NULL == p) { + if (p == NULL) { + p = tcf_hash_create(parm, est, a, sizeof(*p), ovr, bind); + if (p == NULL) return -1; - } else { + else { p->refcnt = 1; ret = 1; goto override; @@ -116,7 +111,7 @@ override: p->action = parm->action; #ifdef CONFIG_GACT_PROB - if (NULL != p_parm) { + if (p_parm != NULL) { p->paction = p_parm->paction; p->pval = p_parm->pval; p->ptype = p_parm->ptype; @@ -125,16 +120,15 @@ } #endif } - return ret; } static int tcf_gact_cleanup(struct tc_action *a, int bind) { - struct tcf_gact *p; - p = PRIV(a,gact); - if (NULL != p) + struct tcf_gact *p = PRIV(a, gact); + + if (p != NULL) return tcf_hash_release(p, bind); return 0; } @@ -142,13 +136,11 @@ static int tcf_gact(struct sk_buff **pskb, struct tc_action *a) { - struct tcf_gact *p; + struct tcf_gact *p = PRIV(a, gact); struct sk_buff *skb = *pskb; int action = TC_ACT_SHOT; - p = PRIV(a,gact); - - if (NULL == p) { + if (p == NULL) { if (net_ratelimit()) printk("BUG: tcf_gact called with NULL params\n"); return -1; @@ -156,7 +148,7 @@ spin_lock(&p->lock); #ifdef CONFIG_GACT_PROB - if (p->ptype && NULL != gact_rand[p->ptype]) + if (p->ptype && gact_rand[p->ptype] != NULL) action = gact_rand[p->ptype](p); else action = p->action; @@ -165,7 +157,7 @@ #endif p->bstats.bytes += skb->len; p->bstats.packets++; - if (TC_ACT_SHOT == action) + if (action == TC_ACT_SHOT) p->qstats.drops++; p->tm.lastuse = jiffies; spin_unlock(&p->lock); @@ -181,11 +173,10 @@ #ifdef CONFIG_GACT_PROB struct tc_gact_p p_opt; #endif - struct tcf_gact *p; + struct tcf_gact *p = PRIV(a, gact); struct tcf_t t; - p = PRIV(a,gact); - if (NULL == p) { + if (p == NULL) { printk("BUG: tcf_gact_dump called with NULL params\n"); goto rtattr_failure; } @@ -194,19 +185,19 @@ opt.refcnt = p->refcnt - ref; opt.bindcnt = p->bindcnt - bind; opt.action = p->action; - RTA_PUT(skb, TCA_GACT_PARMS, sizeof (opt), &opt); + RTA_PUT(skb, TCA_GACT_PARMS, sizeof(opt), &opt); #ifdef CONFIG_GACT_PROB if (p->ptype) { p_opt.paction = p->paction; p_opt.pval = p->pval; p_opt.ptype = p->ptype; - RTA_PUT(skb, TCA_GACT_PROB, sizeof (p_opt), &p_opt); - } + RTA_PUT(skb, TCA_GACT_PROB, sizeof(p_opt), &p_opt); + } #endif t.install = jiffies_to_clock_t(jiffies - p->tm.install); t.lastuse = jiffies_to_clock_t(jiffies - p->tm.lastuse); t.expires = jiffies_to_clock_t(p->tm.expires); - RTA_PUT(skb, TCA_GACT_TM, sizeof (t), &t); + RTA_PUT(skb, TCA_GACT_TM, sizeof(t), &t); return skb->len; rtattr_failure: --------------020505020004020905020104-- From kaber@trash.net Wed Dec 29 19:39:34 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 29 Dec 2004 19:39:42 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBU3dDll030373 for ; Wed, 29 Dec 2004 19:39:34 -0800 Received: from eru.coreworks.de ([172.16.0.2] helo=trash.net) by kaber.coreworks.de with esmtp (Exim 4.34) id 1CjrBP-0002Ik-Oe; Thu, 30 Dec 2004 04:41:04 +0100 Message-ID: <41D3788F.7030501@trash.net> Date: Thu, 30 Dec 2004 04:39:59 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040413 Debian/1.6-5 X-Accept-Language: en MIME-Version: 1.0 To: jamal CC: Maillist netdev Subject: [PATCH PKT_SCHED 7/17]: Remove checks for impossible conditions in gact action Content-Type: multipart/mixed; boundary="------------060603020807000000040607" X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13185 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------060603020807000000040607 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit a->priv can only be NULL in tcf_gact_cleanup, everything else _is_ a bug, so let's just crash so we get a backtrace. --------------060603020807000000040607 Content-Type: text/x-patch; name="07.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="07.diff" # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/12/30 02:25:13+01:00 kaber@coreworks.de # [PKT_SCHED]: Remove checks for impossible conditions in gact action # # a->priv can only be NULL in tcf_gact_cleanup, everything else _is_ a # bug, so let's just crash so we get a backtrace. # # Signed-off-by: Patrick McHardy # # # net/sched/gact.c # 2004/12/30 02:25:07+01:00 kaber@coreworks.de +1 -12 # [PKT_SCHED]: Remove checks for impossible conditions in gact action # # a->priv can only be NULL in tcf_gact_cleanup, everything else _is_ a # bug, so let's just crash so we get a backtrace. # # Signed-off-by: Patrick McHardy # # diff -Nru a/net/sched/gact.c b/net/sched/gact.c --- a/net/sched/gact.c 2004-12-30 04:01:24 +01:00 +++ b/net/sched/gact.c 2004-12-30 04:01:24 +01:00 @@ -85,7 +85,7 @@ if (rtattr_parse(tb, TCA_GACT_MAX, RTA_DATA(rta), RTA_PAYLOAD(rta)) < 0) return -1; - if (a == NULL || tb[TCA_GACT_PARMS - 1] == NULL) { + if (tb[TCA_GACT_PARMS - 1] == NULL) { printk("BUG: tcf_gact_init called with NULL params\n"); return -1; } @@ -140,12 +140,6 @@ struct sk_buff *skb = *pskb; int action = TC_ACT_SHOT; - if (p == NULL) { - if (net_ratelimit()) - printk("BUG: tcf_gact called with NULL params\n"); - return -1; - } - spin_lock(&p->lock); #ifdef CONFIG_GACT_PROB if (p->ptype && gact_rand[p->ptype] != NULL) @@ -175,11 +169,6 @@ #endif struct tcf_gact *p = PRIV(a, gact); struct tcf_t t; - - if (p == NULL) { - printk("BUG: tcf_gact_dump called with NULL params\n"); - goto rtattr_failure; - } opt.index = p->index; opt.refcnt = p->refcnt - ref; --------------060603020807000000040607-- From kaber@trash.net Wed Dec 29 19:39:48 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 29 Dec 2004 19:39:56 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBU3dPwQ030628 for ; Wed, 29 Dec 2004 19:39:46 -0800 Received: from eru.coreworks.de ([172.16.0.2] helo=trash.net) by kaber.coreworks.de with esmtp (Exim 4.34) id 1CjrBb-0002Ip-QG; Thu, 30 Dec 2004 04:41:16 +0100 Message-ID: <41D3789C.3040705@trash.net> Date: Thu, 30 Dec 2004 04:40:12 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040413 Debian/1.6-5 X-Accept-Language: en MIME-Version: 1.0 To: jamal CC: Maillist netdev Subject: [PATCH PKT_SCHED 9/17]: Return -EOPNOTSUPP if gact probability is requested but not compiled in Content-Type: multipart/mixed; boundary="------------050707060508060702040603" X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13187 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------050707060508060702040603 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Return -EOPNOTSUPP if gact probability is requested but not compiled in. --------------050707060508060702040603 Content-Type: text/x-patch; name="09.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="09.diff" # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/12/30 02:34:11+01:00 kaber@coreworks.de # [PKT_SCHED]: Return -EOPNOTSUPP if gact probability is requested but not compiled in # # Signed-off-by: Patrick McHardy # # net/sched/gact.c # 2004/12/30 02:34:04+01:00 kaber@coreworks.de +3 -1 # [PKT_SCHED]: Return -EOPNOTSUPP if gact probability is requested but not compiled in # # Signed-off-by: Patrick McHardy # diff -Nru a/net/sched/gact.c b/net/sched/gact.c --- a/net/sched/gact.c 2004-12-30 04:01:33 +01:00 +++ b/net/sched/gact.c 2004-12-30 04:01:33 +01:00 @@ -88,9 +88,11 @@ return -EINVAL; parm = RTA_DATA(tb[TCA_GACT_PARMS - 1]); -#ifdef CONFIG_GACT_PROB if (tb[TCA_GACT_PROB - 1] != NULL) +#ifdef CONFIG_GACT_PROB p_parm = RTA_DATA(tb[TCA_GACT_PROB - 1]); +#else + return -EOPNOTSUPP; #endif p = tcf_hash_check(parm, a, ovr, bind); if (p == NULL) { --------------050707060508060702040603-- From kaber@trash.net Wed Dec 29 19:39:40 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 29 Dec 2004 19:39:48 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBU3dJNh030502 for ; Wed, 29 Dec 2004 19:39:39 -0800 Received: from eru.coreworks.de ([172.16.0.2] helo=trash.net) by kaber.coreworks.de with esmtp (Exim 4.34) id 1CjrBV-0002Io-Lf; Thu, 30 Dec 2004 04:41:09 +0100 Message-ID: <41D37895.4010801@trash.net> Date: Thu, 30 Dec 2004 04:40:05 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040413 Debian/1.6-5 X-Accept-Language: en MIME-Version: 1.0 To: jamal CC: Maillist netdev Subject: [PATCH PKT_SCHED 8/17]: Return proper error codes in tcf_gact_init Content-Type: multipart/mixed; boundary="------------050808070300080804000405" X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13186 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------050808070300080804000405 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Return proper error codes in tcf_gact_init. --------------050808070300080804000405 Content-Type: text/x-patch; name="08.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="08.diff" # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/12/30 02:29:53+01:00 kaber@coreworks.de # [PKT_SCHED]: Return proper error codes in tcf_gact_init # # Signed-off-by: Patrick McHardy # # net/sched/gact.c # 2004/12/30 02:29:47+01:00 kaber@coreworks.de +4 -7 # [PKT_SCHED]: Return proper error codes in tcf_gact_init # # Signed-off-by: Patrick McHardy # diff -Nru a/net/sched/gact.c b/net/sched/gact.c --- a/net/sched/gact.c 2004-12-30 04:01:28 +01:00 +++ b/net/sched/gact.c 2004-12-30 04:01:28 +01:00 @@ -83,12 +83,9 @@ int ret = 0; if (rtattr_parse(tb, TCA_GACT_MAX, RTA_DATA(rta), RTA_PAYLOAD(rta)) < 0) - return -1; - - if (tb[TCA_GACT_PARMS - 1] == NULL) { - printk("BUG: tcf_gact_init called with NULL params\n"); - return -1; - } + return -EINVAL; + if (tb[TCA_GACT_PARMS - 1] == NULL) + return -EINVAL; parm = RTA_DATA(tb[TCA_GACT_PARMS - 1]); #ifdef CONFIG_GACT_PROB @@ -99,7 +96,7 @@ if (p == NULL) { p = tcf_hash_create(parm, est, a, sizeof(*p), ovr, bind); if (p == NULL) - return -1; + return -ENOMEM; else { p->refcnt = 1; ret = 1; --------------050808070300080804000405-- From kaber@trash.net Wed Dec 29 19:39:57 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 29 Dec 2004 19:40:06 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBU3dYgC030847 for ; Wed, 29 Dec 2004 19:39:55 -0800 Received: from eru.coreworks.de ([172.16.0.2] helo=trash.net) by kaber.coreworks.de with esmtp (Exim 4.34) id 1CjrBk-0002It-Ml; Thu, 30 Dec 2004 04:41:25 +0100 Message-ID: <41D378A5.9090702@trash.net> Date: Thu, 30 Dec 2004 04:40:21 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040413 Debian/1.6-5 X-Accept-Language: en MIME-Version: 1.0 To: jamal CC: Maillist netdev Subject: [PATCH PKT_SCHED 10/17]: Clean up ipt action Content-Type: multipart/mixed; boundary="------------060706090705090305050802" X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13188 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------060706090705090305050802 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Whitespace cleanup, consistent comparision style, break lines at 80 characters. --------------060706090705090305050802 Content-Type: text/x-patch; name="10.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="10.diff" # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/12/30 02:45:21+01:00 kaber@coreworks.de # [PKT_SCHED]: Clean up ipt action # # - Whitespace cleanup # - Consistent comparision style # - Break lines at 80 characters # # Signed-off-by: Patrick McHardy # # net/sched/ipt.c # 2004/12/30 02:45:15+01:00 kaber@coreworks.de +18 -28 # [PKT_SCHED]: Clean up ipt action # # - Whitespace cleanup # - Consistent comparision style # - Break lines at 80 characters # # Signed-off-by: Patrick McHardy # diff -Nru a/net/sched/ipt.c b/net/sched/ipt.c --- a/net/sched/ipt.c 2004-12-30 04:01:37 +01:00 +++ b/net/sched/ipt.c 2004-12-30 04:01:37 +01:00 @@ -93,7 +93,8 @@ } static int -tcf_ipt_init(struct rtattr *rta, struct rtattr *est, struct tc_action *a, int ovr, int bind) +tcf_ipt_init(struct rtattr *rta, struct rtattr *est, struct tc_action *a, + int ovr, int bind) { struct ipt_entry_target *t; unsigned h; @@ -103,12 +104,9 @@ u32 index = 0; u32 hook = 0; - if (NULL == a || NULL == rta || - (rtattr_parse(tb, TCA_IPT_MAX, RTA_DATA(rta), RTA_PAYLOAD(rta)) < - 0)) { + if (a == NULL || rta == NULL || + rtattr_parse(tb, TCA_IPT_MAX, RTA_DATA(rta), RTA_PAYLOAD(rta)) < 0) return -1; - } - if (tb[TCA_IPT_INDEX - 1]) { index = *(u32 *) RTA_DATA(tb[TCA_IPT_INDEX - 1]); @@ -129,15 +127,13 @@ return ret; } - if (NULL == tb[TCA_IPT_TARG - 1] || NULL == tb[TCA_IPT_HOOK - 1]) { + if (tb[TCA_IPT_TARG - 1] == NULL || tb[TCA_IPT_HOOK - 1] == NULL) return -1; - } - p = kmalloc(sizeof (*p), GFP_KERNEL); + p = kmalloc(sizeof(*p), GFP_KERNEL); if (p == NULL) return -1; - - memset(p, 0, sizeof (*p)); + memset(p, 0, sizeof(*p)); p->refcnt = 1; ret = 1; spin_lock_init(&p->lock); @@ -192,7 +188,7 @@ } } - if (0 > init_targ(p)) { + if (init_targ(p) < 0) { if (ovr) { printk("ipt policy messed up 2 \n"); spin_unlock(&p->lock); @@ -225,7 +221,7 @@ p->next = tcf_ipt_ht[h]; tcf_ipt_ht[h] = p; write_unlock_bh(&ipt_lock); - a->priv = (void *) p; + a->priv = p; return ret; } @@ -233,8 +229,8 @@ static int tcf_ipt_cleanup(struct tc_action *a, int bind) { - struct tcf_ipt *p; - p = PRIV(a,ipt); + struct tcf_ipt *p = PRIV(a, ipt); + if (NULL != p) return tcf_hash_release(p, bind); return 0; @@ -244,14 +240,11 @@ tcf_ipt(struct sk_buff **pskb, struct tc_action *a) { int ret = 0, result = 0; - struct tcf_ipt *p; + struct tcf_ipt *p = PRIV(a, ipt); struct sk_buff *skb = *pskb; - p = PRIV(a,ipt); - - if (NULL == p || NULL == skb) { + if (p == NULL || skb == NULL) return -1; - } spin_lock(&p->lock); @@ -260,16 +253,15 @@ p->bstats.packets++; if (skb_cloned(skb) ) { - if (pskb_expand_head(skb, 0, 0, GFP_ATOMIC)) { + if (pskb_expand_head(skb, 0, 0, GFP_ATOMIC)) return -1; - } } /* yes, we have to worry about both in and out dev worry later - danger - this API seems to have changed from earlier kernels */ ret = p->t->u.kernel.target->target(&skb, skb->dev, NULL, - p->hook, p->t->data, (void *)NULL); + p->hook, p->t->data, NULL); switch (ret) { case NF_ACCEPT: result = TC_ACT_OK; @@ -299,11 +291,9 @@ struct tcf_t tm; struct tc_cnt c; unsigned char *b = skb->tail; + struct tcf_ipt *p = PRIV(a, ipt); - struct tcf_ipt *p; - - p = PRIV(a,ipt); - if (NULL == p) { + if (p == NULL) { printk("BUG: tcf_ipt_dump called with NULL params\n"); goto rtattr_failure; } @@ -314,7 +304,7 @@ t = kmalloc(p->t->u.user.target_size, GFP_ATOMIC); - if (NULL == t) + if (t == NULL) goto rtattr_failure; c.bindcnt = p->bindcnt - bind; --------------060706090705090305050802-- From kaber@trash.net Wed Dec 29 19:39:59 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 29 Dec 2004 19:40:09 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBU3dfKK030985 for ; Wed, 29 Dec 2004 19:39:57 -0800 Received: from eru.coreworks.de ([172.16.0.2] helo=trash.net) by kaber.coreworks.de with esmtp (Exim 4.34) id 1CjrBr-0002Ix-KM; Thu, 30 Dec 2004 04:41:32 +0100 Message-ID: <41D378AB.70204@trash.net> Date: Thu, 30 Dec 2004 04:40:27 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040413 Debian/1.6-5 X-Accept-Language: en MIME-Version: 1.0 To: jamal CC: Maillist netdev Subject: [PATCH PKT_SCHED 11/17]: Remove checks for impossible conditions in ipt action Content-Type: multipart/mixed; boundary="------------000506070304060700050205" X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13189 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------000506070304060700050205 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Remove checks for impossible conditions in ipt action, same as for gact. --------------000506070304060700050205 Content-Type: text/x-patch; name="11.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="11.diff" # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/12/30 02:52:02+01:00 kaber@coreworks.de # [PKT_SCHED]: Remove checks for impossible conditions in ipt action # # Signed-off-by: Patrick McHardy # # net/sched/ipt.c # 2004/12/30 02:51:56+01:00 kaber@coreworks.de +1 -9 # [PKT_SCHED]: Remove checks for impossible conditions in ipt action # # Signed-off-by: Patrick McHardy # diff -Nru a/net/sched/ipt.c b/net/sched/ipt.c --- a/net/sched/ipt.c 2004-12-30 04:01:42 +01:00 +++ b/net/sched/ipt.c 2004-12-30 04:01:42 +01:00 @@ -104,8 +104,7 @@ u32 index = 0; u32 hook = 0; - if (a == NULL || rta == NULL || - rtattr_parse(tb, TCA_IPT_MAX, RTA_DATA(rta), RTA_PAYLOAD(rta)) < 0) + if (rtattr_parse(tb, TCA_IPT_MAX, RTA_DATA(rta), RTA_PAYLOAD(rta)) < 0) return -1; if (tb[TCA_IPT_INDEX - 1]) { @@ -243,9 +242,6 @@ struct tcf_ipt *p = PRIV(a, ipt); struct sk_buff *skb = *pskb; - if (p == NULL || skb == NULL) - return -1; - spin_lock(&p->lock); p->tm.lastuse = jiffies; @@ -293,10 +289,6 @@ unsigned char *b = skb->tail; struct tcf_ipt *p = PRIV(a, ipt); - if (p == NULL) { - printk("BUG: tcf_ipt_dump called with NULL params\n"); - goto rtattr_failure; - } /* for simple targets kernel size == user size ** user name = target name ** for foolproof you need to not assume this --------------000506070304060700050205-- From kaber@trash.net Wed Dec 29 19:40:09 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 29 Dec 2004 19:40:18 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBU3dlpb031118 for ; Wed, 29 Dec 2004 19:40:07 -0800 Received: from eru.coreworks.de ([172.16.0.2] helo=trash.net) by kaber.coreworks.de with esmtp (Exim 4.34) id 1CjrBx-0002J1-8V; Thu, 30 Dec 2004 04:41:37 +0100 Message-ID: <41D378B1.40202@trash.net> Date: Thu, 30 Dec 2004 04:40:33 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040413 Debian/1.6-5 X-Accept-Language: en MIME-Version: 1.0 To: jamal CC: Maillist netdev Subject: [PATCH PKT_SCHED 12/17]: Fix missing unlock in ipt action error path Content-Type: multipart/mixed; boundary="------------090101090800040307030702" X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13190 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------090101090800040307030702 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Fix missing unlock in ipt action error path. --------------090101090800040307030702 Content-Type: text/x-patch; name="12.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="12.diff" # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/12/30 03:02:56+01:00 kaber@coreworks.de # [PKT_SCHED]: Fix missing unlock in ipt action error path # # Signed-off-by: Patrick McHardy # # net/sched/ipt.c # 2004/12/30 02:57:03+01:00 kaber@coreworks.de +6 -4 # [PKT_SCHED]: Fix missing unlock in ipt action error path # # Signed-off-by: Patrick McHardy # diff -Nru a/net/sched/ipt.c b/net/sched/ipt.c --- a/net/sched/ipt.c 2004-12-30 04:01:46 +01:00 +++ b/net/sched/ipt.c 2004-12-30 04:01:46 +01:00 @@ -248,9 +248,11 @@ p->bstats.bytes += skb->len; p->bstats.packets++; - if (skb_cloned(skb) ) { - if (pskb_expand_head(skb, 0, 0, GFP_ATOMIC)) - return -1; + if (skb_cloned(skb)) { + if (unlikely(pskb_expand_head(skb, 0, 0, GFP_ATOMIC))) { + result = -1; + goto out_unlock; + } } /* yes, we have to worry about both in and out dev worry later - danger - this API seems to have changed @@ -275,9 +277,9 @@ result = TC_POLICE_OK; break; } +out_unlock: spin_unlock(&p->lock); return result; - } static int --------------090101090800040307030702-- From kaber@trash.net Wed Dec 29 19:40:14 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 29 Dec 2004 19:40:20 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBU3dqYL031240 for ; Wed, 29 Dec 2004 19:40:12 -0800 Received: from eru.coreworks.de ([172.16.0.2] helo=trash.net) by kaber.coreworks.de with esmtp (Exim 4.34) id 1CjrC2-0002J5-SH; Thu, 30 Dec 2004 04:41:43 +0100 Message-ID: <41D378B7.9020407@trash.net> Date: Thu, 30 Dec 2004 04:40:39 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040413 Debian/1.6-5 X-Accept-Language: en MIME-Version: 1.0 To: jamal CC: Maillist netdev Subject: [PATCH PKT_SCHED 13/17]: Clean up tcf_ipt_init Content-Type: multipart/mixed; boundary="------------090008040607010802060306" X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13191 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------090008040607010802060306 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Clean up tcf_ipt_init: - Return proper error codes - Check size of TCA_IPT_INDEX attribute - Remove useless cast - Rip out totally broken override bits - Consolidate error path The override part leaks memory, does not uninit the old iptables target, needs GFP_ATOMIC allocations because a lock is held and fails anyway. It think sharing code with normal initialization obfuscates both parts, so I ripped it out for now. --------------090008040607010802060306 Content-Type: text/x-patch; name="13.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="13.diff" # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/12/30 03:29:42+01:00 kaber@coreworks.de # [PKT_SCHED]: Clean up tcf_ipt_init # # - Return proper error codes # - Check size of TCA_IPT_INDEX attribute # - Remove useless cast # - Rip out totally broken override bits # - Consolidate error path # # Signed-off-by: Patrick McHardy # # net/sched/ipt.c # 2004/12/30 03:29:35+01:00 kaber@coreworks.de +30 -56 # [PKT_SCHED]: Clean up tcf_ipt_init # # - Return proper error codes # - Check size of TCA_IPT_INDEX attribute # - Remove useless cast # - Rip out totally broken override bits # - Consolidate error path # # Signed-off-by: Patrick McHardy # diff -Nru a/net/sched/ipt.c b/net/sched/ipt.c --- a/net/sched/ipt.c 2004-12-30 04:01:51 +01:00 +++ b/net/sched/ipt.c 2004-12-30 04:01:51 +01:00 @@ -60,12 +60,10 @@ struct ipt_target *target; int ret = 0; struct ipt_entry_target *t = p->t; - target = __ipt_find_target_lock(t->u.user.name, &ret); - if (!target) { - printk("init_targ: Failed to find %s\n", t->u.user.name); - return -1; - } + target = __ipt_find_target_lock(t->u.user.name, &ret); + if (!target) + return -ENOENT; DPRINTK("init_targ: found %s\n", target->name); /* we really need proper ref counting @@ -105,33 +103,31 @@ u32 hook = 0; if (rtattr_parse(tb, TCA_IPT_MAX, RTA_DATA(rta), RTA_PAYLOAD(rta)) < 0) - return -1; + return -EINVAL; - if (tb[TCA_IPT_INDEX - 1]) { + if (tb[TCA_IPT_INDEX - 1] && + RTA_PAYLOAD(tb[TCA_IPT_INDEX - 1]) >= sizeof(index)) index = *(u32 *) RTA_DATA(tb[TCA_IPT_INDEX - 1]); - DPRINTK("ipt index %d\n", index); - } if (index && (p = tcf_hash_lookup(index)) != NULL) { - a->priv = (void *) p; + a->priv = p; spin_lock(&p->lock); if (bind) { p->bindcnt += 1; p->refcnt += 1; } - if (ovr) { - goto override; - } + if (ovr) + ret = -EOPNOTSUPP; spin_unlock(&p->lock); return ret; } if (tb[TCA_IPT_TARG - 1] == NULL || tb[TCA_IPT_HOOK - 1] == NULL) - return -1; + return -EINVAL; p = kmalloc(sizeof(*p), GFP_KERNEL); if (p == NULL) - return -1; + return -ENOMEM; memset(p, 0, sizeof(*p)); p->refcnt = 1; ret = 1; @@ -140,45 +136,30 @@ if (bind) p->bindcnt = 1; -override: + /* override: */ hook = *(u32 *) RTA_DATA(tb[TCA_IPT_HOOK - 1]); t = (struct ipt_entry_target *) RTA_DATA(tb[TCA_IPT_TARG - 1]); + ret = -ENOMEM; p->t = kmalloc(t->u.target_size, GFP_KERNEL); - if (p->t == NULL) { - if (ovr) { - printk("ipt policy messed up \n"); - spin_unlock(&p->lock); - return -1; - } - kfree(p); - return -1; - } - + if (p->t == NULL) + goto err1; memcpy(p->t, RTA_DATA(tb[TCA_IPT_TARG - 1]), t->u.target_size); + DPRINTK(" target NAME %s size %d data[0] %x data[1] %x\n", t->u.user.name, t->u.target_size, t->data[0], t->data[1]); p->tname = kmalloc(IFNAMSIZ, GFP_KERNEL); - - if (p->tname == NULL) { - if (ovr) { - printk("ipt policy messed up 2 \n"); - spin_unlock(&p->lock); - return -1; - } - kfree(p->t); - kfree(p); - return -1; - } else { + if (p->tname == NULL) + goto err2; + else { int csize = IFNAMSIZ - 1; memset(p->tname, 0, IFNAMSIZ); if (tb[TCA_IPT_TABLE - 1]) { - if (strlen((char *) RTA_DATA(tb[TCA_IPT_TABLE - 1])) < - csize) - csize = strlen(RTA_DATA(tb[TCA_IPT_TABLE - 1])); + if (RTA_PAYLOAD(tb[TCA_IPT_TABLE - 1]) < csize) + csize = RTA_PAYLOAD(tb[TCA_IPT_TABLE - 1]); strncpy(p->tname, RTA_DATA(tb[TCA_IPT_TABLE - 1]), csize); DPRINTK("table name %s\n", p->tname); @@ -187,22 +168,8 @@ } } - if (init_targ(p) < 0) { - if (ovr) { - printk("ipt policy messed up 2 \n"); - spin_unlock(&p->lock); - return -1; - } - kfree(p->tname); - kfree(p->t); - kfree(p); - return -1; - } - - if (ovr) { - spin_unlock(&p->lock); - return -1; - } + if ((ret = init_targ(p)) < 0) + goto err3; p->index = index ? : tcf_hash_new_index(); @@ -223,6 +190,13 @@ a->priv = p; return ret; +err3: + kfree(p->tname); +err2: + kfree(p->t); +err1: + kfree(p); + return -1; } static int --------------090008040607010802060306-- From kaber@trash.net Wed Dec 29 19:40:35 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 29 Dec 2004 19:40:42 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBU3e9L2031592 for ; Wed, 29 Dec 2004 19:40:30 -0800 Received: from eru.coreworks.de ([172.16.0.2] helo=trash.net) by kaber.coreworks.de with esmtp (Exim 4.34) id 1CjrCJ-0002JC-MW; Thu, 30 Dec 2004 04:42:00 +0100 Message-ID: <41D378C8.9070808@trash.net> Date: Thu, 30 Dec 2004 04:40:56 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040413 Debian/1.6-5 X-Accept-Language: en MIME-Version: 1.0 To: jamal CC: Maillist netdev Subject: [PATCH PKT_SCHED 14/17]: Clean up pedit action Content-Type: multipart/mixed; boundary="------------010800010105060800030209" X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13192 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------010800010105060800030209 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Clean up pedit action, same as for gact and ipt. --------------010800010105060800030209 Content-Type: text/x-patch; name="14.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="14.diff" # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/12/30 03:44:02+01:00 kaber@coreworks.de # [PKT_SCHED]: Clean up pedit action # # Signed-off-by: Patrick McHardy # # net/sched/pedit.c # 2004/12/30 03:43:55+01:00 kaber@coreworks.de +25 -41 # [PKT_SCHED]: Clean up pedit action # # Signed-off-by: Patrick McHardy # diff -Nru a/net/sched/pedit.c b/net/sched/pedit.c --- a/net/sched/pedit.c 2004-12-30 04:01:56 +01:00 +++ b/net/sched/pedit.c 2004-12-30 04:01:56 +01:00 @@ -54,36 +54,30 @@ static int -tcf_pedit_init(struct rtattr *rta, struct rtattr *est, struct tc_action *a,int ovr, int bind) +tcf_pedit_init(struct rtattr *rta, struct rtattr *est, struct tc_action *a, + int ovr, int bind) { struct rtattr *tb[TCA_PEDIT_MAX]; struct tc_pedit *parm; int size = 0; int ret = 0; - struct tcf_pedit *p = NULL; + struct tcf_pedit *p; if (rtattr_parse(tb, TCA_PEDIT_MAX, RTA_DATA(rta), RTA_PAYLOAD(rta)) < 0) return -1; - - if (NULL == a || NULL == tb[TCA_PEDIT_PARMS - 1]) { + if (a == NULL || tb[TCA_PEDIT_PARMS - 1] == NULL) { printk("BUG: tcf_pedit_init called with NULL params\n"); return -1; } parm = RTA_DATA(tb[TCA_PEDIT_PARMS - 1]); - p = tcf_hash_check(parm, a, ovr, bind); - - if (NULL == p) { /* new */ - + if (p == NULL) { /* new */ if (!parm->nkeys) return -1; - - size = sizeof (*p)+ (parm->nkeys*sizeof(struct tc_pedit_key)); - - p = tcf_hash_create(parm,est,a,size,ovr,bind); - - if (NULL == p) + size = sizeof(*p) + parm->nkeys * sizeof(struct tc_pedit_key); + p = tcf_hash_create(parm, est, a, size, ovr, bind); + if (p == NULL) return -1; ret = 1; goto override; @@ -94,7 +88,8 @@ p->flags = parm->flags; p->nkeys = parm->nkeys; p->action = parm->action; - memcpy(p->keys,parm->keys,parm->nkeys*(sizeof(struct tc_pedit_key))); + memcpy(p->keys, parm->keys, + parm->nkeys * sizeof(struct tc_pedit_key)); } return ret; @@ -103,27 +98,22 @@ static int tcf_pedit_cleanup(struct tc_action *a, int bind) { - struct tcf_pedit *p; - p = PRIV(a,pedit); + struct tcf_pedit *p = PRIV(a,pedit); + if (NULL != p) - return tcf_hash_release(p, bind); + return tcf_hash_release(p, bind); return 0; } -/* -** -*/ static int tcf_pedit(struct sk_buff **pskb, struct tc_action *a) { - struct tcf_pedit *p; + struct tcf_pedit *p = PRIV(a, pedit); struct sk_buff *skb = *pskb; int i, munged = 0; u8 *pptr; - p = PRIV(a,pedit); - - if (NULL == p) { + if (p == NULL) { printk("BUG: tcf_pedit called with NULL params\n"); return -1; /* change to something symbolic */ } @@ -141,11 +131,11 @@ p->tm.lastuse = jiffies; - if (0 < p->nkeys) { + if (p->nkeys > 0) { struct tc_pedit_key *tkey = p->keys; for (i = p->nkeys; i > 0; i--, tkey++) { - u32 *ptr ; + u32 *ptr; int offset = tkey->off; if (tkey->offmask) { @@ -168,7 +158,6 @@ goto bad; } - ptr = (u32 *)(pptr+offset); /* just do it, baby */ *ptr = ((*ptr & tkey->mask) ^ tkey->val); @@ -196,29 +185,24 @@ { unsigned char *b = skb->tail; struct tc_pedit *opt; - struct tcf_pedit *p; + struct tcf_pedit *p = PRIV(a, pedit); struct tcf_t t; int s; - - p = PRIV(a,pedit); - - if (NULL == p) { + if (p == NULL) { printk("BUG: tcf_pedit_dump called with NULL params\n"); goto rtattr_failure; } - s = sizeof (*opt)+(p->nkeys*sizeof(struct tc_pedit_key)); + s = sizeof(*opt) + p->nkeys * sizeof(struct tc_pedit_key); - /* netlink spinlocks held above us - must use ATOMIC - * */ + /* netlink spinlocks held above us - must use ATOMIC */ opt = kmalloc(s, GFP_ATOMIC); if (opt == NULL) return -ENOBUFS; - memset(opt, 0, s); - memcpy(opt->keys,p->keys,p->nkeys*(sizeof(struct tc_pedit_key))); + memcpy(opt->keys, p->keys, p->nkeys * sizeof(struct tc_pedit_key)); opt->index = p->index; opt->nkeys = p->nkeys; opt->flags = p->flags; @@ -239,15 +223,15 @@ (unsigned int)key->off, (unsigned int)key->val, (unsigned int)key->mask); - } - } + } + } #endif RTA_PUT(skb, TCA_PEDIT_PARMS, s, opt); t.install = jiffies_to_clock_t(jiffies - p->tm.install); t.lastuse = jiffies_to_clock_t(jiffies - p->tm.lastuse); t.expires = jiffies_to_clock_t(p->tm.expires); - RTA_PUT(skb, TCA_PEDIT_TM, sizeof (t), &t); + RTA_PUT(skb, TCA_PEDIT_TM, sizeof(t), &t); return skb->len; rtattr_failure: --------------010800010105060800030209-- From kaber@trash.net Wed Dec 29 19:40:40 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 29 Dec 2004 19:40:48 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBU3eFZv031695 for ; Wed, 29 Dec 2004 19:40:36 -0800 Received: from eru.coreworks.de ([172.16.0.2] helo=trash.net) by kaber.coreworks.de with esmtp (Exim 4.34) id 1CjrCP-0002JG-8G; Thu, 30 Dec 2004 04:42:05 +0100 Message-ID: <41D378CD.9090407@trash.net> Date: Thu, 30 Dec 2004 04:41:01 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040413 Debian/1.6-5 X-Accept-Language: en MIME-Version: 1.0 To: jamal CC: Maillist netdev Subject: [PATCH PKT_SCHED 15/17]: Remove checks for impossible conditions in pedit action Content-Type: multipart/mixed; boundary="------------060903040607030803060503" X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13193 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------060903040607030803060503 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Remove checks for impossible conditions in pedit action. --------------060903040607030803060503 Content-Type: text/x-patch; name="15.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="15.diff" # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/12/30 03:47:30+01:00 kaber@coreworks.de # [PKT_SCHED]: Remove checks for impossible conditions in pedit action # # Signed-off-by: Patrick McHardy # # net/sched/pedit.c # 2004/12/30 03:47:24+01:00 kaber@coreworks.de +1 -11 # [PKT_SCHED]: Remove checks for impossible conditions in pedit action # # Signed-off-by: Patrick McHardy # diff -Nru a/net/sched/pedit.c b/net/sched/pedit.c --- a/net/sched/pedit.c 2004-12-30 04:02:01 +01:00 +++ b/net/sched/pedit.c 2004-12-30 04:02:01 +01:00 @@ -65,7 +65,7 @@ if (rtattr_parse(tb, TCA_PEDIT_MAX, RTA_DATA(rta), RTA_PAYLOAD(rta)) < 0) return -1; - if (a == NULL || tb[TCA_PEDIT_PARMS - 1] == NULL) { + if (tb[TCA_PEDIT_PARMS - 1] == NULL) { printk("BUG: tcf_pedit_init called with NULL params\n"); return -1; } @@ -113,11 +113,6 @@ int i, munged = 0; u8 *pptr; - if (p == NULL) { - printk("BUG: tcf_pedit called with NULL params\n"); - return -1; /* change to something symbolic */ - } - if (!(skb->tc_verd & TC_OK2MUNGE)) { /* should we set skb->cloned? */ if (pskb_expand_head(skb, 0, 0, GFP_ATOMIC)) { @@ -189,11 +184,6 @@ struct tcf_t t; int s; - if (p == NULL) { - printk("BUG: tcf_pedit_dump called with NULL params\n"); - goto rtattr_failure; - } - s = sizeof(*opt) + p->nkeys * sizeof(struct tc_pedit_key); /* netlink spinlocks held above us - must use ATOMIC */ --------------060903040607030803060503-- From kaber@trash.net Wed Dec 29 19:40:45 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 29 Dec 2004 19:40:53 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBU3eMih031866 for ; Wed, 29 Dec 2004 19:40:43 -0800 Received: from eru.coreworks.de ([172.16.0.2] helo=trash.net) by kaber.coreworks.de with esmtp (Exim 4.34) id 1CjrCW-0002JK-Ni; Thu, 30 Dec 2004 04:42:12 +0100 Message-ID: <41D378D5.9040406@trash.net> Date: Thu, 30 Dec 2004 04:41:09 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040413 Debian/1.6-5 X-Accept-Language: en MIME-Version: 1.0 To: jamal CC: Maillist netdev Subject: [PATCH PKT_SCHED 16/17]: Return proper error codes in tcf_pedit_init Content-Type: multipart/mixed; boundary="------------040500050206060209080109" X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13194 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------040500050206060209080109 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Return proper error codes in tcf_pedit_init. --------------040500050206060209080109 Content-Type: text/x-patch; name="16.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="16.diff" # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/12/30 03:52:40+01:00 kaber@coreworks.de # [PKT_SCHED]: Return proper error codes in tcf_pedit_init # # Signed-off-by: Patrick McHardy # # net/sched/pedit.c # 2004/12/30 03:52:34+01:00 kaber@coreworks.de +7 -7 # [PKT_SCHED]: Return proper error codes in tcf_pedit_init # # Signed-off-by: Patrick McHardy # diff -Nru a/net/sched/pedit.c b/net/sched/pedit.c --- a/net/sched/pedit.c 2004-12-30 04:02:05 +01:00 +++ b/net/sched/pedit.c 2004-12-30 04:02:05 +01:00 @@ -64,26 +64,26 @@ struct tcf_pedit *p; if (rtattr_parse(tb, TCA_PEDIT_MAX, RTA_DATA(rta), RTA_PAYLOAD(rta)) < 0) - return -1; - if (tb[TCA_PEDIT_PARMS - 1] == NULL) { - printk("BUG: tcf_pedit_init called with NULL params\n"); - return -1; - } + return -EINVAL; + if (tb[TCA_PEDIT_PARMS - 1] == NULL) + return -EINVAL; parm = RTA_DATA(tb[TCA_PEDIT_PARMS - 1]); p = tcf_hash_check(parm, a, ovr, bind); if (p == NULL) { /* new */ if (!parm->nkeys) - return -1; + return -EINVAL; size = sizeof(*p) + parm->nkeys * sizeof(struct tc_pedit_key); p = tcf_hash_create(parm, est, a, size, ovr, bind); if (p == NULL) - return -1; + return -ENOMEM; ret = 1; goto override; } + ret = -EEXIST; if (ovr) { + ret = 0; override: p->flags = parm->flags; p->nkeys = parm->nkeys; --------------040500050206060209080109-- From kaber@trash.net Wed Dec 29 19:41:29 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 29 Dec 2004 19:41:37 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBU3f5bg032647 for ; Wed, 29 Dec 2004 19:41:26 -0800 Received: from eru.coreworks.de ([172.16.0.2] helo=trash.net) by kaber.coreworks.de with esmtp (Exim 4.34) id 1CjrDD-0002Je-2l; Thu, 30 Dec 2004 04:42:55 +0100 Message-ID: <41D378FF.3080205@trash.net> Date: Thu, 30 Dec 2004 04:41:51 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040413 Debian/1.6-5 X-Accept-Language: en MIME-Version: 1.0 To: jamal CC: Maillist netdev Subject: [PATCH PKT_SCHED 17/17]: Disable broken override bits in pedit action Content-Type: multipart/mixed; boundary="------------020300080003090501060808" X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13195 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------020300080003090501060808 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Disable broken override bits in pedit action. It misses locking and needs to allocate new memory if nkeys increases. Also disable it for now. --------------020300080003090501060808 Content-Type: text/x-patch; name="17.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="17.diff" # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/12/30 03:56:27+01:00 kaber@coreworks.de # [PKT_SCHED]: Disable broken override bits in pedit action # # Signed-off-by: Patrick McHardy # # net/sched/pedit.c # 2004/12/30 03:56:20+01:00 kaber@coreworks.de +2 -0 # [PKT_SCHED]: Disable broken override bits in pedit action # # Signed-off-by: Patrick McHardy # diff -Nru a/net/sched/pedit.c b/net/sched/pedit.c --- a/net/sched/pedit.c 2004-12-30 04:02:10 +01:00 +++ b/net/sched/pedit.c 2004-12-30 04:02:10 +01:00 @@ -83,6 +83,8 @@ ret = -EEXIST; if (ovr) { + /* FIXME: no locking, larger memory area might be required */ + return -EOPNOTSUPP; ret = 0; override: p->flags = parm->flags; --------------020300080003090501060808-- From hadi@cyberus.ca Wed Dec 29 20:54:46 2004 Received: with ECARTIS (v1.0.0; list netdev); Wed, 29 Dec 2004 20:54:53 -0800 (PST) Received: from mx01.cybersurf.com (mx01.cybersurf.com [209.197.145.104]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBU4sQ7W011849 for ; Wed, 29 Dec 2004 20:54:46 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx01.cybersurf.com with esmtp (Exim 4.30) id 1CjsM6-0004qG-30 for netdev@oss.sgi.com; Wed, 29 Dec 2004 23:56:10 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CjsM2-0000Wa-RR; Wed, 29 Dec 2004 23:56:07 -0500 Subject: Re: [PATCH PKT_SCHED 0/17]: tc action cleanup + fixes From: jamal Reply-To: hadi@cyberus.ca To: Patrick McHardy Cc: Maillist netdev In-Reply-To: <41D3785F.3040909@trash.net> References: <41D3785F.3040909@trash.net> Content-Type: text/plain Organization: jamalopolous Message-Id: <1104382562.1048.39.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 29 Dec 2004 23:56:02 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13196 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev Patrick, Thanks for this cleanup. Questions/comments: 1)compiler or style issue? You have a few of fixes from a) if (..){ single statement here; } to: if (..) single statement here; I always add an extra pair of brace for lazy reasons (in the back of my mind: incase i want to add another statement ;->). b)Other things which i have seen compilers whine about in the past of the form: a missing cast - a->priv = (void *) p; + a->priv = p; or unitialized vars: - struct tcf_pedit *p = NULL; + struct tcf_pedit *p; 2) You are not the first to not like the if (constant != variable_here) Should be noted that i am dyxlesic and this has saved me a few times (I would say the most common errata for me, weird as that may sound). Dont have a problem with the changes you made (dont need the protection at this stage;->). 3) Is there any reason in which some cases you fixed things to be: return_type functionname() eg -static int -gact_net_rand(struct tcf_gact *p) { +static int gact_net_rand(struct tcf_gact *p) and in some cases you leave them to be of the form: return_type functionname() 4) Some of those messages are actually still useful and dont really harm to leave around for a little while longer; - if (tb[TCA_PEDIT_PARMS - 1] == NULL) { - printk("BUG: tcf_pedit_init called with NULL params\n"); I realize the fixes you have to return -ENOMEN/NOENT etc are an improvement but a little ascii puking wont harm for somebody writting a user space app until we get better netlink error propagation in place. I will look closely at one or two of those fixes in the morning; majority look good on first quick scan (most things were needed during development or are artifacts of that period and are safe to rid of now). Again thanks for the good work. cheers, jamal On Wed, 2004-12-29 at 22:39, Patrick McHardy wrote: > Hi Jamal, > > here is what I got so far, I'll continue tommorrow. Only compile > tested yet. Please review and comment. > > Patrick McHardy: > o [PKT_SCHED]: Disable broken override bits in pedit action > o [PKT_SCHED]: Return proper error codes in tcf_pedit_init > o [PKT_SCHED]: Remove checks for impossible conditions in pedit action > o [PKT_SCHED]: Clean up pedit action > o [PKT_SCHED]: Clean up tcf_ipt_init > o [PKT_SCHED]: Fix missing unlock in ipt action error path > o [PKT_SCHED]: Remove checks for impossible conditions in ipt action > o [PKT_SCHED]: Clean up ipt action > o [PKT_SCHED]: Return -EOPNOTSUPP if gact probability is requested > but not compiled in > o [PKT_SCHED]: Return proper error codes in tcf_gact_init > o [PKT_SCHED]: Remove checks for impossible conditions in gact action > o [PKT_SCHED: Clean up gact action > o [PKT_SCHED]: Clean up act_api.c action init path, propagate errors > properly > o [PKT_SCHED]: Check TCA_ACT_KIND payload size _before_ copying it > o [PKT_SCHED]: Remove checks for impossible conditions in act_api.c > o [PKT_SCHED]: Consistent comparision style in act_api.c > o [PKT_SCHED]: act_api.c whitespace cleanup > > > Regards > Patrick > > From akpm@osdl.org Thu Dec 30 00:19:51 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 00:20:02 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBU8JUid023005 for ; Thu, 30 Dec 2004 00:19:50 -0800 Received: from bix (build.pdx.osdl.net [172.20.1.2]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id iBU8LI616937; Thu, 30 Dec 2004 00:21:18 -0800 Date: Thu, 30 Dec 2004 00:21:15 -0800 From: Andrew Morton To: Arnaldo Carvalho de Melo Cc: davem@davemloft.net, netdev@oss.sgi.com, James.Bottomley@HansenPartnership.com Subject: Re: [IPV6] fix inet6_sk for non IPV6 builds Message-Id: <20041230002115.67c0dbbf.akpm@osdl.org> In-Reply-To: <41D3306F.7080605@conectiva.com.br> References: <41D3306F.7080605@conectiva.com.br> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.10; i386-redhat-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13197 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: akpm@osdl.org Precedence: bulk X-list: netdev Arnaldo Carvalho de Melo wrote: > > Please apply this patch, the problem was noted by James Bottomley, that > doesn't enables CONFIG_IPV6. > > Signed-off-by: Arnaldo Carvalho de Melo > > - Arnaldo > > > [inet6_sk.patch text/plain (787 bytes)] > ===== include/linux/ipv6.h 1.23 vs edited ===== > --- 1.23/include/linux/ipv6.h 2004-12-27 23:56:33 -02:00 > +++ edited/include/linux/ipv6.h 2004-12-29 20:22:45 -02:00 > @@ -273,6 +273,7 @@ > struct ipv6_pinfo inet6; > }; > > +#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) > static inline struct ipv6_pinfo * inet6_sk(const struct sock *__sk) > { > return inet_sk(__sk)->pinet6; > @@ -283,7 +284,6 @@ > return &((struct raw6_sock *)__sk)->raw6; > } > > -#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) > #define __ipv6_only_sock(sk) (inet6_sk(sk)->ipv6only) > #define ipv6_only_sock(sk) ((sk)->sk_family == PF_INET6 && __ipv6_only_sock(sk)) > #else This breaks selinux: security/selinux/avc.c: In function `avc_audit': security/selinux/avc.c:640: warning: implicit declaration of function `inet6_sk' security/selinux/avc.c:640: warning: initialization makes pointer from integer w This is hastily tested: --- 25/include/linux/ipv6.h~fix-inet6_sk-for-non-ipv6-builds 2004-12-30 00:19:06.550680488 -0800 +++ 25-akpm/include/linux/ipv6.h 2004-12-30 00:20:30.386935440 -0800 @@ -273,6 +273,7 @@ struct tcp6_sock { struct ipv6_pinfo inet6; }; +#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) static inline struct ipv6_pinfo * inet6_sk(const struct sock *__sk) { return inet_sk(__sk)->pinet6; @@ -283,10 +284,20 @@ static inline struct raw6_opt * raw6_sk( return &((struct raw6_sock *)__sk)->raw6; } -#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) #define __ipv6_only_sock(sk) (inet6_sk(sk)->ipv6only) #define ipv6_only_sock(sk) ((sk)->sk_family == PF_INET6 && __ipv6_only_sock(sk)) #else + +static inline struct ipv6_pinfo * inet6_sk(const struct sock *__sk) +{ + return NULL; +} + +static inline struct raw6_opt * raw6_sk(const struct sock *__sk) +{ + return NULL; +} + #define __ipv6_only_sock(sk) 0 #define ipv6_only_sock(sk) 0 #endif _ From dave@thedillows.org Thu Dec 30 00:46:46 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 00:47:52 -0800 (PST) Received: from smtp.knology.net (smtp.knology.net [24.214.63.101]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBU8kPFa024579 for ; Thu, 30 Dec 2004 00:46:46 -0800 Received: (qmail 24004 invoked by uid 0); 30 Dec 2004 08:47:27 -0000 Received: from user-69-1-45-93.knology.net (HELO ori.thedillows.org) (69.1.45.93) by smtp1.knology.net with SMTP; 30 Dec 2004 08:47:27 -0000 Received: from ori.thedillows.org (localhost [127.0.0.1]) by ori.thedillows.org (8.13.1/8.13.1) with ESMTP id iBU8mcpN009843; Thu, 30 Dec 2004 03:48:38 -0500 Received: (from root@localhost) by ori.thedillows.org (8.13.1/8.13.1/Submit) id iBU8mcaR009842; Thu, 30 Dec 2004 03:48:38 -0500 Date: Thu, 30 Dec 2004 03:48:38 -0500 To: netdev@oss.sgi.com Cc: linux-kernel@vger.kernel.org, dave@thedillows.org From: David Dillow Subject: [RFC 2.6.10 22/22] Add some documentation for the IPSEC crypto offload Message-Id: <20041230035000.31@ori.thedillows.org> References: <20041230035000.30@ori.thedillows.org> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13217 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dave@thedillows.org Precedence: bulk X-list: netdev # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/12/30 01:06:37-05:00 dave@thedillows.org # Add some documentation for the IPSEC crypto offload. # # Signed-off-by: David Dillow # # Documentation/networking/netdevices.txt # 2004/12/30 01:06:19-05:00 dave@thedillows.org +16 -0 # Add some documentation for the IPSEC crypto offload. # # Signed-off-by: David Dillow # diff -Nru a/Documentation/networking/netdevices.txt b/Documentation/networking/netdevices.txt --- a/Documentation/networking/netdevices.txt 2004-12-30 01:07:40 -05:00 +++ b/Documentation/networking/netdevices.txt 2004-12-30 01:07:40 -05:00 @@ -73,3 +73,19 @@ dev_close code and comments in net/core/dev.c for more info. Context: softirq +dev->xfrm_state_add: + Synchronization: None, but can be called inside dev_base_lock rwlock + Context: nominally process, but don't sleep inside an rwlock + Notes: Only called for inbound xfrm_state(s). Can be invoked during + xfrm_accel_add() call. + +dev->xfrm_state_del: + Synchronization: None, but can be called inside dev->xmit_lock spinlock. + Context: BHs disabled/softirq + Notes: Called for all offloaded xfrm_state(s). Can be invoked during + xfrm_accel_flush() call. + +dev->xfrm_bundle_add: + Synchronization: None + Context: softirq/process + Notes: Called for newly created outbound xfrm bundles. From dave@thedillows.org Thu Dec 30 00:46:43 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 00:47:49 -0800 (PST) Received: from smtp.knology.net (smtp.knology.net [24.214.63.101]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBU8kMnn024561 for ; Thu, 30 Dec 2004 00:46:43 -0800 Received: (qmail 17275 invoked by uid 0); 30 Dec 2004 08:47:21 -0000 Received: from user-69-1-45-93.knology.net (HELO ori.thedillows.org) (69.1.45.93) by smtp5.knology.net with SMTP; 30 Dec 2004 08:47:21 -0000 Received: from ori.thedillows.org (localhost [127.0.0.1]) by ori.thedillows.org (8.13.1/8.13.1) with ESMTP id iBU8mZeE009771; Thu, 30 Dec 2004 03:48:35 -0500 Received: (from root@localhost) by ori.thedillows.org (8.13.1/8.13.1/Submit) id iBU8mZra009770; Thu, 30 Dec 2004 03:48:35 -0500 Date: Thu, 30 Dec 2004 03:48:35 -0500 To: netdev@oss.sgi.com Cc: linux-kernel@vger.kernel.org, dave@thedillows.org From: David Dillow Subject: [RFC 2.6.10 4/22] xfrm: Try to offload inbound xfrm_states Message-Id: <20041230035000.13@ori.thedillows.org> References: <20041230035000.12@ori.thedillows.org> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13203 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dave@thedillows.org Precedence: bulk X-list: netdev # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/12/30 00:33:11-05:00 dave@thedillows.org # Plumb in offloading of inbound xfrm_states. # # Signed-off-by: David Dillow # # net/xfrm/xfrm_state.c # 2004/12/30 00:32:54-05:00 dave@thedillows.org +28 -1 # Try to offload an inbound xfrm_state when it is added or updated. # Since it could potentially come in from any interface, try to # offload it on all devices that support it. # # Signed-off-by: David Dillow # diff -Nru a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c --- a/net/xfrm/xfrm_state.c 2004-12-30 01:11:30 -05:00 +++ b/net/xfrm/xfrm_state.c 2004-12-30 01:11:30 -05:00 @@ -398,6 +398,26 @@ spin_unlock_bh(&xfrm_state_lock); } +static void xfrm_state_inbound_accel(struct xfrm_state *x) +{ + /* Only called for an inbound xfrm_state. Since it could + * possibly arrive on any interface, try to offload it + * on all devices that are capable. + */ + struct net_device *dev; + + rtnl_lock(); + read_lock(&dev_base_lock); + dev = dev_base; + while (dev) { + if (netif_running(dev) && (dev->features & NETIF_F_IPSEC)) + dev->xfrm_state_add(dev, x); + dev = dev->next; + } + read_unlock(&dev_base_lock); + rtnl_unlock(); +} + static struct xfrm_state *__xfrm_find_acq_byseq(u32 seq); int xfrm_state_add(struct xfrm_state *x) @@ -444,6 +464,9 @@ spin_unlock_bh(&xfrm_state_lock); xfrm_state_put_afinfo(afinfo); + if (!err && x->dir == XFRM_STATE_DIR_IN) + xfrm_state_inbound_accel(x); + if (x1) { xfrm_state_delete(x1); xfrm_state_put(x1); @@ -455,7 +478,7 @@ int xfrm_state_update(struct xfrm_state *x) { struct xfrm_state_afinfo *afinfo; - struct xfrm_state *x1; + struct xfrm_state *x1, *accel = NULL; int err; afinfo = xfrm_state_get_afinfo(x->props.family); @@ -479,6 +502,7 @@ if (x1->km.state == XFRM_STATE_ACQ) { __xfrm_state_insert(x); + accel = x; x = NULL; } err = 0; @@ -489,6 +513,9 @@ if (err) return err; + + if (accel && accel->dir == XFRM_STATE_DIR_IN) + xfrm_state_inbound_accel(accel); if (!x) { xfrm_state_delete(x1); From dave@thedillows.org Thu Dec 30 00:46:45 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 00:47:49 -0800 (PST) Received: from smtp.knology.net (smtp.knology.net [24.214.63.101]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBU8kOTu024574 for ; Thu, 30 Dec 2004 00:46:45 -0800 Received: (qmail 23998 invoked by uid 0); 30 Dec 2004 08:47:26 -0000 Received: from user-69-1-45-93.knology.net (HELO ori.thedillows.org) (69.1.45.93) by smtp1.knology.net with SMTP; 30 Dec 2004 08:47:26 -0000 Received: from ori.thedillows.org (localhost [127.0.0.1]) by ori.thedillows.org (8.13.1/8.13.1) with ESMTP id iBU8mbUS009823; Thu, 30 Dec 2004 03:48:37 -0500 Received: (from root@localhost) by ori.thedillows.org (8.13.1/8.13.1/Submit) id iBU8mbiM009822; Thu, 30 Dec 2004 03:48:37 -0500 Date: Thu, 30 Dec 2004 03:48:37 -0500 To: netdev@oss.sgi.com Cc: linux-kernel@vger.kernel.org, dave@thedillows.org From: David Dillow Subject: [RFC 2.6.10 17/22] typhoon: split out setting of offloaded tasks Message-Id: <20041230035000.26@ori.thedillows.org> References: <20041230035000.25@ori.thedillows.org> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13208 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dave@thedillows.org Precedence: bulk X-list: netdev # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/12/30 00:57:35-05:00 dave@thedillows.org # Move the setting of the currently offloaded tasks to its own # function, as we'll be making use of it to change the crypto # offload status when adding/removing xfrms. # # Signed-off-by: David Dillow # # drivers/net/typhoon.c # 2004/12/30 00:57:17-05:00 dave@thedillows.org +26 -15 # Move the setting of the currently offloaded tasks to its own # function, as we'll be making use of it to change the crypto # offload status when adding/removing xfrms. # # Signed-off-by: David Dillow # diff -Nru a/drivers/net/typhoon.c b/drivers/net/typhoon.c --- a/drivers/net/typhoon.c 2004-12-30 01:08:45 -05:00 +++ b/drivers/net/typhoon.c 2004-12-30 01:08:45 -05:00 @@ -304,6 +304,7 @@ u16 xcvr_select; u16 wol_events; u32 offload; + spinlock_t offload_lock; u16 tx_sa_max; u16 rx_sa_max; @@ -725,11 +726,28 @@ return err; } +static int +typhoon_set_offload(struct typhoon *tp) +{ + /* Caller should hold tp->offload_lock, or otherwise guarantee + * exclusitivity to this routine. + */ + struct cmd_desc xp_cmd; + + smp_rmb(); + if(tp->card_state != Running) + return 0; + + INIT_COMMAND_WITH_RESPONSE(&xp_cmd, TYPHOON_CMD_SET_OFFLOAD_TASKS); + xp_cmd.parm2 = tp->offload; + xp_cmd.parm3 = tp->offload; + return typhoon_issue_command(tp, 1, &xp_cmd, 0, NULL); +} + static void typhoon_vlan_rx_register(struct net_device *dev, struct vlan_group *grp) { struct typhoon *tp = netdev_priv(dev); - struct cmd_desc xp_cmd; int err; spin_lock_bh(&tp->state_lock); @@ -737,25 +755,16 @@ /* We've either been turned on for the first time, or we've * been turned off. Update the 3XP. */ + spin_lock_bh(&tp->offload_lock); if(grp) tp->offload |= TYPHOON_OFFLOAD_VLAN; else tp->offload &= ~TYPHOON_OFFLOAD_VLAN; + err = typhoon_set_offload(tp); + spin_unlock_bh(&tp->offload_lock); - /* If the interface is up, the runtime is running -- and we - * must be up for the vlan core to call us. - * - * Do the command outside of the spin lock, as it is slow. - */ - INIT_COMMAND_WITH_RESPONSE(&xp_cmd, - TYPHOON_CMD_SET_OFFLOAD_TASKS); - xp_cmd.parm2 = tp->offload; - xp_cmd.parm3 = tp->offload; - spin_unlock_bh(&tp->state_lock); - err = typhoon_issue_command(tp, 1, &xp_cmd, 0, NULL); if(err < 0) printk("%s: vlan offload error %d\n", tp->name, -err); - spin_lock_bh(&tp->state_lock); } /* now make the change visible */ @@ -1486,6 +1495,7 @@ spin_lock_init(&tp->command_lock); spin_lock_init(&tp->state_lock); + spin_lock_init(&tp->offload_lock); } static void @@ -2218,12 +2228,13 @@ if(err < 0) goto error_out; + /* tp->card_state != Running, so nothing will change this out + * from under us. + */ INIT_COMMAND_NO_RESPONSE(&xp_cmd, TYPHOON_CMD_SET_OFFLOAD_TASKS); - spin_lock_bh(&tp->state_lock); xp_cmd.parm2 = tp->offload; xp_cmd.parm3 = tp->offload; err = typhoon_issue_command(tp, 1, &xp_cmd, 0, NULL); - spin_unlock_bh(&tp->state_lock); if(err < 0) goto error_out; From dave@thedillows.org Thu Dec 30 00:46:43 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 00:47:54 -0800 (PST) Received: from smtp.knology.net (smtp.knology.net [24.214.63.101]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBU8kMaD024562 for ; Thu, 30 Dec 2004 00:46:43 -0800 Received: (qmail 4257 invoked by uid 0); 30 Dec 2004 08:48:58 -0000 Received: from user-69-1-45-93.knology.net (HELO ori.thedillows.org) (69.1.45.93) by smtp6.knology.net with SMTP; 30 Dec 2004 08:48:58 -0000 Received: from ori.thedillows.org (localhost [127.0.0.1]) by ori.thedillows.org (8.13.1/8.13.1) with ESMTP id iBU8mZCF009775; Thu, 30 Dec 2004 03:48:35 -0500 Received: (from root@localhost) by ori.thedillows.org (8.13.1/8.13.1/Submit) id iBU8mZvr009774; Thu, 30 Dec 2004 03:48:35 -0500 Date: Thu, 30 Dec 2004 03:48:35 -0500 To: netdev@oss.sgi.com Cc: linux-kernel@vger.kernel.org, dave@thedillows.org From: David Dillow Subject: [RFC 2.6.10 5/22] xfrm: Attempt to offload bundled xfrm_states for outbound xfrms Message-Id: <20041230035000.14@ori.thedillows.org> References: <20041230035000.13@ori.thedillows.org> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13218 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dave@thedillows.org Precedence: bulk X-list: netdev # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/12/30 00:34:46-05:00 dave@thedillows.org # Plumb in offloading new bundles for outgoing packets. # # Signed-off-by: David Dillow # # net/xfrm/xfrm_policy.c # 2004/12/30 00:34:28-05:00 dave@thedillows.org +28 -0 # When we create a new bundle for an outbound flow, try to # offload as much as the destination driver will allow. # # Don't forget to clean up.... # # Signed-off-by: David Dillow # # include/net/xfrm.h # 2004/12/30 00:34:28-05:00 dave@thedillows.org +6 -0 # A convenience structure for offloading bundles. # # The dst->child field gives us a singly linked list # from upper protocols to outer transforms. Drivers, however, # will likely have a limited number of offloads they can # perform on a particular packet, so they need to offload # the bundle from the outside in. This list makes it easier # for them. # # Signed-off-by: David Dillow # # include/net/dst.h # 2004/12/30 00:34:28-05:00 dave@thedillows.org +1 -0 # Add a field to store the offload information for this part # of the outgoing bundle (non-NULL if this dst is offloaded.) # # Signed-off-by: David Dillow # diff -Nru a/include/net/dst.h b/include/net/dst.h --- a/include/net/dst.h 2004-12-30 01:11:18 -05:00 +++ b/include/net/dst.h 2004-12-30 01:11:18 -05:00 @@ -65,6 +65,7 @@ struct neighbour *neighbour; struct hh_cache *hh; struct xfrm_state *xfrm; + struct xfrm_offload *xfrm_offload; int (*input)(struct sk_buff*); int (*output)(struct sk_buff*); diff -Nru a/include/net/xfrm.h b/include/net/xfrm.h --- a/include/net/xfrm.h 2004-12-30 01:11:18 -05:00 +++ b/include/net/xfrm.h 2004-12-30 01:11:18 -05:00 @@ -178,6 +178,12 @@ atomic_t refcnt; }; +struct xfrm_bundle_list +{ + struct list_head node; + struct dst_entry * dst; +}; + struct xfrm_type; struct xfrm_dst; struct xfrm_policy_afinfo { diff -Nru a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c --- a/net/xfrm/xfrm_policy.c 2004-12-30 01:11:18 -05:00 +++ b/net/xfrm/xfrm_policy.c 2004-12-30 01:11:18 -05:00 @@ -705,6 +705,31 @@ }; } +static void xfrm_accel_bundle(struct dst_entry *dst) +{ + struct xfrm_bundle_list bundle, *xbl, *tmp; + struct net_device *dev = dst->dev; + INIT_LIST_HEAD(&bundle.node); + + if (dev && netif_running(dev) && (dev->features & NETIF_F_IPSEC)) { + while (dst) { + xbl = kmalloc(sizeof(*xbl), GFP_ATOMIC); + if (!xbl) + goto out; + + xbl->dst = dst; + list_add_tail(&xbl->node, &bundle.node); + dst = dst->child; + } + + dev->xfrm_bundle_add(dev, &bundle); + } + +out: + list_for_each_entry_safe(xbl, tmp, &bundle.node, node) + kfree(xbl); +} + static int stale_bundle(struct dst_entry *dst); /* Main function: finds/creates a bundle for given flow. @@ -833,6 +858,7 @@ policy->bundles = dst; dst_hold(dst); write_unlock_bh(&policy->lock); + xfrm_accel_bundle(dst); } *dst_p = dst; dst_release(dst_orig); @@ -1023,8 +1049,10 @@ { if (!dst->xfrm) return; + xfrm_offload_release(dst->xfrm_offload); xfrm_state_put(dst->xfrm); dst->xfrm = NULL; + dst->xfrm_offload = NULL; } static void xfrm_link_failure(struct sk_buff *skb) From dave@thedillows.org Thu Dec 30 00:46:44 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 00:47:49 -0800 (PST) Received: from smtp.knology.net (smtp.knology.net [24.214.63.101]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBU8kNwq024567 for ; Thu, 30 Dec 2004 00:46:44 -0800 Received: (qmail 4264 invoked by uid 0); 30 Dec 2004 08:48:58 -0000 Received: from user-69-1-45-93.knology.net (HELO ori.thedillows.org) (69.1.45.93) by smtp6.knology.net with SMTP; 30 Dec 2004 08:48:58 -0000 Received: from ori.thedillows.org (localhost [127.0.0.1]) by ori.thedillows.org (8.13.1/8.13.1) with ESMTP id iBU8ma61009791; Thu, 30 Dec 2004 03:48:36 -0500 Received: (from root@localhost) by ori.thedillows.org (8.13.1/8.13.1/Submit) id iBU8macn009790; Thu, 30 Dec 2004 03:48:36 -0500 Date: Thu, 30 Dec 2004 03:48:36 -0500 To: netdev@oss.sgi.com Cc: linux-kernel@vger.kernel.org, dave@thedillows.org From: David Dillow Subject: [RFC 2.6.10 9/22] AH: Split header initialization from zeroing of mutable fields Message-Id: <20041230035000.18@ori.thedillows.org> References: <20041230035000.17@ori.thedillows.org> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13204 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dave@thedillows.org Precedence: bulk X-list: netdev # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/12/30 00:42:33-05:00 dave@thedillows.org # Seperate AH header initialization from the zeroing of mutable # IP header fields in preparation for offloading the crypto # processing of the packet. # # Signed-off-by: David Dillow # # net/ipv4/ah4.c # 2004/12/30 00:42:15-05:00 dave@thedillows.org +18 -12 # Seperate AH header initialization from the zeroing of mutable # IP header fields in preparation for offloading the crypto # processing of the packet. # # Signed-off-by: David Dillow # diff -Nru a/net/ipv4/ah4.c b/net/ipv4/ah4.c --- a/net/ipv4/ah4.c 2004-12-30 01:10:27 -05:00 +++ b/net/ipv4/ah4.c 2004-12-30 01:10:27 -05:00 @@ -69,6 +69,20 @@ top_iph = skb->nh.iph; iph = &tmp_iph.iph; + ah = (struct ip_auth_hdr *)((char *)top_iph+top_iph->ihl*4); + ah->nexthdr = top_iph->protocol; + + top_iph->tot_len = htons(skb->len); + top_iph->protocol = IPPROTO_AH; + + ahp = x->data; + ah->hdrlen = (XFRM_ALIGN8(sizeof(struct ip_auth_hdr) + + ahp->icv_trunc_len) >> 2) - 2; + + ah->reserved = 0; + ah->spi = x->id.spi; + ah->seq_no = htonl(x->replay.oseq + 1); + iph->tos = top_iph->tos; iph->ttl = top_iph->ttl; iph->frag_off = top_iph->frag_off; @@ -81,23 +95,11 @@ goto error; } - ah = (struct ip_auth_hdr *)((char *)top_iph+top_iph->ihl*4); - ah->nexthdr = top_iph->protocol; - top_iph->tos = 0; - top_iph->tot_len = htons(skb->len); top_iph->frag_off = 0; top_iph->ttl = 0; - top_iph->protocol = IPPROTO_AH; top_iph->check = 0; - ahp = x->data; - ah->hdrlen = (XFRM_ALIGN8(sizeof(struct ip_auth_hdr) + - ahp->icv_trunc_len) >> 2) - 2; - - ah->reserved = 0; - ah->spi = x->id.spi; - ah->seq_no = htonl(++x->replay.oseq); ahp->icv(ahp, skb, ah->auth_data); top_iph->tos = iph->tos; @@ -108,6 +110,10 @@ memcpy(top_iph+1, iph+1, top_iph->ihl*4 - sizeof(struct iphdr)); } + /* Delay incrementing the replay sequence until we know we're going + * to send this packet to prevent gaps. + */ + x->replay.oseq++; ip_send_check(top_iph); err = 0; From dave@thedillows.org Thu Dec 30 00:46:43 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 00:47:49 -0800 (PST) Received: from smtp.knology.net (smtp.knology.net [24.214.63.101]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBU8kMg7024560 for ; Thu, 30 Dec 2004 00:46:43 -0800 Received: (qmail 28257 invoked by uid 0); 30 Dec 2004 08:48:57 -0000 Received: from user-69-1-45-93.knology.net (HELO ori.thedillows.org) (69.1.45.93) by smtp2.knology.net with SMTP; 30 Dec 2004 08:48:57 -0000 Received: from ori.thedillows.org (localhost [127.0.0.1]) by ori.thedillows.org (8.13.1/8.13.1) with ESMTP id iBU8mZdl009767; Thu, 30 Dec 2004 03:48:35 -0500 Received: (from root@localhost) by ori.thedillows.org (8.13.1/8.13.1/Submit) id iBU8mZLw009766; Thu, 30 Dec 2004 03:48:35 -0500 Date: Thu, 30 Dec 2004 03:48:35 -0500 To: netdev@oss.sgi.com Cc: linux-kernel@vger.kernel.org, dave@thedillows.org From: David Dillow Subject: [RFC 2.6.10 3/22] xfrm: Add offload management routines Message-Id: <20041230035000.12@ori.thedillows.org> References: <20041230035000.11@ori.thedillows.org> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13213 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dave@thedillows.org Precedence: bulk X-list: netdev # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/12/30 00:31:03-05:00 dave@thedillows.org # Add offload management to xfrm_state. # # xfrm_offload_alloc() creates a new xfrm_offload, with a private # part to be used by the driver (ala net_device->priv) # The returned offload may be kfree'd if it has not been # added to a xfrm_state using xfrm_state_offload_add(). # xfrm_offload_priv() returns a pointer to the private area of # the xfrm_offload. This will be 8-byte aligned. # xfrm_offload_hold()/xfrm_offload_release() do the reference # counting of the xfrm_offload # xfrm_offload_get() looks up the xfrm_offload from a given device # The caller should call xfrm_offload_release() when it is # finished with this offload. # xfrm_state_offload_add() adds a new offload to the xfrm_state, # replacing any existing offload for the device that may # exist for this xfrm_state. # # Signed-off-by: David Dillow # # net/xfrm/xfrm_state.c # 2004/12/30 00:30:46-05:00 dave@thedillows.org +28 -0 # Clean up any offloads on destruction of an xfrm_state, and # allow the addition of new xfrm_offloads to a xfrm_state. # # Signed-off-by: David Dillow # # net/xfrm/xfrm_export.c # 2004/12/30 00:30:46-05:00 dave@thedillows.org +2 -0 # Export xfrm_state_offload_add() to drivers. # # Signed-off-by: David Dillow # # include/net/xfrm.h # 2004/12/30 00:30:46-05:00 dave@thedillows.org +79 -0 # Add offload management to xfrm_state. # # Add xfrm_offload_alloc(), xfrm_offload_priv(), xfrm_offload_hold() # xfrm_offload_release(), and xfrm_offload_get(). # # Signed-off-by: David Dillow # diff -Nru a/include/net/xfrm.h b/include/net/xfrm.h --- a/include/net/xfrm.h 2004-12-30 01:11:43 -05:00 +++ b/include/net/xfrm.h 2004-12-30 01:11:43 -05:00 @@ -81,6 +81,8 @@ metrics. Plus, it will be made via sk->sk_dst_cache. Solved. */ +struct xfrm_offload; + /* Full description of state of transformer. */ struct xfrm_state { @@ -149,6 +151,9 @@ /* Intended direction of this state, used for offloading */ int dir; + + /* List of offload cookies, per device */ + struct list_head offloads; }; enum { @@ -166,6 +171,13 @@ XFRM_STATE_DIR_OUT, }; +struct xfrm_offload +{ + struct list_head bydev; + struct net_device * dev; + atomic_t refcnt; +}; + struct xfrm_type; struct xfrm_dst; struct xfrm_policy_afinfo { @@ -911,5 +923,72 @@ (struct in6_addr *)b); } } + +#define XFRM_OFFLOAD_ALIGN 8 +#define XFRM_OFFLOAD_ALIGN_CONST (XFRM_OFFLOAD_ALIGN - 1) + +static inline struct xfrm_offload * +xfrm_offload_alloc(int sizeof_priv, struct net_device *dev) +{ + struct xfrm_offload *xol; + int alloc_size; + + alloc_size = (sizeof(*xol) + XFRM_OFFLOAD_ALIGN_CONST) + & ~XFRM_OFFLOAD_ALIGN_CONST; + alloc_size += sizeof_priv; + xol = kmalloc(alloc_size, GFP_ATOMIC); + if (xol) { + memset(xol, 0, alloc_size); + INIT_LIST_HEAD(&xol->bydev); + atomic_set(&xol->refcnt, 1); + xol->dev = dev; + } + + return xol; +} + +static inline void *xfrm_offload_priv(struct xfrm_offload *xol) +{ + return (char *)xol + ((sizeof(*xol) + XFRM_OFFLOAD_ALIGN_CONST) + & ~XFRM_OFFLOAD_ALIGN_CONST); +} + +static inline void xfrm_offload_hold(struct xfrm_offload *xol) +{ + atomic_inc(&xol->refcnt); +} + +static inline void xfrm_offload_release(struct xfrm_offload *xol) +{ + if (xol) { + WARN_ON(atomic_read(&xol->refcnt) < 1); + if (atomic_dec_and_test(&xol->refcnt)) { + xol->dev->xfrm_state_del(xol->dev, xol); + dev_put(xol->dev); + kfree(xol); + } + } +} + +static inline struct xfrm_offload *xfrm_offload_get(struct xfrm_state *x, + struct net_device *dev) +{ + struct xfrm_offload *xol, *ret = NULL; + + spin_lock(&x->lock); + list_for_each_entry(xol, &x->offloads, bydev) { + if (xol->dev == dev) { + xfrm_offload_hold(xol); + ret = xol; + break; + } + } + spin_unlock(&x->lock); + + return ret; +} + +extern void xfrm_state_offload_add(struct xfrm_state *x, + struct xfrm_offload *xol); #endif /* _NET_XFRM_H */ diff -Nru a/net/xfrm/xfrm_export.c b/net/xfrm/xfrm_export.c --- a/net/xfrm/xfrm_export.c 2004-12-30 01:11:43 -05:00 +++ b/net/xfrm/xfrm_export.c 2004-12-30 01:11:43 -05:00 @@ -61,3 +61,5 @@ EXPORT_SYMBOL_GPL(xfrm_ealg_get_byname); EXPORT_SYMBOL_GPL(xfrm_calg_get_byname); EXPORT_SYMBOL_GPL(skb_icv_walk); + +EXPORT_SYMBOL_GPL(xfrm_state_offload_add); diff -Nru a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c --- a/net/xfrm/xfrm_state.c 2004-12-30 01:11:43 -05:00 +++ b/net/xfrm/xfrm_state.c 2004-12-30 01:11:43 -05:00 @@ -53,6 +53,12 @@ static void xfrm_state_gc_destroy(struct xfrm_state *x) { + if (!list_empty(&x->offloads)) { + struct xfrm_offload *xol, *next; + + list_for_each_entry_safe(xol, next, &x->offloads, bydev) + xfrm_offload_release(xol); + } if (del_timer(&x->timer)) BUG(); if (x->aalg) @@ -178,6 +184,7 @@ atomic_set(&x->tunnel_users, 0); INIT_LIST_HEAD(&x->bydst); INIT_LIST_HEAD(&x->byspi); + INIT_LIST_HEAD(&x->offloads); init_timer(&x->timer); x->timer.function = xfrm_timer_handler; x->timer.data = (unsigned long)x; @@ -941,6 +948,27 @@ xfrm_state_put(t); x->tunnel = NULL; } +} + +void xfrm_state_offload_add(struct xfrm_state *x, struct xfrm_offload *xol) +{ + struct xfrm_offload *entry, *old = NULL; + + spin_lock(&x->lock); + list_for_each_entry(entry, &x->offloads, bydev) { + if (entry->dev == xol->dev) { + list_del(&entry->bydev); + old = entry; + break; + } + } + + dev_hold(xol->dev); + list_add_tail(&xol->bydev, &x->offloads); + spin_unlock(&x->lock); + + if(old) + xfrm_offload_release(old); } void __init xfrm_state_init(void) From dave@thedillows.org Thu Dec 30 00:46:43 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 00:47:49 -0800 (PST) Received: from smtp.knology.net (smtp.knology.net [24.214.63.101]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBU8kN0V024563 for ; Thu, 30 Dec 2004 00:46:43 -0800 Received: (qmail 19770 invoked by uid 0); 30 Dec 2004 08:53:05 -0000 Received: from user-69-1-45-93.knology.net (HELO ori.thedillows.org) (69.1.45.93) by smtp8.knology.net with SMTP; 30 Dec 2004 08:53:05 -0000 Received: from ori.thedillows.org (localhost [127.0.0.1]) by ori.thedillows.org (8.13.1/8.13.1) with ESMTP id iBU8mZ3j009779; Thu, 30 Dec 2004 03:48:35 -0500 Received: (from root@localhost) by ori.thedillows.org (8.13.1/8.13.1/Submit) id iBU8mZ2P009778; Thu, 30 Dec 2004 03:48:35 -0500 Date: Thu, 30 Dec 2004 03:48:35 -0500 To: netdev@oss.sgi.com Cc: linux-kernel@vger.kernel.org, dave@thedillows.org From: David Dillow Subject: [RFC 2.6.10 6/22] xfrm: add a parameter to xfrm_prune_bundles() Message-Id: <20041230035000.15@ori.thedillows.org> References: <20041230035000.14@ori.thedillows.org> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13199 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dave@thedillows.org Precedence: bulk X-list: netdev # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/12/30 00:35:34-05:00 dave@thedillows.org # Add a parameter to the decision function(s) used by # xfrm_prune_bundles(). This will allow us to have more # fine grained selection of bundles pruned (like, say, # per device.) # # Signed-off-by: David Dillow # # net/xfrm/xfrm_policy.c # 2004/12/30 00:35:16-05:00 dave@thedillows.org +10 -9 # Add a parameter to the decision function(s) used by # xfrm_prune_bundles(). This will allow us to have more # fine grained selection of bundles pruned (like, say, # per device.) # # Signed-off-by: David Dillow # diff -Nru a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c --- a/net/xfrm/xfrm_policy.c 2004-12-30 01:11:05 -05:00 +++ b/net/xfrm/xfrm_policy.c 2004-12-30 01:11:05 -05:00 @@ -730,7 +730,7 @@ kfree(xbl); } -static int stale_bundle(struct dst_entry *dst); +static int stale_bundle(struct dst_entry *dst, void *unused); /* Main function: finds/creates a bundle for given flow. * @@ -841,7 +841,7 @@ } write_lock_bh(&policy->lock); - if (unlikely(policy->dead || stale_bundle(dst))) { + if (unlikely(policy->dead || stale_bundle(dst, NULL))) { /* Wow! While we worked on resolving, this * policy has gone. Retry. It is not paranoia, * we just cannot enlist new bundle to dead object. @@ -1022,14 +1022,14 @@ static struct dst_entry *xfrm_dst_check(struct dst_entry *dst, u32 cookie) { - if (!stale_bundle(dst)) + if (!stale_bundle(dst, NULL)) return dst; dst_release(dst); return NULL; } -static int stale_bundle(struct dst_entry *dst) +static int stale_bundle(struct dst_entry *dst, void *unused) { struct dst_entry *child = dst; @@ -1072,7 +1072,8 @@ return dst; } -static void xfrm_prune_bundles(int (*func)(struct dst_entry *)) +static void xfrm_prune_bundles(int (*func)(struct dst_entry *, void *), + void *data) { int i; struct xfrm_policy *pol; @@ -1084,7 +1085,7 @@ write_lock(&pol->lock); dstp = &pol->bundles; while ((dst=*dstp) != NULL) { - if (func(dst)) { + if (func(dst, data)) { *dstp = dst->next; dst->next = gc_list; gc_list = dst; @@ -1104,19 +1105,19 @@ } } -static int unused_bundle(struct dst_entry *dst) +static int unused_bundle(struct dst_entry *dst, void *unused) { return !atomic_read(&dst->__refcnt); } static void __xfrm_garbage_collect(void) { - xfrm_prune_bundles(unused_bundle); + xfrm_prune_bundles(unused_bundle, NULL); } int xfrm_flush_bundles(void) { - xfrm_prune_bundles(stale_bundle); + xfrm_prune_bundles(stale_bundle, NULL); return 0; } From dave@thedillows.org Thu Dec 30 00:46:43 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 00:47:49 -0800 (PST) Received: from smtp.knology.net (smtp.knology.net [24.214.63.101]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBU8kMZg024558 for ; Thu, 30 Dec 2004 00:46:42 -0800 Received: (qmail 19764 invoked by uid 0); 30 Dec 2004 08:53:04 -0000 Received: from user-69-1-45-93.knology.net (HELO ori.thedillows.org) (69.1.45.93) by smtp8.knology.net with SMTP; 30 Dec 2004 08:53:04 -0000 Received: from ori.thedillows.org (localhost [127.0.0.1]) by ori.thedillows.org (8.13.1/8.13.1) with ESMTP id iBU8mY8d009759; Thu, 30 Dec 2004 03:48:34 -0500 Received: (from root@localhost) by ori.thedillows.org (8.13.1/8.13.1/Submit) id iBU8mYTm009758; Thu, 30 Dec 2004 03:48:34 -0500 Date: Thu, 30 Dec 2004 03:48:34 -0500 To: netdev@oss.sgi.com Cc: linux-kernel@vger.kernel.org, dave@thedillows.org From: David Dillow Subject: [RFC 2.6.10 1/22] xfrm: Add direction information to xfrm_state Message-Id: <20041230035000.10@ori.thedillows.org> References: <20041230035000.01@ori.thedillows.org> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13207 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dave@thedillows.org Precedence: bulk X-list: netdev # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/12/30 00:27:15-05:00 dave@thedillows.org # Add direction information to xfrm_state. This will be needed to # offload xfrm processing to the NIC. # # Signed-off-by: David Dillow # # net/xfrm/xfrm_state.c # 2004/12/30 00:25:42-05:00 dave@thedillows.org +5 -0 # Add direction information to xfrm_state. This will be needed to # offload xfrm processing to the NIC. # # Signed-off-by: David Dillow # # net/ipv6/xfrm6_state.c # 2004/12/30 00:25:42-05:00 dave@thedillows.org +9 -0 # Place holder for adding IPv6 direction mapping routine. # # net/ipv4/xfrm4_state.c # 2004/12/30 00:25:42-05:00 dave@thedillows.org +10 -0 # Add direction information to xfrm_state. This will be needed to # offload xfrm processing to the NIC. # # Signed-off-by: David Dillow # # include/net/xfrm.h # 2004/12/30 00:25:42-05:00 dave@thedillows.org +10 -0 # Add direction information to xfrm_state. This will be needed to # offload xfrm processing to the NIC. # # Signed-off-by: David Dillow # diff -Nru a/include/net/xfrm.h b/include/net/xfrm.h --- a/include/net/xfrm.h 2004-12-30 01:12:08 -05:00 +++ b/include/net/xfrm.h 2004-12-30 01:12:08 -05:00 @@ -146,6 +146,9 @@ /* Private data of this transformer, format is opaque, * interpreted by xfrm_type methods. */ void *data; + + /* Intended direction of this state, used for offloading */ + int dir; }; enum { @@ -157,6 +160,12 @@ XFRM_STATE_DEAD }; +enum { + XFRM_STATE_DIR_UNKNOWN, + XFRM_STATE_DIR_IN, + XFRM_STATE_DIR_OUT, +}; + struct xfrm_type; struct xfrm_dst; struct xfrm_policy_afinfo { @@ -194,6 +203,7 @@ struct xfrm_state *(*find_acq)(u8 mode, u32 reqid, u8 proto, xfrm_address_t *daddr, xfrm_address_t *saddr, int create); + void (*map_direction)(struct xfrm_state *xfrm); }; extern int xfrm_state_register_afinfo(struct xfrm_state_afinfo *afinfo); diff -Nru a/net/ipv4/xfrm4_state.c b/net/ipv4/xfrm4_state.c --- a/net/ipv4/xfrm4_state.c 2004-12-30 01:12:08 -05:00 +++ b/net/ipv4/xfrm4_state.c 2004-12-30 01:12:08 -05:00 @@ -106,12 +106,22 @@ return x0; } +static void +__xfrm4_map_direction(struct xfrm_state *x) +{ + if (inet_addr_type(x->id.daddr.a4) == RTN_LOCAL) + x->dir = XFRM_STATE_DIR_IN; + else + x->dir = XFRM_STATE_DIR_OUT; +} + static struct xfrm_state_afinfo xfrm4_state_afinfo = { .family = AF_INET, .lock = RW_LOCK_UNLOCKED, .init_tempsel = __xfrm4_init_tempsel, .state_lookup = __xfrm4_state_lookup, .find_acq = __xfrm4_find_acq, + .map_direction = __xfrm4_map_direction, }; void __init xfrm4_state_init(void) diff -Nru a/net/ipv6/xfrm6_state.c b/net/ipv6/xfrm6_state.c --- a/net/ipv6/xfrm6_state.c 2004-12-30 01:12:08 -05:00 +++ b/net/ipv6/xfrm6_state.c 2004-12-30 01:12:08 -05:00 @@ -116,12 +116,21 @@ return x0; } +static void +__xfrm6_map_direction(struct xfrm_state *x) +{ + /* XXX This needs to be implemented by someone who knows + * IPv6 better then I. + */ +} + static struct xfrm_state_afinfo xfrm6_state_afinfo = { .family = AF_INET6, .lock = RW_LOCK_UNLOCKED, .init_tempsel = __xfrm6_init_tempsel, .state_lookup = __xfrm6_state_lookup, .find_acq = __xfrm6_find_acq, + .map_direction = __xfrm6_map_direction, }; void __init xfrm6_state_init(void) diff -Nru a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c --- a/net/xfrm/xfrm_state.c 2004-12-30 01:12:08 -05:00 +++ b/net/xfrm/xfrm_state.c 2004-12-30 01:12:08 -05:00 @@ -186,6 +186,7 @@ x->lft.soft_packet_limit = XFRM_INF; x->lft.hard_byte_limit = XFRM_INF; x->lft.hard_packet_limit = XFRM_INF; + x->dir = XFRM_STATE_DIR_UNKNOWN; spin_lock_init(&x->lock); } return x; @@ -404,6 +405,8 @@ if (unlikely(afinfo == NULL)) return -EAFNOSUPPORT; + afinfo->map_direction(x); + spin_lock_bh(&xfrm_state_lock); x1 = afinfo->state_lookup(&x->id.daddr, x->id.spi, x->id.proto); @@ -451,6 +454,8 @@ afinfo = xfrm_state_get_afinfo(x->props.family); if (unlikely(afinfo == NULL)) return -EAFNOSUPPORT; + + afinfo->map_direction(x); spin_lock_bh(&xfrm_state_lock); x1 = afinfo->state_lookup(&x->id.daddr, x->id.spi, x->id.proto); From dave@thedillows.org Thu Dec 30 00:46:44 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 00:47:49 -0800 (PST) Received: from smtp.knology.net (smtp.knology.net [24.214.63.101]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBU8kNOO024568 for ; Thu, 30 Dec 2004 00:46:44 -0800 Received: (qmail 19776 invoked by uid 0); 30 Dec 2004 08:53:06 -0000 Received: from user-69-1-45-93.knology.net (HELO ori.thedillows.org) (69.1.45.93) by smtp8.knology.net with SMTP; 30 Dec 2004 08:53:06 -0000 Received: from ori.thedillows.org (localhost [127.0.0.1]) by ori.thedillows.org (8.13.1/8.13.1) with ESMTP id iBU8ma1w009799; Thu, 30 Dec 2004 03:48:36 -0500 Received: (from root@localhost) by ori.thedillows.org (8.13.1/8.13.1/Submit) id iBU8ma3u009798; Thu, 30 Dec 2004 03:48:36 -0500 Date: Thu, 30 Dec 2004 03:48:36 -0500 To: netdev@oss.sgi.com Cc: linux-kernel@vger.kernel.org, dave@thedillows.org From: David Dillow Subject: [RFC 2.6.10 11/22] AH, ESP: Add offloading of inbound packets Message-Id: <20041230035000.20@ori.thedillows.org> References: <20041230035000.19@ori.thedillows.org> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13210 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dave@thedillows.org Precedence: bulk X-list: netdev # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/12/30 00:47:54-05:00 dave@thedillows.org # Add crypto offload for inbound IPv4 AH xfrms. # # Signed-off-by: David Dillow # # net/ipv4/esp4.c # 2004/12/30 00:47:36-05:00 dave@thedillows.org +30 -16 # Add crypto offload for inbound IPv4 AH xfrms. # # Signed-off-by: David Dillow # # net/ipv4/ah4.c # 2004/12/30 00:47:36-05:00 dave@thedillows.org +13 -4 # Add crypto offload for inbound IPv4 AH xfrms. # # Signed-off-by: David Dillow # diff -Nru a/net/ipv4/ah4.c b/net/ipv4/ah4.c --- a/net/ipv4/ah4.c 2004-12-30 01:10:02 -05:00 +++ b/net/ipv4/ah4.c 2004-12-30 01:10:02 -05:00 @@ -138,6 +138,7 @@ struct iphdr *iph; struct ip_auth_hdr *ah; struct ah_data *ahp; + int offload; char work_buf[60]; if (!pskb_may_pull(skb, sizeof(struct ip_auth_hdr))) @@ -164,6 +165,7 @@ ah = (struct ip_auth_hdr*)skb->data; iph = skb->nh.iph; + offload = skb_pop_xfrm_result(skb); memcpy(work_buf, iph, iph->ihl*4); @@ -181,10 +183,17 @@ memcpy(auth_data, ah->auth_data, ahp->icv_trunc_len); skb_push(skb, skb->data - skb->nh.raw); - ahp->icv(ahp, skb, ah->auth_data); - if (memcmp(ah->auth_data, auth_data, ahp->icv_trunc_len)) { - x->stats.integrity_failed++; - goto out; + if (offload & XFRM_OFFLOAD_AUTH) { + if (unlikely(offload & XFRM_OFFLOAD_AUTH_FAIL)) { + x->stats.integrity_failed++; + goto out; + } + } else { + ahp->icv(ahp, skb, ah->auth_data); + if (memcmp(ah->auth_data, auth_data, ahp->icv_trunc_len)) { + x->stats.integrity_failed++; + goto out; + } } } ((struct iphdr*)work_buf)->protocol = ah->nexthdr; diff -Nru a/net/ipv4/esp4.c b/net/ipv4/esp4.c --- a/net/ipv4/esp4.c 2004-12-30 01:10:02 -05:00 +++ b/net/ipv4/esp4.c 2004-12-30 01:10:02 -05:00 @@ -164,6 +164,7 @@ int elen = skb->len - sizeof(struct ip_esp_hdr) - esp->conf.ivlen - alen; int nfrags; int encap_len = 0; + int offload; if (!pskb_may_pull(skb, sizeof(struct ip_esp_hdr))) goto out; @@ -171,22 +172,32 @@ if (elen <= 0 || (elen & (blksize-1))) goto out; + offload = skb_pop_xfrm_result(skb); + /* If integrity check is required, do this. */ if (esp->auth.icv_full_len) { - u8 sum[esp->auth.icv_full_len]; - u8 sum1[alen]; + if (unlikely(offload & XFRM_OFFLOAD_AUTH_FAIL)) { + x->stats.integrity_failed++; + goto out; + } + + if (!(offload & XFRM_OFFLOAD_AUTH)) { + u8 sum[esp->auth.icv_full_len]; + u8 sum1[alen]; - esp->auth.icv(esp, skb, 0, skb->len-alen, sum); + esp->auth.icv(esp, skb, 0, skb->len-alen, sum); - if (skb_copy_bits(skb, skb->len-alen, sum1, alen)) - BUG(); + if (skb_copy_bits(skb, skb->len-alen, sum1, alen)) + BUG(); - if (unlikely(memcmp(sum, sum1, alen))) { - x->stats.integrity_failed++; - goto out; + if (unlikely(memcmp(sum, sum1, alen))) { + x->stats.integrity_failed++; + goto out; + } } } + /* XXX I think this can be moved to the !offload case */ if ((nfrags = skb_cow_data(skb, 0, &trailer)) < 0) goto out; @@ -195,15 +206,12 @@ esph = (struct ip_esp_hdr*)skb->data; iph = skb->nh.iph; - /* Get ivec. This can be wrong, check against another impls. */ - if (esp->conf.ivlen) - crypto_cipher_set_iv(esp->conf.tfm, esph->enc_data, crypto_tfm_alg_ivsize(esp->conf.tfm)); - - { - u8 nexthdr[2]; + if (!(offload & XFRM_OFFLOAD_CONF)) { struct scatterlist *sg = &esp->sgbuf[0]; - u8 workbuf[60]; - int padlen; + + /* Get ivec. This can be wrong, check against another impls. */ + if (esp->conf.ivlen) + crypto_cipher_set_iv(esp->conf.tfm, esph->enc_data, crypto_tfm_alg_ivsize(esp->conf.tfm)); if (unlikely(nfrags > ESP_NUM_FAST_SG)) { sg = kmalloc(sizeof(struct scatterlist)*nfrags, GFP_ATOMIC); @@ -214,6 +222,12 @@ crypto_cipher_decrypt(esp->conf.tfm, sg, sg, elen); if (unlikely(sg != &esp->sgbuf[0])) kfree(sg); + } + + { + u8 nexthdr[2]; + u8 workbuf[60]; + int padlen; if (skb_copy_bits(skb, skb->len-alen-2, nexthdr, 2)) BUG(); From dave@thedillows.org Thu Dec 30 00:46:45 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 00:47:49 -0800 (PST) Received: from smtp.knology.net (smtp.knology.net [24.214.63.101]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBU8kON6024571 for ; Thu, 30 Dec 2004 00:46:44 -0800 Received: (qmail 4272 invoked by uid 0); 30 Dec 2004 08:48:59 -0000 Received: from user-69-1-45-93.knology.net (HELO ori.thedillows.org) (69.1.45.93) by smtp6.knology.net with SMTP; 30 Dec 2004 08:48:59 -0000 Received: from ori.thedillows.org (localhost [127.0.0.1]) by ori.thedillows.org (8.13.1/8.13.1) with ESMTP id iBU8maxQ009811; Thu, 30 Dec 2004 03:48:36 -0500 Received: (from root@localhost) by ori.thedillows.org (8.13.1/8.13.1/Submit) id iBU8mamn009810; Thu, 30 Dec 2004 03:48:36 -0500 Date: Thu, 30 Dec 2004 03:48:36 -0500 To: netdev@oss.sgi.com Cc: linux-kernel@vger.kernel.org, dave@thedillows.org From: David Dillow Subject: [RFC 2.6.10 14/22] typhoon: add inbound offload result processing Message-Id: <20041230035000.23@ori.thedillows.org> References: <20041230035000.22@ori.thedillows.org> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13211 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dave@thedillows.org Precedence: bulk X-list: netdev # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/12/30 00:54:09-05:00 dave@thedillows.org # Add inbound packet crypto result processing to the Typhoon driver. # # Signed-off-by: David Dillow # # drivers/net/typhoon.c # 2004/12/30 00:53:50-05:00 dave@thedillows.org +42 -0 # Add inbound packet crypto result processing to the Typhoon driver. # # Signed-off-by: David Dillow # diff -Nru a/drivers/net/typhoon.c b/drivers/net/typhoon.c --- a/drivers/net/typhoon.c 2004-12-30 01:09:23 -05:00 +++ b/drivers/net/typhoon.c 2004-12-30 01:09:23 -05:00 @@ -131,6 +131,7 @@ #include #include #include +#include #include "typhoon.h" #include "typhoon-firmware.h" @@ -1681,6 +1682,43 @@ return 0; } +static inline void +typhoon_ipsec_rx(struct sk_buff *skb, u16 results) +{ +#define CHECK_OFFLOAD(good, bad) \ + do { if(results & (good|bad)) { \ + unsigned int tmp = XFRM_OFFLOAD_CONF | XFRM_OFFLOAD_AUTH; \ + tmp |= (results & good) ? XFRM_OFFLOAD_AUTH_OK : \ + XFRM_OFFLOAD_AUTH_FAIL; \ + if(skb_put_xfrm_result(skb, tmp, i)) \ + return; \ + i++; \ + } } while(0) + + /* We have no way to determine what the order of the SAs were on + * the wire, just the 1st AH seen, the 1st ESP seen, etc. + * + * We just walk the stack, and pretend that AH SAs get decypted + * so that if we get the order wrong, the worst case scenerio is + * that we indicate the failure on the wrong SA, since we'll need + * to match all SAs against the policy. + * + * We get a "ESP good" indication for null auth hash on ESP. + */ + /* XXX think more about security indications -- can I craft a + * packet to do bad things -- maybe a NULL auth ESP packet, + * and a failed AH packet? + */ + int i = 0; + + CHECK_OFFLOAD(TYPHOON_RX_AH1_GOOD, TYPHOON_RX_AH1_FAIL); + CHECK_OFFLOAD(TYPHOON_RX_ESP1_GOOD, TYPHOON_RX_ESP1_FAIL); + CHECK_OFFLOAD(TYPHOON_RX_AH2_GOOD, TYPHOON_RX_AH2_FAIL); + CHECK_OFFLOAD(TYPHOON_RX_ESP2_GOOD, TYPHOON_RX_ESP2_FAIL); + +#undef CHECK_OFFLOAD +} + static int typhoon_rx(struct typhoon *tp, struct basic_ring *rxRing, volatile u32 * ready, volatile u32 * cleared, int budget) @@ -1745,6 +1783,10 @@ new_skb->ip_summed = CHECKSUM_UNNECESSARY; } else new_skb->ip_summed = CHECKSUM_NONE; + + if((rx->rxStatus & TYPHOON_RX_IPSEC) && + !(rx->rxStatus & TYPHOON_RX_IP_FRAG)) + typhoon_ipsec_rx(new_skb, rx->ipsecResults); spin_lock(&tp->state_lock); if(tp->vlgrp != NULL && rx->rxStatus & TYPHOON_RX_VLAN) From dave@thedillows.org Thu Dec 30 00:46:43 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 00:47:50 -0800 (PST) Received: from smtp.knology.net (smtp.knology.net [24.214.63.101]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBU8kMZO024559 for ; Thu, 30 Dec 2004 00:46:43 -0800 Received: (qmail 23985 invoked by uid 0); 30 Dec 2004 08:47:24 -0000 Received: from user-69-1-45-93.knology.net (HELO ori.thedillows.org) (69.1.45.93) by smtp1.knology.net with SMTP; 30 Dec 2004 08:47:24 -0000 Received: from ori.thedillows.org (localhost [127.0.0.1]) by ori.thedillows.org (8.13.1/8.13.1) with ESMTP id iBU8mYPD009763; Thu, 30 Dec 2004 03:48:34 -0500 Received: (from root@localhost) by ori.thedillows.org (8.13.1/8.13.1/Submit) id iBU8mYE2009762; Thu, 30 Dec 2004 03:48:34 -0500 Date: Thu, 30 Dec 2004 03:48:34 -0500 To: netdev@oss.sgi.com Cc: linux-kernel@vger.kernel.org, dave@thedillows.org From: David Dillow Subject: [RFC 2.6.10 2/22] xfrm: Add xfrm offload management calls to struct netdevice Message-Id: <20041230035000.11@ori.thedillows.org> References: <20041230035000.10@ori.thedillows.org> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13216 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dave@thedillows.org Precedence: bulk X-list: netdev # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/12/30 00:28:25-05:00 dave@thedillows.org # Add the xfrm offload management calls to struct netdevice. # # xfrm_state_add() is called for inbound xfrm states # xfrm_bundle_add() is called for outbound xfrm bundles # xfrm_state_del() is called for all offloaded xfrms, # inbound or outbound. # # If a driver adds NETIF_F_IPSEC to its features, it must # provide all three callbacks. # # Signed-off-by: David Dillow # # include/linux/netdevice.h # 2004/12/30 00:28:07-05:00 dave@thedillows.org +11 -0 # Add the xfrm offload management calls to struct netdevice. # # xfrm_state_add() is called for inbound xfrm states # xfrm_bundle_add() is called for outbound xfrm bundles # xfrm_state_del() is called for all offloaded xfrms, # inbound or outbound. # # If a driver adds NETIF_F_IPSEC to its features, it must # provide all three callbacks. # # Signed-off-by: David Dillow # diff -Nru a/include/linux/netdevice.h b/include/linux/netdevice.h --- a/include/linux/netdevice.h 2004-12-30 01:11:56 -05:00 +++ b/include/linux/netdevice.h 2004-12-30 01:11:56 -05:00 @@ -250,6 +250,9 @@ }; #define NETDEV_BOOT_SETUP_MAX 8 +struct xfrm_state; +struct xfrm_offload; +struct xfrm_bundle_list; /* * The DEVICE structure. @@ -415,6 +418,7 @@ #define NETIF_F_VLAN_CHALLENGED 1024 /* Device cannot handle VLAN packets */ #define NETIF_F_TSO 2048 /* Can offload TCP/IP segmentation */ #define NETIF_F_LLTX 4096 /* LockLess TX */ +#define NETIF_F_IPSEC 8192 /* Can offload IPSEC crypto */ /* Called after device is detached from network. */ void (*uninit)(struct net_device *dev); @@ -464,6 +468,13 @@ unsigned short vid); void (*vlan_rx_kill_vid)(struct net_device *dev, unsigned short vid); + + void (*xfrm_state_add)(struct net_device *dev, + struct xfrm_state *x); + void (*xfrm_bundle_add)(struct net_device *dev, + struct xfrm_bundle_list *bl); + void (*xfrm_state_del)(struct net_device *dev, + struct xfrm_offload *xol); int (*hard_header_parse)(struct sk_buff *skb, unsigned char *haddr); From dave@thedillows.org Thu Dec 30 00:46:44 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 00:47:49 -0800 (PST) Received: from smtp.knology.net (smtp.knology.net [24.214.63.101]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBU8kNeg024564 for ; Thu, 30 Dec 2004 00:46:43 -0800 Received: (qmail 28263 invoked by uid 0); 30 Dec 2004 08:48:58 -0000 Received: from user-69-1-45-93.knology.net (HELO ori.thedillows.org) (69.1.45.93) by smtp2.knology.net with SMTP; 30 Dec 2004 08:48:58 -0000 Received: from ori.thedillows.org (localhost [127.0.0.1]) by ori.thedillows.org (8.13.1/8.13.1) with ESMTP id iBU8mZwK009783; Thu, 30 Dec 2004 03:48:35 -0500 Received: (from root@localhost) by ori.thedillows.org (8.13.1/8.13.1/Submit) id iBU8mZIA009782; Thu, 30 Dec 2004 03:48:35 -0500 Date: Thu, 30 Dec 2004 03:48:35 -0500 To: netdev@oss.sgi.com Cc: linux-kernel@vger.kernel.org, dave@thedillows.org From: David Dillow Subject: [RFC 2.6.10 7/22] xfrm: Allow device drivers to force recalculation of offloads Message-Id: <20041230035000.16@ori.thedillows.org> References: <20041230035000.15@ori.thedillows.org> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13206 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dave@thedillows.org Precedence: bulk X-list: netdev # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/12/30 00:37:44-05:00 dave@thedillows.org # Give device drivers a method to allow the use of crypto # offload features for existing xfrm_states and bundles, as # well as dynamically remove crypto offload capabilities. # # Signed-off-by: David Dillow # # net/xfrm/xfrm_state.c # 2004/12/30 00:37:26-05:00 dave@thedillows.org +39 -0 # When we've been informed of a new device that can offload # xfrm crypto operations, go ahead and offload existing inbound # xfrm_states to it. # # When we're removing crypto offload capabilities, remove every # offload instance for that device. # # Signed-off-by: David Dillow # # net/xfrm/xfrm_policy.c # 2004/12/30 00:37:26-05:00 dave@thedillows.org +17 -0 # When adding/removing xfrm offload capable device, give the xfrm_state # engine a chance to make the changes it needs, then flush any existing # bundles that use the device so that future flows get a chance to use # the offload features (for add), or resume using the software crypto # routines (for remove). # # Signed-off-by: David Dillow # # net/xfrm/xfrm_export.c # 2004/12/30 00:37:26-05:00 dave@thedillows.org +2 -0 # Export the driver-visible API. # # Signed-off-by: David Dillow # # include/net/xfrm.h # 2004/12/30 00:37:26-05:00 dave@thedillows.org +4 -0 # Prototypes for the new routines. # # Signed-off-by: David Dillow # diff -Nru a/include/net/xfrm.h b/include/net/xfrm.h --- a/include/net/xfrm.h 2004-12-30 01:10:52 -05:00 +++ b/include/net/xfrm.h 2004-12-30 01:10:52 -05:00 @@ -833,6 +833,8 @@ extern struct xfrm_state *xfrm_find_acq_byseq(u32 seq); extern void xfrm_state_delete(struct xfrm_state *x); extern void xfrm_state_flush(u8 proto); +extern void xfrm_state_accel_add(struct net_device *dev); +extern void xfrm_state_accel_flush(struct net_device *dev); extern int xfrm_replay_check(struct xfrm_state *x, u32 seq); extern void xfrm_replay_advance(struct xfrm_state *x, u32 seq); extern int xfrm_state_check(struct xfrm_state *x, struct sk_buff *skb); @@ -888,6 +890,8 @@ extern int xfrm_sk_policy_insert(struct sock *sk, int dir, struct xfrm_policy *pol); extern struct xfrm_policy *xfrm_sk_policy_lookup(struct sock *sk, int dir, struct flowi *fl); extern int xfrm_flush_bundles(void); +extern void xfrm_accel_add(struct net_device *dev); +extern void xfrm_accel_flush(struct net_device *dev); extern wait_queue_head_t km_waitq; extern void km_state_expired(struct xfrm_state *x, int hard); diff -Nru a/net/xfrm/xfrm_export.c b/net/xfrm/xfrm_export.c --- a/net/xfrm/xfrm_export.c 2004-12-30 01:10:52 -05:00 +++ b/net/xfrm/xfrm_export.c 2004-12-30 01:10:52 -05:00 @@ -63,3 +63,5 @@ EXPORT_SYMBOL_GPL(skb_icv_walk); EXPORT_SYMBOL_GPL(xfrm_state_offload_add); +EXPORT_SYMBOL_GPL(xfrm_accel_add); +EXPORT_SYMBOL_GPL(xfrm_accel_flush); diff -Nru a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c --- a/net/xfrm/xfrm_policy.c 2004-12-30 01:10:52 -05:00 +++ b/net/xfrm/xfrm_policy.c 2004-12-30 01:10:52 -05:00 @@ -1121,6 +1121,23 @@ return 0; } +static int bundle_uses_dev(struct dst_entry *dst, void *dev) +{ + return (dst->dev == dev); +} + +void xfrm_accel_add(struct net_device *dev) +{ + xfrm_state_accel_add(dev); + xfrm_prune_bundles(bundle_uses_dev, dev); +} + +void xfrm_accel_flush(struct net_device *dev) +{ + xfrm_state_accel_flush(dev); + xfrm_prune_bundles(bundle_uses_dev, dev); +} + /* Well... that's _TASK_. We need to scan through transformation * list and figure out what mss tcp should generate in order to * final datagram fit to mtu. Mama mia... :-) diff -Nru a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c --- a/net/xfrm/xfrm_state.c 2004-12-30 01:10:52 -05:00 +++ b/net/xfrm/xfrm_state.c 2004-12-30 01:10:52 -05:00 @@ -998,6 +998,45 @@ xfrm_offload_release(old); } +static int try_new_accel(struct xfrm_state *x, int unused, void *data) +{ + struct net_device *dev = data; + + if (x->dir == XFRM_STATE_DIR_IN) + dev->xfrm_state_add(dev, x); + return 0; +} + +static int remove_stale_accel(struct xfrm_state *x, int unused, void *dev) +{ + struct xfrm_offload *xol, *entry = NULL; + + spin_lock(&x->lock); + list_for_each_entry(xol, &x->offloads, bydev) { + if (xol->dev == dev) { + list_del(&xol->bydev); + entry = xol; + break; + } + } + spin_unlock(&x->lock); + + if (entry) + xfrm_offload_release(entry); + + return 0; +} + +void xfrm_state_accel_add(struct net_device *dev) +{ + xfrm_state_walk(IPSEC_PROTO_ANY, try_new_accel, dev); +} + +void xfrm_state_accel_flush(struct net_device *dev) +{ + xfrm_state_walk(IPSEC_PROTO_ANY, remove_stale_accel, dev); +} + void __init xfrm_state_init(void) { int i; From dave@thedillows.org Thu Dec 30 00:46:45 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 00:47:49 -0800 (PST) Received: from smtp.knology.net (smtp.knology.net [24.214.63.101]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBU8kP0F024575 for ; Thu, 30 Dec 2004 00:46:45 -0800 Received: (qmail 17290 invoked by uid 0); 30 Dec 2004 08:47:23 -0000 Received: from user-69-1-45-93.knology.net (HELO ori.thedillows.org) (69.1.45.93) by smtp5.knology.net with SMTP; 30 Dec 2004 08:47:23 -0000 Received: from ori.thedillows.org (localhost [127.0.0.1]) by ori.thedillows.org (8.13.1/8.13.1) with ESMTP id iBU8mbIr009831; Thu, 30 Dec 2004 03:48:37 -0500 Received: (from root@localhost) by ori.thedillows.org (8.13.1/8.13.1/Submit) id iBU8mb3t009830; Thu, 30 Dec 2004 03:48:37 -0500 Date: Thu, 30 Dec 2004 03:48:37 -0500 To: netdev@oss.sgi.com Cc: linux-kernel@vger.kernel.org, dave@thedillows.org From: David Dillow Subject: [RFC 2.6.10 19/22] typhoon: add loading of xfrm_states to hardware Message-Id: <20041230035000.28@ori.thedillows.org> References: <20041230035000.27@ori.thedillows.org> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13212 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dave@thedillows.org Precedence: bulk X-list: netdev # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/12/30 01:02:32-05:00 dave@thedillows.org # Teach the Typhoon driver how to add and remove xfrm_states to # the 3XP for later packet processing. # # When the first xfrm_state is added, we turn on IPSEC offloads # for the 3XP, and we turn it off when the last one is removed. # # Signed-off-by: David Dillow # # drivers/net/typhoon.c # 2004/12/30 01:02:14-05:00 dave@thedillows.org +167 -0 # Teach the Typhoon driver how to add and remove xfrm_states to # the 3XP for later packet processing. # # When the first xfrm_state is added, we turn on IPSEC offloads # for the 3XP, and we turn it off when the last one is removed. # # Signed-off-by: David Dillow # diff -Nru a/drivers/net/typhoon.c b/drivers/net/typhoon.c --- a/drivers/net/typhoon.c 2004-12-30 01:08:19 -05:00 +++ b/drivers/net/typhoon.c 2004-12-30 01:08:19 -05:00 @@ -2420,6 +2420,173 @@ #undef REQUIRED #undef UNSUPPORTED +static struct xfrm_offload * +typhoon_offload_ipsec(struct typhoon *tp, struct xfrm_state *x) +{ + struct cmd_desc xp_cmd[5]; + struct resp_desc xp_resp; + struct sa_descriptor *sa = (struct sa_descriptor *)xp_cmd; + struct xfrm_offload *xol; + struct typhoon_xfrm_offload *txo; + u16 *dir_sa_avail = &tp->rx_sa_avail; + u16 cookie; + int keylen, err; + + if(!typhoon_validate_xfrm(tp, x)) + goto error; + + memset(xp_cmd, 0, 5 * sizeof(xp_cmd[0])); + INIT_COMMAND_WITH_RESPONSE(xp_cmd, TYPHOON_CMD_CREATE_SA); + sa->numDesc = 4; + + sa->mode = TYPHOON_SA_MODE_AH; + if(x->type->proto == IPPROTO_ESP) + sa->mode = TYPHOON_SA_MODE_ESP; + + if(x->dir == XFRM_STATE_DIR_OUT) { + sa->direction = TYPHOON_SA_DIR_TX; + dir_sa_avail = &tp->tx_sa_avail; + } + + spin_lock_bh(&tp->offload_lock); + if(!*dir_sa_avail) { + spin_unlock_bh(&tp->offload_lock); + goto error; + } + *dir_sa_avail--; + if(!tp->sa_count++) { + tp->offload |= TYPHOON_OFFLOAD_IPSEC; + err = typhoon_set_offload(tp); + if(err < 0) { + spin_unlock_bh(&tp->offload_lock); + printk(KERN_ERR "%s: unable to enable IPSEC " + "offload (%d)\n", tp->name, -err); + goto error_counted; + } + } + spin_unlock_bh(&tp->offload_lock); + + if(x->props.aalgo != SADB_X_AALG_NULL && x->aalg) { + keylen = (x->aalg->alg_key_len + 7) / 8; + + sa->hashFlags = TYPHOON_SA_HASH_SHA1; + if(x->props.aalgo == SADB_AALG_MD5HMAC) + sa->hashFlags = TYPHOON_SA_HASH_MD5; + sa->hashFlags |= TYPHOON_SA_HASH_ENABLE; + + memcpy(sa->integKey, x->aalg->alg_key, keylen); + } + + if(x->props.ealgo != SADB_EALG_NULL && x->ealg) { + keylen = (x->ealg->alg_key_len + 7) / 8; + + sa->encryptionFlags = TYPHOON_SA_ENCRYPT_ENABLE | + TYPHOON_SA_ENCRYPT_CBC; + if(x->props.ealgo == SADB_EALG_DESCBC) + sa->encryptionFlags |= TYPHOON_SA_ENCRYPT_DES; + else if(x->ealg->alg_key_len == 192) + sa->encryptionFlags |= TYPHOON_SA_ENCRYPT_3DES_3KEY; + else { + sa->encryptionFlags |= TYPHOON_SA_ENCRYPT_3DES_2KEY; + memcpy(&sa->confKey[16], x->ealg->alg_key, 8); + } + + memcpy(sa->confKey, x->ealg->alg_key, keylen); + } + + /* The 3XP expects the SPI to be in host order, litte endian. + * It expects the address to be in network order. + */ + sa->SPI = cpu_to_le32(ntohl(x->id.spi)); + sa->destAddr = x->id.daddr.a4; + sa->destMask = (u32) ~0UL; + + err = typhoon_issue_command(tp, 5, xp_cmd, 1, &xp_resp); + cookie = le16_to_cpu(xp_resp.parm1); + if(err < 0 || !cookie || cookie == 0xffff) + goto error_counted; + + xol = xfrm_offload_alloc(sizeof(*txo), tp->dev); + if(!xol) + goto error_cookie; + + txo = xfrm_offload_priv(xol); + txo->sa_cookie = cookie; + txo->tunnel = !!x->props.mode; + txo->ah = (x->id.proto == IPPROTO_AH); + txo->inbound = (x->dir == XFRM_STATE_DIR_IN); + + xfrm_state_offload_add(x, xol); + + return xol; + +error_cookie: + INIT_COMMAND_NO_RESPONSE(xp_cmd, TYPHOON_CMD_DELETE_SA); + xp_cmd[0].parm1 = xp_resp.parm1; + typhoon_issue_command(tp, 1, xp_cmd, 0, NULL); + +error_counted: + spin_lock_bh(&tp->offload_lock); + *dir_sa_avail++; + tp->sa_count--; + if(!tp->sa_count) { + tp->offload &= ~TYPHOON_OFFLOAD_IPSEC; + err = typhoon_set_offload(tp); + if(err < 0) + printk(KERN_ERR "%s: unable to disable IPSEC " + "offload (%d)\n", tp->name, -err); + } + spin_unlock_bh(&tp->offload_lock); + +error: + return NULL; +} + +static void +typhoon_xfrm_state_add(struct net_device *dev, struct xfrm_state *x) +{ + struct typhoon *tp = netdev_priv(dev); + + smp_rmb(); + if(tp->card_state == Running) + typhoon_offload_ipsec(tp, x); +} + +static void +typhoon_xfrm_state_del(struct net_device *dev, struct xfrm_offload *xol) +{ + struct typhoon *tp = netdev_priv(dev); + struct typhoon_xfrm_offload *txo = xfrm_offload_priv(xol); + struct cmd_desc xp_cmd; + int err; + + smp_rmb(); + if(tp->card_state != Running) + return; + + INIT_COMMAND_NO_RESPONSE(&xp_cmd, TYPHOON_CMD_DELETE_SA); + xp_cmd.parm1 = cpu_to_le16(txo->sa_cookie); + if(typhoon_issue_command(tp, 1, &xp_cmd, 0, NULL) < 0) { + printk(KERN_ERR "%s: unable to remove offloaded SA 0x%04x\n", + tp->name, txo->sa_cookie); + } + + spin_lock_bh(&tp->offload_lock); + if(txo->inbound) + tp->rx_sa_avail++; + else + tp->tx_sa_avail++; + tp->sa_count--; + if(!tp->sa_count) { + tp->offload &= ~TYPHOON_OFFLOAD_IPSEC; + err = typhoon_set_offload(tp); + if(err < 0) + printk(KERN_ERR "%s: unable to disable IPSEC " + "offload (%d)\n", tp->name, -err); + } + spin_unlock_bh(&tp->offload_lock); +} + static void typhoon_tx_timeout(struct net_device *dev) { From dave@thedillows.org Thu Dec 30 00:46:44 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 00:47:49 -0800 (PST) Received: from smtp.knology.net (smtp.knology.net [24.214.63.101]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBU8kOLb024569 for ; Thu, 30 Dec 2004 00:46:44 -0800 Received: (qmail 23992 invoked by uid 0); 30 Dec 2004 08:47:26 -0000 Received: from user-69-1-45-93.knology.net (HELO ori.thedillows.org) (69.1.45.93) by smtp1.knology.net with SMTP; 30 Dec 2004 08:47:26 -0000 Received: from ori.thedillows.org (localhost [127.0.0.1]) by ori.thedillows.org (8.13.1/8.13.1) with ESMTP id iBU8maQu009803; Thu, 30 Dec 2004 03:48:36 -0500 Received: (from root@localhost) by ori.thedillows.org (8.13.1/8.13.1/Submit) id iBU8ma8S009802; Thu, 30 Dec 2004 03:48:36 -0500 Date: Thu, 30 Dec 2004 03:48:36 -0500 To: netdev@oss.sgi.com Cc: linux-kernel@vger.kernel.org, dave@thedillows.org From: David Dillow Subject: [RFC 2.6.10 12/22] ethtool: Add support for crypto offload Message-Id: <20041230035000.21@ori.thedillows.org> References: <20041230035000.20@ori.thedillows.org> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13201 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dave@thedillows.org Precedence: bulk X-list: netdev # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/12/30 00:51:19-05:00 dave@thedillows.org # Add support for querying and changing the status of the # IPSEC crypto offload feature of a NIC. # # Signed-off-by: David Dillow # # net/core/ethtool.c # 2004/12/30 00:51:00-05:00 dave@thedillows.org +54 -0 # Add support for querying and changing the status of the IPSEC # crypto offload feature of a NIC. # # Turn on/off the feature flag before informing the xfrm engine # of the change so that existing xfrms get the new settings. # # Signed-off-by: David Dillow # # include/linux/ethtool.h # 2004/12/30 00:51:00-05:00 dave@thedillows.org +8 -0 # Add support for querying and changing the status of the # IPSEC crypto offload feature of a NIC. # # Signed-off-by: David Dillow # diff -Nru a/include/linux/ethtool.h b/include/linux/ethtool.h --- a/include/linux/ethtool.h 2004-12-30 01:09:49 -05:00 +++ b/include/linux/ethtool.h 2004-12-30 01:09:49 -05:00 @@ -260,6 +260,8 @@ int ethtool_op_set_sg(struct net_device *dev, u32 data); u32 ethtool_op_get_tso(struct net_device *dev); int ethtool_op_set_tso(struct net_device *dev, u32 data); +u32 ethtool_op_get_ipsec(struct net_device *dev); +int ethtool_op_set_ipsec(struct net_device *dev, u32 data); /** * ðtool_ops - Alter and report network device settings @@ -293,6 +295,8 @@ * get_strings: Return a set of strings that describe the requested objects * phys_id: Identify the device * get_stats: Return statistics about the device + * get_ipsec: Report whether IPSEC crypto offload is enabled + * set_ipsec: Turn IPSEC crypto offload on or off * * Description: * @@ -345,6 +349,8 @@ int (*set_sg)(struct net_device *, u32); u32 (*get_tso)(struct net_device *); int (*set_tso)(struct net_device *, u32); + u32 (*get_ipsec)(struct net_device *); + int (*set_ipsec)(struct net_device *, u32); int (*self_test_count)(struct net_device *); void (*self_test)(struct net_device *, struct ethtool_test *, u64 *); void (*get_strings)(struct net_device *, u32 stringset, u8 *); @@ -388,6 +394,8 @@ #define ETHTOOL_GSTATS 0x0000001d /* get NIC-specific statistics */ #define ETHTOOL_GTSO 0x0000001e /* Get TSO enable (ethtool_value) */ #define ETHTOOL_STSO 0x0000001f /* Set TSO enable (ethtool_value) */ +#define ETHTOOL_GIPSEC 0x00000020 /* Get IPSEC enable (ethtool_value) */ +#define ETHTOOL_SIPSEC 0x00000021 /* Set IPSEC enable (ethtool_value) */ /* compatibility with older code */ #define SPARC_ETH_GSET ETHTOOL_GSET diff -Nru a/net/core/ethtool.c b/net/core/ethtool.c --- a/net/core/ethtool.c 2004-12-30 01:09:49 -05:00 +++ b/net/core/ethtool.c 2004-12-30 01:09:49 -05:00 @@ -14,6 +14,7 @@ #include #include #include +#include #include /* @@ -72,6 +73,24 @@ return 0; } +u32 ethtool_op_get_ipsec(struct net_device *dev) +{ + return (dev->features & NETIF_F_IPSEC) != 0; +} + +int ethtool_op_set_ipsec(struct net_device *dev, u32 data) +{ + if (data) { + dev->features |= NETIF_F_IPSEC; + xfrm_accel_add(dev); + } else { + dev->features &= ~NETIF_F_IPSEC; + xfrm_accel_flush(dev); + } + + return 0; +} + /* Handlers for each ethtool command */ static int ethtool_get_settings(struct net_device *dev, void __user *useraddr) @@ -548,6 +567,33 @@ return dev->ethtool_ops->set_tso(dev, edata.data); } +static int ethtool_get_ipsec(struct net_device *dev, char __user *useraddr) +{ + struct ethtool_value edata = { ETHTOOL_GIPSEC }; + + if (!dev->ethtool_ops->get_ipsec) + return -EOPNOTSUPP; + + edata.data = dev->ethtool_ops->get_ipsec(dev); + + if (copy_to_user(useraddr, &edata, sizeof(edata))) + return -EFAULT; + return 0; +} + +static int ethtool_set_ipsec(struct net_device *dev, char __user *useraddr) +{ + struct ethtool_value edata; + + if (!dev->ethtool_ops->set_ipsec) + return -EOPNOTSUPP; + + if (copy_from_user(&edata, useraddr, sizeof(edata))) + return -EFAULT; + + return dev->ethtool_ops->set_ipsec(dev, edata.data); +} + static int ethtool_self_test(struct net_device *dev, char __user *useraddr) { struct ethtool_test test; @@ -783,6 +829,12 @@ case ETHTOOL_STSO: rc = ethtool_set_tso(dev, useraddr); break; + case ETHTOOL_GIPSEC: + rc = ethtool_get_ipsec(dev, useraddr); + break; + case ETHTOOL_SIPSEC: + rc = ethtool_set_ipsec(dev, useraddr); + break; case ETHTOOL_TEST: rc = ethtool_self_test(dev, useraddr); break; @@ -813,7 +865,9 @@ EXPORT_SYMBOL(ethtool_op_get_link); EXPORT_SYMBOL(ethtool_op_get_sg); EXPORT_SYMBOL(ethtool_op_get_tso); +EXPORT_SYMBOL(ethtool_op_get_ipsec); EXPORT_SYMBOL(ethtool_op_get_tx_csum); EXPORT_SYMBOL(ethtool_op_set_sg); EXPORT_SYMBOL(ethtool_op_set_tso); +EXPORT_SYMBOL(ethtool_op_set_ipsec); EXPORT_SYMBOL(ethtool_op_set_tx_csum); From dave@thedillows.org Thu Dec 30 00:46:45 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 00:47:49 -0800 (PST) Received: from smtp.knology.net (smtp.knology.net [24.214.63.101]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBU8kOk9024573 for ; Thu, 30 Dec 2004 00:46:44 -0800 Received: (qmail 19785 invoked by uid 0); 30 Dec 2004 08:53:07 -0000 Received: from user-69-1-45-93.knology.net (HELO ori.thedillows.org) (69.1.45.93) by smtp8.knology.net with SMTP; 30 Dec 2004 08:53:07 -0000 Received: from ori.thedillows.org (localhost [127.0.0.1]) by ori.thedillows.org (8.13.1/8.13.1) with ESMTP id iBU8mbnl009819; Thu, 30 Dec 2004 03:48:37 -0500 Received: (from root@localhost) by ori.thedillows.org (8.13.1/8.13.1/Submit) id iBU8mb7S009818; Thu, 30 Dec 2004 03:48:37 -0500 Date: Thu, 30 Dec 2004 03:48:37 -0500 To: netdev@oss.sgi.com Cc: linux-kernel@vger.kernel.org, dave@thedillows.org From: David Dillow Subject: [RFC 2.6.10 16/22] typhoon: collect crypto offload capabilities Message-Id: <20041230035000.25@ori.thedillows.org> References: <20041230035000.24@ori.thedillows.org> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13202 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dave@thedillows.org Precedence: bulk X-list: netdev # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/12/30 00:56:40-05:00 dave@thedillows.org # Collect some information about the Typhoon's offload capabilities, # and store it for future use. # # Signed-off-by: David Dillow # # drivers/net/typhoon.h # 2004/12/30 00:56:22-05:00 dave@thedillows.org +14 -0 # Add the reply message format for the crypto capability query command. # # Signed-off-by: David Dillow # # drivers/net/typhoon.c # 2004/12/30 00:56:22-05:00 dave@thedillows.org +56 -0 # Collect some information about the Typhoon's offload capabilities, # and store it for future use. # # Signed-off-by: David Dillow # diff -Nru a/drivers/net/typhoon.c b/drivers/net/typhoon.c --- a/drivers/net/typhoon.c 2004-12-30 01:08:58 -05:00 +++ b/drivers/net/typhoon.c 2004-12-30 01:08:58 -05:00 @@ -305,6 +305,12 @@ u16 wol_events; u32 offload; + u16 tx_sa_max; + u16 rx_sa_max; + u16 tx_sa_avail; + u16 rx_sa_avail; + int sa_count; + /* unused stuff (future use) */ int capabilities; struct transmit_ring txHiRing; @@ -2105,6 +2111,53 @@ return 0; } +static inline int +typhoon_ipsec_init(struct typhoon *tp) +{ + struct cmd_desc xp_cmd; + struct resp_desc xp_resp; + struct ipsec_info_desc *info = (struct ipsec_info_desc *) &xp_resp; + u16 last_tx, last_rx, last_cap; + int err; + + last_tx = tp->tx_sa_max; + last_rx = tp->rx_sa_max; + last_cap = tp->capabilities; + + INIT_COMMAND_WITH_RESPONSE(&xp_cmd, TYPHOON_CMD_READ_IPSEC_INFO); + err = typhoon_issue_command(tp, 1, &xp_cmd, 1, &xp_resp); + if(err < 0) + goto out; + + /* We're not up yet, so no need to lock this -- we cannot modify + * these fields yet. + */ + tp->tx_sa_avail = tp->tx_sa_max = le16_to_cpu(info->tx_sa_max); + tp->rx_sa_avail = tp->rx_sa_max = le16_to_cpu(info->rx_sa_max); + tp->sa_count = 0; + + /* Typhoon2 was originally going to have variable crypto capabilities, + * subject to registration with 3Com. It appears they have decided + * to just enable 3DES as well. + */ + if(tp->capabilities & TYPHOON_CRYPTO_VARIABLE) { + tp->capabilities &= ~TYPHOON_CRYPTO_VARIABLE; + tp->capabilities |= TYPHOON_CRYPTO_DES | TYPHOON_CRYPTO_3DES; + } + + if(last_tx != tp->tx_sa_max || last_rx != tp->rx_sa_max || + last_cap != tp->capabilities) { + printk(KERN_INFO "%s: IPSEC offload %s%s %d Tx %d Rx\n", + tp->name, + tp->capabilities & TYPHOON_CRYPTO_DES ? "DES " : "", + tp->capabilities & TYPHOON_CRYPTO_3DES ? "3DES" : "", + tp->tx_sa_max, tp->rx_sa_max); + } + +out: + return err; +} + static int typhoon_start_runtime(struct typhoon *tp) { @@ -2127,6 +2180,9 @@ err = -EIO; goto error_out; } + + if(typhoon_ipsec_init(tp)) + goto error_out; INIT_COMMAND_NO_RESPONSE(&xp_cmd, TYPHOON_CMD_SET_MAX_PKT_SIZE); xp_cmd.parm1 = cpu_to_le16(PKT_BUF_SZ); diff -Nru a/drivers/net/typhoon.h b/drivers/net/typhoon.h --- a/drivers/net/typhoon.h 2004-12-30 01:08:58 -05:00 +++ b/drivers/net/typhoon.h 2004-12-30 01:08:58 -05:00 @@ -487,6 +487,20 @@ u32 unused2; } __attribute__ ((packed)); +/* TYPHOON_CMD_READ_IPSEC_INFO response descriptor + */ +struct ipsec_info_desc { + u8 flags; + u8 numDesc; + u16 cmd; + u16 seqNo; + u16 des_enabled; + u16 tx_sa_max; + u16 rx_sa_max; + u16 tx_sa_count; + u16 rx_sa_count; +} __attribute__ ((packed)); + /* TYPHOON_CMD_SET_OFFLOAD_TASKS bits (cmd.parm2 (Tx) & cmd.parm3 (Rx)) * This is all for IPv4. */ From dave@thedillows.org Thu Dec 30 00:46:43 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 00:47:50 -0800 (PST) Received: from smtp.knology.net (smtp.knology.net [24.214.63.101]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBU8kMbp024557 for ; Thu, 30 Dec 2004 00:46:42 -0800 Received: (qmail 31795 invoked by uid 0); 30 Dec 2004 08:54:53 -0000 Received: from user-69-1-45-93.knology.net (HELO ori.thedillows.org) (69.1.45.93) by smtp7.knology.net with SMTP; 30 Dec 2004 08:54:53 -0000 Received: from ori.thedillows.org (localhost [127.0.0.1]) by ori.thedillows.org (8.13.1/8.13.1) with ESMTP id iBU8mY1P009755; Thu, 30 Dec 2004 03:48:34 -0500 Received: (from root@localhost) by ori.thedillows.org (8.13.1/8.13.1/Submit) id iBU8mYWM009754; Thu, 30 Dec 2004 03:48:34 -0500 Date: Thu, 30 Dec 2004 03:48:34 -0500 To: netdev@oss.sgi.com Cc: linux-kernel@vger.kernel.org, dave@thedillows.org From: David Dillow Subject: [RFC 2.6.10 0/22] Add hardware assist for IPSEC crypto Message-Id: <20041230035000.01@ori.thedillows.org> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13215 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dave@thedillows.org Precedence: bulk X-list: netdev The following patch set adds hardware offload of the crypto operations for IPv4 IPSEC processing. It gives a noticible speedup on my (admittedly older) hardware, but given the recent numbers posted, can be a speedup even for more recent hardware. There are a few known issues with the current patchset, but I think it is ready for wider review. * Only the 3Com 3CR990 family of NICs are supported. I don't have hardware or documentation for the Intel cards. * Doesn't do IPv6. Need someone to implement map_direction(), and AH/ESP handling, as well as come up with a card that supports it. * The use of GFP_ATOMIC in xfrm_offload_alloc() is probably not a good idea. * linux/skbuff.h cannot include net/xfrm.h currently, so there are redundant defines (requires some header cleanup, which I'm not very inclined to tackle at the moment.) * TCP Segmentation offload seems broken by firmware 03.001.008. It could be my changes to support the offload, but that seems unlikely. I will have to investigate this. * Latency suffers somewhat on smaller packets, it may be advisable to have a minimum packet size to offload. * No real feedback on which xfrm_states have been offloaded or not. The patch set will be sent as follow-ups to this post, or is available via: bk pull http://typhoon.bkbits.net/ipsec-2.6 It will update the following files: Documentation/networking/netdevices.txt | 16 drivers/net/typhoon.c | 687 +++++++++++++++++++++++++++++++- drivers/net/typhoon.h | 38 + include/linux/ethtool.h | 8 include/linux/netdevice.h | 11 include/linux/skbuff.h | 55 ++ include/net/dst.h | 1 include/net/xfrm.h | 108 +++++ net/core/ethtool.c | 54 ++ net/core/skbuff.c | 31 + net/ipv4/ah4.c | 99 ++-- net/ipv4/esp4.c | 102 +++- net/ipv4/xfrm4_state.c | 10 net/ipv6/xfrm6_state.c | 9 net/xfrm/xfrm_export.c | 4 net/xfrm/xfrm_policy.c | 64 ++ net/xfrm/xfrm_state.c | 101 ++++ 17 files changed, 1284 insertions(+), 114 deletions(-) If you work from the mailed patches, you will want the netdev-2.6 updates to the typhoon driver, as the 3CR990B series needs the newest firmware to correctly offload IPSEC processing. That patch is available from http://www.thedillows.org/typhoon-netdev-2.6.patch.bz2 The following results were generated using a dual processor PIII 1GHz/512MB with a 3CR990SVR97 (ori) and an Athlon 550 MHz/256MB with a 3CR990B (tank). Latency testing was performed with lmbench's lat_tcp, and bandwith testing was performed with Andrew Morton's zcc/zcs/cyclesoak. I ran the tests multiple times, and picked the median results to report. There was not much deviation in the results (+/- 1.5 us +/- 50KBytes/s +/- 1.5% CPU usage). TCP Latency tests (1 byte msg) Config Latency No IPSEC 196 us AH/SHA1 (sw) 256 us AH/SHA1 (hw) 317 us ESP/3DES,SHA1 (sw) 333 us ESP/3DES,SHA1 (hw) 347 us ESP-AH/3DES,SHA1-SHA1 (sw) 387 us ESP-AH/3DES,SHA1-SHA1 (hw) 467 us TCP Latency tests (1024 byte msg) Config Latency No IPSEC 625 us AH/SHA1 (sw) 771 us AH/SHA1 (hw) 858 us ESP/3DES,SHA1 (sw) 1999 us ESP/3DES,SHA1 (hw) 902 us ESP-AH/3DES,SHA1-SHA1 (sw) 2140 us ESP-AH/3DES,SHA1-SHA1 (hw) 1131 us Bandwidth tests Config (sender -> receiver) Bandwidth ori CPU tank CPU No IPSEC (tank->ori) 11494 KB/s 11.9% 18.7% No IPSEC (ori->tank) 11492 KB/s 9.5% 34.3% AH/SHA1 (sw) (tank->ori) 11303 KB/s 29.2% 79.3% AH/SHA1 (sw) (ori->tank) 11302 KB/s 28.6% 91.1% ESP/3DES,SHA1 (sw) (tank->ori) 2130 KB/s 29.6% 100% ESP/3DES,SHA1 (sw) (ori->tank) 2263 KB/s 29.3% 99.7% ESP-AH/3DES,SHA1-SHA1 (sw) (tank->ori) 1906 KB/s 29.1% 100% ESP-AH/3DES,SHA1-SHA1 (sw) (ori->tank) 2051 KB/s 29.3% 99.7% AH/SHA1 (hw) (tank->ori) 11303 KB/s 14.0% 30.2% AH/SHA1 (hw) (ori->tank) 11301 KB/s 14.1% 39.8% ESP/3DES,SHA1 (hw) (tank->ori) 11221 KB/s 15.4% 44.9% ESP/3DES,SHA1 (hw) (ori->tank) 11220 KB/s 21.5% 48.1% ESP-AH/3DES,SHA1-SHA1 (hw) (tank->ori) 5920 KB/s 10.8% 35.9% ESP-AH/3DES,SHA1-SHA1 (hw) (ori->tank) 7189 KB/s 14.3% 35.4% The last line seems suspicious, and should probably be retested. From dave@thedillows.org Thu Dec 30 00:46:47 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 00:47:55 -0800 (PST) Received: from smtp.knology.net (smtp.knology.net [24.214.63.101]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBU8kP5F024577 for ; Thu, 30 Dec 2004 00:46:45 -0800 Received: (qmail 28275 invoked by uid 0); 30 Dec 2004 08:49:00 -0000 Received: from user-69-1-45-93.knology.net (HELO ori.thedillows.org) (69.1.45.93) by smtp2.knology.net with SMTP; 30 Dec 2004 08:49:00 -0000 Received: from ori.thedillows.org (localhost [127.0.0.1]) by ori.thedillows.org (8.13.1/8.13.1) with ESMTP id iBU8mbN8009827; Thu, 30 Dec 2004 03:48:37 -0500 Received: (from root@localhost) by ori.thedillows.org (8.13.1/8.13.1/Submit) id iBU8mbGL009826; Thu, 30 Dec 2004 03:48:37 -0500 Date: Thu, 30 Dec 2004 03:48:37 -0500 To: netdev@oss.sgi.com Cc: linux-kernel@vger.kernel.org, dave@thedillows.org From: David Dillow Subject: [RFC 2.6.10 18/22] typhoon: add validation of offloaded xfrm_states Message-Id: <20041230035000.27@ori.thedillows.org> References: <20041230035000.26@ori.thedillows.org> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13220 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dave@thedillows.org Precedence: bulk X-list: netdev # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/12/30 01:00:43-05:00 dave@thedillows.org # Add routines to validate that the xfrm_state passed to them is # one that we can offload to the 3XP. # # Signed-off-by: David Dillow # # drivers/net/typhoon.c # 2004/12/30 01:00:25-05:00 dave@thedillows.org +90 -0 # Add routines to validate that the xfrm_state passed to them is # one that we can offload to the 3XP. # # Signed-off-by: David Dillow # diff -Nru a/drivers/net/typhoon.c b/drivers/net/typhoon.c --- a/drivers/net/typhoon.c 2004-12-30 01:08:32 -05:00 +++ b/drivers/net/typhoon.c 2004-12-30 01:08:32 -05:00 @@ -2330,6 +2330,96 @@ return 0; } +#define UNSUPPORTED goto unsupported +#define REQUIRED(x) if(!(x)) goto unsupported + +static inline int +typhoon_validate_ealgo(struct typhoon *tp, struct xfrm_state *x) +{ + switch(x->props.ealgo) { + case SADB_EALG_NULL: + break; + case SADB_EALG_DESCBC: + REQUIRED(x->ealg); + REQUIRED(tp->capabilities & TYPHOON_CRYPTO_DES); + REQUIRED(x->ealg->alg_key_len == 64); + break; + case SADB_EALG_3DESCBC: + REQUIRED(x->ealg); + REQUIRED(tp->capabilities & TYPHOON_CRYPTO_3DES); + REQUIRED(x->ealg->alg_key_len == 128 || + x->ealg->alg_key_len == 192); + break; + default: + UNSUPPORTED; + } + + return 1; + +unsupported: + return 0; +} + +static inline int +typhoon_validate_aalgo(struct typhoon *tp, struct xfrm_state *x) +{ + switch(x->props.aalgo) { + case SADB_X_AALG_NULL: + break; + case SADB_AALG_MD5HMAC: + REQUIRED(x->aalg); + REQUIRED(x->aalg->alg_key_len == 128); + break; + case SADB_AALG_SHA1HMAC: + REQUIRED(x->aalg); + REQUIRED(x->aalg->alg_key_len == 160); + break; + default: + UNSUPPORTED; + } + + return 1; + +unsupported: + return 0; +} + +static inline int +typhoon_validate_xfrm(struct typhoon *tp, struct xfrm_state *x) +{ + u8 ealgo, aalgo, need_auth = 1; + + REQUIRED(x->props.family == AF_INET); + REQUIRED(x->dir == XFRM_STATE_DIR_OUT || x->dir == XFRM_STATE_DIR_IN); + REQUIRED(!x->encap); + + aalgo = x->props.aalgo; + ealgo = x->props.ealgo; + + switch(x->type->proto) { + case IPPROTO_ESP: + need_auth = 0; + REQUIRED(aalgo != SADB_X_AALG_NULL || ealgo != SADB_EALG_NULL); + REQUIRED(typhoon_validate_ealgo(tp, x)); + /* fall through to validate auth algorithm */ + case IPPROTO_AH: + REQUIRED(typhoon_validate_aalgo(tp, x)); + if(need_auth) + REQUIRED(aalgo != SADB_X_AALG_NULL); + break; + default: + UNSUPPORTED; + } + + return 1; + +unsupported: + return 0; +} + +#undef REQUIRED +#undef UNSUPPORTED + static void typhoon_tx_timeout(struct net_device *dev) { From dave@thedillows.org Thu Dec 30 00:46:44 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 00:47:49 -0800 (PST) Received: from smtp.knology.net (smtp.knology.net [24.214.63.101]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBU8kNWf024566 for ; Thu, 30 Dec 2004 00:46:44 -0800 Received: (qmail 31805 invoked by uid 0); 30 Dec 2004 08:54:55 -0000 Received: from user-69-1-45-93.knology.net (HELO ori.thedillows.org) (69.1.45.93) by smtp7.knology.net with SMTP; 30 Dec 2004 08:54:55 -0000 Received: from ori.thedillows.org (localhost [127.0.0.1]) by ori.thedillows.org (8.13.1/8.13.1) with ESMTP id iBU8masC009795; Thu, 30 Dec 2004 03:48:36 -0500 Received: (from root@localhost) by ori.thedillows.org (8.13.1/8.13.1/Submit) id iBU8manA009794; Thu, 30 Dec 2004 03:48:36 -0500 Date: Thu, 30 Dec 2004 03:48:36 -0500 To: netdev@oss.sgi.com Cc: linux-kernel@vger.kernel.org, dave@thedillows.org From: David Dillow Subject: [RFC 2.6.10 10/22] AH, ESP: Add offloading of outbound packets Message-Id: <20041230035000.19@ori.thedillows.org> References: <20041230035000.18@ori.thedillows.org> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13209 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dave@thedillows.org Precedence: bulk X-list: netdev # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/12/30 00:44:50-05:00 dave@thedillows.org # Add crypto processing for outbound AH and ESP xfrms (IPv4). # # Signed-off-by: David Dillow # # net/ipv4/esp4.c # 2004/12/30 00:44:32-05:00 dave@thedillows.org +35 -21 # Add crypto offload for outbound ESP (IPv4) xfrms. Note that we always # generate a random IV, as we are not guaranteed to have any state in # the software crypto engine (we may have always been offloaded), and # we cannot rely on secure IV generation by the NIC driver/hw. # # Signed-off-by: David Dillow # # net/ipv4/ah4.c # 2004/12/30 00:44:32-05:00 dave@thedillows.org +31 -21 # Add crypto offload for outbound AH (IPv4) xfrms. Note that the NIC # driver/hw is responsible for zeroing the mutable IP header fields. # # Signed-off-by: David Dillow # diff -Nru a/net/ipv4/ah4.c b/net/ipv4/ah4.c --- a/net/ipv4/ah4.c 2004-12-30 01:10:14 -05:00 +++ b/net/ipv4/ah4.c 2004-12-30 01:10:14 -05:00 @@ -83,31 +83,41 @@ ah->spi = x->id.spi; ah->seq_no = htonl(x->replay.oseq + 1); - iph->tos = top_iph->tos; - iph->ttl = top_iph->ttl; - iph->frag_off = top_iph->frag_off; - - if (top_iph->ihl != 5) { - iph->daddr = top_iph->daddr; - memcpy(iph+1, top_iph+1, top_iph->ihl*4 - sizeof(struct iphdr)); - err = ip_clear_mutable_options(top_iph, &top_iph->daddr); - if (err) + if (dst->xfrm_offload) { + err = -ENOMEM; + xfrm_offload_hold(dst->xfrm_offload); + if (skb_push_xfrm_offload(skb, dst->xfrm_offload)) { + xfrm_offload_release(dst->xfrm_offload); goto error; - } + } + } else { + /* Not offloaded, manually calculate the auth hash */ + iph->tos = top_iph->tos; + iph->ttl = top_iph->ttl; + iph->frag_off = top_iph->frag_off; + + if (top_iph->ihl != 5) { + iph->daddr = top_iph->daddr; + memcpy(iph+1, top_iph+1, top_iph->ihl*4 - sizeof(struct iphdr)); + err = ip_clear_mutable_options(top_iph, &top_iph->daddr); + if (err) + goto error; + } - top_iph->tos = 0; - top_iph->frag_off = 0; - top_iph->ttl = 0; - top_iph->check = 0; + top_iph->tos = 0; + top_iph->frag_off = 0; + top_iph->ttl = 0; + top_iph->check = 0; - ahp->icv(ahp, skb, ah->auth_data); + ahp->icv(ahp, skb, ah->auth_data); - top_iph->tos = iph->tos; - top_iph->ttl = iph->ttl; - top_iph->frag_off = iph->frag_off; - if (top_iph->ihl != 5) { - top_iph->daddr = iph->daddr; - memcpy(top_iph+1, iph+1, top_iph->ihl*4 - sizeof(struct iphdr)); + top_iph->tos = iph->tos; + top_iph->ttl = iph->ttl; + top_iph->frag_off = iph->frag_off; + if (top_iph->ihl != 5) { + top_iph->daddr = iph->daddr; + memcpy(top_iph+1, iph+1, top_iph->ihl*4 - sizeof(struct iphdr)); + } } /* Delay incrementing the replay sequence until we know we're going diff -Nru a/net/ipv4/esp4.c b/net/ipv4/esp4.c --- a/net/ipv4/esp4.c 2004-12-30 01:10:14 -05:00 +++ b/net/ipv4/esp4.c 2004-12-30 01:10:14 -05:00 @@ -98,33 +98,47 @@ esph->spi = x->id.spi; esph->seq_no = htonl(++x->replay.oseq); - if (esp->conf.ivlen) - crypto_cipher_set_iv(tfm, esp->conf.ivec, crypto_tfm_alg_ivsize(tfm)); + if (dst->xfrm_offload) { + xfrm_offload_hold(dst->xfrm_offload); + if (skb_push_xfrm_offload(skb, dst->xfrm_offload)) { + xfrm_offload_release(dst->xfrm_offload); + goto error; + } + + if (esp->conf.ivlen) + get_random_bytes(esph->enc_data, esp->conf.ivlen); + } else { + if (esp->conf.ivlen) + crypto_cipher_set_iv(tfm, esp->conf.ivec, crypto_tfm_alg_ivsize(tfm)); + + do { + struct scatterlist *sg = &esp->sgbuf[0]; - do { - struct scatterlist *sg = &esp->sgbuf[0]; + if (unlikely(nfrags > ESP_NUM_FAST_SG)) { + sg = kmalloc(sizeof(struct scatterlist)*nfrags, GFP_ATOMIC); + if (!sg) + goto error; + } + skb_to_sgvec(skb, sg, esph->enc_data+esp->conf.ivlen-skb->data, clen); + crypto_cipher_encrypt(tfm, sg, sg, clen); + if (unlikely(sg != &esp->sgbuf[0])) + kfree(sg); + } while (0); - if (unlikely(nfrags > ESP_NUM_FAST_SG)) { - sg = kmalloc(sizeof(struct scatterlist)*nfrags, GFP_ATOMIC); - if (!sg) - goto error; + if (esp->conf.ivlen) { + memcpy(esph->enc_data, esp->conf.ivec, crypto_tfm_alg_ivsize(tfm)); + crypto_cipher_get_iv(tfm, esp->conf.ivec, crypto_tfm_alg_ivsize(tfm)); + } + + if (esp->auth.icv_full_len) { + esp->auth.icv(esp, skb, (u8*)esph-skb->data, + sizeof(struct ip_esp_hdr) + esp->conf.ivlen+clen, trailer->tail); } - skb_to_sgvec(skb, sg, esph->enc_data+esp->conf.ivlen-skb->data, clen); - crypto_cipher_encrypt(tfm, sg, sg, clen); - if (unlikely(sg != &esp->sgbuf[0])) - kfree(sg); - } while (0); - - if (esp->conf.ivlen) { - memcpy(esph->enc_data, esp->conf.ivec, crypto_tfm_alg_ivsize(tfm)); - crypto_cipher_get_iv(tfm, esp->conf.ivec, crypto_tfm_alg_ivsize(tfm)); } - if (esp->auth.icv_full_len) { - esp->auth.icv(esp, skb, (u8*)esph-skb->data, - sizeof(struct ip_esp_hdr) + esp->conf.ivlen+clen, trailer->tail); + /* Need to account for auth data, offloading or not... */ + if (esp->auth.icv_full_len) pskb_put(skb, trailer, alen); - } ip_send_check(top_iph); From dave@thedillows.org Thu Dec 30 00:46:45 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 00:47:49 -0800 (PST) Received: from smtp.knology.net (smtp.knology.net [24.214.63.101]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBU8kOXJ024572 for ; Thu, 30 Dec 2004 00:46:44 -0800 Received: (qmail 31811 invoked by uid 0); 30 Dec 2004 08:54:55 -0000 Received: from user-69-1-45-93.knology.net (HELO ori.thedillows.org) (69.1.45.93) by smtp7.knology.net with SMTP; 30 Dec 2004 08:54:55 -0000 Received: from ori.thedillows.org (localhost [127.0.0.1]) by ori.thedillows.org (8.13.1/8.13.1) with ESMTP id iBU8mah8009815; Thu, 30 Dec 2004 03:48:36 -0500 Received: (from root@localhost) by ori.thedillows.org (8.13.1/8.13.1/Submit) id iBU8maDt009814; Thu, 30 Dec 2004 03:48:36 -0500 Date: Thu, 30 Dec 2004 03:48:36 -0500 To: netdev@oss.sgi.com Cc: linux-kernel@vger.kernel.org, dave@thedillows.org From: David Dillow Subject: [RFC 2.6.10 15/22] typhoon: add outbound offload processing Message-Id: <20041230035000.24@ori.thedillows.org> References: <20041230035000.23@ori.thedillows.org> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13205 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dave@thedillows.org Precedence: bulk X-list: netdev # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/12/30 00:55:05-05:00 dave@thedillows.org # Add outbound xfrm crypto offload processing to the packet path. # # Signed-off-by: David Dillow # # drivers/net/typhoon.c # 2004/12/30 00:54:46-05:00 dave@thedillows.org +130 -0 # Add outbound xfrm crypto offload processing to the packet path. # # Signed-off-by: David Dillow # diff -Nru a/drivers/net/typhoon.c b/drivers/net/typhoon.c --- a/drivers/net/typhoon.c 2004-12-30 01:09:10 -05:00 +++ b/drivers/net/typhoon.c 2004-12-30 01:09:10 -05:00 @@ -352,6 +352,15 @@ #define TSO_OFFLOAD_ON 0 #endif +#define IPSEC_NUM_DESCRIPTORS 1 + +struct typhoon_xfrm_offload { + u16 sa_cookie; + u16 tunnel:1, + ah:1, + inbound:1; +}; + static inline void typhoon_inc_index(u32 *index, const int count, const int num_entries) { @@ -779,12 +788,115 @@ tcpd->status = 0; } +static inline int +typhoon_ipsec_fill(struct typhoon *tp, struct sk_buff *skb, + struct transmit_ring *txRing) +{ + struct xfrm_offload *xol; + struct typhoon_xfrm_offload *txo; + struct ipsec_desc *ipsec; + int last_was_esp = 0; + int i, entry; + u32 sa[3]; + + ipsec = (struct ipsec_desc *) (txRing->ringBase + txRing->lastWrite); + typhoon_inc_tx_index(&txRing->lastWrite, 1); + + ipsec->flags = TYPHOON_OPT_DESC | TYPHOON_OPT_IPSEC; + ipsec->numDesc = 1; + ipsec->ipsecFlags = TYPHOON_IPSEC_USE_IV; + ipsec->reserved = 0; + sa[0] = sa[1] = sa[2] = 0; + + /* Fill the offload descriptor with the cookies to indicate + * which key set to use when. While we're looping through the + * offloaded xfrms, if the last xfrm was ESP, and we're doing + * AH now, * then we can move the ESP part to the top of the + * descriptor. Otherwise, we'll need to move to the next one. + * We overrun into sa[2] to prevent needing to check the entry + * limit in the middile of things. + */ + entry = i = 0; + xol = skb_get_xfrm_offload(skb, i++); + while(xol && entry < 2) { + xfrm_offload_hold(xol); + txo = xfrm_offload_priv(xol); + if(sa[entry] && txo->tunnel) + entry++; + if(sa[entry] & 0xffff) { + if(last_was_esp && txo->ah) + sa[entry] <<= 16; + else + entry++; + } + + sa[entry] |= txo->sa_cookie; + last_was_esp = !txo->ah; + + xol = skb_get_xfrm_offload(skb, i++); + } + + /* Make sure we used all of the xfrms that were offloaded. + */ + if(unlikely(entry == 2 && xol)) { + if(net_ratelimit()) + printk(KERN_ERR "%s: failing to offload IPSEC packet " + "with too many xfrms!\n", tp->name); + goto bad_packet; + } + + ipsec->sa[0] = cpu_to_le16(sa[0] & 0xffff); + ipsec->sa[1] = cpu_to_le16(sa[0] >> 16); + ipsec->sa[2] = cpu_to_le16(sa[1] & 0xffff); + ipsec->sa[3] = cpu_to_le16(sa[1] >> 16); + + /* The current 3XP firmware seems to hang if we try to feed it + * the same (non-zero) SA twice on the same packet. So, detect + * and drop those packets as it is likely a stack bug, or + * misconfiguration of policy. + * + * I.e., we should never hit this. + */ + if(unlikely(ipsec->sa[2])) { + if(unlikely(ipsec->sa[2] == ipsec->sa[3])) + goto avoiding_sa_hang; + if(unlikely(ipsec->sa[2] == ipsec->sa[0] || + ipsec->sa[2] == ipsec->sa[1])) + goto avoiding_sa_hang; + if(unlikely(ipsec->sa[3] && (ipsec->sa[3] == ipsec->sa[0] || + ipsec->sa[3] == ipsec->sa[1]))) + goto avoiding_sa_hang; + } + + if(unlikely(ipsec->sa[1] && ipsec->sa[0] == ipsec->sa[1])) + goto avoiding_sa_hang; + + return 0; + +avoiding_sa_hang: + if(net_ratelimit()) + printk(KERN_ERR "%s: failing attempted IPSEC offload with " + "duplicate SAs %08x %08x\n", tp->name, + sa[0], sa[1]); + +bad_packet: + /* Any xfrm_offloads we've attached to this skb will be + * released for us when typhoon_start_tx() calls dev_kfree_skb_any() + * on it. + * + * Return an error to indicate this packet cannot be offloaded as + * specified and should never make it to the wire. + */ + return -EINVAL; +} + static int typhoon_start_tx(struct sk_buff *skb, struct net_device *dev) { struct typhoon *tp = netdev_priv(dev); struct transmit_ring *txRing; struct tx_desc *txd, *first_txd; + u32 origLastWrite; dma_addr_t skb_dma; int numDesc; @@ -811,6 +923,9 @@ if(skb_tso_size(skb)) numDesc++; + if(skb_has_xfrm_offload(skb)) + numDesc++; + /* When checking for free space in the ring, we need to also * account for the initial Tx descriptor, and we always must leave * at least one descriptor unused in the ring so that it doesn't @@ -823,6 +938,7 @@ while(unlikely(typhoon_num_free_tx(txRing) < (numDesc + 2))) smp_rmb(); + origLastWrite = txRing->lastWrite; first_txd = (struct tx_desc *) (txRing->ringBase + txRing->lastWrite); typhoon_inc_tx_index(&txRing->lastWrite, 1); @@ -855,6 +971,14 @@ typhoon_tso_fill(skb, txRing, tp->txlo_dma_addr); } + if(skb_has_xfrm_offload(skb)) { + first_txd->processFlags |= TYPHOON_TX_PF_IPSEC; + first_txd->numDesc++; + + if(typhoon_ipsec_fill(tp, skb, txRing)) + goto error; + } + txd = (struct tx_desc *) (txRing->ringBase + txRing->lastWrite); typhoon_inc_tx_index(&txRing->lastWrite, 1); @@ -915,6 +1039,7 @@ * Tx header. */ numDesc = MAX_SKB_FRAGS + TSO_NUM_DESCRIPTORS + 1; + numDesc += IPSEC_NUM_DESCRIPTORS; if(typhoon_num_free_tx(txRing) < (numDesc + 2)) { netif_stop_queue(dev); @@ -927,6 +1052,11 @@ netif_wake_queue(dev); } + return 0; + +error: + txRing->lastWrite = origLastWrite; + dev_kfree_skb_any(skb); return 0; } From dave@thedillows.org Thu Dec 30 00:46:44 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 00:47:50 -0800 (PST) Received: from smtp.knology.net (smtp.knology.net [24.214.63.101]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBU8kO0V024570 for ; Thu, 30 Dec 2004 00:46:44 -0800 Received: (qmail 28269 invoked by uid 0); 30 Dec 2004 08:48:59 -0000 Received: from user-69-1-45-93.knology.net (HELO ori.thedillows.org) (69.1.45.93) by smtp2.knology.net with SMTP; 30 Dec 2004 08:48:59 -0000 Received: from ori.thedillows.org (localhost [127.0.0.1]) by ori.thedillows.org (8.13.1/8.13.1) with ESMTP id iBU8mawd009807; Thu, 30 Dec 2004 03:48:36 -0500 Received: (from root@localhost) by ori.thedillows.org (8.13.1/8.13.1/Submit) id iBU8ma02009806; Thu, 30 Dec 2004 03:48:36 -0500 Date: Thu, 30 Dec 2004 03:48:36 -0500 To: netdev@oss.sgi.com Cc: linux-kernel@vger.kernel.org, dave@thedillows.org From: David Dillow Subject: [RFC 2.6.10 13/22] typhoon: Make the ipsec descriptor match actual usage Message-Id: <20041230035000.22@ori.thedillows.org> References: <20041230035000.21@ori.thedillows.org> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13214 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dave@thedillows.org Precedence: bulk X-list: netdev # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/12/30 00:53:15-05:00 dave@thedillows.org # Make the crypto structures better match actual usage. # # Signed-off-by: David Dillow # # drivers/net/typhoon.h # 2004/12/30 00:52:57-05:00 dave@thedillows.org +13 -11 # Make the crypto structures better match actual usage. # # Signed-off-by: David Dillow # diff -Nru a/drivers/net/typhoon.h b/drivers/net/typhoon.h --- a/drivers/net/typhoon.h 2004-12-30 01:09:36 -05:00 +++ b/drivers/net/typhoon.h 2004-12-30 01:09:36 -05:00 @@ -210,7 +210,10 @@ * flags: descriptor type * numDesc: must be 1 * ipsecFlags: bit 0: 0 -- generate IV, 1 -- use supplied IV - * sa1, sa2: Security Association IDs for this packet + * sa[0]: the inner AH header offload cookie (or ESP if no AH) + * sa[1]: the inner ESP header offload cookie (or 0 if no AH) + * sa[2]: the outer AH header offload cookie (or ESP if no AH) + * sa[3]: the outer ESP header offload cookie (or 0 if no AH) * reserved: set to 0 */ struct ipsec_desc { @@ -219,8 +222,7 @@ u16 ipsecFlags; #define TYPHOON_IPSEC_GEN_IV __constant_cpu_to_le16(0x0000) #define TYPHOON_IPSEC_USE_IV __constant_cpu_to_le16(0x0001) - u32 sa1; - u32 sa2; + u16 sa[4]; u32 reserved; } __attribute__ ((packed)); @@ -268,14 +270,14 @@ #define TYPHOON_RX_FILTER_MASK __constant_cpu_to_le16(0x7fff) #define TYPHOON_RX_FILTERED __constant_cpu_to_le16(0x8000) u16 ipsecResults; -#define TYPHOON_RX_OUTER_AH_GOOD __constant_cpu_to_le16(0x0001) -#define TYPHOON_RX_OUTER_ESP_GOOD __constant_cpu_to_le16(0x0002) -#define TYPHOON_RX_INNER_AH_GOOD __constant_cpu_to_le16(0x0004) -#define TYPHOON_RX_INNER_ESP_GOOD __constant_cpu_to_le16(0x0008) -#define TYPHOON_RX_OUTER_AH_FAIL __constant_cpu_to_le16(0x0010) -#define TYPHOON_RX_OUTER_ESP_FAIL __constant_cpu_to_le16(0x0020) -#define TYPHOON_RX_INNER_AH_FAIL __constant_cpu_to_le16(0x0040) -#define TYPHOON_RX_INNER_ESP_FAIL __constant_cpu_to_le16(0x0080) +#define TYPHOON_RX_AH1_GOOD __constant_cpu_to_le16(0x0001) +#define TYPHOON_RX_ESP1_GOOD __constant_cpu_to_le16(0x0002) +#define TYPHOON_RX_AH2_GOOD __constant_cpu_to_le16(0x0004) +#define TYPHOON_RX_ESP2_GOOD __constant_cpu_to_le16(0x0008) +#define TYPHOON_RX_AH1_FAIL __constant_cpu_to_le16(0x0010) +#define TYPHOON_RX_ESP1_FAIL __constant_cpu_to_le16(0x0020) +#define TYPHOON_RX_AH2_FAIL __constant_cpu_to_le16(0x0040) +#define TYPHOON_RX_ESP2_FAIL __constant_cpu_to_le16(0x0080) #define TYPHOON_RX_UNKNOWN_SA __constant_cpu_to_le16(0x0100) #define TYPHOON_RX_ESP_FORMAT_ERR __constant_cpu_to_le16(0x0200) u32 vlanTag; From dave@thedillows.org Thu Dec 30 00:46:47 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 00:47:54 -0800 (PST) Received: from smtp.knology.net (smtp.knology.net [24.214.63.101]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBU8kPYj024578 for ; Thu, 30 Dec 2004 00:46:45 -0800 Received: (qmail 19791 invoked by uid 0); 30 Dec 2004 08:53:07 -0000 Received: from user-69-1-45-93.knology.net (HELO ori.thedillows.org) (69.1.45.93) by smtp8.knology.net with SMTP; 30 Dec 2004 08:53:07 -0000 Received: from ori.thedillows.org (localhost [127.0.0.1]) by ori.thedillows.org (8.13.1/8.13.1) with ESMTP id iBU8mb3J009839; Thu, 30 Dec 2004 03:48:37 -0500 Received: (from root@localhost) by ori.thedillows.org (8.13.1/8.13.1/Submit) id iBU8mbhi009838; Thu, 30 Dec 2004 03:48:37 -0500 Date: Thu, 30 Dec 2004 03:48:37 -0500 To: netdev@oss.sgi.com Cc: linux-kernel@vger.kernel.org, dave@thedillows.org From: David Dillow Subject: [RFC 2.6.10 21/22] typhoon: add callbacks to support crypto offload Message-Id: <20041230035000.30@ori.thedillows.org> References: <20041230035000.29@ori.thedillows.org> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13219 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dave@thedillows.org Precedence: bulk X-list: netdev # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/12/30 01:05:45-05:00 dave@thedillows.org # Export the xfrm offload callbacks, and let the world know we # support IPSEC offload. # # While we're at it, allow this to controlled by ethtool. # # Signed-off-by: David Dillow # # drivers/net/typhoon.c # 2004/12/30 01:05:27-05:00 dave@thedillows.org +23 -4 # Export the xfrm offload callbacks, and let the world know we # support IPSEC offload. # # While we're at it, allow this to controlled by ethtool. # # Signed-off-by: David Dillow # diff -Nru a/drivers/net/typhoon.c b/drivers/net/typhoon.c --- a/drivers/net/typhoon.c 2004-12-30 01:07:53 -05:00 +++ b/drivers/net/typhoon.c 2004-12-30 01:07:53 -05:00 @@ -33,9 +33,12 @@ *) Waiting for a command response takes 8ms due to non-preemptable polling. Only significant for getting stats and creating SAs, but an ugly wart never the less. + *) Inbound IPSEC packets of the form outer ESP transport, inner + ESP tunnel seems to fail the hash on the inner ESP + *) Inbound IPSEC packets of the form outer AH transport, inner + AH tunnel seems to fail the hash on the outer AH TODO: - *) Doesn't do IPSEC offloading. Yet. Keep yer pants on, it's coming. *) Add more support for ethtool (especially for NIC stats) *) Allow disabling of RX checksum offloading *) Fix MAC changing to work while the interface is up @@ -100,8 +103,8 @@ #define PKT_BUF_SZ 1536 #define DRV_MODULE_NAME "typhoon" -#define DRV_MODULE_VERSION "1.5.6" -#define DRV_MODULE_RELDATE "04/12/17" +#define DRV_MODULE_VERSION "1.5.6-ipsec" +#define DRV_MODULE_RELDATE "04/12/19" #define PFX DRV_MODULE_NAME ": " #define ERR_PFX KERN_ERR PFX @@ -1406,6 +1409,8 @@ .get_tso = ethtool_op_get_tso, .set_tso = ethtool_op_set_tso, .get_ringparam = typhoon_get_ringparam, + .get_ipsec = ethtool_op_get_ipsec, + .set_ipsec = ethtool_op_set_ipsec, }; static int @@ -2253,6 +2258,9 @@ tp->card_state = Running; smp_wmb(); + if(dev->features & NETIF_F_IPSEC) + xfrm_accel_add(dev); + iowrite32(TYPHOON_INTR_ENABLE_ALL, ioaddr + TYPHOON_REG_INTR_ENABLE); iowrite32(TYPHOON_INTR_NONE, ioaddr + TYPHOON_REG_INTR_MASK); typhoon_post_pci_writes(ioaddr); @@ -2327,6 +2335,14 @@ typhoon_clean_tx(tp, &tp->txLoRing, &indexes->txLoCleared); } + if(tp->dev->features & NETIF_F_IPSEC) + xfrm_accel_flush(tp->dev); + + /* tp->card_state != Running, so nothing will change this out + * from under us. + */ + tp->offload &= ~TYPHOON_OFFLOAD_IPSEC; + return 0; } @@ -3183,6 +3199,9 @@ dev->set_mac_address = typhoon_set_mac_address; dev->vlan_rx_register = typhoon_vlan_rx_register; dev->vlan_rx_kill_vid = typhoon_vlan_rx_kill_vid; + dev->xfrm_state_add = typhoon_xfrm_state_add; + dev->xfrm_state_del = typhoon_xfrm_state_del; + dev->xfrm_bundle_add = typhoon_xfrm_bundle_add; SET_ETHTOOL_OPS(dev, &typhoon_ethtool_ops); /* We can handle scatter gather, up to 16 entries, and @@ -3190,7 +3209,7 @@ */ dev->features |= NETIF_F_SG | NETIF_F_IP_CSUM; dev->features |= NETIF_F_HW_VLAN_TX | NETIF_F_HW_VLAN_RX; - dev->features |= NETIF_F_TSO; + dev->features |= NETIF_F_TSO | NETIF_F_IPSEC; if(register_netdev(dev) < 0) goto error_out_reset; From dave@thedillows.org Thu Dec 30 00:46:46 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 00:47:49 -0800 (PST) Received: from smtp.knology.net (smtp.knology.net [24.214.63.101]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBU8kPQr024576 for ; Thu, 30 Dec 2004 00:46:45 -0800 Received: (qmail 4278 invoked by uid 0); 30 Dec 2004 08:49:00 -0000 Received: from user-69-1-45-93.knology.net (HELO ori.thedillows.org) (69.1.45.93) by smtp6.knology.net with SMTP; 30 Dec 2004 08:49:00 -0000 Received: from ori.thedillows.org (localhost [127.0.0.1]) by ori.thedillows.org (8.13.1/8.13.1) with ESMTP id iBU8mbMe009835; Thu, 30 Dec 2004 03:48:37 -0500 Received: (from root@localhost) by ori.thedillows.org (8.13.1/8.13.1/Submit) id iBU8mbdF009834; Thu, 30 Dec 2004 03:48:37 -0500 Date: Thu, 30 Dec 2004 03:48:37 -0500 To: netdev@oss.sgi.com Cc: linux-kernel@vger.kernel.org, dave@thedillows.org From: David Dillow Subject: [RFC 2.6.10 20/22] typhoon: add management of outbound bundles Message-Id: <20041230035000.29@ori.thedillows.org> References: <20041230035000.28@ori.thedillows.org> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13200 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dave@thedillows.org Precedence: bulk X-list: netdev # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/12/30 01:04:39-05:00 dave@thedillows.org # Add the offloading of outbound bundles. # # This is a tricky business, because there are restrictions on # the types and order of the xfrms we can offload. Some combinations # also yield incorrect results, so we have to reduce the amount of # offloading we do in those cases. # # Signed-off-by: David Dillow # # drivers/net/typhoon.c # 2004/12/30 01:04:20-05:00 dave@thedillows.org +134 -0 # Add the offloading of outbound bundles. # # This is a tricky business, because there are restrictions on # the types and order of the xfrms we can offload. Some combinations # also yield incorrect results, so we have to reduce the amount of # offloading we do in those cases. # # Signed-off-by: David Dillow # diff -Nru a/drivers/net/typhoon.c b/drivers/net/typhoon.c --- a/drivers/net/typhoon.c 2004-12-30 01:08:06 -05:00 +++ b/drivers/net/typhoon.c 2004-12-30 01:08:06 -05:00 @@ -2587,6 +2587,140 @@ spin_unlock_bh(&tp->offload_lock); } +static inline int +typhoon_max_offload(struct xfrm_bundle_list *xbl) +{ + /* Pre-scan the bundle to avoid offloading problematic sequences. + * Only reduces the offload level to keep as much advantage as + * possible. + * + * For 03.001.002 -- still problematic for 03.001.008, but need + * re-verify symptoms. + * + * inner AH tunnel, outer AH transport + * --> 3XP seems to put the inner hash at the wrong location + * inner AH tunnel, outer ESP tunnel + * --> 3XP corrupts outer hash, maybe wrong place? + * inner AH transport, outer ESP tunnel + * --> 3XP seems to encrypt the wrong portion of the packet + * inner ESP transport, outer AH tunnel + * --> 3XP lockup, requires reset + */ + struct xfrm_bundle_list *bundle; + struct dst_entry *dst; + struct xfrm_state *x; + int last_was_ah = 0, last_was_tunnel = 0; + int max_level = 2; + int proto; + + list_for_each_entry_reverse(bundle, &xbl->node, node) { + dst = bundle->dst; + x = dst->xfrm; + + proto = x ? x->type->proto : IPPROTO_IP; + + if(proto == IPPROTO_AH && x->props.mode && + (last_was_ah ^ last_was_tunnel)) + goto problem_offload; + + if(proto == IPPROTO_AH && !x->props.mode && + (!last_was_ah && last_was_tunnel)) + goto problem_offload; + + if(proto == IPPROTO_ESP && last_was_ah && last_was_tunnel) + goto problem_offload; + + last_was_ah = (proto == IPPROTO_AH) ? 1 : 0; + last_was_tunnel = x ? x->props.mode : 0; + continue; + +problem_offload: + max_level--; + break; + } + + return max_level; +} + +static void +typhoon_xfrm_bundle_add(struct net_device *dev, struct xfrm_bundle_list *xbl) +{ + /* Walk from the outermost dst back up the chain, offloading + * until we hit something we cannot deal with. + */ + struct typhoon *tp = netdev_priv(dev); + struct xfrm_bundle_list *bundle; + struct dst_entry *dst; + struct xfrm_state *x; + struct xfrm_offload *xol; + struct typhoon_xfrm_offload *txo; + int proto; + int level = 0, max_level; + int last = -1; + + smp_rmb(); + if(tp->card_state != Running) + return; + + max_level = typhoon_max_offload(xbl); + + list_for_each_entry_reverse(bundle, &xbl->node, node) { + dst = bundle->dst; + x = dst->xfrm; + + /* Only support IPv4 */ + if(dst->ops->family != AF_INET) + goto cannot_offload; + + proto = x ? x->type->proto : IPPROTO_IP; + + switch(proto) { + case IPPROTO_IP: + case IPPROTO_IPIP: + if(last == IPPROTO_IP || last == IPPROTO_IPIP) + goto cannot_offload; + if(level) + level++; + last = proto; + continue; + case IPPROTO_ESP: + if(last != IPPROTO_AH) + level++; + break; + case IPPROTO_AH: + level++; + break; + default: + goto cannot_offload; + } + + last = proto; + if(level > max_level) + goto cannot_offload; + + if(dst->xfrm_offload) + continue; + + xol = xfrm_offload_get(x, dev); + if(!xol) { + xol = typhoon_offload_ipsec(tp, x); + if(xol) + xfrm_offload_hold(xol); + } + + if(!xol) + goto cannot_offload; + + dst->xfrm_offload = xol; + txo = xfrm_offload_priv(xol); + if(txo->tunnel) + last = IPPROTO_IPIP; + } + +cannot_offload: + return; +} + static void typhoon_tx_timeout(struct net_device *dev) { From dave@thedillows.org Thu Dec 30 00:46:44 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 00:47:49 -0800 (PST) Received: from smtp.knology.net (smtp.knology.net [24.214.63.101]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBU8kNBA024565 for ; Thu, 30 Dec 2004 00:46:43 -0800 Received: (qmail 17281 invoked by uid 0); 30 Dec 2004 08:47:22 -0000 Received: from user-69-1-45-93.knology.net (HELO ori.thedillows.org) (69.1.45.93) by smtp5.knology.net with SMTP; 30 Dec 2004 08:47:22 -0000 Received: from ori.thedillows.org (localhost [127.0.0.1]) by ori.thedillows.org (8.13.1/8.13.1) with ESMTP id iBU8mZko009787; Thu, 30 Dec 2004 03:48:35 -0500 Received: (from root@localhost) by ori.thedillows.org (8.13.1/8.13.1/Submit) id iBU8mZ6u009786; Thu, 30 Dec 2004 03:48:35 -0500 Date: Thu, 30 Dec 2004 03:48:35 -0500 To: netdev@oss.sgi.com Cc: linux-kernel@vger.kernel.org, dave@thedillows.org From: David Dillow Subject: [RFC 2.6.10 8/22] skbuff: Add routines to manage applied offloads per skb Message-Id: <20041230035000.17@ori.thedillows.org> References: <20041230035000.16@ori.thedillows.org> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13198 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dave@thedillows.org Precedence: bulk X-list: netdev # This is a BitKeeper generated diff -Nru style patch. # # ChangeSet # 2004/12/30 00:40:20-05:00 dave@thedillows.org # Add fields to sk_buff to track offloaded xfrm_states for this # packet. On Tx, these will be pointers to struct xfrm_offload. # On Rx, these will be a 4 bit bitfield indicating what operations # were performed, and the result of those operations. # # skb_push_xfrm_offload() records an offloaded xfrm on Tx. It will # return an error code if it is unable to record the offload. # skb_get_xfrm_offload() returns the xfrm_offload struct at the # given position on the stack. It will return NULL if there # are no more offloads available. # skb_has_xfrm_offload() returns true if the sk_buff has offload # information available. # skb_put_xfrm_result() records an offload result on Rx at the given # index. It will return an error code if it is unable to # record the result. # skb_pop_xfrm_result() pops the current offload result from the. # stack. If there are no more results, it will return # XFRM_OFFLOAD_NONE. # # Signed-off-by: David Dillow # # net/core/skbuff.c # 2004/12/30 00:40:02-05:00 dave@thedillows.org +31 -0 # When an sk_buff is cloned, we must gain a reference to each # xfrm_offload that it references. # # When an sk_buff is freed, we must release our references # to the xfrm_offloads attached to it. # # Signed-off-by: David Dillow # # include/net/xfrm.h # 2004/12/30 00:40:02-05:00 dave@thedillows.org +9 -0 # Add the values for the result bitfield. # # Signed-off-by: David Dillow # # include/linux/skbuff.h # 2004/12/30 00:40:02-05:00 dave@thedillows.org +55 -0 # Add the fields and functins to track offloads and results, as # well as the current position in the stack. # # Signed-off-by: David Dillow # diff -Nru a/include/linux/skbuff.h b/include/linux/skbuff.h --- a/include/linux/skbuff.h 2004-12-30 01:10:39 -05:00 +++ b/include/linux/skbuff.h 2004-12-30 01:10:39 -05:00 @@ -146,6 +146,14 @@ skb_frag_t frags[MAX_SKB_FRAGS]; }; +/* XXX UGH. We cannot include in this file without some + * header file surgery, so define our own max xfrm depth. This should + * be kept >= XFRM_MAX_DEPTH until we fix the includes, and it can + * go away. + */ +#define SKB_XFRM_MAX_DEPTH 4 +struct xfrm_offload; + /** * struct sk_buff - socket buffer * @next: Next buffer in list @@ -187,6 +195,8 @@ * @nf_bridge: Saved data about a bridged frame - see br_netfilter.c * @private: Data which is private to the HIPPI implementation * @tc_index: Traffic control index + * @xfrm_offload: Tx offload info, Rx offload results + * @xfrm_offload_idx: The number of cookies/results stored currently */ struct sk_buff { @@ -272,6 +282,8 @@ #endif + int xfrm_offload_idx; + struct xfrm_offload * xfrm_offload[SKB_XFRM_MAX_DEPTH]; /* These elements must be at the end, see alloc_skb() for details. */ unsigned int truesize; @@ -1178,6 +1190,49 @@ #else /* CONFIG_NETFILTER */ static inline void nf_reset(struct sk_buff *skb) {} #endif /* CONFIG_NETFILTER */ + +static inline int skb_push_xfrm_offload(struct sk_buff *skb, + struct xfrm_offload *xol) +{ + if (likely(skb->xfrm_offload_idx < SKB_XFRM_MAX_DEPTH)) { + skb->xfrm_offload[skb->xfrm_offload_idx++] = xol; + return 0; + } + + return -ENOMEM; +} + +static inline struct xfrm_offload * +skb_get_xfrm_offload(const struct sk_buff *skb, int idx) +{ + if (likely(idx < skb->xfrm_offload_idx)) + return skb->xfrm_offload[idx]; + else + return NULL; +} + +static inline int skb_has_xfrm_offload(const struct sk_buff *skb) +{ + return !!skb_get_xfrm_offload(skb, 0); +} + +static inline int skb_put_xfrm_result(struct sk_buff *skb, int result, int idx) +{ + if (likely(idx < SKB_XFRM_MAX_DEPTH)) { + skb->xfrm_offload[idx] = (struct xfrm_offload *) result; + return 0; + } + return -ENOMEM; +} + +static inline int skb_pop_xfrm_result(struct sk_buff *skb) +{ + /* XXX XFRM_OFFLOAD_NONE == 0, but cannot include */ + int res = 0; + if (likely(skb->xfrm_offload_idx < SKB_XFRM_MAX_DEPTH)) + res = (int) skb->xfrm_offload[skb->xfrm_offload_idx++]; + return res; +} #endif /* __KERNEL__ */ #endif /* _LINUX_SKBUFF_H */ diff -Nru a/include/net/xfrm.h b/include/net/xfrm.h --- a/include/net/xfrm.h 2004-12-30 01:10:39 -05:00 +++ b/include/net/xfrm.h 2004-12-30 01:10:39 -05:00 @@ -171,6 +171,15 @@ XFRM_STATE_DIR_OUT, }; +enum { + XFRM_OFFLOAD_NONE = 0, + XFRM_OFFLOAD_CONF = 1, + XFRM_OFFLOAD_AUTH = 2, + XFRM_OFFLOAD_AUTH_OK = 4, + XFRM_OFFLOAD_AUTH_FAIL = 8, + XFRM_OFFLOAD_FIELD = 0x0f +}; + struct xfrm_offload { struct list_head bydev; diff -Nru a/net/core/skbuff.c b/net/core/skbuff.c --- a/net/core/skbuff.c 2004-12-30 01:10:39 -05:00 +++ b/net/core/skbuff.c 2004-12-30 01:10:39 -05:00 @@ -230,6 +230,14 @@ dst_release(skb->dst); #ifdef CONFIG_XFRM + { + int i = 0; + struct xfrm_offload *xol; + while ((xol = skb_get_xfrm_offload(skb, i++)) != NULL) { + if ((unsigned long) xol > XFRM_OFFLOAD_FIELD) + xfrm_offload_release(xol); + } + } secpath_put(skb->sp); #endif if(skb->destructor) { @@ -334,6 +342,17 @@ #endif #endif + C(xfrm_offload_idx); + memcpy(n->xfrm_offload, skb->xfrm_offload, + sizeof(struct xfrm_offload *) * SKB_XFRM_MAX_DEPTH); + { + int i = 0; + struct xfrm_offload *xol; + while ((xol = skb_get_xfrm_offload(skb, i++)) != NULL) { + if ((unsigned long) xol > XFRM_OFFLOAD_FIELD) + xfrm_offload_hold(xol); + } + } C(truesize); atomic_set(&n->users, 1); C(head); @@ -396,6 +415,18 @@ atomic_set(&new->users, 1); skb_shinfo(new)->tso_size = skb_shinfo(old)->tso_size; skb_shinfo(new)->tso_segs = skb_shinfo(old)->tso_segs; + + new->xfrm_offload_idx = old->xfrm_offload_idx; + memcpy(new->xfrm_offload, old->xfrm_offload, + sizeof(struct xfrm_offload *) * SKB_XFRM_MAX_DEPTH); + { + int i = 0; + struct xfrm_offload *xol; + while ((xol = skb_get_xfrm_offload(old, i++)) != NULL) { + if ((unsigned long) xol > XFRM_OFFLOAD_FIELD) + xfrm_offload_hold(xol); + } + } } /** From dave@thedillows.org Thu Dec 30 00:53:05 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 00:53:13 -0800 (PST) Received: from smtp.knology.net (smtp.knology.net [24.214.63.101]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBU8qfsp001400 for ; Thu, 30 Dec 2004 00:53:03 -0800 Received: (qmail 17601 invoked by uid 0); 30 Dec 2004 08:53:42 -0000 Received: from user-69-1-45-93.knology.net (HELO ori.thedillows.org) (69.1.45.93) by smtp5.knology.net with SMTP; 30 Dec 2004 08:53:42 -0000 Received: from ori.thedillows.org (localhost [127.0.0.1]) by ori.thedillows.org (8.13.1/8.13.1) with ESMTP id iBU8sttu009908; Thu, 30 Dec 2004 03:54:55 -0500 Received: (from il1@localhost) by ori.thedillows.org (8.13.1/8.13.1/Submit) id iBU8stm0009907; Thu, 30 Dec 2004 03:54:55 -0500 X-Authentication-Warning: ori.thedillows.org: il1 set sender to dave@thedillows.org using -f Subject: Ethtool offload patch From: David Dillow To: Netdev Cc: dave@thedillows.org Content-Type: multipart/mixed; boundary="=-NAzlHeonTRg6HEV7ZPqL" Date: Thu, 30 Dec 2004 03:54:55 -0500 Message-Id: <1104396895.5845.1.camel@ori.thedillows.org> Mime-Version: 1.0 X-Mailer: Evolution 2.0.2 (2.0.2-3) X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13221 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dave@thedillows.org Precedence: bulk X-list: netdev --=-NAzlHeonTRg6HEV7ZPqL Content-Type: text/plain Content-Transfer-Encoding: 7bit The attached patch allows the ethtool userspace tool to query and control IPSEC crypto offload. -- David Dillow --=-NAzlHeonTRg6HEV7ZPqL Content-Disposition: attachment; filename=ethtool-offload.patch Content-Transfer-Encoding: base64 Content-Type: text/x-patch; name=ethtool-offload.patch; charset=UTF-8 ZGlmZiAtdXJOIGV0aHRvb2wtMi9ldGh0b29sLmMgZXRodG9vbC0yLWlwc2VjL2V0aHRvb2wuYw0K LS0tIGV0aHRvb2wtMi9ldGh0b29sLmMJMjAwNC0wNy0wMiAxMToyODo0OC4wMDAwMDAwMDAgLTA0 MDANCisrKyBldGh0b29sLTItaXBzZWMvZXRodG9vbC5jCTIwMDQtMTEtMjEgMDM6MTU6NDYuMDAw MDAwMDAwIC0wNTAwDQpAQCAtMTE4LDcgKzExOCw4IEBADQogICoJCVsgcnggb258b2ZmIF0gXA0K ICAqCQlbIHR4IG9ufG9mZiBdIFwNCiAgKgkJWyBzZyBvbnxvZmYgXSBcDQotICoJCVsgdHNvIG9u fG9mZiBdDQorICoJCVsgdHNvIG9ufG9mZiBdIFwNCisgKgkJWyBpcHNlYyBvbnxvZmYgXQ0KICAq CWV0aHRvb2wgLXIgREVWTkFNRQ0KICAqCWV0aHRvb2wgLXAgREVWTkFNRSBbICVkIF0NCiAgKgll dGh0b29sIC10IERFVk5BTUUgWyBvbmxpbmV8b2ZmbGluZSBdDQpAQCAtMTkwLDcgKzE5MSw4IEBA DQogCQkiCQlbIHJ4IG9ufG9mZiBdIFxcXG4iDQogCQkiCQlbIHR4IG9ufG9mZiBdIFxcXG4iDQog CQkiCQlbIHNnIG9ufG9mZiBdIFxcXG4iDQotCQkiCQlbIHRzbyBvbnxvZmYgXVxuIg0KKwkJIgkJ WyB0c28gb258b2ZmIF0gXFxcbiINCisJCSIJCVsgaXBzZWMgb258b2ZmIF1cbiINCiAJCSIJZXRo dG9vbCAtciBERVZOQU1FXG4iDQogCQkiCWV0aHRvb2wgLXAgREVWTkFNRSBbICUlZCBdXG4iDQog CQkiCWV0aHRvb2wgLXQgREVWTkFNRSBbb25saW5lfChvZmZsaW5lKV1cbiINCkBAIC0yMzYsNiAr MjM4LDcgQEANCiBzdGF0aWMgaW50IG9mZl9jc3VtX3R4X3dhbnRlZCA9IC0xOw0KIHN0YXRpYyBp bnQgb2ZmX3NnX3dhbnRlZCA9IC0xOw0KIHN0YXRpYyBpbnQgb2ZmX3Rzb193YW50ZWQgPSAtMTsN CitzdGF0aWMgaW50IG9mZl9pcHNlY193YW50ZWQgPSAtMTsNCiANCiBzdGF0aWMgc3RydWN0IGV0 aHRvb2xfcGF1c2VwYXJhbSBlcGF1c2U7DQogc3RhdGljIGludCBncGF1c2VfY2hhbmdlZCA9IDA7 DQpAQCAtMzM5LDYgKzM0Miw3IEBADQogCXsgInR4IiwgQ01ETF9CT09MLCAmb2ZmX2NzdW1fdHhf d2FudGVkLCBOVUxMIH0sDQogCXsgInNnIiwgQ01ETF9CT09MLCAmb2ZmX3NnX3dhbnRlZCwgTlVM TCB9LA0KIAl7ICJ0c28iLCBDTURMX0JPT0wsICZvZmZfdHNvX3dhbnRlZCwgTlVMTCB9LA0KKwl7 ICJpcHNlYyIsIENNRExfQk9PTCwgJm9mZl9pcHNlY193YW50ZWQsIE5VTEwgfSwNCiB9Ow0KIA0K IHN0YXRpYyBzdHJ1Y3QgY21kbGluZV9pbmZvIGNtZGxpbmVfcGF1c2VbXSA9IHsNCkBAIC0xMTc1 LDE3ICsxMTc5LDE5IEBADQogCXJldHVybiAwOw0KIH0NCiANCi1zdGF0aWMgaW50IGR1bXBfb2Zm bG9hZCAoaW50IHJ4LCBpbnQgdHgsIGludCBzZywgaW50IHRzbykNCitzdGF0aWMgaW50IGR1bXBf b2ZmbG9hZCAoaW50IHJ4LCBpbnQgdHgsIGludCBzZywgaW50IHRzbywgaW50IGlwc2VjKQ0KIHsN CiAJZnByaW50ZihzdGRvdXQsDQogCQkicngtY2hlY2tzdW1taW5nOiAlc1xuIg0KIAkJInR4LWNo ZWNrc3VtbWluZzogJXNcbiINCiAJCSJzY2F0dGVyLWdhdGhlcjogJXNcbiINCi0JCSJ0Y3Agc2Vn bWVudGF0aW9uIG9mZmxvYWQ6ICVzXG4iLA0KKwkJInRjcCBzZWdtZW50YXRpb24gb2ZmbG9hZDog JXNcbiINCisJCSJJUFNFQyBjcnlwdG8gb2ZmbG9hZDogJXNcbiIsDQogCQlyeCA/ICJvbiIgOiAi b2ZmIiwNCiAJCXR4ID8gIm9uIiA6ICJvZmYiLA0KIAkJc2cgPyAib24iIDogIm9mZiIsDQotCQl0 c28gPyAib24iIDogIm9mZiIpOw0KKwkJdHNvID8gIm9uIiA6ICJvZmYiLA0KKwkJaXBzZWMgPyAi b24iIDogIm9mZiIpOw0KIA0KIAlyZXR1cm4gMDsNCiB9DQpAQCAtMTQ0OSw3ICsxNDU1LDcgQEAN CiBzdGF0aWMgaW50IGRvX2dvZmZsb2FkKGludCBmZCwgc3RydWN0IGlmcmVxICppZnIpDQogew0K IAlzdHJ1Y3QgZXRodG9vbF92YWx1ZSBldmFsOw0KLQlpbnQgZXJyLCBhbGxmYWlsID0gMSwgcngg PSAwLCB0eCA9IDAsIHNnID0gMCwgdHNvID0gMDsNCisJaW50IGVyciwgYWxsZmFpbCA9IDEsIHJ4 ID0gMCwgdHggPSAwLCBzZyA9IDAsIHRzbyA9IDAsIGlwc2VjID0gMDsNCiANCiAJZnByaW50Zihz dGRvdXQsICJPZmZsb2FkIHBhcmFtZXRlcnMgZm9yICVzOlxuIiwgZGV2bmFtZSk7DQogDQpAQCAt MTQ5MywxMiArMTQ5OSwyMiBAQA0KIAkJYWxsZmFpbCA9IDA7DQogCX0NCiANCisJZXZhbC5jbWQg PSBFVEhUT09MX0dJUFNFQzsNCisJaWZyLT5pZnJfZGF0YSA9IChjYWRkcl90KSZldmFsOw0KKwll cnIgPSBpb2N0bChmZCwgU0lPQ0VUSFRPT0wsIGlmcik7DQorCWlmIChlcnIpDQorCQlwZXJyb3Io IkNhbm5vdCBnZXQgZGV2aWNlIElQU0VDIG9mZmxvYWQgc2V0dGluZ3MiKTsNCisJZWxzZSB7DQor CQlpcHNlYyA9IGV2YWwuZGF0YTsNCisJCWFsbGZhaWwgPSAwOw0KKwl9DQorDQogCWlmIChhbGxm YWlsKSB7DQogCQlmcHJpbnRmKHN0ZG91dCwgIm5vIG9mZmxvYWQgaW5mbyBhdmFpbGFibGVcbiIp Ow0KIAkJcmV0dXJuIDgzOw0KIAl9DQogDQotCXJldHVybiBkdW1wX29mZmxvYWQocngsIHR4LCBz ZywgdHNvKTsNCisJcmV0dXJuIGR1bXBfb2ZmbG9hZChyeCwgdHgsIHNnLCB0c28sIGlwc2VjKTsN CiB9DQogDQogc3RhdGljIGludCBkb19zb2ZmbG9hZChpbnQgZmQsIHN0cnVjdCBpZnJlcSAqaWZy KQ0KQEAgLTE1NTMsNiArMTU2OSwxOCBAQA0KIAkJCXJldHVybiA4ODsNCiAJCX0NCiAJfQ0KKw0K KwlpZiAob2ZmX2lwc2VjX3dhbnRlZCA+PSAwKSB7DQorCQljaGFuZ2VkID0gMTsNCisJCWV2YWwu Y21kID0gRVRIVE9PTF9TSVBTRUM7DQorCQlldmFsLmRhdGEgPSAob2ZmX2lwc2VjX3dhbnRlZCA9 PSAxKTsNCisJCWlmci0+aWZyX2RhdGEgPSAoY2FkZHJfdCkmZXZhbDsNCisJCWVyciA9IGlvY3Rs KGZkLCBTSU9DRVRIVE9PTCwgaWZyKTsNCisJCWlmIChlcnIpIHsNCisJCQlwZXJyb3IoIkNhbm5v dCBzZXQgZGV2aWNlIElQU0VFQyBvZmZsb2FkIHNldHRpbmdzIik7DQorCQkJcmV0dXJuIDg5Ow0K KwkJfQ0KKwl9DQogCWlmICghY2hhbmdlZCkgew0KIAkJZnByaW50ZihzdGRvdXQsICJubyBvZmZs b2FkIHNldHRpbmdzIGNoYW5nZWRcbiIpOw0KIAl9DQpkaWZmIC11ck4gZXRodG9vbC0yL2V0aHRv b2wtY29weS5oIGV0aHRvb2wtMi1pcHNlYy9ldGh0b29sLWNvcHkuaA0KLS0tIGV0aHRvb2wtMi9l dGh0b29sLWNvcHkuaAkyMDAzLTA3LTE5IDExOjE5OjUyLjAwMDAwMDAwMCAtMDQwMA0KKysrIGV0 aHRvb2wtMi1pcHNlYy9ldGh0b29sLWNvcHkuaAkyMDA0LTExLTIxIDAyOjQ2OjAzLjAwMDAwMDAw MCAtMDUwMA0KQEAgLTI4Myw2ICsyODMsOCBAQA0KICNkZWZpbmUgRVRIVE9PTF9HU1RBVFMJCTB4 MDAwMDAwMWQgLyogZ2V0IE5JQy1zcGVjaWZpYyBzdGF0aXN0aWNzICovDQogI2RlZmluZSBFVEhU T09MX0dUU08JCTB4MDAwMDAwMWUgLyogR2V0IFRTTyBlbmFibGUgKGV0aHRvb2xfdmFsdWUpICov DQogI2RlZmluZSBFVEhUT09MX1NUU08JCTB4MDAwMDAwMWYgLyogU2V0IFRTTyBlbmFibGUgKGV0 aHRvb2xfdmFsdWUpICovDQorI2RlZmluZSBFVEhUT09MX0dJUFNFQwkJMHgwMDAwMDAyMCAvKiBH ZXQgSVBTRUMgZW5hYmxlIChldGh0b29sX3ZhbHVlKSAqLw0KKyNkZWZpbmUgRVRIVE9PTF9TSVBT RUMJCTB4MDAwMDAwMjEgLyogU2V0IElQU0VDIGVuYWJsZSAoZXRodG9vbF92YWx1ZSkgKi8NCiAN CiAvKiBjb21wYXRpYmlsaXR5IHdpdGggb2xkZXIgY29kZSAqLw0KICNkZWZpbmUgU1BBUkNfRVRI X0dTRVQJCUVUSFRPT0xfR1NFVA0K --=-NAzlHeonTRg6HEV7ZPqL-- From jbglaw@lug-owl.de Thu Dec 30 01:46:20 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 01:46:28 -0800 (PST) Received: from lug-owl.de (lug-owl.de [195.71.106.12]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBU9jxCR008528 for ; Thu, 30 Dec 2004 01:46:20 -0800 Received: by lug-owl.de (Postfix, from userid 1001) id A0ACC1901D0; Thu, 30 Dec 2004 10:48:39 +0100 (CET) Date: Thu, 30 Dec 2004 10:48:39 +0100 From: Jan-Benedict Glaw To: David Dillow Cc: netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: [RFC 2.6.10 1/22] xfrm: Add direction information to xfrm_state Message-ID: <20041230094839.GX2460@lug-owl.de> Mail-Followup-To: David Dillow , netdev@oss.sgi.com, linux-kernel@vger.kernel.org References: <20041230035000.01@ori.thedillows.org> <20041230035000.10@ori.thedillows.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="urKaCNvFwQ8jDQOg" Content-Disposition: inline In-Reply-To: <20041230035000.10@ori.thedillows.org> X-Operating-System: Linux mail 2.6.10-rc2-bk5lug-owl X-gpg-fingerprint: 250D 3BCF 7127 0D8C A444 A961 1DBD 5E75 8399 E1BB X-gpg-key: wwwkeys.de.pgp.net User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13222 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jbglaw@lug-owl.de Precedence: bulk X-list: netdev --urKaCNvFwQ8jDQOg Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, 2004-12-30 03:48:34 -0500, David Dillow wrote in message <20041230035000.10@ori.thedillows.org>: > diff -Nru a/include/net/xfrm.h b/include/net/xfrm.h > --- a/include/net/xfrm.h 2004-12-30 01:12:08 -05:00 > +++ b/include/net/xfrm.h 2004-12-30 01:12:08 -05:00 > @@ -146,6 +146,9 @@ > /* Private data of this transformer, format is opaque, > * interpreted by xfrm_type methods. */ > void *data; > + > + /* Intended direction of this state, used for offloading */ > + int dir; > }; > =20 > enum { > @@ -157,6 +160,12 @@ > XFRM_STATE_DEAD > }; > =20 > +enum { > + XFRM_STATE_DIR_UNKNOWN, > + XFRM_STATE_DIR_IN, > + XFRM_STATE_DIR_OUT, > +}; Any specific reason to first define such a nice enum and then using int in the struct? MfG, JBG --=20 Jan-Benedict Glaw jbglaw@lug-owl.de . +49-172-7608481 = _ O _ "Eine Freie Meinung in einem Freien Kopf | Gegen Zensur | Gegen Krieg = _ _ O fuer einen Freien Staat voll Freier B=C3=BCrger" | im Internet! | im Ira= k! O O O ret =3D do_actions((curr | FREE_SPEECH) & ~(NEW_COPYRIGHT_LAW | DRM | TCPA)= ); --urKaCNvFwQ8jDQOg Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Digital signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) iD8DBQFB0873Hb1edYOZ4bsRAtCuAJ9evP8DkQ142XphAFaMDulpbbu15gCgieXS cBCw52xCYS6wmYrHlrMGijM= =44jp -----END PGP SIGNATURE----- --urKaCNvFwQ8jDQOg-- From y030729@njupt.edu.cn Thu Dec 30 02:54:32 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 02:54:39 -0800 (PST) Received: from njupt.edu.cn (em.njupt.edu.cn [202.119.230.11]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBUAs9gU013615 for ; Thu, 30 Dec 2004 02:54:31 -0800 Received: (eyou send program); Thu, 30 Dec 2004 19:49:56 +0800 Message-ID: <304407396.02136@njupt.edu.cn> Received: from 10.10.136.115 by em.njupt.edu.cn with HTTP; Thu, 30 Dec 2004 19:49:56 +0800 X-WebMAIL-MUA: [10.10.136.115] From: "Zhenyu Wu" To: netdev@oss.sgi.com Date: Thu, 30 Dec 2004 19:49:56 +0800 Reply-To: "Zhenyu Wu" X-Priority: 3 Subject: How can i join this mail list? Content-Type: text/plain X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13223 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: y030729@njupt.edu.cn Precedence: bulk X-list: netdev I want to know if I use the mail to join the mail list, how can I do? Just write "subscribe" in the body? From linux_lover2004@yahoo.com Thu Dec 30 03:50:46 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 03:50:52 -0800 (PST) Received: from web52202.mail.yahoo.com (web52202.mail.yahoo.com [206.190.39.84]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBUBoPX7017591 for ; Thu, 30 Dec 2004 03:50:45 -0800 Received: (qmail 6908 invoked by uid 60001); 30 Dec 2004 11:54:01 -0000 Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; b=06aFVnN/Cmo6S1e2S92Rs3ckK1TFKbpkWut3oTxyDKuPSPQmN174UqDqvgZNUB4BHrl0VZ29nf9s+dBEbl5iVKoPVOOlB0reOIvubM4Ku84GEtVQtLdVLhmquqoTn1K7SPnf/RTOcZ3oRwpjAgCSHJEEb+7HMvUE7TcPNTqnaj0= ; Message-ID: <20041230115401.6906.qmail@web52202.mail.yahoo.com> Received: from [202.56.231.117] by web52202.mail.yahoo.com via HTTP; Thu, 30 Dec 2004 03:54:01 PST Date: Thu, 30 Dec 2004 03:54:01 -0800 (PST) From: linux lover Subject: how to access packet's data part in skbuff? To: netdev@oss.sgi.com MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13224 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: linux_lover2004@yahoo.com Precedence: bulk X-list: netdev Hello all, While writing kernel module packet sniffer at IP layer,i start with first accessing packets length and its data part.so, to start i try to access packet data first and copy it to other variable to dump its contents but i am facing a problem while accessing the packet's data. As i have studied i found that data in packet at any layer resides in between data and tail pointers. So if i have to print it or copy it in any unsigned string then how to do that? I tried with following example which receives only loopback packet and print data part at IP layer. But it does not print also why am i getting sb->len as 1 not actual size of packet at IP layer? regards, linux_lover #define MODULE #define __KERNEL__ #include #include #include #include #include #include #include #include static struct nf_hook_ops nfho; unsigned int cap_packet(unsigned int hooknum,struct sk_buff **skb,const struct net_device *in, const struct net_device *out,int (*okfn)(struct sk_buff *)) { struct sk_buff *sb = *skb; unsigned char *packet; int buflen=0,i=0; buflen=sb->len; packet=kmalloc(buflen,GFP_USER); memset(packet,'\0',buflen); printk(KERN_DEBUG "Length of sb->data in hook function = %d\n", buflen); while(buflen>=0) { packet[i]=sb->data[i]; i++; buflen--; } packet[i]='\0'; strcpy(packet,sb->data); printk(KERN_DEBUG "packet contents of sb->data in hook function = %s\n", packet); return NF_ACCEPT; } static int __init init(void) { nfho.hook = cap_packet; nfho.hooknum = NF_IP_LOCAL_OUT; nfho.pf = PF_INET; nfho.priority = NF_IP_PRI_FIRST; nf_register_hook(&nfho); return 0; } static void __exit fini(void) { nf_unregister_hook(&nfho); } module_init(init); module_exit(fini); MODULE_LICENSE("GPL"); __________________________________ Do you Yahoo!? The all-new My Yahoo! - What will yours do? http://my.yahoo.com From tgraf@suug.ch Thu Dec 30 04:22:53 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 04:23:05 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBUCMW5X023103 for ; Thu, 30 Dec 2004 04:22:53 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id DAC768A; Thu, 30 Dec 2004 13:26:09 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id E2FB41C0EA; Thu, 30 Dec 2004 13:26:52 +0100 (CET) Date: Thu, 30 Dec 2004 13:26:52 +0100 From: Thomas Graf To: "David S. Miller" Cc: Jamal Hadi Salim , Patrick McHardy , netdev@oss.sgi.com Subject: [PATCH 0/9] PKT_SCHED: tcf_exts API & make classifier changes consistent upon failure Message-ID: <20041230122652.GM32419@postel.suug.ch> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13225 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev Dave, The following patchset adds tcf_exts API to remove the ifdef clutter, helps make changes consistent upon failures, makes adding action bits to yet unsupported classifiers as easy as adding a few lines and generally saves a lot of code per classifier. The patches to make use of the API add action support to all classifiers and makes changes consistent except for indev failures (to be removed soon) and rsvp which needs a closer look since i haven't had time to perfectly understand it yet. I gave it extensive testing for 2 weeks, patches 5-6 are non-trivial though and it's quite hard to test everything. I will post a request to the lartc mailinglist once these changes are in a bk release so everyone can try it out and I have time fix any remaining issues before the next rc release. I could provide the patchset but its quite some work for the testers and I guess there will be more people trying it out if they can use an official release. Cheers, Thomas From tgraf@suug.ch Thu Dec 30 04:24:50 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 04:24:57 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBUCOTFq023483 for ; Thu, 30 Dec 2004 04:24:50 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id 4D4E2F; Thu, 30 Dec 2004 13:28:08 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id 9FFE91C0EA; Thu, 30 Dec 2004 13:28:51 +0100 (CET) Date: Thu, 30 Dec 2004 13:28:51 +0100 From: Thomas Graf To: "David S. Miller" Cc: Jamal Hadi Salim , Patrick McHardy , netdev@oss.sgi.com Subject: [PATCH 1/9] PKT_SCHED: rtattr_parse shortcut for nested TLVs Message-ID: <20041230122851.GN32419@postel.suug.ch> References: <20041230122652.GM32419@postel.suug.ch> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041230122652.GM32419@postel.suug.ch> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13226 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev Signed-off-by: Thomas Graf --- linux-2.6.10.orig/include/linux/rtnetlink.h 2004-12-25 23:21:18.000000000 +0100 +++ linux-2.6.10/include/linux/rtnetlink.h 2004-12-26 19:52:21.000000000 +0100 @@ -756,6 +756,9 @@ extern int rtattr_parse(struct rtattr *tb[], int maxattr, struct rtattr *rta, int len); +#define rtattr_parse_nested(tb, max, rta) \ + rtattr_parse((tb), (max), RTA_DATA((rta)), RTA_PAYLOAD((rta))) + extern struct sock *rtnl; struct rtnetlink_link From tgraf@suug.ch Thu Dec 30 04:26:22 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 04:26:28 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBUCQ0bf024138 for ; Thu, 30 Dec 2004 04:26:21 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id D21C8F; Thu, 30 Dec 2004 13:29:39 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id 3D02C1C0EA; Thu, 30 Dec 2004 13:30:23 +0100 (CET) Date: Thu, 30 Dec 2004 13:30:23 +0100 From: Thomas Graf To: "David S. Miller" Cc: Jamal Hadi Salim , Patrick McHardy , netdev@oss.sgi.com Subject: [PATCH 2/9] PKT_SCHED: tc filter extension API Message-ID: <20041230123023.GO32419@postel.suug.ch> References: <20041230122652.GM32419@postel.suug.ch> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041230122652.GM32419@postel.suug.ch> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13227 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev The tcf_exts API abstracts extensions such as actions/policers into a generic layer and reduces the knowledge inside classifiers to the minimum required. It isolates the validation code into its own function to allow classifiers to validate all input data before making changes and thus avoids the need to undo changes if a extension configuration request cannot be fullfilled. As a nice side effect, using this API removes the existing ifdef clutter. Usage: The classifier holds struct tcf_exts which may be empty if no extensions are compiled in. It then calls tcf_exts_validate when a new change request was received and provides a temporary tcf_exts copy to store the change requests. Given it succeeded the classifier may change its own parameters and at the end call tcf_exts_change to commit the changes and replace the existing extension configuration with the new one. The classifier is responsible to destroy his temporary copy if any of its own validation checks fail. The classifier specific TLV types must be exported to the extensions API via tcf_ext_map. Destroying the extensions is as easy as calling tcf_exts_destroy. The extensions are executed by the classifier by calling tcf_exts_exec which must be done as the last thing after making sure the filter matches. Note: A classifier might take further actions after the execution to tcf_exts_exec such as correcting its own cache to avoid caching results which could have been influenced by the extensions. tcf_exts_exec returns a negative error code if the filter must be considered unmatched, 0 on normal execution or a positive classifier return code (TC_ACT_*) which must be returned to the underlying layer as-is. Signed-off-by: Thomas Graf --- linux-2.6.10-bk2.orig/include/net/pkt_cls.h 2004-12-30 01:22:01.000000000 +0100 +++ linux-2.6.10-bk2/include/net/pkt_cls.h 2004-12-30 01:22:39.000000000 +0100 @@ -62,6 +62,93 @@ tp->q->ops->cl_ops->unbind_tcf(tp->q, cl); } +struct tcf_exts +{ +#ifdef CONFIG_NET_CLS_ACT + struct tc_action *action; +#elif defined CONFIG_NET_CLS_POLICE + struct tcf_police *police; +#endif +}; + +/* Map to export classifier specific extension TLV types to the + * generic extensions API. Unsupported extensions must be set to 0. + */ +struct tcf_ext_map +{ + int action; + int police; +}; + +/** + * tcf_exts_is_predicative - check if a predicative extension is present + * @exts: tc filter extensions handle + * + * Returns 1 if a predicative extension is present, i.e. an extension which + * might cause further actions and thus overrule the regular tcf_result. + */ +static inline int +tcf_exts_is_predicative(struct tcf_exts *exts) +{ +#ifdef CONFIG_NET_CLS_ACT + return !!exts->action; +#elif defined CONFIG_NET_CLS_POLICE + return !!exts->police; +#else + return 0; +#endif +} + +/** + * tcf_exts_is_available - check if at least one extension is present + * @exts: tc filter extensions handle + * + * Returns 1 if at least one extension is present. + */ +static inline int +tcf_exts_is_available(struct tcf_exts *exts) +{ + /* All non-predicative extensions must be added here. */ + return tcf_exts_is_predicative(exts); +} + +/** + * tcf_exts_exec - execute tc filter extensions + * @skb: socket buffer + * @exts: tc filter extensions handle + * @res: desired result + * + * Executes all configured extensions. Returns 0 on a normal execution, + * a negative number if the filter must be considered unmatched or + * a positive action code (TC_ACT_*) which must be returned to the + * underlying layer. + */ +static inline int +tcf_exts_exec(struct sk_buff *skb, struct tcf_exts *exts, + struct tcf_result *res) +{ +#ifdef CONFIG_NET_CLS_ACT + if (exts->action) + return tcf_action_exec(skb, exts->action, res); +#elif defined CONFIG_NET_CLS_POLICE + if (exts->police) + return tcf_police(skb, exts->police); +#endif + + return 0; +} + +extern int tcf_exts_validate(struct tcf_proto *tp, struct rtattr **tb, + struct rtattr *rate_tlv, struct tcf_exts *exts, + struct tcf_ext_map *map); +extern void tcf_exts_destroy(struct tcf_proto *tp, struct tcf_exts *exts); +extern void tcf_exts_change(struct tcf_proto *tp, struct tcf_exts *dst, + struct tcf_exts *src); +extern int tcf_exts_dump(struct sk_buff *skb, struct tcf_exts *exts, + struct tcf_ext_map *map); +extern int tcf_exts_dump_stats(struct sk_buff *skb, struct tcf_exts *exts, + struct tcf_ext_map *map); + #ifdef CONFIG_NET_CLS_ACT static inline int tcf_change_act_police(struct tcf_proto *tp, struct tc_action **action, --- linux-2.6.10-bk2.orig/net/sched/cls_api.c 2004-12-30 01:22:01.000000000 +0100 +++ linux-2.6.10-bk2/net/sched/cls_api.c 2004-12-30 00:47:06.000000000 +0100 @@ -439,6 +439,162 @@ return skb->len; } +void +tcf_exts_destroy(struct tcf_proto *tp, struct tcf_exts *exts) +{ +#ifdef CONFIG_NET_CLS_ACT + if (exts->action) { + tcf_action_destroy(exts->action, TCA_ACT_UNBIND); + exts->action = NULL; + } +#elif defined CONFIG_NET_CLS_POLICE + if (exts->police) { + tcf_police_release(exts->police, TCA_ACT_UNBIND); + exts->police = NULL; + } +#endif +} + + +int +tcf_exts_validate(struct tcf_proto *tp, struct rtattr **tb, + struct rtattr *rate_tlv, struct tcf_exts *exts, + struct tcf_ext_map *map) +{ + memset(exts, 0, sizeof(*exts)); + +#ifdef CONFIG_NET_CLS_ACT + int err; + struct tc_action *act; + + if (map->police && tb[map->police-1] && rate_tlv) { + act = tcf_action_init_1(tb[map->police-1], rate_tlv, "police", + TCA_ACT_NOREPLACE, TCA_ACT_BIND, &err); + if (NULL == act) + return err; + + act->type = TCA_OLD_COMPAT; + exts->action = act; + } else if (map->action && tb[map->action-1] && rate_tlv) { + act = tcf_action_init(tb[map->action-1], rate_tlv, NULL, + TCA_ACT_NOREPLACE, TCA_ACT_BIND, &err); + if (NULL == act) + return err; + + exts->action = act; + } +#elif defined CONFIG_NET_CLS_POLICE + if (map->police && tb[map->police-1] && rate_tlv) { + struct tcf_police *p; + + p = tcf_police_locate(tb[map->police-1], rate_tlv); + if (NULL == p) + return -EINVAL; + + exts->police = p; + } else if (map->action && tb[map->action-1]) + return -EOPNOTSUPP; +#else + if ((map->action && tb[map->action-1]) || + (map->police && tb[map->police-1])) + return -EOPNOTSUPP; +#endif + + return 0; +} + +void +tcf_exts_change(struct tcf_proto *tp, struct tcf_exts *dst, + struct tcf_exts *src) +{ +#ifdef CONFIG_NET_CLS_ACT + if (src->action) { + if (dst->action) { + struct tc_action *act; + + tcf_tree_lock(tp); + act = xchg(&dst->action, src->action); + tcf_tree_unlock(tp); + + tcf_action_destroy(act, TCA_ACT_UNBIND); + } else + dst->action = src->action; + } +#elif defined CONFIG_NET_CLS_POLICE + if (src->police) { + if (dst->police) { + struct tcf_police *p; + + tcf_tree_lock(tp); + p = xchg(&dst->police, src->police); + tcf_tree_unlock(tp); + + tcf_police_release(p, TCA_ACT_UNBIND); + } else + dst->police = src->police; + } +#endif +} + +int +tcf_exts_dump(struct sk_buff *skb, struct tcf_exts *exts, + struct tcf_ext_map *map) +{ +#ifdef CONFIG_NET_CLS_ACT + if (map->action && exts->action) { + /* + * again for backward compatible mode - we want + * to work with both old and new modes of entering + * tc data even if iproute2 was newer - jhs + */ + struct rtattr * p_rta = (struct rtattr*) skb->tail; + + if (exts->action->type != TCA_OLD_COMPAT) { + RTA_PUT(skb, map->action, 0, NULL); + if (tcf_action_dump(skb, exts->action, 0, 0) < 0) + goto rtattr_failure; + p_rta->rta_len = skb->tail - (u8*)p_rta; + } else if (map->police) { + RTA_PUT(skb, map->police, 0, NULL); + if (tcf_action_dump_old(skb, exts->action, 0, 0) < 0) + goto rtattr_failure; + p_rta->rta_len = skb->tail - (u8*)p_rta; + } + } +#elif defined CONFIG_NET_CLS_POLICE + if (map->police && exts->police) { + struct rtattr * p_rta = (struct rtattr*) skb->tail; + + RTA_PUT(skb, map->police, 0, NULL); + + if (tcf_police_dump(skb, exts->police) < 0) + goto rtattr_failure; + + p_rta->rta_len = skb->tail - (u8*)p_rta; + } +#endif + return 0; +rtattr_failure: + return -1; +} + +int +tcf_exts_dump_stats(struct sk_buff *skb, struct tcf_exts *exts, + struct tcf_ext_map *map) +{ +#ifdef CONFIG_NET_CLS_ACT + if (exts->action) + if (tcf_action_copy_stats(skb, exts->action) < 0) + goto rtattr_failure; +#elif defined CONFIG_NET_CLS_POLICE + if (exts->police) + if (tcf_police_dump_stats(skb, exts->police) < 0) + goto rtattr_failure; +#endif + return 0; +rtattr_failure: + return -1; +} static int __init tc_filter_init(void) { @@ -461,3 +617,8 @@ EXPORT_SYMBOL(register_tcf_proto_ops); EXPORT_SYMBOL(unregister_tcf_proto_ops); +EXPORT_SYMBOL(tcf_exts_validate); +EXPORT_SYMBOL(tcf_exts_destroy); +EXPORT_SYMBOL(tcf_exts_change); +EXPORT_SYMBOL(tcf_exts_dump); +EXPORT_SYMBOL(tcf_exts_dump_stats); From tgraf@suug.ch Thu Dec 30 04:27:06 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 04:27:13 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBUCQjsO024333 for ; Thu, 30 Dec 2004 04:27:06 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id D4D94F; Thu, 30 Dec 2004 13:30:24 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id 3FAFE1C0EA; Thu, 30 Dec 2004 13:31:08 +0100 (CET) Date: Thu, 30 Dec 2004 13:31:08 +0100 From: Thomas Graf To: "David S. Miller" Cc: Jamal Hadi Salim , Patrick McHardy , netdev@oss.sgi.com Subject: [PATCH 3/9] PKT_SCHED: u32: make use of tcf_exts API Message-ID: <20041230123108.GP32419@postel.suug.ch> References: <20041230122652.GM32419@postel.suug.ch> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041230122652.GM32419@postel.suug.ch> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13228 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev Transforms u32 to use tcf_exts API. Makes the u32 changing procedure consistent upon failures except for indev failures but indev will be removed very soon. Signed-off-by: Thomas Graf --- linux-2.6.10-bk2.orig/net/sched/cls_u32.c 2004-12-29 20:16:24.000000000 +0100 +++ linux-2.6.10-bk2/net/sched/cls_u32.c 2004-12-29 20:19:49.000000000 +0100 @@ -61,9 +61,9 @@ struct tc_u32_mark { - __u32 val; - __u32 mask; - __u32 success; + u32 val; + u32 mask; + u32 success; }; struct tc_u_knode @@ -71,13 +71,7 @@ struct tc_u_knode *next; u32 handle; struct tc_u_hnode *ht_up; -#ifdef CONFIG_NET_CLS_ACT - struct tc_action *action; -#else -#ifdef CONFIG_NET_CLS_POLICE - struct tcf_police *police; -#endif -#endif + struct tcf_exts exts; #ifdef CONFIG_NET_CLS_IND char indev[IFNAMSIZ]; #endif @@ -112,6 +106,11 @@ u32 hgenerator; }; +static struct tcf_ext_map u32_ext_map = { + .action = TCA_U32_ACT, + .police = TCA_U32_POLICE +}; + static struct tc_u_common *u32_list; static __inline__ unsigned u32_hash_fold(u32 key, struct tc_u32_sel *sel, u8 fshift) @@ -137,7 +136,7 @@ #ifdef CONFIG_CLS_U32_PERF int j; #endif - int i; + int i, r; next_ht: n = ht->ht[sel]; @@ -185,22 +184,13 @@ #ifdef CONFIG_CLS_U32_PERF n->pf->rhit +=1; #endif -#ifdef CONFIG_NET_CLS_ACT - if (n->action) { - int pol_res = tcf_action_exec(skb, n->action, res); - if (pol_res >= 0) - return pol_res; - } else -#else -#ifdef CONFIG_NET_CLS_POLICE - if (n->police) { - int pol_res = tcf_police(skb, n->police); - if (pol_res >= 0) - return pol_res; - } else -#endif -#endif - return 0; + r = tcf_exts_exec(skb, &n->exts, res); + if (r < 0) { + n = n->next; + goto next_knode; + } + + return r; } n = n->next; goto next_knode; @@ -359,15 +349,7 @@ static int u32_destroy_key(struct tcf_proto *tp, struct tc_u_knode *n) { tcf_unbind_filter(tp, &n->res); -#ifdef CONFIG_NET_CLS_ACT - if (n->action) { - tcf_action_destroy(n->action, TCA_ACT_UNBIND); - } -#else -#ifdef CONFIG_NET_CLS_POLICE - tcf_police_release(n->police, TCA_ACT_UNBIND); -#endif -#endif + tcf_exts_destroy(tp, &n->exts); if (n->ht_down) n->ht_down->refcnt--; #ifdef CONFIG_CLS_U32_PERF @@ -509,18 +491,26 @@ struct tc_u_knode *n, struct rtattr **tb, struct rtattr *est) { + int err; + struct tcf_exts e; + + err = tcf_exts_validate(tp, tb, est, &e, &u32_ext_map); + if (err < 0) + return err; + + err = -EINVAL; if (tb[TCA_U32_LINK-1]) { u32 handle = *(u32*)RTA_DATA(tb[TCA_U32_LINK-1]); struct tc_u_hnode *ht_down = NULL; if (TC_U32_KEY(handle)) - return -EINVAL; + goto errout; if (handle) { ht_down = u32_lookup_ht(ht->tp_c, handle); if (ht_down == NULL) - return -EINVAL; + goto errout; ht_down->refcnt++; } @@ -535,36 +525,20 @@ n->res.classid = *(u32*)RTA_DATA(tb[TCA_U32_CLASSID-1]); tcf_bind_filter(tp, &n->res, base); } -#ifdef CONFIG_NET_CLS_ACT - if (tb[TCA_U32_POLICE-1]) { - int err = tcf_change_act_police(tp, &n->action, tb[TCA_U32_POLICE-1], est); - if (err < 0) - return err; - } - if (tb[TCA_U32_ACT-1]) { - int err = tcf_change_act(tp, &n->action, tb[TCA_U32_ACT-1], est); - if (err < 0) - return err; - } -#else -#ifdef CONFIG_NET_CLS_POLICE - if (tb[TCA_U32_POLICE-1]) { - int err = tcf_change_police(tp, &n->police, tb[TCA_U32_POLICE-1], est); - if (err < 0) - return err; - } -#endif -#endif #ifdef CONFIG_NET_CLS_IND if (tb[TCA_U32_INDEV-1]) { int err = tcf_change_indev(tp, n->indev, tb[TCA_U32_INDEV-1]); if (err < 0) - return err; + goto errout; } #endif + tcf_exts_change(tp, &n->exts, &e); return 0; +errout: + tcf_exts_destroy(tp, &e); + return err; } static int u32_change(struct tcf_proto *tp, unsigned long base, u32 handle, @@ -584,7 +558,7 @@ if (opt == NULL) return handle ? -EINVAL : 0; - if (rtattr_parse(tb, TCA_U32_MAX, RTA_DATA(opt), RTA_PAYLOAD(opt)) < 0) + if (rtattr_parse_nested(tb, TCA_U32_MAX, opt) < 0) return -EINVAL; if ((n = (struct tc_u_knode*)*arg) != NULL) { @@ -657,12 +631,12 @@ memset(n, 0, sizeof(*n) + s->nkeys*sizeof(struct tc_u32_key)); #ifdef CONFIG_CLS_U32_PERF - n->pf = kmalloc(sizeof(struct tc_u32_pcnt) + s->nkeys*sizeof(__u64), GFP_KERNEL); + n->pf = kmalloc(sizeof(struct tc_u32_pcnt) + s->nkeys*sizeof(u64), GFP_KERNEL); if (n->pf == NULL) { kfree(n); return -ENOBUFS; } - memset(n->pf, 0, sizeof(struct tc_u32_pcnt) + s->nkeys*sizeof(__u64)); + memset(n->pf, 0, sizeof(struct tc_u32_pcnt) + s->nkeys*sizeof(u64)); #endif memcpy(&n->sel, s, sizeof(*s) + s->nkeys*sizeof(struct tc_u32_key)); @@ -783,15 +757,8 @@ RTA_PUT(skb, TCA_U32_MARK, sizeof(n->mark), &n->mark); #endif -#ifdef CONFIG_NET_CLS_ACT - if (tcf_dump_act(skb, n->action, TCA_U32_ACT, TCA_U32_POLICE) < 0) - goto rtattr_failure; -#else -#ifdef CONFIG_NET_CLS_POLICE - if (tcf_dump_police(skb, n->police, TCA_U32_POLICE) < 0) + if (tcf_exts_dump(skb, &n->exts, &u32_ext_map) < 0) goto rtattr_failure; -#endif -#endif #ifdef CONFIG_NET_CLS_IND if(strlen(n->indev)) @@ -799,26 +766,15 @@ #endif #ifdef CONFIG_CLS_U32_PERF RTA_PUT(skb, TCA_U32_PCNT, - sizeof(struct tc_u32_pcnt) + n->sel.nkeys*sizeof(__u64), + sizeof(struct tc_u32_pcnt) + n->sel.nkeys*sizeof(u64), n->pf); #endif } rta->rta_len = skb->tail - b; -#ifdef CONFIG_NET_CLS_ACT - if (TC_U32_KEY(n->handle) != 0) { - if (TC_U32_KEY(n->handle) && n->action && n->action->type == TCA_OLD_COMPAT) { - if (tcf_action_copy_stats(skb,n->action)) - goto rtattr_failure; - } - } -#else -#ifdef CONFIG_NET_CLS_POLICE - if (TC_U32_KEY(n->handle) && n->police) - if (tcf_police_dump_stats(skb, n->police) < 0) + if (TC_U32_KEY(n->handle)) + if (tcf_exts_dump_stats(skb, &n->exts, &u32_ext_map) < 0) goto rtattr_failure; -#endif -#endif return skb->len; rtattr_failure: From tgraf@suug.ch Thu Dec 30 04:28:14 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 04:28:21 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBUCRroU025075 for ; Thu, 30 Dec 2004 04:28:14 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id 4F8B4F; Thu, 30 Dec 2004 13:31:33 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id AFE201C0EA; Thu, 30 Dec 2004 13:32:16 +0100 (CET) Date: Thu, 30 Dec 2004 13:32:16 +0100 From: Thomas Graf To: "David S. Miller" Cc: Jamal Hadi Salim , Patrick McHardy , netdev@oss.sgi.com Subject: [PATCH 4/9] PKT_SCHED: fw: make use of tcf_exts API Message-ID: <20041230123216.GQ32419@postel.suug.ch> References: <20041230122652.GM32419@postel.suug.ch> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041230122652.GM32419@postel.suug.ch> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13229 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev Transforms fw to use tcf_exts API. Makes the fw changing procedure consistent upon failures except for indev failures but indev will be removed very soon. Signed-off-by: Thomas Graf --- linux-2.6.10-bk2.orig/net/sched/cls_fw.c 2004-12-29 20:16:24.000000000 +0100 +++ linux-2.6.10-bk2/net/sched/cls_fw.c 2004-12-29 20:20:26.000000000 +0100 @@ -59,13 +59,12 @@ #ifdef CONFIG_NET_CLS_IND char indev[IFNAMSIZ]; #endif /* CONFIG_NET_CLS_IND */ -#ifdef CONFIG_NET_CLS_ACT - struct tc_action *action; -#else /* CONFIG_NET_CLS_ACT */ -#ifdef CONFIG_NET_CLS_POLICE - struct tcf_police *police; -#endif /* CONFIG_NET_CLS_POLICE */ -#endif /* CONFIG_NET_CLS_ACT */ + struct tcf_exts exts; +}; + +static struct tcf_ext_map fw_ext_map = { + .action = TCA_FW_ACT, + .police = TCA_FW_POLICE }; static __inline__ int fw_hash(u32 handle) @@ -78,6 +77,7 @@ { struct fw_head *head = (struct fw_head*)tp->root; struct fw_filter *f; + int r; #ifdef CONFIG_NETFILTER u32 id = skb->nfmark; #else @@ -92,20 +92,11 @@ if (!tcf_match_indev(skb, f->indev)) continue; #endif /* CONFIG_NET_CLS_IND */ -#ifdef CONFIG_NET_CLS_ACT - if (f->action) { - int act_res = tcf_action_exec(skb, f->action, res); - if (act_res >= 0) - return act_res; + r = tcf_exts_exec(skb, &f->exts, res); + if (r < 0) continue; - } -#else /* CONFIG_NET_CLS_ACT */ -#ifdef CONFIG_NET_CLS_POLICE - if (f->police) - return tcf_police(skb, f->police); -#endif /* CONFIG_NET_CLS_POLICE */ -#endif /* CONFIG_NET_CLS_ACT */ - return 0; + + return r; } } } else { @@ -144,6 +135,14 @@ return 0; } +static inline void +fw_delete_filter(struct tcf_proto *tp, struct fw_filter *f) +{ + tcf_unbind_filter(tp, &f->res); + tcf_exts_destroy(tp, &f->exts); + kfree(f); +} + static void fw_destroy(struct tcf_proto *tp) { struct fw_head *head = (struct fw_head*)xchg(&tp->root, NULL); @@ -156,18 +155,7 @@ for (h=0; h<256; h++) { while ((f=head->ht[h]) != NULL) { head->ht[h] = f->next; - tcf_unbind_filter(tp, &f->res); -#ifdef CONFIG_NET_CLS_ACT - if (f->action) - tcf_action_destroy(f->action, TCA_ACT_UNBIND); -#else /* CONFIG_NET_CLS_ACT */ -#ifdef CONFIG_NET_CLS_POLICE - if (f->police) - tcf_police_release(f->police, TCA_ACT_UNBIND); -#endif /* CONFIG_NET_CLS_POLICE */ -#endif /* CONFIG_NET_CLS_ACT */ - - kfree(f); + fw_delete_filter(tp, f); } } kfree(head); @@ -187,16 +175,7 @@ tcf_tree_lock(tp); *fp = f->next; tcf_tree_unlock(tp); - tcf_unbind_filter(tp, &f->res); -#ifdef CONFIG_NET_CLS_ACT - if (f->action) - tcf_action_destroy(f->action,TCA_ACT_UNBIND); -#else /* CONFIG_NET_CLS_ACT */ -#ifdef CONFIG_NET_CLS_POLICE - tcf_police_release(f->police,TCA_ACT_UNBIND); -#endif /* CONFIG_NET_CLS_POLICE */ -#endif /* CONFIG_NET_CLS_ACT */ - kfree(f); + fw_delete_filter(tp, f); return 0; } } @@ -208,8 +187,14 @@ fw_change_attrs(struct tcf_proto *tp, struct fw_filter *f, struct rtattr **tb, struct rtattr **tca, unsigned long base) { - int err = -EINVAL; + struct tcf_exts e; + int err; + err = tcf_exts_validate(tp, tb, tca[TCA_RATE-1], &e, &fw_ext_map); + if (err < 0) + return err; + + err = -EINVAL; if (tb[TCA_FW_CLASSID-1]) { if (RTA_PAYLOAD(tb[TCA_FW_CLASSID-1]) != sizeof(u32)) goto errout; @@ -225,33 +210,11 @@ } #endif /* CONFIG_NET_CLS_IND */ -#ifdef CONFIG_NET_CLS_ACT - if (tb[TCA_FW_POLICE-1]) { - err = tcf_change_act_police(tp, &f->action, tb[TCA_FW_POLICE-1], - tca[TCA_RATE-1]); - if (err < 0) - goto errout; - } - - if (tb[TCA_FW_ACT-1]) { - err = tcf_change_act(tp, &f->action, tb[TCA_FW_ACT-1], - tca[TCA_RATE-1]); - if (err < 0) - goto errout; - } -#else /* CONFIG_NET_CLS_ACT */ -#ifdef CONFIG_NET_CLS_POLICE - if (tb[TCA_FW_POLICE-1]) { - err = tcf_change_police(tp, &f->police, tb[TCA_FW_POLICE-1], - tca[TCA_RATE-1]); - if (err < 0) - goto errout; - } -#endif /* CONFIG_NET_CLS_POLICE */ -#endif /* CONFIG_NET_CLS_ACT */ + tcf_exts_change(tp, &f->exts, &e); - err = 0; + return 0; errout: + tcf_exts_destroy(tp, &e); return err; } @@ -269,7 +232,7 @@ if (!opt) return handle ? -EINVAL : 0; - if (rtattr_parse(tb, TCA_FW_MAX, RTA_DATA(opt), RTA_PAYLOAD(opt)) < 0) + if (rtattr_parse_nested(tb, TCA_FW_MAX, opt) < 0) return -EINVAL; if (f != NULL) { @@ -357,15 +320,7 @@ t->tcm_handle = f->id; - if (!f->res.classid -#ifdef CONFIG_NET_CLS_ACT - && !f->action -#else -#ifdef CONFIG_NET_CLS_POLICE - && !f->police -#endif -#endif - ) + if (!f->res.classid && !tcf_exts_is_available(&f->exts)) return skb->len; rta = (struct rtattr*)b; @@ -377,29 +332,15 @@ if (strlen(f->indev)) RTA_PUT(skb, TCA_FW_INDEV, IFNAMSIZ, f->indev); #endif /* CONFIG_NET_CLS_IND */ -#ifdef CONFIG_NET_CLS_ACT - if (tcf_dump_act(skb, f->action, TCA_FW_ACT, TCA_FW_POLICE) < 0) - goto rtattr_failure; -#else /* CONFIG_NET_CLS_ACT */ -#ifdef CONFIG_NET_CLS_POLICE - if (tcf_dump_police(skb, f->police, TCA_FW_POLICE) < 0) + + if (tcf_exts_dump(skb, &f->exts, &fw_ext_map) < 0) goto rtattr_failure; -#endif /* CONFIG_NET_CLS_POLICE */ -#endif /* CONFIG_NET_CLS_ACT */ rta->rta_len = skb->tail - b; -#ifdef CONFIG_NET_CLS_ACT - if (f->action && f->action->type == TCA_OLD_COMPAT) { - if (tcf_action_copy_stats(skb,f->action)) - goto rtattr_failure; - } -#else /* CONFIG_NET_CLS_ACT */ -#ifdef CONFIG_NET_CLS_POLICE - if (f->police) - if (tcf_police_dump_stats(skb, f->police) < 0) - goto rtattr_failure; -#endif /* CONFIG_NET_CLS_POLICE */ -#endif /* CONFIG_NET_CLS_ACT */ + + if (tcf_exts_dump_stats(skb, &f->exts, &fw_ext_map) < 0) + goto rtattr_failure; + return skb->len; rtattr_failure: From tgraf@suug.ch Thu Dec 30 04:29:09 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 04:29:16 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBUCSm1b025458 for ; Thu, 30 Dec 2004 04:29:08 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id 98AFEF; Thu, 30 Dec 2004 13:32:28 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id 021B01C0EA; Thu, 30 Dec 2004 13:33:11 +0100 (CET) Date: Thu, 30 Dec 2004 13:33:11 +0100 From: Thomas Graf To: "David S. Miller" Cc: Jamal Hadi Salim , Patrick McHardy , netdev@oss.sgi.com Subject: [PATCH 5/9] PKT_SCHED: route: allow changing parameters for existing filters and use tcf_exts API Message-ID: <20041230123311.GR32419@postel.suug.ch> References: <20041230122652.GM32419@postel.suug.ch> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041230122652.GM32419@postel.suug.ch> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13230 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev Transforms route to use tcf_exts API and thus adds support for actions. Replaces the existing change implementation with a new one supporting changes for existing filters which allows to change a classifier without letting a single packet pass by unclassified. Fixes various cases where a error is returned but the filter was changed already. Signed-off-by: Thomas Graf --- linux-2.6.10-bk1.orig/include/linux/pkt_cls.h 2004-12-27 21:34:52.000000000 +0100 +++ linux-2.6.10-bk1/include/linux/pkt_cls.h 2004-12-27 21:46:40.000000000 +0100 @@ -280,6 +280,7 @@ TCA_ROUTE4_FROM, TCA_ROUTE4_IIF, TCA_ROUTE4_POLICE, + TCA_ROUTE4_ACT, __TCA_ROUTE4_MAX }; --- linux-2.6.10-bk2.orig/net/sched/cls_route.c 2004-12-29 20:16:24.000000000 +0100 +++ linux-2.6.10-bk2/net/sched/cls_route.c 2004-12-29 20:21:15.000000000 +0100 @@ -59,6 +59,7 @@ struct route4_bucket { + /* 16 FROM buckets + 16 IIF buckets + 1 wildcard bucket */ struct route4_filter *ht[16+16+1]; }; @@ -69,22 +70,25 @@ int iif; struct tcf_result res; -#ifdef CONFIG_NET_CLS_POLICE - struct tcf_police *police; -#endif - + struct tcf_exts exts; u32 handle; struct route4_bucket *bkt; }; #define ROUTE4_FAILURE ((struct route4_filter*)(-1L)) +static struct tcf_ext_map route_ext_map = { + .police = TCA_ROUTE4_POLICE, + .action = TCA_ROUTE4_ACT +}; + static __inline__ int route4_fastmap_hash(u32 id, int iif) { return id&0xF; } -static void route4_reset_fastmap(struct net_device *dev, struct route4_head *head, u32 id) +static inline +void route4_reset_fastmap(struct net_device *dev, struct route4_head *head, u32 id) { spin_lock_bh(&dev->queue_lock); memset(head->fastmap, 0, sizeof(head->fastmap)); @@ -121,19 +125,20 @@ return 32; } -#ifdef CONFIG_NET_CLS_POLICE -#define IF_ROUTE_POLICE \ -if (f->police) { \ - int pol_res = tcf_police(skb, f->police); \ - if (pol_res >= 0) return pol_res; \ - dont_cache = 1; \ - continue; \ -} \ -if (!dont_cache) -#else -#define IF_ROUTE_POLICE -#endif - +#define ROUTE4_APPLY_RESULT() \ + do { \ + *res = f->res; \ + if (tcf_exts_is_available(&f->exts)) { \ + int r = tcf_exts_exec(skb, &f->exts, res); \ + if (r < 0) { \ + dont_cache = 1; \ + continue; \ + } \ + return r; \ + } else if (!dont_cache) \ + route4_set_fastmap(head, id, iif, f); \ + return 0; \ + } while(0) static int route4_classify(struct sk_buff *skb, struct tcf_proto *tp, struct tcf_result *res) @@ -142,11 +147,8 @@ struct dst_entry *dst; struct route4_bucket *b; struct route4_filter *f; -#ifdef CONFIG_NET_CLS_POLICE - int dont_cache = 0; -#endif u32 id, h; - int iif; + int iif, dont_cache = 0; if ((dst = skb->dst) == NULL) goto failure; @@ -172,29 +174,16 @@ restart: if ((b = head->table[h]) != NULL) { - f = b->ht[route4_hash_from(id)]; - - for ( ; f; f = f->next) { - if (f->id == id) { - *res = f->res; - IF_ROUTE_POLICE route4_set_fastmap(head, id, iif, f); - return 0; - } - } - - for (f = b->ht[route4_hash_iif(iif)]; f; f = f->next) { - if (f->iif == iif) { - *res = f->res; - IF_ROUTE_POLICE route4_set_fastmap(head, id, iif, f); - return 0; - } - } + for (f = b->ht[route4_hash_from(id)]; f; f = f->next) + if (f->id == id) + ROUTE4_APPLY_RESULT(); + + for (f = b->ht[route4_hash_iif(iif)]; f; f = f->next) + if (f->iif == iif) + ROUTE4_APPLY_RESULT(); - for (f = b->ht[route4_hash_wild()]; f; f = f->next) { - *res = f->res; - IF_ROUTE_POLICE route4_set_fastmap(head, id, iif, f); - return 0; - } + for (f = b->ht[route4_hash_wild()]; f; f = f->next) + ROUTE4_APPLY_RESULT(); } if (h < 256) { @@ -203,9 +192,7 @@ goto restart; } -#ifdef CONFIG_NET_CLS_POLICE if (!dont_cache) -#endif route4_set_fastmap(head, id, iif, ROUTE4_FAILURE); failure: return -1; @@ -220,7 +207,7 @@ return -1; } -static u32 to_hash(u32 id) +static inline u32 to_hash(u32 id) { u32 h = id&0xFF; if (id&0x8000) @@ -228,7 +215,7 @@ return h; } -static u32 from_hash(u32 id) +static inline u32 from_hash(u32 id) { id &= 0xFFFF; if (id == 0xFFFF) @@ -276,6 +263,14 @@ return 0; } +static inline void +route4_delete_filter(struct tcf_proto *tp, struct route4_filter *f) +{ + tcf_unbind_filter(tp, &f->res); + tcf_exts_destroy(tp, &f->exts); + kfree(f); +} + static void route4_destroy(struct tcf_proto *tp) { struct route4_head *head = xchg(&tp->root, NULL); @@ -293,11 +288,7 @@ while ((f = b->ht[h2]) != NULL) { b->ht[h2] = f->next; - tcf_unbind_filter(tp, &f->res); -#ifdef CONFIG_NET_CLS_POLICE - tcf_police_release(f->police,TCA_ACT_UNBIND); -#endif - kfree(f); + route4_delete_filter(tp, f); } } kfree(b); @@ -327,11 +318,7 @@ tcf_tree_unlock(tp); route4_reset_fastmap(tp->q->dev, head, f->id); - tcf_unbind_filter(tp, &f->res); -#ifdef CONFIG_NET_CLS_POLICE - tcf_police_release(f->police,TCA_ACT_UNBIND); -#endif - kfree(f); + route4_delete_filter(tp, f); /* Strip tree */ @@ -351,108 +338,63 @@ return 0; } -static int route4_change(struct tcf_proto *tp, unsigned long base, - u32 handle, - struct rtattr **tca, - unsigned long *arg) +static int route4_set_parms(struct tcf_proto *tp, unsigned long base, + struct route4_filter *f, u32 handle, struct route4_head *head, + struct rtattr **tb, struct rtattr *est, int new) { - struct route4_head *head = tp->root; - struct route4_filter *f, *f1, **ins_f; - struct route4_bucket *b; - struct rtattr *opt = tca[TCA_OPTIONS-1]; - struct rtattr *tb[TCA_ROUTE4_MAX]; - unsigned h1, h2; int err; + u32 id = 0, to = 0, nhandle = 0x8000; + struct route4_filter *fp; + unsigned int h1; + struct route4_bucket *b; + struct tcf_exts e; - if (opt == NULL) - return handle ? -EINVAL : 0; - - if (rtattr_parse(tb, TCA_ROUTE4_MAX, RTA_DATA(opt), RTA_PAYLOAD(opt)) < 0) - return -EINVAL; - - if ((f = (struct route4_filter*)*arg) != NULL) { - if (f->handle != handle && handle) - return -EINVAL; - if (tb[TCA_ROUTE4_CLASSID-1]) { - f->res.classid = *(u32*)RTA_DATA(tb[TCA_ROUTE4_CLASSID-1]); - tcf_bind_filter(tp, &f->res, base); - } -#ifdef CONFIG_NET_CLS_POLICE - if (tb[TCA_ROUTE4_POLICE-1]) { - err = tcf_change_police(tp, &f->police, - tb[TCA_ROUTE4_POLICE-1], tca[TCA_RATE-1]); - if (err < 0) - return err; - } -#endif - return 0; - } - - /* Now more serious part... */ - - if (head == NULL) { - head = kmalloc(sizeof(struct route4_head), GFP_KERNEL); - if (head == NULL) - return -ENOBUFS; - memset(head, 0, sizeof(struct route4_head)); - - tcf_tree_lock(tp); - tp->root = head; - tcf_tree_unlock(tp); - } - - f = kmalloc(sizeof(struct route4_filter), GFP_KERNEL); - if (f == NULL) - return -ENOBUFS; - - memset(f, 0, sizeof(*f)); + err = tcf_exts_validate(tp, tb, est, &e, &route_ext_map); + if (err < 0) + return err; err = -EINVAL; - f->handle = 0x8000; + if (tb[TCA_ROUTE4_CLASSID-1]) + if (RTA_PAYLOAD(tb[TCA_ROUTE4_CLASSID-1]) < sizeof(u32)) + goto errout; + if (tb[TCA_ROUTE4_TO-1]) { - if (handle&0x8000) + if (new && handle & 0x8000) goto errout; - if (RTA_PAYLOAD(tb[TCA_ROUTE4_TO-1]) < 4) + if (RTA_PAYLOAD(tb[TCA_ROUTE4_TO-1]) < sizeof(u32)) goto errout; - f->id = *(u32*)RTA_DATA(tb[TCA_ROUTE4_TO-1]); - if (f->id > 0xFF) + to = *(u32*)RTA_DATA(tb[TCA_ROUTE4_TO-1]); + if (to > 0xFF) goto errout; - f->handle = f->id; + nhandle = to; } + if (tb[TCA_ROUTE4_FROM-1]) { - u32 sid; if (tb[TCA_ROUTE4_IIF-1]) goto errout; - if (RTA_PAYLOAD(tb[TCA_ROUTE4_FROM-1]) < 4) + if (RTA_PAYLOAD(tb[TCA_ROUTE4_FROM-1]) < sizeof(u32)) goto errout; - sid = (*(u32*)RTA_DATA(tb[TCA_ROUTE4_FROM-1])); - if (sid > 0xFF) + id = *(u32*)RTA_DATA(tb[TCA_ROUTE4_FROM-1]); + if (id > 0xFF) goto errout; - f->handle |= sid<<16; - f->id |= sid<<16; + nhandle |= id << 16; } else if (tb[TCA_ROUTE4_IIF-1]) { - if (RTA_PAYLOAD(tb[TCA_ROUTE4_IIF-1]) < 4) + if (RTA_PAYLOAD(tb[TCA_ROUTE4_IIF-1]) < sizeof(u32)) goto errout; - f->iif = *(u32*)RTA_DATA(tb[TCA_ROUTE4_IIF-1]); - if (f->iif > 0x7FFF) + id = *(u32*)RTA_DATA(tb[TCA_ROUTE4_IIF-1]); + if (id > 0x7FFF) goto errout; - f->handle |= (f->iif|0x8000)<<16; + nhandle = (id | 0x8000) << 16; } else - f->handle |= 0xFFFF<<16; + nhandle = 0xFFFF << 16; - if (handle) { - f->handle |= handle&0x7F00; - if (f->handle != handle) + if (handle && new) { + nhandle |= handle & 0x7F00; + if (nhandle != handle) goto errout; } - if (tb[TCA_ROUTE4_CLASSID-1]) { - if (RTA_PAYLOAD(tb[TCA_ROUTE4_CLASSID-1]) < 4) - goto errout; - f->res.classid = *(u32*)RTA_DATA(tb[TCA_ROUTE4_CLASSID-1]); - } - - h1 = to_hash(f->handle); + h1 = to_hash(nhandle); if ((b = head->table[h1]) == NULL) { err = -ENOBUFS; b = kmalloc(sizeof(struct route4_bucket), GFP_KERNEL); @@ -463,27 +405,119 @@ tcf_tree_lock(tp); head->table[h1] = b; tcf_tree_unlock(tp); + } else { + unsigned int h2 = from_hash(nhandle >> 16); + err = -EEXIST; + for (fp = b->ht[h2]; fp; fp = fp->next) + if (fp->handle == f->handle) + goto errout; } + + tcf_tree_lock(tp); + if (tb[TCA_ROUTE4_TO-1]) + f->id = to; + + if (tb[TCA_ROUTE4_FROM-1]) + f->id = to | id<<16; + else if (tb[TCA_ROUTE4_IIF-1]) + f->iif = id; + + f->handle = nhandle; f->bkt = b; + tcf_tree_unlock(tp); - err = -EEXIST; - h2 = from_hash(f->handle>>16); - for (ins_f = &b->ht[h2]; (f1=*ins_f) != NULL; ins_f = &f1->next) { - if (f->handle < f1->handle) - break; - if (f1->handle == f->handle) + if (tb[TCA_ROUTE4_CLASSID-1]) { + f->res.classid = *(u32*)RTA_DATA(tb[TCA_ROUTE4_CLASSID-1]); + tcf_bind_filter(tp, &f->res, base); + } + + tcf_exts_change(tp, &f->exts, &e); + + return 0; +errout: + tcf_exts_destroy(tp, &e); + return err; +} + +static int route4_change(struct tcf_proto *tp, unsigned long base, + u32 handle, + struct rtattr **tca, + unsigned long *arg) +{ + struct route4_head *head = tp->root; + struct route4_filter *f, *f1, **fp; + struct route4_bucket *b; + struct rtattr *opt = tca[TCA_OPTIONS-1]; + struct rtattr *tb[TCA_ROUTE4_MAX]; + unsigned int h, th; + u32 old_handle = 0; + int err; + + if (opt == NULL) + return handle ? -EINVAL : 0; + + if (rtattr_parse_nested(tb, TCA_ROUTE4_MAX, opt) < 0) + return -EINVAL; + + if ((f = (struct route4_filter*)*arg) != NULL) { + if (f->handle != handle && handle) + return -EINVAL; + + if (f->bkt) + old_handle = f->handle; + + err = route4_set_parms(tp, base, f, handle, head, tb, + tca[TCA_RATE-1], 0); + if (err < 0) + return err; + + goto reinsert; + } + + err = -ENOBUFS; + if (head == NULL) { + head = kmalloc(sizeof(struct route4_head), GFP_KERNEL); + if (head == NULL) goto errout; + memset(head, 0, sizeof(struct route4_head)); + + tcf_tree_lock(tp); + tp->root = head; + tcf_tree_unlock(tp); } - tcf_bind_filter(tp, &f->res, base); -#ifdef CONFIG_NET_CLS_POLICE - if (tb[TCA_ROUTE4_POLICE-1]) - tcf_change_police(tp, &f->police, tb[TCA_ROUTE4_POLICE-1], tca[TCA_RATE-1]); -#endif + f = kmalloc(sizeof(struct route4_filter), GFP_KERNEL); + if (f == NULL) + goto errout; + memset(f, 0, sizeof(*f)); + + err = route4_set_parms(tp, base, f, handle, head, tb, + tca[TCA_RATE-1], 1); + if (err < 0) + goto errout; + +reinsert: + h = from_hash(f->handle >> 16); + for (fp = &f->bkt->ht[h]; (f1=*fp) != NULL; fp = &f1->next) + if (f->handle < f1->handle) + break; f->next = f1; tcf_tree_lock(tp); - *ins_f = f; + *fp = f; + + if (old_handle && f->handle != old_handle) { + th = to_hash(old_handle); + h = from_hash(old_handle >> 16); + if ((b = head->table[th]) != NULL) { + for (fp = &b->ht[h]; *fp; fp = &(*fp)->next) { + if (*fp == f) { + *fp = f->next; + break; + } + } + } + } tcf_tree_unlock(tp); route4_reset_fastmap(tp->q->dev, head, f->id); @@ -559,17 +593,15 @@ } if (f->res.classid) RTA_PUT(skb, TCA_ROUTE4_CLASSID, 4, &f->res.classid); -#ifdef CONFIG_NET_CLS_POLICE - if (tcf_dump_police(skb, f->police, TCA_ROUTE4_POLICE) < 0) + + if (tcf_exts_dump(skb, &f->exts, &route_ext_map) < 0) goto rtattr_failure; -#endif rta->rta_len = skb->tail - b; -#ifdef CONFIG_NET_CLS_POLICE - if (f->police) - if (tcf_police_dump_stats(skb, f->police) < 0) - goto rtattr_failure; -#endif + + if (tcf_exts_dump_stats(skb, &f->exts, &route_ext_map) < 0) + goto rtattr_failure; + return skb->len; rtattr_failure: From tgraf@suug.ch Thu Dec 30 04:30:08 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 04:30:16 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBUCTlKo025981 for ; Thu, 30 Dec 2004 04:30:08 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id 40674F; Thu, 30 Dec 2004 13:33:28 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id 9D8C01C0EA; Thu, 30 Dec 2004 13:34:11 +0100 (CET) Date: Thu, 30 Dec 2004 13:34:11 +0100 From: Thomas Graf To: "David S. Miller" Cc: Jamal Hadi Salim , Patrick McHardy , netdev@oss.sgi.com Subject: [PATCH 6/9] PKT_SCHED: tcindex: allow changing parameters for existing filters and use tcf_exts API Message-ID: <20041230123411.GS32419@postel.suug.ch> References: <20041230122652.GM32419@postel.suug.ch> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041230122652.GM32419@postel.suug.ch> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13231 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev Transforms tcindex to use tcf_exts API and thus adds support for actions. Replaces the existing change implementation with a new one supporting changes for existing filters which allows to change a classifier without letting a single packet pass by unclassified. Fixes various cases where a error is returned but the filter was changed already. Signed-off-by: Thomas Graf --- linux-2.6.10-bk1.orig/include/linux/pkt_cls.h 2004-12-27 22:28:09.000000000 +0100 +++ linux-2.6.10-bk1/include/linux/pkt_cls.h 2004-12-27 22:28:30.000000000 +0100 @@ -312,6 +312,7 @@ TCA_TCINDEX_FALL_THROUGH, TCA_TCINDEX_CLASSID, TCA_TCINDEX_POLICE, + TCA_TCINDEX_ACT, __TCA_TCINDEX_MAX }; --- linux-2.6.10-bk2.orig/net/sched/cls_tcindex.c 2004-12-29 20:41:32.000000000 +0100 +++ linux-2.6.10-bk2/net/sched/cls_tcindex.c 2004-12-29 22:52:46.000000000 +0100 @@ -49,12 +49,12 @@ struct tcindex_filter_result { - struct tcf_police *police; - struct tcf_result res; + struct tcf_exts exts; + struct tcf_result res; }; struct tcindex_filter { - __u16 key; + u16 key; struct tcindex_filter_result result; struct tcindex_filter *next; }; @@ -64,60 +64,64 @@ struct tcindex_filter_result *perfect; /* perfect hash; NULL if none */ struct tcindex_filter **h; /* imperfect hash; only used if !perfect; NULL if unused */ - __u16 mask; /* AND key with mask */ + u16 mask; /* AND key with mask */ int shift; /* shift ANDed key to the right */ int hash; /* hash table size; 0 if undefined */ int alloc_hash; /* allocated size */ int fall_through; /* 0: only classify if explicit match */ }; +static struct tcf_ext_map tcindex_ext_map = { + .police = TCA_TCINDEX_POLICE, + .action = TCA_TCINDEX_ACT +}; + +static inline int +tcindex_filter_is_set(struct tcindex_filter_result *r) +{ + return tcf_exts_is_predicative(&r->exts) || r->res.classid; +} -static struct tcindex_filter_result *lookup(struct tcindex_data *p,__u16 key) +static struct tcindex_filter_result * +tcindex_lookup(struct tcindex_data *p, u16 key) { struct tcindex_filter *f; if (p->perfect) - return p->perfect[key].res.class ? p->perfect+key : NULL; - if (!p->h) - return NULL; - for (f = p->h[key % p->hash]; f; f = f->next) { - if (f->key == key) - return &f->result; + return tcindex_filter_is_set(p->perfect + key) ? + p->perfect + key : NULL; + else if (p->h) { + for (f = p->h[key % p->hash]; f; f = f->next) + if (f->key == key) + return &f->result; } + return NULL; } static int tcindex_classify(struct sk_buff *skb, struct tcf_proto *tp, - struct tcf_result *res) + struct tcf_result *res) { struct tcindex_data *p = PRIV(tp); struct tcindex_filter_result *f; + int key = (skb->tc_index & p->mask) >> p->shift; D2PRINTK("tcindex_classify(skb %p,tp %p,res %p),p %p\n",skb,tp,res,p); - f = lookup(p,(skb->tc_index & p->mask) >> p->shift); + f = tcindex_lookup(p, key); if (!f) { if (!p->fall_through) return -1; - res->classid = TC_H_MAKE(TC_H_MAJ(tp->q->handle), - (skb->tc_index& p->mask) >> p->shift); + res->classid = TC_H_MAKE(TC_H_MAJ(tp->q->handle), key); res->class = 0; D2PRINTK("alg 0x%x\n",res->classid); return 0; } *res = f->res; D2PRINTK("map 0x%x\n",res->classid); -#ifdef CONFIG_NET_CLS_POLICE - if (f->police) { - int result; - - result = tcf_police(skb,f->police); - D2PRINTK("police %d\n",res); - return result; - } -#endif - return 0; + + return tcf_exts_exec(skb, &f->exts, res); } @@ -129,8 +133,8 @@ DPRINTK("tcindex_get(tp %p,handle 0x%08x)\n",tp,handle); if (p->perfect && handle >= p->alloc_hash) return 0; - r = lookup(PRIV(tp),handle); - return r && r->res.class ? (unsigned long) r : 0; + r = tcindex_lookup(p, handle); + return r && tcindex_filter_is_set(r) ? (unsigned long) r : 0UL; } @@ -149,13 +153,12 @@ if (!p) return -ENOMEM; - tp->root = p; - p->perfect = NULL; - p->h = NULL; - p->hash = 0; + memset(p, 0, sizeof(*p)); p->mask = 0xffff; - p->shift = 0; + p->hash = DEFAULT_HASH_SIZE; p->fall_through = 1; + + tp->root = p; return 0; } @@ -190,9 +193,7 @@ tcf_tree_unlock(tp); } tcf_unbind_filter(tp, &r->res); -#ifdef CONFIG_NET_CLS_POLICE - tcf_police_release(r->police, TCA_ACT_UNBIND); -#endif + tcf_exts_destroy(tp, &r->exts); if (f) kfree(f); return 0; @@ -203,148 +204,184 @@ return __tcindex_delete(tp, arg, 1); } -/* - * There are no parameters for tcindex_init, so we overload tcindex_change - */ +static inline int +valid_perfect_hash(struct tcindex_data *p) +{ + return p->hash > (p->mask >> p->shift); +} + +static int +tcindex_set_parms(struct tcf_proto *tp, unsigned long base, u32 handle, + struct tcindex_data *p, struct tcindex_filter_result *r, + struct rtattr **tb, struct rtattr *est) +{ + int err, balloc = 0; + struct tcindex_filter_result new_filter_result, *old_r = r; + struct tcindex_filter_result cr; + struct tcindex_data cp; + struct tcindex_filter *f = NULL; /* make gcc behave */ + struct tcf_exts e; + + err = tcf_exts_validate(tp, tb, est, &e, &tcindex_ext_map); + if (err < 0) + return err; + + memcpy(&cp, p, sizeof(cp)); + memset(&new_filter_result, 0, sizeof(new_filter_result)); + + if (old_r) + memcpy(&cr, r, sizeof(cr)); + else + memset(&cr, 0, sizeof(cr)); + + err = -EINVAL; + if (tb[TCA_TCINDEX_HASH-1]) { + if (RTA_PAYLOAD(tb[TCA_TCINDEX_HASH-1]) < sizeof(u32)) + goto errout; + cp.hash = *(u32 *) RTA_DATA(tb[TCA_TCINDEX_HASH-1]); + } + + if (tb[TCA_TCINDEX_MASK-1]) { + if (RTA_PAYLOAD(tb[TCA_TCINDEX_MASK-1]) < sizeof(u16)) + goto errout; + cp.mask = *(u16 *) RTA_DATA(tb[TCA_TCINDEX_MASK-1]); + } + + if (tb[TCA_TCINDEX_SHIFT-1]) { + if (RTA_PAYLOAD(tb[TCA_TCINDEX_SHIFT-1]) < sizeof(u16)) + goto errout; + cp.shift = *(u16 *) RTA_DATA(tb[TCA_TCINDEX_SHIFT-1]); + } + + err = -EBUSY; + /* Hash already allocated, make sure that we still meet the + * requirements for the allocated hash. + */ + if (cp.perfect) { + if (!valid_perfect_hash(&cp) || + cp.hash > cp.alloc_hash) + goto errout; + } else if (cp.h && cp.hash != cp.alloc_hash) + goto errout; + + err = -EINVAL; + if (tb[TCA_TCINDEX_FALL_THROUGH-1]) { + if (RTA_PAYLOAD(tb[TCA_TCINDEX_FALL_THROUGH-1]) < sizeof(u32)) + goto errout; + cp.fall_through = + *(u32 *) RTA_DATA(tb[TCA_TCINDEX_FALL_THROUGH-1]); + } + + if (!cp.hash) { + /* Hash not specified, use perfect hash if the upper limit + * of the hashing index is below the threshold. + */ + if ((cp.mask >> cp.shift) < PERFECT_HASH_THRESHOLD) + cp.hash = (cp.mask >> cp.shift)+1; + else + cp.hash = DEFAULT_HASH_SIZE; + } + + if (!cp.perfect && !cp.h) + cp.alloc_hash = cp.hash; + + /* Note: this could be as restrictive as if (handle & ~(mask >> shift)) + * but then, we'd fail handles that may become valid after some future + * mask change. While this is extremely unlikely to ever matter, + * the check below is safer (and also more backwards-compatible). + */ + if (cp.perfect || valid_perfect_hash(&cp)) + if (handle >= cp.alloc_hash) + goto errout; + + + err = -ENOMEM; + if (!cp.perfect && !cp.h) { + if (valid_perfect_hash(&cp)) { + cp.perfect = kmalloc(cp.hash * sizeof(*r), GFP_KERNEL); + if (!cp.perfect) + goto errout; + memset(cp.perfect, 0, cp.hash * sizeof(*r)); + balloc = 1; + } else { + cp.h = kmalloc(cp.hash * sizeof(f), GFP_KERNEL); + if (!cp.h) + goto errout; + memset(cp.h, 0, cp.hash * sizeof(f)); + balloc = 2; + } + } + + if (cp.perfect) + r = cp.perfect + handle; + else + r = tcindex_lookup(&cp, handle) ? : &new_filter_result; + + if (r == &new_filter_result) { + f = kmalloc(sizeof(*f), GFP_KERNEL); + if (!f) + goto errout_alloc; + memset(f, 0, sizeof(*f)); + } + + if (tb[TCA_TCINDEX_CLASSID-1]) { + cr.res.classid = *(u32 *) RTA_DATA(tb[TCA_TCINDEX_CLASSID-1]); + tcf_bind_filter(tp, &cr.res, base); + } + + tcf_exts_change(tp, &cr.exts, &e); + + tcf_tree_lock(tp); + if (old_r && old_r != r) + memset(old_r, 0, sizeof(*old_r)); + + memcpy(p, &cp, sizeof(cp)); + memcpy(r, &cr, sizeof(cr)); + + if (r == &new_filter_result) { + struct tcindex_filter **fp; + + f->key = handle; + f->result = new_filter_result; + f->next = NULL; + for (fp = p->h+(handle % p->hash); *fp; fp = &(*fp)->next) + /* nothing */; + *fp = f; + } + tcf_tree_unlock(tp); + + return 0; +errout_alloc: + if (balloc == 1) + kfree(cp.perfect); + else if (balloc == 2) + kfree(cp.h); +errout: + tcf_exts_destroy(tp, &e); + return err; +} -static int tcindex_change(struct tcf_proto *tp,unsigned long base,u32 handle, - struct rtattr **tca,unsigned long *arg) -{ - struct tcindex_filter_result new_filter_result = { - NULL, /* no policing */ - { 0,0 }, /* no classification */ - }; +static int +tcindex_change(struct tcf_proto *tp, unsigned long base, u32 handle, + struct rtattr **tca, unsigned long *arg) +{ struct rtattr *opt = tca[TCA_OPTIONS-1]; struct rtattr *tb[TCA_TCINDEX_MAX]; struct tcindex_data *p = PRIV(tp); - struct tcindex_filter *f; struct tcindex_filter_result *r = (struct tcindex_filter_result *) *arg; - struct tcindex_filter **walk; - int hash,shift; - __u16 mask; DPRINTK("tcindex_change(tp %p,handle 0x%08x,tca %p,arg %p),opt %p," - "p %p,r %p\n",tp,handle,tca,arg,opt,p,r); - if (arg) - DPRINTK("*arg = 0x%lx\n",*arg); + "p %p,r %p,*arg 0x%lx\n", + tp, handle, tca, arg, opt, p, r, arg ? *arg : 0L); + if (!opt) return 0; - if (rtattr_parse(tb,TCA_TCINDEX_MAX,RTA_DATA(opt),RTA_PAYLOAD(opt)) < 0) - return -EINVAL; - if (!tb[TCA_TCINDEX_HASH-1]) { - hash = p->hash; - } else { - if (RTA_PAYLOAD(tb[TCA_TCINDEX_HASH-1]) < sizeof(int)) - return -EINVAL; - hash = *(int *) RTA_DATA(tb[TCA_TCINDEX_HASH-1]); - } - if (!tb[TCA_TCINDEX_MASK-1]) { - mask = p->mask; - } else { - if (RTA_PAYLOAD(tb[TCA_TCINDEX_MASK-1]) < sizeof(__u16)) - return -EINVAL; - mask = *(__u16 *) RTA_DATA(tb[TCA_TCINDEX_MASK-1]); - } - if (!tb[TCA_TCINDEX_SHIFT-1]) - shift = p->shift; - else { - if (RTA_PAYLOAD(tb[TCA_TCINDEX_SHIFT-1]) < sizeof(__u16)) - return -EINVAL; - shift = *(int *) RTA_DATA(tb[TCA_TCINDEX_SHIFT-1]); - } - if (p->perfect && hash <= (mask >> shift)) - return -EBUSY; - if (p->perfect && hash > p->alloc_hash) - return -EBUSY; - if (p->h && hash != p->alloc_hash) - return -EBUSY; - p->hash = hash; - p->mask = mask; - p->shift = shift; - if (tb[TCA_TCINDEX_FALL_THROUGH-1]) { - if (RTA_PAYLOAD(tb[TCA_TCINDEX_FALL_THROUGH-1]) < sizeof(int)) - return -EINVAL; - p->fall_through = - *(int *) RTA_DATA(tb[TCA_TCINDEX_FALL_THROUGH-1]); - } - DPRINTK("classid/police %p/%p\n",tb[TCA_TCINDEX_CLASSID-1], - tb[TCA_TCINDEX_POLICE-1]); - if (!tb[TCA_TCINDEX_CLASSID-1] && !tb[TCA_TCINDEX_POLICE-1]) - return 0; - if (!hash) { - if ((mask >> shift) < PERFECT_HASH_THRESHOLD) { - p->hash = (mask >> shift)+1; - } else { - p->hash = DEFAULT_HASH_SIZE; - } - } - if (!p->perfect && !p->h) { - p->alloc_hash = p->hash; - DPRINTK("hash %d mask %d\n",p->hash,p->mask); - if (p->hash > (mask >> shift)) { - p->perfect = kmalloc(p->hash* - sizeof(struct tcindex_filter_result),GFP_KERNEL); - if (!p->perfect) - return -ENOMEM; - memset(p->perfect, 0, - p->hash * sizeof(struct tcindex_filter_result)); - } else { - p->h = kmalloc(p->hash*sizeof(struct tcindex_filter *), - GFP_KERNEL); - if (!p->h) - return -ENOMEM; - memset(p->h, 0, p->hash*sizeof(struct tcindex_filter *)); - } - } - /* - * Note: this could be as restrictive as - * if (handle & ~(mask >> shift)) - * but then, we'd fail handles that may become valid after some - * future mask change. While this is extremely unlikely to ever - * matter, the check below is safer (and also more - * backwards-compatible). - */ - if (p->perfect && handle >= p->alloc_hash) + + if (rtattr_parse_nested(tb, TCA_TCINDEX_MAX, opt) < 0) return -EINVAL; - if (p->perfect) { - r = p->perfect+handle; - } else { - r = lookup(p,handle); - DPRINTK("r=%p\n",r); - if (!r) - r = &new_filter_result; - } - DPRINTK("r=%p\n",r); - if (tb[TCA_TCINDEX_CLASSID-1]) { - r->res.classid = *(__u32 *) RTA_DATA(tb[TCA_TCINDEX_CLASSID-1]); - tcf_bind_filter(tp, &r->res, base); - if (!r->res.class) { - r->res.classid = 0; - return -ENOENT; - } - } -#ifdef CONFIG_NET_CLS_POLICE - if (tb[TCA_TCINDEX_POLICE-1]) { - int err = tcf_change_police(tp, &r->police, tb[TCA_TCINDEX_POLICE-1], NULL); - if (err < 0) - return err; - } -#endif - if (r != &new_filter_result) - return 0; - f = kmalloc(sizeof(struct tcindex_filter),GFP_KERNEL); - if (!f) - return -ENOMEM; - f->key = handle; - f->result = new_filter_result; - f->next = NULL; - for (walk = p->h+(handle % p->hash); *walk; walk = &(*walk)->next) - /* nothing */; - wmb(); - *walk = f; - return 0; + return tcindex_set_parms(tp, base, handle, p, r, tb, tca[TCA_RATE-1]); } @@ -434,6 +471,7 @@ RTA_PUT(skb,TCA_TCINDEX_SHIFT,sizeof(p->shift),&p->shift); RTA_PUT(skb,TCA_TCINDEX_FALL_THROUGH,sizeof(p->fall_through), &p->fall_through); + rta->rta_len = skb->tail-b; } else { if (p->perfect) { t->tcm_handle = r-p->perfect; @@ -453,12 +491,15 @@ DPRINTK("handle = %d\n",t->tcm_handle); if (r->res.class) RTA_PUT(skb, TCA_TCINDEX_CLASSID, 4, &r->res.classid); -#ifdef CONFIG_NET_CLS_POLICE - if (tcf_dump_police(skb, r->police, TCA_TCINDEX_POLICE) < 0) + + if (tcf_exts_dump(skb, &r->exts, &tcindex_ext_map) < 0) + goto rtattr_failure; + rta->rta_len = skb->tail-b; + + if (tcf_exts_dump_stats(skb, &r->exts, &tcindex_ext_map) < 0) goto rtattr_failure; -#endif } - rta->rta_len = skb->tail-b; + return skb->len; rtattr_failure: From tgraf@suug.ch Thu Dec 30 04:30:55 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 04:31:03 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBUCUYpv026256 for ; Thu, 30 Dec 2004 04:30:55 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id A008CF; Thu, 30 Dec 2004 13:34:15 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id 094531C0EA; Thu, 30 Dec 2004 13:34:59 +0100 (CET) Date: Thu, 30 Dec 2004 13:34:58 +0100 From: Thomas Graf To: "David S. Miller" Cc: Jamal Hadi Salim , Patrick McHardy , netdev@oss.sgi.com Subject: [PATCH 7/9] PKT_SCHED: rsvp: use tcf_exts API Message-ID: <20041230123458.GT32419@postel.suug.ch> References: <20041230122652.GM32419@postel.suug.ch> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041230122652.GM32419@postel.suug.ch> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13232 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev Transforms tcindex to use tcf_exts API and thus adds support for actions. Needs more work to allow changing parameters for existing filters. Signed-off-by: Thomas Graf --- linux-2.6.10-bk1.orig/include/linux/pkt_cls.h 2004-12-27 22:38:14.000000000 +0100 +++ linux-2.6.10-bk1/include/linux/pkt_cls.h 2004-12-27 22:39:29.000000000 +0100 @@ -249,6 +249,7 @@ TCA_RSVP_SRC, TCA_RSVP_PINFO, TCA_RSVP_POLICE, + TCA_RSVP_ACT, __TCA_RSVP_MAX }; --- linux-2.6.10-bk2.orig/net/sched/cls_rsvp.h 2004-12-29 20:16:24.000000000 +0100 +++ linux-2.6.10-bk2/net/sched/cls_rsvp.h 2004-12-29 20:24:09.000000000 +0100 @@ -95,9 +95,7 @@ u8 tunnelhdr; struct tcf_result res; -#ifdef CONFIG_NET_CLS_POLICE - struct tcf_police *police; -#endif + struct tcf_exts exts; u32 handle; struct rsvp_session *sess; @@ -120,18 +118,20 @@ return h & 0xF; } -#ifdef CONFIG_NET_CLS_POLICE -#define RSVP_POLICE() \ -if (f->police) { \ - int pol_res = tcf_police(skb, f->police); \ - if (pol_res < 0) continue; \ - if (pol_res) return pol_res; \ -} -#else -#define RSVP_POLICE() -#endif - +static struct tcf_ext_map rsvp_ext_map = { + .police = TCA_RSVP_POLICE, + .action = TCA_RSVP_ACT +}; +#define RSVP_APPLY_RESULT() \ + do { \ + int r = tcf_exts_exec(skb, &f->exts, res); \ + if (r < 0) \ + continue; \ + else if (r > 0) \ + return r; \ + } while(0) + static int rsvp_classify(struct sk_buff *skb, struct tcf_proto *tp, struct tcf_result *res) { @@ -189,8 +189,7 @@ #endif ) { *res = f->res; - - RSVP_POLICE(); + RSVP_APPLY_RESULT(); matched: if (f->tunnelhdr == 0) @@ -205,7 +204,7 @@ /* And wildcard bucket... */ for (f = s->ht[16]; f; f = f->next) { *res = f->res; - RSVP_POLICE(); + RSVP_APPLY_RESULT(); goto matched; } return -1; @@ -251,6 +250,14 @@ return -ENOBUFS; } +static inline void +rsvp_delete_filter(struct tcf_proto *tp, struct rsvp_filter *f) +{ + tcf_unbind_filter(tp, &f->res); + tcf_exts_destroy(tp, &f->exts); + kfree(f); +} + static void rsvp_destroy(struct tcf_proto *tp) { struct rsvp_head *data = xchg(&tp->root, NULL); @@ -273,11 +280,7 @@ while ((f = s->ht[h2]) != NULL) { s->ht[h2] = f->next; - tcf_unbind_filter(tp, &f->res); -#ifdef CONFIG_NET_CLS_POLICE - tcf_police_release(f->police,TCA_ACT_UNBIND); -#endif - kfree(f); + rsvp_delete_filter(tp, f); } } kfree(s); @@ -299,12 +302,7 @@ tcf_tree_lock(tp); *fp = f->next; tcf_tree_unlock(tp); - tcf_unbind_filter(tp, &f->res); -#ifdef CONFIG_NET_CLS_POLICE - tcf_police_release(f->police,TCA_ACT_UNBIND); -#endif - - kfree(f); + rsvp_delete_filter(tp, f); /* Strip tree */ @@ -412,6 +410,7 @@ struct tc_rsvp_pinfo *pinfo = NULL; struct rtattr *opt = tca[TCA_OPTIONS-1]; struct rtattr *tb[TCA_RSVP_MAX]; + struct tcf_exts e; unsigned h1, h2; u32 *dst; int err; @@ -419,38 +418,38 @@ if (opt == NULL) return handle ? -EINVAL : 0; - if (rtattr_parse(tb, TCA_RSVP_MAX, RTA_DATA(opt), RTA_PAYLOAD(opt)) < 0) + if (rtattr_parse_nested(tb, TCA_RSVP_MAX, opt) < 0) return -EINVAL; + err = tcf_exts_validate(tp, tb, tca[TCA_RATE-1], &e, &rsvp_ext_map); + if (err < 0) + return err; + if ((f = (struct rsvp_filter*)*arg) != NULL) { /* Node exists: adjust only classid */ if (f->handle != handle && handle) - return -EINVAL; + goto errout2; if (tb[TCA_RSVP_CLASSID-1]) { f->res.classid = *(u32*)RTA_DATA(tb[TCA_RSVP_CLASSID-1]); tcf_bind_filter(tp, &f->res, base); } -#ifdef CONFIG_NET_CLS_POLICE - if (tb[TCA_RSVP_POLICE-1]) { - err = tcf_change_police(tp, &f->police, - tb[TCA_RSVP_POLICE-1], tca[TCA_RATE-1]); - if (err < 0) - return err; - } -#endif + + tcf_exts_change(tp, &f->exts, &e); return 0; } /* Now more serious part... */ + err = -EINVAL; if (handle) - return -EINVAL; + goto errout2; if (tb[TCA_RSVP_DST-1] == NULL) - return -EINVAL; + goto errout2; + err = -ENOBUFS; f = kmalloc(sizeof(struct rsvp_filter), GFP_KERNEL); if (f == NULL) - return -ENOBUFS; + goto errout2; memset(f, 0, sizeof(*f)); h2 = 16; @@ -516,10 +515,8 @@ f->sess = s; if (f->tunnelhdr == 0) tcf_bind_filter(tp, &f->res, base); -#ifdef CONFIG_NET_CLS_POLICE - if (tb[TCA_RSVP_POLICE-1]) - tcf_change_police(tp, &f->police, tb[TCA_RSVP_POLICE-1], tca[TCA_RATE-1]); -#endif + + tcf_exts_change(tp, &f->exts, &e); for (fp = &s->ht[h2]; *fp; fp = &(*fp)->next) if (((*fp)->spi.mask&f->spi.mask) != f->spi.mask) @@ -560,6 +557,8 @@ errout: if (f) kfree(f); +errout2: + tcf_exts_destroy(tp, &e); return err; } @@ -624,17 +623,14 @@ RTA_PUT(skb, TCA_RSVP_CLASSID, 4, &f->res.classid); if (((f->handle>>8)&0xFF) != 16) RTA_PUT(skb, TCA_RSVP_SRC, sizeof(f->src), f->src); -#ifdef CONFIG_NET_CLS_POLICE - if (tcf_dump_police(skb, f->police, TCA_RSVP_POLICE) < 0) + + if (tcf_exts_dump(skb, &f->exts, &rsvp_ext_map) < 0) goto rtattr_failure; -#endif rta->rta_len = skb->tail - b; -#ifdef CONFIG_NET_CLS_POLICE - if (f->police) - if (tcf_police_dump_stats(skb, f->police) < 0) - goto rtattr_failure; -#endif + + if (tcf_exts_dump_stats(skb, &f->exts, &rsvp_ext_map) < 0) + goto rtattr_failure; return skb->len; rtattr_failure: From tgraf@suug.ch Thu Dec 30 04:31:33 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 04:31:39 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBUCVHRr026844 for ; Thu, 30 Dec 2004 04:31:33 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id 86694F; Thu, 30 Dec 2004 13:34:58 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id DC7A31C0EB; Thu, 30 Dec 2004 13:35:41 +0100 (CET) Date: Thu, 30 Dec 2004 13:35:41 +0100 From: Thomas Graf To: "David S. Miller" Cc: Jamal Hadi Salim , Patrick McHardy , netdev@oss.sgi.com Subject: [PATCH 8/9] PKT_SCHED: Remove old action/police helpers Message-ID: <20041230123541.GU32419@postel.suug.ch> References: <20041230122652.GM32419@postel.suug.ch> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041230122652.GM32419@postel.suug.ch> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13233 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev Signed-off-by: Thomas Graf --- linux-2.6.10-bk1.orig/include/net/pkt_cls.h 2004-12-27 21:35:31.000000000 +0100 +++ linux-2.6.10-bk1/include/net/pkt_cls.h 2004-12-27 22:46:36.000000000 +0100 @@ -149,88 +149,6 @@ extern int tcf_exts_dump_stats(struct sk_buff *skb, struct tcf_exts *exts, struct tcf_ext_map *map); -#ifdef CONFIG_NET_CLS_ACT -static inline int -tcf_change_act_police(struct tcf_proto *tp, struct tc_action **action, - struct rtattr *act_police_tlv, struct rtattr *rate_tlv) -{ - int ret; - struct tc_action *act; - - act = tcf_action_init_1(act_police_tlv, rate_tlv, "police", - TCA_ACT_NOREPLACE, TCA_ACT_BIND, &ret); - if (act == NULL) - return ret; - - act->type = TCA_OLD_COMPAT; - - if (*action) { - tcf_tree_lock(tp); - act = xchg(action, act); - tcf_tree_unlock(tp); - - tcf_action_destroy(act, TCA_ACT_UNBIND); - } else - *action = act; - - return 0; -} - -static inline int -tcf_change_act(struct tcf_proto *tp, struct tc_action **action, - struct rtattr *act_tlv, struct rtattr *rate_tlv) -{ - int ret; - struct tc_action *act; - - act = tcf_action_init(act_tlv, rate_tlv, NULL, - TCA_ACT_NOREPLACE, TCA_ACT_BIND, &ret); - if (act == NULL) - return ret; - - if (*action) { - tcf_tree_lock(tp); - act = xchg(action, act); - tcf_tree_unlock(tp); - - tcf_action_destroy(act, TCA_ACT_UNBIND); - } else - *action = act; - - return 0; -} - -static inline int -tcf_dump_act(struct sk_buff *skb, struct tc_action *action, - int act_type, int compat_type) -{ - /* - * again for backward compatible mode - we want - * to work with both old and new modes of entering - * tc data even if iproute2 was newer - jhs - */ - if (action) { - struct rtattr * p_rta = (struct rtattr*) skb->tail; - - if (action->type != TCA_OLD_COMPAT) { - RTA_PUT(skb, act_type, 0, NULL); - if (tcf_action_dump(skb, action, 0, 0) < 0) - goto rtattr_failure; - } else { - RTA_PUT(skb, compat_type, 0, NULL); - if (tcf_action_dump_old(skb, action, 0, 0) < 0) - goto rtattr_failure; - } - - p_rta->rta_len = skb->tail - (u8*)p_rta; - } - return 0; - -rtattr_failure: - return -1; -} -#endif /* CONFIG_NET_CLS_ACT */ - #ifdef CONFIG_NET_CLS_IND static inline int tcf_change_indev(struct tcf_proto *tp, char *indev, struct rtattr *indev_tlv) @@ -260,44 +178,4 @@ } #endif /* CONFIG_NET_CLS_IND */ -#ifdef CONFIG_NET_CLS_POLICE -static inline int -tcf_change_police(struct tcf_proto *tp, struct tcf_police **police, - struct rtattr *police_tlv, struct rtattr *rate_tlv) -{ - struct tcf_police *p = tcf_police_locate(police_tlv, rate_tlv); - - if (*police) { - tcf_tree_lock(tp); - p = xchg(police, p); - tcf_tree_unlock(tp); - - tcf_police_release(p, TCA_ACT_UNBIND); - } else - *police = p; - - return 0; -} - -static inline int -tcf_dump_police(struct sk_buff *skb, struct tcf_police *police, - int police_type) -{ - if (police) { - struct rtattr * p_rta = (struct rtattr*) skb->tail; - - RTA_PUT(skb, police_type, 0, NULL); - - if (tcf_police_dump(skb, police) < 0) - goto rtattr_failure; - - p_rta->rta_len = skb->tail - (u8*)p_rta; - } - return 0; - -rtattr_failure: - return -1; -} -#endif /* CONFIG_NET_CLS_POLICE */ - #endif From kaber@trash.net Thu Dec 30 04:32:03 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 04:32:10 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBUCVgHe027059 for ; Thu, 30 Dec 2004 04:32:03 -0800 Received: from eru.coreworks.de ([172.16.0.2] helo=trash.net) by kaber.coreworks.de with esmtp (Exim 4.34) id 1CjzX4-00034X-HE; Thu, 30 Dec 2004 13:35:58 +0100 Message-ID: <41D3F5EC.9050808@trash.net> Date: Thu, 30 Dec 2004 13:34:52 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040413 Debian/1.6-5 X-Accept-Language: en MIME-Version: 1.0 To: hadi@cyberus.ca CC: Maillist netdev Subject: Re: [PATCH PKT_SCHED 0/17]: tc action cleanup + fixes References: <41D3785F.3040909@trash.net> <1104382562.1048.39.camel@jzny.localdomain> In-Reply-To: <1104382562.1048.39.camel@jzny.localdomain> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13234 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev jamal wrote: > Patrick, > Thanks for this cleanup. > > Questions/comments: > > 1)compiler or style issue? > > You have a few of fixes from > > a) > if (..){ > single statement here; > } > > to: > if (..) > single statement here; > > I always add an extra pair of brace > for lazy reasons (in the back of my mind: incase i want to add another > statement ;->). Just cleanup, I prefer not to waste too many lines. Saving space increases readability. > > b)Other things which i have seen compilers whine about in the past of > the form: > > a missing cast > - a->priv = (void *) p; > + a->priv = p; No need to cast a pointer to void *, except if a->priv was of some different type. > > or unitialized vars: > - struct tcf_pedit *p = NULL; > + struct tcf_pedit *p; p is assigned another value before the first use, so initializing to NULL is not neccessary. > 2) You are not the first to not like the > if (constant != variable_here) > > Should be noted that i am dyxlesic and this has saved > me a few times (I would say the most common errata for me, weird as that > may sound). Dont have a problem with the changes you made > (dont need the protection at this stage;->). The compiler warns about assignments in comparisions nowadays. > 3) Is there any reason in which some cases you fixed things to be: > > return_type > functionname() eg > -static int > -gact_net_rand(struct tcf_gact *p) { > +static int gact_net_rand(struct tcf_gact *p) > > and in some cases you leave them to be of the form: > return_type functionname() Just saving empty lines. I didn't try to be consistent with this, In case I changed it the other way around it's usually to keep all arguments on one line without exceeding 80 chars. > > 4) Some of those messages are actually still useful and dont really > harm to leave around for a little while longer; > - if (tb[TCA_PEDIT_PARMS - 1] == NULL) { > - printk("BUG: tcf_pedit_init called with NULL params\n"); > > I realize the fixes you have to return -ENOMEN/NOENT etc are an > improvement but a little ascii puking wont harm for somebody writting > a user space app until we get better netlink error propagation > in place. Agreed for some messages, but those should be DEBUGs. Anyway, I didn't want to judge for every message and possible convert it, so I deleted all printks that got replaced by error codes. > I will look closely at one or two of those fixes in the morning; > majority look good on first quick scan (most things were needed during > development or are artifacts of that period and are safe to rid of now). Thanks, I'll continue later today and send another batch tonight. Regards Patrick From tgraf@suug.ch Thu Dec 30 04:32:13 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 04:32:20 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBUCVqJY027309 for ; Thu, 30 Dec 2004 04:32:13 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id 009BBF; Thu, 30 Dec 2004 13:35:34 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id 5BED11C0EB; Thu, 30 Dec 2004 13:36:17 +0100 (CET) Date: Thu, 30 Dec 2004 13:36:17 +0100 From: Thomas Graf To: "David S. Miller" Cc: Jamal Hadi Salim , Patrick McHardy , netdev@oss.sgi.com Subject: [PATCH 9/9] PKT_SCHED: Actions are now available for all classifiers Message-ID: <20041230123617.GV32419@postel.suug.ch> References: <20041230122652.GM32419@postel.suug.ch> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041230122652.GM32419@postel.suug.ch> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13235 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev Signed-off-by: Thomas Graf --- linux-2.6.10-bk1.orig/net/sched/Kconfig 2004-12-27 14:32:33.000000000 +0100 +++ linux-2.6.10-bk1/net/sched/Kconfig 2004-12-27 22:47:22.000000000 +0100 @@ -381,7 +381,6 @@ ---help--- This option requires you have a new iproute2. It enables tc extensions which can be used with tc classifiers. - Only the u32 and fw classifiers are supported at the moment. You MUST NOT turn this on if you dont have an update iproute2. config NET_ACT_POLICE From hadi@cyberus.ca Thu Dec 30 05:13:19 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 05:13:29 -0800 (PST) Received: from mx02.cybersurf.com (mx02.cybersurf.com [209.197.145.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBUDCwqf030250 for ; Thu, 30 Dec 2004 05:13:19 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx02.cybersurf.com with esmtp (Exim 4.30) id 1Ck0B4-0008LF-Hl for netdev@oss.sgi.com; Thu, 30 Dec 2004 08:17:18 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1Ck0B2-0006dj-IL; Thu, 30 Dec 2004 08:17:16 -0500 Subject: Re: [PATCH PKT_SCHED 17/17]: Disable broken override bits in pedit action From: jamal Reply-To: hadi@cyberus.ca To: Patrick McHardy Cc: Maillist netdev In-Reply-To: <41D378FF.3080205@trash.net> References: <41D378FF.3080205@trash.net> Content-Type: text/plain Organization: jamalopolous Message-Id: <1104412634.1048.106.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 30 Dec 2004 08:17:14 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13236 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Wed, 2004-12-29 at 22:41, Patrick McHardy wrote: > Disable broken override bits in pedit action. It misses > locking and needs to allocate new memory if nkeys increases. > Also disable it for now. > There are a couple of these that you have (ipt being other). Could you just add a check for size before returning -EOPNOTSUPP? Example: if (ovr) if (p->nkeys != parm->nkeys) return -EOPNOTSUPP; This way if they are of the same size then things should work as is and my testcases dont break. Now if you feel more gracious, go ahead and fix them ;-> > ret = -EEXIST; > if (ovr) { > + /* FIXME: no locking, larger memory area might be required */ > + return -EOPNOTSUPP; > ret = 0; > override: > p->flags = parm->flags; cheers, jamal From kaber@trash.net Thu Dec 30 05:21:51 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 05:21:58 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBUDLUWr031128 for ; Thu, 30 Dec 2004 05:21:51 -0800 Received: from eru.coreworks.de ([172.16.0.2] helo=trash.net) by kaber.coreworks.de with esmtp (Exim 4.34) id 1Ck0Jj-00038P-6Z; Thu, 30 Dec 2004 14:26:15 +0100 Message-ID: <41D401B4.7010104@trash.net> Date: Thu, 30 Dec 2004 14:25:08 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040413 Debian/1.6-5 X-Accept-Language: en MIME-Version: 1.0 To: hadi@cyberus.ca CC: Maillist netdev Subject: Re: [PATCH PKT_SCHED 17/17]: Disable broken override bits in pedit action References: <41D378FF.3080205@trash.net> <1104412634.1048.106.camel@jzny.localdomain> In-Reply-To: <1104412634.1048.106.camel@jzny.localdomain> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13237 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev jamal wrote: > This way if they are of the same size then things should work as is > and my testcases dont break. Now if you feel more gracious, go ahead and > fix them ;-> Yes, this patch was silly, I should have just fixed it. I'll revert the patch and fix it. Regards Patrick From hadi@cyberus.ca Thu Dec 30 05:25:57 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 05:26:04 -0800 (PST) Received: from mx03.cybersurf.com (mx03.cybersurf.com [209.197.145.106]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBUDPaRB031796 for ; Thu, 30 Dec 2004 05:25:56 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx03.cybersurf.com with esmtp (Exim 4.30) id 1Ck0NR-0006GL-6P for netdev@oss.sgi.com; Thu, 30 Dec 2004 08:30:05 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1Ck0NO-0007vg-9c; Thu, 30 Dec 2004 08:30:02 -0500 Subject: Re: [PATCH PKT_SCHED 0/17]: tc action cleanup + fixes From: jamal Reply-To: hadi@cyberus.ca To: Patrick McHardy Cc: Maillist netdev In-Reply-To: <41D3F5EC.9050808@trash.net> References: <41D3785F.3040909@trash.net> <1104382562.1048.39.camel@jzny.localdomain> <41D3F5EC.9050808@trash.net> Content-Type: text/plain Organization: jamalopolous Message-Id: <1104413400.1047.123.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 30 Dec 2004 08:30:00 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13238 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Thu, 2004-12-30 at 07:34, Patrick McHardy wrote: > jamal wrote: > Patrick, > > Thanks for this cleanup. > > > > Questions/comments: > > > > 1)compiler or style issue? > > > > You have a few of fixes from > > > > a) > > if (..){ > > single statement here; > > } > > > > to: > > if (..) > > single statement here; > > > > I always add an extra pair of brace > > for lazy reasons (in the back of my mind: incase i want to add another > > statement ;->). > > Just cleanup, I prefer not to waste too many lines. Saving > space increases readability. > I am going to try and control my fingers ;-> They have a brain of their own. > > b)Other things which i have seen compilers whine about in the past of > > the form: > > > > a missing cast > > - a->priv = (void *) p; > > + a->priv = p; > > No need to cast a pointer to void *, except if a->priv > was of some different type. > So as long as lvalue was void you dont cast? p is certainly not void. > > 4) Some of those messages are actually still useful and dont really > > harm to leave around for a little while longer; > > - if (tb[TCA_PEDIT_PARMS - 1] == NULL) { > > - printk("BUG: tcf_pedit_init called with NULL params\n"); > > > > I realize the fixes you have to return -ENOMEN/NOENT etc are an > > improvement but a little ascii puking wont harm for somebody writting > > a user space app until we get better netlink error propagation > > in place. > > Agreed for some messages, but those should be DEBUGs. Anyway, > I didn't want to judge for every message and possible convert > it, so I deleted all printks that got replaced by error codes. > the printks are meant to help a little more (and are mostly on the slow path); when the error propagation for netlink works well, those sorts of ascii messages will probably be transported back to user space. On any newer patches I suggest to just keep them. Heres something else: Re: [PATCH PKT_SCHED 15/17]: Remove checks for impossible conditions in pedit action, you say: >Remove checks for impossible conditions in pedit action. ________________________________________________________________________ [..] - if (p == NULL) { > - printk("BUG: tcf_pedit_dump called with NULL params\n"); > - goto rtattr_failure; > - } > - You have these type changes all over. These are certainly artifacts of the development time, I may have have caught a bug or two via these checks at the time. It is highly likely those bugs are fixed in the code merged. If they happen, however, they are a BUG and the possibility of a bug is still there ;-> i.e the word "impossible" is too strong a description. Having said that: Is it better to have an oops catch this or have something print on the console or syslog indicating a bug? This is more a philosphical question and an answer could be "good practise is to let oops catch it". I am actually indifferent if those checks go - however if i had caught them myself i would have put unlikely() around them. > > I will look closely at one or two of those fixes in the morning; > > majority look good on first quick scan (most things were needed during > > development or are artifacts of that period and are safe to rid of now). > > Thanks, I'll continue later today and send another batch > tonight. I will wait for you to finish before i start working on the eactions. So a general comment to all the patches. All look good - I would prefer a check against size instead of EOPNOTSUPP for the two i pointed at. And going forward, prefer you leave the printks i had for errors but fix the return codes to be more meaningful. So only those two i pointed at with EOPNOTSUPP i am not ACKing (my basic tests will break) - rest Dave can push in. Again, thanks. cheers, jamal From tgraf@suug.ch Thu Dec 30 05:29:24 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 05:29:31 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBUDT3B3032411 for ; Thu, 30 Dec 2004 05:29:24 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id ED79EF; Thu, 30 Dec 2004 14:33:18 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id 335971C0EA; Thu, 30 Dec 2004 14:34:01 +0100 (CET) Date: Thu, 30 Dec 2004 14:34:01 +0100 From: Thomas Graf To: Patrick McHardy Cc: jamal , Maillist netdev Subject: Re: [PATCH PKT_SCHED 4/17]: Check TCA_ACT_KIND payload size _before_ copying it Message-ID: <20041230133401.GW32419@postel.suug.ch> References: <41D37875.5020103@trash.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <41D37875.5020103@trash.net> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13239 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * Patrick McHardy <41D37875.5020103@trash.net> 2004-12-30 04:39 > - sprintf(act_name, "%s", (char*)RTA_DATA(kind)); > - if (RTA_PAYLOAD(kind) >= IFNAMSIZ) { > - printk("Action %s bad\n", (char*)RTA_DATA(kind)); > + if (RTA_PAYLOAD(kind) >= IFNAMSIZ) The check should be RTA_PAYLOAD(kind) > IFNAMSIZ, == is ok if the terminating NUL is provided. > goto err_out; > - } > + sprintf(act_name, "%s", (char*)RTA_DATA(kind)); > } else { This will cause horrible crashes if no NUL is provided to terminate the name. So I think this should be: if (RTA_PAYLOAD(kind) > IFNAMSIZ) goto err_out; memset(act_name, ...); memcpy(act_name, RTA_DATA(kind), RTA_PAYLOAD(kind)); act_name[IFNAMSIZ - 1] = '\0'; The memset is required to ensure 0 termination if kind is not and shorter than IFNAMSIZ. memcpy instead of str* to avoid using any form of str(n)len on a possibly not terminated string and setting IFNAMSIZ - 1 to NUL to ensure proper handling of a IFNAMSIZ long not terminated string. I know it's unlikely but this might just save us some troubles later. From tgraf@suug.ch Thu Dec 30 05:35:49 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 05:35:56 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBUDZST3000716 for ; Thu, 30 Dec 2004 05:35:48 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id 440A5F; Thu, 30 Dec 2004 14:39:47 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id 49D8C1C0EA; Thu, 30 Dec 2004 14:40:29 +0100 (CET) Date: Thu, 30 Dec 2004 14:40:29 +0100 From: Thomas Graf To: Patrick McHardy Cc: jamal , Maillist netdev Subject: Re: [PATCH PKT_SCHED 11/17]: Remove checks for impossible conditions in ipt action Message-ID: <20041230134029.GX32419@postel.suug.ch> References: <41D378AB.70204@trash.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <41D378AB.70204@trash.net> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13240 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * Patrick McHardy <41D378AB.70204@trash.net> 2004-12-30 04:40 > - if (a == NULL || rta == NULL || > - rtattr_parse(tb, TCA_IPT_MAX, RTA_DATA(rta), RTA_PAYLOAD(rta)) < 0) > + if (rtattr_parse(tb, TCA_IPT_MAX, RTA_DATA(rta), RTA_PAYLOAD(rta)) < 0) > return -1; You might want to use rtattr_parse_nested here (see patch 1 of my latest patchset) if (rtattr_parse_nested(tb, TCA_IPT_MAX, rta) < 0) Purely cosmetic though. It gives a slightly better hint on what is being done. From hadi@cyberus.ca Thu Dec 30 05:47:36 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 05:47:43 -0800 (PST) Received: from mx01.cybersurf.com (mx01.cybersurf.com [209.197.145.104]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBUDlGOS001799 for ; Thu, 30 Dec 2004 05:47:36 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx01.cybersurf.com with esmtp (Exim 4.30) id 1Ck0ib-0001g7-O6 for netdev@oss.sgi.com; Thu, 30 Dec 2004 08:51:57 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1Ck0iZ-00027x-ST; Thu, 30 Dec 2004 08:51:56 -0500 Subject: Re: [PATCH 2/9] PKT_SCHED: tc filter extension API From: jamal Reply-To: hadi@cyberus.ca To: Thomas Graf Cc: "David S. Miller" , Patrick McHardy , netdev@oss.sgi.com In-Reply-To: <20041230123023.GO32419@postel.suug.ch> References: <20041230122652.GM32419@postel.suug.ch> <20041230123023.GO32419@postel.suug.ch> Content-Type: text/plain Organization: jamalopolous Message-Id: <1104414713.1047.130.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 30 Dec 2004 08:51:54 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13241 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev Thomas, Havent looked at the whole set - will later today; quick question: On Thu, 2004-12-30 at 07:30, Thomas Graf wrote: > +struct tcf_exts > +{ > +#ifdef CONFIG_NET_CLS_ACT > + struct tc_action *action; > +#elif defined CONFIG_NET_CLS_POLICE > + struct tcf_police *police; > +#endif > +}; Things like above: In current code you can have CONFIG_NET_CLS_ACT and not use new style policer, rather use old one i.e CONFIG_NET_CLS_POLICE. You seem to indicate presence of CONFIG_NET_CLS_ACT implies absence of NET_CLS_POLICE. A fix for example maybe to s/elif/ifdef [1] Anyways, back later to peek some more. cheers, jamal [1]Look at Kconfig rules: ---- config NET_CLS_POLICE ... depends on NET_CLS && NET_QOS && NET_ACT_POLICE!=y && NET_ACT_POLICE!=m ---- Anyways, back later to peek some more. cheers, jamal From walter.liu@126.com Thu Dec 30 06:01:35 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 06:02:01 -0800 (PST) Received: from 126.com ([220.181.31.170]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBUE1Cqr002969 for ; Thu, 30 Dec 2004 06:01:33 -0800 Received: from [61.145.139.200] (unknown [61.145.139.200]) by smtp1 (Coremail) with SMTP id IkAmJBYL1EE8dE07.1 for ; Thu, 30 Dec 2004 22:05:15 +0800 (CST) X-Originating-IP: [61.145.139.200] Message-ID: <41D40B0C.2060204@126.com> Date: Thu, 30 Dec 2004 22:05:00 +0800 From: Walter Liu User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv:1.7.3) Gecko/20040910 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Zhenyu Wu CC: netdev@oss.sgi.com Subject: Re: How can i join this mail list? References: <304407396.02136@njupt.edu.cn> In-Reply-To: <304407396.02136@njupt.edu.cn> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13242 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: Walter.liu@126.com Precedence: bulk X-list: netdev Zhenyu Wu wrote: >I want to know if I use the mail to join the mail list, how can I do? Just write >"subscribe" in the body? > "subscribe netdev", this is the email body, you must send it to majordomo2oss.sgi.com. Regards LWT From tgraf@suug.ch Thu Dec 30 06:04:35 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 06:04:42 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBUE4Eja003659 for ; Thu, 30 Dec 2004 06:04:35 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id CD434F; Thu, 30 Dec 2004 15:08:46 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id DC6691C0EA; Thu, 30 Dec 2004 15:09:29 +0100 (CET) Date: Thu, 30 Dec 2004 15:09:29 +0100 From: Thomas Graf To: jamal Cc: "David S. Miller" , Patrick McHardy , netdev@oss.sgi.com Subject: Re: [PATCH 2/9] PKT_SCHED: tc filter extension API Message-ID: <20041230140929.GY32419@postel.suug.ch> References: <20041230122652.GM32419@postel.suug.ch> <20041230123023.GO32419@postel.suug.ch> <1104414713.1047.130.camel@jzny.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1104414713.1047.130.camel@jzny.localdomain> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13243 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * jamal <1104414713.1047.130.camel@jzny.localdomain> 2004-12-30 08:51 > In current code you can have CONFIG_NET_CLS_ACT and not use new > style policer, rather use old one i.e CONFIG_NET_CLS_POLICE. You seem to > indicate presence of CONFIG_NET_CLS_ACT implies absence of > NET_CLS_POLICE. Is this wrong? Current code: (u32) 2004/06/15 hadi | #ifdef CONFIG_NET_CLS_ACT 2004/06/15 hadi | struct tc_action *action; 2004/06/15 hadi | #else 2002/02/05 torvalds | #ifdef CONFIG_NET_CLS_POLICE 2002/02/05 torvalds | struct tcf_police *police; 2002/02/05 torvalds | #endif 2004/06/15 hadi | #endif > config NET_CLS_POLICE > ... > depends on NET_CLS && NET_QOS && NET_ACT_POLICE!=y && > NET_ACT_POLICE!=m Hmm... doesn't make too much sense for me. What's the advantage of allowing this mix? From kaber@trash.net Thu Dec 30 06:12:59 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 06:13:07 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBUECcdD004603 for ; Thu, 30 Dec 2004 06:12:59 -0800 Received: from eru.coreworks.de ([172.16.0.2] helo=trash.net) by kaber.coreworks.de with esmtp (Exim 4.34) id 1Ck17c-0003D4-Tx; Thu, 30 Dec 2004 15:17:49 +0100 Message-ID: <41D40DCA.7090706@trash.net> Date: Thu, 30 Dec 2004 15:16:42 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040413 Debian/1.6-5 X-Accept-Language: en MIME-Version: 1.0 To: hadi@cyberus.ca CC: Maillist netdev Subject: Re: [PATCH PKT_SCHED 0/17]: tc action cleanup + fixes References: <41D3785F.3040909@trash.net> <1104382562.1048.39.camel@jzny.localdomain> <41D3F5EC.9050808@trash.net> <1104413400.1047.123.camel@jzny.localdomain> In-Reply-To: <1104413400.1047.123.camel@jzny.localdomain> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13244 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev jamal wrote: >>>b)Other things which i have seen compilers whine about in the past of >>>the form: >>> >>>a missing cast >>>- a->priv = (void *) p; >>>+ a->priv = p; >> >>No need to cast a pointer to void *, except if a->priv >>was of some different type. >> > So as long as lvalue was void you dont cast? p is certainly not void. Exactly. >>>4) Some of those messages are actually still useful and dont really >>>harm to leave around for a little while longer; >>>- if (tb[TCA_PEDIT_PARMS - 1] == NULL) { >>>- printk("BUG: tcf_pedit_init called with NULL params\n"); >>> >>>I realize the fixes you have to return -ENOMEN/NOENT etc are an >>>improvement but a little ascii puking wont harm for somebody writting >>>a user space app until we get better netlink error propagation >>>in place. >> >>Agreed for some messages, but those should be DEBUGs. Anyway, >>I didn't want to judge for every message and possible convert >>it, so I deleted all printks that got replaced by error codes. >> > > the printks are meant to help a little more (and are mostly on the slow > path); when the error propagation for netlink works well, those sorts of > ascii messages will probably be transported back to user space. On any > newer patches I suggest to just keep them. Ok. > Heres something else: > Re: [PATCH PKT_SCHED 15/17]: Remove checks for impossible conditions in > pedit action, you say: > > >>Remove checks for impossible conditions in pedit action. > > > ________________________________________________________________________ > [..] > - if (p == NULL) { > >>- printk("BUG: tcf_pedit_dump called with NULL params\n"); >>- goto rtattr_failure; >>- } >>- > > > You have these type changes all over. These are certainly artifacts of the > development time, I may have have caught a bug or two via these checks at > the time. It is highly likely those bugs are fixed in the code merged. Yes, I checked all paths before removing them. > If they happen, however, they are a BUG and the possibility of a bug is > still there ;-> i.e the word "impossible" is too strong a description. > Having said that: > Is it better to have an oops catch this or have something print on the > console or syslog indicating a bug? This is more a philosphical question > and an answer could be "good practise is to let oops catch it". I am > actually indifferent if those checks go - however if i had caught them > myself i would have put unlikely() around them. I prefer an Oops because it gives a backtrace, without requiring additional checks in the code. The other reason I deleted them was that not all of them printed something on the console, so some bugs were just quietly ignored. And I didn't want to add more printks :) > I will wait for you to finish before i start working on the eactions. > > So a general comment to all the patches. All look good - I would prefer > a check against size instead of EOPNOTSUPP for the two i pointed at. > And going forward, prefer you leave the printks i had for errors but fix > the return codes to be more meaningful. So only those two i pointed at > with EOPNOTSUPP i am not ACKing (my basic tests will break) - rest Dave > can push in. Thanks. Dave is on holidays until next week, I'll fix them up until then. Regards Patrick From kaber@trash.net Thu Dec 30 06:17:05 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 06:17:13 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBUEGijf005246 for ; Thu, 30 Dec 2004 06:17:05 -0800 Received: from eru.coreworks.de ([172.16.0.2] helo=trash.net) by kaber.coreworks.de with esmtp (Exim 4.34) id 1Ck1Be-0003DE-EO; Thu, 30 Dec 2004 15:21:58 +0100 Message-ID: <41D40EC2.3070906@trash.net> Date: Thu, 30 Dec 2004 15:20:50 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040413 Debian/1.6-5 X-Accept-Language: en MIME-Version: 1.0 To: Thomas Graf CC: jamal , Maillist netdev Subject: Re: [PATCH PKT_SCHED 4/17]: Check TCA_ACT_KIND payload size _before_ copying it References: <41D37875.5020103@trash.net> <20041230133401.GW32419@postel.suug.ch> In-Reply-To: <20041230133401.GW32419@postel.suug.ch> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13245 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev Thomas Graf wrote: > * Patrick McHardy <41D37875.5020103@trash.net> 2004-12-30 04:39 > >>- sprintf(act_name, "%s", (char*)RTA_DATA(kind)); >>- if (RTA_PAYLOAD(kind) >= IFNAMSIZ) { >>- printk("Action %s bad\n", (char*)RTA_DATA(kind)); >>+ if (RTA_PAYLOAD(kind) >= IFNAMSIZ) > > > The check should be RTA_PAYLOAD(kind) > IFNAMSIZ, == is ok > if the terminating NUL is provided. Thanks. > > >> goto err_out; >>- } >>+ sprintf(act_name, "%s", (char*)RTA_DATA(kind)); >> } else { > > > This will cause horrible crashes if no NUL is provided to terminate > the name. > > So I think this should be: > > if (RTA_PAYLOAD(kind) > IFNAMSIZ) > goto err_out; > memset(act_name, ...); > memcpy(act_name, RTA_DATA(kind), RTA_PAYLOAD(kind)); > act_name[IFNAMSIZ - 1] = '\0'; > > The memset is required to ensure 0 termination if kind is not and > shorter than IFNAMSIZ. memcpy instead of str* to avoid using > any form of str(n)len on a possibly not terminated string > and setting IFNAMSIZ - 1 to NUL to ensure proper handling of > a IFNAMSIZ long not terminated string. > > I know it's unlikely but this might just save us some troubles later. Agreed. I saved this change for later because there are more places in net/sched that need to be fixed. I guess I'll just add a rtattr_strncpy function. Regards Patrick From kaber@trash.net Thu Dec 30 06:21:22 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 06:21:29 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBUEL1RU005966 for ; Thu, 30 Dec 2004 06:21:22 -0800 Received: from eru.coreworks.de ([172.16.0.2] helo=trash.net) by kaber.coreworks.de with esmtp (Exim 4.34) id 1Ck1Fs-0003Dz-3K; Thu, 30 Dec 2004 15:26:20 +0100 Message-ID: <41D40FC9.6060902@trash.net> Date: Thu, 30 Dec 2004 15:25:13 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040413 Debian/1.6-5 X-Accept-Language: en MIME-Version: 1.0 To: Thomas Graf CC: jamal , Maillist netdev Subject: Re: [PATCH PKT_SCHED 11/17]: Remove checks for impossible conditions in ipt action References: <41D378AB.70204@trash.net> <20041230134029.GX32419@postel.suug.ch> In-Reply-To: <20041230134029.GX32419@postel.suug.ch> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13246 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev Thomas Graf wrote: > * Patrick McHardy <41D378AB.70204@trash.net> 2004-12-30 04:40 > >>- if (a == NULL || rta == NULL || >>- rtattr_parse(tb, TCA_IPT_MAX, RTA_DATA(rta), RTA_PAYLOAD(rta)) < 0) >>+ if (rtattr_parse(tb, TCA_IPT_MAX, RTA_DATA(rta), RTA_PAYLOAD(rta)) < 0) >> return -1; > > > You might want to use rtattr_parse_nested here (see patch 1 of my latest > patchset) > > if (rtattr_parse_nested(tb, TCA_IPT_MAX, rta) < 0) > > Purely cosmetic though. It gives a slightly better hint on what is being > done. We can do this once your changes are merged (I'll review them later today). For now I prefer to leave it this way so I can work on a vanilla tree. Regards Patrick From wichert@levante.wiggy.net Thu Dec 30 08:01:15 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 08:01:21 -0800 (PST) Received: from mx1.wiggy.net (Debian-exim@levante.wiggy.net [195.85.225.139]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBUG0sIk013948 for ; Thu, 30 Dec 2004 08:01:15 -0800 Received: from wichert by mx1.wiggy.net with local (Exim 4.34) id 1Ck2p1-0001eM-U7; Thu, 30 Dec 2004 17:06:44 +0100 Date: Thu, 30 Dec 2004 17:06:43 +0100 From: Wichert Akkerman To: netdev@oss.sgi.com Cc: tgraf@suug.ch Subject: ing_filter debug messages Message-ID: <20041230160643.GD24603@wiggy.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.6+20040907i X-SA-Exim-Connect-IP: X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13247 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: wichert@wiggy.net Precedence: bulk X-list: netdev After upgrading a machine to (unpatched mainline) 2.6.10 my kernel log is filled with ing_filter (debug?) messages: Dec 30 16:24:58 thunder kernel: ing_filter: fixed eth1 out eth1 Dec 30 16:24:58 thunder kernel: ing_filter: fixed eth1 out eth1 Dec 30 16:37:08 thunder kernel: ing_filter: fixed eth1 out eth1 Dec 30 16:37:08 thunder kernel: ing_filter: fixed eth1 out eth1 Dec 30 16:49:08 thunder kernel: ing_filter: fixed eth1 out eth1 Dec 30 16:49:08 thunder kernel: ing_filter: fixed eth1 out eth1 Dec 30 17:01:08 thunder kernel: ing_filter: fixed eth1 out eth1 Dec 30 17:01:08 thunder kernel: ing_filter: fixed eth1 out eth1 the messages always come in pairs. eth1 is the externel interface which has a standard wondershaper configuration attached to it. Relevant bits of .config are below. Wichert. # # QoS and/or fair queueing # CONFIG_NET_SCHED=y CONFIG_NET_SCH_CLK_JIFFIES=y # CONFIG_NET_SCH_CLK_GETTIMEOFDAY is not set # CONFIG_NET_SCH_CLK_CPU is not set CONFIG_NET_SCH_CBQ=y CONFIG_NET_SCH_HTB=y # CONFIG_NET_SCH_HFSC is not set # CONFIG_NET_SCH_PRIO is not set # CONFIG_NET_SCH_RED is not set CONFIG_NET_SCH_SFQ=y # CONFIG_NET_SCH_TEQL is not set CONFIG_NET_SCH_TBF=y # CONFIG_NET_SCH_GRED is not set # CONFIG_NET_SCH_DSMARK is not set # CONFIG_NET_SCH_NETEM is not set CONFIG_NET_SCH_INGRESS=y CONFIG_NET_QOS=y CONFIG_NET_ESTIMATOR=y CONFIG_NET_CLS=y CONFIG_NET_CLS_TCINDEX=y CONFIG_NET_CLS_ROUTE4=y CONFIG_NET_CLS_ROUTE=y CONFIG_NET_CLS_FW=y CONFIG_NET_CLS_U32=y # CONFIG_CLS_U32_PERF is not set # CONFIG_NET_CLS_IND is not set CONFIG_NET_CLS_RSVP=y CONFIG_NET_CLS_RSVP6=y CONFIG_NET_CLS_ACT=y CONFIG_NET_ACT_POLICE=y CONFIG_NET_ACT_GACT=y CONFIG_GACT_PROB=y CONFIG_NET_ACT_MIRRED=y CONFIG_NET_ACT_IPT=y CONFIG_NET_ACT_PEDIT=y -- Wichert Akkerman It is simple to make things. http://www.wiggy.net/ It is hard to make things simple. From kaber@trash.net Thu Dec 30 08:05:42 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 08:05:49 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBUG5Lt8014564 for ; Thu, 30 Dec 2004 08:05:42 -0800 Received: from eru.coreworks.de ([172.16.0.2] helo=trash.net) by kaber.coreworks.de with esmtp (Exim 4.34) id 1Ck2tZ-0003jg-Vw; Thu, 30 Dec 2004 17:11:26 +0100 Message-ID: <41D4286B.1060106@trash.net> Date: Thu, 30 Dec 2004 17:10:19 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040413 Debian/1.6-5 X-Accept-Language: en MIME-Version: 1.0 To: hadi@cyberus.ca CC: Maillist netdev Subject: Re: [PATCH PKT_SCHED 0/17]: tc action cleanup + fixes References: <41D3785F.3040909@trash.net> <1104382562.1048.39.camel@jzny.localdomain> <41D3F5EC.9050808@trash.net> In-Reply-To: <41D3F5EC.9050808@trash.net> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13248 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev Patrick McHardy wrote: > Thanks, I'll continue later today and send another batch > tonight. Thinking about it, I think its better to reorganize the patches by subject. While doing this I'm going to add back the useful printks in the init paths as DPRINTKs. I'm going to post the entire reorganized batch next week when I return home, working with bitkeeper on my crappy notebook is just to painful :) Regards Patrick From dave@thedillows.org Thu Dec 30 08:11:26 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 08:11:33 -0800 (PST) Received: from iasrv1.idleaire.net (NS1.idleaire.net [65.220.16.2]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBUGB5oo015325 for ; Thu, 30 Dec 2004 08:11:26 -0800 Received: by iasrv1.idleaire.net (Postfix, from userid 300) id F1F42236E88; Thu, 30 Dec 2004 11:16:53 -0500 (EST) Received: from corp4.idleaire.com (corp4.idleaire.com [10.1.66.36]) by iasrv1.idleaire.net (Postfix) with ESMTP id 9E03B236E5B; Thu, 30 Dec 2004 11:16:53 -0500 (EST) Received: from knox.voodoobox.net ([10.1.66.124]) by corp4.idleaire.com with Microsoft SMTPSVC(5.0.2195.6713); Thu, 30 Dec 2004 11:16:53 -0500 Subject: Re: [RFC 2.6.10 1/22] xfrm: Add direction information to xfrm_state From: Dave Dillow To: Jan-Benedict Glaw Cc: netdev@oss.sgi.com, linux-kernel@vger.kernel.org In-Reply-To: <20041230094839.GX2460@lug-owl.de> References: <20041230035000.01@ori.thedillows.org> <20041230035000.10@ori.thedillows.org> <20041230094839.GX2460@lug-owl.de> Content-Type: text/plain Message-Id: <1104423409.23254.9.camel@dillow.idleaire.com> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.5 Date: Thu, 30 Dec 2004 11:16:49 -0500 Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 30 Dec 2004 16:16:53.0581 (UTC) FILETIME=[F9B9AFD0:01C4EE8A] X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13249 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dave@thedillows.org Precedence: bulk X-list: netdev On Thu, 2004-12-30 at 04:48, Jan-Benedict Glaw wrote: > On Thu, 2004-12-30 03:48:34 -0500, David Dillow > wrote in message <20041230035000.10@ori.thedillows.org>: > > diff -Nru a/include/net/xfrm.h b/include/net/xfrm.h > > --- a/include/net/xfrm.h 2004-12-30 01:12:08 -05:00 > > +++ b/include/net/xfrm.h 2004-12-30 01:12:08 -05:00 > > @@ -146,6 +146,9 @@ > > /* Private data of this transformer, format is opaque, > > * interpreted by xfrm_type methods. */ > > void *data; > > + > > + /* Intended direction of this state, used for offloading */ > > + int dir; > > }; > > > > enum { > > @@ -157,6 +160,12 @@ > > XFRM_STATE_DEAD > > }; > > > > +enum { > > + XFRM_STATE_DIR_UNKNOWN, > > + XFRM_STATE_DIR_IN, > > + XFRM_STATE_DIR_OUT, > > +}; > > Any specific reason to first define such a nice enum and then using int > in the struct? Just following the current style in net/xfrm.h, see xfrm_state.km.state and XFRM_STATE_*. Though, I probably should have used a u8; easily changed if it is an issue. -- Dave Dillow From dave@thedillows.org Thu Dec 30 08:15:55 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 08:16:02 -0800 (PST) Received: from iasrv1.idleaire.net (NS1.idleaire.net [65.220.16.2]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBUGFZ0E015986 for ; Thu, 30 Dec 2004 08:15:55 -0800 Received: by iasrv1.idleaire.net (Postfix, from userid 300) id 933FA236E76; Thu, 30 Dec 2004 11:21:27 -0500 (EST) Received: from corp4.idleaire.com (corp4.idleaire.com [10.1.66.36]) by iasrv1.idleaire.net (Postfix) with ESMTP id 74B86236E5B; Thu, 30 Dec 2004 11:21:27 -0500 (EST) Received: from knox.voodoobox.net ([10.1.66.124]) by corp4.idleaire.com with Microsoft SMTPSVC(5.0.2195.6713); Thu, 30 Dec 2004 11:21:27 -0500 Subject: Re: [RFC 2.6.10 1/22] xfrm: Add direction information to xfrm_state From: Dave Dillow To: Ingo Oeser Cc: linux-kernel@vger.kernel.org, netdev@oss.sgi.com In-Reply-To: <200412301436.06653.ioe-lkml@axxeo.de> References: <20041230035000.01@ori.thedillows.org> <20041230035000.10@ori.thedillows.org> <200412301436.06653.ioe-lkml@axxeo.de> Content-Type: text/plain Message-Id: <1104423687.23254.15.camel@dillow.idleaire.com> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.4.5 Date: Thu, 30 Dec 2004 11:21:27 -0500 Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 30 Dec 2004 16:21:27.0401 (UTC) FILETIME=[9CEF4D90:01C4EE8B] X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13250 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dave@thedillows.org Precedence: bulk X-list: netdev [readding netdev to the cc list] On Thu, 2004-12-30 at 08:36, Ingo Oeser wrote: > Hi David, > > I'm happy to see a framework and example driver for this. Thanks, I'm just glad it works. > David Dillow schrieb: > > diff -Nru a/include/net/xfrm.h b/include/net/xfrm.h > > --- a/include/net/xfrm.h 2004-12-30 01:12:08 -05:00 > > +++ b/include/net/xfrm.h 2004-12-30 01:12:08 -05:00 > > @@ -194,6 +203,7 @@ > > struct xfrm_state *(*find_acq)(u8 mode, u32 reqid, u8 proto, > > xfrm_address_t *daddr, xfrm_address_t *saddr, > > int create); > > + void (*map_direction)(struct xfrm_state *xfrm); > > }; > > > > Please don't build modifiers, but build functions instead. > > e.g. > > xfrm->direction = map_direction(xfrm) > > That way you don't hide the assignment and thus code becomes much clearer and > can be called multiple times without risk. I'll make the change for the next iteration. -- Dave Dillow From tgraf@suug.ch Thu Dec 30 08:28:00 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 08:28:07 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBUGRd5e020390 for ; Thu, 30 Dec 2004 08:27:59 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id 13C9EF; Thu, 30 Dec 2004 17:33:17 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id C31F81C0EA; Thu, 30 Dec 2004 17:33:59 +0100 (CET) Date: Thu, 30 Dec 2004 17:33:59 +0100 From: Thomas Graf To: "David S. Miller" Cc: Jamal Hadi Salim , Patrick McHardy , netdev@oss.sgi.com Subject: [RESEND 2/9] PKT_SCHED: tc filter extension API Message-ID: <20041230163359.GA32419@postel.suug.ch> References: <20041230122652.GM32419@postel.suug.ch> <20041230123023.GO32419@postel.suug.ch> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041230123023.GO32419@postel.suug.ch> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13251 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev Resend: Added unused __attribute__ to the rtattr_failure error labels to surpress the warnings if no extensions are compiled in. It's not pretty but moving it into the ifdefs is even worse. Missed this latest change in the patch. Sorry. The tcf_exts API abstracts extensions such as actions/policers into a generic layer and reduces the knowledge inside classifiers to the minimum required. It isolates the validation code into its own function to allow classifiers to validate all input data before making changes and thus avoids the need to undo changes if a extension configuration request cannot be fullfilled. As a nice side effect, using this API removes the existing ifdef clutter. Usage: The classifier holds struct tcf_exts which may be empty if no extensions are compiled in. It then calls tcf_exts_validate when a new change request was received and provides a temporary tcf_exts copy to store the change requests. Given it succeeded the classifier may change its own parameters and at the end call tcf_exts_change to commit the changes and replace the existing extension configuration with the new one. The classifier is responsible to destroy his temporary copy if any of its own validation checks fail. The classifier specific TLV types must be exported to the extensions API via tcf_ext_map. Destroying the extensions is as easy as calling tcf_exts_destroy. The extensions are executed by the classifier by calling tcf_exts_exec which must be done as the last thing after making sure the filter matches. Note: A classifier might take further actions after the execution to tcf_exts_exec such as correcting its own cache to avoid caching results which could have been influenced by the extensions. tcf_exts_exec returns a negative error code if the filter must be considered unmatched, 0 on normal execution or a positive classifier return code (TC_ACT_*) which must be returned to the underlying layer as-is. Signed-off-by: Thomas Graf --- linux-2.6.10-bk2.orig/include/net/pkt_cls.h 2004-12-30 01:22:01.000000000 +0100 +++ linux-2.6.10-bk2/include/net/pkt_cls.h 2004-12-30 01:22:39.000000000 +0100 @@ -62,6 +62,93 @@ tp->q->ops->cl_ops->unbind_tcf(tp->q, cl); } +struct tcf_exts +{ +#ifdef CONFIG_NET_CLS_ACT + struct tc_action *action; +#elif defined CONFIG_NET_CLS_POLICE + struct tcf_police *police; +#endif +}; + +/* Map to export classifier specific extension TLV types to the + * generic extensions API. Unsupported extensions must be set to 0. + */ +struct tcf_ext_map +{ + int action; + int police; +}; + +/** + * tcf_exts_is_predicative - check if a predicative extension is present + * @exts: tc filter extensions handle + * + * Returns 1 if a predicative extension is present, i.e. an extension which + * might cause further actions and thus overrule the regular tcf_result. + */ +static inline int +tcf_exts_is_predicative(struct tcf_exts *exts) +{ +#ifdef CONFIG_NET_CLS_ACT + return !!exts->action; +#elif defined CONFIG_NET_CLS_POLICE + return !!exts->police; +#else + return 0; +#endif +} + +/** + * tcf_exts_is_available - check if at least one extension is present + * @exts: tc filter extensions handle + * + * Returns 1 if at least one extension is present. + */ +static inline int +tcf_exts_is_available(struct tcf_exts *exts) +{ + /* All non-predicative extensions must be added here. */ + return tcf_exts_is_predicative(exts); +} + +/** + * tcf_exts_exec - execute tc filter extensions + * @skb: socket buffer + * @exts: tc filter extensions handle + * @res: desired result + * + * Executes all configured extensions. Returns 0 on a normal execution, + * a negative number if the filter must be considered unmatched or + * a positive action code (TC_ACT_*) which must be returned to the + * underlying layer. + */ +static inline int +tcf_exts_exec(struct sk_buff *skb, struct tcf_exts *exts, + struct tcf_result *res) +{ +#ifdef CONFIG_NET_CLS_ACT + if (exts->action) + return tcf_action_exec(skb, exts->action, res); +#elif defined CONFIG_NET_CLS_POLICE + if (exts->police) + return tcf_police(skb, exts->police); +#endif + + return 0; +} + +extern int tcf_exts_validate(struct tcf_proto *tp, struct rtattr **tb, + struct rtattr *rate_tlv, struct tcf_exts *exts, + struct tcf_ext_map *map); +extern void tcf_exts_destroy(struct tcf_proto *tp, struct tcf_exts *exts); +extern void tcf_exts_change(struct tcf_proto *tp, struct tcf_exts *dst, + struct tcf_exts *src); +extern int tcf_exts_dump(struct sk_buff *skb, struct tcf_exts *exts, + struct tcf_ext_map *map); +extern int tcf_exts_dump_stats(struct sk_buff *skb, struct tcf_exts *exts, + struct tcf_ext_map *map); + #ifdef CONFIG_NET_CLS_ACT static inline int tcf_change_act_police(struct tcf_proto *tp, struct tc_action **action, --- linux-2.6.10-bk2.orig/net/sched/cls_api.c 2004-12-24 22:34:26.000000000 +0100 +++ linux-2.6.10-bk2/net/sched/cls_api.c 2004-12-30 17:03:52.000000000 +0100 @@ -439,6 +439,162 @@ return skb->len; } +void +tcf_exts_destroy(struct tcf_proto *tp, struct tcf_exts *exts) +{ +#ifdef CONFIG_NET_CLS_ACT + if (exts->action) { + tcf_action_destroy(exts->action, TCA_ACT_UNBIND); + exts->action = NULL; + } +#elif defined CONFIG_NET_CLS_POLICE + if (exts->police) { + tcf_police_release(exts->police, TCA_ACT_UNBIND); + exts->police = NULL; + } +#endif +} + + +int +tcf_exts_validate(struct tcf_proto *tp, struct rtattr **tb, + struct rtattr *rate_tlv, struct tcf_exts *exts, + struct tcf_ext_map *map) +{ + memset(exts, 0, sizeof(*exts)); + +#ifdef CONFIG_NET_CLS_ACT + int err; + struct tc_action *act; + + if (map->police && tb[map->police-1] && rate_tlv) { + act = tcf_action_init_1(tb[map->police-1], rate_tlv, "police", + TCA_ACT_NOREPLACE, TCA_ACT_BIND, &err); + if (NULL == act) + return err; + + act->type = TCA_OLD_COMPAT; + exts->action = act; + } else if (map->action && tb[map->action-1] && rate_tlv) { + act = tcf_action_init(tb[map->action-1], rate_tlv, NULL, + TCA_ACT_NOREPLACE, TCA_ACT_BIND, &err); + if (NULL == act) + return err; + + exts->action = act; + } +#elif defined CONFIG_NET_CLS_POLICE + if (map->police && tb[map->police-1] && rate_tlv) { + struct tcf_police *p; + + p = tcf_police_locate(tb[map->police-1], rate_tlv); + if (NULL == p) + return -EINVAL; + + exts->police = p; + } else if (map->action && tb[map->action-1]) + return -EOPNOTSUPP; +#else + if ((map->action && tb[map->action-1]) || + (map->police && tb[map->police-1])) + return -EOPNOTSUPP; +#endif + + return 0; +} + +void +tcf_exts_change(struct tcf_proto *tp, struct tcf_exts *dst, + struct tcf_exts *src) +{ +#ifdef CONFIG_NET_CLS_ACT + if (src->action) { + if (dst->action) { + struct tc_action *act; + + tcf_tree_lock(tp); + act = xchg(&dst->action, src->action); + tcf_tree_unlock(tp); + + tcf_action_destroy(act, TCA_ACT_UNBIND); + } else + dst->action = src->action; + } +#elif defined CONFIG_NET_CLS_POLICE + if (src->police) { + if (dst->police) { + struct tcf_police *p; + + tcf_tree_lock(tp); + p = xchg(&dst->police, src->police); + tcf_tree_unlock(tp); + + tcf_police_release(p, TCA_ACT_UNBIND); + } else + dst->police = src->police; + } +#endif +} + +int +tcf_exts_dump(struct sk_buff *skb, struct tcf_exts *exts, + struct tcf_ext_map *map) +{ +#ifdef CONFIG_NET_CLS_ACT + if (map->action && exts->action) { + /* + * again for backward compatible mode - we want + * to work with both old and new modes of entering + * tc data even if iproute2 was newer - jhs + */ + struct rtattr * p_rta = (struct rtattr*) skb->tail; + + if (exts->action->type != TCA_OLD_COMPAT) { + RTA_PUT(skb, map->action, 0, NULL); + if (tcf_action_dump(skb, exts->action, 0, 0) < 0) + goto rtattr_failure; + p_rta->rta_len = skb->tail - (u8*)p_rta; + } else if (map->police) { + RTA_PUT(skb, map->police, 0, NULL); + if (tcf_action_dump_old(skb, exts->action, 0, 0) < 0) + goto rtattr_failure; + p_rta->rta_len = skb->tail - (u8*)p_rta; + } + } +#elif defined CONFIG_NET_CLS_POLICE + if (map->police && exts->police) { + struct rtattr * p_rta = (struct rtattr*) skb->tail; + + RTA_PUT(skb, map->police, 0, NULL); + + if (tcf_police_dump(skb, exts->police) < 0) + goto rtattr_failure; + + p_rta->rta_len = skb->tail - (u8*)p_rta; + } +#endif + return 0; +rtattr_failure: __attribute__ ((unused)) + return -1; +} + +int +tcf_exts_dump_stats(struct sk_buff *skb, struct tcf_exts *exts, + struct tcf_ext_map *map) +{ +#ifdef CONFIG_NET_CLS_ACT + if (exts->action) + if (tcf_action_copy_stats(skb, exts->action) < 0) + goto rtattr_failure; +#elif defined CONFIG_NET_CLS_POLICE + if (exts->police) + if (tcf_police_dump_stats(skb, exts->police) < 0) + goto rtattr_failure; +#endif + return 0; +rtattr_failure: __attribute__ ((unused)) + return -1; +} static int __init tc_filter_init(void) { @@ -461,3 +617,8 @@ EXPORT_SYMBOL(register_tcf_proto_ops); EXPORT_SYMBOL(unregister_tcf_proto_ops); +EXPORT_SYMBOL(tcf_exts_validate); +EXPORT_SYMBOL(tcf_exts_destroy); +EXPORT_SYMBOL(tcf_exts_change); +EXPORT_SYMBOL(tcf_exts_dump); +EXPORT_SYMBOL(tcf_exts_dump_stats); From jbglaw@lug-owl.de Thu Dec 30 08:30:34 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 08:30:43 -0800 (PST) Received: from lug-owl.de (lug-owl.de [195.71.106.12]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBUGUDEb020840 for ; Thu, 30 Dec 2004 08:30:34 -0800 Received: by lug-owl.de (Postfix, from userid 1001) id 47DF21901F6; Thu, 30 Dec 2004 17:36:17 +0100 (CET) Date: Thu, 30 Dec 2004 17:36:17 +0100 From: Jan-Benedict Glaw To: Dave Dillow Cc: netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: [RFC 2.6.10 1/22] xfrm: Add direction information to xfrm_state Message-ID: <20041230163617.GB2460@lug-owl.de> Mail-Followup-To: Dave Dillow , netdev@oss.sgi.com, linux-kernel@vger.kernel.org References: <20041230035000.01@ori.thedillows.org> <20041230035000.10@ori.thedillows.org> <20041230094839.GX2460@lug-owl.de> <1104423409.23254.9.camel@dillow.idleaire.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="5Wo9T3dGHPT+4OQz" Content-Disposition: inline In-Reply-To: <1104423409.23254.9.camel@dillow.idleaire.com> X-Operating-System: Linux mail 2.6.10-rc2-bk5lug-owl X-gpg-fingerprint: 250D 3BCF 7127 0D8C A444 A961 1DBD 5E75 8399 E1BB X-gpg-key: wwwkeys.de.pgp.net User-Agent: Mutt/1.5.6+20040907i X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13252 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: jbglaw@lug-owl.de Precedence: bulk X-list: netdev --5Wo9T3dGHPT+4OQz Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, 2004-12-30 11:16:49 -0500, Dave Dillow wrote in message <1104423409.23254.9.camel@dillow.idleaire.com>: > On Thu, 2004-12-30 at 04:48, Jan-Benedict Glaw wrote: > > On Thu, 2004-12-30 03:48:34 -0500, David Dillow > > wrote in message <20041230035000.10@ori.thedillows.org>: > > > +enum { > > > + XFRM_STATE_DIR_UNKNOWN, > > > + XFRM_STATE_DIR_IN, > > > + XFRM_STATE_DIR_OUT, > > > +}; > >=20 > > Any specific reason to first define such a nice enum and then using int > > in the struct? >=20 > Just following the current style in net/xfrm.h, see xfrm_state.km.state > and XFRM_STATE_*. Hmmm... Maybe I'd prepare patches then :) MfG, JBG --=20 Jan-Benedict Glaw jbglaw@lug-owl.de . +49-172-7608481 = _ O _ "Eine Freie Meinung in einem Freien Kopf | Gegen Zensur | Gegen Krieg = _ _ O fuer einen Freien Staat voll Freier B=C3=BCrger" | im Internet! | im Ira= k! O O O ret =3D do_actions((curr | FREE_SPEECH) & ~(NEW_COPYRIGHT_LAW | DRM | TCPA)= ); --5Wo9T3dGHPT+4OQz Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Digital signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) iD8DBQFB1C6BHb1edYOZ4bsRAraeAJ4kdi0KY9MbAavAWo1ZpQ8F7bnn3wCbBvjS UD3Tby1J+v8hAYqL5uWkgrE= =Vrl2 -----END PGP SIGNATURE----- --5Wo9T3dGHPT+4OQz-- From tgraf@suug.ch Thu Dec 30 09:36:38 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 09:36:46 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBUHaHNA024493 for ; Thu, 30 Dec 2004 09:36:37 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id 90CACF; Thu, 30 Dec 2004 18:42:31 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id BECC51C0EA; Thu, 30 Dec 2004 18:43:13 +0100 (CET) Date: Thu, 30 Dec 2004 18:43:13 +0100 From: Thomas Graf To: jamal Cc: "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. Message-ID: <20041230174313.GB32419@postel.suug.ch> References: <20041228221021.GF32419@postel.suug.ch> <1104275197.1100.276.camel@jzny.localdomain> <20041228231916.GG32419@postel.suug.ch> <1104277165.1100.291.camel@jzny.localdomain> <20041229000928.GH32419@postel.suug.ch> <1104282811.1090.314.camel@jzny.localdomain> <20041229124842.GI32419@postel.suug.ch> <1104330054.1089.329.camel@jzny.localdomain> <20041229150140.GJ32419@postel.suug.ch> <1104335620.1025.22.camel@jzny.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1104335620.1025.22.camel@jzny.localdomain> X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13253 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * jamal <1104335620.1025.22.camel@jzny.localdomain> 2004-12-29 10:53 > > > Also in the case of u32 (to take this opportunity) i would like to stash > > > state inot a 16 bit ID to help in pretty printing the matches. > > > > Not sure what you mean. Which "state"? > > > > One example of what state you could store: > In the case where i enter something readable in english, the display > back is raw; > example: > match ip src 10.0.0.210/32 > gets displayed as: match 0a0000d2/ffffffff at 12 > And a lot of times its tricky to find exactly what "at 12" means. > > If i store some ID that would tell me "IP" when i dump then i can pretty > print it in english in user space using ip_print(). Understood, we could store a map in userspace mapping those IDs to pretty english match descriptions. I think avoiding to hardcode those ids but rather just hold it for userspace is the best thing. OTOH, if we give unique ids to matches we can use it instead of a separate ID. Unique in terms of parent classid + filter handle + match handle must be unique per interface. Thoughts? > Sounds good to me since we have a new sel. > It may endup being tricky to be both fast and backward compat; but we'll > see what fun awaits when you start coding. I started developing concrete thoughts. The current u32 match could be made a generic match just like meta which would give us a u32 w/o hashtables on a always-true classifier. The problem arises with exactly those hashtables. u32 uses the same selector for this and furthermore even defines stuff like hoff and hmask for this purpose. I have to read up again to understand the hashing in full detail but this is the only issue I see that we might face. What I have in mind is, something like u32 but much simpler, w/o the overhead of creating additional filters for hashing etc. Basically this can be the always-true classifier which just implements the generic matches tree. And have the existing u32 with the generic matches added when hashing is required. I planned to write these 2 additional generic matches so far. It's pretty simple because I can just take over the code from EGP. KMP: Knuth-Morris-Pratt text search algorithm NByte: Compares any number of bytes against a pattern, very useful for comparing IPv6 addresses instead of creating 4 ANDed u32 matches. From ak@muc.de Thu Dec 30 09:53:59 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 09:54:06 -0800 (PST) Received: from one.firstfloor.org (one.firstfloor.org [213.235.205.2]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBUHrcpU025717 for ; Thu, 30 Dec 2004 09:53:59 -0800 Received: by one.firstfloor.org (Postfix, from userid 502) id 0E543D033E; Thu, 30 Dec 2004 19:00:25 +0100 (CET) To: Alan Cox Cc: netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: PATCH: kmalloc packet slab References: <1104156983.20944.25.camel@localhost.localdomain> From: Andi Kleen Date: Thu, 30 Dec 2004 19:00:25 +0100 In-Reply-To: <1104156983.20944.25.camel@localhost.localdomain> (Alan Cox's message of "Mon, 27 Dec 2004 14:16:23 +0000") Message-ID: User-Agent: Gnus/5.110002 (No Gnus v0.2) Emacs/21.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13254 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: ak@muc.de Precedence: bulk X-list: netdev Alan Cox writes: > The networking world runs in 1514 byte packets pretty much all the time. > This adds a 1620 byte slab for such objects and is one of the internally > generated Red Hat patches we use on things like Fedora Core 3. Original: > Arjan van de Ven. Doesnt this clash a bit with yours and Arjans no-prisoners-taken quest to get rid of order>0 allocations? (4K stacks). I implemented this long ago (in 2.1 - bonus points if you still find the leftover hook), but then gave up on it. I realized that to use it you would need order>0 allocations. In a single 4K page only 2 1.5K slabs fit, but 2 2K slabs fit as well. And there is already a handy 2K slab that works perfect well. IMHO it is useless except for architectures with PAGE_SIZE>4K or if you fix the VM to handle order>0 allocations really well. If you want to add it for sparc64/ia64/alpha etc. I would do it with an ifdef at least. -Andi From bdschuym@pandora.be Thu Dec 30 10:44:08 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 10:44:14 -0800 (PST) Received: from poros.telenet-ops.be (poros.telenet-ops.be [195.130.132.44]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBUIhl5i028506 for ; Thu, 30 Dec 2004 10:44:08 -0800 Received: from localhost (localhost.localdomain [127.0.0.1]) by poros.telenet-ops.be (Postfix) with SMTP id 7DF743BC159; Thu, 30 Dec 2004 19:50:56 +0100 (MET) Received: from 192.168.0.138 (D5763CA9.kabel.telenet.be [213.118.60.169]) by poros.telenet-ops.be (Postfix) with ESMTP id 37E243BC185; Thu, 30 Dec 2004 19:50:56 +0100 (MET) Subject: [PATCH][BRIDGE-NF] Fix wrong use of skb->protocol From: Bart De Schuymer To: "David S. Miller" Cc: netdev@oss.sgi.com Content-Type: text/plain Date: Thu, 30 Dec 2004 19:55:14 +0100 Message-Id: <1104432914.15601.19.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Evolution 2.0.2 (2.0.2-3) Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13255 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bdschuym@pandora.be Precedence: bulk X-list: netdev Hi Dave, ip_sabotage_out() needs to distinguish IPv4 and IPv6 traffic. It currently does that by looking at skb->protocol. However, for locally originated packets, skb->protocol is not initialized. The patch below instead looks at the version number of the packet's data, which should be 4 or 6. Thanks to Pasha (Crazy AMD K7 ) for his patience. Signed-off-by: Bart De Schuymer --- linux-2.6.10/net/bridge/br_netfilter.c.old 2004-12-30 15:34:11.000000000 +0100 +++ linux-2.6.10/net/bridge/br_netfilter.c 2004-12-30 19:13:31.000000000 +0100 @@ -845,19 +845,6 @@ static unsigned int ip_sabotage_out(unsi { struct sk_buff *skb = *pskb; -#ifdef CONFIG_SYSCTL - if (!skb->nf_bridge) { - struct vlan_ethhdr *hdr = vlan_eth_hdr(skb); - - if (skb->protocol == __constant_htons(ETH_P_IP) || - IS_VLAN_IP) { - if (!brnf_call_iptables) - return NF_ACCEPT; - } else if (!brnf_call_ip6tables) - return NF_ACCEPT; - } -#endif - if ((out->hard_start_xmit == br_dev_xmit && okfn != br_nf_forward_finish && okfn != br_nf_local_out_finish && @@ -869,8 +856,24 @@ static unsigned int ip_sabotage_out(unsi ) { struct nf_bridge_info *nf_bridge; - if (!skb->nf_bridge && !nf_bridge_alloc(skb)) - return NF_DROP; + if (!skb->nf_bridge) { +#ifdef CONFIG_SYSCTL + /* This code is executed while in the IP(v6) stack, + the version should be 4 or 6. We can't use + skb->protocol because that isn't set on + PF_INET(6)/LOCAL_OUT. */ + struct iphdr *ip = skb->nh.iph; + + if (ip->version == 4 && !brnf_call_iptables) + return NF_ACCEPT; + else if (ip->version == 6 && !brnf_call_ip6tables) + return NF_ACCEPT; +#endif + if (hook == NF_IP_POST_ROUTING) + return NF_ACCEPT; + if (!nf_bridge_alloc(skb)) + return NF_DROP; + } nf_bridge = skb->nf_bridge; From akpm@osdl.org Thu Dec 30 13:19:31 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 13:19:39 -0800 (PST) Received: from mail.osdl.org (fw.osdl.org [65.172.181.6]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBULJABj005107 for ; Thu, 30 Dec 2004 13:19:31 -0800 Received: from bix (build.pdx.osdl.net [172.20.1.2]) by mail.osdl.org (8.11.6/8.11.6) with SMTP id iBULRA628849 for ; Thu, 30 Dec 2004 13:27:10 -0800 Date: Thu, 30 Dec 2004 13:27:07 -0800 From: Andrew Morton To: netdev@oss.sgi.com Subject: Fw: [Bugme-new] [Bug 3973] New: Panic when using Miredo IPv6 tunnel (TUN/TAP) Message-Id: <20041230132707.253d0a79.akpm@osdl.org> X-Mailer: Sylpheed version 0.9.7 (GTK+ 1.2.10; i386-redhat-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13256 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: akpm@osdl.org Precedence: bulk X-list: netdev Begin forwarded message: Date: Thu, 30 Dec 2004 12:47:13 -0800 From: bugme-daemon@osdl.org To: bugme-new@lists.osdl.org Subject: [Bugme-new] [Bug 3973] New: Panic when using Miredo IPv6 tunnel (TUN/TAP) http://bugme.osdl.org/show_bug.cgi?id=3973 Summary: Panic when using Miredo IPv6 tunnel (TUN/TAP) Kernel Version: 2.6.10 Status: NEW Severity: high Owner: yoshfuji@linux-ipv6.org Submitter: kernelbug.20.jj05@spamgourmet.com Distribution: Debian (testing) Hardware Environment: Pentium III (Running under VMWare) Software Environment: Clean install, miredo 0.3.2, GCC 3.3.4 (Debian 1:3.3.4-13) Kernel: Linus' 2.6.10, configuration from arch/i386/defconfig, with the following main changes: CONFIG_PCNET_32=y CONFIG_TUN=m CONFIG_IPV6=m CONFIG_MPENTIUM3=y Problem Description: Kernel panic after creating a tunnel using miredo and waiting some time. Steps to reproduce: Using miredo 0.3.2 (http://www.simphalempin.com/dev/miredo/) 1. ./miredo -f -v teredo.via.ecp.fr 2. Wait about 15 minutes. Crash log: Linux version 2.6.10s (root@debian) (gcc version 3.3.4 (Debian 1:3.3.4-13)) #1 SMP Tue Dec 28 23:04:53 CET 2004 [snip] Universal TUN/TAP device driver 1.5 (C)1999-2002 Maxim Krasnyansky Unable to handle kernel paging request at virtual address 61696265 printing eip: c018c7d5 *pde = 00000000 Oops: 0000 [#1] PREEMPT SMP Modules linked in: tun ipv6 CPU: 0 EIP: 0060:[] Not tainted VLI EFLAGS: 00010246 (2.6.10s) EIP is at remove_proc_entry+0x25/0x150 eax: 00000000 ebx: c2c92ec0 ecx: ffffffff edx: c344fb00 esi: c3fe0400 edi: 61696265 ebp: c04c6fc4 esp: c04c6f38 ds: 007b es: 007b ss: 0068 Process swapper (pid: 0, threadinfo=c04c6000 task=c03f5b40) Stack: 00000000 c03f5b40 c04c6f70 c0118443 61696265 c2c92ec0 c3fe0400 00000000 c489e094 61696265 c344fb00 c2c92ec0 c487bd95 c2c92ec0 c031dd70 c1084960 c347b9c0 00000000 c346c900 c031e0de c2c92ec0 c346c040 c346c900 c04f98c0 Call Trace: [] scheduler_tick+0x123/0x4c0 [] snmp6_unregister_dev+0x44/0x60 [ipv6] [] in6_dev_finish_destroy+0x35/0xd0 [ipv6] [] dst_run_gc+0x0/0x110 [] dst_destroy+0xae/0xd0 [] dst_run_gc+0xe2/0x110 [] dst_run_gc+0x0/0x110 [] dst_run_gc+0x0/0x110 [] run_timer_softirq+0xda/0x1a0 [] __do_softirq+0xba/0xd0 [] do_softirq+0x4a/0x60 ======================= [] irq_exit+0x39/0x40 [] apic_timer_interrupt+0x1c/0x24 [] default_idle+0x0/0x40 [] default_idle+0x2c/0x40 [] cpu_idle+0x42/0x60 [] start_kernel+0x14f/0x170 [] unknown_bootoption+0x0/0x1e0 Code: b4 26 00 00 00 00 57 56 53 83 ec 14 8b 54 24 28 8b 4c 24 24 85 d2 89 4c 24 10 0f 84 08 01 00 00 8b 7c 24 10 31 c0 b9 ff ff ff ff ae f7 d1 49 8b 42 34 89 ce 8d 5a 34 85 c0 74 2a 8b 3b 89 34 <0>Kernel panic - not syncing: Fatal exception in interrupt ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is. From acme@conectiva.com.br Thu Dec 30 14:01:26 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 14:01:34 -0800 (PST) Received: from orion.netbank.com.br (orion.netbank.com.br [200.203.199.90]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBUM15Pg007272 for ; Thu, 30 Dec 2004 14:01:26 -0800 Received: from [200.138.51.5] (helo=oops.ghostprotocols.net) by orion.netbank.com.br with asmtp (Exim 3.33 #1) id 1Ck8WO-0008Sh-00; Thu, 30 Dec 2004 20:11:52 -0200 Received: from [192.168.1.6] (amd64.kerneljanitors.org [192.168.1.6]) by oops.ghostprotocols.net (Postfix) with ESMTP id 72E6775334; Thu, 30 Dec 2004 20:09:12 -0200 (BRST) Message-ID: <41D47D34.1000708@conectiva.com.br> Date: Thu, 30 Dec 2004 20:12:04 -0200 From: Arnaldo Carvalho de Melo Organization: Conectiva S.A. User-Agent: Mozilla Thunderbird 0.9 (X11/20041103) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Patrick McHardy Cc: hadi@cyberus.ca, Maillist netdev Subject: Re: [PATCH PKT_SCHED 0/17]: tc action cleanup + fixes References: <41D3785F.3040909@trash.net> <1104382562.1048.39.camel@jzny.localdomain> <41D3F5EC.9050808@trash.net> In-Reply-To: <41D3F5EC.9050808@trash.net> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13257 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: acme@conectiva.com.br Precedence: bulk X-list: netdev Patrick McHardy wrote: > jamal wrote: > >> Patrick, >> Thanks for this cleanup. >> >> Questions/comments: >> >> 1)compiler or style issue? >> >> You have a few of fixes from >> >> a) >> if (..){ >> single statement here; >> } >> >> to: >> if (..) >> single statement here; >> >> I always add an extra pair of brace >> for lazy reasons (in the back of my mind: incase i want to add another >> statement ;->). > > > Just cleanup, I prefer not to waste too many lines. Saving > space increases readability. Agreed, whenever I have the chance, I remove such bloat ;) - Arnaldo From buytenh@wantstofly.org Thu Dec 30 14:16:08 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 14:16:15 -0800 (PST) Received: from xi.wantstofly.org (alephnull.demon.nl [212.238.201.82]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBUMFlrg008230 for ; Thu, 30 Dec 2004 14:16:08 -0800 Received: by xi.wantstofly.org (Postfix, from userid 500) id 46F1E2B0EE; Thu, 30 Dec 2004 23:24:15 +0100 (MET) Date: Thu, 30 Dec 2004 23:24:15 +0100 From: Lennert Buytenhek To: Bart De Schuymer Cc: "David S. Miller" , netdev@oss.sgi.com, snort2004@mail.ru Subject: Re: [PATCH][BRIDGE-NF] Fix wrong use of skb->protocol Message-ID: <20041230222415.GB19587@xi.wantstofly.org> References: <1104432914.15601.19.camel@localhost.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1104432914.15601.19.camel@localhost.localdomain> User-Agent: Mutt/1.4.1i X-Virus-Scanned: ClamAV 0.80/638/Tue Dec 21 14:41:34 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13258 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: buytenh@wantstofly.org Precedence: bulk X-list: netdev On Thu, Dec 30, 2004 at 07:55:14PM +0100, Bart De Schuymer wrote: > ip_sabotage_out() needs to distinguish IPv4 and IPv6 traffic. It > currently does that by looking at skb->protocol. However, for locally > originated packets, skb->protocol is not initialized. > The patch below instead looks at the version number of the packet's > data, which should be 4 or 6. A while ago there were a number of problems with bridging CIPE ethernet devices, which turned out to be the bridge code not initialising skb->protocol for locally originated STP frames. At the time I was told that initialising skb->protocol for locally originated packets is required, so that is how I fixed it then. cheers, Lennert From bdschuym@pandora.be Thu Dec 30 14:58:21 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 14:58:29 -0800 (PST) Received: from adicia.telenet-ops.be (adicia.telenet-ops.be [195.130.132.56]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBUMw0vH002223 for ; Thu, 30 Dec 2004 14:58:21 -0800 Received: from localhost (localhost.localdomain [127.0.0.1]) by adicia.telenet-ops.be (Postfix) with SMTP id 28BF8441FE; Fri, 31 Dec 2004 00:06:35 +0100 (MET) Received: from 192.168.0.138 (D5763CA9.kabel.telenet.be [213.118.60.169]) by adicia.telenet-ops.be (Postfix) with ESMTP id C3F324408A; Fri, 31 Dec 2004 00:06:34 +0100 (MET) Subject: Re: [PATCH][BRIDGE-NF] Fix wrong use of skb->protocol From: Bart De Schuymer To: Lennert Buytenhek Cc: "David S. Miller" , netdev@oss.sgi.com, snort2004@mail.ru In-Reply-To: <20041230222415.GB19587@xi.wantstofly.org> References: <1104432914.15601.19.camel@localhost.localdomain> <20041230222415.GB19587@xi.wantstofly.org> Content-Type: text/plain Date: Fri, 31 Dec 2004 00:10:48 +0100 Message-Id: <1104448248.15601.55.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Evolution 2.0.2 (2.0.2-3) Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13259 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bdschuym@pandora.be Precedence: bulk X-list: netdev Op do, 30-12-2004 te 23:24 +0100, schreef Lennert Buytenhek: > On Thu, Dec 30, 2004 at 07:55:14PM +0100, Bart De Schuymer wrote: > > > ip_sabotage_out() needs to distinguish IPv4 and IPv6 traffic. It > > currently does that by looking at skb->protocol. However, for locally > > originated packets, skb->protocol is not initialized. > > The patch below instead looks at the version number of the packet's > > data, which should be 4 or 6. > > A while ago there were a number of problems with bridging CIPE ethernet > devices, which turned out to be the bridge code not initialising > skb->protocol for locally originated STP frames. > > At the time I was told that initialising skb->protocol for locally > originated packets is required, so that is how I fixed it then. Hi Lennert, skb->protocol is not set for locally generated packets when the packet is still in the IP stack. I don't know what happens with it after the IP stack is finished with the packet. The comment in skbuff.h says "packet protocol from driver", from which I tend to conclude that skb->protocol is only set by drivers when a packet enters the box. Too bad stuff like this isn't clearly spelled out, the FIXME for the dst field has been sitting there for probably more than a year too. Anyway, it wouldn't hurt if the skb->protocol field always held the right value. cheers, Bart From romieu@fr.zoreil.com Thu Dec 30 15:29:43 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 15:29:52 -0800 (PST) Received: from fr.zoreil.com (electric-eye.fr.zoreil.com [213.41.134.224]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBUNTMF5003966 for ; Thu, 30 Dec 2004 15:29:43 -0800 Received: from electric-eye.fr.zoreil.com (localhost.localdomain [127.0.0.1]) by fr.zoreil.com (8.13.1/8.12.1) with ESMTP id iBUNYn5X007726; Fri, 31 Dec 2004 00:34:49 +0100 Received: (from romieu@localhost) by electric-eye.fr.zoreil.com (8.13.1/8.13.1/Submit) id iBUNYiOS007725; Fri, 31 Dec 2004 00:34:44 +0100 Date: Fri, 31 Dec 2004 00:34:43 +0100 From: Francois Romieu To: David Dillow Cc: netdev@oss.sgi.com, linux-kernel@vger.kernel.org Subject: Re: [RFC 2.6.10 5/22] xfrm: Attempt to offload bundled xfrm_states for outbound xfrms Message-ID: <20041230233443.GB5247@electric-eye.fr.zoreil.com> References: <20041230035000.13@ori.thedillows.org> <20041230035000.14@ori.thedillows.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041230035000.14@ori.thedillows.org> User-Agent: Mutt/1.4.1i X-Organisation: Land of Sunshine Inc. X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13260 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: romieu@fr.zoreil.com Precedence: bulk X-list: netdev David Dillow : [...] > diff -Nru a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c > --- a/net/xfrm/xfrm_policy.c 2004-12-30 01:11:18 -05:00 > +++ b/net/xfrm/xfrm_policy.c 2004-12-30 01:11:18 -05:00 > @@ -705,6 +705,31 @@ > }; > } > > +static void xfrm_accel_bundle(struct dst_entry *dst) > +{ > + struct xfrm_bundle_list bundle, *xbl, *tmp; > + struct net_device *dev = dst->dev; > + INIT_LIST_HEAD(&bundle.node); > + > + if (dev && netif_running(dev) && (dev->features & NETIF_F_IPSEC)) { > + while (dst) { > + xbl = kmalloc(sizeof(*xbl), GFP_ATOMIC); > + if (!xbl) > + goto out; > + > + xbl->dst = dst; > + list_add_tail(&xbl->node, &bundle.node); > + dst = dst->child; > + } > + > + dev->xfrm_bundle_add(dev, &bundle); > + } > + > +out: > + list_for_each_entry_safe(xbl, tmp, &bundle.node, node) > + kfree(xbl); > +} If the driver knows the max depth which is allowed, why not have it allocate its own bundle-like struct during initialization one for once ? Instead of pushing the bundle list, dst is walked by the code of the device's own xyz_xfrm_bundle_add into the said circular list, entries get overwriten if the dst chain is longer and when the end of dst is reached, the bundle-like list is walked in reverse order. It avoids a few failure points imho. -- Ueimor From kaber@trash.net Thu Dec 30 16:25:55 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 16:26:02 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBV0PYAf009407 for ; Thu, 30 Dec 2004 16:25:55 -0800 Received: from eru.coreworks.de ([172.16.0.2] helo=trash.net) by kaber.coreworks.de with esmtp (Exim 4.34) id 1CkAkJ-0004w7-2D; Fri, 31 Dec 2004 01:34:23 +0100 Message-ID: <41D49E4B.2020007@trash.net> Date: Fri, 31 Dec 2004 01:33:15 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040413 Debian/1.6-5 X-Accept-Language: en MIME-Version: 1.0 To: Bart De Schuymer CC: Lennert Buytenhek , "David S. Miller" , netdev@oss.sgi.com, snort2004@mail.ru Subject: Re: [PATCH][BRIDGE-NF] Fix wrong use of skb->protocol References: <1104432914.15601.19.camel@localhost.localdomain> <20041230222415.GB19587@xi.wantstofly.org> <1104448248.15601.55.camel@localhost.localdomain> In-Reply-To: <1104448248.15601.55.camel@localhost.localdomain> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13261 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev Bart De Schuymer wrote: > Hi Lennert, > > skb->protocol is not set for locally generated packets when the packet > is still in the IP stack. I don't know what happens with it after the IP > stack is finished with the packet. It is set shortly before the packet leaves the IP stack, in ip_finish_output. This is after LOCAL_OUT, but before POST_ROUTING. So your fix looks fine. > The comment in skbuff.h says "packet protocol from driver", from which I > tend to conclude that skb->protocol is only set by drivers when a packet > enters the box. Too bad stuff like this isn't clearly spelled out, the > FIXME for the dst field has been sitting there for probably more than a > year too. Anyway, it wouldn't hurt if the skb->protocol field always > held the right value. The IP stack knows it's IP anyway :) After that it has to hold the right value. Regards Patrick From sri@us.ibm.com Thu Dec 30 16:45:55 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 16:46:03 -0800 (PST) Received: from e34.co.us.ibm.com (e34.co.us.ibm.com [32.97.110.132]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBV0jTnE010404 for ; Thu, 30 Dec 2004 16:45:55 -0800 Received: from d03relay04.boulder.ibm.com (d03relay04.boulder.ibm.com [9.17.195.106]) by e34.co.us.ibm.com (8.12.10/8.12.9) with ESMTP id iBV0rw9g475010 for ; Thu, 30 Dec 2004 19:53:58 -0500 Received: from d03av02.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.195.168]) by d03relay04.boulder.ibm.com (8.12.10/NCO/VER6.6) with ESMTP id iBV0rwRs324802 for ; Thu, 30 Dec 2004 17:53:58 -0700 Received: from d03av02.boulder.ibm.com (loopback [127.0.0.1]) by d03av02.boulder.ibm.com (8.12.11/8.12.11) with ESMTP id iBV0rv6f025770 for ; Thu, 30 Dec 2004 17:53:58 -0700 Received: from w-sridhar.beaverton.ibm.com (w-sridhar.beaverton.ibm.com [9.47.18.19]) by d03av02.boulder.ibm.com (8.12.11/8.12.11) with ESMTP id iBV0rv6m025752; Thu, 30 Dec 2004 17:53:57 -0700 Date: Thu, 30 Dec 2004 16:53:56 -0800 (PST) From: Sridhar Samudrala X-X-Sender: sridhar@w-sridhar.beaverton.ibm.com To: davem@davemloft.net cc: netdev@oss.sgi.com, lksctp-developers@lists.sourceforge.net Subject: [BK PATCH] 2.6 and 2.4 SCTP updates Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13262 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: sri@us.ibm.com Precedence: bulk X-list: netdev Dave, Please do a bk pull http://linux-lksctp.bkbits.net/lksctp-2.5.work & bk pull http://linux-lksctp.bkbits.net/lksctp-2.4.work to get the following SCTP updates to 2.6 and 2.4. The 2.4 csets include a patch that gets rid of sk_xxx macros that were causing backporting issues for other networking components. Thanks Sridhar # ChangeSet # 2004/12/30 15:50:53-08:00 sri@us.ibm.com # [SCTP] Fix sctp_getladdrs() to return valid local addresses on an endpoint # that is bound to INADDR_ANY or inaddr6_any. # # Signed-off-by: Sridhar Samudrala # # net/sctp/socket.c # # ChangeSet # 2004/12/29 11:35:11-08:00 sri@us.ibm.com # [SCTP] Fix misc. issues in SCTP_PEER_ADDR_PARAMS set socket option. # # Signed-off-by: Sridhar Samudrala # # net/sctp/transport.c # net/sctp/socket.c # net/sctp/sm_sideeffect.c # net/sctp/associola.c # include/net/sctp/structs.h # # ChangeSet # 2004/12/28 17:04:30-08:00 sri@us.ibm.com # [SCTP] Fix bug in setting ephemeral port in the bind address. # # Signed-off-by: Sridhar Samudrala # # net/sctp/socket.c # # ChangeSet # 2004/12/28 16:22:41-08:00 sri@us.ibm.com # [SCTP] Clean up the T3_rtx timer when deleting a transport. # # Signed-off-by: Vladislav Yasevich # Signed-off-by: Sridhar Samudrala # # net/sctp/transport.c # # ChangeSet # 2004/12/28 16:03:30-08:00 sri@us.ibm.com # [SCTP] Implementation of SCTP Implementer's Guide Section 2.35. # This code checks that the verification tag, source port and # destination port in the SCTP header matches the information # contained in the state cookie. # # Signed-off-by: Vladislav Yasevich # Signed-off-by: Sridhar Samudrala # # net/sctp/sm_make_chunk.c # net/sctp/associola.c # include/net/sctp/structs.h # include/net/sctp/constants.h # # ChangeSet # 2004/12/28 15:47:51-08:00 sri@us.ibm.com # [SCTP] Validate and respond to invalid chunk/parameter lengths. # # Signed-off-by: Vladislav Yasevich # Signed-off-by: Sridhar Samudrala # # net/sctp/sm_statefuns.c # net/sctp/sm_make_chunk.c # net/sctp/inqueue.c # net/sctp/input.c # net/sctp/endpointola.c # net/sctp/associola.c # include/net/sctp/sm.h # include/net/sctp/sctp.h # include/linux/sctp.h # # ChangeSet # 2004/12/27 14:13:06-08:00 sri@us.ibm.com # [SCTP] Treat ICMP protocol unreachable errors from non-SCTP capable hosts as ABORTs. # # Signed-off-by: Jerome Forrissier # Signed-off-by: Sridhar Samudrala # # net/sctp/sm_statetable.c # net/sctp/sm_statefuns.c # net/sctp/ipv6.c # net/sctp/input.c # net/sctp/debug.c # include/net/sctp/sm.h # include/net/sctp/sctp.h # include/net/sctp/constants.h # # ChangeSet # 2004/12/27 10:51:01-08:00 sri@us.ibm.com # [SCTP] Code cleanup: remove unused code and make needlessly global code static # # Signed-off-by: Adrian Bunk # Signed-off-by: Sridhar Samudrala # # net/sctp/ulpqueue.c # net/sctp/ulpevent.c # net/sctp/tsnmap.c # net/sctp/transport.c # net/sctp/ssnmap.c # net/sctp/socket.c # net/sctp/sm_statetable.c # net/sctp/sm_statefuns.c # net/sctp/sm_sideeffect.c # net/sctp/sm_make_chunk.c # net/sctp/protocol.c # net/sctp/proc.c # net/sctp/outqueue.c # net/sctp/objcnt.c # net/sctp/ipv6.c # net/sctp/inqueue.c # net/sctp/input.c # net/sctp/endpointola.c # net/sctp/debug.c # net/sctp/command.c # net/sctp/chunk.c # net/sctp/bind_addr.c # net/sctp/associola.c # include/net/sctp/ulpqueue.h # include/net/sctp/ulpevent.h # include/net/sctp/tsnmap.h # include/net/sctp/structs.h # include/net/sctp/sm.h # include/net/sctp/sctp.h # include/net/sctp/constants.h # include/net/sctp/command.h # # ChangeSet # 2004/12/24 22:33:13-08:00 sri@us.ibm.com # [SCTP] Fix potential null pointer dereference in sctp_err_lookup(). # # Signed-off-by: Vladislav Yasevich # Signed-off-by: Sridhar Samudrala # # net/sctp/input.c # From kaber@trash.net Thu Dec 30 16:54:17 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 16:54:24 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBV0ruOP011014 for ; Thu, 30 Dec 2004 16:54:17 -0800 Received: from eru.coreworks.de ([172.16.0.2] helo=trash.net) by kaber.coreworks.de with esmtp (Exim 4.34) id 1CkBBF-0004zD-PL; Fri, 31 Dec 2004 02:02:14 +0100 Message-ID: <41D4A4D2.4000109@trash.net> Date: Fri, 31 Dec 2004 02:01:06 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040413 Debian/1.6-5 X-Accept-Language: en MIME-Version: 1.0 To: Thomas Graf CC: "David S. Miller" , Jamal Hadi Salim , netdev@oss.sgi.com Subject: Re: [PATCH 2/9] PKT_SCHED: tc filter extension API References: <20041230122652.GM32419@postel.suug.ch> <20041230123023.GO32419@postel.suug.ch> In-Reply-To: <20041230123023.GO32419@postel.suug.ch> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13263 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev Thomas Graf wrote: > +int > +tcf_exts_validate(struct tcf_proto *tp, struct rtattr **tb, > + struct rtattr *rate_tlv, struct tcf_exts *exts, > + struct tcf_ext_map *map) > +{ > + memset(exts, 0, sizeof(*exts)); > + > +#ifdef CONFIG_NET_CLS_ACT > + int err; > + struct tc_action *act; > + > + if (map->police && tb[map->police-1] && rate_tlv) { > + act = tcf_action_init_1(tb[map->police-1], rate_tlv, "police", > + TCA_ACT_NOREPLACE, TCA_ACT_BIND, &err); > + if (NULL == act) Please use act == NULL > +void > +tcf_exts_change(struct tcf_proto *tp, struct tcf_exts *dst, > + struct tcf_exts *src) > +{ > +#ifdef CONFIG_NET_CLS_ACT > + if (src->action) { > + if (dst->action) { > + struct tc_action *act; > + > + tcf_tree_lock(tp); > + act = xchg(&dst->action, src->action); > + tcf_tree_unlock(tp); > + > + tcf_action_destroy(act, TCA_ACT_UNBIND); > + } else > + dst->action = src->action; This isn't right (its also wrong in the current code). If the CPU reorders stores and another CPU looks at dst->action at the wrong time it might see an inconsistent structure. So at least smp_wmb is required for the unlocked case, but I think you should just remove it completely. I also wonder if anyone actually knows why we need the xchg (here and in all the other places), it looks totally useless. Regards Patrick From acme@conectiva.com.br Thu Dec 30 17:53:37 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 17:53:50 -0800 (PST) Received: from orion.netbank.com.br (orion.netbank.com.br [200.203.199.90]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBV1rGoe013579 for ; Thu, 30 Dec 2004 17:53:37 -0800 Received: from [200.138.51.5] (helo=oops.ghostprotocols.net) by orion.netbank.com.br with asmtp (Exim 3.33 #1) id 1CkC9I-0000bD-00; Fri, 31 Dec 2004 00:04:16 -0200 Received: from [192.168.1.6] (amd64.kerneljanitors.org [192.168.1.6]) by oops.ghostprotocols.net (Postfix) with ESMTP id 89296753A9; Fri, 31 Dec 2004 00:01:45 -0200 (BRST) Message-ID: <41D4B3AD.9080605@conectiva.com.br> Date: Fri, 31 Dec 2004 00:04:29 -0200 From: Arnaldo Carvalho de Melo Organization: Conectiva S.A. User-Agent: Mozilla Thunderbird 0.9 (X11/20041103) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Patrick McHardy Cc: Thomas Graf , "David S. Miller" , Jamal Hadi Salim , netdev@oss.sgi.com Subject: Re: [PATCH 2/9] PKT_SCHED: tc filter extension API References: <20041230122652.GM32419@postel.suug.ch> <20041230123023.GO32419@postel.suug.ch> <41D4A4D2.4000109@trash.net> In-Reply-To: <41D4A4D2.4000109@trash.net> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13264 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: acme@conectiva.com.br Precedence: bulk X-list: netdev Patrick McHardy wrote: > Thomas Graf wrote: > >> +int >> +tcf_exts_validate(struct tcf_proto *tp, struct rtattr **tb, >> + struct rtattr *rate_tlv, struct tcf_exts *exts, >> + struct tcf_ext_map *map) >> +{ >> + memset(exts, 0, sizeof(*exts)); >> + >> +#ifdef CONFIG_NET_CLS_ACT >> + int err; >> + struct tc_action *act; >> + >> + if (map->police && tb[map->police-1] && rate_tlv) { >> + act = tcf_action_init_1(tb[map->police-1], rate_tlv, "police", >> + TCA_ACT_NOREPLACE, TCA_ACT_BIND, &err); >> + if (NULL == act) > > > Please use act == NULL Agreed, but I understand the people who like this (ugly) style, it becomes less likely to become an "if (act = NULL)", but hey, compilers already helps us with a nice warning. - Arnaldo From dave@thedillows.org Thu Dec 30 19:22:56 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 19:23:03 -0800 (PST) Received: from smtp.knology.net (smtp.knology.net [24.214.63.101]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBV3MZro016281 for ; Thu, 30 Dec 2004 19:22:55 -0800 Received: (qmail 4542 invoked by uid 0); 31 Dec 2004 03:31:32 -0000 Received: from user-69-1-45-93.knology.net (HELO ori.thedillows.org) (69.1.45.93) by smtp6.knology.net with SMTP; 31 Dec 2004 03:31:32 -0000 Received: from ori.thedillows.org (localhost [127.0.0.1]) by ori.thedillows.org (8.13.1/8.13.1) with ESMTP id iBV3V8u9010615; Thu, 30 Dec 2004 22:31:08 -0500 Received: (from il1@localhost) by ori.thedillows.org (8.13.1/8.13.1/Submit) id iBV3V7K1010614; Thu, 30 Dec 2004 22:31:07 -0500 X-Authentication-Warning: ori.thedillows.org: il1 set sender to dave@thedillows.org using -f Subject: Re: [RFC 2.6.10 5/22] xfrm: Attempt to offload bundled xfrm_states for outbound xfrms From: David Dillow To: Francois Romieu Cc: Netdev , linux-kernel@vger.kernel.org In-Reply-To: <20041230233443.GB5247@electric-eye.fr.zoreil.com> References: <20041230035000.13@ori.thedillows.org> <20041230035000.14@ori.thedillows.org> <20041230233443.GB5247@electric-eye.fr.zoreil.com> Content-Type: text/plain Content-Transfer-Encoding: 7bit Date: Thu, 30 Dec 2004 22:31:07 -0500 Message-Id: <1104463867.10590.2.camel@ori.thedillows.org> Mime-Version: 1.0 X-Mailer: Evolution 2.0.2 (2.0.2-3) X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13265 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: dave@thedillows.org Precedence: bulk X-list: netdev On Fri, 2004-12-31 at 00:34 +0100, Francois Romieu wrote: > David Dillow : > [...] > > diff -Nru a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c > > --- a/net/xfrm/xfrm_policy.c 2004-12-30 01:11:18 -05:00 > > +++ b/net/xfrm/xfrm_policy.c 2004-12-30 01:11:18 -05:00 > > @@ -705,6 +705,31 @@ > > }; > > } > > > > +static void xfrm_accel_bundle(struct dst_entry *dst) > > +{ > > + struct xfrm_bundle_list bundle, *xbl, *tmp; > > + struct net_device *dev = dst->dev; > > + INIT_LIST_HEAD(&bundle.node); > > + > > + if (dev && netif_running(dev) && (dev->features & NETIF_F_IPSEC)) { > > + while (dst) { > > + xbl = kmalloc(sizeof(*xbl), GFP_ATOMIC); > > + if (!xbl) > > + goto out; > > + > > + xbl->dst = dst; > > + list_add_tail(&xbl->node, &bundle.node); > > + dst = dst->child; > > + } > > + > > + dev->xfrm_bundle_add(dev, &bundle); > > + } > > + > > +out: > > + list_for_each_entry_safe(xbl, tmp, &bundle.node, node) > > + kfree(xbl); > > +} > > If the driver knows the max depth which is allowed, why not have it > allocate its own bundle-like struct during initialization one for once ? > Instead of pushing the bundle list, dst is walked by the code of > the device's own xyz_xfrm_bundle_add into the said circular list, > entries get overwriten if the dst chain is longer and when the end of > dst is reached, the bundle-like list is walked in reverse order. > It avoids a few failure points imho. Good idea! I'll see if I can't code it up. I definitely want to get rid of that GFP_ATOMIC allocation. -- David Dillow From hadi@cyberus.ca Thu Dec 30 20:28:50 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 20:28:56 -0800 (PST) Received: from mx03.cybersurf.com (mx03.cybersurf.com [209.197.145.106]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBV4SUZn021785 for ; Thu, 30 Dec 2004 20:28:50 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx03.cybersurf.com with esmtp (Exim 4.30) id 1CkEXE-0007j0-T0 for netdev@oss.sgi.com; Thu, 30 Dec 2004 23:37:08 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CkEX7-0004wL-7r; Thu, 30 Dec 2004 23:37:01 -0500 Subject: Re: [PATCH 2/9] PKT_SCHED: tc filter extension API From: jamal Reply-To: hadi@cyberus.ca To: Thomas Graf Cc: "David S. Miller" , Patrick McHardy , netdev@oss.sgi.com In-Reply-To: <20041230123023.GO32419@postel.suug.ch> References: <20041230122652.GM32419@postel.suug.ch> <20041230123023.GO32419@postel.suug.ch> Content-Type: text/plain Organization: jamalopolous Message-Id: <1104467816.1049.181.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 30 Dec 2004 23:36:57 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13266 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Thu, 2004-12-30 at 07:30, Thomas Graf wrote: > The extensions are executed by the classifier by calling tcf_exts_exec > which must be done as the last thing after making sure the > filter matches. Note: A classifier might take further actions after > the execution to tcf_exts_exec such as correcting its own cache to > avoid caching results which could have been influenced by the extensions. Which cache? > tcf_exts_exec returns a negative error code if the filter must be > considered unmatched, 0 on normal execution or a positive classifier > return code (TC_ACT_*) which must be returned to the underlying layer > as-is. Look at the TC_ACT_*; i think they should be sufficient. > > Signed-off-by: Thomas Graf > > --- linux-2.6.10-bk2.orig/include/net/pkt_cls.h 2004-12-30 01:22:01.000000000 +0100 > +++ linux-2.6.10-bk2/include/net/pkt_cls.h 2004-12-30 01:22:39.000000000 +0100 > +/** > + * tcf_exts_exec - execute tc filter extensions > + * @skb: socket buffer > + * @exts: tc filter extensions handle > + * @res: desired result > + * > + * Executes all configured extensions. Returns 0 on a normal execution, > + * a negative number if the filter must be considered unmatched or > + * a positive action code (TC_ACT_*) which must be returned to the > + * underlying layer. TCA_ACT_OK is 0. > tcf_change_act_police(struct tcf_proto *tp, struct tc_action **action, > --- linux-2.6.10-bk2.orig/net/sched/cls_api.c 2004-12-30 01:22:01.000000000 +0100 > +++ linux-2.6.10-bk2/net/sched/cls_api.c 2004-12-30 00:47:06.000000000 +0100 > @@ -439,6 +439,162 @@ > return skb->len; > } > + > +int > +tcf_exts_validate(struct tcf_proto *tp, struct rtattr **tb, > + struct rtattr *rate_tlv, struct tcf_exts *exts, > + struct tcf_ext_map *map) > +{ > + memset(exts, 0, sizeof(*exts)); > + > +#ifdef CONFIG_NET_CLS_ACT > + int err; > + struct tc_action *act; > + > + if (map->police && tb[map->police-1] && rate_tlv) { > + act = tcf_action_init_1(tb[map->police-1], rate_tlv, "police", > + TCA_ACT_NOREPLACE, TCA_ACT_BIND, &err); > + if (NULL == act) > + return err; > + > + act->type = TCA_OLD_COMPAT; > + exts->action = act; > + } else if (map->action && tb[map->action-1] && rate_tlv) { > + act = tcf_action_init(tb[map->action-1], rate_tlv, NULL, > + TCA_ACT_NOREPLACE, TCA_ACT_BIND, &err); > + if (NULL == act) > + return err; > + > + exts->action = act; > + } > +#elif defined CONFIG_NET_CLS_POLICE > + if (map->police && tb[map->police-1] && rate_tlv) { > + struct tcf_police *p; > + > + p = tcf_police_locate(tb[map->police-1], rate_tlv); > + if (NULL == p) > + return -EINVAL; > + > + exts->police = p; > + } else if (map->action && tb[map->action-1]) > + return -EOPNOTSUPP; > +#else > + if ((map->action && tb[map->action-1]) || > + (map->police && tb[map->police-1])) > + return -EOPNOTSUPP; > +#endif > + > + return 0; > +} Maybe a few DPRINTKs for debugging purposes? > +void > +tcf_exts_change(struct tcf_proto *tp, struct tcf_exts *dst, > + struct tcf_exts *src) > +{ > +#ifdef CONFIG_NET_CLS_ACT > + if (src->action) { > + if (dst->action) { > + struct tc_action *act; > + > + tcf_tree_lock(tp); > + act = xchg(&dst->action, src->action); > + tcf_tree_unlock(tp); > + > + tcf_action_destroy(act, TCA_ACT_UNBIND); > + } else > + dst->action = src->action; > + } > +#elif defined CONFIG_NET_CLS_POLICE > + if (src->police) { > + if (dst->police) { > + struct tcf_police *p; > + > + tcf_tree_lock(tp); > + p = xchg(&dst->police, src->police); > + tcf_tree_unlock(tp); > + > + tcf_police_release(p, TCA_ACT_UNBIND); > + } else > + dst->police = src->police; > + } > +#endif Patrick pointed this in other email: the xchg is sort of defeated by the else. Perhaps make the second one xchg as well. BTW Thomas: Hopefully these patches match closely what was already in place? (sorry didnt cross reference) i.e if any optimizations we should probably bring next phase cheers, jamal From hadi@cyberus.ca Thu Dec 30 20:34:39 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 20:34:46 -0800 (PST) Received: from mx03.cybersurf.com (mx03.cybersurf.com [209.197.145.106]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBV4YIpN022441 for ; Thu, 30 Dec 2004 20:34:38 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx03.cybersurf.com with esmtp (Exim 4.30) id 1CkEcq-0002cd-Jm for netdev@oss.sgi.com; Thu, 30 Dec 2004 23:42:56 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CkEcj-0005UN-CZ; Thu, 30 Dec 2004 23:42:49 -0500 Subject: Re: [PATCH 2/9] PKT_SCHED: tc filter extension API From: jamal Reply-To: hadi@cyberus.ca To: Thomas Graf Cc: "David S. Miller" , Patrick McHardy , netdev@oss.sgi.com In-Reply-To: <20041230140929.GY32419@postel.suug.ch> References: <20041230122652.GM32419@postel.suug.ch> <20041230123023.GO32419@postel.suug.ch> <1104414713.1047.130.camel@jzny.localdomain> <20041230140929.GY32419@postel.suug.ch> Content-Type: text/plain Organization: jamalopolous Message-Id: <1104468164.1049.197.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 30 Dec 2004 23:42:44 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13267 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Thu, 2004-12-30 at 09:09, Thomas Graf wrote: > * jamal <1104414713.1047.130.camel@jzny.localdomain> 2004-12-30 08:51 > > In current code you can have CONFIG_NET_CLS_ACT and not use new > > style policer, rather use old one i.e CONFIG_NET_CLS_POLICE. You seem to > > indicate presence of CONFIG_NET_CLS_ACT implies absence of > > NET_CLS_POLICE. > > Is this wrong? Current code: (u32) > > 2004/06/15 hadi | #ifdef CONFIG_NET_CLS_ACT > 2004/06/15 hadi | struct tc_action *action; > 2004/06/15 hadi | #else > 2002/02/05 torvalds | #ifdef CONFIG_NET_CLS_POLICE > 2002/02/05 torvalds | struct tcf_police *police; > 2002/02/05 torvalds | #endif > 2004/06/15 hadi | #endif > > > config NET_CLS_POLICE > > ... > > depends on NET_CLS && NET_QOS && NET_ACT_POLICE!=y && > > NET_ACT_POLICE!=m > > Hmm... doesn't make too much sense for me. What's the advantage of > allowing this mix? Ok, send a patch for the Kconfig then;-> Make sure that CLS_POLICE cant be selected if CONFIG_NET_CLS_ACT is on. [Agreed that doing it this way would allow killing the policer sooner] cheers, jamal From hadi@cyberus.ca Thu Dec 30 20:35:14 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 20:35:22 -0800 (PST) Received: from mx03.cybersurf.com (mx03.cybersurf.com [209.197.145.106]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBV4YqLc022492 for ; Thu, 30 Dec 2004 20:35:14 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx03.cybersurf.com with esmtp (Exim 4.30) id 1CkEdP-0002t2-3S for netdev@oss.sgi.com; Thu, 30 Dec 2004 23:43:31 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CkEdJ-0005YE-Kn; Thu, 30 Dec 2004 23:43:25 -0500 Subject: Re: [PATCH 3/9] PKT_SCHED: u32: make use of tcf_exts API From: jamal Reply-To: hadi@cyberus.ca To: Thomas Graf Cc: "David S. Miller" , Patrick McHardy , netdev@oss.sgi.com In-Reply-To: <20041230123108.GP32419@postel.suug.ch> References: <20041230122652.GM32419@postel.suug.ch> <20041230123108.GP32419@postel.suug.ch> Content-Type: text/plain Organization: jamalopolous Message-Id: <1104468201.1047.201.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 30 Dec 2004 23:43:21 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13268 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Thu, 2004-12-30 at 07:31, Thomas Graf wrote: > > +static struct tcf_ext_map u32_ext_map = { > + .action = TCA_U32_ACT, > + .police = TCA_U32_POLICE > +}; Repeated on all classifiers - perhaps a little macro magic in order? ;-> Generally rest all looking good - I need to stare some more at the route and tcindex one but dont see any show stoppers off hand; cheers, jamal From hadi@cyberus.ca Thu Dec 30 20:37:29 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 20:37:36 -0800 (PST) Received: from mx02.cybersurf.com (mx02.cybersurf.com [209.197.145.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBV4b8Sv023463 for ; Thu, 30 Dec 2004 20:37:29 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx02.cybersurf.com with esmtp (Exim 4.30) id 1CkEfY-0007Zb-PP for netdev@oss.sgi.com; Thu, 30 Dec 2004 23:45:44 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CkEfU-0005j9-HV; Thu, 30 Dec 2004 23:45:40 -0500 Subject: Re: [PATCH PKT_SCHED 0/17]: tc action cleanup + fixes From: jamal Reply-To: hadi@cyberus.ca To: Patrick McHardy Cc: Maillist netdev In-Reply-To: <41D4286B.1060106@trash.net> References: <41D3785F.3040909@trash.net> <1104382562.1048.39.camel@jzny.localdomain> <41D3F5EC.9050808@trash.net> <41D4286B.1060106@trash.net> Content-Type: text/plain Organization: jamalopolous Message-Id: <1104468336.1048.205.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 30 Dec 2004 23:45:36 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13269 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Thu, 2004-12-30 at 11:10, Patrick McHardy wrote: > Thinking about it, I think its better to reorganize the > patches by subject. While doing this I'm going to add > back the useful printks in the init paths as DPRINTKs. > I'm going to post the entire reorganized batch next week > when I return home, working with bitkeeper on my crappy > notebook is just to painful :) [I dont use bitkeeper for religious reasons (i hope i am not offending anyone)]. Ok, sounds like a good plan (gives me more time to work with the driver stuff which is getting out of control ;->): I think the patches may have goten a little messy. Maybe the order should be: you, Thomas with what he has now then myself with eaction and Thomas with extended matches (I dont think the order of the last two matters). I will just code against whatever latest release is and migrate later when all the otehr patches are in. cheers, jamal From parklee_sel@yahoo.com Thu Dec 30 20:42:57 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 20:43:04 -0800 (PST) Received: from web51502.mail.yahoo.com (web51502.mail.yahoo.com [206.190.38.194]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBV4gaWG024101 for ; Thu, 30 Dec 2004 20:42:56 -0800 Received: (qmail 28171 invoked by uid 60001); 31 Dec 2004 04:51:06 -0000 Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; b=hIPL7VoUJu+0kawYLCMvBTn8ehZRIpVqk4E2NJ2msDClK9FhN52z4QwcINKt2DMXhcSdvJUwuKdhFVMjURP2K0U1NPRKlst53Nq7JwwPm05854ezDN8Fvs0S9uBUZNj4Fzs23yESy8apQWHy2tsbx2yWPWmP+JPc1gJoxFGufU0= ; Message-ID: <20041231045106.28169.qmail@web51502.mail.yahoo.com> Received: from [221.15.137.76] by web51502.mail.yahoo.com via HTTP; Thu, 30 Dec 2004 20:51:06 PST Date: Thu, 30 Dec 2004 20:51:06 -0800 (PST) From: Park Lee Subject: Issue on packets sending through ip_route_output_key() to xfrm_lookup() in native IPsec To: netdev@oss.sgi.com MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13270 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: parklee_sel@yahoo.com Precedence: bulk X-list: netdev Hi, In Linux native IPsec, there is a function xfrm_lookup(struct dst_entry **dst_p, struct flowi *fl, struct sock *sk, int flags) (in /usr/src/linux-2.6.5-1.358/net/xfrm/xfrm_policy.c). Whenever a packet is sending, kernel will call xfrm_lookup() to finds/creates a bundle for it. xfrm_lookup() can be called by many functions. one of these functions is ip_route_output_key(). we can see its definition as follows: int ip_route_output_key(struct rtable **rp, struct flowi *flp) { int err; if ((err = __ip_route_output_key(rp, flp)) != 0) return err; return flp->proto ? xfrm_lookup((struct dst_entry**)rp, flp, NULL, 0) : 0; } As ip_route_output_key() calls xfrm_lookup() with the argument sk set to NULL, Does this means that the packets sending through ip_route_output_key() to xfrm_lookup() have no corresponding local socket with them (because their sk is NULL)? Are these packets all created by special kernel socket (i.e. icmp_socket and tcp_socket)? Thank you very much. ===== Best Regards, Park Lee __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com From hadi@cyberus.ca Thu Dec 30 20:50:23 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 20:50:30 -0800 (PST) Received: from mx03.cybersurf.com (mx03.cybersurf.com [209.197.145.106]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBV4o3C8024822 for ; Thu, 30 Dec 2004 20:50:23 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx03.cybersurf.com with esmtp (Exim 4.30) id 1CkEs5-0003uM-Ou for netdev@oss.sgi.com; Thu, 30 Dec 2004 23:58:41 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CkErz-000745-LC; Thu, 30 Dec 2004 23:58:35 -0500 Subject: Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. From: jamal Reply-To: hadi@cyberus.ca To: Thomas Graf Cc: "David S. Miller" , netdev@oss.sgi.com In-Reply-To: <20041230174313.GB32419@postel.suug.ch> References: <20041228221021.GF32419@postel.suug.ch> <1104275197.1100.276.camel@jzny.localdomain> <20041228231916.GG32419@postel.suug.ch> <1104277165.1100.291.camel@jzny.localdomain> <20041229000928.GH32419@postel.suug.ch> <1104282811.1090.314.camel@jzny.localdomain> <20041229124842.GI32419@postel.suug.ch> <1104330054.1089.329.camel@jzny.localdomain> <20041229150140.GJ32419@postel.suug.ch> <1104335620.1025.22.camel@jzny.localdomain> <20041230174313.GB32419@postel.suug.ch> Content-Type: text/plain Organization: jamalopolous Message-Id: <1104469111.1049.219.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 30 Dec 2004 23:58:31 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13271 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Thu, 2004-12-30 at 12:43, Thomas Graf wrote: > * jamal <1104335620.1025.22.camel@jzny.localdomain> 2004-12-29 10:53 > > If i store some ID that would tell me "IP" when i dump then i can pretty > > print it in english in user space using ip_print(). > > Understood, we could store a map in userspace mapping those IDs to > pretty english match descriptions. I think avoiding to hardcode those > ids but rather just hold it for userspace is the best thing. We may be in sync: I was thinking of just teaching tc to stash something there that it understands on how to translate. Thinking about it now, this may not be sufficient: perhaps we need a few bits in the selector to identify the owner who installed the rule to begin with. Then it would be safe to interpret the meaning of the ID (by the same app). Did you say there were some unused bits in the selector? > OTOH, if > we give unique ids to matches we can use it instead of a separate ID. Note: The above id i was talking about is "opaque" i.e the real meaning of it is only known to the user space app that installed it (unloess you want to define things in kernel headers) > Unique in terms of parent classid + filter handle + match handle must > be unique per interface. Thoughts? I think all you need really is to say "this match starts at IP" i.e such a definition is global. handles per rule already exist - and you can actually specify them when installing a rule. Are those insufficient? > > Sounds good to me since we have a new sel. > > It may endup being tricky to be both fast and backward compat; but we'll > > see what fun awaits when you start coding. > > I started developing concrete thoughts. The current u32 match could be > made a generic match just like meta which would give us a u32 w/o hashtables > on a always-true classifier. The problem arises with exactly those > hashtables. u32 uses the same selector for this and furthermore even defines > stuff like hoff and hmask for this purpose. I have to read up again to > understand the hashing in full detail but this is the only issue I see that > we might face. > Why not make the always-true to be an extended match? actually a u32 match of 0 0 is always true. Those hashes are quiet tricky/flexible; i would rather we clone u32 and call it something else then speacilize it. > What I have in mind is, something like u32 but much simpler, w/o the > overhead of creating additional filters for hashing etc. Basically > this can be the always-true classifier which just implements the > generic matches tree. And have the existing u32 with the generic > matches added when hashing is required. > Whip anothe classifier is my opinion. Or write extended matches. > I planned to write these 2 additional generic matches so far. It's > pretty simple because I can just take over the code from EGP. > > KMP: Knuth-Morris-Pratt text search algorithm > NByte: Compares any number of bytes against a pattern, very useful > for comparing IPv6 addresses instead of creating 4 ANDed > u32 matches. Both sound very appealing. You plan to do them as extended matches, correct? KMP can be used for something like virus scanning? does it maintain state? cheers, jamal From parklee_sel@yahoo.com Thu Dec 30 20:59:00 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 20:59:07 -0800 (PST) Received: from web51507.mail.yahoo.com (web51507.mail.yahoo.com [206.190.38.199]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBV4wdcc025581 for ; Thu, 30 Dec 2004 20:59:00 -0800 Received: (qmail 11917 invoked by uid 60001); 31 Dec 2004 05:07:09 -0000 Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; b=b6hc6PeWilX1ZXzoDq8LV284CTl08UzQWVBhLIm9wvwIcv74gOzOHyJehHXzq+TjJsQKXp0fZnB5XSA5eCwEKdvsdQ7pGfAVYHclYnIytqbBTjm4wrjECwjUkopjFNDbf0GO7xg2YuKIqX1uZkuZAAbyXQEhep87G9CjSNESP9k= ; Message-ID: <20041231050709.11915.qmail@web51507.mail.yahoo.com> Received: from [221.15.137.76] by web51507.mail.yahoo.com via HTTP; Thu, 30 Dec 2004 21:07:09 PST Date: Thu, 30 Dec 2004 21:07:09 -0800 (PST) From: Park Lee Subject: Issue on packets sending through ip_route_output_key() to xfrm_lookup() in native IPsec To: netdev@oss.sgi.com MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13272 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: parklee_sel@yahoo.com Precedence: bulk X-list: netdev Hi, In Linux native IPsec, there is a function xfrm_lookup(struct dst_entry **dst_p, struct flowi *fl, struct sock *sk, int flags) (in /usr/src/linux-2.6.5-1.358/net/xfrm/xfrm_policy.c). Whenever a packet is sending, kernel will call xfrm_lookup() to finds/creates a bundle for it. xfrm_lookup() can be called by many functions. one of these functions is ip_route_output_key(). we can see its definition as follows: int ip_route_output_key(struct rtable **rp, struct flowi *flp) { int err; if ((err = __ip_route_output_key(rp, flp)) != 0) return err; return flp->proto ? xfrm_lookup((struct dst_entry**)rp, flp, NULL, 0) : 0; } As ip_route_output_key() calls xfrm_lookup() with the argument sk set to NULL, Does this means that the packets sending through ip_route_output_key() to xfrm_lookup() have no corresponding local socket with them (because their sk is NULL)? Are these packets all created by special kernel socket (i.e. icmp_socket and tcp_socket)? Thank you very much. ===== Best Regards, Park Lee __________________________________ Do you Yahoo!? Dress up your holiday email, Hollywood style. Learn more. http://celebrity.mail.yahoo.com From hadi@cyberus.ca Thu Dec 30 20:59:39 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 20:59:46 -0800 (PST) Received: from mx02.cybersurf.com (mx02.cybersurf.com [209.197.145.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBV4xJAc025658 for ; Thu, 30 Dec 2004 20:59:39 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx02.cybersurf.com with esmtp (Exim 4.30) id 1CkF0v-0002OD-93 for netdev@oss.sgi.com; Fri, 31 Dec 2004 00:07:49 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CkF0t-0008Lm-F1; Fri, 31 Dec 2004 00:07:47 -0500 Subject: Re: ing_filter debug messages From: jamal Reply-To: hadi@cyberus.ca To: Wichert Akkerman Cc: netdev@oss.sgi.com, tgraf@suug.ch In-Reply-To: <20041230160643.GD24603@wiggy.net> References: <20041230160643.GD24603@wiggy.net> Content-Type: text/plain Organization: jamalopolous Message-Id: <1104469666.1049.231.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 31 Dec 2004 00:07:46 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13273 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev The emssage is useful but debug (mostly). Whats your ifconfig look like? Have some tunnels in there maybe? Whats your netfilter setup? cheers, jamal On Thu, 2004-12-30 at 11:06, Wichert Akkerman wrote: > After upgrading a machine to (unpatched mainline) 2.6.10 my kernel log > is filled with ing_filter (debug?) messages: > > Dec 30 16:24:58 thunder kernel: ing_filter: fixed eth1 out eth1 > Dec 30 16:24:58 thunder kernel: ing_filter: fixed eth1 out eth1 > Dec 30 16:37:08 thunder kernel: ing_filter: fixed eth1 out eth1 > Dec 30 16:37:08 thunder kernel: ing_filter: fixed eth1 out eth1 > Dec 30 16:49:08 thunder kernel: ing_filter: fixed eth1 out eth1 > Dec 30 16:49:08 thunder kernel: ing_filter: fixed eth1 out eth1 > Dec 30 17:01:08 thunder kernel: ing_filter: fixed eth1 out eth1 > Dec 30 17:01:08 thunder kernel: ing_filter: fixed eth1 out eth1 > > the messages always come in pairs. eth1 is the externel interface which > has a standard wondershaper configuration attached to it. Relevant bits > of .config are below. > > Wichert. > > > # > # QoS and/or fair queueing > # > CONFIG_NET_SCHED=y > CONFIG_NET_SCH_CLK_JIFFIES=y > # CONFIG_NET_SCH_CLK_GETTIMEOFDAY is not set > # CONFIG_NET_SCH_CLK_CPU is not set > CONFIG_NET_SCH_CBQ=y > CONFIG_NET_SCH_HTB=y > # CONFIG_NET_SCH_HFSC is not set > # CONFIG_NET_SCH_PRIO is not set > # CONFIG_NET_SCH_RED is not set > CONFIG_NET_SCH_SFQ=y > # CONFIG_NET_SCH_TEQL is not set > CONFIG_NET_SCH_TBF=y > # CONFIG_NET_SCH_GRED is not set > # CONFIG_NET_SCH_DSMARK is not set > # CONFIG_NET_SCH_NETEM is not set > CONFIG_NET_SCH_INGRESS=y > CONFIG_NET_QOS=y > CONFIG_NET_ESTIMATOR=y > CONFIG_NET_CLS=y > CONFIG_NET_CLS_TCINDEX=y > CONFIG_NET_CLS_ROUTE4=y > CONFIG_NET_CLS_ROUTE=y > CONFIG_NET_CLS_FW=y > CONFIG_NET_CLS_U32=y > # CONFIG_CLS_U32_PERF is not set > # CONFIG_NET_CLS_IND is not set > CONFIG_NET_CLS_RSVP=y > CONFIG_NET_CLS_RSVP6=y > CONFIG_NET_CLS_ACT=y > CONFIG_NET_ACT_POLICE=y > CONFIG_NET_ACT_GACT=y > CONFIG_GACT_PROB=y > CONFIG_NET_ACT_MIRRED=y > CONFIG_NET_ACT_IPT=y > CONFIG_NET_ACT_PEDIT=y > From hadi@cyberus.ca Thu Dec 30 21:03:37 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 21:03:44 -0800 (PST) Received: from mx02.cybersurf.com (mx02.cybersurf.com [209.197.145.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBV53Htm026624 for ; Thu, 30 Dec 2004 21:03:37 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx02.cybersurf.com with esmtp (Exim 4.30) id 1CkF4l-00051n-M0 for netdev@oss.sgi.com; Fri, 31 Dec 2004 00:11:47 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CkExv-0007pd-Dt; Fri, 31 Dec 2004 00:04:43 -0500 Subject: Re: [PATCH 2/9] PKT_SCHED: tc filter extension API From: jamal Reply-To: hadi@cyberus.ca To: Arnaldo Carvalho de Melo Cc: Patrick McHardy , Thomas Graf , "David S. Miller" , netdev@oss.sgi.com In-Reply-To: <41D4B3AD.9080605@conectiva.com.br> References: <20041230122652.GM32419@postel.suug.ch> <20041230123023.GO32419@postel.suug.ch> <41D4A4D2.4000109@trash.net> <41D4B3AD.9080605@conectiva.com.br> Content-Type: text/plain Organization: jamalopolous Message-Id: <1104469479.1047.227.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 31 Dec 2004 00:04:39 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13274 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Thu, 2004-12-30 at 21:04, Arnaldo Carvalho de Melo wrote: > > > > Please use act == NULL > > Agreed, but I understand the people who like this (ugly) style, it becomes > less likely to become an "if (act = NULL)", but hey, compilers already > helps us with a nice warning. There are certain people who would kill you for doing the above ;-> Or maybe theyll just take away your coffee. I think its a taste issue. Code looks fine either way. cheers, jamal From parklee_sel@yahoo.com Thu Dec 30 21:14:11 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 21:14:17 -0800 (PST) Received: from web51507.mail.yahoo.com (web51507.mail.yahoo.com [206.190.38.199]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBV5DpUW027795 for ; Thu, 30 Dec 2004 21:14:11 -0800 Received: (qmail 15580 invoked by uid 60001); 31 Dec 2004 05:22:21 -0000 Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; b=rN0CgE3gI0+JrTIE7SHX+O9KBCWBUTbzjsGQJOtfSH1VKCefzcc6AziWfjaPoFE33N/CPfo5ZMthneEcnkxMvIVXAOn5nXHhrnjYPgOH6TiRaM8WxICNNtszTIPUCCQZtcS0vRtGKdLmhnKeZ6uiQJnDbe64bl9GiybjHeJI1Vo= ; Message-ID: <20041231052221.15578.qmail@web51507.mail.yahoo.com> Received: from [221.15.137.76] by web51507.mail.yahoo.com via HTTP; Thu, 30 Dec 2004 21:22:21 PST Date: Thu, 30 Dec 2004 21:22:21 -0800 (PST) From: Park Lee Subject: A sending packet and its socket and sock To: netdev@oss.sgi.com MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13275 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: parklee_sel@yahoo.com Precedence: bulk X-list: netdev Hi, In Linux IP network, when a packet is going to be sent out. Would you please tell me which one of the following items is right for the packet (Let's suppose struct sock *sk; struct socket *skt): 1), The packet has a sock (i.e. sk!=NULL), but doesn't has a socket (i.e. skt=NULL) corresponding to the packet? 2), The packet has a socket (i.e. skt!=NULL), but doesn't has a sock (i.e. sk=NULL) corresponding to the packet? 3), The packet doesn't has the both (i.e. sk=NULL and skt=NULL)? And, If some of them are right, Would you please also give me some examples of them? Thank you very much. ===== Best Regards, Park Lee __________________________________ Do you Yahoo!? Yahoo! Mail - Easier than ever with enhanced search. Learn more. http://info.mail.yahoo.com/mail_250 From hadi@cyberus.ca Thu Dec 30 21:57:49 2004 Received: with ECARTIS (v1.0.0; list netdev); Thu, 30 Dec 2004 21:57:57 -0800 (PST) Received: from mx02.cybersurf.com (mx02.cybersurf.com [209.197.145.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBV5vRpa029990 for ; Thu, 30 Dec 2004 21:57:49 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx02.cybersurf.com with esmtp (Exim 4.30) id 1CkFvC-0001Ze-A0 for netdev@oss.sgi.com; Fri, 31 Dec 2004 01:05:58 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CkEw2-0007bw-Dq; Fri, 31 Dec 2004 00:02:46 -0500 Subject: Re: [PATCH 2/9] PKT_SCHED: tc filter extension API From: jamal Reply-To: hadi@cyberus.ca To: Patrick McHardy Cc: Thomas Graf , "David S. Miller" , netdev@oss.sgi.com In-Reply-To: <41D4A4D2.4000109@trash.net> References: <20041230122652.GM32419@postel.suug.ch> <20041230123023.GO32419@postel.suug.ch> <41D4A4D2.4000109@trash.net> Content-Type: text/plain Organization: jamalopolous Message-Id: <1104469362.1049.224.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 31 Dec 2004 00:02:42 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13276 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Thu, 2004-12-30 at 20:01, Patrick McHardy wrote: > > +#ifdef CONFIG_NET_CLS_ACT > > + if (src->action) { > > + if (dst->action) { > > + struct tc_action *act; > > + > > + tcf_tree_lock(tp); > > + act = xchg(&dst->action, src->action); > > + tcf_tree_unlock(tp); > > + > > + tcf_action_destroy(act, TCA_ACT_UNBIND); > > + } else > > + dst->action = src->action; > > This isn't right (its also wrong in the current code). If the > CPU reorders stores and another CPU looks at dst->action at > the wrong time it might see an inconsistent structure. I think an xchg around the else should fix this. > So at > least smp_wmb is required for the unlocked case, or maybe this. > but I think > you should just remove it completely. I also wonder if anyone > actually knows why we need the xchg (here and in all the other > places), it looks totally useless. All these were put in by Alexey and the LinuxWay(tm) took effect. an xchg puts almost a lock and ensures an atomic swap. I dont see any harm in leaving it as is - just needs fixing the else cheers, jamal From buytenh@wantstofly.org Fri Dec 31 00:25:38 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 31 Dec 2004 00:25:50 -0800 (PST) Received: from xi.wantstofly.org (alephnull.demon.nl [212.238.201.82]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBV8PHrx005414 for ; Fri, 31 Dec 2004 00:25:38 -0800 Received: by xi.wantstofly.org (Postfix, from userid 500) id 5D6322B0EE; Fri, 31 Dec 2004 09:33:52 +0100 (MET) Date: Fri, 31 Dec 2004 09:33:52 +0100 From: Lennert Buytenhek To: Bart De Schuymer Cc: "David S. Miller" , netdev@oss.sgi.com, snort2004@mail.ru Subject: Re: [PATCH][BRIDGE-NF] Fix wrong use of skb->protocol Message-ID: <20041231083352.GA25031@xi.wantstofly.org> References: <1104432914.15601.19.camel@localhost.localdomain> <20041230222415.GB19587@xi.wantstofly.org> <1104448248.15601.55.camel@localhost.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1104448248.15601.55.camel@localhost.localdomain> User-Agent: Mutt/1.4.1i X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13277 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: buytenh@wantstofly.org Precedence: bulk X-list: netdev On Fri, Dec 31, 2004 at 12:10:48AM +0100, Bart De Schuymer wrote: > > A while ago there were a number of problems with bridging CIPE ethernet > > devices, which turned out to be the bridge code not initialising > > skb->protocol for locally originated STP frames. > > > > At the time I was told that initialising skb->protocol for locally > > originated packets is required, so that is how I fixed it then. > > Hi Lennert, Hello, > skb->protocol is not set for locally generated packets when the packet > is still in the IP stack. I don't know what happens with it after the IP > stack is finished with the packet. > The comment in skbuff.h says "packet protocol from driver", from which I > tend to conclude that skb->protocol is only set by drivers when a packet > enters the box. > Too bad stuff like this isn't clearly spelled out, This is what I thought back then too. Indeed, it's rather misleading. > the FIXME for the dst field has been sitting there for probably more > than a year too. Yes :( Just one more thing: AFAIK it is possible to inject a raw IPv4 packet with an invalid IPv4 header. So maybe the better 'fix' would be to have different hooks for PF_INET and PF_INET6, and distinguish v4/v6 packets that way instead of peeking into the header. (The hook you're talking about is a PF_INET* and not a PF_BRIDGE hook, right?) Then again, that would add yet another function onto the already rather deep call chains that we have in there. Too bad I don't see any cleaner way of integrating the whole bridging thing into the stack. I wonder if any of the *BSDs found a cleaner way of doing this. --L From wichert@levante.wiggy.net Fri Dec 31 01:30:15 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 31 Dec 2004 01:30:24 -0800 (PST) Received: from mx1.wiggy.net (Debian-exim@levante.wiggy.net [195.85.225.139]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBV9TsMx007360 for ; Fri, 31 Dec 2004 01:30:15 -0800 Received: from wichert by mx1.wiggy.net with local (Exim 4.34) id 1CkJEp-0000Y7-Er; Fri, 31 Dec 2004 10:38:27 +0100 Date: Fri, 31 Dec 2004 10:38:27 +0100 From: Wichert Akkerman To: jamal Cc: netdev@oss.sgi.com, tgraf@suug.ch Subject: Re: ing_filter debug messages Message-ID: <20041231093827.GG24603@wiggy.net> References: <20041230160643.GD24603@wiggy.net> <1104469666.1049.231.camel@jzny.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1104469666.1049.231.camel@jzny.localdomain> User-Agent: Mutt/1.5.6+20040907i X-SA-Exim-Connect-IP: X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13278 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: wichert@wiggy.net Precedence: bulk X-list: netdev Previously jamal wrote: > The emssage is useful but debug (mostly). > Whats your ifconfig look like? Have some tunnels in there maybe? three ethernet interfaces and a ipv6/ip tunnel. Here is the ifconfig output: eth0 Link encap:Ethernet HWaddr 00:50:04:0B:DD:79 inet addr:192.168.10.1 Bcast:192.168.10.255 Mask:255.255.255.0 inet6 addr: fe80::250:4ff:fe0b:dd79/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:673540 errors:0 dropped:0 overruns:0 frame:0 TX packets:668357 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:229549898 (218.9 MiB) TX bytes:465650601 (444.0 MiB) Interrupt:11 Base address:0xec00 eth1 Link encap:Ethernet HWaddr 00:90:27:BE:60:55 inet addr:194.109.254.66 Bcast:255.255.255.255 Mask:255.255.255.0 inet6 addr: fe80::290:27ff:febe:6055/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:1276385 errors:0 dropped:0 overruns:0 frame:0 TX packets:1172898 errors:0 dropped:0 overruns:0 carrier:0 collisions:1128 txqueuelen:1000 RX bytes:1332651102 (1.2 GiB) TX bytes:293617934 (280.0 MiB) eth2 Link encap:Ethernet HWaddr 00:90:27:BE:4B:EC inet addr:192.168.1.1 Bcast:192.168.1.255 Mask:255.255.255.0 inet6 addr: 2001:888:101d::1/64 Scope:Global inet6 addr: fe80::290:27ff:febe:4bec/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:713928 errors:0 dropped:0 overruns:0 frame:0 TX packets:752929 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:111560554 (106.3 MiB) TX bytes:860040830 (820.1 MiB) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:33261 errors:0 dropped:0 overruns:0 frame:0 TX packets:33261 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:3401641 (3.2 MiB) TX bytes:3401641 (3.2 MiB) xs4all Link encap:IPv6-in-IPv4 inet6 addr: 2001:888:10:1d::2/64 Scope:Global inet6 addr: fe80::c26d:fe42/64 Scope:Link inet6 addr: fe80::c0a8:a01/64 Scope:Link inet6 addr: fe80::c0a8:102/64 Scope:Link inet6 addr: fe80::c0a8:101/64 Scope:Link UP POINTOPOINT RUNNING NOARP MTU:1480 Metric:1 RX packets:128 errors:0 dropped:0 overruns:0 frame:0 TX packets:116 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:95545 (93.3 KiB) TX bytes:19613 (19.1 KiB) there is also a secondary address on eth2 which ifconfig does not show. > Whats your netfilter setup? One chain attached to the FORWARD chain and two to the OUTPUT chain, all pretty simple. Wichert. -- Wichert Akkerman It is simple to make things. http://www.wiggy.net/ It is hard to make things simple. From kaber@trash.net Fri Dec 31 01:46:08 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 31 Dec 2004 01:46:15 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBV9jlYY008228 for ; Fri, 31 Dec 2004 01:46:07 -0800 Received: from eru.coreworks.de ([172.16.0.2] helo=trash.net) by kaber.coreworks.de with esmtp (Exim 4.34) id 1CkJTw-0005XW-3Y; Fri, 31 Dec 2004 10:54:04 +0100 Message-ID: <41D52176.80703@trash.net> Date: Fri, 31 Dec 2004 10:52:54 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040413 Debian/1.6-5 X-Accept-Language: en MIME-Version: 1.0 To: hadi@cyberus.ca CC: Thomas Graf , "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PATCH 2/9] PKT_SCHED: tc filter extension API References: <20041230122652.GM32419@postel.suug.ch> <20041230123023.GO32419@postel.suug.ch> <41D4A4D2.4000109@trash.net> <1104469362.1049.224.camel@jzny.localdomain> In-Reply-To: <1104469362.1049.224.camel@jzny.localdomain> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13279 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev jamal wrote: > On Thu, 2004-12-30 at 20:01, Patrick McHardy wrote: > >>This isn't right (its also wrong in the current code). If the >>CPU reorders stores and another CPU looks at dst->action at >>the wrong time it might see an inconsistent structure. > > > I think an xchg around the else should fix this. Yes. >>I also wonder if anyone >>actually knows why we need the xchg (here and in all the other >>places), it looks totally useless. > > All these were put in by Alexey and the LinuxWay(tm) took effect. > an xchg puts almost a lock and ensures an atomic swap. I dont see any > harm in leaving it as is - just needs fixing the else No real harm, but it still should be removed IMO, or used _instead_ of tcf_tree_lock in this place. I've asked myself multiple times what it is meant for and I've seen others do the same, this alone justifies removing it. Another reason is what you call LinuxWay(tm), strange things spread on their own and at some time you have to touch lots of files to get rid them. So its best to do it as early as possible. Regards Patrick From kaber@trash.net Fri Dec 31 01:47:28 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 31 Dec 2004 01:47:35 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBV9l9pk008558 for ; Fri, 31 Dec 2004 01:47:27 -0800 Received: from eru.coreworks.de ([172.16.0.2] helo=trash.net) by kaber.coreworks.de with esmtp (Exim 4.34) id 1CkJVn-0005Xc-S9; Fri, 31 Dec 2004 10:56:00 +0100 Message-ID: <41D521EA.2090603@trash.net> Date: Fri, 31 Dec 2004 10:54:50 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040413 Debian/1.6-5 X-Accept-Language: en MIME-Version: 1.0 To: hadi@cyberus.ca CC: Maillist netdev Subject: Re: [PATCH PKT_SCHED 0/17]: tc action cleanup + fixes References: <41D3785F.3040909@trash.net> <1104382562.1048.39.camel@jzny.localdomain> <41D3F5EC.9050808@trash.net> <41D4286B.1060106@trash.net> <1104468336.1048.205.camel@jzny.localdomain> In-Reply-To: <1104468336.1048.205.camel@jzny.localdomain> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13280 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev jamal wrote: > Ok, sounds like a good plan (gives me more time to work with the driver > stuff which is getting out of control ;->): I think the patches may have > goten a little messy. Maybe the order should be: you, Thomas with what > he has now then myself with eaction and Thomas with extended matches (I > dont think the order of the last two matters). I will just code against > whatever latest release is and migrate later when all the otehr patches > are in. Sounds good too me. Regards Patrick From emann@mrv.com Fri Dec 31 02:27:49 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 31 Dec 2004 02:27:58 -0800 (PST) Received: from apollo.nbase.co.il ([194.90.137.2]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBVARRCw010223 for ; Fri, 31 Dec 2004 02:27:48 -0800 Received: from [194.90.139.39] by apollo.nbase.co.il (Post.Office MTA v3.1.2 release (PO205-101c) ID# 0-44418U200L2S100) with ESMTP id AAA965; Fri, 31 Dec 2004 12:37:09 +0200 Message-ID: <41D52B72.3050803@mrv.com> Date: Fri, 31 Dec 2004 12:35:30 +0200 From: emann@mrv.com (Eran Mann) User-Agent: Mozilla Thunderbird 1.0 (X11/20041206) X-Accept-Language: en-us, en MIME-Version: 1.0 To: netdev@oss.sgi.com CC: jgrazik@pobox.com, Ganesh Venkatesan Subject: [patch] e100 MDI/MDIX bug(?) Content-Type: multipart/mixed; boundary="------------060600060702060607080608" X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13281 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: emann@mrv.com Precedence: bulk X-list: netdev This is a multi-part message in MIME format. --------------060600060702060607080608 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit e100_phy_init() contains the following code to conditionally enable MDI/MDI-X Autodetection on newer e100 chips: if((nic->mac >= mac_82550_D102) || ((nic->flags & ich) && (mdio_read(netdev, nic->mii.phy_id, MII_TPISTATUS) & 0x8000) && (nic->eeprom[eeprom_cnfg_mdix] & eeprom_mdix_enabled))) /* enable/disable MDI/MDI-X auto-switching */ mdio_write(netdev, nic->mii.phy_id, MII_NCONFIG, nic->mii.force_media ? 0 : NCONFIG_AUTO_SWITCH); This code seems to include 4 problems: 1. e100_eeprom_load was called after e100_phy_init... 2. If the chip revision is >= mac_82550_D102 it enables the feature unconditionally, disregarding the eeprom. 3. According to Intel's errata, MDI/MDI-X Autodetection should never be enabled on 82551ER or 82551QM chips (see http://developer.intel.com/design/network/specupdt/82551ER_si.pdf). 4. If the eeprom disables the feature, it should be actively disabled, and not left as default. suggested patch against 2.6.10 attached. An alternative would be to forcibly disable the feature for all chips with (nic->mac >= mac_82550_D102), as I'm not sure if there are chips for which it IS supported. Signed-off-by: Eran Mann -- Eran Mann MRV International www.mrv.com --------------060600060702060607080608 Content-Type: text/plain; name="e100-mdix-2.6.patch" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="e100-mdix-2.6.patch" --- linux-2.6.x/drivers/net/e100.c 2004-12-31 11:23:51.326587040 +0200 +++ linux-2.6.x-fixed/drivers/net/e100.c 2004-12-31 11:30:26.059578504 +0200 @@ -1074,11 +1074,19 @@ } if((nic->mac >= mac_82550_D102) || ((nic->flags & ich) && - (mdio_read(netdev, nic->mii.phy_id, MII_TPISTATUS) & 0x8000) && - (nic->eeprom[eeprom_cnfg_mdix] & eeprom_mdix_enabled))) - /* enable/disable MDI/MDI-X auto-switching */ - mdio_write(netdev, nic->mii.phy_id, MII_NCONFIG, - nic->mii.force_media ? 0 : NCONFIG_AUTO_SWITCH); + (mdio_read(netdev, nic->mii.phy_id, MII_TPISTATUS) + & 0x8000))) { + /* enable/disable MDI/MDI-X auto-switching. + According to errata MDI/MDI-X auto-switching should + be disabled for 82551ER/QM chips */ + if ((nic->mac == mac_82551_E) || (nic->mac == mac_82551_F) || + (nic->mac == mac_82551_10) || (nic->mii.force_media) || + !(nic->eeprom[eeprom_cnfg_mdix] & eeprom_mdix_enabled)) + mdio_write(netdev, nic->mii.phy_id, MII_NCONFIG, 0); + else + mdio_write(netdev, nic->mii.phy_id, MII_NCONFIG, + NCONFIG_AUTO_SWITCH); + } return 0; } @@ -2248,11 +2256,11 @@ goto err_out_iounmap; } - e100_phy_init(nic); - if((err = e100_eeprom_load(nic))) goto err_out_free; + e100_phy_init(nic); + memcpy(netdev->dev_addr, nic->eeprom, ETH_ALEN); if(!is_valid_ether_addr(netdev->dev_addr)) { DPRINTK(PROBE, ERR, "Invalid MAC address from " --------------060600060702060607080608-- From bdschuym@pandora.be Fri Dec 31 02:38:27 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 31 Dec 2004 02:38:34 -0800 (PST) Received: from asia.telenet-ops.be (asia.telenet-ops.be [195.130.132.59]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBVAc6Vd010890 for ; Fri, 31 Dec 2004 02:38:27 -0800 Received: from localhost (localhost.localdomain [127.0.0.1]) by asia.telenet-ops.be (Postfix) with SMTP id 2CDF7224073; Fri, 31 Dec 2004 11:46:41 +0100 (MET) Received: from 192.168.0.138 (D5763CA9.kabel.telenet.be [213.118.60.169]) by asia.telenet-ops.be (Postfix) with ESMTP id BF49C224070; Fri, 31 Dec 2004 11:46:40 +0100 (MET) Subject: Re: [PATCH][BRIDGE-NF] Fix wrong use of skb->protocol From: Bart De Schuymer To: Lennert Buytenhek Cc: "David S. Miller" , netdev@oss.sgi.com, snort2004@mail.ru In-Reply-To: <20041231083352.GA25031@xi.wantstofly.org> References: <1104432914.15601.19.camel@localhost.localdomain> <20041230222415.GB19587@xi.wantstofly.org> <1104448248.15601.55.camel@localhost.localdomain> <20041231083352.GA25031@xi.wantstofly.org> Content-Type: text/plain Date: Fri, 31 Dec 2004 11:51:00 +0100 Message-Id: <1104490260.3373.15.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Evolution 2.0.2 (2.0.2-3) Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13282 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: bdschuym@pandora.be Precedence: bulk X-list: netdev Op vr, 31-12-2004 te 09:33 +0100, schreef Lennert Buytenhek: > Just one more thing: AFAIK it is possible to inject a raw IPv4 packet > with an invalid IPv4 header. So maybe the better 'fix' would be to have > different hooks for PF_INET and PF_INET6, and distinguish v4/v6 packets > that way instead of peeking into the header. (The hook you're talking > about is a PF_INET* and not a PF_BRIDGE hook, right?) That was my original plan, but it seems such a waste. Wouldn't injecting such an invalid IPv4 header also screw up iptables? Is there any reason why someone should be allowed to do this? I checked ip_tables.c::ipt_do_table() before using the IP version, and it looks at the IP header too without any precautions AFAICT. > Then again, that would add yet another function onto the already rather > deep call chains that we have in there. The netfilter scheme itself implies call chains. > Too bad I don't see any cleaner way of integrating the whole bridging > thing into the stack. I wonder if any of the *BSDs found a cleaner way > of doing this. How about adding something like NF_STOP, which acts like NF_STOLEN but still executes okfn in nf_hook_slow()? cheers, Bart From tgraf@suug.ch Fri Dec 31 03:00:03 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 31 Dec 2004 03:00:12 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBVAxgsZ013677 for ; Fri, 31 Dec 2004 03:00:03 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id EA382F; Fri, 31 Dec 2004 12:07:53 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id 5843E1C0EA; Fri, 31 Dec 2004 12:08:36 +0100 (CET) Date: Fri, 31 Dec 2004 12:08:36 +0100 From: Thomas Graf To: jamal Cc: "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. Message-ID: <20041231110836.GD32419@postel.suug.ch> References: <20041228231916.GG32419@postel.suug.ch> <1104277165.1100.291.camel@jzny.localdomain> <20041229000928.GH32419@postel.suug.ch> <1104282811.1090.314.camel@jzny.localdomain> <20041229124842.GI32419@postel.suug.ch> <1104330054.1089.329.camel@jzny.localdomain> <20041229150140.GJ32419@postel.suug.ch> <1104335620.1025.22.camel@jzny.localdomain> <20041230174313.GB32419@postel.suug.ch> <1104469111.1049.219.camel@jzny.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1104469111.1049.219.camel@jzny.localdomain> X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13283 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * jamal <1104469111.1049.219.camel@jzny.localdomain> 2004-12-30 23:58 > On Thu, 2004-12-30 at 12:43, Thomas Graf wrote: > > * jamal <1104335620.1025.22.camel@jzny.localdomain> 2004-12-29 10:53 > > > > If i store some ID that would tell me "IP" when i dump then i can pretty > > > print it in english in user space using ip_print(). > > > > Understood, we could store a map in userspace mapping those IDs to > > pretty english match descriptions. I think avoiding to hardcode those > > ids but rather just hold it for userspace is the best thing. > > We may be in sync: > I was thinking of just teaching tc to stash something there that it > understands on how to translate. Thinking about it now, this may not > be sufficient: perhaps we need a few bits in the selector to identify > the owner who installed the rule to begin with. Then it would be safe to > interpret the meaning of the ID (by the same app). Did you say there > were some unused bits in the selector? Right, but why not do this in userspace by having a global map somewhere in a file? A u32 config could have been modified by multiple pids and it would get really messy to store a pid for every possible changeable item. > I think all you need really is to say "this match starts at IP" i.e such > a definition is global. > handles per rule already exist - and you can actually specify them when > installing a rule. Are those insufficient? Those are absolutely sufficient. I was thinking of giving a match a 16bit ID which can be used for both, identifying and mapping, i.e: __u8 kind; /* match type, for lookup in matchers table */ __u8 flags; /* Invert Flag + Relations */ __u16 handle; /* must be unique per selector, may be autogenerated */ I want to have those matches be as small as possible, so no nested TLVs but rather this u32 + matcher specific data form a TLV together. A selector consists of a TLV array of such matches. The first TLV, type=1 becomes a header with the possibility to transfer classifier specific options (such as hash table configuration for u32). > Why not make the always-true to be an extended match? actually a u32 > match of 0 0 is always true. Those hashes are quiet tricky/flexible; > i would rather we clone u32 and call it something else then speacilize > it. Agreed, I don't want to change u32 but I want to introduce ematches in u32 as well so we can benefit from the hashing but for those who don't need hashing u32 is already bloat so we can do a simple always-true classifier which does nothing more than evaluating the ematches. I want to have the u32 match be a ematch as well so the always-true classifier would become a u32 alternative but without the hashing overhead. > Both sound very appealing. You plan to do them as extended matches, > correct? Excatly. > KMP can be used for something like virus scanning? does it > maintain state? It requires the following parameters: - start offset - end offset - pattern - prefix table and then will simply start at `start` and scans until `end` looking for pattern with the help of the prefix table. Again, I'm not sure what you mean by state ;-> From hadi@cyberus.ca Fri Dec 31 03:03:43 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 31 Dec 2004 03:03:49 -0800 (PST) Received: from mx01.cybersurf.com (mx01.cybersurf.com [209.197.145.104]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBVB3N3F014234 for ; Fri, 31 Dec 2004 03:03:43 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx01.cybersurf.com with esmtp (Exim 4.30) id 1CkKhG-0001YZ-Rd for netdev@oss.sgi.com; Fri, 31 Dec 2004 06:11:54 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CkKhE-0004gG-TD; Fri, 31 Dec 2004 06:11:53 -0500 Subject: Re: ing_filter debug messages From: jamal Reply-To: hadi@cyberus.ca To: Wichert Akkerman Cc: netdev@oss.sgi.com, tgraf@suug.ch In-Reply-To: <20041231093827.GG24603@wiggy.net> References: <20041230160643.GD24603@wiggy.net> <1104469666.1049.231.camel@jzny.localdomain> <20041231093827.GG24603@wiggy.net> Content-Type: multipart/mixed; boundary="=-x0GzCbij3ytyMxBNNDtZ" Organization: jamalopolous Message-Id: <1104491510.1047.234.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 31 Dec 2004 06:11:51 -0500 X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13284 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev --=-x0GzCbij3ytyMxBNNDtZ Content-Type: text/plain Content-Transfer-Encoding: 7bit The sit tunnel is on top of eth1? Does attached patch fix it? cheers, jamal On Fri, 2004-12-31 at 04:38, Wichert Akkerman wrote: > Previously jamal wrote: > > The emssage is useful but debug (mostly). > > Whats your ifconfig look like? Have some tunnels in there maybe? > > three ethernet interfaces and a ipv6/ip tunnel. Here is the ifconfig > output: > > eth0 Link encap:Ethernet HWaddr 00:50:04:0B:DD:79 > inet addr:192.168.10.1 Bcast:192.168.10.255 Mask:255.255.255.0 > inet6 addr: fe80::250:4ff:fe0b:dd79/64 Scope:Link > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 > RX packets:673540 errors:0 dropped:0 overruns:0 frame:0 > TX packets:668357 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:1000 > RX bytes:229549898 (218.9 MiB) TX bytes:465650601 (444.0 MiB) > Interrupt:11 Base address:0xec00 > > eth1 Link encap:Ethernet HWaddr 00:90:27:BE:60:55 > inet addr:194.109.254.66 Bcast:255.255.255.255 Mask:255.255.255.0 > inet6 addr: fe80::290:27ff:febe:6055/64 Scope:Link > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 > RX packets:1276385 errors:0 dropped:0 overruns:0 frame:0 > TX packets:1172898 errors:0 dropped:0 overruns:0 carrier:0 > collisions:1128 txqueuelen:1000 > RX bytes:1332651102 (1.2 GiB) TX bytes:293617934 (280.0 MiB) > > eth2 Link encap:Ethernet HWaddr 00:90:27:BE:4B:EC > inet addr:192.168.1.1 Bcast:192.168.1.255 Mask:255.255.255.0 > inet6 addr: 2001:888:101d::1/64 Scope:Global > inet6 addr: fe80::290:27ff:febe:4bec/64 Scope:Link > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 > RX packets:713928 errors:0 dropped:0 overruns:0 frame:0 > TX packets:752929 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:1000 > RX bytes:111560554 (106.3 MiB) TX bytes:860040830 (820.1 MiB) > > lo Link encap:Local Loopback > inet addr:127.0.0.1 Mask:255.0.0.0 > inet6 addr: ::1/128 Scope:Host > UP LOOPBACK RUNNING MTU:16436 Metric:1 > RX packets:33261 errors:0 dropped:0 overruns:0 frame:0 > TX packets:33261 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:0 > RX bytes:3401641 (3.2 MiB) TX bytes:3401641 (3.2 MiB) > > xs4all Link encap:IPv6-in-IPv4 > inet6 addr: 2001:888:10:1d::2/64 Scope:Global > inet6 addr: fe80::c26d:fe42/64 Scope:Link > inet6 addr: fe80::c0a8:a01/64 Scope:Link > inet6 addr: fe80::c0a8:102/64 Scope:Link > inet6 addr: fe80::c0a8:101/64 Scope:Link > UP POINTOPOINT RUNNING NOARP MTU:1480 Metric:1 > RX packets:128 errors:0 dropped:0 overruns:0 frame:0 > TX packets:116 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:0 > RX bytes:95545 (93.3 KiB) TX bytes:19613 (19.1 KiB) > > there is also a secondary address on eth2 which ifconfig does not show. > > > Whats your netfilter setup? > > One chain attached to the FORWARD chain and two to the OUTPUT chain, > all pretty simple. > > Wichert. --=-x0GzCbij3ytyMxBNNDtZ Content-Disposition: attachment; filename=sit-p Content-Type: text/plain; name=sit-p; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit --- a/net/ipv6/sit.c 2004/12/31 11:03:32 1.1 +++ b/net/ipv6/sit.c 2004/12/31 11:06:50 @@ -385,7 +385,7 @@ skb->pkt_type = PACKET_HOST; tunnel->stat.rx_packets++; tunnel->stat.rx_bytes += skb->len; - skb->dev = tunnel->dev; + skb->input_dev = skb->dev = tunnel->dev; dst_release(skb->dst); skb->dst = NULL; nf_reset(skb); --=-x0GzCbij3ytyMxBNNDtZ-- From wichert@levante.wiggy.net Fri Dec 31 03:06:54 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 31 Dec 2004 03:07:01 -0800 (PST) Received: from mx1.wiggy.net (Debian-exim@levante.wiggy.net [195.85.225.139]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBVB6Xv4014803 for ; Fri, 31 Dec 2004 03:06:53 -0800 Received: from wichert by mx1.wiggy.net with local (Exim 4.34) id 1CkKkL-0001C3-Uc; Fri, 31 Dec 2004 12:15:06 +0100 Date: Fri, 31 Dec 2004 12:15:05 +0100 From: Wichert Akkerman To: jamal Cc: netdev@oss.sgi.com, tgraf@suug.ch Subject: Re: ing_filter debug messages Message-ID: <20041231111505.GJ24603@wiggy.net> References: <20041230160643.GD24603@wiggy.net> <1104469666.1049.231.camel@jzny.localdomain> <20041231093827.GG24603@wiggy.net> <1104491510.1047.234.camel@jzny.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1104491510.1047.234.camel@jzny.localdomain> User-Agent: Mutt/1.5.6+20040907i X-SA-Exim-Connect-IP: X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13285 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: wichert@wiggy.net Precedence: bulk X-list: netdev Previously jamal wrote: > The sit tunnel is on top of eth1? Yes. > Does attached patch fix it? I'll give it a spin later today, have to figure out what to make for dinner tonight and do the associated shopping first :) Wichert. -- Wichert Akkerman It is simple to make things. http://www.wiggy.net/ It is hard to make things simple. From tgraf@suug.ch Fri Dec 31 03:10:19 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 31 Dec 2004 03:10:27 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBVB9wU7015339 for ; Fri, 31 Dec 2004 03:10:18 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id 14864F; Fri, 31 Dec 2004 12:18:11 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id 887A31C0EA; Fri, 31 Dec 2004 12:18:53 +0100 (CET) Date: Fri, 31 Dec 2004 12:18:53 +0100 From: Thomas Graf To: Patrick McHardy Cc: hadi@cyberus.ca, "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PATCH 2/9] PKT_SCHED: tc filter extension API Message-ID: <20041231111853.GE32419@postel.suug.ch> References: <20041230122652.GM32419@postel.suug.ch> <20041230123023.GO32419@postel.suug.ch> <41D4A4D2.4000109@trash.net> <1104469362.1049.224.camel@jzny.localdomain> <41D52176.80703@trash.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <41D52176.80703@trash.net> X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13286 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * Patrick McHardy <41D52176.80703@trash.net> 2004-12-31 10:52 > jamal wrote: > >On Thu, 2004-12-30 at 20:01, Patrick McHardy wrote: > > > >>This isn't right (its also wrong in the current code). If the > >>CPU reorders stores and another CPU looks at dst->action at > >>the wrong time it might see an inconsistent structure. > > > > > >I think an xchg around the else should fix this. Agreed, we do have a problem when changing a existing filter and adding a action/police. Thanks Patrick. > >>I also wonder if anyone > >>actually knows why we need the xchg (here and in all the other > >>places), it looks totally useless. > > > >All these were put in by Alexey and the LinuxWay(tm) took effect. > >an xchg puts almost a lock and ensures an atomic swap. I dont see any > >harm in leaving it as is - just needs fixing the else > > No real harm, but it still should be removed IMO, or used _instead_ > of tcf_tree_lock in this place. I've asked myself multiple times what > it is meant for and I've seen others do the same, this alone justifies > removing it. Another reason is what you call LinuxWay(tm), strange > things spread on their own and at some time you have to touch lots of > files to get rid them. So its best to do it as early as possible. There are many of those spread all around. Alexey used them right and now some have been surrounded by locks. I wanted to fix this since ages but didn't get around yet. I'll remove the tcf lock in my patch and change the other occurances later. From tgraf@suug.ch Fri Dec 31 03:17:30 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 31 Dec 2004 03:17:37 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBVBH9Je016294 for ; Fri, 31 Dec 2004 03:17:30 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id 2DE56F; Fri, 31 Dec 2004 12:25:22 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id 3ABC81C0EA; Fri, 31 Dec 2004 12:26:05 +0100 (CET) Date: Fri, 31 Dec 2004 12:26:05 +0100 From: Thomas Graf To: Patrick McHardy Cc: hadi@cyberus.ca, Maillist netdev Subject: Re: [PATCH PKT_SCHED 0/17]: tc action cleanup + fixes Message-ID: <20041231112605.GF32419@postel.suug.ch> References: <41D3785F.3040909@trash.net> <1104382562.1048.39.camel@jzny.localdomain> <41D3F5EC.9050808@trash.net> <41D4286B.1060106@trash.net> <1104468336.1048.205.camel@jzny.localdomain> <41D521EA.2090603@trash.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <41D521EA.2090603@trash.net> X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13287 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * Patrick McHardy <41D521EA.2090603@trash.net> 2004-12-31 10:54 > jamal wrote: > > >Ok, sounds like a good plan (gives me more time to work with the driver > >stuff which is getting out of control ;->): I think the patches may have > >goten a little messy. Maybe the order should be: you, Thomas with what > >he has now then myself with eaction and Thomas with extended matches (I > >dont think the order of the last two matters). I will just code against > >whatever latest release is and migrate later when all the otehr patches > >are in. > > Sounds good too me. Fine with me. Jamal, we might want to share some code to do the logic relations like sharing the macros to access the bits, checking if the the current result already makes the whole expression obvious etc. From tgraf@suug.ch Fri Dec 31 03:55:10 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 31 Dec 2004 03:55:17 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBVBsnUi017677 for ; Fri, 31 Dec 2004 03:55:10 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id 4AD4AF; Fri, 31 Dec 2004 13:03:02 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id B470A1C0EA; Fri, 31 Dec 2004 13:03:44 +0100 (CET) Date: Fri, 31 Dec 2004 13:03:44 +0100 From: Thomas Graf To: jamal Cc: "David S. Miller" , Patrick McHardy , netdev@oss.sgi.com Subject: Re: [PATCH 3/9] PKT_SCHED: u32: make use of tcf_exts API Message-ID: <20041231120344.GH32419@postel.suug.ch> References: <20041230122652.GM32419@postel.suug.ch> <20041230123108.GP32419@postel.suug.ch> <1104468201.1047.201.camel@jzny.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1104468201.1047.201.camel@jzny.localdomain> X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13288 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * jamal <1104468201.1047.201.camel@jzny.localdomain> 2004-12-30 23:43 > On Thu, 2004-12-30 at 07:31, Thomas Graf wrote: > > > > > +static struct tcf_ext_map u32_ext_map = { > > + .action = TCA_U32_ACT, > > + .police = TCA_U32_POLICE > > +}; > > Repeated on all classifiers - perhaps a little macro magic in order? ;-> I thought about this but couldn't come up with one that makes it actually easier. I don't want to hardcode action/police in a macro because we might have extensions only needed for specific classifiers. > Generally rest all looking good - I need to stare some more at the route > and tcindex one but dont see any show stoppers off hand; Yes please, those are the critical points. From tgraf@suug.ch Fri Dec 31 05:02:04 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 31 Dec 2004 05:02:13 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBVD1iGc023356 for ; Fri, 31 Dec 2004 05:02:04 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id 3741A85; Fri, 31 Dec 2004 14:09:56 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id 7B6A51C0EA; Fri, 31 Dec 2004 14:10:39 +0100 (CET) Date: Fri, 31 Dec 2004 14:10:39 +0100 From: Thomas Graf To: jamal Cc: "David S. Miller" , Patrick McHardy , netdev@oss.sgi.com Subject: Re: [PATCH 2/9] PKT_SCHED: tc filter extension API Message-ID: <20041231131039.GI32419@postel.suug.ch> References: <20041230122652.GM32419@postel.suug.ch> <20041230123023.GO32419@postel.suug.ch> <1104467816.1049.181.camel@jzny.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1104467816.1049.181.camel@jzny.localdomain> X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13289 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * jamal <1104467816.1049.181.camel@jzny.localdomain> 2004-12-30 23:36 > On Thu, 2004-12-30 at 07:30, Thomas Graf wrote: > > > The extensions are executed by the classifier by calling tcf_exts_exec > > which must be done as the last thing after making sure the > > filter matches. Note: A classifier might take further actions after > > the execution to tcf_exts_exec such as correcting its own cache to > > avoid caching results which could have been influenced by the extensions. > > Which cache? route classifier maintains a fastmap to cache results. It may only make use of the cache if no extension is involed in the matching, instead of just disabling caching completely it only caches results where no extensions are involved. > > tcf_exts_exec returns a negative error code if the filter must be > > considered unmatched, 0 on normal execution or a positive classifier > > return code (TC_ACT_*) which must be returned to the underlying layer > > as-is. > > Look at the TC_ACT_*; i think they should be sufficient. I know them, I should have written TC_ACT_OK instead of 0. We mean the same. I defined it a little bit more generic to allow further extensions to make use of it. > Maybe a few DPRINTKs for debugging purposes? Yes, it might be a good idea to introduce a overall debugging level config option to put in all the generic debug messages such as the ing_filter indev correction printk. > Patrick pointed this in other email: the xchg is sort of defeated by the > else. Perhaps make the second one xchg as well. Agreed, I changed it to: struct tc_action *act; act = xchg(&dst->action, src->action); if (act) tcf_action_destroy(act, TCA_ACT_UNBIND); Am I missing something? I'll resend patch 2 with the cosmetic and locking fixes and patch 9 to fix Kconfig police dependencies. > BTW Thomas: Hopefully these patches match closely what was already in > place? (sorry didnt cross reference) They match as closely as possible, I even inherited the bugs as you can see ;-> The major change is that the actions/police get initialized before changing the classifier itself and might eventually be destroyed again if the any changes of the classifier fails. > i.e if any optimizations we should probably bring next phase No optimizations, there are a few cosmetic fixes such as to not use __u* in kernel only space and I inlined the route hashing functions. From wichert@levante.wiggy.net Fri Dec 31 05:07:41 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 31 Dec 2004 05:07:48 -0800 (PST) Received: from mx1.wiggy.net (Debian-exim@levante.wiggy.net [195.85.225.139]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBVD7KlW023938 for ; Fri, 31 Dec 2004 05:07:41 -0800 Received: from wichert by mx1.wiggy.net with local (Exim 4.34) id 1CkMdG-0001xc-55; Fri, 31 Dec 2004 14:15:54 +0100 Date: Fri, 31 Dec 2004 14:15:54 +0100 From: Wichert Akkerman To: jamal Cc: netdev@oss.sgi.com, tgraf@suug.ch Subject: Re: ing_filter debug messages Message-ID: <20041231131553.GA7460@wiggy.net> References: <20041230160643.GD24603@wiggy.net> <1104469666.1049.231.camel@jzny.localdomain> <20041231093827.GG24603@wiggy.net> <1104491510.1047.234.camel@jzny.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1104491510.1047.234.camel@jzny.localdomain> User-Agent: Mutt/1.5.6+20040907i X-SA-Exim-Connect-IP: X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13290 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: wichert@wiggy.net Precedence: bulk X-list: netdev Previously jamal wrote: > Does attached patch fix it? No change at all I'm afraid . While rebooting to the kernel with that patch applied it got stuck in the shutdown sequence while repeating this line: unregister_netdevice: waiting for xs4all for become free, Usage count = 1 Wichert. -- Wichert Akkerman It is simple to make things. http://www.wiggy.net/ It is hard to make things simple. From linux_lover2004@yahoo.com Fri Dec 31 05:23:32 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 31 Dec 2004 05:23:38 -0800 (PST) Received: from web52210.mail.yahoo.com (web52210.mail.yahoo.com [206.190.39.92]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBVDNB1v024685 for ; Fri, 31 Dec 2004 05:23:31 -0800 Received: (qmail 92495 invoked by uid 60001); 31 Dec 2004 13:31:41 -0000 Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; b=1MnkMN1hDm5vFvi1kLHMoYBUdYj0I8Gpf2ut3igt4I3Odio6GIvhoxCSS5NnN6mWHBeycK0HndCtxIjoUnAPQ9d5fMREBMwV6jSqhX3zK6VbfFAroTEiosEStzV3cXWWy+1GNQLHxIBdUPonhlGivJQ7mtt6s0E3Xq8x3WFxY9M= ; Message-ID: <20041231133141.92493.qmail@web52210.mail.yahoo.com> Received: from [202.56.231.117] by web52210.mail.yahoo.com via HTTP; Fri, 31 Dec 2004 05:31:41 PST Date: Fri, 31 Dec 2004 05:31:41 -0800 (PST) From: linux lover Subject: Is that possible.... To: netdev@oss.sgi.com MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13291 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: linux_lover2004@yahoo.com Precedence: bulk X-list: netdev Hello all, I write a netfilter kernel module which prints skb->len at NF_IP_POST_ROUTING. When i complie it on Fedora core 1 and insmod, i got results for ping localhost that skb->len is 0. But if i compile and insmod it same on 2.4.20 kernel it works? why??? regards, linux_lover. __________________________________ Do you Yahoo!? The all-new My Yahoo! - What will yours do? http://my.yahoo.com From tgraf@suug.ch Fri Dec 31 06:04:15 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 31 Dec 2004 06:04:21 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBVE3rlN026247 for ; Fri, 31 Dec 2004 06:04:14 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id 0B7D785; Fri, 31 Dec 2004 15:12:06 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id 622AD1C0EA; Fri, 31 Dec 2004 15:12:49 +0100 (CET) Date: Fri, 31 Dec 2004 15:12:49 +0100 From: Thomas Graf To: "David S. Miller" Cc: Jamal Hadi Salim , Patrick McHardy , netdev@oss.sgi.com Subject: [RESEND 2/9] PKT_SCHED: tc filter extension API Message-ID: <20041231141249.GK32419@postel.suug.ch> References: <20041230122652.GM32419@postel.suug.ch> <20041230123023.GO32419@postel.suug.ch> <20041230163359.GA32419@postel.suug.ch> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041230163359.GA32419@postel.suug.ch> X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13292 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev Resend: Fixes missing locking when exchanging action/police pointers and removes unneded locking and uses xchg instead as noted by Patrick McHardy. The tcf_exts API abstracts extensions such as actions/policers into a generic layer and reduces the knowledge inside classifiers to the minimum required. It isolates the validation code into its own function to allow classifiers to validate all input data before making changes and thus avoids the need to undo changes if a extension configuration request cannot be fullfilled. As a nice side effect, using this API removes the existing ifdef clutter. Usage: The classifier holds struct tcf_exts which may be empty if no extensions are compiled in. It then calls tcf_exts_validate when a new change request was received and provides a temporary tcf_exts copy to store the change requests. Given it succeeded the classifier may change its own parameters and at the end call tcf_exts_change to commit the changes and replace the existing extension configuration with the new one. The classifier is responsible to destroy his temporary copy if any of its own validation checks fail. The classifier specific TLV types must be exported to the extensions API via tcf_ext_map. Destroying the extensions is as easy as calling tcf_exts_destroy. The extensions are executed by the classifier by calling tcf_exts_exec which must be done as the last thing after making sure the filter matches. Note: A classifier might take further actions after the execution to tcf_exts_exec such as correcting its own cache to avoid caching results which could have been influenced by the extensions. tcf_exts_exec returns a negative error code if the filter must be considered unmatched, 0 on normal execution or a positive classifier return code (TC_ACT_*) which must be returned to the underlying layer as-is. Signed-off-by: Thomas Graf --- linux-2.6.10-bk2.orig/include/net/pkt_cls.h 2004-12-31 14:10:54.000000000 +0100 +++ linux-2.6.10-bk2/include/net/pkt_cls.h 2004-12-30 17:17:56.000000000 +0100 @@ -62,6 +62,93 @@ tp->q->ops->cl_ops->unbind_tcf(tp->q, cl); } +struct tcf_exts +{ +#ifdef CONFIG_NET_CLS_ACT + struct tc_action *action; +#elif defined CONFIG_NET_CLS_POLICE + struct tcf_police *police; +#endif +}; + +/* Map to export classifier specific extension TLV types to the + * generic extensions API. Unsupported extensions must be set to 0. + */ +struct tcf_ext_map +{ + int action; + int police; +}; + +/** + * tcf_exts_is_predicative - check if a predicative extension is present + * @exts: tc filter extensions handle + * + * Returns 1 if a predicative extension is present, i.e. an extension which + * might cause further actions and thus overrule the regular tcf_result. + */ +static inline int +tcf_exts_is_predicative(struct tcf_exts *exts) +{ +#ifdef CONFIG_NET_CLS_ACT + return !!exts->action; +#elif defined CONFIG_NET_CLS_POLICE + return !!exts->police; +#else + return 0; +#endif +} + +/** + * tcf_exts_is_available - check if at least one extension is present + * @exts: tc filter extensions handle + * + * Returns 1 if at least one extension is present. + */ +static inline int +tcf_exts_is_available(struct tcf_exts *exts) +{ + /* All non-predicative extensions must be added here. */ + return tcf_exts_is_predicative(exts); +} + +/** + * tcf_exts_exec - execute tc filter extensions + * @skb: socket buffer + * @exts: tc filter extensions handle + * @res: desired result + * + * Executes all configured extensions. Returns 0 on a normal execution, + * a negative number if the filter must be considered unmatched or + * a positive action code (TC_ACT_*) which must be returned to the + * underlying layer. + */ +static inline int +tcf_exts_exec(struct sk_buff *skb, struct tcf_exts *exts, + struct tcf_result *res) +{ +#ifdef CONFIG_NET_CLS_ACT + if (exts->action) + return tcf_action_exec(skb, exts->action, res); +#elif defined CONFIG_NET_CLS_POLICE + if (exts->police) + return tcf_police(skb, exts->police); +#endif + + return 0; +} + +extern int tcf_exts_validate(struct tcf_proto *tp, struct rtattr **tb, + struct rtattr *rate_tlv, struct tcf_exts *exts, + struct tcf_ext_map *map); +extern void tcf_exts_destroy(struct tcf_proto *tp, struct tcf_exts *exts); +extern void tcf_exts_change(struct tcf_proto *tp, struct tcf_exts *dst, + struct tcf_exts *src); +extern int tcf_exts_dump(struct sk_buff *skb, struct tcf_exts *exts, + struct tcf_ext_map *map); +extern int tcf_exts_dump_stats(struct sk_buff *skb, struct tcf_exts *exts, + struct tcf_ext_map *map); + #ifdef CONFIG_NET_CLS_ACT static inline int tcf_change_act_police(struct tcf_proto *tp, struct tc_action **action, --- linux-2.6.10-bk2.orig/net/sched/cls_api.c 2004-12-24 22:34:26.000000000 +0100 +++ linux-2.6.10-bk2/net/sched/cls_api.c 2004-12-31 13:44:59.000000000 +0100 @@ -439,6 +439,150 @@ return skb->len; } +void +tcf_exts_destroy(struct tcf_proto *tp, struct tcf_exts *exts) +{ +#ifdef CONFIG_NET_CLS_ACT + if (exts->action) { + tcf_action_destroy(exts->action, TCA_ACT_UNBIND); + exts->action = NULL; + } +#elif defined CONFIG_NET_CLS_POLICE + if (exts->police) { + tcf_police_release(exts->police, TCA_ACT_UNBIND); + exts->police = NULL; + } +#endif +} + + +int +tcf_exts_validate(struct tcf_proto *tp, struct rtattr **tb, + struct rtattr *rate_tlv, struct tcf_exts *exts, + struct tcf_ext_map *map) +{ + memset(exts, 0, sizeof(*exts)); + +#ifdef CONFIG_NET_CLS_ACT + int err; + struct tc_action *act; + + if (map->police && tb[map->police-1] && rate_tlv) { + act = tcf_action_init_1(tb[map->police-1], rate_tlv, "police", + TCA_ACT_NOREPLACE, TCA_ACT_BIND, &err); + if (act == NULL) + return err; + + act->type = TCA_OLD_COMPAT; + exts->action = act; + } else if (map->action && tb[map->action-1] && rate_tlv) { + act = tcf_action_init(tb[map->action-1], rate_tlv, NULL, + TCA_ACT_NOREPLACE, TCA_ACT_BIND, &err); + if (act == NULL) + return err; + + exts->action = act; + } +#elif defined CONFIG_NET_CLS_POLICE + if (map->police && tb[map->police-1] && rate_tlv) { + struct tcf_police *p; + + p = tcf_police_locate(tb[map->police-1], rate_tlv); + if (p == NULL) + return -EINVAL; + + exts->police = p; + } else if (map->action && tb[map->action-1]) + return -EOPNOTSUPP; +#else + if ((map->action && tb[map->action-1]) || + (map->police && tb[map->police-1])) + return -EOPNOTSUPP; +#endif + + return 0; +} + +void +tcf_exts_change(struct tcf_proto *tp, struct tcf_exts *dst, + struct tcf_exts *src) +{ +#ifdef CONFIG_NET_CLS_ACT + if (src->action) { + struct tc_action *act; + act = xchg(&dst->action, src->action); + if (act) + tcf_action_destroy(act, TCA_ACT_UNBIND); + } +#elif defined CONFIG_NET_CLS_POLICE + if (src->police) { + struct tcf_police *p; + p = xchg(&dst->police, src->police); + if (p) + tcf_police_release(p, TCA_ACT_UNBIND); + } +#endif +} + +int +tcf_exts_dump(struct sk_buff *skb, struct tcf_exts *exts, + struct tcf_ext_map *map) +{ +#ifdef CONFIG_NET_CLS_ACT + if (map->action && exts->action) { + /* + * again for backward compatible mode - we want + * to work with both old and new modes of entering + * tc data even if iproute2 was newer - jhs + */ + struct rtattr * p_rta = (struct rtattr*) skb->tail; + + if (exts->action->type != TCA_OLD_COMPAT) { + RTA_PUT(skb, map->action, 0, NULL); + if (tcf_action_dump(skb, exts->action, 0, 0) < 0) + goto rtattr_failure; + p_rta->rta_len = skb->tail - (u8*)p_rta; + } else if (map->police) { + RTA_PUT(skb, map->police, 0, NULL); + if (tcf_action_dump_old(skb, exts->action, 0, 0) < 0) + goto rtattr_failure; + p_rta->rta_len = skb->tail - (u8*)p_rta; + } + } +#elif defined CONFIG_NET_CLS_POLICE + if (map->police && exts->police) { + struct rtattr * p_rta = (struct rtattr*) skb->tail; + + RTA_PUT(skb, map->police, 0, NULL); + + if (tcf_police_dump(skb, exts->police) < 0) + goto rtattr_failure; + + p_rta->rta_len = skb->tail - (u8*)p_rta; + } +#endif + return 0; +rtattr_failure: __attribute__ ((unused)) + return -1; +} + +int +tcf_exts_dump_stats(struct sk_buff *skb, struct tcf_exts *exts, + struct tcf_ext_map *map) +{ +#ifdef CONFIG_NET_CLS_ACT + if (exts->action) + if (tcf_action_copy_stats(skb, exts->action) < 0) + goto rtattr_failure; +#elif defined CONFIG_NET_CLS_POLICE + if (exts->police) + if (tcf_police_dump_stats(skb, exts->police) < 0) + goto rtattr_failure; +#endif + return 0; +rtattr_failure: __attribute__ ((unused)) + return -1; +} static int __init tc_filter_init(void) { @@ -461,3 +605,8 @@ EXPORT_SYMBOL(register_tcf_proto_ops); EXPORT_SYMBOL(unregister_tcf_proto_ops); +EXPORT_SYMBOL(tcf_exts_validate); +EXPORT_SYMBOL(tcf_exts_destroy); +EXPORT_SYMBOL(tcf_exts_change); +EXPORT_SYMBOL(tcf_exts_dump); +EXPORT_SYMBOL(tcf_exts_dump_stats); From tgraf@suug.ch Fri Dec 31 06:09:06 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 31 Dec 2004 06:09:13 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBVE8kpl026870 for ; Fri, 31 Dec 2004 06:09:06 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id F36F3F; Fri, 31 Dec 2004 15:16:58 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id 353AD1C0EA; Fri, 31 Dec 2004 15:17:21 +0100 (CET) Date: Fri, 31 Dec 2004 15:17:21 +0100 From: Thomas Graf To: "David S. Miller" Cc: Jamal Hadi Salim , Patrick McHardy , netdev@oss.sgi.com Subject: [RESEND 9/9] PKT_SCHED: Actions are now available for all classifiers & Fix police Kconfig dependencies Message-ID: <20041231141721.GL32419@postel.suug.ch> References: <20041230122652.GM32419@postel.suug.ch> <20041230123617.GV32419@postel.suug.ch> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041230123617.GV32419@postel.suug.ch> X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13293 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev Removes outdated comment and make action and old compat policer mutually exclusive to reflect the code. Noted by Jamal Hadi Salim. Signed-off-by: Thomas Graf --- linux-2.6.10-bk2.orig/net/sched/Kconfig 2004-12-31 14:10:54.000000000 +0100 +++ linux-2.6.10-bk2/net/sched/Kconfig 2004-12-31 15:04:01.000000000 +0100 @@ -381,7 +381,6 @@ ---help--- This option requires you have a new iproute2. It enables tc extensions which can be used with tc classifiers. - Only the u32 and fw classifiers are supported at the moment. You MUST NOT turn this on if you dont have an update iproute2. config NET_ACT_POLICE @@ -392,13 +391,6 @@ below to select a policer. You MUST NOT turn this on if you dont have an update iproute2. -config NET_CLS_POLICE - bool "Traffic policing (needed for in/egress)" - depends on NET_CLS && NET_QOS && NET_ACT_POLICE!=y && NET_ACT_POLICE!=m - help - Say Y to support traffic policing (bandwidth limits). Needed for - ingress and egress rate limiting. - config NET_ACT_GACT tristate "generic Actions" depends on NET_CLS_ACT @@ -432,3 +424,11 @@ ---help--- requires new iproute2 This allows for packets to be generically edited + +config NET_CLS_POLICE + bool "Traffic policing (needed for in/egress)" + depends on NET_CLS && NET_QOS && NET_CLS_ACT!=y + help + Say Y to support traffic policing (bandwidth limits). Needed for + ingress and egress rate limiting. + From kaber@trash.net Fri Dec 31 06:12:07 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 31 Dec 2004 06:12:14 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBVEBkIO027384 for ; Fri, 31 Dec 2004 06:12:07 -0800 Received: from eru.coreworks.de ([172.16.0.2] helo=trash.net) by kaber.coreworks.de with esmtp (Exim 4.34) id 1CkNdK-0005jp-5q; Fri, 31 Dec 2004 15:20:02 +0100 Message-ID: <41D55FC9.6040605@trash.net> Date: Fri, 31 Dec 2004 15:18:49 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040413 Debian/1.6-5 X-Accept-Language: en MIME-Version: 1.0 To: Thomas Graf CC: jamal , "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PATCH 2/9] PKT_SCHED: tc filter extension API References: <20041230122652.GM32419@postel.suug.ch> <20041230123023.GO32419@postel.suug.ch> <1104467816.1049.181.camel@jzny.localdomain> <20041231131039.GI32419@postel.suug.ch> In-Reply-To: <20041231131039.GI32419@postel.suug.ch> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13294 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev Thomas Graf wrote: > * jamal <1104467816.1049.181.camel@jzny.localdomain> 2004-12-30 23:36 > > >>Patrick pointed this in other email: the xchg is sort of defeated by the >>else. Perhaps make the second one xchg as well. > > Agreed, I changed it to: > > struct tc_action *act; > act = xchg(&dst->action, src->action); > if (act) > tcf_action_destroy(act, TCA_ACT_UNBIND); > > Am I missing something? Yes, the xchg without locking is only right in case we don't have an existing action that needs to be destroyed. Someone might still be looking at the old action in a softirq. If you want to keep the "lockless" variant, you need to call synchronize_net() before tcf_action_destroy(). Regards Patrick From tgraf@suug.ch Fri Dec 31 06:26:47 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 31 Dec 2004 06:26:54 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBVEQQEx028189 for ; Fri, 31 Dec 2004 06:26:47 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id 6D303F; Fri, 31 Dec 2004 15:34:39 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id D2BC61C0EA; Fri, 31 Dec 2004 15:35:22 +0100 (CET) Date: Fri, 31 Dec 2004 15:35:22 +0100 From: Thomas Graf To: Patrick McHardy Cc: jamal , "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PATCH 2/9] PKT_SCHED: tc filter extension API Message-ID: <20041231143522.GM32419@postel.suug.ch> References: <20041230122652.GM32419@postel.suug.ch> <20041230123023.GO32419@postel.suug.ch> <1104467816.1049.181.camel@jzny.localdomain> <20041231131039.GI32419@postel.suug.ch> <41D55FC9.6040605@trash.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <41D55FC9.6040605@trash.net> X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13295 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * Patrick McHardy <41D55FC9.6040605@trash.net> 2004-12-31 15:18 > Thomas Graf wrote: > > struct tc_action *act; > > act = xchg(&dst->action, src->action); > > if (act) > > tcf_action_destroy(act, TCA_ACT_UNBIND); > > > >Am I missing something? > > Yes, the xchg without locking is only right in case we don't have > an existing action that needs to be destroyed. Someone might still > be looking at the old action in a softirq. If you want to keep the > "lockless" variant, you need to call synchronize_net() before > tcf_action_destroy(). You're absolutely right. I will put locks around it again to avoid having the pointer exchanged in the middle of a classification. A synchronize_net will avoid the crash but will no prevent a possible corruption of the classification result if the action is used more than once. From hadi@cyberus.ca Fri Dec 31 06:50:55 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 31 Dec 2004 06:51:02 -0800 (PST) Received: from mx01.cybersurf.com (mx01.cybersurf.com [209.197.145.104]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBVEoY7A029181 for ; Fri, 31 Dec 2004 06:50:54 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx01.cybersurf.com with esmtp (Exim 4.30) id 1CkOF9-0004uO-9D for netdev@oss.sgi.com; Fri, 31 Dec 2004 09:59:07 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CkOF7-0001Z9-7E; Fri, 31 Dec 2004 09:59:05 -0500 Subject: Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. From: jamal Reply-To: hadi@cyberus.ca To: Thomas Graf Cc: "David S. Miller" , netdev@oss.sgi.com In-Reply-To: <20041231110836.GD32419@postel.suug.ch> References: <20041228231916.GG32419@postel.suug.ch> <1104277165.1100.291.camel@jzny.localdomain> <20041229000928.GH32419@postel.suug.ch> <1104282811.1090.314.camel@jzny.localdomain> <20041229124842.GI32419@postel.suug.ch> <1104330054.1089.329.camel@jzny.localdomain> <20041229150140.GJ32419@postel.suug.ch> <1104335620.1025.22.camel@jzny.localdomain> <20041230174313.GB32419@postel.suug.ch> <1104469111.1049.219.camel@jzny.localdomain> <20041231110836.GD32419@postel.suug.ch> Content-Type: text/plain Organization: jamalopolous Message-Id: <1104505142.1048.262.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 31 Dec 2004 09:59:02 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13296 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Fri, 2004-12-31 at 06:08, Thomas Graf wrote: > * jamal <1104469111.1049.219.camel@jzny.localdomain> 2004-12-30 23:58 > > perhaps we need a few bits in the selector to identify > > the owner who installed the rule to begin with. Then it would be safe to > > interpret the meaning of the ID (by the same app). Did you say there > > were some unused bits in the selector? > > Right, but why not do this in userspace by having a global map > somewhere in a file? A u32 config could have been modified by > multiple pids and it would get really messy to store a pid for > every possible changeable item. The "PID" is per app - same as it is for rtm where the different known routing apps like zebra are given an ID and certain IDs are left blank for general use. So if you run multiple tcs they will all have the same ID. The way the id is used in the rtm historically has been for allowing multiple route protocols to install route entries (this way for example OSPF does/nt announce RIP routes etc). In our case this would mean: "this rule was installed by tc - only it knows what the opaque value 10 means". In that case tc would be responsible to decode 10 which would mean to it "this is an ip match therefore use ip_print() to pretty print" and 10 is global to tc only and in that case stored in some header file belonging to tc. > > I think all you need really is to say "this match starts at IP" i.e such > > a definition is global. > > handles per rule already exist - and you can actually specify them when > > installing a rule. Are those insufficient? > > Those are absolutely sufficient. I was thinking of giving a match a > 16bit ID which can be used for both, identifying and mapping, i.e: > > __u8 kind; /* match type, for lookup in matchers table */ 255 possible matchers max? > __u8 flags; /* Invert Flag + Relations */ > __u16 handle; /* must be unique per selector, may be autogenerated */ Ok this is the one used to store the opaque IDs - unique per app so may be the same across multiple selectors. Probably steal a few bits from here and use them in nkind. > I want to have those matches be as small as possible, so no nested > TLVs but rather this u32 + matcher specific data form a TLV together. > > A selector consists of a TLV array of such matches. The first TLV, > type=1 becomes a header with the possibility to transfer classifier > specific options (such as hash table configuration for u32). Maybe i misunderstood you. You are going to have: SEL2 | | +--match1 | | | + -- extended match1 . . . . . . . +---- extnded matchn | +--- matchn Why does the first one have to be speacial? One of the mistakes in u32 is the tight packing of the matches. ie the match1-n above are packet together. If they were put in TLVs probably only new thing that will be needed is MATCH2 TLV. No harm in transporting an extra 32 bits for a TLV - its not like you are going to have a million matches and need to save space. So i would suggest everything under a TLV (SEL2->MATCH(ES)->EMATCH(ES)) > > Why not make the always-true to be an extended match? actually a u32 > > match of 0 0 is always true. Those hashes are quiet tricky/flexible; > > i would rather we clone u32 and call it something else then speacilize > > it. > > Agreed, I don't want to change u32 but I want to introduce ematches > in u32 as well so we can benefit from the hashing but for those who > don't need hashing u32 is already bloat so we can do a simple > always-true classifier which does nothing more than evaluating the > ematches. I want to have the u32 match be a ematch as well so the > always-true classifier would become a u32 alternative but without > the hashing overhead. Sounds fine to me. so a u32 match 00 emathes here .. > It requires the following parameters: > - start offset > - end offset > - pattern > - prefix table > > and then will simply start at `start` and scans until `end` looking > for pattern with the help of the prefix table. Again, I'm not sure what > you mean by state ;-> A virus would span several packets. So the state/knowldge of whether something is a virus is spread across several packets. That knowledge typically needs to be accumulated before making a call. Is this thing capable of remembering? cheers, jamal From hadi@cyberus.ca Fri Dec 31 06:52:41 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 31 Dec 2004 06:52:47 -0800 (PST) Received: from mx02.cybersurf.com (mx02.cybersurf.com [209.197.145.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBVEqL4c029574 for ; Fri, 31 Dec 2004 06:52:41 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx02.cybersurf.com with esmtp (Exim 4.30) id 1CkOGr-0006LY-Ug for netdev@oss.sgi.com; Fri, 31 Dec 2004 10:00:53 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CkOGq-0001nr-AQ; Fri, 31 Dec 2004 10:00:52 -0500 Subject: Re: [PATCH PKT_SCHED 0/17]: tc action cleanup + fixes From: jamal Reply-To: hadi@cyberus.ca To: Thomas Graf Cc: Patrick McHardy , Maillist netdev In-Reply-To: <20041231112605.GF32419@postel.suug.ch> References: <41D3785F.3040909@trash.net> <1104382562.1048.39.camel@jzny.localdomain> <41D3F5EC.9050808@trash.net> <41D4286B.1060106@trash.net> <1104468336.1048.205.camel@jzny.localdomain> <41D521EA.2090603@trash.net> <20041231112605.GF32419@postel.suug.ch> Content-Type: text/plain Organization: jamalopolous Message-Id: <1104505249.1049.266.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 31 Dec 2004 10:00:50 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13297 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Fri, 2004-12-31 at 06:26, Thomas Graf wrote: > Jamal, we might want to share some code to do the logic > relations like sharing the macros to access the bits, checking if the > the current result already makes the whole expression obvious etc. Sure we can discuss - you may actually end up doing yours first and i will use the LinuxWay(tm) ;-> aka inherit all your bugs ;-> I started working on it but too distracted finding some exciting stuff on this e1000. cheers, jamal From hadi@cyberus.ca Fri Dec 31 06:55:48 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 31 Dec 2004 06:55:55 -0800 (PST) Received: from mx01.cybersurf.com (mx01.cybersurf.com [209.197.145.104]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBVEtRYt030268 for ; Fri, 31 Dec 2004 06:55:47 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx01.cybersurf.com with esmtp (Exim 4.30) id 1CkOJq-0000bM-5g for netdev@oss.sgi.com; Fri, 31 Dec 2004 10:03:58 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CkOJn-0002OD-A9; Fri, 31 Dec 2004 10:03:55 -0500 Subject: Re: ing_filter debug messages From: jamal Reply-To: hadi@cyberus.ca To: Wichert Akkerman Cc: netdev@oss.sgi.com, tgraf@suug.ch In-Reply-To: <20041231131553.GA7460@wiggy.net> References: <20041230160643.GD24603@wiggy.net> <1104469666.1049.231.camel@jzny.localdomain> <20041231093827.GG24603@wiggy.net> <1104491510.1047.234.camel@jzny.localdomain> <20041231131553.GA7460@wiggy.net> Content-Type: text/plain Organization: jamalopolous Message-Id: <1104505432.1047.270.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 31 Dec 2004 10:03:53 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13298 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Fri, 2004-12-31 at 08:15, Wichert Akkerman wrote: > Previously jamal wrote: > > Does attached patch fix it? > > No change at all I'm afraid . While rebooting to the kernel with that > patch applied it got stuck in the shutdown sequence while repeating this > line: > > unregister_netdevice: waiting for xs4all for become free, Usage count = 1 I doubt this has anything to do with what i sent you. What i sent you is certainly needed. Let me poke around for a few minutes - I am begining to think theres a relation; the only way to be sure is to compile out netfilter. cheers, jamal From hadi@cyberus.ca Fri Dec 31 07:02:29 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 31 Dec 2004 07:02:36 -0800 (PST) Received: from mx03.cybersurf.com (mx03.cybersurf.com [209.197.145.106]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBVF29gd030931 for ; Fri, 31 Dec 2004 07:02:29 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx03.cybersurf.com with esmtp (Exim 4.30) id 1CkOQN-0000Vb-7j for netdev@oss.sgi.com; Fri, 31 Dec 2004 10:10:43 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CkOQK-0003Go-Cu; Fri, 31 Dec 2004 10:10:40 -0500 Subject: Re: ing_filter debug messages From: jamal Reply-To: hadi@cyberus.ca To: Wichert Akkerman Cc: netdev@oss.sgi.com In-Reply-To: <20041231131553.GA7460@wiggy.net> References: <20041230160643.GD24603@wiggy.net> <1104469666.1049.231.camel@jzny.localdomain> <20041231093827.GG24603@wiggy.net> <1104491510.1047.234.camel@jzny.localdomain> <20041231131553.GA7460@wiggy.net> Content-Type: multipart/mixed; boundary="=-XcQdU1SCo9jgv+Y8JYL7" Organization: jamalopolous Message-Id: <1104505838.1048.273.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 31 Dec 2004 10:10:38 -0500 X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13299 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev --=-XcQdU1SCo9jgv+Y8JYL7 Content-Type: text/plain Content-Transfer-Encoding: 7bit Wichert, Try also the attached patch with netfilter on and your rules installed. cheers, jamal --=-XcQdU1SCo9jgv+Y8JYL7 Content-Disposition: attachment; filename=indev2-p Content-Type: text/plain; name=indev2-p; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit --- a/net/ipv4/ip_output.c 2004/12/31 14:26:08 1.1 +++ b/net/ipv4/ip_output.c 2004/12/31 14:27:53 @@ -111,6 +111,7 @@ #ifdef CONFIG_NETFILTER_DEBUG nf_debug_ip_loopback_xmit(newskb); #endif + newskb->input_dev = newskb->dev; netif_rx(newskb); return 0; } --- a/net/ipv6/ip6_output.c 2004-12-24 16:33:51.000000000 -0500 +++ b/net/ipv6/ip6_output.c 2004-12-31 10:29:47.505392096 -0500 @@ -102,7 +102,7 @@ newskb->pkt_type = PACKET_LOOPBACK; newskb->ip_summed = CHECKSUM_UNNECESSARY; BUG_TRAP(newskb->dst); - + newskb->input_dev = newskb->dev; netif_rx(newskb); return 0; } --=-XcQdU1SCo9jgv+Y8JYL7-- From parklee_sel@yahoo.com Fri Dec 31 07:15:14 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 31 Dec 2004 07:15:20 -0800 (PST) Received: from web51507.mail.yahoo.com (web51507.mail.yahoo.com [206.190.38.199]) by oss.sgi.com (8.13.0/8.13.0) with SMTP id iBVFEnFx031710 for ; Fri, 31 Dec 2004 07:15:13 -0800 Received: (qmail 46689 invoked by uid 60001); 31 Dec 2004 15:23:19 -0000 Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; b=hB4H9NWjQXxxAsOITU0ElvXI2ZqMDdC6KufDNkoFsHtESX1Oid5ce/dpR+Jmzv3A2l21bB97FHYndcsc7FPE/QJbF3L7/BiQUWi+QXw/xe04HBqUKhk6mHHL4a+hVIJajX6351h+aOm6lcmoQEWldtjm5zWU7SPY4C+FDLPti5U= ; Message-ID: <20041231152319.46687.qmail@web51507.mail.yahoo.com> Received: from [221.15.137.76] by web51507.mail.yahoo.com via HTTP; Fri, 31 Dec 2004 07:23:19 PST Date: Fri, 31 Dec 2004 07:23:19 -0800 (PST) From: Park Lee Subject: Issue on ip_route_output_key() and ip_route_output_flow() To: netdev@oss.sgi.com Cc: linux-kernel@vger.kernel.org, Linus.Torvalds@helsinki.fi, gw4pts@gw4pts.ampr.org, bir7@leland.Stanford.Edu, kuznet@ms2.inr.ac.ru, waltje@uWalt.NL.Mugnet.ORG MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13300 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: parklee_sel@yahoo.com Precedence: bulk X-list: netdev Hi, In /usr/src/linux/net/ipv4/route.c, there are two functions, one is ip_route_output_key(), the other is ip_route_output_flow(). Would you please tell me what circumstances those two function are used for? and What the difference between the two functions is? As for ip_route_output_key() here, what is the meaning of the "_key" in its name? Thank you. ===== Best Regards, Park Lee __________________________________ Do you Yahoo!? All your favorites on one personal page – Try My Yahoo! http://my.yahoo.com From tgraf@suug.ch Fri Dec 31 07:30:55 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 31 Dec 2004 07:31:02 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBVFUZMg032518 for ; Fri, 31 Dec 2004 07:30:55 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id 53DFDF; Fri, 31 Dec 2004 16:38:47 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id 5885C1C0EA; Fri, 31 Dec 2004 16:39:30 +0100 (CET) Date: Fri, 31 Dec 2004 16:39:30 +0100 From: Thomas Graf To: jamal Cc: "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. Message-ID: <20041231153930.GN32419@postel.suug.ch> References: <20041229000928.GH32419@postel.suug.ch> <1104282811.1090.314.camel@jzny.localdomain> <20041229124842.GI32419@postel.suug.ch> <1104330054.1089.329.camel@jzny.localdomain> <20041229150140.GJ32419@postel.suug.ch> <1104335620.1025.22.camel@jzny.localdomain> <20041230174313.GB32419@postel.suug.ch> <1104469111.1049.219.camel@jzny.localdomain> <20041231110836.GD32419@postel.suug.ch> <1104505142.1048.262.camel@jzny.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1104505142.1048.262.camel@jzny.localdomain> X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13301 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * jamal <1104505142.1048.262.camel@jzny.localdomain> 2004-12-31 09:59 > On Fri, 2004-12-31 at 06:08, Thomas Graf wrote: > > * jamal <1104469111.1049.219.camel@jzny.localdomain> 2004-12-30 23:58 > > > > perhaps we need a few bits in the selector to identify > > > the owner who installed the rule to begin with. Then it would be safe to > > > interpret the meaning of the ID (by the same app). Did you say there > > > were some unused bits in the selector? > > > > Right, but why not do this in userspace by having a global map > > somewhere in a file? A u32 config could have been modified by > > multiple pids and it would get really messy to store a pid for > > every possible changeable item. > > The "PID" is per app - same as it is for rtm where the different known > routing apps like zebra are given an ID and certain IDs are left blank > for general use. So if you run multiple tcs they will all have the same > ID. The way the id is used in the rtm historically has been for allowing > multiple route protocols to install route entries (this way for example > OSPF does/nt announce RIP routes etc). In our case this would mean: > "this rule was installed by tc - only it knows what the opaque value > 10 means". In that case tc would be responsible to decode 10 which would > mean to it "this is an ip match therefore use ip_print() to pretty > print" and 10 is global to tc only and in that case stored in some > header file belonging to tc. Exactly so we don't need any PIDs stored. /etc/iproute2/tc_matches. > > __u8 kind; /* match type, for lookup in matchers table */ > > 255 possible matchers max? 256 not enough? > > __u8 flags; /* Invert Flag + Relations */ > > __u16 handle; /* must be unique per selector, may be autogenerated */ > > Ok this is the one used to store the opaque IDs - unique per app > so may be the same across multiple selectors. No, it must be unique, we will return EINVAL if it isn't. Most apps will set it to 0 which will autogenerate it with a generate && !get just like in u32. > Probably steal a few bits from here and use them in nkind. Fair, we can also steal a few bits from flags although I'd like to keep at least 3 empty. > Maybe i misunderstood you. You are going to have: > > SEL2 > | > | > +--match1 > | | > | + -- extended match1 > . . > . . > . . > . +---- extnded matchn > | > +--- matchn > > > Why does the first one have to be speacial? No, I'm going to have everything be ematches. > One of the mistakes in u32 is the tight packing of the matches. ie the > match1-n above are packet together. If they were put in TLVs probably > only new thing that will be needed is MATCH2 TLV. > No harm in transporting an extra 32 bits for a TLV - its not like you > are going to have a million matches and need to save space. > So i would suggest everything under a TLV (SEL2->MATCH(ES)->EMATCH(ES)) SEL2: TLV (TCA_U32_SEL2) +-------------------+ | Selector header | +-------------------+ | Match 1 TLV | +-------------------+ | ... | +-------------------+ | Match N TLV | +-------------------+ where Match TLV: +--------------------+ | Match Header (u32) | +- - - - - - - - - - + | Match Data | +--------------------+ Where Header: u8 kind; u8 flags; u16 handle; (or some more for kind and less for handle) where kind := { 0 | match type } handle := { 0 | unique } where kind == 0 means that the match is a container and data contains a u32 pointing into the match array. where kind != 0 means a match to be looked up in the matcher ops table. with the following matches: - u32: {u8|u16|u32} at offset match - meta - kmp - nbyte - ... In case SEL1 TLV is provided we simply create a flat index with no containers and all AND relations. Which means we can do: Sel2 \__match u32 at 2 16 0xff | + and match meta nfmark 2 | + and container \__match u32 at 4 11 0xf0 | + or nbyte "::1" at 12 Thoughts? > A virus would span several packets. So the state/knowldge of whether > something is a virus is spread across several packets. That knowledge > typically needs to be accumulated before making a call. Is this thing > capable of remembering? Not and I would rather do it outside in a separate ematch to match stateful information. I have to think some more about this though. From wichert@levante.wiggy.net Fri Dec 31 07:40:33 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 31 Dec 2004 07:40:39 -0800 (PST) Received: from mx1.wiggy.net (Debian-exim@levante.wiggy.net [195.85.225.139]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBVFeBLu000646 for ; Fri, 31 Dec 2004 07:40:33 -0800 Received: from wichert by mx1.wiggy.net with local (Exim 4.34) id 1CkP1A-0002zy-Hv; Fri, 31 Dec 2004 16:48:44 +0100 Date: Fri, 31 Dec 2004 16:48:44 +0100 From: Wichert Akkerman To: jamal Cc: netdev@oss.sgi.com Subject: Re: ing_filter debug messages Message-ID: <20041231154844.GA11511@wiggy.net> References: <20041230160643.GD24603@wiggy.net> <1104469666.1049.231.camel@jzny.localdomain> <20041231093827.GG24603@wiggy.net> <1104491510.1047.234.camel@jzny.localdomain> <20041231131553.GA7460@wiggy.net> <1104505838.1048.273.camel@jzny.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1104505838.1048.273.camel@jzny.localdomain> User-Agent: Mutt/1.5.6+20040907i X-SA-Exim-Connect-IP: X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13302 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: wichert@wiggy.net Precedence: bulk X-list: netdev Previously jamal wrote: > Try also the attached patch with netfilter on and your rules installed. That seems to do the trick: I no longer see the debug messages appear. The tunnel still works as well. However, the unregister_netdev problem still persists. Wichert. -- Wichert Akkerman It is simple to make things. http://www.wiggy.net/ It is hard to make things simple. From hadi@cyberus.ca Fri Dec 31 08:37:08 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 31 Dec 2004 08:37:14 -0800 (PST) Received: from mx01.cybersurf.com (mx01.cybersurf.com [209.197.145.104]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBVGamME006134 for ; Fri, 31 Dec 2004 08:37:08 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx01.cybersurf.com with esmtp (Exim 4.30) id 1CkPtx-00066R-1o for netdev@oss.sgi.com; Fri, 31 Dec 2004 11:45:21 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CkPtZ-0005yt-Fo; Fri, 31 Dec 2004 11:44:57 -0500 Subject: Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. From: jamal Reply-To: hadi@cyberus.ca To: Thomas Graf Cc: "David S. Miller" , netdev@oss.sgi.com In-Reply-To: <20041231153930.GN32419@postel.suug.ch> References: <20041229000928.GH32419@postel.suug.ch> <1104282811.1090.314.camel@jzny.localdomain> <20041229124842.GI32419@postel.suug.ch> <1104330054.1089.329.camel@jzny.localdomain> <20041229150140.GJ32419@postel.suug.ch> <1104335620.1025.22.camel@jzny.localdomain> <20041230174313.GB32419@postel.suug.ch> <1104469111.1049.219.camel@jzny.localdomain> <20041231110836.GD32419@postel.suug.ch> <1104505142.1048.262.camel@jzny.localdomain> <20041231153930.GN32419@postel.suug.ch> Content-Type: text/plain Organization: jamalopolous Message-Id: <1104511494.1048.303.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 31 Dec 2004 11:44:54 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13303 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Fri, 2004-12-31 at 10:39, Thomas Graf wrote: > * jamal <1104505142.1048.262.camel@jzny.localdomain> 2004-12-31 09:59 > Exactly so we don't need any PIDs stored. /etc/iproute2/tc_matches. Sure - that would be a fine place to store the IDs. Although i am not sure you want to call it matches - rather its u32 program identifiers. > > > __u8 kind; /* match type, for lookup in matchers table */ > > > > 255 possible matchers max? > > 256 not enough? I dont know, possible given how easy it would be to add a new match that it wont be sufficient. > > > __u8 flags; /* Invert Flag + Relations */ > > > __u16 handle; /* must be unique per selector, may be autogenerated */ > > > > Ok this is the one used to store the opaque IDs - unique per app > > so may be the same across multiple selectors. > > No, it must be unique, we will return EINVAL if it isn't. Most > apps will set it to 0 which will autogenerate it with a > generate && !get just like in u32. > We may be talking about different things - please double check. The misunderstanding seems to be on scoping: PID is global and the opaque ID is per match. IOW, theres a hierachy: A program(such as tc) installs a filter rule - we need to be able to identify the program - this is the PID. Unique across all Linux. A filter rule constitutes one or more matches. Different programs may install different u32 rules. For most users its a single program - tc. Each program that installs a rule will need to be identified by the PID. Main purpose is so that it can decode what the second level number means. This second level number is what i said was opaque. Its meaning is per app. So as an example PID 0x10 identifies application tc and opaque code 0x20 for tc translates to mean "thats an ip match, so no need to dump raw data - just dump it in english using ip_print()". 0x10 belongs to the selector; 0x20 is per match. 0x21 could mean "this is a TCP match with options" etc The ematches on the hand are purely decodable via user space without needing the opaque numbers - the kind/type serves these just fine. > > Probably steal a few bits from here and use them in nkind. > > Fair, we can also steal a few bits from flags although I'd like to > keep at least 3 empty. the 16 bits for match handle sounds a little too generous; still 4 bits or so from there should be fine. > SEL2: > > TLV (TCA_U32_SEL2) > +-------------------+ > | Selector header | > +-------------------+ > | Match 1 TLV | > +-------------------+ > | ... | > +-------------------+ > | Match N TLV | > +-------------------+ nod. > where Match TLV: > > +--------------------+ > | Match Header (u32) | > +- - - - - - - - - - + > | Match Data | > +--------------------+ > > Where Header: > u8 kind; > u8 flags; > u16 handle; > > (or some more for kind and less for handle) nod. > where > > kind := { 0 | match type } > handle := { 0 | unique } > > where kind == 0 means that the match is a container and data > contains a u32 pointing into the match array. > So essentiall good old u32 here? > where kind != 0 means a match to be looked up in the matcher > ops table. > > with the following matches: > - u32: {u8|u16|u32} at offset match > - meta > - kmp > - nbyte > - ... Everything is almost the same as to what i was saying - except i see the u32 again there; isnt kind = 0 covering this? > In case SEL1 TLV is provided we simply create a flat > index with no containers and all AND relations. No choice there. > Which means we can do: > > Sel2 > \__match u32 at 2 16 0xff > | > + and match meta nfmark 2 > | > + and container > \__match u32 at 4 11 0xf0 > | > + or nbyte "::1" at 12 > > Thoughts? > Ok, I may have understood what you are talking about in the second u32 where kind !=0. I think we are in sync. Go nuts. > > A virus would span several packets. So the state/knowldge of whether > > something is a virus is spread across several packets. That knowledge > > typically needs to be accumulated before making a call. Is this thing > > capable of remembering? > > Not and I would rather do it outside in a separate ematch to match > stateful information. I have to think some more about this though. Its a non-trivial problem. Its ok to defer it for now but keep it in the back of your mind. cheers, jamal From hadi@cyberus.ca Fri Dec 31 08:40:09 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 31 Dec 2004 08:40:16 -0800 (PST) Received: from mx03.cybersurf.com (mx03.cybersurf.com [209.197.145.106]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBVGdnZs006659 for ; Fri, 31 Dec 2004 08:40:09 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx03.cybersurf.com with esmtp (Exim 4.30) id 1CkPwt-00078d-M0 for netdev@oss.sgi.com; Fri, 31 Dec 2004 11:48:23 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CkPwp-0006Kk-PF; Fri, 31 Dec 2004 11:48:20 -0500 Subject: unregister_netdev Annoyance WAS(Re: ing_filter debug messages From: jamal Reply-To: hadi@cyberus.ca To: Wichert Akkerman Cc: netdev@oss.sgi.com In-Reply-To: <20041231154844.GA11511@wiggy.net> References: <20041230160643.GD24603@wiggy.net> <1104469666.1049.231.camel@jzny.localdomain> <20041231093827.GG24603@wiggy.net> <1104491510.1047.234.camel@jzny.localdomain> <20041231131553.GA7460@wiggy.net> <1104505838.1048.273.camel@jzny.localdomain> <20041231154844.GA11511@wiggy.net> Content-Type: text/plain Organization: jamalopolous Message-Id: <1104511697.1048.308.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 31 Dec 2004 11:48:17 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13304 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Fri, 2004-12-31 at 10:48, Wichert Akkerman wrote: > Previously jamal wrote: > > Try also the attached patch with netfilter on and your rules installed. > > That seems to do the trick: I no longer see the debug messages appear. > The tunnel still works as well. > > However, the unregister_netdev problem still persists. I am pretty sure its a different problem. Quick scan shows the register/unregister state machine may be at fault. Just changed the subject because there have been threads on this topic that other people have been discussing that i havent followed. Lets see if this gets their attention - If it doesnt i will poke around. cheers, jamal From hadi@cyberus.ca Fri Dec 31 09:08:29 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 31 Dec 2004 09:08:36 -0800 (PST) Received: from mx02.cybersurf.com (mx02.cybersurf.com [209.197.145.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBVH898u007907 for ; Fri, 31 Dec 2004 09:08:29 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx02.cybersurf.com with esmtp (Exim 4.30) id 1CkQOF-0004qJ-8a for netdev@oss.sgi.com; Fri, 31 Dec 2004 12:16:39 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CkQOB-00011v-V9; Fri, 31 Dec 2004 12:16:36 -0500 Subject: patch: tunnels not setting inputdev From: jamal Reply-To: hadi@cyberus.ca To: "David S. Miller" Cc: netdev@oss.sgi.com, Wichert Akkerman Content-Type: multipart/mixed; boundary="=-qFQt6HEKMQ274NM9KnwZ" Organization: jamalopolous Message-Id: <1104513392.1048.316.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 31 Dec 2004 12:16:33 -0500 X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13305 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev --=-qFQt6HEKMQ274NM9KnwZ Content-Type: text/plain Content-Transfer-Encoding: 7bit Dave, Patch attached that has tunnels setting input dev correctly. Incorporates what ive sent to Wichert already. A lot of the stuff the tunnels do is very similar, so maybe wiser to have something like tunnel_type_trans(). cheers, jamal --=-qFQt6HEKMQ274NM9KnwZ Content-Disposition: attachment; filename=indev-patch Content-Type: text/plain; name=indev-patch; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit --- 2610-bk1/net/ipv4/xfrm4_input.c 2004/12/31 17:00:25 1.1 +++ 2610-bk1/net/ipv4/xfrm4_input.c 2004/12/31 17:01:05 @@ -142,6 +142,7 @@ dst_release(skb->dst); skb->dst = NULL; } + skb->input_dev = skb->dev; netif_rx(skb); return 0; } else { --- 2610-bk1/net/ipv4/ipmr.c 2004/12/31 17:01:24 1.1 +++ 2610-bk1/net/ipv4/ipmr.c 2004/12/31 17:01:54 @@ -1461,6 +1461,7 @@ ((struct net_device_stats*)reg_dev->priv)->rx_bytes += skb->len; ((struct net_device_stats*)reg_dev->priv)->rx_packets++; nf_reset(skb); + skb->input_dev = skb->dev; netif_rx(skb); dev_put(reg_dev); return 0; @@ -1516,6 +1517,7 @@ ((struct net_device_stats*)reg_dev->priv)->rx_bytes += skb->len; ((struct net_device_stats*)reg_dev->priv)->rx_packets++; skb->dst = NULL; + skb->input_dev = skb->dev; nf_reset(skb); netif_rx(skb); dev_put(reg_dev); --- 2610-bk1/net/ipv4/ip_gre.c 2004/12/31 17:02:02 1.1 +++ 2610-bk1/net/ipv4/ip_gre.c 2004/12/31 17:03:08 @@ -655,6 +655,7 @@ skb->dst = NULL; nf_reset(skb); ipgre_ecn_decapsulate(iph, skb); + skb->input_dev = skb->dev; netif_rx(skb); read_unlock(&ipgre_lock); return(0); --- 2610-bk1/net/ipv4/ipip.c 2004/12/31 17:03:24 1.1 +++ 2610-bk1/net/ipv4/ipip.c 2004/12/31 17:03:52 @@ -497,6 +497,7 @@ skb->dst = NULL; nf_reset(skb); ipip_ecn_decapsulate(iph, skb); + skb->input_dev = skb->dev; netif_rx(skb); read_unlock(&ipip_lock); return 0; --- 2610-bk1/net/ipv6/ip6_tunnel.c 2004/12/31 17:06:12 1.1 +++ 2610-bk1/net/ipv6/ip6_tunnel.c 2004/12/31 17:07:03 @@ -547,6 +547,7 @@ ip6ip6_ecn_decapsulate(ipv6h, skb); t->stat.rx_packets++; t->stat.rx_bytes += skb->len; + skb->input_dev = skb->dev; netif_rx(skb); read_unlock(&ip6ip6_lock); return 0; --- 2610-bk1/net/ipv6/xfrm6_input.c 2004/12/31 17:07:18 1.1 +++ 2610-bk1/net/ipv6/xfrm6_input.c 2004/12/31 17:07:45 @@ -126,6 +126,7 @@ dst_release(skb->dst); skb->dst = NULL; } + skb->input_dev = skb->dev; netif_rx(skb); return -1; } else { --- a/net/ipv4/ip_output.c 2004/12/31 14:26:08 1.1 +++ b/net/ipv4/ip_output.c 2004/12/31 14:27:53 @@ -111,6 +111,7 @@ #ifdef CONFIG_NETFILTER_DEBUG nf_debug_ip_loopback_xmit(newskb); #endif + newskb->input_dev = newskb->dev; netif_rx(newskb); return 0; } --- a/net/ipv6/ip6_output.c 2004-12-24 16:33:51.000000000 -0500 +++ b/net/ipv6/ip6_output.c 2004-12-31 10:29:47.505392096 -0500 @@ -102,7 +102,7 @@ newskb->pkt_type = PACKET_LOOPBACK; newskb->ip_summed = CHECKSUM_UNNECESSARY; BUG_TRAP(newskb->dst); - + newskb->input_dev = newskb->dev; netif_rx(newskb); return 0; } --- a/net/ipv6/sit.c 2004/12/31 11:03:32 1.1 +++ b/net/ipv6/sit.c 2004/12/31 11:06:50 @@ -385,7 +385,7 @@ skb->pkt_type = PACKET_HOST; tunnel->stat.rx_packets++; tunnel->stat.rx_bytes += skb->len; - skb->dev = tunnel->dev; + skb->input_dev = skb->dev = tunnel->dev; dst_release(skb->dst); skb->dst = NULL; nf_reset(skb); --=-qFQt6HEKMQ274NM9KnwZ-- From hadi@cyberus.ca Fri Dec 31 09:24:45 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 31 Dec 2004 09:24:51 -0800 (PST) Received: from mx03.cybersurf.com (mx03.cybersurf.com [209.197.145.106]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBVHOOg2008747 for ; Fri, 31 Dec 2004 09:24:45 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx03.cybersurf.com with esmtp (Exim 4.30) id 1CkQe3-0007b4-Lf for netdev@oss.sgi.com; Fri, 31 Dec 2004 12:32:59 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CkQdy-0002vQ-TO; Fri, 31 Dec 2004 12:32:55 -0500 Subject: Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. From: jamal Reply-To: hadi@cyberus.ca To: Thomas Graf Cc: "David S. Miller" , netdev@oss.sgi.com In-Reply-To: <1104511494.1048.303.camel@jzny.localdomain> References: <20041229000928.GH32419@postel.suug.ch> <1104282811.1090.314.camel@jzny.localdomain> <20041229124842.GI32419@postel.suug.ch> <1104330054.1089.329.camel@jzny.localdomain> <20041229150140.GJ32419@postel.suug.ch> <1104335620.1025.22.camel@jzny.localdomain> <20041230174313.GB32419@postel.suug.ch> <1104469111.1049.219.camel@jzny.localdomain> <20041231110836.GD32419@postel.suug.ch> <1104505142.1048.262.camel@jzny.localdomain> <20041231153930.GN32419@postel.suug.ch> <1104511494.1048.303.camel@jzny.localdomain> Content-Type: text/plain Organization: jamalopolous Message-Id: <1104514372.1047.326.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 31 Dec 2004 12:32:52 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13306 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Fri, 2004-12-31 at 11:44, jamal wrote: > > with the following matches: One thing i just remembered: You need to know the length of the matches and their data in order to store them. This is why i was earlier preaching putting them in TLVs. Some things dont need the datalen like u32 - however i suspect most will. So either need a length somewhere in the header or use TLVs for the ematches in which you can make T=kind - still have 32 bit inside body but reserve bits not used for flags for future use. Thoughts?. BTW, for deleting - should not allow deleting one of N matches. Deletion should be at selector level. Replace can be used to pretend a single match was deleted. cheers, jamal From kaber@trash.net Fri Dec 31 09:55:20 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 31 Dec 2004 09:55:28 -0800 (PST) Received: from kaber.coreworks.de ([62.206.217.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBVHt0H7010592 for ; Fri, 31 Dec 2004 09:55:20 -0800 Received: from eru.coreworks.de ([172.16.0.2] helo=trash.net) by kaber.coreworks.de with esmtp (Exim 4.34) id 1CkR7M-0005xP-73; Fri, 31 Dec 2004 19:03:16 +0100 Message-ID: <41D5941C.8060001@trash.net> Date: Fri, 31 Dec 2004 19:02:04 +0100 From: Patrick McHardy User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040413 Debian/1.6-5 X-Accept-Language: en MIME-Version: 1.0 To: hadi@cyberus.ca CC: "David S. Miller" , netdev@oss.sgi.com, Wichert Akkerman Subject: Re: patch: tunnels not setting inputdev References: <1104513392.1048.316.camel@jzny.localdomain> In-Reply-To: <1104513392.1048.316.camel@jzny.localdomain> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13307 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: kaber@trash.net Precedence: bulk X-list: netdev jamal wrote: > Dave, > > Patch attached that has tunnels setting input dev correctly. > > Incorporates what ive sent to Wichert already. > > A lot of the stuff the tunnels do is very similar, so maybe wiser to > have something like tunnel_type_trans(). > > cheers, > jamal > > > ------------------------------------------------------------------------ > > --- 2610-bk1/net/ipv4/xfrm4_input.c 2004/12/31 17:00:25 1.1 > +++ 2610-bk1/net/ipv4/xfrm4_input.c 2004/12/31 17:01:05 > @@ -142,6 +142,7 @@ > dst_release(skb->dst); > skb->dst = NULL; > } > + skb->input_dev = skb->dev; This is not necessary, xfrm4_input doesn't change anything regarding devices, so if it was correct before, it is still correct. For the remaining changes, why not simply set input_dev in netif_receive_skb before the call to ing_filter ? > netif_rx(skb); > return 0; > } else { Another question - why is ing_filter exported when CONFIG_NET_CLS_ACT is defined ? Nobody uses it currently outside of dev.c. Regards Patrick From tgraf@suug.ch Fri Dec 31 10:03:18 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 31 Dec 2004 10:03:24 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBVI2vqj011368 for ; Fri, 31 Dec 2004 10:03:18 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id 4105CF; Fri, 31 Dec 2004 19:11:10 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id 54F051C0EA; Fri, 31 Dec 2004 19:11:53 +0100 (CET) Date: Fri, 31 Dec 2004 19:11:53 +0100 From: Thomas Graf To: jamal Cc: "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. Message-ID: <20041231181153.GP32419@postel.suug.ch> References: <20041229124842.GI32419@postel.suug.ch> <1104330054.1089.329.camel@jzny.localdomain> <20041229150140.GJ32419@postel.suug.ch> <1104335620.1025.22.camel@jzny.localdomain> <20041230174313.GB32419@postel.suug.ch> <1104469111.1049.219.camel@jzny.localdomain> <20041231110836.GD32419@postel.suug.ch> <1104505142.1048.262.camel@jzny.localdomain> <20041231153930.GN32419@postel.suug.ch> <1104511494.1048.303.camel@jzny.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1104514372.1047.326.camel@jzny.localdomain> <1104511494.1048.303.camel@jzny.localdomain> X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13308 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * jamal <1104511494.1048.303.camel@jzny.localdomain> 2004-12-31 11:44 > We may be talking about different things - please double check. > The misunderstanding seems to be on scoping: PID is global and > the opaque ID is per match. > IOW, theres a hierachy: > A program(such as tc) installs a filter rule - we need to be able to > identify the program - this is the PID. Unique across all Linux. > A filter rule constitutes one or more matches. Different programs may > install different u32 rules. For most users its a single program - tc. > Each program that installs a rule will need to be identified by the PID. > Main purpose is so that it can decode what the second level number > means. This second level number is what i said was opaque. Its meaning > is per app. > So as an example PID 0x10 identifies application tc and opaque code > 0x20 for tc translates to mean "thats an ip match, so no need to dump > raw data - just dump it in english using ip_print()". > 0x10 belongs to the selector; 0x20 is per match. 0x21 could mean "this > is a TCP match with options" etc > The ematches on the hand are purely decodable via user space without > needing the opaque numbers - the kind/type serves these just fine. Agreed I just don't get the reason for the PID. tc is usually called as a new process instance when dumping. For me there are two possible options, we can either introduce a ID system where an ID is assigned to a match string in either a config file or a header file or we can have tc write id -> desc maps to a global file somewhere where id means match id + u32 handle + parent + dev. The first is probably the better way. We could extend the match header to 64bit: u16 handle u16 matchID u16 kind u8 flags u8 pad > Its a non-trivial problem. Its ok to defer it for now but keep it in the > back of your mind. Agreed. * jamal <1104514372.1047.326.camel@jzny.localdomain> 2004-12-31 12:32 > One thing i just remembered: You need to know the length of the matches > and their data in order to store them. This is why i was earlier > preaching putting them in TLVs. Some things dont need the datalen > like u32 - however i suspect most will. It might not have been obvious, but every match is indeed in its own TLV. The part I don't want to use own TLVs is to separate the match header and match data. Match header is always the same size and match data can be aligned as well. We need len attributes for things like meta indev match, nbyte and kmp though. A Nbyte config TLV would look like: TCA_EMATCH +-------------------------+ | struct tcf_ematch_hdr | | - - - - - - - - - - - --| | ematch data | +-------------------------+ where ematch data contains nested TLVs such as TCA_EMATCH_NBYTE_HDR header, contains length of pattern + possibily more TCA_EMATCH_NBYTE_START lower limit of searching range (offset) TCA_EMATCH_NBYTE_END upper limit of searching range (offset) TCA_EMATCH_NBYTE_PATTERN searching pattern, u8 [] The length in the header is required because we can't use L from TCA_EMATCH_NBYTE_PATTERN since it might be padded. Same would go for meta: TCA_EMATCH_META_HDR TCA_EMATCH_META_LVALUE TCA_EMATCH_MEtA_RVALUE If needed we can put in match specific stats via a _STATS TLV. > So either need a length somewhere in the header or use TLVs for the > ematches in which you can make T=kind - still have 32 bit inside body > but reserve bits not used for flags for future use. Thoughts?. I thought about the following ordering in the selctor TLV: T=1 generic selector header T=2 classifier specific selector header (u32 hashsing stuff goes here) T=3 ematch 1 T=N ematch N > BTW, for deleting - should not allow deleting one of N matches. Deletion > should be at selector level. Replace can be used to pretend a single > match was deleted. Agreed. From tgraf@suug.ch Fri Dec 31 10:10:58 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 31 Dec 2004 10:11:05 -0800 (PST) Received: from b.mx.projectdream.org (eth0-0.arisu.projectdream.org [194.158.4.191]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBVIAbsX012039 for ; Fri, 31 Dec 2004 10:10:58 -0800 Received: from postel.suug.ch (unknown [195.134.158.23]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (No client certificate requested) by b.mx.projectdream.org (Postfix) with ESMTP id 339A3F; Fri, 31 Dec 2004 19:18:50 +0100 (CET) Received: by postel.suug.ch (Postfix, from userid 10001) id 1AC4A1C0EA; Fri, 31 Dec 2004 19:19:33 +0100 (CET) Date: Fri, 31 Dec 2004 19:19:33 +0100 From: Thomas Graf To: jamal Cc: "David S. Miller" , netdev@oss.sgi.com Subject: Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. Message-ID: <20041231181933.GQ32419@postel.suug.ch> References: <1104330054.1089.329.camel@jzny.localdomain> <20041229150140.GJ32419@postel.suug.ch> <1104335620.1025.22.camel@jzny.localdomain> <20041230174313.GB32419@postel.suug.ch> <1104469111.1049.219.camel@jzny.localdomain> <20041231110836.GD32419@postel.suug.ch> <1104505142.1048.262.camel@jzny.localdomain> <20041231153930.GN32419@postel.suug.ch> <1104511494.1048.303.camel@jzny.localdomain> <20041231181153.GP32419@postel.suug.ch> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20041231181153.GP32419@postel.suug.ch> X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13309 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: tgraf@suug.ch Precedence: bulk X-list: netdev * Thomas Graf <20041231181153.GP32419@postel.suug.ch> 2004-12-31 19:11 > T=1 generic selector header > T=2 classifier specific selector header (u32 hashsing stuff goes here) > T=3 ematch 1 > T=N ematch N Of course this should have been: T=N ematch N-2 From andre@tomt.net Fri Dec 31 10:18:23 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 31 Dec 2004 10:18:30 -0800 (PST) Received: from mx1.skjellin.no (mail1.skjellin.no [80.239.42.67]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBVIHwKq012667 for ; Fri, 31 Dec 2004 10:18:23 -0800 Received: from localhost (localhost [127.0.0.1]) by mx1.skjellin.no (Postfix) with ESMTP id F3977884E1; Fri, 31 Dec 2004 19:26:32 +0100 (CET) Received: from puppen.pasop.tomt.net (gw-fe-1.pasop.tomt.net [10.255.1.1]) by mail1.skjellin.no (Postfix) with ESMTP id 780A68849A; Fri, 31 Dec 2004 19:26:31 +0100 (CET) Received: from [10.255.1.10] (slurv.pasop.tomt.net [10.255.1.10]) by puppen.pasop.tomt.net (Postfix) with ESMTP id 5B9A7228B5; Fri, 31 Dec 2004 19:26:31 +0100 (CET) Message-ID: <41D5995E.1070005@tomt.net> Date: Fri, 31 Dec 2004 19:24:30 +0100 From: Andre Tomt User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206) X-Accept-Language: en-us, en MIME-Version: 1.0 To: hadi@cyberus.ca Cc: Wichert Akkerman , netdev@oss.sgi.com Subject: Re: unregister_netdev Annoyance WAS(Re: ing_filter debug messages References: <20041230160643.GD24603@wiggy.net> <1104469666.1049.231.camel@jzny.localdomain> <20041231093827.GG24603@wiggy.net> <1104491510.1047.234.camel@jzny.localdomain> <20041231131553.GA7460@wiggy.net> <1104505838.1048.273.camel@jzny.localdomain> <20041231154844.GA11511@wiggy.net> <1104511697.1048.308.camel@jzny.localdomain> In-Reply-To: <1104511697.1048.308.camel@jzny.localdomain> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Scanned: by amavisd-new-20030616-p10 (Debian) at skjellin.no X-Virus-Status: Clean X-archive-position: 13310 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: andre@tomt.net Precedence: bulk X-list: netdev jamal wrote: > On Fri, 2004-12-31 at 10:48, Wichert Akkerman wrote: > >>Previously jamal wrote: >> >>>Try also the attached patch with netfilter on and your rules installed. >> >>That seems to do the trick: I no longer see the debug messages appear. >>The tunnel still works as well. >> >>However, the unregister_netdev problem still persists. > > > I am pretty sure its a different problem. Quick scan shows the > register/unregister state machine may be at fault. > Just changed the subject because there have been threads on this topic > that other people have been discussing that i havent followed. > Lets see if this gets their attention - If it doesnt i will poke around. Just had this happen on a router yesterday, while booting for testing a netfilter bugfix. The dot1q VLAN interfaces got stuck waiting to become free on ifdown (part of the system shutdown process in this case) There have been several such bugs in recent 2.6 kernel versions, and several refcounting leaks have been plugged since, but somehow it keeps coming back to hunt me from time to time. I'm about to disable ifdown -a on shutdown, but I find that a rather silly workaround to such a problem ;-) Kernel is 2.6.10, ipv6 and ip_conntrack loaded, running zebra/ospfd/ospf6d. ipip and sch/cls modules loaded, but currently not in use (and wasn't since boot) # lsmod Module Size Used by dm_mod 44668 0 sch_htb 18816 0 sch_sfq 4480 0 cls_u32 6788 0 softdog 4368 0 ip6table_filter 2048 1 ip6t_limit 1920 0 ip6t_LOG 5888 0 ip6_tables 14976 3 ip6table_filter,ip6t_limit,ip6t_LOG ip_conntrack_irc 70320 0 ip_conntrack_ftp 70960 0 iptable_filter 2944 1 ipt_limit 1920 6 ipt_REJECT 5120 2 ipt_LOG 5376 3 ipt_state 1536 252 ip_conntrack 34164 3 ip_conntrack_irc,ip_conntrack_ftp,ipt_state ip_tables 14336 5 iptable_filter,ipt_limit,ipt_REJECT,ipt_LOG,ipt_state ipip 7396 0 xfrm4_tunnel 2820 1 ipip 8021q 14728 0 8139too 17664 0 mii 3712 1 8139too crc32 3968 1 8139too ipv6 189568 23 rtc 8760 0 af_packet 14344 0 unix 19124 145 ext3 99208 6 jbd 41496 1 ext3 mbcache 5636 1 ext3 # cat /etc/sysctl.conf # # /etc/sysctl.conf - Configuration file for setting system variables # See sysctl.conf (5) for information. net.ipv4.icmp_ignore_bogus_error_responses=1 net.ipv4.icmp_echo_ignore_broadcasts=1 net.ipv4.conf.default.send_redirects=0 net.ipv4.conf.default.accept_redirects=0 net.ipv4.conf.default.secure_redirects=0 net.ipv4.conf.default.shared_media=0 net.ipv6.conf.default.forwarding=1 net.ipv6.conf.default.accept_ra=0 net.ipv6.conf.default.accept_redirects=0 net.ipv6.conf.default.autoconf=0 net.ipv6.conf.default.router_solicitations=0 net.ipv6.conf.all.forwarding=1 net.ipv6.conf.all.accept_ra=0 net.ipv6.conf.all.accept_redirects=0 net.ipv6.conf.all.autoconf=0 net.ipv6.conf.all.router_solicitations=0 kernel.panic=60 kernel.panic_on_oops=1 vm.overcommit_memory=2 net.ipv4.netfilter.ip_conntrack_max=131072 net.ipv4.netfilter.ip_conntrack_log_invalid=6 From hadi@cyberus.ca Fri Dec 31 12:03:25 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 31 Dec 2004 12:03:38 -0800 (PST) Received: from mx02.cybersurf.com (mx02.cybersurf.com [209.197.145.105]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBVK35Wi015385 for ; Fri, 31 Dec 2004 12:03:25 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx02.cybersurf.com with esmtp (Exim 4.30) id 1CkT7b-00007s-GU for netdev@oss.sgi.com; Fri, 31 Dec 2004 15:11:39 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CkT7Y-0004Me-2b; Fri, 31 Dec 2004 15:11:36 -0500 Subject: Re: patch: tunnels not setting inputdev From: jamal Reply-To: hadi@cyberus.ca To: Patrick McHardy Cc: "David S. Miller" , netdev@oss.sgi.com, Wichert Akkerman In-Reply-To: <41D5941C.8060001@trash.net> References: <1104513392.1048.316.camel@jzny.localdomain> <41D5941C.8060001@trash.net> Content-Type: text/plain Organization: jamalopolous Message-Id: <1104523892.1047.338.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 31 Dec 2004 15:11:32 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13311 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Fri, 2004-12-31 at 13:02, Patrick McHardy wrote: > jamal wrote: > > --- 2610-bk1/net/ipv4/xfrm4_input.c 2004/12/31 17:00:25 1.1 > > +++ 2610-bk1/net/ipv4/xfrm4_input.c 2004/12/31 17:01:05 > > @@ -142,6 +142,7 @@ > > dst_release(skb->dst); > > skb->dst = NULL; > > } > > + skb->input_dev = skb->dev; > > This is not necessary, xfrm4_input doesn't change anything > regarding devices, so if it was correct before, it is still > correct. Same logic can apply to xfrm6_input then - since both are not exactly tunnels. I didnt check the code path - iam gonna assume you are right and kill those two. > For the remaining changes, why not simply set > input_dev in netif_receive_skb before the call to ing_filter ? > You want to be able to filter on indev at ingress - it is safer for whoever calls netif_rx() to do the setting. The packet could be looped from egress multiple times as well (redirected). > > netif_rx(skb); > > return 0; > > } else { > > Another question - why is ing_filter exported when > CONFIG_NET_CLS_ACT is defined ? Nobody uses it currently > outside of dev.c. Go ahead kill it there and in net/pkt_cls.h cheers, jamal From hadi@cyberus.ca Fri Dec 31 12:43:44 2004 Received: with ECARTIS (v1.0.0; list netdev); Fri, 31 Dec 2004 12:43:50 -0800 (PST) Received: from mx01.cybersurf.com (mx01.cybersurf.com [209.197.145.104]) by oss.sgi.com (8.13.0/8.13.0) with ESMTP id iBVKhNZI019985 for ; Fri, 31 Dec 2004 12:43:44 -0800 Received: from mail.cyberus.ca ([209.197.145.21]) by mx01.cybersurf.com with esmtp (Exim 4.30) id 1CkTkb-0002LN-PM for netdev@oss.sgi.com; Fri, 31 Dec 2004 15:51:57 -0500 Received: from cpe0030ab124d2f-cm014500000962.cpe.net.cable.rogers.com ([24.103.99.32] helo=[10.0.0.9]) by mail.cyberus.ca with esmtp (Exim 4.20) id 1CkTkY-0000ck-Dt; Fri, 31 Dec 2004 15:51:54 -0500 Subject: Re: [PKT_SCHED]: Allow using nfmark as key in U32 classifier. From: jamal Reply-To: hadi@cyberus.ca To: Thomas Graf Cc: "David S. Miller" , netdev@oss.sgi.com In-Reply-To: <20041231181153.GP32419@postel.suug.ch> References: <20041229124842.GI32419@postel.suug.ch> <1104330054.1089.329.camel@jzny.localdomain> <20041229150140.GJ32419@postel.suug.ch> <1104335620.1025.22.camel@jzny.localdomain> <20041230174313.GB32419@postel.suug.ch> <1104469111.1049.219.camel@jzny.localdomain> <20041231110836.GD32419@postel.suug.ch> <1104505142.1048.262.camel@jzny.localdomain> <20041231153930.GN32419@postel.suug.ch> <1104511494.1048.303.camel@jzny.localdomain> <20041231181153.GP32419@postel.suug.ch> Content-Type: text/plain Organization: jamalopolous Message-Id: <1104526311.1047.379.camel@jzny.localdomain> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 Date: 31 Dec 2004 15:51:51 -0500 Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV 0.80/645/Mon Dec 27 14:56:20 2004 clamav-milter version 0.80j on 127.0.0.1 X-Virus-Status: Clean X-archive-position: 13312 X-ecartis-version: Ecartis v1.0.0 Sender: netdev-bounce@oss.sgi.com Errors-to: netdev-bounce@oss.sgi.com X-original-sender: hadi@cyberus.ca Precedence: bulk X-list: netdev On Fri, 2004-12-31 at 13:11, Thomas Graf wrote: > * jamal <1104511494.1048.303.camel@jzny.localdomain> 2004-12-31 11:44 > Agreed I just don't get the reason for the PID. tc is usually called as > a new process instance when dumping. Indeed. A new instance of tc should be able to delete or understand what an old instance with different process ID installed. The P in pid here stands for "program" not "process". Looking at it from another angle it is the "owner" of that rule. I gave an example of routes as a comparison: Example: [root@jzny root]# ip r ls 10.1.0.25 via 10.0.0.90 dev eth0 proto zebra 10.0.0.0/24 dev eth0 proto kernel scope link src 10.0.0.9 127.0.0.0/8 dev lo scope link default via 10.0.0.1 dev eth0 [root@jzny root]# See the "proto" field? Same logic - if tc installed those rules that should read "tc". Same thinking for new u32 display with sel2. More impaortantly though: If the u32 rule was installed by tc, then it can understand what the code/opaqueid _in the match_ means. If i knew how tc used those opaque values then i too in my program could intepret them when i dump even though i didnt install the rule. > For me there are two possible > options, we can either introduce a ID system where an ID is assigned > to a match string in either a config file or a header file or we can > have tc write id -> desc maps to a global file somewhere where id > means match id + u32 handle + parent + dev. The first is probably > the better way. We could extend the match header to 64bit: > > u16 handle > u16 matchID > u16 kind > u8 flags > u8 pad We need to know who installed the rule so we can intepret what the ID in the match is. Unless you see a desire that, in order to understand all this, we need to also know on which device and which parent adds towards reaching that goal then I am afraid this will overcomplicate things. Theres probably other things you could gain from as well by having all those fields; you dont need them for this simple case. > > Its a non-trivial problem. Its ok to defer it for now but keep it in the > > back of your mind. > > Agreed. Lets review at a future date though. > * jamal <1104514372.1047.326.camel@jzny.localdomain> 2004-12-31 12:32 > > One thing i just remembered: You need to know the length of the matches > > and their data in order to store them. This is why i was earlier > > preaching putting them in TLVs. Some things dont need the datalen > > like u32 - however i suspect most will. > > It might not have been obvious, but every match is indeed in its > own TLV. Ok, cool. To recall: > TLV (TCA_U32_SEL2) > +-------------------+ > | Selector header | > +-------------------+ > | Match 1 TLV | > +-------------------+ > | ... | > +-------------------+ > | Match N TLV | > +-------------------+ You may actually need those Ts enumerated as if they are array indices. Look at the way i transfer actions using "order" > The part I don't want to use own TLVs is to separate the > match header and match data. Match header is always the same size > and match data can be aligned as well. We need len attributes for > things like meta indev match, nbyte and kmp though. A Nbyte config > TLV would look like: > > TCA_EMATCH > +-------------------------+ > | struct tcf_ematch_hdr | > | - - - - - - - - - - - --| > | ematch data | > +-------------------------+ I was more worried about the matches not being TLVs. So this looks good. > where ematch data contains nested TLVs such as > > TCA_EMATCH_NBYTE_HDR header, contains length of pattern + possibily more > TCA_EMATCH_NBYTE_START lower limit of searching range (offset) > TCA_EMATCH_NBYTE_END upper limit of searching range (offset) > TCA_EMATCH_NBYTE_PATTERN searching pattern, u8 [] > > The length in the header is required because we can't use > L from TCA_EMATCH_NBYTE_PATTERN since it might be padded. > My view was length is also a common field. Theres also another reason why you want length viewable in a dumb way: --> you dont really wanna force people to write dumpers for these ematchers (goal: keep this interface as simple as it can be); i.e dont need any pretty formater in the kernel. If you have a length then you can reconstruct the TCA_EMATCH easily without caring about the content. This is the path i started taking in eactions. Refer to my notes i sent earlier on the mythical one page ematch/eaction. If someone wants funky stuff - write a classifier. > Same would go for meta: > > TCA_EMATCH_META_HDR > TCA_EMATCH_META_LVALUE > TCA_EMATCH_MEtA_RVALUE > > If needed we can put in match specific stats via a _STATS TLV. Stats are the other thing that adds complexity to the API. If you can make it optional then that would be best - I was thinking to not even have it in. > > So either need a length somewhere in the header or use TLVs for the > > ematches in which you can make T=kind - still have 32 bit inside body > > but reserve bits not used for flags for future use. Thoughts?. > > I thought about the following ordering in the selctor TLV: > > T=1 generic selector header > > T=2 classifier specific selector header (u32 hashsing stuff goes here) > T=3 ematch 1 > T=N ematch N I thought we already agreed on the layout: SEL2- which may nest E/MATCHEs TLVs. Sel2 not being very different from original selector. May be i didnt follow. cheers, jamal